links: JS MOC


illustrating lexical scopes

we explored how scope is determined during code compilation, a model called “lexical scope”. The term lexical refers to the first stage of compilation (lexing/parsing).

To properly reason about our programs, it’s important to have a solid conceptual foundation of how scope works. If we rely on guesses and intuition, we may accidentally get right answers some of the time, but many other times we are far off. This isn’t a recipe for success.

Like way back in grade school, getting the right answer isn’t enough, if we don’t show the correct steps to get there! we need to build accurate and helpful mental models as foundation moving forward.

Marbles, Buckets and Bubbles… Oh My!

One metaphor effective in understanding scope is sorting colored marbles into buckets of their matching color.

Imagine you came across a pile of marbles, and notice all the marbles are colored red, blue or green. Let’s sort all the marbles, dropping the red ones into a red bucket, green into a green bucket and blue into a blue bucket. After sorting when you later need a green marble, you already know the green bucket is where to go to get it.

In this metaphor, the marbles are the variables in our program. The buckets are scopes (functional and blocks), which we just conceptually assign individual colors for our discussion purposes. The color of each marble is thus determined by which color scope we find the marble originally created in.

Let’s annotate the running program example from Chapter 1 with scope color labels:

// outer/global scope: RED
 
var students = [
    { id: 14, name: "Kyle" },
    { id: 73, name: "Suzy" },
    { id: 112, name: "Frank" },
    { id: 6, name: "Sarah" }
];
 
function getStudentName(studentID) {
    // function scope: BLUE
 
    for (let student of students) {
        // loop scope: GREEN
 
        if (student.id == studentID) {
            return student.name;
        }
    }
}
 
var nextStudent = getStudentName(73);
console.log(nextStudent);   // Suzy

We have designated three scope colors with code comments RED (outermost global scope), BLUE (scope of function getStudentName(..)), and GREEN (scope of/inside the for loop). But still it maybe difficult to recognize the boundaries of these scope buckets when looking at a code.

The below image helps visualize the boundaries of the scopes by drawing colored bubbles (aka buckets) around each:

  1. Bubble 1 (RED) encompasses the global scope, which holds three identifiers/variables: students (line 1), getStudentName (line 8) and nextStudent (line 16)
  2. Bubble 2 (BLUE) encompasses the scope of the function getStudentName (line 8), which holds just one variable/identifier: the parameter studentID (line 8)
  3. Bubble 3 (GREEN) encompasses the scope of the for-loop (line 9), which holds just one identifier/variable student (line 9).
Note:
Technically, the parameter studentID is not exactly in the BLUE(2) scope. We’ll unwind that confusion in “Implied scopes” in Appendix A. For now, it’s close enough to label studentID a BLUE(2) marble

Scope bubbles are determined during compilation based on where the functions/blocks of scope are written, the nesting inside each other, and so on. Each scope bubble is entirely contained in parent scope bubble — a scope bubble is never partially in two different scopes.

Each marble(variable/identifier) is colored based on which bubble (bucket) it’s declared in, not the color of the scope it maybe accessed from (e.g. students on line 9 and studentID on line 10)

Note:
Remember we asserted in chapter 1 that id, name and log are all properties, not variables; in other words, they’re not marbles in buckets, so they don’t get colored based on any of the rules we’re discussing in the book. To understand on how such property accesses are handled. read Object and classes in this series.

As the JS engine processes the program(during compilation), and finds the declaration for a variable. It essentially asks, “Which color scope (bucket/bubble) am I currently in?” The variable is designated as that same color, meaning it belongs to that bucket/bubble

The GREEN(3) bucket is wholly nested in BLUE(2) bucket and similarly BLUE(2) bucket is wholly nested in RED(1) bucket. Scopes can nest inside each other as shown, to any depth of nesting as your program needs.

References (non declarations) to variables/identifiers are allowed if there is a matching declaration either in the current scope or any scope above/outside current scope, but not with declarations with lower/nested scopes.

An expression in RED(1) bucket only has access to RED(1) marbles, not BLUE(2) or GREEN(3). An expression in the BLUE(2) bucket can either reference BLUE(2) or RED(1) marbles, not GREEN(3). And an expression in the GREEN(3) bucket has access to RED(1), BLUE(2) and GREEN(3) marbles.

We can conceptualize the process of determining these non -declaration marble colors during runtime as a lookup. Since the students variable reference in the for-of loop statement on line 9 is not a declaration, it has no color. So we ask the current BLUE(2) scope bucket if it has a marble matching that name. Since it doesn’t the lookup continues with the next outer/containing scope: RED(1). The RED(1) bucket has the marble of the name students, so the loop-statement’s students variable reference is determined to be a RED(1) marble.

The if (student.id == studentID) statement in line 10 is similarly determined to reference a GREEN(3) marble name student and BLUE(2) marble name studentID

Note:
The JS engine doesn’t generally determine these marble colors during runtime; the “lookup” here is a rhetorical device to help you understand the concepts. During compilation most or all variable references will match already known scope buckets, So their color is already determined, and stored with each marble reference to avoid unnecessary lookups as the program runs. Read chapter 3 about this naunce.

The key takeaways from marbles and buckets (and bubbles):

  • Variables are declared in specific scopes, which can be thought of as colored marbles from matching color-buckets.
  • Any variable reference that appears in the scope where it was declared, or appears in any deeper nested scopes, will be labeled a marble of that same color—unless an intervening scope “shadows” the variable declaration.(see “Shadowing” in chapter 3)
  • The determination of colored buckets and the color marbles they contains, happen during compilation. This information is used for variable(marble color) “lookups” during code execution.

A conversation among friends

Another useful metaphor for the process of analyzing variables and the scopes they come from is to imagine various conversations that occur inside the engine as the code is processed and then executed. We can “listen in” on these conversations to get a better conceptual foundation for how scopes work.

Let’s meet the members of the JS engine that will have conversations as they process our program:

  • Engine: responsible for start-to-finish compilation and execution of javascript program
  • Compiler: One of the Engine’s’ friends; handles all the dirty work of parsing and code-generation
  • Scope Manager: Another friend of Engine; collects and maintains a lookup list of all the declared variables/identifiers, and enforces a set of rules as to how these are accessible to currently executing code.

In order to fully understand JS engine, you need to begin think like engine (and friends) think, ask the questions they ask, and answer their questions likewise.

Let’s explore these conversations with an example:

var students = [
    { id: 14, name: "Kyle" },
    { id: 73, name: "Suzy" },
    { id: 112, name: "Frank" },
    { id: 6, name: "Sarah" }
];
 
function getStudentName(studentID) {
    for (let student of students) {
        if (student.id == studentID) {
            return student.name;
        }
    }
}
 
var nextStudent = getStudentName(73);
 
console.log(nextStudent);
// Suzy

Let’s examine how JS is going to process that program, specifically starting with the first statement. The array and it’s values are just basic JS value literals (and thus uneffected by any scoping concerns), so our focus here on will be on the var students = [...] declaration and initialization-assignment parts.

We typically see of that as a single statement, but that’s not how our friend Engine sees it. In fact, JS treats these as two distinct operations, one which Compiler will handle during compilation, and the other which Engine will handle during execution.

The first thing compiler will do with the program is perform lexing to break it down into tokens, which it will then parse into a tree (AST).

Once Compiler gets to code generation, there’s more detail to consider than maybe obvious. A reasonable assumption would be that Compiler would produce for the first statement such as “Allocate memory for a variable, label it students, then stick a reference to the array into that variable”. But that’sn not the whole story.

Here’s the steps that follow Compiler will follow to handle that statement:

  1. Encountering var students, Compiler will ask Scope Manager to see if a variable name students already exists for that particular scope bucket. If so, Compiler would ignore this declaration and move on. Otherwise, Compiler will produce code that (at execution time) asks Scope Manager to create a new variable called students in that scope bucket.
  2. Compiler then produces code for Engine to later execute, to handle the students = [] assignment. The code Engine runs will first ask Scope Manager if there is a variable called students accessible in the current scope bucket. If not Engine keeps looking else where (Nested Scope). Once Engine finds a variable, it assigns the reference to the [..] array to it

In conversational form the first phase of compilation for the program might play out between Compiler and Scope Manager like this:

Compiler: Hey, Scope Manager(of global scope), I found a formal declaration for an identifier called students ever heard of it?

(Global) Scope Manager: Nope, never heard of it, so I just created for you.

Compiler: Hey, Scope Manager, I found a formal declaration for an identifier called getStudentName, ever heard of it?

(Global) Scope Manager: Nope, but I just created for you.

Compiler: Hey, Scope Manager, getStudentName points to a function, so we need a new scope bucket.

(Function) Scope Manager: Got it, here’s the scope bucket.

Compiler: Hey, Scope Manager (of function scope), I found a formal parameter declaration for studentID, ever heard of it?

(Function) Scope Manager: Nope, but it’s created in this scope.

Compiler: Hey Scope Manager(of the function), I found a for - loop that will need its own scope bucket.

The conversation is a question and answer exchange, Where Compiler asks the current Scope Manager if an encountered identifier declaration has already been encountered. If “no” the Scope Manager creates that variable in that scope, if the answer is “yes”, then it’s effectively skipped over since there is nothing more Scope Manager to do.

Compiler also signals when it runs across functions or block scopes, so that a new scope bucket and Scope Manager can be instantiated.

Later, when it comes to the execution of the program, the conversation will shift to the Engine and Scope Manager, and might play out like this:

Engine: Hey, Scope Manager(of global scope), before we begin can you look up the identifier getStudentName so I can assign this function to it?

(Global) Scope Manager: yep, here’s the variable.

Engine: Hey, Scope Manager I found a target reference for students, ever heard of it?

(Global) Scope Manager: Yes, it was formally declared for this scope. here it is.

Engine: Thanks I’m initializing students to undefined so it’s ready to use.

Hey, Scope Manager(of the global scope), I found a target reference for nextStudent, ever heard of it?

(Global) Scope Manager: Yes, it was formally declared for this scope, so here it is.

Engine: Thanks, I’m initializing nextStudent to undefined so it’s ready to use.

Hey, Scope Manager (of the global scope), I found a source reference for getStudentName, ever heard of it?

(Global) Scope Manager: Yes, it was formally declared for this scope. Here it is.

Engine: Great, the value in getStudentName is a function, so I’m going to execute it.

Engine: Hey, Scope Manager, now we need to instantiate the function’s scope.

The conversation is another question and answer exchange, where Engine first asks the current Scope Manager to look up the hoisted getStudentName identifier, so as to associate the function with it. Engine then proceeds to ask Scope Manager about the target reference for students and so on.

To review and summarize how a statement like var students = [..] is processed in two distinct steps:

  1. Compiler sets up the declaration of the scope variable (since it wasn’t previously declared in the current scope).
  2. While Engine is executing, to process the assignment part of the statement, Engine asks Scope Manager to look up the variable, initialized it to undefined so it’s ready to use, and then assigns the array value to it.

Nested Scope

Scopes can be lexically nested to any arbitrary depth. To create a new scope add a flower brackets {}. For example all the functions, for loops, if statements and switch cases creates a new scope under their parent scope, as you add flower brackets

Each scope gets its own Scope Manager Instance each time that scope is executed. Each scope has all it’s identifiers registered at the start of the scope being executed (this is called “variable hoisting”).

The source reference lookup happens in this way

Outer Scope
 _______________
|  Inner Scope  |
|  _________    |
| |________|    |
|_______________|

One of the key aspects of lexical scope is that any time if an identifier reference cannot be found in current scope, the next outer scope in the nesting is consulted; that process is repeated until an answer is found or there are no more scopes to consult

Lookup Failures

When engine exhausts all lexically available scopes (moving outward) and still cannot find identifier, an error condition then exists. Depending on the mode of program (strict or non-strict) and role of variable (target or source), this error condition will be handled differently

Undefined mess

If the variable is source, an unresolved identifier lookup is considered as undeclared(unknown or missing) variable, which always results in a ReferenceError: being thrown. Also if the variable is a target and the code at that moment is running in strict-mode, the variable is considered undeclared and similarly throws a Reference Error

// Source Reference Error
for(let biscuit of biscuits) {
	console.log(biscuit)
}
// Uncaught ReferenceError: biscuits is not defined"

The error message for an undeclared variable condition, in most JS environments look like, “Reference Error: XYZ is not defined”. The phrase “not defined” seems almost identical to the word “undefined”, as for as the English goes. But these two are very different in JS, and this error message unfortunately creates a persistent confusion.

“Not defined” really means “not declared” or rather “undeclared” as in a variable that has no matching declaration in any lexically available scope. By contrast “undefined” really means a variable is found (declared), but otherwise the variable doesn’t have any value at the moment.

To perpetuate the confusion even further, JS’s typeof operator returns the string "undefined" for variable references in either state:

var studentName;
 
typeof studentName //  "undefined"
 
typeof doesntExist // "undefined"

Global… What?

If the variable is target and in non-strict mode, a surprising legacy behavior kicks in. It creates a a global variable

// Target Reference error
chocolates = ['Five Star', 'Diary Milk Silk']

The above statement will throw Reference Error in strict mode, but creates a global variable in non-strict mode

Always use strict-mode and always formally declare your variables. You’ll get then helpful ReferenceError, if you ever mistakenly try to assign to a non-declared variable

Building on metaphors

Think of a multi storey building a nested scope collection. The currently executing scope is the floor you are currently in. The top level is the global scope.

You resolve a target or source reference by looking up at current floor and moving on to next floors via lift (outer scope) until you reach the top floor


tags: ydkjs , basics

Beautiful code lexical scope in javascript