Symbol Tables
Stacks of maps - managing scope, variables, and functions
What is a symbol table?
A symbol table is a map of names to types - but more precisely, it’s a stack of maps. Each map represents a scope, and the stack manages scope nesting.
class SymbolTable {
LinkedList<Map<String, CatScriptType>> scopes = new LinkedList<>();
SymbolTable() {
scopes.push(new HashMap<>()); // global scope
}
}
The constructor creates the first map - the global scope. Anything declared at the top level of a CatScript program ends up here.
Block-based scoping
// global scope
for (x in [1, 2, 3]) { // block scope starts
var x = 12
} // block scope ends
var x = "hello" // back to global scope - no conflict
The two variables named x don’t conflict because they exist in different scopes. When the for loop’s block ends, everything declared inside it disappears.
Push and pop
| Event | Action |
|---|---|
Enter a new block ({) |
Push a new map onto the stack |
Exit a block (}) |
Pop the top map off the stack |
symbolTable.pushScope(); // new block
symbolTable.registerSymbol("x", CatScriptType.INT);
// ... validate body statements ...
symbolTable.popScope(); // x is gone
Pushing creates a fresh map for new symbols. Popping removes all symbols introduced in that scope in one operation.
Where validation starts
// In CatScriptProgram.verify()
public void verify() {
SymbolTable symbolTable = new SymbolTable(); // creates global scope
this.validate(symbolTable); // recursive validation
// collect errors and throw if any exist
}
The symbol table is created once and threaded through every validate call in the parse tree.
Function registration
Before validating any statements, CatScript pre-registers all functions in the symbol table:
// In CatScriptProgram.validate()
registerFunctions(symbolTable); // put ALL functions into global scope first
for (Statement stmt : statements) {
stmt.validate(symbolTable);
}
This means functions can be called from anywhere in a CatScript program - no forward declarations needed.
In C, functions must be declared before they’re used, which is why header files exist. CatScript avoids this by registering everything up front.
Variable statements and the symbol table
When a var statement is validated:
- Check if the name already exists in the symbol table -> duplicate name error
- Register the name with its type in the current scope
// Pseudo-code for var statement validation
if (symbolTable.hasSymbol(variableName)) {
addError("duplicate variable name");
} else {
symbolTable.registerSymbol(variableName, resolvedType);
}
For statements and scope
For loops introduce both a new scope and a new variable:
for (x in [1, 2, 3]) {
print(x)
}
The loop variable x has its type derived from the component type of the list expression. If the list is list<int>, then x is int.
Important: push the scope before registering the loop variable, so it gets cleaned up when the scope pops:
symbolTable.pushScope();
symbolTable.registerSymbol("x", componentType);
// validate body statements
symbolTable.popScope(); // x is no longer in scope
Scoping rules - order matters
var x = "hello"
for (i in [1, 2, 3]) {
var x = 12 // ERROR: x already exists in enclosing scope
}
vs.
for (i in [1, 2, 3]) {
var x = 12 // fine - inner scope
}
var x = "hello" // fine - inner x was popped
When x is declared in the global scope before the for loop, it’s visible inside the loop. Redeclaring it is a duplicate name error. When x is declared inside the loop first, it’s removed when the scope pops, so the global x is fine.
Shadowing
CatScript does not allow variable shadowing - redeclaring a name in an inner scope that already exists in an outer scope is an error.
Rust, by contrast, allows shadowing:
let x = "hello";
{
let x = 12; // shadows outer x - legal in Rust
}
// x is back to "hello"
Implementing shadowing would mean skipping the duplicate-name check and simply looking up variables from the top of the stack downward, returning the first match.
Java scoping quirks
| Scenario | Legal? |
|---|---|
| Two local variables with the same name in the same scope | No |
| Local variable shadowing a class field | Yes |
| Instance field accessed from a static method | No - this doesn’t exist in static context |
| Static field accessed from any method | Yes |
Java effectively uses two symbol tables: one for static members and one for instance members.
Dynamic scoping (Perl)
In most modern languages, scoping is lexical - determined by where code is written. In Perl’s local keyword, scoping is dynamic - a variable’s value bleeds into called functions:
sub inner { print $var; }
sub outer { local $var = "outer"; inner(); }
$var = "global";
outer(); # prints "outer", not "global"
Dynamic scoping makes programs harder to reason about and is largely abandoned in modern languages. Even Lisp has moved to lexical scoping.
Thread-local scoping
Thread-local variables are visible only to a single thread, avoiding race conditions. Java supports this via the ThreadLocal class rather than language syntax. JavaScript operates this way by default due to its single-threaded event loop.
Summary
| Concept | Key point |
|---|---|
| Symbol table | A stack of maps (name -> type), one map per scope |
| Push/pop | Enter a block -> push; exit a block -> pop |
| Function registration | All functions registered before validation - no forward declarations needed |
| Variable declaration | Check for duplicates, then register name and type |
| For loops | Push scope before registering loop variable so it’s cleaned up on pop |
| No shadowing | CatScript doesn’t allow inner scopes to redeclare outer names |
| Lexical vs. dynamic | Modern languages use lexical scoping; dynamic scoping (Perl) is rare today |