7 Symbol
7 Symbol
Symbol Tables
COMP 520: Compiler Design (4 credits)
Alexander Krolik
[email protected]
MWF 8:30-9:30, TR 1080
http://www.cs.mcgill.ca/~cs520/2019/
COMP 520 Winter 2019 Symbol Tables (2)
Semantic Analysis
Until now we were concerned with the lexical and syntactic properties of source programs (i.e.
structure), ignoring their intended function. The following program is therefore syntactically valid
var a : string;
var b : int;
var c : boolean;
Definition
Semantic analysis is a collection of compiler passes which analyze the meaning of a program, and
is largely divided into two sections
The above program may or may not produce a semantic error depending on the exact rules of the
source language.
COMP 520 Winter 2019 Symbol Tables (3)
Symbol Tables
Symbol tables are used to describe and analyze definitions and uses of identifiers.
Grammars are too weak to express these concepts; the language below is not context-free.
{wαw|w ∈ Σ∗ }
To solve this problem, we use a symbol table - an extra data structure that maps identifiers to
meanings
i local int
done local boolean
insert method ...
List class ...
x formal List
. . .
. . .
. . .
To handle scoping, ordering, and re-declaring, we must construct a symbol table for every program
point.
COMP 520 Winter 2019 Symbol Tables (4)
Symbol Tables
In general, symbol tables allow us to perform two important functions
We can use these relationships to enforce certain properties that were impossible in earlier phases
• ...
• Class hierarchies
– Which classes are defined;
– What is the inheritance hierarchy; and
– Is the hierarchy well-formed.
• Class members
– Which fields are defined;
– Which methods are defined; and
– What are the signatures of methods.
• Identifier use
– Are identifiers defined twice;
– Are identifiers defined when used; and
– Are identifiers used properly?
COMP 520 Winter 2019 Symbol Tables (6)
In static scoping, symbol references (identifiers) are resolved using properties of the source code (it
is also called lexical scoping for this reason)
• Blocks may (or may not) define new scopes – block scoping; and
Scoping rules
One-Pass Technology
Historically, only a single pass was performed during compilation. Elements in the global scope (or
any order scopes) were thus not visible until they were defined.
COMP 520 Winter 2019 Symbol Tables (9)
Redefinitions
If multiple definitions for the same symbol exist, use the closest definition. Identifiers in the same
scope must be unique.
COMP 520 Winter 2019 Symbol Tables (10)
Scope Stack
COMP 520 Winter 2019 Symbol Tables (11)
Dynamic Scoping
Dynamic scoping is much less common, and significantly less easy to reason about. It uses the
program state to resolve symbols, traversing the call hierarchy until it encounters a definition.
Cactus stack
A cactus stack has multiple branches, where each element has a pointer to its parent (it’s also
called a parent pointer tree)
• scopeSymbolTable(SymbolTable *t)
• unscopeSymbolTable(SymbolTable *t)
Scope rules
• Each hash table contains the identifiers for a scope, mapped to information;
Declarations
• For each declaration, the identifier is entered in the top-most hash table;
• A use of an identifier is looked up in the hash tables from top to bottom; and
Hash Functions
What is a good hash function on identifiers?
hash = *str;
COMP 520 Winter 2019 Symbol Tables (16)
The symbol table contains both the table of symbols, as well as a pointer to the parent scope.
COMP 520 Winter 2019 Symbol Tables (19)
t->parent = NULL;
return t;
}
When opening a new scope, we first construct a new symbol table and then set its parent to the
current scope. Note that by construction, the global (top-level) scope has no parent.
SymbolTable *scopeSymbolTable(SymbolTable *s) {
SymbolTable *t = initSymbolTable();
t->parent = s;
return t;
}
COMP 520 Winter 2019 Symbol Tables (20)
Adding a new symbol consists of inserting an entry into the hash table (note the check for
redefinitions)
SYMBOL *putSymbol(SymbolTable *t, char *name, SymbolKind kind) {
int i = Hash(name);
for (SYMBOL *s = t->table[i]; s; s = s->next) {
if (strcmp(s->name, name) == 0) // throw an error
}
SYMBOL *s = malloc(sizeof(SYMBOL));
s->name = name;
s->kind = kind;
s->next = t->table[i];
t->table[i] = s;
return s;
}
COMP 520 Winter 2019 Symbol Tables (21)
Mutual Recursion
A typical symbol table implementation does a single traversal of the program. Upon identifier
This naturally assumes that declarations come before use. This works well for statement
sequences, but it fails for mutual recursion and members of classes.
A
...B...
B
...A...
Mutual Recursion
Solution: Make two traversals
For cases like recursive types, the definition is not completed until the second traversal.
JOOS
Like in Java, JOOS supports mutual recursion and allows functions to reference any member of the
class regardless of lexical order.
Symbol passes
Note that no types are resolved at this time since a class is a type (and we might have recursive
types)! Resolving types is thus performed in the second pass.
COMP 520 Winter 2019 Symbol Tables (29)
Signature
In the second pass, we can now resolve both the return types and formal (parameter) types since
the class discovery is complete.
void symInterfaceTypesMETHOD(METHOD *m, SymbolTable *symbolTable) {
if (m == NULL) {
return;
}
symInterfaceTypesMETHOD(m->next, symbolTable);
symTYPE(m->returntype, symbolTable);
symInterfaceTypesFORMAL(m->formals, symbolTable);
}
COMP 520 Winter 2019 Symbol Tables (30)
Classes
void symImplementationCLASS(CLASS *c) {
SymbolTable *symbolTable = scopeSymbolTable(classlib);
symImplementationFIELD(c->fields, symbolTable);
symImplementationCONSTRUCTOR(c->constructors, c, symbolTable);
symImplementationMETHOD(c->methods, c, symbolTable);
}
Methods
void symImplementationMETHOD(METHOD *m, CLASS *this, SymbolTable *symbolTable) {
if (m == NULL) {
return;
}
symImplementationMETHOD(m->next, this, symbolTable);
SymbolTable *m_symbolTable = scopeSymbolTable(symbolTable);
symImplementationFORMAL(m->formals, m_symbolTable);
symImplementationSTATEMENT(m->statements, this, m_symbolTable,
m->modifier == staticMod);
}
COMP 520 Winter 2019 Symbol Tables (31)
[...]
}
}
Note that block statements open a new scope for the body which allows variable scoping.
COMP 520 Winter 2019 Symbol Tables (32)
Locally declared variable names are added to the symbol table and associated with their declaration
void symImplementationLOCAL(LOCAL *l, SymbolTable *symbolTable) {
if (l == NULL) {
return;
}
symImplementationLOCAL(l->next, symbolTable);
symTYPE(l->type,sym);
SYMBOL *s = putSymbol(symbolTable, l->name, localSym);
s->val.localS = l;
}
COMP 520 Winter 2019 Symbol Tables (33)
• A textual representation of the symbol table is printed once for every scope area;
• These tables are then compared to a corresponding manual construction for a sufficient
collection of programs;