0% found this document useful (0 votes)
25 views

Lec6 - SemanticAnalysis 3

This document discusses semantic analysis in compilers. Semantic analysis occurs after parsing and checks for errors beyond syntax, such as type checking, scope analysis using symbol tables, and ensuring declarations and uses agree. It is important for catching errors that parsing alone cannot detect. The document provides examples of semantic errors and explains why semantic analysis is a separate phase from parsing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Lec6 - SemanticAnalysis 3

This document discusses semantic analysis in compilers. Semantic analysis occurs after parsing and checks for errors beyond syntax, such as type checking, scope analysis using symbol tables, and ensuring declarations and uses agree. It is important for catching errors that parsing alone cannot detect. The document provides examples of semantic errors and explains why semantic analysis is a separate phase from parsing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 38

Semantic Analysis

Outline
• The role of semantic analysis in a compiler
– A laundry list of tasks

• Scope
– Static vs. Dynamic scoping
– Implementation: symbol tables

• Types
– Static analyses that detect type errors
– Statically vs. Dynamically typed languages

2
The Compiler Front-End
Lexical analysis: program is lexically well-formed
– Tokens are legal
• e.g. identifiers have valid names, no stray characters, etc.
– Detects inputs with illegal tokens
Parsing: program is syntactically well-formed
– Declarations have correct structure, expressions are
syntactically valid, etc.
– Detects inputs with ill-formed syntax
Semantic analysis:
– Last “front end” compilation phase
– Catches all remaining errors

3
Semantic analyzer
The syntax analyzer will just create a parse tree.
Semantic analyzer will check the actual meaning
of the statement parsed in the parsed tree.

Semantic analysis can compare the information


in one part of the parse tree to that in the other
part.
Ex . Compares reference to variable agrees with
its declaration or that parameters to a function call
matches the function definition .

4
Semantic analysis is used for the following
Maintaining the symbol table for each block.

Check source program for Semantic errors .

Collect type information for code generation .

Report compile time errors in the code .

Generate the object code . (assembler or


intermediate code )

5
The semantic analyzer will check:

 Data Type of the first operand.


 Data type of the second operand .
 Check if + is binary or unary.
 Check for number of operands
supplied to operator depending on
the type of operator ( Unary │
Binary │Ternary )
Beyond Syntax Errors
• What’s wrong with foo(int a, char * s){...}
this C code?
(Note: it parses int bar() {
correctly) int f[3];
inti, j, k;
• Undeclared identifier char q, *p;
• Multiply declared float k;
identifier
foo(f[6], 10, j);
• Index out of bounds
break;
• Wrong number or types of
arguments to function i->val = 42;
call j = m + k;
• Incompatible types for printf("%s,%s.\n",p,q);
operation
goto label42;
• break statement outside
switch/loop }
• goto with no label 7
Why Have a Separate Semantic Analysis?
Parsing cannot catch some errors

Some language constructs are not context-free


– Example: Identifier declaration and use
– An abstract version of the problem is:
L = { wcw | w  (a + b)* }
– The 1st w represents the identifier’s declaration;
the 2nd w represents a use of the identifier
– This language is not context-free

8
What Does Semantic Analysis Do?
Performs checks beyond syntax of many kinds ...
Examples:
1.All used identifiers are declared
2.Identifiers declared only once
3.Types
4.Procedures and functions defined only once
5.Procedures and functions used with the right
number and type of arguments
And others . . .

The requirements depend on the language


9
What’s Wrong?

Example 1
let string y  "abc" in y + 42

Example 2
let integer y in x + 42

10
Semantic Processing: Syntax-Directed
Translation
Basic idea: Associate information with language
constructs by attaching attributes to the
grammar symbols that represent these constructs
– Values for attributes are computed using semantic
rules associated with grammar productions
– An attribute can represent anything (reasonable)
that we choose; e.g. a string, number, type, etc.
– A parse tree showing the values of attributes at
each node is called an annotated parse tree

11
Attributes of an Identifier
name: character string (obtained from scanner)
scope: program region in which identifier is valid
type:
- integer
- array:
• number of dimensions
• upper and lower bounds for each dimension
• type of elements
– function:
• number and type of parameters (in order)
• type of returned value
• size of stack frame

10
Scope
• The scope of an identifier (a binding of a name
to the entity it names) is the textual part of
the program in which the binding is active
• Scope matches identifier declarations with uses
– Important static analysis step in most languages

• The scope of an identifier is the portion of a


program in which that identifier is accessible
• The same identifier may refer to different
things in different parts of the program
– Different scopes for same name don’t overlap
• An identifier may have restricted scope
13
Static vs. Dynamic Scope
• Most languages have static (lexical) scope
– Scope depends only on the physical structure
of program text, not its run-time behavior
– The determination of scope is made by the
compiler
– C, Java, ML have static scope; so do most
languages

• A few languages are dynamically scoped


– Lisp, SNOBOL
– Lisp has changed to mostly static scoping
– Scope depends on execution of the program
14
Static Scoping Example

let integer x 0
{ in
x;
let integer x  1
x; in
x;
}

Uses of x refer to closest enclosing


definition
15
Dynamic Scope
• A dynamically-scoped variable refers to the
closest enclosing binding in the execution
of the program

Example
g(y) = let integer a  42 in f(3);
f(x) = a;
– When invoking g(54) the result will be
42

16
Static vs. Dynamic Scope
Program scopes (input, output);
var a: integer;
procedure first; With static scope
begin a := 1; end; rules, it prints 1
procedure second;
var a: integer; With dynamic scope
begin first; end; rules, it prints 2
begin
a := 2; second; write(a);
end.

17
Dynamic Scope (Cont.)
• With dynamic scope, bindings cannot always
be resolved by examining the program
because they are dependent on calling
sequences
• Dynamic scope rules are usually encountered
in interpreted languages
• Also, usually these languages do not
normally have static type checking:
– type determination is not always possible
when dynamic rules are in effect

18
Scope of Identifiers
• In most programming languages
identifier bindings are introduced by
– Function declarations (introduce function
names)
– Procedure definitions (introduce procedure
names)
– Identifier declarations (introduce
identifiers)
– Formal parameters (introduce identifiers)

19
Scope of Identifiers (Cont.)
• Not all kinds of identifiers follow the most-
closely nested scope rule

• For example, function declarations


– often cannot be nested
– are globally visible throughout the program

• In other words, a function name can be used


before it is defined

20
Example: Use Before Definition

foo (int x)
{
int y
y =
bar(x)
} ...
int bar (int i)
{
...
}
Flow-of-Control Checks
myfunc()
{ …
break; // ERROR
} myfunc()
{ …
switch (a)
myfunc() { case 0:
{ … …
while (n) break; // OK
{ … case 1:
if (i>10) …
break; // OK }
} }
}
Uniqueness Checks
myfunc()
{ int i, j, i; // ERROR

}

cnufym(int a, int a) // ERROR


{ …
}

struct myrec
{ int name;
};
struct myrec // ERROR
{ int id;
};
Symbol Tables
Purpose: To hold information about identifiers
that is computed at some point and looked up
at later times during compilation
Examples:
– type of a variable
– entry point for a function

Operations: insert, lookup, delete

Common implementations: linked lists, hash


tables
24
Symbol Tables
• Assuming static scope, consider :
let integer x 42 in
• Idea: E
– Before processing E, add definition of x
to current definitions, overriding any
other definition of x
– After processing E, remove definition of x
and, if needed, restore old definition of x

• A symbol table is a data structure that


tracks the current bindings of identifiers
25
A Simple Symbol Table
Implementation
• Structure is a stack

• Operations
add_symbol(x) push x and associated info, such as
x’s type, on the stack
find_symbol(x) search stack, starting from top, for
x. Return first x found or NULL if none found
remove_symbol() pop the stack

• Why does this work?

26
Limitations
• The simple symbol table works for variable
declarations
– Symbols added one at a time
– Declarations are perfectly nested

• Doesn’t work for


foo( int x, float x);

• Other problems?

27
A Fancier Symbol Table
• enter_scope() start/push a new nested scope
• find_symbol(x) finds current x (or null)
• add_symbol(x) add a symbol x to the table
• check_scope(x) true if x defined in current
scope
• exit_scope() exits/pops the current
scope

28
Function/Procedure Definitions
• Function names can be used prior to their
definition
• We can’t check that for function names
– using a symbol table
– or even in one pass
• Solution
– Pass 1: Gather all function/procedure names
– Pass 2: Do the checking
• Semantic analysis requires multiple
passes
– Probably more than two
29
Types
• What is a type?
– This is a subject of some debate
– The notion varies from language to language

• Consensus
– A type is a set of values and
– A set of operations on those values

• Type errors arise when operations are performed on


values that do not support that operation
Why Do We Need Type Systems?

Consider the assembly language fragment

addi $r1, $r2, $r3

What are the types of $r1,$r2, $r3?

31
Types and Operations
 Certain operations are legal for values of
each type

– It doesn’t make sense to add a function


pointer and an integer in C

– It does make sense to add two integers

– But both have the same assembly


language implementation!

32
Type Systems
• A language’s type system specifies which
operations are valid for which types

• The goal of type checking is to ensure that


operations are used with the correct
types
– Enforces intended interpretation of values,
because nothing else will!

• Type systems provide a concise formalization


of the semantic checking rules

33
What Can Types do For Us?
• Allow for a more efficient compilation of
programs
– Allocate right amount of space for variables
• Use fewer bits when possible
– Select the right machine operations

• Detect statically certain kinds of errors


– Memory errors
• Reading from an invalid pointer, etc.
– Violation of abstraction boundaries
– Security and access rights violations

34
Type Checking Overview
Three kinds of languages:

Statically typed: All or almost all checking of types


is done as part of compilation
• C, C++, ML, Haskell, Java, C#, ...

Dynamically typed: Almost all checking of types is


done as part of program execution
• Scheme, Prolog, Erlang, Python, Ruby, PHP, Perl, ...

Untyped: No type checking (machine code)

35
The Type Wars
• Competing views on static vs. dynamic typing

• Static typing proponents say:


– Static checking catches many programming errors
at compile time
– Avoids overhead of runtime type checks

• Dynamic typing proponents say:


– Static type systems are restrictive
– Rapid prototyping easier in a dynamic type system

36
The Type Wars (Cont.)
• In practice, most code is written in statically
typed languages with an “escape” mechanism
– Unsafe casts in C, Java

• It is debatable whether this compromise


represents the best or worst of both worlds

37
Questions???

You might also like