0% found this document useful (0 votes)

2 views34 pages

_20221211_082916_570

The document discusses types of parsers in compiler design, specifically focusing on top-down and bottom-up parsers. It explains the classifications within these types, such as recursive descent and LR parsers, along with examples of how parsing works through reductions. Additionally, it outlines the rules and construction of LR parsing tables, including the SLR parsing method.

Uploaded by

Rasha Elsayed Sakr

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views34 pages

_20221211_082916_570

Uploaded by

Rasha Elsayed Sakr

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

3rd Class

Compiler 2
Lecture -1- : Types of parsers in compiler design

The parser is that phase of the compiler which takes a token string as input
and with the help of existing grammar, converts it into the corresponding
Intermediate Representation. The parser is also known as Syntax Analyzer.

Types of Parser:
The parser is mainly classified into two categories, i.e. Top-down Parser, and
Bottom-up Parser. These are explained below:
1- Top-Down Parser:
The top-down parser is the parser that generates parse for the given input
string with the help of grammar productions by expanding the non-terminals i.e. it
starts from the start symbol and ends on the terminals. It uses left most derivation.
Further Top-down parser is classified into 2 types: Recursive descent parser,
and Non-recursive descent parser.
3rd Class

Compiler 2
Recursive descent parser is also known as the Brute force parser or the
backtracking parser. It basically generates the parse tree by using brute force and
backtracking.
Non-recursive descent parser is also known as LL(1) parser or predictive
parser or without backtracking parser or dynamic parser. It uses a parsing table to
generate the parse tree instead of backtracking.
2- Bottom-up Parser:
Bottom-up Parser is the parser that generates the parse tree for the given
input string with the help of grammar productions by compressing the non-
terminals i.e. it starts from non-terminals and ends on the start symbol. It uses the
reverse of the rightmost derivation.
Further Bottom-up parser is classified into two types: LR parser, and Operator
precedence parser.
LR parser is the bottom-up parser that generates the parse tree for the given
string by using unambiguous grammar. It follows the reverse of the rightmost
derivation.
LR parser is of four types:
a- LR(0) b- SLR(1) c-LALR(1) d-CLR(1)
Operator precedence parser generates the parse tree form given grammar
and string but the only condition is two consecutive non-terminals and epsilon
never appear on the right-hand side of any production.
Bottom Up Parsers / Shift Reduce Parsers
Bottom up parsers start from the sequence of terminal symbols and work
their way back up to the start symbol by repeatedly replacing grammar rules' right
hand sides by the corresponding non-terminal. This is the reverse of the derivation
process, and is called "reduction".
3rd Class

Compiler 2
Example:1 consider the grammar
S→ aABe
A→ Abc|b
B→ d
The sentence abbcde can be reduced to S by the following steps:
Sol:
abbcde
aAbcde
aAde
aABe
S
Example:2 consider the grammar
S→ aABe
A→ Abc|bc
B→ dd
3rd Class

Compiler 2
The sentence abcbcdde can be reduced to S by the following steps:
Sol:
abcbcdde
aAbcdde
aAdde
aABe
S
Example:3 Using the following arithmetic grammar
E E+T | T
T T*F | F
F (E) | id

Illustrates the bottom-up parse for string w= id * id

The reductions will be discussed in terms of the sequence of strings
id * id
F * id
T * id
T * F handle
T handle pruning start
E start

The following derivation corresponds to the parse

E T
T * F
T * id
F * id
id * id
3rd Class

Compiler 2
This derivation is in fact a RightMost Derivation (RMD):

We can think of bottom-up parsing as the process of "reducing" a string w to

the start symbol of the grammar.
At each reduction step, a specific substring matching the body of a production
is replaced by the nonterminal at the head of that production.
The key decisions during bottom-up parsing are about when to reduce and
about what production to apply, as the parse proceeds.
The grammar is the expression grammar in example 3:
The reductions will be discussed in terms of the sequence of strings
id * id F * id T * id T * F T E … (reductions)
By definition, a Reduction is the reverse of a step in a derivation (recall that in a
derivation, a nonterminal in a sentential form is replaced by the body of one of its
productions).

The goal of Bottom-Up Parsing is therefore to construct a derivation in

reverse. The following derivation corresponds to the parse in example 3:
E T T * F T * id F * id id * id … (RMD derivation)

Handle Pruning:
Bottom-up parsing during a left-to-right scan of the input constructs a
rightmost derivation in reverse. Informally, a "handle" is a substring that matches
the body of a production, and whose reduction represents one step along the reverse
of a rightmost derivation.

For example, adding subscripts to the tokens id for clarity, the handles
during the parse of idl * id2 according to the expression grammar
3rd Class

Compiler 2
E E+T | T
T T*F | F
F (E) | id

Although T is the body of the production E T, the symbol T is not a handle in
the sentential form T * id2.
If T were indeed replaced by E, we would get the string E * id2, which cannot be
derived from the start symbol E.
Thus, the leftmost substring that matches the body of some production need not be
a handle.
E T T * F F * F F * id1 id1 * id2 … (derivation)

Example: consider the grammar:

E' E
E E+T | E–T | T
T (E) | id
By using RMD derivation derive id + (id – id)
Solution:
E' E
E + T
E + (E)
E + (E – T)
E + (E – id)
E + (T – id)
E + (id – id)
T + (id – id)
id + (id – id)
3rd Class

Compiler 2
H.W.
For this grammar
E → E+T | T
T → T*F | F
F → id | (E)
Parse the input id * id + id
3rd Class

Compiler 2
Lecture -2- : LR Parser Family

The LR(k) parsing technique was introduced by Knuth in 1965

L is for Left-to-right scanning of input, R corresponds to a Rightmost
derivation done in reverse, and k is the number of lookahead symbols used to make
parsing decisions.
There are three widely used Algorithms available for constructing an LR
parser:
SLR (1) – Simple LR Parser.
LR (1) – LR Parser.
LALR (1) – Look-Ahead LR Parser.

Rules for LR parser:

The rules of LR parser as follows:
The first item from the given grammar rules adds itself as the first closed set.
If an object is present in the closure of the form A→ α. β. γ, where the next
symbol after the symbol is non-terminal, add the symbol’s production rules
where the dot precedes the first item.
Repeat steps (B) and (C) for new items added under (B).
The LR-Parsing Algorithm
A schematic of an LR parser consists of an input, an output, a stack, a driver
program, and a parsing table that has two pasts (ACTION and GOTO).
 The driver program is the same for all LR parsers; only the parsing table
changes from one parser to another.
 The parsing program reads characters from an input buffer one at a time.
 Where a shift-reduce parser would shift a symbol, an LR parser shifts a state.
 Each state summarizes the information contained in the stack below it.
1
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
Parsing Table:
Parsing table is divided into two parts- Action table and Go-To table. The
action table gives a grammar rule to implement the given current state and current
terminal in the input stream.
1. The ACTION function takes as arguments a state i and a terminal a (or $, the
input endmarker).
The value of ACTION [i, a] can have one of four forms:
a. Shift j, where j is a state : The action taken by the parser effectively shifts
input a to the stack, but uses state j to represent a.
b. Reduce A β. The action of the parser effectively reduces β on the top of
the stack to head A.
c. Accept. The parser accepts the input and finishes parsing.
d. Error. The parser discovers an error in its input and takes some corrective
action.
2. We extend the GOTO function, defined on sets of items, to states:
if GOTO [ Ii, A] = Ij, then GOTO also maps a state i and a nonterminal A to
state j.
Example: The ACTION and GOTO functions of an LR-parsing table for the
expression the following grammar,
E E+T | T
T T*F | F
F (E) | id
2
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
Repeated with the productions numbered
1. E E + T
2. E T
3. T T * F
4. T F
5. F (E)
6. F id
The codes for the actions are:
1. si means shift and stack state i,
2. rj means reduce by the production numbered j,
3. acc means accept,
4. blank means error.
First construct the set of items I0:
I0:
E •E + T r1
E •T r2
T •T * F r3
T •F r4
F •(E) r5
F •id r6

I1: Goto [I0, E]

E E• + T … Accept

I2: Goto [I0, T]

E T• … Complete
T T• * F

I3: Goto [I0, F]

T F•

I4: Goto [I0, ( ]

F (•E)
E •E + T
E •T

3
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
T •T * F
T •F
F •(E)
F •id
I5: Goto [I0, id]
F id• … Complete

I6: Goto [I1, +]

E E +• T
T •T * F
T •F
F •(E)
F •id
I7: Goto [I2, *]
T T *• F
F •(E)
F •id
I8: Goto [I4, E]
F (E•)
E E• + T
Goto [I4, T] = I2 ‫حاالت مكررة‬
Goto [I4, F] = I3
Goto [I4, (] = I4
Goto [I4, id] = I5

I9: Goto [I6, T]

E E + T• … Complete
Goto [I6, T] = I2
Goto [I6, F] = I3
Goto [I6, (] = I4
Goto [I6, id] = I5

I10: Goto [I7, F]

T T * F• … Complete
Goto [I7, (] = I4
Goto [I7, id] = I5
4
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
I11: Goto [I8, )]
F (E) • … Complete
Follow(E) = {$, +, )}
Follow(T) = {$, +, ), *}
Follow(F) = {$, +, ), *}

Constructing SLR-Parsing Tables

We shall refer to the parsing table constructed by this method as an SLR
table, and to an LR parser using an SLR-parsing table as an SLR parser.
The other two methods augment the SLR method with lookahead information.
The action and goto entries in the parsing table are then constructed using the
following algorithm. It requires us to know FOLLOW(A) for each nonterminal A
of a grammar.
Constructing an SLR-parsing table Algorithm:
INPUT: An augmented grammar G'.
OUTPUT: The SLR-parsing table functions ACTION and COTO for G'.
METHOD:
1. Construct C = {I0, I1, …., In}, the collection of sets of LR(0) items for G'.
2. State i is constructed from Ii.
5
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
The parsing actions for state i are determined as follows:
a. If [A α.αβ] is in Ii and GOTO (Ii, α) = Ij, then set ACTION [i, α] to
"shift j." Here α must be a terminal.
b. If [A α.] is in Ii, then set ACTION [i, α] to "reduce A α" for all α
in FOLLOW(A); here A may not be S'.
c. if [S' S.] is in Ii, then set ACTION [i, S] to "accept." If any conflicting
actions result from the above rules, we say the grammar is not SLR (1). The
algorithm fails to produce a parser in this case.
3. The goto transitions for state i are constructed for all nonterrninals A using the
rule: If GOTO (Ii, A) = Ij, then GOTO [i, A] = j.
4. All entries not defined by rules (2) and (3) are made "error."
5. The initial state of the parser is the one constructed from the set of items
containing [S' .S].

Example:
Let us construct the SLR table for the augmented expression grammar.
The canonical collection of sets of LR(0) items for the grammar.
I0:
E' •E r1
E •E + T r2
E •T r3
T •T * F r4
T •F r5
F •(E) r6
F •id r7

I1: Goto [I0, E]

E' E• … Accept
E E• + T

I2: Goto [I0, T]

E T• … Complete
T T• * F

I3: Goto [I0, F]

T F• … Complete
6
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
I4: Goto [I0, (]
F (•E)
E •E + T
E •T
T •T * F
T •F
F •(E)
F •id
I5: Goto [I0, id]
F id• … Complete

I6: Goto [I1, +]

E E +• T
T •T * F
T •F
F •(E)
F •id
I7: Goto [I2, *]
T T *• F
F •(E)
F •id
I8: Goto [I4, *]
F (E•)
Goto [I4, E] = I1
Goto [I4, T] = I2
Goto [I4, F] = I3
Goto [I4, (] = I4
Goto [I4, id] = I5
I9: Goto [I6, T]
E E + T• … Complete
Goto [I6, T] = I2
Goto [I6, F] = I3
Goto [I6, (] = I4
Goto [I6, id] = I5

7
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
I10: Goto [I7, F]
T T * F• … Complete
Goto [I7, (] = I4
Goto [I7, id] = I5

I11: Goto [I8, )]

F (E)• … Complete

Follow(E) = {$, +, )}
Follow(T) = {$, +, ), *}
Follow(F) = {$, +, ), *}

8
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
Lecture -3- : Syntax Directed Translation
Syntax Directed Translation has augmented rules to the grammar that
facilitate semantic analysis. SDT involves passing information bottom-up and/or
top-down the parse tree in form of attributes attached to the nodes. Syntax-directed
translation rules use:
1. Lexical values of nodes.
2. Constants.
3. Attributes associated with the non-terminals in their definitions.
The general approach to Syntax-Directed Translation is to construct a parse tree or
syntax tree and compute the values of attributes at the nodes of the tree by visiting
them in some order. In many cases, translation can be done during parsing without
building an explicit tree.
Example
E E+T | T
T T*F | F
F id
This is a grammar to syntactically validate an expression having additions and
multiplications in it. Now, to carry out semantic analysis we will augment SDT rules
to this grammar, in order to pass some information up the parse tree and check for
semantic errors, if any. In this example, we will focus on the evaluation of the given
expression, as we don’t have any semantic assertions to check in this very basic
example.
1. E E + T { E.val = E.val + T.val }
2. E T { E.val = T.val }
3. T T * F { T.val = T.val * F.val }
4. T F { T.val = F.val }
5. F id { F.val = id.lexval }

1
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
Semantic analysis for (S = 2+3*4)

To evaluate translation rules, we can employ one depth-first search traversal

on the parse tree. This is possible only because SDT rules don’t impose any specific
order on evaluation until children’s attributes are computed before parents for a
grammar having all synthesized attributes. Otherwise, we would have to figure out
the best-suited plan to traverse through the parse tree and evaluate all the attributes
in one or more traversals. For better understanding, we will move bottom-up in the
left to right fashion for computing the translation rules of our example.

2
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
S–attributed and L–attributed SDTs in Syntax directed translation

Attributes may be of two types – Synthesized or Inherited.

1- Synthesized attributes
A Synthesized attribute is an attribute of the non-terminal on the left-hand
side of a production. Synthesized attributes represent information that is being
passed up the parse tree. The attribute can take value only from its children
(Variables in the RHS of the production).
For eg. let’s say A -> BC is a production of a grammar, and A’s attribute is
dependent on B’s attributes or C’s attributes then it will be synthesized attribute.

2- Inherited attributes
An attribute of a nonterminal on the right-hand side of a production is called
an inherited attribute. The attribute can take value either from its parent or from its
siblings (variables in the LHS or RHS of the production).
For example, let’s say A -> BC is a production of a grammar and B’s
attribute is dependent on A’s attributes or C’s attributes then it will be inherited
attribute.

S-attributed SDT:
If an SDT uses only synthesized attributes, it is called as S-attributed SDT.
S-attributed SDTs are evaluated in bottom-up
parsing, as the values of the parent nodes depend
upon the values of the child nodes.
Semantic actions are placed in rightmost place
of RHS.

L-attributed SDT:
If an SDT uses both synthesized attributes and
inherited attributes with a restriction that inherited
attribute can inherit values from left siblings only, it
is called as L-attributed SDT.
Attributes in L-attributed SDTs are evaluated by depth-first and left-to-right
parsing manner.
Semantic actions are placed anywhere in RHS.
3
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
For example: A -> XYZ {Y.S = A.S, Y.S = X.S, Y.S = Z.S}
is not an L-attributed grammar since Y.S = A.S and Y.S = X.S are allowed
but Y.S = Z.S violates the L-attributed SDT definition as attributed is inheriting the
value from its right sibling.

Note – If a definition is S-attributed, then it is also L-attributed but NOT vice-

versa.

The comparison between these two attributes are given below:

S.NO Synthesized Attributes Inherited Attributes

1. An attribute is said to be Synthesized An attribute is said to be Inherited attribute
attribute if its parse tree node value is if its parse tree node value is determined by
determined by the attribute value at the attribute value at parent and/or siblings
child nodes. node.
2. The production must have non-terminal The production must have non-terminal as a
as its head. symbol in its body.
3. A synthesized attribute at node n is A Inherited attribute at node n is defined
defined only in terms of attribute values only in terms of attribute values of n’s
at the children of n itself. parent, n itself, and n’s siblings.
4. It can be evaluated during a single It can be evaluated during a single top-down
bottom-up traversal of parse tree. and sideways traversal of parse tree.
5. Synthesized attributes can be contained Inherited attributes can’t be contained by
by both the terminals and non-terminals. both, It is only contained by non-terminals.
6. Synthesized attribute is used by both S- Inherited attribute is used by only L-
attributed SDT and L-attributed STD. attributed SDT.
7.

4
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
Lecture -4- : Semantic Analysis in Compiler Design
Semantic Analysis is the third phase of Compiler. Semantic Analysis makes
sure that declarations and statements of program are semantically correct. It is a
collection of procedures which is called by parser as and when required by grammar.
Both syntax tree of previous phase and symbol table are used to check the
consistency of the given code. Type checking is an important part of semantic
analysis where compiler makes sure that each operator has matching operands.

Semantic Analyzer:
It uses syntax tree and symbol table to check whether the given program is
semantically consistent with language definition. It gathers type information and
stores it in either syntax tree or symbol table. This type information is subsequently
used by compiler during intermediate-code generation.
Semantic Errors:
Errors recognized by semantic analyzer are as follows:
1. Type mismatch
2. Undeclared variables
3. Reserved identifier misuse
4. Multiple declaration of variable in a scope.
5. Accessing an out-of-scope variable.
6. Actual and formal parameter mismatch.

Functions of Semantic Analysis:

1- Type Checking –
Ensures that data types are used in a way consistent with their definition.
2- Label Checking –
A program should contain labels references.
3- Flow Control Check –
1
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
Keeps a check that control structures are used in a proper manner.(example:
no break statement outside a loop)
Example:
float x = 10.1;
float y = x*30;
In the above example integer 30 will be type casted to float 30.0 before
multiplication, by semantic analyzer.
Static and Dynamic Semantics:
In many compilers, the work of the semantic analyzer takes the form of
semantic action routines, invoked by the parser when it realizes that it has reached
a particular point within a grammar rule.
Of course, not all semantic rules can be checked at compile time. Those that
can are referred to as the static semantics of the language. Those that must be
checked at run time are referred to as the dynamic semantics of the language. C has
very little in the way of dynamic checks.
Examples of rules that other languages enforce at run time include the
following:
■ Variables are never used in an expression unless they have been given a value.
■ Pointers are never dereferenced unless they refer to a valid object.
■ Array subscript expressions lie within the bounds of the array.
■ Arithmetic operations do not overflow.

Semantic analysis judges whether the syntax structure constructed in the source
program derives any meaning or not.
CFG + semantic rules = Syntax Directed Definitions
For example:
int a = “value”;

2
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
Should not issue an error in lexical and syntax analysis phase, as it is lexically and
structurally correct, but it should generate a semantic error as the type of the
assignment differs. These rules are set by the grammar of the language and evaluated
in semantic analysis. The following tasks should be performed in semantic analysis:
 Scope resolution
 Type checking
 Array-bound checking
If a semantic analyzer has a symbol table for each separate procedure, it can find
semantic errors that occur because of the following mistakes:
 Names that aren’t declared
 Operands of the wrong type for the operator they’re used with
 Values that have the wrong type for the name to which they're assigned

If a semantic analyzer has a symbol table for the program as a whole, it can find
semantic errors that occur because of the following mistakes:
 Procedures that are invoked with the wrong number of arguments
 Procedures that are invoked with the wrong type of arguments
 Function return values that are the wrong type for the context in which
they're used

If a semantic analyzer has control-flow and data-flow information for each separate
procedure, it can find semantic errors that occur because of the following mistakes:

 Code blocks that are unreachable

 Code blocks that have no effect
 Local variables that are used before being initialized or assigned
 Local variables that are initialized or assigned but not used

If a semantic analyzer has control-flow and data-flow information for the program
as a whole, it can find semantic errors that occur because of the following
mistakes:
 Procedures that are never invoked
 Procedures that have no effect
3
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
 Global variables that are used before being initialized or assigned
 Global variables that are initialized or assigned, but not used

Examples
1- the following code is correct

while (x <= 5)
writeOut "OK";
break;
;
Whereas the following one isn’t, and should be rejected.
while (x <= 5)
writeOut "OK";
;
break;
2- x = 3;
z = "abc";
y = x + z;
The three lines above should also generate a compilation error. The reason
is that the operator + is used with a int type (x) and a string type (z). Even
though this kind of operation may be allowed in some languages.

4
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
Lecture -5- : Semantic Analysis (TYPE checking)

A semantic analyzer checks the source program for semantic errors. Type-
checking is an important part of semantic analyzer. Type checking is the process of
verifying and enforcing constraints of types in values and attempts to catch
programming errors based on the theory of types.
Two types of semantic Checks are performed within this phase these are:-
1. Static Semantic Checks are performed at compile time like:-
Type checking.
Every variable is declared before used.
Identifiers are used in appropriate contexts.
Check labels
2. Dynamic Semantic Checks are performed at run time, and the compiler
produces code that performs these checks:-
Array subscript values are within bounds.
Arithmetic errors, e.g. division by zero.
A variable is used but hasn’t been initialized.
Three kinds of languages:
1- Statically(typed: All or almost all checking of types is done as part of
compilation (C, Java)
2- Dynamically(typed: Almost all checking of types is done as part of program
execution (Scheme)
3- Un-typed: No type checking (machine code).
NOTE: Some programming languages such as C will combine both static and
dynamic typing i.e, some types are checked before execution while others
during execution.

1
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
The design of type checker depends on:
1- Syntactic structure of language constructor.
2- The type expression of language.
3- The rules of the assigning types to construct.
Type Expression and Type Systems
Type Expression
The type of a language construct will be denoted by a type expression. A type
expression is either a basic type or is formed by applying an operator called a type
constructor to other type expressions.
1- Basic type
• Integer: 7, 34, 909.
• Floating point: 5.34, 123, 87.
• Character: a, A.
• Boolean: not, and, or, xor.
2- Type constructor
 Arrays: If T is a type expression, then array (I, T) is a type expression
denoting the type of an array with elements of type T and index set I.
 Products: If T1 and T2 are type expressions, then their Cartesian
product T1×T2 is a type expression.
 Records: The type of a record is in a sense the product of the types of
its fields. The difference between a record and a product is that the fields
of a record have names.
 Pointers: If T is a type expression, then pointer (T) is a type expression
denoting the type pointer to an object of type T.
 Functions: Functions take values in some domain and map them into
value in some range.

2
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
Type System
Collection of rules for assigning types expression. In most languages, types
system are:
1- Basic types are the atomic types with no internal structure as far as the
programmer is concerned (int, char, float,….).
2- Constructed types are arrays, records, and sets. In addition, pointers and
functions can also be treated as constructed types.
3- Type Equivalence:
 Name equivalence: Types are equivalence only when they have the
same name.
 Structural equivalence: Types are equivalence when they have the
same structure.
 Example: In C uses structural equivalence for structs and name
equivalence for arrays/pointers.

3
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
Lecture -6-: Intermediate Code Generation

In the analysis-synthesis model of a compiler, the front end of a compiler

translates a source program into an independent intermediate code, then the back end
of the compiler uses this intermediate code to generate the target code (which can be
understood by the machine).

The benefits of using machine independent intermediate code are:

 If a compiler translates the source language to its target machine language without
having the ability to generate intermediate code, so for each new machine a full
native compiler is required.
 The intermediate code eliminates the need for a complete new compiler for every
single machine by keeping the parsing part the same for all compilers.
 The second part of the compiler, the synthesis, is modified depending on the target
machine.
 It becomes easier to apply source code changes to improve code performance by
applying code optimization techniques on intermediate code.

1
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
If we generate machine code directly from source code then for n target
machine we will have n optimizers and n code generators but if we will have a
machine independent intermediate code, we will have only one optimizer.
Intermediate code can be either language specific (e.g., Bytecode for Java) or
language independent (three-address code).

The following are commonly used intermediate code representation:

1- Postfix Notation –
The ordinary (infix) way of writing the sum of a and b is with operator in the
middle: a + b
The postfix notation for the same expression places the operator at the right
end as ab +. In general, if e1 and e2 are any postfix expressions, and + is any binary
operator, the result of applying + to the values denoted by e1 and e2 is postfix
notation by e1e2 +. No parentheses are needed in postfix notation because the
position and arity (number of arguments) of the operators permit only one way to
decode a postfix expression. In postfix notation the operator follows the operand.

Example –
The postfix representation of the expression (a – b) * (c + d) + (a – b) is:
ab – cd + *ab -+.

2- Three-Address Code –
A statement involving no more than three references (two for operands and
one for result) is known as three address statement. A sequence of three address
statements is known as three address code. Three address statement is of the form
x = y op z where x, y, z will have address (memory location).
Sometimes a statement might contain less than three references but it is still
called three address statement.
Example – The three address code for the expression a + b * c + d :
T1=b*c
T2=a+T1
T3=T2+d
T 1 , T 2 , T 3 are temporary variables.

3- Syntax Tree –
2
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
Syntax tree is nothing more than condensed form of a parse tree. The
operator and keyword nodes of the parse tree are moved to their parents and a
chain of single productions is replaced by single link in syntax tree the internal
nodes are operators and child nodes are operands. To form syntax tree put
parentheses in the expression, this way it's easy to recognize which operand
should come first.
Example –
x = (a + b * c) / (a – b * c)

Some of the basic operations which in the so program, to change in the

assembly language

Operations H.L.L Assembly language

Math. operation +, -, *, / Add, sub, mult, div
Boolean operation &, |, ~ And, or, not
Assignment := Mov, LD, Store
Jump Go to JP, JN, JC
Conditional If, Case CMP
Loop instruction For, Do, Repeat, These must have I.C
While

3
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
The operation which change H.L.L to Assembly language, is called the
Intermediate code generation and there is the division operation come it,
which mean every statement have a sing operation.

Example: X=A+B*C/D-Y*N
T1= B*C
T2=T1/D
T3=Y*N
T4=A+T2
T5=T4-T3

Example: Y= Cos(A*B)+C/N-X*P
T1=A*B
T2=Cos(T1)
T3=X*p
T4=C/N
T5=T2+T4
T6=T5-T3

If Condition Statement:

Example:
X=1;
If (X>Y)
{ A=A+1;
B=B-A+2;
}
P=P+1;
60 P=P+1

4
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
Example:
X=1
If ((X>Y) && (Y>=2))
{
A=A+1
B=B-A+2
}
Else X=X+1;
P=P+2+X;

For - Loop
Example:
For (i=1; i<=10;i++)
X = X+ (i*Y);

5
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
Issues in the design of a code generator
Code generator converts the intermediate representation of source code into
a form that can be readily executed by the machine. A code generator is expected
to generate the correct code. Designing of code generator should be done in such a
way so that it can be easily implemented, tested and maintained.
1. Input to code generator
The input to code generator is the intermediate code generated by the front
end, along with information in the symbol table that determines the run-time
addresses of the data-objects denoted by the names in the intermediate
representation. Intermediate codes may be represented mostly in quadruples,
triples, indirect triples, Postfix notation, syntax trees, DAG’s, etc. The code
generation phase just proceeds on an assumption that the input are free from
all of syntactic and state semantic errors, the necessary type checking has
taken place and the type-conversion operators have been inserted wherever
necessary.
2. Target program
The target program is the output of the code generator. The output may be
absolute machine language, relocatable machine language, assembly
language.
1. Absolute machine language as output has advantages that it can be
placed in a fixed memory location and can be immediately executed.
2. Relocatable machine language as an output allows subprograms and
subroutines to be compiled separately. Relocatable object modules can
be linked together and loaded by linking loader. But there is added
expense of linking and loading.
3. Assembly language as output makes the code generation easier. We can
generate symbolic instructions and use macro-facilities of assembler in
generating code. And we need an additional assembly step after code
generation.

3. Memory Management:
Mapping the names in the source program to the addresses of data objects
is done by the front end and the code generator. A name in the three address
statements refers to the symbol table entry for name. Then from the symbol
table entry, a relative address can be determined for the name.

6
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
4. Instruction selection:
Selecting the best instructions will improve the efficiency of the program.
It includes the instructions that should be complete and uniform. Instruction
speeds and machine idioms also plays a major role when efficiency is
considered. But if we do not care about the efficiency of the target program
then instruction selection is straight-forward.
For example, the respective three-address statements would be translated
into the latter code sequence as shown below:

P:=Q+R
S:=P+T

MOV Q, R0
ADD R, R0
MOV R0, P
MOV P, R0
ADD T, R0
MOV R0, S

Here the fourth statement is redundant as the value of the P is

loaded again in that statement that just has been stored in the previous statement. It
leads to an inefficient code sequence. A given intermediate representation can be
translated into many code sequences, with significant cost differences between the
different implementations. A prior knowledge of instruction cost is needed in order
to design good sequences, but accurate cost information is difficult to predict.
5. Register allocation issues:
Use of registers make the computations faster in comparison to that of
memory, so efficient utilization of registers is important. The use of registers
are subdivided into two sub-problems:
1. During Register allocation – we select only those set of variables that
will reside in the registers at each point in the program.
2. During a subsequent Register assignment phase, the specific register
is picked to access the variable.
As the number of variables increases, the optimal assignment of
registers to variables becomes difficult. Mathematically, this problem
becomes NP-complete. Certain machine requires register pairs consist of an
even and next odd-numbered register. For example

7
By: Dr. Ielaf Osamah
3rd Class

Compiler 2
M a, b

These types of multiplicative instruction involve register pairs where

the multiplicand is an even register and b, the multiplier is the odd register of
the even/odd register pair.

6. Evaluation order:
The code generator decides the order in which the instruction will be
executed. The order of computations affects the efficiency of the target code.
Among many computational orders, some will require only fewer registers to
hold the intermediate results. However, picking the best order in the general
case is a difficult NP-complete problem.
7. Approaches to code generation issues:
Code generator must always generate the correct code. It is essential
because of the number of special cases that a code generator might face.
Some of the design goals of code generator are:
 Correct
 Easily maintainable
 Testable
 Efficient

8
By: Dr. Ielaf Osamah

CSE538 sp25 (4) Lexical and Vector Semantics 2-25 nlp
No ratings yet
CSE538 sp25 (4) Lexical and Vector Semantics 2-25 nlp
126 pages
10-estimators-pre-lecture
No ratings yet
10-estimators-pre-lecture
109 pages
3_slides corpus3
No ratings yet
3_slides corpus3
88 pages
2DI90_ch9 (1)
No ratings yet
2DI90_ch9 (1)
83 pages
4_slides Regualer expression
No ratings yet
4_slides Regualer expression
75 pages
Syntactic and Dependency Parsing
No ratings yet
Syntactic and Dependency Parsing
159 pages
2DI90_chID190-CH5
No ratings yet
2DI90_chID190-CH5
62 pages
07-covariance-answers-hidden-lecture
No ratings yet
07-covariance-answers-hidden-lecture
62 pages
DOC-20240412-WA0009.
No ratings yet
DOC-20240412-WA0009.
85 pages
2DI90_ch11 (1)
No ratings yet
2DI90_ch11 (1)
54 pages
Ring Theory - Lecture 1
No ratings yet
Ring Theory - Lecture 1
23 pages
01-introduction plc
No ratings yet
01-introduction plc
53 pages
Compiler Design Unit 2 By Dr. Choudhary Ravi Singh
No ratings yet
Compiler Design Unit 2 By Dr. Choudhary Ravi Singh
46 pages
13-neuralcrf pos tagging
No ratings yet
13-neuralcrf pos tagging
40 pages
04-textcat text class
No ratings yet
04-textcat text class
77 pages
(Arguments of The Philosophers) David Cunning - Cavendish-Routledge (2016)
No ratings yet
(Arguments of The Philosophers) David Cunning - Cavendish-Routledge (2016)
337 pages
61799956 POS tagging
No ratings yet
61799956 POS tagging
63 pages
01-bayes-all-handout prob
No ratings yet
01-bayes-all-handout prob
28 pages
slides08-lr-parsing
No ratings yet
slides08-lr-parsing
25 pages
13-oo-opolymorphism plc
No ratings yet
13-oo-opolymorphism plc
15 pages
lect33-textcat (1)
No ratings yet
lect33-textcat (1)
70 pages
Jungheinrich Forklift Etv 214 Spare Parts Manual 91064550 411
No ratings yet
Jungheinrich Forklift Etv 214 Spare Parts Manual 91064550 411
22 pages
CEFR students-book-version-02-2023
No ratings yet
CEFR students-book-version-02-2023
133 pages
reduction proofs
No ratings yet
reduction proofs
9 pages
DOC-20250419-WA0018.
No ratings yet
DOC-20250419-WA0018.
9 pages
Jarrar.LectureNotes.Ch1.Introduction
No ratings yet
Jarrar.LectureNotes.Ch1.Introduction
18 pages
T4 Memory
No ratings yet
T4 Memory
24 pages
03_PARSING
No ratings yet
03_PARSING
71 pages
Mod3-Bottomupparser
No ratings yet
Mod3-Bottomupparser
9 pages
AS00001155
No ratings yet
AS00001155
28 pages
2.BasicTextProcessing NEW
No ratings yet
2.BasicTextProcessing NEW
39 pages
imc_shift-cipher
No ratings yet
imc_shift-cipher
17 pages
Extending Bell Hooks Feminist Theory
No ratings yet
Extending Bell Hooks Feminist Theory
19 pages
Lecture 1 - PPT
No ratings yet
Lecture 1 - PPT
28 pages
Ch. 1 Notes
No ratings yet
Ch. 1 Notes
11 pages
02 Random Vars All Handout
No ratings yet
02 Random Vars All Handout
23 pages
Morphological Process Exercise
No ratings yet
Morphological Process Exercise
2 pages
mod3
No ratings yet
mod3
29 pages
PoCD_Week_06_Chapter_05_Handouts
No ratings yet
PoCD_Week_06_Chapter_05_Handouts
16 pages
2.Parsers- Top Down Parser-predictive Parser
No ratings yet
2.Parsers- Top Down Parser-predictive Parser
19 pages
CD Unit3 Part1
No ratings yet
CD Unit3 Part1
22 pages
NLP-LLM
No ratings yet
NLP-LLM
47 pages
ch07-consistency-replication (1)
No ratings yet
ch07-consistency-replication (1)
30 pages
Conceptualizationand Lexical Realizationof Motion Verbsin Standard Written Arabic
No ratings yet
Conceptualizationand Lexical Realizationof Motion Verbsin Standard Written Arabic
311 pages
CD UNIT II
No ratings yet
CD UNIT II
11 pages
Tut4_WordEmb nlp
No ratings yet
Tut4_WordEmb nlp
30 pages
Unit3.2 bottomupparsars
No ratings yet
Unit3.2 bottomupparsars
71 pages
new trends for authentication
No ratings yet
new trends for authentication
5 pages
Primes
No ratings yet
Primes
39 pages
CD_Chap3_III_Bottom Up Parsing (2)
No ratings yet
CD_Chap3_III_Bottom Up Parsing (2)
37 pages
CD R19 Unit-2
No ratings yet
CD R19 Unit-2
53 pages
toc unit 3.pptx
No ratings yet
toc unit 3.pptx
49 pages
Soal Bahasa Inggris Kelas Viii Genap Al Amin
No ratings yet
Soal Bahasa Inggris Kelas Viii Genap Al Amin
7 pages
Syntax Analysis II 2024 Student Copy
No ratings yet
Syntax Analysis II 2024 Student Copy
67 pages
Lec03 Part I SLR
No ratings yet
Lec03 Part I SLR
70 pages
CD Unit-3 (1) (R20)
No ratings yet
CD Unit-3 (1) (R20)
29 pages
Be Electronics and Telecommunication Engineering Semester 5 2023 November Microcontrollers Pattern 2019
No ratings yet
Be Electronics and Telecommunication Engineering Semester 5 2023 November Microcontrollers Pattern 2019
2 pages
Stylistics Analysis of "For Whom The Bell Tolls"
No ratings yet
Stylistics Analysis of "For Whom The Bell Tolls"
5 pages
Pre-Vedic Period - Case Study
No ratings yet
Pre-Vedic Period - Case Study
2 pages
Silence As A Mode of Communication
100% (1)
Silence As A Mode of Communication
17 pages
Parsers
No ratings yet
Parsers
11 pages
Compiler Design(Unit-II)
No ratings yet
Compiler Design(Unit-II)
89 pages
Factoring Polynomials
No ratings yet
Factoring Polynomials
31 pages
Intro-Parsing - Top-Down - Bottom-Up
No ratings yet
Intro-Parsing - Top-Down - Bottom-Up
7 pages
Assignment 6 Mine
No ratings yet
Assignment 6 Mine
13 pages
Unit III
No ratings yet
Unit III
29 pages
bag_of_words nlp
No ratings yet
bag_of_words nlp
23 pages
Mod 2
No ratings yet
Mod 2
29 pages
Syntax Analyzer 2-up to LR(0)
No ratings yet
Syntax Analyzer 2-up to LR(0)
73 pages
Syntax Analyzer 2-up to LALR
No ratings yet
Syntax Analyzer 2-up to LALR
74 pages
Plenary Paper Behlmer Rome Bibliography-libre-Coptic Lit
No ratings yet
Plenary Paper Behlmer Rome Bibliography-libre-Coptic Lit
11 pages
Bottom Up Parser
100% (1)
Bottom Up Parser
61 pages
CD Unit 2 RV
No ratings yet
CD Unit 2 RV
21 pages
Compiler Ass
No ratings yet
Compiler Ass
13 pages
Compiler Design Study Material Unit 2nd
No ratings yet
Compiler Design Study Material Unit 2nd
28 pages
CC 5
No ratings yet
CC 5
77 pages
Compiler Design: 7. Top-Down Table-Driven Parsing
No ratings yet
Compiler Design: 7. Top-Down Table-Driven Parsing
9 pages
150 DSA Questions
No ratings yet
150 DSA Questions
9 pages
Lecture3 Parser Full
No ratings yet
Lecture3 Parser Full
30 pages
Practical No 22
No ratings yet
Practical No 22
4 pages
4 Syntax Analysis - Bottom Up Parsing
No ratings yet
4 Syntax Analysis - Bottom Up Parsing
12 pages
ACD-UNIT-4 Notes
No ratings yet
ACD-UNIT-4 Notes
32 pages
LR Parser
No ratings yet
LR Parser
15 pages
CD Unit 2
No ratings yet
CD Unit 2
19 pages
PCC-CS501
No ratings yet
PCC-CS501
10 pages
Module 3
No ratings yet
Module 3
29 pages
Bottom-Up Parsing
No ratings yet
Bottom-Up Parsing
4 pages
Bottomupparsingh
No ratings yet
Bottomupparsingh
21 pages
CD Unit 3
No ratings yet
CD Unit 3
30 pages
oss
No ratings yet
oss
18 pages
Flat Nmu Paper 3
No ratings yet
Flat Nmu Paper 3
2 pages
KCA015 Unit2
No ratings yet
KCA015 Unit2
29 pages
RkCD-Chapter 4 - Syntax Analysis
No ratings yet
RkCD-Chapter 4 - Syntax Analysis
20 pages
Compilef Design Unit 2 AKTU As Per 2023-24 Syllabus
No ratings yet
Compilef Design Unit 2 AKTU As Per 2023-24 Syllabus
46 pages
G. H. Raisoni College of Engineering: Sub-Langueage Processor
No ratings yet
G. H. Raisoni College of Engineering: Sub-Langueage Processor
8 pages
LR
No ratings yet
LR
4 pages
Frithiof's Saga
100% (1)
Frithiof's Saga
362 pages
Compiler Design Unit-2
No ratings yet
Compiler Design Unit-2
29 pages
07 Bottom Up Parsing
No ratings yet
07 Bottom Up Parsing
79 pages
Gerundios Infinitivos
No ratings yet
Gerundios Infinitivos
3 pages
Session 3
No ratings yet
Session 3
18 pages
UNIT-4 Parsing Techniques
No ratings yet
UNIT-4 Parsing Techniques
20 pages
Unit 02 - Part 03
No ratings yet
Unit 02 - Part 03
50 pages
Negative Language Transfer When Learning: Spanish As A Foreign Language
No ratings yet
Negative Language Transfer When Learning: Spanish As A Foreign Language
12 pages
CAE Pipe Tutorial 2
100% (1)
CAE Pipe Tutorial 2
88 pages
CS346 Bottom Up Parser
No ratings yet
CS346 Bottom Up Parser
64 pages
Chapter 6 - Compiler Construction
No ratings yet
Chapter 6 - Compiler Construction
13 pages
Unit 3 - Sessions 6 - 7 (A)
No ratings yet
Unit 3 - Sessions 6 - 7 (A)
25 pages
Table-Driven Parsing: Tables
No ratings yet
Table-Driven Parsing: Tables
22 pages
G5 Q2W2 DLL MATH MELCs
No ratings yet
G5 Q2W2 DLL MATH MELCs
13 pages
Garde 10 To 12 English P2 Pamphlet
100% (1)
Garde 10 To 12 English P2 Pamphlet
41 pages
Bottom Up Approach
No ratings yet
Bottom Up Approach
22 pages
Urdu Handwriting Practice Handwriting Practicepdf Free Download in The Note
No ratings yet
Urdu Handwriting Practice Handwriting Practicepdf Free Download in The Note
2 pages
Unit III
No ratings yet
Unit III
211 pages
ML4D-L6 nlp2
No ratings yet
ML4D-L6 nlp2
58 pages
Fortran User Model
No ratings yet
Fortran User Model
20 pages
Top-Down and Bottom-Up Parsing
No ratings yet
Top-Down and Bottom-Up Parsing
23 pages
DLL All Subjects 2 q3 w3 d5
No ratings yet
DLL All Subjects 2 q3 w3 d5
3 pages
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet

_20221211_082916_570

Uploaded by

_20221211_082916_570

Uploaded by

3rd Class

Illustrates the bottom-up parse for string w= id * id

The following derivation corresponds to the parse

We can think of bottom-up parsing as the process of "reducing" a string w to

The goal of Bottom-Up Parsing is therefore to construct a derivation in

Example: consider the grammar:

The LR(k) parsing technique was introduced by Knuth in 1965

Rules for LR parser:

I1: Goto [I0, E]

I2: Goto [I0, T]

I3: Goto [I0, F]

I4: Goto [I0, ( ]

I6: Goto [I1, +]

I9: Goto [I6, T]

I10: Goto [I7, F]

Constructing SLR-Parsing Tables

I1: Goto [I0, E]

I2: Goto [I0, T]

I3: Goto [I0, F]

I6: Goto [I1, +]

I11: Goto [I8, )]

To evaluate translation rules, we can employ one depth-first search traversal

Attributes may be of two types – Synthesized or Inherited.

Note – If a definition is S-attributed, then it is also L-attributed but NOT vice-

The comparison between these two attributes are given below:

S.NO Synthesized Attributes Inherited Attributes

Functions of Semantic Analysis:

 Code blocks that are unreachable

In the analysis-synthesis model of a compiler, the front end of a compiler

The benefits of using machine independent intermediate code are:

The following are commonly used intermediate code representation:

Some of the basic operations which in the so program, to change in the

Operations H.L.L Assembly language

Here the fourth statement is redundant as the value of the P is

These types of multiplicative instruction involve register pairs where

You might also like