Compiler RNP SP Unit 4
Compiler RNP SP Unit 4
Compilers
• Introduction to Compiling
• Phases in Compilation
• Lexical Analysis
• Syntax Analysis
– Context Free Grammars
– Top-Down Parsing,
– Bottom-Up Parsing,
– Ambiguity in Grammar
2
Prof. Reshma Pise
Translators
3
Prof. Reshma Pise
COMPILERS
• A compiler is a program takes a program written in a source language
and translates it into an equivalent program in a target language.
error messages
oldval 12
– Both of them do similar things; But the lexical analyzer deals with the simple non-
recursive constructs of the language.
– The syntax analyzer deals with the recursive constructs of the language.
– The lexical analyzer recognizes the smallest meaningful units (tokens) in a source
program.
– The syntax analyzer works on the smallest meaningful units (tokens) in a source
program to recognize meaningful structures in our programming language.
Prof. Reshma Pise 12
Parsing Techniques
• Depending on how the parse tree is created, there are different
parsing techniques.
• These parsing techniques are categorized into two groups:
– Top-Down Parsing,
– Bottom-Up Parsing
• Top-Down Parsing:
– Construction of the parse tree starts at the root, and proceeds towards the
leaves.
– Efficient top-down parsers can be easily constructed by hand.
– Recursive Predictive Parsing, Non-Recursive Predictive Parsing (LL Parsing).
• Bottom-Up Parsing:
– Construction of the parse tree starts at the leaves, and proceeds towards the
root.
– Normally efficient bottom-up parsers are created with the help of some software
tools.
– Bottom-up parsing is also known as shift-reduce parsing.
– Operator-Precedence Parsing –Prof.simple,
Reshma Pise restrictive, easy to implement 13
3. Semantic Analyzer
• A semantic analyzer checks the source program for semantic errors
and collects the type information for the code generation.
• Type-checking is an important part of semantic analyzer.
• Context-free grammars used in the syntax analysis are integrated with
attributes (semantic rules)
– the result is a syntax-directed translation,
– Attribute grammars
• Ex:
newval := oldval + 12
• The type of the identifier newval must match with type of the expression
(oldval+12)
• Ex:
temp1 := id2 * id3
id1 := temp1 + 1
MOVE id2,R1
MULT id3,R1
ADD #1,R1
MOVE R1,id1
Code Generator
[Intermediate Code Generator]
Tokens
Code Optimizer
Parser
[Syntax Analyzer]
Optimized Intermediate Code
Parse tree
Code Genrator
Semantic Process
[Semantic analyzer] Target machine code
main ()
{
int i,sum;
sum = 0;
for (i=1; i<=10; i++);
sum = sum + i;
printf("%d\n",sum);
}
Approaches to implementation
. Use assembly language- Most efficient but most difficult to implement
. Use tools like lex, flex- Easy to implement but not as efficient as the first
two cases
Prof. Reshma Pise 28
Prof. Reshma Pise 29
Prof. Reshma Pise 30
Prof. Reshma Pise 31
Prof. Reshma Pise 32
Prof. Reshma Pise 33
Prof. Reshma Pise 34
Prof. Reshma Pise 35
lex Programming Utility
General Information:
• Input is stored in a file with *.l extension
• File consists of three main sections
• lex generates C function stored in lex.yy.c
Using lex:
1) Specify words to be used as tokens (Extension of regular
expressions)
2) Run the lex utility on the source file to generate yylex(
), a C function
3) Declares global variables char* yytext and int yyleng
lex Programming Utility
Three sections of a lex input file:
%{
int num_lines = 0, num_chars = 0;
%}
%%
\n ++num_lines; ++num_chars;
. ++num_chars;
%%
main() {
yylex();
printf( "# of lines = %d, # of chars = %d\n",
num_lines, num_chars );
}
Syntax Analyzer
•Context-free grammars
•Writing a grammar
1. Top-Down Parser
– the parse tree is created top to bottom, starting from the root.
2. Bottom-Up Parser
– the parse is created bottom to top; starting from the leaves
• Both top-down and bottom-up parsers scan the input from left to
right (one symbol at a time).
• Efficient top-down and bottom-up parsers can be implemented only
for sub-classes of context-free grammars.
– LL for top-down parsing
– LR for bottom-up parsing
• unambiguous grammar
➔ unique selection of the parse tree for a sentence
stmt stmt
E2 S1 E2 S1 S2
1 2
Prof. Reshma Pise 67
Ambiguity (cont.)
• We prefer the second parse tree (else matches with closest if).
• So, we have to disambiguate our grammar to reflect this choice.