0% found this document useful (0 votes)
32 views

Chapter 1.1

Chapter 1

Uploaded by

rohitaher1919
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

Chapter 1.1

Chapter 1

Uploaded by

rohitaher1919
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

SEMESTER IV

Compiler Construction
Code No. : CS-366
CHAPTER 1: INTRODUCTION
 1.1 Definition of Compiler, Aspects of compilation.
 1.2 The structure of Compiler.

 1.3 Phases of Compiler –

Lexical Analysis,
Syntax Analysis,
Semantic Analysis,
Intermediate Code generation,
code optimization,
code generation.
 1.4 Error Handling

 1.5 Introduction to one pass & Multi pass compilers, cross


compiler, Bootstrapping.
 Definition of Compiler:
Compiler is a software
which converts a program written in high level language (Source
Language) to low level language (Object/Target/Machine
Language).

PHASES OF COMPILER
 Phases of a Compiler :
There are two major phases of
compilation, which in turn have many parts.
 Each of them take input from the output of the previous
level and work in a coordinated way.

a. Analysis Phase :
An intermediate representation is created
from the give source code :
i. Lexical Analyser
ii. Syntax Analyser
iii. Semantic Analyser
iv. Intermediate Code Generator
b. Synthesis Phase :
Equivalent target program is
created from the intermediate representation.
It has two parts :
i. Code Optimizer
ii. Code Generator
 From the above diagram, all phases are managed by symbol
table manager.
 In symbol table manager, information is stored which is related
to symbols.
 The errors which occur during all phases are handled by error
handler.
 Lexical analyser divides the program into “tokens”, Syntax
analyser recognizes “sentences” in the program using syntax
of language and Semantic analyser checks static semantics of
each construct.
 Intermediate Code Generator generates “abstract” code.
a. Analysis Phase:
i. Lexical Analyser :
It is also called scanner.
 A program that does the task of lexical analysis is
called lexical analyser.
 Lexical analyser generates a token and blank
characters are eliminated.
 Ex.

price:= amount + rate*50


id1= price
id2= amount
id3= rate
Tokens:
identifiers: price, amount, rate
Assignment operator: :=
Arithmetic operator: +, *
Constant: 50
 Lexical Analysis:

id1:= id2 + id3 * 50


 Syntax Analyser :
It is sometimes called as
parser.
 It constructs the parse tree.

 In syntax analysis, sentences are checked


grammatically.
 If sentence is grammatically correct then parse tree is
generated out of the sentence.
 The program does the task of syntax analysis is called
syntax analyser or parser.
 Syntax analysis takes input as tokens.
 Ex.
id1:= id2 + id3 * 50
 Parse Tree:
iii. Semantic Analysis :
It verifies the parse tree, whether
it‟s meaningful or not.
 It furthermore produces a verified parse tree.

 It also does type checking, Label checking and Flow control


checking.
 It perform major task of type checking.
 Ex.
id1:= id2 + id3 * 50
(Here id1, id2 and id3 are real numbers and 50 is
integer.)
int a,b;
char c;
c = a+b; // Syntactically Correct but
Semantically incorrect

Sum = a + b;
a = int
b= char
Sum = double
 Intermediate Code Generator :
It is a bridge between
analysis and synthesis phase.
 It generates intermediate code, that is a form which
can be readily executed by machine.
 We have many popular intermediate codes.

 Example – Three address code etc.

 Intermediate code is converted to machine language


using the last two phases which are platform
dependent.
 Ex.
id1 := id2 + id3 * 50
temp1 := int to real (50)
temp2 := id3 * temp1
temp3 := id2 + temp2
id1 := temp3
b. Synthesis Phase:

i. Code Optimization :
It improves the intermediate code.
 Code optimization takes less memory space and less
execution time without changing the functionality or
correctness.
 It improves efficiency of the program.

 Ex.

id1:= id2 + id3 * 50


temp1 := id3 * 50
id1 := id2 + temp1
ii. Target Code Generation :
The main purpose of Target
Code generator is to write a code that the machine
can understand and also register allocation,
instruction selection etc.
 The output is dependent on the type of assembler.
 This is the final stage of compilation.
 Intermediate code is translated into sequence of
machine instruction that perform the same
operation.
 For every variable (identifier) memory location
(registers) are selected.
 Ex.
id1:= id2 + id3 * 50

Ex.: if id2= 2 and id3=5


MOV R1, id3 R1= 5
MUL R1, 50 R1=R1*50=5*50=250
MOV R2, id2 R2= 2
ADD R2, R1 R2= R2+R1=2+250=252
MOV id1, R2 id1= 252
 All these six phases are associated with the symbol
table manager and error handler as shown in the above
block diagram.
 Symbol Table:

It is an important data structure


created and maintained by the compiler in order to keep
track of semantics of variable i.e. it stores information
about scope and binding information about names,
information about instances of various entities such as
variable and function names, classes, objects, etc.
 It is built in lexical and syntax analysis phases.
 The information is collected by the analysis phases of
compiler and is used by synthesis phases of compiler
to generate code.
 It is used by compiler to achieve compile time
efficiency.
 It is used by various phases of compiler as follows

1. Lexical Analysis: Creates new table entries in the


table, example like entries about token.
2. Syntax Analysis: Adds information regarding
attribute type, scope, dimension, line of reference, use,
etc. in the table.

3. Semantic Analysis: Uses available information in the
table to check for semantics i.e. to verify that
expressions and assignments are semantically
correct(type checking) and update it accordingly.
4. Intermediate Code generation: Refers symbol table
for knowing how much and what type of run-time is
allocated and table helps in adding temporary variable
information.
5. Code Optimization: Uses information present in
symbol table for machine dependent optimization.
6. Target Code generation: Generates code by using
address information of identifier present in the table.
 Symbol Table entries – Each entry in symbol table is
associated with attributes that support compiler in
different phases.
Items stored in Symbol table:
 Variable names and constants

 Procedure and function names

 Literal constants and strings

 Compiler generated temporaries

 Labels in source languages


 Passes of Compiler:
Pass is a complete traversal of the
source program.
 Compiler has two passes to traverse the source program.

i. One pass compiler (Single Pass)


ii. Two pass compiler (Multi pass)
 One pass compiler (Single Pass):
If we combine or group
all the phases of compiler design in a single module
known as single pass compiler.
 One-pass compiler is used to traverse the program
only once.
 The one-pass compiler passes only once through the
parts of each compilation unit.
 It translates each part into its final machine code.

 In the one pass compiler, when the line source is


processed, it is scanned and the token is extracted.
 Then the syntax of each line is analysed and the tree
structure is build.
 After the semantic part, the code is generated.

 The same process is repeated for each line of code


until the entire program is compiled.
ii. Two pass compiler (Multi pass):
 A Two pass/multi-pass Compiler is a type of compiler that
processes the source code or abstract syntax tree of a program
multiple times.
 In multipass Compiler we divide phases in two pass as:


 First Pass: is refers as
(a) Front end
(b) Analytic part
(c) Platform independent
 In first pass the included phases are as Lexical analyser, syntax
analyser, semantic analyser, intermediate code generator are
work as front end and analytic part means all phases analyse
the High level language and convert them three address
code and first pass is platform independent because the output
of first pass is as three address code which is useful for every
system and the requirement is to change the code optimization
and code generator phase which are comes to the second
pass.
 Second Pass: is refers as
(a) Back end
(b) Synthesis Part
(c) Platform Dependent

 In the first pass, compiler can read the source program, scan it,
extract the tokens and store the result in an output file.
 In the second pass, compiler can read the output file produced by
first pass, build the syntactic tree and perform the syntactical
analysis. The output of this phase is a file that contains the
syntactical tree.
 In the third pass, compiler can read the output file produced by
second pass and check that the tree follows the rules of language or
not. The output of semantic analysis phase is the annotated tree
syntax.
 This pass is going on, until the target output is produced.
 Cross Compiler:
Cross compiler that runs on a
machine „A‟ and produces a code for another machine
„B‟.
 It is capable of creating code for a platform other than
the one on which the compiler is running.
 Bootstrapping:
Bootstrapping is a process in which
simple language is used to translate more complicated
program which in turn may handle for more
complicated program.
 This complicated program can further handle even
more complicated program and so on.
 Boot strapping is used to produce self hosting
compiler.
 It is a compiler that can compile its own source code.

 A compiler can be categorized by three languages.


i. Source Language (S)
ii. Target Language (T)
iii. Implementation Language (I)

Source Lang. Compiler Target Language

Implemented Some lang. I


 In bootstrapping the compiler diagram is represented by T
Diagram.
 Suppose we want to write a cross compiler for new language
X.
 The implementation language of this compiler is say Y and the
target code being generated is in language Z.
 That is, we create XYZ. Now if existing compiler Y runs on
machine M and generates code for M then it is denoted as
YMM.
 Now if we run XYZ using YMM then we get a compiler XMZ.

 That means a compiler for source language X that generates a


target code in language Z and which runs on machine M.
 Following diagram illustrates the above scenario.
 Example:
We can create compiler of many different forms. Now
we will generate.

 Compiler which takes C language and generates an


assembly language as an output with the availability
of a machine of assembly language.
Step-1: First we write a compiler for a small of C in
assembly language.

Step-2: Then using with small subset of C i.e. C0, for the
source language c the compiler is written.
 Step-3: Finally we compile the second compiler. using
compiler 1 the compiler 2 is compiled.

 Step-4: Thus we get a compiler written in ASM which


compiles C and generates code in ASM.

You might also like