CD Lab FPP
CD Lab FPP
Prepared by,
T.Illakiya
Assistant Professor
CSE Department
Rajalakshmi Institute of Technology
Verified by
Dr. A. Saravanan
Approved by
HOD/CSE
Semester: VI
L T P
1. Course Description:
The course is intended to teach the students the basic techniques that underlie the practice of
Compiler Construction. The course will introduce the theory and tools that can be employed
in order to perform syntax-directed translation of a high-level programming language into an
executable code. These techniques can also be employed in wider areas of application,
whenever we need a syntax-directed analysis of symbolic expressions and languages and
their translation into a lower-level description. They have multiple applications for manmachine interaction, including verification and program analysis.
2. Required Background or Pre-requisite:
Basic knowledge about C Programming, System Software and Theory of computation.
3. Detailed Description of the Course:
1. Implementation of Symbol Table
2. Develop a lexical analyzer to recognize a few patterns in C. (Ex. identifiers, constants,
comments, operators etc.)
3. Implementation of Lexical Analyzer using Lex Tool
4. Generate YACC specification for a few syntactic categories.
a) Program to recognize a valid arithmetic expression that usesoperator +, , * and /.
b) Program to recognize a valid variable which starts with a letter followed by any number of
letters or digits.
c)Implementation of Calculator using LEX and YACC
5. Convert the BNF rules into Yacc form and write code to generate Abstract Syntax Tree.
6. Implement type checking
7. Implement control flow analysis and Data flow Analysis
8. Implement any one storage allocation strategies(Heap,Stack,Static)
9. Construction of DAG
10. Implement the back end of the compiler which takes the three address code and produces
the 8086 assembly language instructions that can be assembled and run using a 8086
assembler. The target assembly instructions can be simple move, add, sub, jump. Also simple
addressing modes are used.
11. Implementation of Simple Code Optimization Techniques (Constant Folding., etc.)
LIST OF EQUIPMENT FOR A BATCH OF 30 STUDENTS:
Standalone desktops with C / C++ compiler and Compiler writing tools 30 Nos.
(or)
Server with C / C++ compiler and Compiler writing tools supporting 30 terminals or more.
LEX and YACC
Lab Plan
S.No
1
2
3A
Week
number
Batch 1
Batch 2
List of Experiments
3B
5
3C
10
11
10
12
8
9
10
11
12
Construction of DAG
Implement Type Checking
Implement Control Flow Analysis And Data Flow
Analysis
Implement the back end of the compiler which takes
the three address code and produces the 8086
assembly language instructions that can be assembled
and run using an 8086 assembler. The target
assembly instructions can be simple move, add, sub,
jump. Also simple addressing modes are used.
Implementation of Simple Code Optimization
Techniques (Constant Folding. Etc.)
4. Course Objectives:
1. Students will acquire knowledge in various phases of compiler and its use, code optimization
techniques, machine code generation, and use of symbol table.
2. Students will able to use the Compiler tools like LEX, YACC, etc.
3. Students will learn to construct LL, SLR, CLR and LALR parse table.
4. Students will know about Syntax directed translation, synthesized and inherited attributes and
different types of compiler tools to meet the requirements of the realistic constraints of
compilers.
5. Students will learn how to optimize and effectively generate machine codes.
5. Course Outcomes:
6.
Students can design and implement a prototype compiler and apply the various optimization
techniques.
Students will employ different compiler construction tools and can design a lexical analyzer
for a sample language.
Students can design a lexical analyzer for a language and apply the knowledge of LEX tool
& YACC tool to develop a scanner & parser.
Student can use the knowledge of patterns, tokens & regular expressions for solving a
problem in the field of data mining.
Student will design a new code generator algorithm and identify the right code optimization
technique to improve the performance of a program in terms of speed & space.
Activity
Demonstration
Viva
Model Test
Total
Mode of Delivery
Workload periods
Execute the
programs and
45
display in projector
Technical quiz in
Last 10 minutes of lab
Moodle
hour
3
48 Periods
7. Evaluation of Students
Course Outcome
Assessment
CO1
CO2
CO3
CO4
CO5
CO1
CO2
CO3
CO4
CO5
a
H
M
H
H
H
b
H
H
M
M
H
c
H
H
H
H
H
d
M
M
M
M
M
e
H
M
M
M
M
f
M
M
g
M
M
h
M
i
L
L
j
M
L
M
L
L
M
1
M
M
M
M
L
9. Notes
Lab Manual is enclosed.
10.Content beyond syllabus
1. Construction of NFA
2. LR parser
11. VIVA Questions
Translator
A program that accepts text expressed in one language and generates semantically
equivalent text expressed in another language.
source language
The input language of a translator.
target language
The output language of a translator.
assembler
A translator from an assembly language to the corresponding machine language.
compiler
A translator from a high level language to a low level language.
high-level translator
A translator from one high-level language to another.
disassembler
A translator from machine language to assembler language.
decompiler
A translater from a low level language to a high level language.
source program
The input text of an assembler or compiler.
object program
The output text of an assembler or compiler.
implementation language
The language in which a program is expressed.
tombstone diagram
A graphical representation of the overall function of a system.
cross compiler
A compiler which generates code for a machine different from the machine on which it
is run.
Portable program
A program which can be (compiled and) run on any machine.
interpretive compiler
A program which combines a compiler that produces object code in an intermediate
language with an interpreter for that intermediate language.
What is a compiler
A compiler is a computer program (or set of programs) that transforms source code
written in a programming language (the source language) into another computer
language (the target language, often having a binary form known as object code).
Difference between compilers & interpreters.
A compiler first takes in the entire program, checks for errors, compiles it and then
executes it. Whereas, an interpreter does this line by line, so it takes one line, checks it
for errors and then executes it.
Eg of Compiler - C
Eg of Interpreter PHP
Language processor.
a parser which parses a particular language are called language processors
Symbol table.
a symbol table is a data structure used by a language translator such as a compiler or
interpreter, where each identifier in a program's source code is associated with
information relating to its declaration or appearance in the source, such as its type,scope
level and sometimes its location.
Explain Different Phases Of A Compiler With An Example
1. Lexical analysis
This is the initial part of reading and analysing the program text: The text is read and
divided into tokens , each of which corresponds toa symbol in the programming
language,
e.g. , a variable name, keyword or number.
2. Syntax analysis
This phase takes the list of tokens produced by the lexical analysis and arranges these in
a tree-structure (called the syntax tree ) that reects the structure of the program. This
phase is often called parsing
3. Semantic analysis - checks for errors.
4. Intermediate Code generation - generates machine code.
5. Code optimization (Machine independent)- looks for ways to make code smaller and
more efficient.
6. Code Generator
7. Target program (machine dependent)- creates the output ( .exe, .com, .dll, etc.).
Risc
Basically, RISC CPU's ( eg: ARM processer...) contain an instruction set where every
instruction and operand is of the exact same length. This makes the CPU design much
simpler since every instruction an operand fits in the pipeline with no wasted cycles
CISC
CISC CPU's (x86, 8051, etc...) contain complex, variable length instruction sets and
operands. While having complex variable length instructions make low level
programming much easier, it comes at the price of greatly increasing the complexity of
the CPU and reducing efficiency.
Application of compiler technology.
pattern recognition systems.
diagram recognition and diagram-processing tasks by use of grammars
Lexical Analyzer
lexical analysis is the process of converting a sequence of characters into a sequence of
tokens. A program or function which performs lexical analysis is called a lexical
analyzer, lexer or scanner.
Tokens, Patterns, Lexemes
A lexical token is a sequence of characters that can be treated as a unit in the grammar
of the programming languages.
Example of tokens:
Type token (id, num, real, . . . )
Patterns
There is a set of strings in the input for which the same token is produced as output.
This set of strings is described by a rule called a pattern associated with the token.
Regular expressions are an important notation for specifying patterns.
For example, the pattern for the Pascal identifier token, id, is: id letter (letter |
digit)*.
Lexeme
A lexeme is a sequence of characters in the source program that is matched by the
pattern for a token.
For example, ( =, , <, , >=)
Regular Expressions
1. The regular expressions over alphabet specifies a language according to the
following rules. is a regular expression that denotes { }, that is, the set containing the
empty string.
Regular Definitions
A regular definition gives names to certain regular expressions and uses those names in
other regular expressions.
Deterministic Finite Automata (DFA)
A deterministic finite automation is a special case of a non-deterministic finite
automation (NFA) in which
1. no state has an -transition
2. for each state s and input symbol a, there is at most one edge labeled a leaving s.
Nondeterministic Finite Automata (NFA)
A nondeterministic finite automation is a mathematical model consists of
1. a set of states S;
2. a set of input symbol, , called the input symbols alphabet.
3. a transition function move that maps state-symbol pairs to sets of states.
4. a state so called the initial or the start state.
5. a set of states F called the accepting or final state.
Synthesized Attributes:
An attribute is synthesized if its value at a parent node can be determined from
attributes of its children.
Syntax-Directed Definitions:
A syntax-directed definition uses a CFG to specify the syntatic structure of the input.
A syntax-directed definition associates a set of attributes with each grammar symbol.
A syntax-directed definition associates a set of semantic rules with each production
rule.
Parsing
Parsing is the process of determining if a string of tokens can be generated by a
grammar. A parser must be capable of constructing the tree, or else the translation
cannot be guaranteed correct. For any language that can be described by CFG, the
parsing requires O(n3) time to parse string of n token. However, most programming
languages are so simple that a parser requires just O(n) time with a single left-to-right
scan over the iput string of n tokens.
There are two types of Parsing
1. Top-down Parsing (start from start symbol and derive string)
A Top-down parser builds a parse tree by starting at the root and working down towards
the leaves.
o Easy to generate by hand.
Dynamic Linking
Dynamic linking allows an object module to include only the information that is require
at load time to execute a program.
There are two types of dynamic linking, they are Load time dynamic linking and Run
time dynamic linking.
Code Generation
code generation is the process by which a compiler's code generator converts some
internal representation of source code into a form (e.g., machine code) that can be
readily executed by a machine
code optimization
Code optimization is the process of modifying the code to make some aspect of
software or hardware work more efficiently or use fewer resources or reduce
compilation time or use memory efficiently etc
CFG is a grammar which naturally generates a formal language in which clauses can be
nested inside clauses arbitrarily deeply, but where grammatical structures are not
allowed to overlap