0% found this document useful (0 votes)
36 views

Compiler CH1

Uploaded by

Senay Mekonnen
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views

Compiler CH1

Uploaded by

Senay Mekonnen
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Department of Computer Science

Compiler Design ()

1. Introduction

1
Overview and History
 Cause
 Software for early computers was written in assembly language
 The benefits of reusing software on different CPUs started to
become significantly greater than the cost of writing a compiler

 The first real compiler


 FORTRAN compilers of the late 1950s
 18 person-years to build

2
Why Study Compilers?
 Build a large, ambitious software system.
 Learn how to build programming languages.
 Learn how programming languages work.
 Learn tradeoffs in language design.
 For new platforms
 for new languages

3
What Do Compilers Do?
 A compiler acts as a translator,
transforming human-oriented programming languages
into computer-oriented machine languages.

 A compiler is a program takes a program written in a


source language and translates it into an equivalent
program in a target language.
Input
Programming
Language Compiler .exe
(Source)

error messages
Output

4
The Structure of a Compiler
There are two major parts of a compiler: Analysis and Synthesis

Compiler

Analysis Synthesis
Lexical Analyzer, Intermediate Code Generator
Syntax Analyzer and Code Optimizer
Semantic Analyzer Code Generator

5
Analysis and Synthesis

 In analysis phase, an intermediate representation is created from


the given source program.
 Lexical Analyzer, Syntax Analyzer and Semantic Analyzer are
the parts of this phase.

 In synthesis phase, the equivalent target program is created from


this intermediate representation.
 Intermediate Code Generator, Code Generator, and Code
Optimizer are the parts of this phase.

6
Analysis and Synthesis Con…
 Analysis: The analysis part breaks up the source program
into consistent pieces and creates an intermediate
representation of the source program.
 During analysis, the operations implied by the source
program are determined and recorded in a hierarchical
structure called a tree.
 Often a special kind of tree called a syntax tree is used, in
which each node represents an operation and the children of
a node represent the arguments of the operation
 Synthesis: The synthesis part constructs the desired target
program from the intermediate representation of the two
parts; synthesis requires the most specialized techniques.
7
Phases of A Compiler
Source Program

Lexical Analyzer

Syntax Analyzer

Semantic Analyzer
Symbol table Error handlers
Intermediate
Code Generator

Code Optimizer

Code Generator

Target Program
• Each phase transforms the source program from one representation
into another representation.
8 • They communicate with error handlers.
• They communicate with the symbol table.
Lexical Analyzer
 Lexical Analyzer reads the source program character by character
and returns the tokens of the source program.
 A token describes a pattern of characters having same meaning in the
source program. (such as identifiers, operators, keywords, numbers,
delimiters and so on)
Ex: newval := oldval + 12 => tokens: newval identifier
:= assignment operator
oldval identifier
+ add operator
12 a number

 Puts information about identifiers into the symbol table.


 Regular expressions are used to describe tokens (lexical constructs).
 A (Deterministic) Finite State Automaton can be used in the
implementation of a lexical analyzer.
9
Syntax Analyzer
 A Syntax Analyzer creates the syntactic structure (generally a
parse tree) of the given program.
 A syntax analyzer is also called as a parser.
 A parse tree describes a syntactic structure.
assgstmt
• In a parse tree, all
identifier := expression terminals are at leaves.

newvl expression + expression • All inner nodes are non-


terminals in
identifier number a context free grammar.

oldval 12

10
Syntax Analyzer versus Lexical Analyzer
 Which constructs of a program should be recognized by the
lexical analyzer, and which ones by the syntax analyzer?
 Both of them do similar things; But the lexical analyzer deals
with simple non-recursive constructs of the language.
 The syntax analyzer deals with recursive constructs of the
language.
 The lexical analyzer simplifies the job of the syntax analyzer.
 The lexical analyzer recognizes the smallest meaningful units
(tokens) in a source program.
 The syntax analyzer works on the smallest meaningful units
(tokens) in a source program to recognize meaningful structures
in our programming language.
11
Semantic Analyzer
 A semantic analyzer checks the source program for
semantic errors and collects the type information for the
code generation.
 Type-checking is an important part of semantic analyzer.
 Normally semantic information cannot be represented by a
context-free language used in syntax analyzers.
 Context-free grammars used in the syntax analysis are
integrated with attributes (semantic rules)
 the result is a syntax-directed translation,
 Attribute grammars
 Ex: newval := oldval + 12
 The type of the identifier newval must match with type of
the expression (oldval+12)
12
Intermediate Code Generation
 A compiler may produce an explicit intermediate codes representing
the source program.
 These intermediate codes are generally machine (architecture
independent). But the level of intermediate codes is close to the level
of machine codes.
 Ex:
newval := oldval * fact + 1

id1 := id2 * id3 + 1

MULT id2,id3,temp1 Intermediates Codes (Quadraples)


ADD temp1,#1,temp2
MOV temp2, ,id1
13
Code Optimizer (for Intermediate Code Generator)

 The code optimizer optimizes the code produced by the


intermediate code generator in the terms of time and space.

 Ex:
MULT id2,id3,temp1
ADD temp1,#1,id1

14
Code Generator
 Produces the target language in a specific architecture.
 The target program is normally is a relocatable object file
containing the machine codes.

 Ex:
( assume that we have an architecture with instructions whose at least
one of its operands is a machine register)

MOVE id2,R1
MULT id3,R1
ADD #1,R1
MOVE R1,id1

15
The Structure of a Compiler More Example

Code Generator
[Intermediate Code Generator]

Non-optimized Intermediate Code


Scanner
[Lexical Analyzer]

Tokens

Code Optimizer
Parser
[Syntax Analyzer]
Optimized Intermediate Code
Parse tree

Code Optimizer
Semantic Process
[Semantic analyzer] Target machine code

Abstract Syntax Tree w/ Attributes

16
Compiler Construction Tools
 Programs to be discussed:

• lex – Programming utility that generates a lexical


analyzer
• yacc – Parser generator

• gcc - The GNU Compiler Collection C compiler

17
General Compiler Infra-structure
Program source Syntactic
Scanner Tokens Parser Structure Semantic
(stream of (tokenizer) Routines
characters)
IR: Intermediate
Representation (1)

Analysis/
Symbol and Transformations/
Attribute Tables optimizations
IR: Intermediate
Representation (2)

Code
Generator

Assembly code

18
lex Programming Utility

General Information:
• Input is stored in a file with *.l extension
• File consists of three main sections
• lex generates C function stored in lex.yy.c
Using lex:
1) Specify words to be used as tokens (Extension of regular
expressions)
2) Run the lex utility on the source file to generate yylex( ), a C
function
3) Declares global variables char* yytext and int yyleng

19
lex Programming Utility

Three sections of a lex input file:


/* C declarations and #includes lex definitions */
%{
#include “header.c”
int i;
}%

%%
/* lex patterns and actions */
{INT} {sscanf (yytext, “%d”, &i);
printf(“INTEGER\n”);}
%%

/* C functions called by the above actions */


{ yylex(): }
20
yacc Parser Generator
General Information:
• Input is specification of a language
• Output is a compiler for that language
• yacc generates C function stored in y.tab.c
• Public domain version available bison

Using yacc:
1) Generates a C function called yyparse()
2) yyparse() may include calls to yylex()
3) Compile this function to obtain the compiler

21
yacc Parser Generator
yacc source lex source

yacc lex

#include “lex.yy.c”
y.tab.c lex.yy.c

cc

a.out

• Input source file – similar to lex input file


• Declarations, Rules, Support routines
• Four parts of output atom:

22
(Operation, Left Operand, Right Operand, Result)
Lex & Yacc

23
gcc Compiler
 General Information:
 Gcc is the GNU Project C compiler
A command-line program
Gcc takes C source files as input
Outputs an executable: a.out
You can specify a different output filename

To compile simply type: gcc –o hello hello.c –g -Wall


‘-o’ option tells the compiler to name the executable
‘HelloProg’
‘-g’ option adds symbolic information to Hello for debugging
‘–Wall’ tells it to print out all warnings (very useful!!!)
Can also give ‘-O6’ to turn on full optimization
24

You might also like