0% found this document useful (0 votes)
6 views

Ch1_Introduction

Chapter 1 introduces compiler design, covering the evolution of programming languages, the definition and history of compilers, and related programs such as interpreters and assemblers. It outlines the phases of compiler design, including lexical analysis, syntax analysis, and semantic analysis, as well as the importance of studying compilers for understanding programming languages and improving programming skills. The chapter also discusses the roles of various tools and processes involved in compiling source code into executable programs.

Uploaded by

fikireselamgirma
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Ch1_Introduction

Chapter 1 introduces compiler design, covering the evolution of programming languages, the definition and history of compilers, and related programs such as interpreters and assemblers. It outlines the phases of compiler design, including lexical analysis, syntax analysis, and semantic analysis, as well as the importance of studying compilers for understanding programming languages and improving programming skills. The chapter also discusses the roles of various tools and processes involved in compiling source code into executable programs.

Uploaded by

fikireselamgirma
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 37

Chapter -1

Introduction to compiler
design

Chapter – 1 : Introduction to Compiler Design 1 Bahir Dar Institute of Technology


Contents
 The Evolution of Programming Languages
 What is compiler?
 History of compilers
 Programs related to compilers/cousins of
compiler
 Why Study Compilers?
 Analysis of the source program
 Phases of Compiler Design 🞑 Intermediate Code
• Scanner
• Parser Generator
• Semantic 🞑 Code Optimizer
Analyzer
 Symbol Tables and Error 🞑 Code generator
Handling

Chapter –Compiler Construction
1 : Introduction Tools
to Compiler Design 2 Bahir Dar Institute of Technology
The Evolution of Programming
Languages
 1940’s - the first electronic computers are invented
• They were programmed in machine language by sequences
of
0's and 1‘s
• tedious, error prone, machine dependant, hard to understand
and modify but fast to run
 Early 1950's - mnemonic assembly languages
developed
• First they were just mnemonic representations of
machine instructions, later, macro instructions were
added
 latter half of the 1950's - higher-level
languages developed
Chapter – 1 : Introduction to Compiler Design 3 Bahir Dar Institute of Technology
The Evolution of Programming Languages
… • following decades, many more languages were
created and today, there are thousands of programming
languages
 They can be classified in a variety of ways.
• Classification based on generation
• First-generation languages - machine languages
• Second-generation languages - the assembly languages,
• Third-generation languages - higher-level languages like
Fortran, Cobol, Lisp, C, C++, C#, and Java.
• Fourth-generation languages - languages designed for
specific applications e.g. NOMAD for report generation, SQL
for database queries
• Fifth-generation language - includes logic- and
Chapter – 1 : constraint-based languages
Introduction to Compiler Design like4 Prolog
Bahirand OPS5 of Technology
Dar Institute
The Evolution of Programming Languages
…Another classification
• Imperative for languages - a program specifies how a computation is to be done.
E.g. C, C++, C#, and Java
• Declarative languages - a program specifies what computation is to be done.
E.g. prolog, ML and Haskel
 Within the declarative and imperative families, there are
several important subclasses.

Chapter – 1 : Introduction to Compiler Design 5 Bahir Dar Institute of Technology


What is compiler?
 Computer’s CPU is capable of executing, very simple,
primitive
operations (move, add, …)
• Recall this from your study of assembly language or computer
organization
 Hence, a program for a computer must be built using
machine language
 However, this is a tedious and error-prone process
• That is why, high-level programming language are used
 Programs written in high-level languages can be very
different from the machine language
• So some means of bridging the gap is required
• This is where the compiler comes in.

Chapter – 1 : Introduction to Compiler Design 6 Bahir Dar Institute of Technology


What is compiler?
 A
…compiler is
• a program that translates
• a program written in a high-level programming
language (suitable for human programmers) into
• low-level machine language (that is required by
computers).
• program that takes a program written in a source language
and translates it into an equivalent program in a target
source
language. COMPILER
target
progra progra
m m
( Normally a program error ( Normally the equivalent
written in a high-level program in machine
programming language) message code or assembly
s language)
Chapter – 1 : Introduction to Compiler Design 7 Bahir Dar Institute of Technology
History of compilers
 1940’s
:• Early stored-program computers were programmed
in machine language.
• Later, assembly languages were developed
 1950’s:
• Early high-level languages were developed,
FORTRAN
• Compiler-writing was a huge task,
• 1960’s onwards/Now:
• Intensively studied
• using software tools, can be done in a few months
Chapter – 1 : Introduction to Compiler Design 8 Bahir Dar Institute of Technology
Programs related to compilers (COUSINS OF COMPILER)
 There are other translators/programs that are related to or used
together with compilers and that often come together with
compilers in complete language development environment.
 As a general,Translator is a program that translates one
language to another.
 Types of Translator:
1. Interpreter 2.Compiler 3.Assembler
 Interpreters - Directly execute the operations specified
in the source program on inputs supplied by the user

• Do not produce a target program as a


Chapter – 1 : translation
Introduction to Compiler Design 9 Bahir Dar Institute of Technology
Programs related to compilers (COUSINS OF
COMPILER)
 Compilers vs. Interpreters
• The compiler executes the entire program at a time, but
the interpreter executes each and every line individually.
• Languages using Compilers: FORTRAN, COBOL, C, C++,
Pascal, PL/1
• Languages using Interpreters: Lisp, scheme, BASIC, APL,
Perl, Python, Smalltalk
• Pros of Compilers: Fast execution (creates executable file)
• Cons of Compilers: Slow processing, Debugging (Improved
the IDEs), more memory required(due to object codes)
• Pros of Interpreters: Easy debugging, Fast Development,
less memory requirement
• Cons of interpreters: Not for large projects, Slower
Chapter –execution
1 : Introduction to Compiler Design 10 Bahir Dar Institute of Technology
Programs related to compilers (COUSINS OF
COMPILER)
Compilers: Translate a source (human-writable) program to an executable (machine-readable)
program
Interpreters: Convert a source program and execute it at the same time.

I
deally:
Source Compiler Executab
le
code
Output
Input Executable data
Source
Output
data
code Interpreter data
Input data
i.e
.

Chapter – 1 : Introduction to Compiler Design 11 Bahir Dar Institute of Technology


Programs related to compilers (COUSINS OF COMPILER)
 Assemblers - convert a program in assembly
language to its equivalent program in machine language

 Linkers - a computer
program that takes one or
more object files generated by
compilers or assemblers and
combines them into a single
executable program.

Chapter – 1 : Introduction to Compiler Design 12 Bahir Dar Institute of Technology


Programs related to compilers (COUSINS OF
COMPILER)
 Loaders- loads an executable into memory and starts
it running.
 Editors – programs used to write/edit source
codes
 Debuggers – programs which are used to
determine execution errors in a compiled errors
 Preprocessors –
• A source program may be divided into modules stored
in separate files.
• The task of collecting the source program is
sometimes entrusted to a separate program, called
a preprocessor.
Chapter•– 1It: Introduction
may also expand shorthands,
to Compiler Design called
13 macros,
Bahir intoof Technology
Dar Institute
Programs related to compilers (COUSINS OF
COMPILER)
 A preprocessors produce input to compilers.
 They may perform the following functions.
• 1. Macro processing: A preprocessor may allow a user to
define macros that are short hands for longer constructs.
• 2. File inclusion: A preprocessor may include header files
into the program text.
• 3. Rational preprocessor: these preprocessors augment
older languages with more modern flow-of control and
data structuring facilities.
• egg while-statement or if-statement if none exist in the program itself

• 4. Language Extensions:These preprocessor attempts to


add capabilities to the language by certain amounts to
build-in macro
Chapter – 1 : Introduction to Compiler Design 14 Bahir Dar Institute of Technology
Language Processing System
source program

preprocessor
modified source program

compiler
target assembly program

assembler
relocatable machine code

linker/loader Library
files
target machine code

Chapter – 1 : Introduction to Compiler Design 15 Bahir Dar Institute of Technology


Language Processing System
 A source program may be divided into modules stored in separate
files.
 The task of collecting the source program is sometimes entrusted
to a separate program, called a preprocessor.
 The preprocessor may also expand short-hands, called macros,
into source language statements
 Large programs are often compiled in pieces, so the
relocatable machine code may have to be linked together
with other relocatable object files and library files into the
code that actually runs on the machine.
 The linker resolves external memory addresses, where
the code in one file may refer to a location in another
file.
 The
Chapter – 1 loader then
: Introduction puts together
to Compiler Design all16of the
Bahir executable
Dar Institute of object
Technology
Compilers Construction related to other Computer
Science topics
 Theory - Finite State Automata, Grammars and Parsing
 Algorithms - Graph manipulation, dynamic
programming
 Data structures - Symbol tables, abstract syntax trees
 Systems - Allocation and naming, multi-pass systems,
compiler construction
 Computer Architecture - Memory hierarchy, instruction
selection, interlocks and latencies, parallelism
 Security - Detection of and Protection against vulnerabilities
 Software Engineering - Software development
environments, debugging
 Artificial Intelligence - Heuristic based search for
Chapter – 1 : Introduction to Compiler Design 17 Bahir Dar Institute of Technology
Analysis of source
program
 In compiling, analysis consists of three phases:
• Linear analysis, in which the stream of characters making up
the source program is read from left-to-right and grouped into
tokens that are sequences of characters having a collective
meaning.
• Hierarchical analysis, in which characters or tokens are
grouped hierarchically into nested collections with collective
meaning.
• Semantic analysis, in which certain checks are performed
to ensure that the components of a program fit together
meaningfully.
• See in phase of compiler topic for detail
Chapter – 1 : Introduction to Compiler Design 18 Bahir Dar Institute of Technology
Why Study
Compilers?
 Compilers enable programming at a high level
language instead of machine instructions.
• Malleability, Portability, Modularity,
Programmer Productivity,
 Increases understanding of language
semantics
 Seeing the machine code generated for

language constructs helps understand


performance issues for languages
 Teaches good language design

 New devices may need device-specific

languages
Chapter – 1 : Introduction to Compiler Design 19 Bahir Dar Institute of Technology
Why Study
 Become a better programmer
Compilers?
Insight into interaction between languages, compilers, and
hardware
 Compiler techniques are everywhere
 Parsing (little languages, interpreters, HTML)
 Database engines, query languages
 Text processing
 Fascinating blend of theory and engineering
 Direct applications of theory to practice
 Parsing, scanning, static analysis
 Resource allocation, “optimization”, etc.
 You might even write a compiler some day!
Chapter – 1 : Introduction to Compiler Design 20 Bahir Dar Institute of Technology
Grouping of Phases into Passes /Parts of compilation
 Compiler is not a single box that maps a source program into a target
program.
 There are two parts to this mapping: analysis and synthesis
• Analysis (front part) [Lexical, Syntax, and Semantic analysis]
• breaks up the source program into constituent pieces
• Creates an intermediate representation of the source program
• Reports any error detected
• Stores source program info in a data structure called a symbol table
• Machine Independent/Language Dependent. b/c they depend
primarily on the source language
• Synthesis (Back part)[Code Generation + Code
Optimization]
• constructs the desired target program from the
intermediate representation and the information in the
symbol table.
• Machine Dependent. b/c they depend on the target
machine/Language independent
 Compilation process operates as a sequence of phases,
• –each
Chapter 1 : Introduction to Compilerone
of which transforms Design 21
representation of the
Bahir Darsource program
Institute of Technology
The Phases of a
Compiler…

Chapter – 1 : Introduction to Compiler Design 22 Bahir Dar Institute of Technology


Lexical Analyzer (Scanner)
 Also called the Lexer
 How it works:
• Reads characters from the source program.
• Groups the characters into lexemes (sequences of
characters that "go together").
• Each lexeme corresponds to a token;
• i.e. For each lexeme, the lexical analyzer produces as output a
token of the form (token-name, attribute-value)
• the scanner returns the next token (plus
maybe some additional information) to the
parser.
• The scanner may also discover lexical errors (e.g.,
erroneous characters).
• Start symbol table with new symbols
Chapter – 1 : Introduction to Compiler Design 23
found
Bahir Dar Institute of Technology
Lexical Analyzer (Scanner)
…
Tokens include e.g.:
• “Reserved words”: do if float while
• Special characters: ( { , + - = ! /
• Names & numbers: myValue, 3.07e02
 The definitions of what a lexeme , token or
bad character is depend on the definition of the
source language.
 Examples of tools for lexical analysis are
• Lex
• flex

 A lexeme is a sequence of characters in the source program that is


matched by the pattern for a token.

Chapter – 1 : Introduction to Compiler Design 24 Bahir Dar Institute of Technology


Lexical Analyzer - Examples
 Consider the expression: sum = 3 + 2;
in C programming language.
Lexeme Token Token type
Tokenized in the table: sum identifier
= assign Assignment operator
3 number Integer literal
+ addition Addition operator
2 mult Integer literal
; semicolo End of statement
n

Position initial _+ _rate_ *


_:=_ 60_;

 Example All are lexemes


 2:
Blanks, Line breaks, etc. are scanned out
Chapter – 1 : Introduction to Compiler Design 25 Bahir Dar Institute of Technology
Syntax Analyzer
 (Parser)
Also known as Hierarchical Analysis/ Parsing
 Constructs a parse tree from symbols
 A pattern-matching problem
• Language grammar defined by set of rules that
identify legal (meaningful) combinations of symbols
• Each application of a rule results in a node in the
parse tree
• Parser applies these rules repeatedly to the program
until
leaves of parse tree are “atoms”
 If no pattern matches, it’s a syntax error

 YACC, bison are tools for 26thisBahir Dar Institute of Technology


Chapter – 1 : Introduction to Compiler Design
Syntax Analyzer -
Example
 Source code:
position = initial + rate * 60;
 Abstract-syntax tree:

• interior nodes of the tree are OPERATORS;


• a node’s children are its OPERANDS;
• each sub-tree forms a logical unit .
• the sub-tree with * at its root shows that
* has higher precedence than +, the
operation “rate * 60” must be performed as a unit,
Chapter – 1 : Introduction to Compiler Design 27 Bahir Dar Institute of Technology
Semantic Analyzer
 Checks source program for semantic errors, e.g., type errors
• Annotates and/or changes the abstract syntax tree based on the
attribute grammar
• Annotate a node that represents an expression with its type.
• Example with before and after:

 The most Important activity in This Phase:


• Type Checking - the compiler checks that each operator has
operands that are permitted by the source language specification.
Chapter – 1 : Introduction to Compiler Design 28 Bahir Dar Institute of Technology
Intermediate Code
 Generator
Translates from abstract-syntax tree to intermediate code
 In other words, it gets input from the semantic analysis and converts
the input into output as intermediate code such as:
• 3-address code
• Each statement contains
– at most 3 operands; in addition to “:=”
• An "easy” and “universal” format that can be translated into most
assembly
languages.
• Here's an example of 3-address code for the abstract-syntax tree shown
on the preceding slide.
– t1 = inttofloat(60)
– t2 = id3 * t1
– t3 = id2 + t2
– id1 = t3
 NB: The three-address code consists of a sequence of instructions, each of
which has at most three operands.
Chapter – 1 : Introduction to Compiler Design 29 Bahir Dar Institute of Technology
Code Optimization
 Improve the efficiency of intermediate code.
• Goal may be to make code run faster , and/or to use least
number of
t1= intofloat(60)
registers t2=id3*60.0
t2=id3*t1
id1 = id2 + t2
t3=id2+t2
id1=t3
 Current trends:
• to obtain smaller, but maybe slower, equivalent code
for embedded systems;
• to reduce power consumption
• to enable parallelism
Chapter – 1 : Introduction to Compiler Design 30 Bahir Dar Institute of Technology
Code
 AGeneration
compiler may generate
• pure machine codes (machine dependent
assembly language) directly, which is rare
now ;
• virtual machine code.
 Generates
cod object code from (optimized)
LDF R2, id3
eintermediate MULF R2, R2, #60.0
t2=id3*60.0
LDF R1, id2
id1 = id2 + t2
ADDF R1, R1,
R2 STF id1, R1

Chapter – 1 : Introduction to Compiler Design 31 Bahir Dar Institute of Technology


Phases of Compilers
(Summary)

Chapter – 1 : Introduction to Compiler Design 32 Bahir Dar Institute of Technology


Symbol Table
 Symbol table management is a part of the compiler that
interacts with several of the phases
– Identifiers and their values are found in lexical analysis and
placed in the symbol table
– During syntactical and semantic analysis, type and
scope information is added
– During code generation, type information is used to
determine what instructions to use
– During optimization, the “live analysis” may be kept in the
symbol
table
 Most suitably implemented as a dynamic data
structure (linear list, binary tree, hash table)
Chapter – 1 : Introduction to Compiler Design 33 Bahir Dar Institute of Technology
Handling Errors
 Error handling and reporting also occurs across
many phases
– Lexical analyzer reports invalid character sequences
– Syntactic analyzer reports invalid token sequences
– Semantic analyzer reports type and scope errors, and
the like

 The compiler may be able to continue with


some errors, but other errors may stop the
process
Chapter – 1 : Introduction to Compiler Design 34 Bahir Dar Institute of Technology
Compiler Construction Tools
 Scanner Generators : Produce Lexical Analyzers
 egg. Lex (Flex)
 Parser Generators : Produce Syntax Analyzers
 Example-YACC (Yet Another Compiler-Compiler).
 Syntax-directed Translation Engines : Generate
intermediate
Code egg.YACC
(Bison)
 Automatic Code Generators : Generate Actual Code
 i.e. It takes a collection of rules to translate intermediate language
into machine language.
 Data-Flow Engines : Support Optimization
 Means: It does code optimization using data-flow analysis, that is,
the gathering of information about how values are transmitted
from one part of a program to each other part.
Chapter – 1 : Introduction to Compiler Design 35 Bahir Dar Institute of Technology
Types of compiler
 Stage Compiler
 A compiler which converts the code into assembly

code only.
 Just-in-time Compiler

A compiler which converts the code into
machine code after the program starts execution.
 Retargetable Compiler

A compiler that can be easily modified to
compile a source code for different CPU
architectures.
 Parallelizing Compiler
 A Compiler capable of compiling a code in parallel
computer architecture.
Chapter – 1 : Introduction to Compiler Design 36 Bahir Dar Institute of Technology
Types of compiler
 One pass Compiler
 The compiler which completes whole compilation

process in a single pass.


 i.e., it traverse through the whole source code only
once.
 Incremental Compiler
 The compiler which compiles only the changed lines
from the source code and update the object code
accordingly.

You might also like