UNIT 1 AND 2
UNIT 1 AND 2
INTRODUCTION
In the previous unit you were taken through some basic concepts you
learnt in an earlier course. This was done because of their
relevance/importance to your understanding of this course.
In this unit you will be introduced to the concept of compilers and their
importance to programme development.
1.0 OBJECTIVES
2.1 Translators
A translator is a programme that takes as input a programme written in one programming
language (t h e source language) and produces as output a programme in another
language (the object or target language). If the source language is a high-level language
such as COBOL, PASCAL, etc. and the object language is a low-level language such as
an assembly language or machine language, then such a translator is called a Compiler.
Executing a programme written in a high-level programming language is basically a two-
step process, as illustrated in Figure 1. The source programme must first be compiled, that
is, translated into object programme. Then the resulting object programme is loaded into
memory and execute
There are other important types of translators, besides compilers. If the source language is
assembly language and the target language is machine language, then the translator is
called an assembler. The term preprocessor is used for translators that take programmes in
one high- level language into equivalent programmes in another high level language. For
example, there many FORTRAN preprocessors that map „structured‟ versions of
FORTRAN into conventional FORTRAN.
Like was mention in section 3.1, compilers and interpreters are not the only examples of
translators. In the table below are a few more:
1) Many variations:
a. many programming languages (e.g. FORTRAN, C++, Java)
b. many programming paradigms (e.g. object-oriented,
functional, logic)
c. many computer architectures (e.g. MIPS, SPARC, Intel, alpha)
d. many operating systems (e.g. Linux, Solaris, Windows)
2) Qualities of a compiler: these concerns the qualities that are compiler must possess in other
to be effective and useful. These are listed below in order of importance:
a. the compiler itself must be bug-free
b. it must generate correct machine code
c. the generated machine code must run fast
d. the compiler itself must run fast (compilation time must be proportional to programme size)
e. the compiler must be portable (i.e. modular, supporting separate compilation)
f. it must print good diagnostics and error messages
g. the generated code must work well with existing debuggers
h. must have consistent and predictable optimisation.
3) In-depth knowledge:
Building a compiler requires in-depth knowledge of:
a. programming languages (parameter passing, variable scoping, memory allocation,
etc.)
b. theory (automata, context-free languages, etc.)
c. algorithms and data structures (hash tables, graph algorithms, dynamic
programming, etc.)
d. computer architecture (assembly programming)
e. software engineering.
A typical real-world compiler usually has multiple phases (this will be treated to greater
details in unit 3 of this module. This increases the compiler's portability and simplifies
retargeting. The front end consists of the following phases:
• scanning: a scanner groups input characters into tokens
• parsing: a parser recognises sequences of tokens according to some grammar and generates
Abstract Syntax Trees (ASTs)
• semantic analysis: performs type checking (i.e. checking whether the variables, functions
etc. in the source programme are used consistently with their definitions and with
the language
semantics) and translates ASTs into IRs
• optimisation: optimises IRs.
The back end consists of the following phases:
• instruction selection: maps IRs into assembly code
• code optimisation: optimises the assembly code using control- flow and data-flow
analyses, register allocation, etc
• code emission: generates machine code from assembly code.
The generated machine code is written in an object file. This file is not executable since it
may refer to external symbols (such as system calls). The operating system provides the
following utilities to execute the code:
• linking: A linker takes several object files and libraries as input and produces one
executable object file. It retrieves from the input files (and puts them together in the
executable object file)
the code of all the referenced functions/procedures and it resolves all external references to
real addresses. The libraries include the operating system libraries, the language-specific
libraries, and, maybe, user-created libraries.
• loading: A loader loads an executable object file into memory, initialises the registers, heap,
data, etc. and starts the execution of the programme.
• Relocatable shared libraries allow effective memory use when many different applications
share the same code.
UNIT 2 THE STRUCTURE OF A COMPILER
1. Front end
2. Back-end
3. Tables of information
4. Runtime library
i) Front-End: the front-end is responsible for the analysis of the structure and meaning of the
source text. This end is usually the analysis part of the compiler. Here we have the syntactic
analyser, semantic analyser, and lexical analyser. This part has been automated.
ii) Back-End: The back-end is responsible for generating the target language. Here we have
intermediate code optimiser, code generator and code optimiser. This part has been
automated.
iii) Tables of Information: It includes the symbol-table and there are some other tables that
provide information during compilation process.
iv) Run-Time Library: It is used for run-time system support.
Source Programme
Lexical Analysis
Syntax Analysis
Code Optimisation
Code Generation
Target Programme
PHASES OF COMPILER