CD 1.1 Introduction to Compiler
CD 1.1 Introduction to Compiler
INTRODUCTION TO COMPILER
What is a Compiler?
Compiler is basically a translator. It translates a source program written in High Level
programming language such as Pascal/C/C++ into machine language for computers,
such as the Intel Pentium IV /AMD processor machine as shown in figure 1.1
Not only this translation, converting High Level Language to Low Level Language, In
addition to this a compiler even takes care of issueing the error messages.
It shows even the error messages in the source program.
The main aim of compiler is to convert a High Level Language into Low Level
Language. Then the question is--- “If we have to convert the High Level Language
into Low Level Language, why don’t we write a program in a Low Level
Language?”
The reason is we are not comfortable writing programs in 0’s and 1’s the binary
language.
We are comfortable writing it in English or some language similar to English and then
compile with your software which is going to convert that High Level Language to
Low Level Language.
3.Assembler:
Assembly code is a mnemonic version of machine code in which names rather than
binary values for machine instructions and memory addresses are used.
An assembler needs to assign memory locations or addresses to symbols/identifiers.
It should use these addresses in generating the target language, that is, the machine
language.
The assembler should ensure that the same address must be used for all the
occurrences of a given identifier and no two identifiers are assigned with the same
address.
A simple mechanism to accomplish this is to make two passes over the input.
During the first pass whenever a new identifier is encountered, assign an address to it.
Store the identifier along with the address in a symbol table.
During the second pass, whenever an identifier is seen, then its address is retrieved
from the symbol table and that value is used in the generated machine code.
Example 1: Consider the following C code for adding two numbers:
The equivalent assembly code for adding two numbers is as follows:
The equivalent machine relocatable code for adding two numbers is as follows:
4. Loader/Linker:
To convert the relocatable machine code to the executable code, one more translator
is required and this is called the loader/linker.
A loader/linker or link editor is a translator that takes one or more object modules
generated by a compiler and combines them into a single executable program called
the exe code.
The terms loader and linker are used synonymsly on Unix environments. This
program is known as a linkage editor in IBM mainframe OS. However, in some
operating systems, the same program handles both the tasks of object linking and
physical loading of a program.
Some systems use linking for the former and loading for the latter.
Linking: This is a process where a linker takes several object files and libraries as input
and produces one executable object file shown in Figure. It retrieves from the input
files (and combines them in the executable code) the code of all the procedures that
are referenced and resolves all external references to actual machine addresses.
The libraries include language- specific libraries, operating system libraries, and user-
defined libraries
Figure: Linker
Loading: This is a process where a loader loads an executable file into memory, initializes
the registers, heap, data, etc., and starts the execution of the program.
If we look at the design of all these translators, designing a preprocessor or assembler
or loader/linker is simple.
It can be taken up as a one-month project.
But among all, the most complex translator is the compiler.
The design of the first FORTRAN compiler took 18 man years.
The complexity of the design of a compiler mainly depends on the source language.
Currently, many automated tools are available. With modern compiler tools like
YACC (yet another compiler compiler), LEX (lexical analyzer), and data flow
engines, the design of a compiler is made easy.