Practical File: Bubblepreet Kaur UE213025
Practical File: Bubblepreet Kaur UE213025
UE213025
PRACTICAL FILE
COMPILER DESIGNER
January 2024 – May 2024
Submitted By:
Bubblepreet Kaur
Roll Number:
UE213025
Submitted To:
Dr. Akashdeep
Assistant Professor, Computer Science and Engineering
1
Bubblepreet Kaur
UE213025
INDEX
5.
6.
7.
2
Bubblepreet Kaur
UE213025
Contents
3
Bubblepreet Kaur
UE213025
Description –
A compiler is a specialized software tool that translates a program written in a high-level programming
language (such as C, C++, Java, etc.) into an equivalent program in a lower-level language, typically machine
code or assembly language, that can be executed by a computer's hardware.
Compilers that translate source code to machine code target specific operating systems and computer
architectures. This type of output is sometimes referred to as object code.
The compilation process in C/C++ is converting an understandable human code into a Machine
understandable code and checking the syntax and semantics of the code to determine any syntax errors or
warnings present in our C program.
Compiler Processes :
1. Preprocessing the cpp source code - Pre-processing is the first step in the compilation process.
Following pre-processing tasks are performed :
a. Comments Removal
b. Macros expansion
c. File inclusion
d. Conditional Compilation
Pre-processor converts our source code file into an intermediate file. Intermediate file has an
extension of .i and it is the expanded form of program containing all the content of header files,
macros expansion, and conditional compilation.
2. Compilation of source code - Compiling phase in C++ uses an inbuilt compiler software to convert
the intermediate (.i) file into an Assembly file (.s) having assembly level instructions.
The whole program code is parsed (syntax analysis) by the compiler software in one go, and it tells
us about any syntax errors or warnings present in the source code through the terminal window.
3. Assembling - Assembly level code (.s file) is converted into a machine- understandable code (in
binary/hexadecimal form) using an assembler. The file generated has the same name as the
assembly file and is known as an object file with .obj /.o extension.
4. Linking the object file to make an executable file - Linking is a process of including the library
files into our program. Library Files are some predefined files with an extension of .lib. The
linking process generates an executable file with .exe / .out extension.
4
Bubblepreet Kaur
UE213025
Command :
By executing the below command, all the intermediate files will be created.
g++ - Wall -save-temps filename.cpp –o filename
.cpp File: (Source File)
5
Bubblepreet Kaur
UE213025
6
Bubblepreet Kaur
UE213025
Output:
7
Bubblepreet Kaur
UE213025
The command ‘g++ -Wall -save-temps prog.cpp -o prog’ compiles the source file prog.cpp into an
executable named prog, while also generating additional intermediate files as part of the compilation
process.
We also used the ‘size’ command to display the sizes of binary files, such as object files(.o) and
executable files(.exe) because they represent the final output of the compilation process and are
directly related to program execution. Also, contain compiled code and data.
But do not provide size of source file(.cpp), pre-processed file(.ii), assembly file(.s) because they are
intermediate files in the compilation process and do not contain compiled code or data. Also, they are
not directly executable.
From the above experiment, I learned about different stages of compilation process and function of
each stage. Functions performed by each stage are defined below:
1) The pre-processor removes comments during compilation because they are not very useful to
the machine. Also, it expands the macros defined in the code. Before the program compilation,
the preprocessor substitutes the macro name with the appropriate value or code when it is used
in the program.
2) Using macros can make code simpler, easier to read, and require less typing. The source
program file, which originally had a ‘.cpp’ extension, is given a ‘.ii’ extension during pre-
processing phase.
3) The Compilation phase in C++ involves translating the pre-processed source code into
assembly code with extension ‘.s’ or machine code and generating object files.
4) The assembler helps in converting the assembly file into an object file containing machine-
level code. In this step, the file's extension changes to ‘.obj’ or ‘.o’.
5) Linker is used to link the library files with the object file to define the unknown statements. It
generates an executable file with ‘.exe’ extension i.e. a prog.exe file.
8
Bubblepreet Kaur
UE213025
Output - File having characters identified as identifiers, keywords, operators, digits etc.
Description –
Lexical Analysis is the first phase of Compiler . It scans the input source file and converts it into a
sequence of Tokens i.e. keywords, identifiers, operators, digits etc.
Approach -
Read a sample file code character by character.
Checked if the character is an operator or special symbol.
If the character is alphabet or underscore then it can be a keyword or identifier.
Push the characters into string till there are alphabets or digits. Also keep checking if the string is a
keyword or predefined identifier.
If there are no more alphabets left, then check for identifiers.
Input File-
Input.txt file
#include<iostream>
using namespace std;
int main()
{
int var1=1;
for (int i=0;i<10;i++)
{
cout<<++var1;
}
return 0;
}
Checker.h file
#include <iostream>
using namespace std;
bool is_alp(char s)
{
if ((s >= 65 && s <= 90) || (s == 35) ||
9
Bubblepreet Kaur
UE213025
return false;
}
bool is_digit(char s)
{
if (s >= 48 && s <= 57)
return true;
return false;
}
bool is_digit(string s)
{
if (s[0] >= 48 && s[0] <= 57)
return true;
return false;
}
bool is_keyw(string s)
{
if (s == "asm" || s == "auto" || s == "operator" || s == "friend" || s == "explicit" || s == "this" || s == "new" || s ==
"delete" || s == "inline" || s == "true" || s == "false" || s == "int" || s == "char" || s == "double" || s == "float" || s
== "bool" || s == "if" || s == "else" || s == "return" || s == "break" || s == "continue" || s == "using" || s ==
"namespace" || s == "for" ||s == "const" || s == "case" || s == "do" || s == "while" || s == "switch" || s ==
"public" || s == "protected" || s == "private" || s == "throw" || s == "catch" || s == "virtual" || s == "try" || s ==
"class" || s == "true" || s == "void")
return true;
return false;
}
bool is_opr(char s)
{
if (s == '+' || s == '-' || s == '*' || s == '/' || s == '%' || s == '=' || s == '<' || s == '>')
return true;
return false;
}
bool is_Id(string s)
{
int i;
if ((s[0] >= 65 && s[0] <= 90) ||
(s[0] == 35) || (s[0] >= 97 && s[0] <= 122))
{
for (i = 1; i < s.size(); i++)
{
10
Bubblepreet Kaur
UE213025
if ((s[i] >= 48 && s[i] <= 57) || (s[0] >= 65 && s[0] <= 90) || (s[0] == 35) || (s[0] >= 97 && s[0] <=
122))
continue;
else
break;
}
if (i == s.size())
return true;
}
return false;
}
bool is_predefinedId(string s)
{
if (s == "include" || s == "iostream" || s == "main" || s == "std" || s == "string" || s == "cin" || s == "cout" || s
== "endl" || s == "INT_MIN" || s == "INT_MAX" || s == "NULL")
return true;
return false;
}
bool is_specialsym(char s)
{
if (s == '(' || s == ')' || s == '[' || s == ']' || s == '{' || s == '}' || s == ';' || s == ',' || s == '#' || s == '"')
return true;
return false;
}
Code –
#include <iostream>
#include <fstream>
#include <vector>
#include <string>
#include "Checkers.h"
using namespace std;
fstream my_file;
fstream o_file;
vector<char> v;
string str;
char ch;
char c, chr;
void lex(char ch)
{ if (is_specialsym(ch))
o_file << ch << " ---Special Symbol" << endl;
else if (is_opr(ch))
{
chr = ch;
my_file >> ch;
cout << ch;
if (is_opr(ch))
11
Bubblepreet Kaur
UE213025
Output –
13
Bubblepreet Kaur
UE213025
1. Learned about the various stages of lexical analysis phase. The lexical analysis phase identifies and
categorizes tokens such as identifiers, keywords, operators, etc. This helps in understanding the
structure and syntax of the source code.
2. The lexical analyser produces a sequence of tokens for each lexeme.
3. The lexical analysis phase ignores whitespace characters and comments present in the source code,
which are not relevant to the semantics of the program.
14
Bubblepreet Kaur
UE213025
Output -Transition table is generated that shows the transitions from state_1 to state_2 and for what input.
Description – One way to implement regular expressions is to convert them into a finite automaton,
known as an ∈-NFA (epsilon-NFA). An ∈-NFA is a type of automaton that allows for the use of “epsilon”
transitions, which do not consume any input. This means that the automaton can move from one state to
another without consuming any characters from the input string.
Approach -
Code:
#include<iostream>
#include<map>
#include<vector>
#include<string>
#include<stack>
using namespace std;
pair<int, char> p;
multimap<pair<int, char>, int> nfa;
int count = 0;
string s;
void print_S(string a){
for(int i=0;i<a.size();i++){
cout<<a[i];
}
cout<<endl;
}
void print_vec(multimap <pair
<int, char> , int>& nfa)
{
15
Bubblepreet Kaur
UE213025
}
int prec(char c){
if(c=='*'){
return 3;
}
else if(c=='.'){
return 2;
}
else if(c=='+'){
return 1;
}
else{
return -1;
}
}
string post(string s) {
stack<char> st;
st.push('N');
int l = s.length();
string ns;
for(int i = 0; i < l; i++) {
if((s[i] >= 'a' && s[i] <= 'z')||(s[i] >= 'A' && s[i] <= 'Z')){
ns+=s[i];
}
else if(s[i] == '('){
st.push('(');
}
else if(s[i] == ')') {
while(st.top() != 'N' && st.top() != '(') {
char c = st.top();
st.pop();
ns += c;
}
if(st.top() == '(') {
char c = st.top();
16
Bubblepreet Kaur
UE213025
st.pop();
}
}
else{
while(st.top() != 'N' && prec(s[i]) <= prec(st.top())) {
char c = st.top();
st.pop();
ns += c;
}
st.push(s[i]);
}
}
while(st.top() != 'N') {
char c = st.top();
st.pop();
ns += c;
}
return ns;
}
void concatenate(int i)
{
p.first = ++count;
if(s[i-2]=='.' || s[i-2]=='*' || s[i-2]=='|')
p.second = s[i-3];
else
p.second = s[i-2];
nfa.insert(make_pair(p, ++count));
p.first = count;
p.second = s[i-1];
nfa.insert(make_pair(p, ++count));
}
void closure(int i)
{
p.first = ++count;
if(s[i-1]!='.' || s[i-1]!='*' || s[i-1]!='|')
p.second = s[i-1];
else
p.second = s[i-2];
// nfa[p] = count;
nfa.insert(make_pair(p, count));
}
17
Bubblepreet Kaur
UE213025
for(int i=0;i<s.size();i++){
if(s[i]=='.')
{
concatenate(i);
}
if(s[i]=='*')
{
closure(i);
}
j=count;
if(s[i]=='+')
{
uni(i,j);
}
}
print_vec(nfa);
18
Bubblepreet Kaur
UE213025
return 0;
}
Output:
19