Compiler Design Lab Manual
Compiler Design Lab Manual
No
1. Using the LEX tool, Develop a lexical analyzer to 2
recognize a few patterns in C. (Ex. identifiers, constants,
comments, operators etc.). Create a symbol table, while
recognizing identifiers.
2. Implement a Lexical Analyzer using LEX Tool 6
3. Generate YACC specification for a few syntactic
categories.
a. Program to recognize a valid arithmetic expression that 10
uses operator +, -, * and /.
b. Program to recognize a valid variable which starts with 13
a letter followed by any number of letters or digits.
c. Program to recognize a valid control structures syntax of 15
C language (For loop, while loop, if-else, if-else-if, switch-
case, etc.).
4. Generate three address code for a simple program using 22
LEX and YACC.
5. Implement type checking using Lex and Yacc. 29
6. Implement simple code optimization techniques (Constant 32
folding, Strength reduction and Algebraic transformation)
7. Implement back-end of the compiler for which the three 37
address code is given as input and the 8086 assembly
language code is produced as output.
L.E.2 Compiler Design
Experiment No: 1
AIM
To develop a lexical analyzer to identify identifiers, constants, comments,
operators etc using C program
ALGORITHM
Step 1: Start the program.
Step 2: Declare all the variables and file pointers.
Step 3: Display the input program.
Step 4: Separate the keyword in the program and display it.
Step 5: Display the header files of the input program
Step 6: Separate the operators of the input program and display it.
Step 7: Print the punctuation marks.
Step 8: Print the constant that are present in input program.
Step 9: Print the identifiers of the input program.
PROGRAM CODE
//Develop a lexical analyzer to recognize a few patterns in C.
#include<string.h>
#include<ctype.h>
#include<stdio.h>
#include<stdlib.h>
void keyword(char str[10])
{
if(strcmp("for",str)==0||strcmp("while",str)==0||strcmp("do",str)==0||strcmp("int",
str)==0||strcmp("float",str)==0||strcmp("char",str)==0||strcmp("double",str)==0||str
cmp("printf",str)==0||strcmp("switch",str)==0||strcmp("case",str)==0)
printf("\n%s is a keyword",str);
Lab Experiments L.E.3
else
printf("\n%s is an identifier",str);
}
void main()
{
FILE *f1,*f2,*f3;
char c,str[10],st1[10];
int num[100],lineno=0,tokenvalue=0,i=0,j=0,k=0;
f1=fopen("input","r");
f2=fopen("identifier","w");
f3=fopen("specialchar","w");
while((c=getc(f1))!=EOF)
{
if(isdigit(c))
{
tokenvalue=c-'0';
c=getc(f1);
while(isdigit(c))
{
tokenvalue*=10+c-'0';
c=getc(f1);
}
num[i++]=tokenvalue;
ungetc(c,f1);
}
else
if(isalpha(c))
{
putc(c,f2);
c=getc(f1);
while(isdigit(c)||isalpha(c)||c=='_'||c=='$')
{
L.E.4 Compiler Design
putc(c,f2);
c=getc(f1);
}
putc(' ',f2);
ungetc(c,f1);
}
else
if(c==' '||c=='\t')
printf(" ");
else
if(c=='\n')
lineno++;
else
putc(c,f3);
}
fclose(f2);
fclose(f3);
fclose(f1);
printf("\n the no's in the program are:");
for(j=0;j<i;j++)
printf("\t%d",num[j]);
printf("\n");
f2=fopen("identifier","r");
k=0;
printf("the keywords and identifier are:");
while((c=getc(f2))!=EOF)
if(c!=' ')
str[k++]=c;
else
{
str[k]='\0';
Lab Experiments L.E.5
keyword(str);
k=0;
}
fclose(f2);
f3=fopen("specialchar","r");
printf("\n Special Characters are");
while((c=getc(f3))!=EOF)
printf("\t%c",c);
printf("\n");
fclose(f3);
printf("Total no of lines are:%d",lineno);
}
OUTPUT
RESULT
Thus the program for developing a lexical analyzer to recognize a few patterns in
C has been executed successfully.
L.E.6 Compiler Design
Experiment No: 2
AIM
To write a program for implementing a Lexical analyser using LEX tool in Linux
platform.
ALGORITHM
Step 1: Lex program contains three sections: definitions, rules, and user
subroutines. Each section must be separated from the others by a line
containing only the delimiter, %%. The format is as follows: definitions
%% rules %% user_subroutines
Step 2: In definition section, the variables make up the left column, and their
definitions make up the right column. Any C statements should be enclosed
in %{..}%. Identifier is defined such that the first letter of an identifier is
alphabet and remaining letters are alphanumeric.
Step 3: In rules section, the left column contains the pattern to be recognized in an
input file to yylex(). The right column contains the C program fragment
executed when that pattern is recognized. The various patterns are
keywords, operators, new line character, number, string, identifier,
beginning and end of block, comment statements, preprocessor directive
statements etc.
Step 4: Each pattern may have a corresponding action, that is, a fragment of C
source code to execute when the pattern is matched.
Step 5: When yylex() matches a string in the input stream, it copies the matched
text to an external character array, yytext, before it executes any actions in
the rules section.
Step 6: In user subroutine section, main routine calls yylex(). yywrap() is used to
get more input.
Step 7: The lex command uses the rules and actions contained in file to generate a
program, lex.yy.c, which can be compiled with the cc command. That
program can then receive input, break the input into the logical pieces
Lab Experiments L.E.7
defined by the rules in file, and run program fragments contained in the
actions in file.
PROGRAM CODE
//Implementation of Lexical Analyzer using Lex tool
%{
int COMMENT=0;
%}
identifier [a-zA-Z][a-zA-Z0-9]*
%%
#.* {printf("\n%s is a preprocessor directive",yytext);}
int |
float |
char |
double |
while |
for |
struct |
typedef |
do |
if |
break |
continue |
void |
switch |
return |
else |
goto {printf("\n\t%s is a keyword",yytext);}
"/*" {COMMENT=1;}{printf("\n\t %s is a COMMENT",yytext);}
{identifier}\( {if(!COMMENT)printf("\nFUNCTION \n\t%s",yytext);}
L.E.8 Compiler Design
{
return(1);
}
INPUT
//var.c
#include<stdio.h>
#include<conio.h>
void main()
{
int a,b,c;
a=1;
b=2;
c=a+b;
printf("Sum:%d",c);
}
OUTPUT
L.E.10 Compiler Design
RESULT
Thus the program for implementation of Lexical Analyzer using Lex tool has been
executed successfully.
Experiment No: 1
return 1;
}
Program Name : arith_id.y
%{
#include
/* This YYAC program is for recognizing the Expression */
%}
%%
statement: A’=’E
|E{
printf(“\n Valid arithmetic expression”);
$$ = $1;
};
E: E’+’ID
| E’-’ID
| E’*’ID
| E’/’ID
| ID
;
%%
extern FILE *yyin;
main()
{
do
{
yyparse();
}while(!feof(yyin));
}
yyerror(char*s)
L.E.12 Compiler Design
{
}
OUTPUT
[root@localhost]# lex arith_id.1
[root@localhost]# yacc –d arith_id.y
[root@localhost]# gcc lex.yy.c y.tab.c
[root@localhost]# ./a.out
x=a+b;
Identifier is x
Operator is EQUAL
Identifier is a
Operator is PLUS
Identifier is b
Lab Experiments L.E.13
;
L:L,ID
|ID
;
T:INT
|FLOAT
|DOUBLE
;
%%
extern FILE *yyin;
main()
{
do
{
yyparse();
}while(!feof(yyin));
}
yyerror(char*s)
{
}
OUTPUT
[root@localhost]# lex variable_test.I
[root@localhost]# yacc –d variable_test.y
[root@localhost]# gcc lex.yy.c y.tab.c
[root@localhost]# ./a.out
int a,b;
Identifier is a
Identifier is b[root@localhost]#
Lab Experiments L.E.15
AIM
To recognize the valid syntax of control structures in the C programming language
using Yacc (Yet Another Compiler Compiler),
ALGORITHM
Define a Yacc specification file (.y file) that describes the grammar rules for
these control structures. Here's an example of a Yacc specification for
recognizing the syntax of various control structures in C:
Define the grammar rules for each control structure such as if-else, while, for,
and switch-case has been defined.
The %token declarations at the beginning of the Yacc file define the tokens
used in the grammar.
PROGRAM
%{
#include <stdio.h>
%}
%token IF ELSE WHILE FOR SWITCH CASE BREAK DEFAULT
%%
program : control_structures
;
control_structures : control_structure
| control_structures control_structure
;
control_structure : if_else
| while_loop
| for_loop
| switch_case
;
L.E.16 Compiler Design
int main() {
yyparse();
return 0;
}
void yyerror(const char *s) {
printf("Syntax error: %s\n", s);
}
OUTPUT
Save the Yacc specification in a file named "control.y".
Install Bison and Flex if you haven't already.
Generate the parser code using Bison:
bison -d control.y
This command generates two files: "control.tab.c" and "control.tab.h".
Create a lexer file (e.g., "control.l") and define the tokens and their
corresponding regular expressions using Flex. Include "control.tab.h" at the
beginning of the lexer file.
Generate the lexer code using Flex:
flex control.l
This command generates a file named "lex.yy.c".
Compile the generated parser and lexer code together:
gcc control.tab.c lex.yy.c -o control_parser
This command compiles the parser code "control.tab.c" and lexer code
"lex.yy.c" into an executable named "control_parser".
Run the parser on a test input file:
./control_parser < input.c
RESULT
Thus the program for control constructs with Yacc Specification has been executed
successfully.
L.E.18 Compiler Design
yyparse();
}
int yyerror(char *error)
{
fprintf(stderr,”%s\n”,error);
}
OUTPUT
The output of the program can be obtained by following commands
[root@localhost]]# lex calci.l
[root@localhost]]# yacc –d calci.y
[root@localhost]]# cc y.tab.c lexyy.c –ll –ly –lm
[root@localhost]]# ./a.out
Enter the expression: 2+@
Answer = 4
2*2+5/4
Answer = 5.25
mem = cos 45
sin 45/mem
Answer = 1
ln 10
Answer = 2.
L.E.22 Compiler Design
Experiment No: 4
AIM
To peform Intermediate code generation using Lex and Yacc
ALGORITHM
The addtotable function is used to add intermediate results to the arr array and
returns the assigned temporary variable.
The threeAdd, fouradd, and triple functions are used to print the intermediate
results in different formats.
The find function is used to find the index of a result variable in the arr array.
3. EXECUTION
In the main function, the program prompts the user to enter an expression.
The yyparse function is called to start the parsing process.
The parsed expression is then evaluated and stored in the arr array.
Finally, the threeAdd, fouradd, and triple functions are called to print the
intermediate results in different formats.
CODE
Generates Three Address, Four Address, Triple Intermediate Code.
Lex
%{
#include"y.tab.h"
extern char yyval;
%}
%%
[0-9]+ {yylval.symbol=(char)(yytext[0]);return NUMBER;}
[a-z] {yylval.symbol= (char)(yytext[0]);return LETTER;}
. {return yytext[0];}
\n {return 0;}
%%
YACC
%{
#include"y.tab.h"
#include<stdio.h>
L.E.24 Compiler Design
char addtotable(char,char,char);
int index1=0;
char temp = 'A'-1;
struct expr{
char operand1;
char operand2;
char operator;
char result;
};
%}
%union{
char symbol;
}
%left '+' '-'
%left '/' '*'
%token <symbol> LETTER NUMBER
%type <symbol> exp
%%
statement: LETTER '=' exp ';' {addtotable((char)$1,(char)$3,'=');};
exp: exp '+' exp {$$ = addtotable((char)$1,(char)$3,'+');}
|exp '-' exp {$$ = addtotable((char)$1,(char)$3,'-');}
|exp '/' exp {$$ = addtotable((char)$1,(char)$3,'/');}
|exp '*' exp {$$ = addtotable((char)$1,(char)$3,'*');}
|'(' exp ')' {$$= (char)$2;}
|NUMBER {$$ = (char)$1;}
|LETTER {(char)$1;};
%%
struct expr arr[20];
void yyerror(char *s){
Lab Experiments L.E.25
printf("Errror %s",s);
}
char addtotable(char a, char b, char o){
temp++;
arr[index1].operand1 =a;
arr[index1].operand2 = b;
arr[index1].operator = o;
arr[index1].result=temp;
index1++;
return temp;
}
void threeAdd(){
int i=0;
char temp='A';
while(i<index1){
printf("%c:=\t",arr[i].result);
printf("%c\t",arr[i].operand1);
printf("%c\t",arr[i].operator);
printf("%c\t",arr[i].operand2);
i++;
temp++;
printf("\n");
}
}
void fouradd(){
int i=0;
char temp='A';
while(i<index1){
printf("%c\t",arr[i].operator);
L.E.26 Compiler Design
printf("%c\t",arr[i].operand1);
printf("%c\t",arr[i].operand2);
printf("%c",arr[i].result);
i++;
temp++;
printf("\n");
}
}
int find(char l){
int i;
for(i=0;i<index1;i++)
if(arr[i].result==l) break;
return i;
}
void triple(){
int i=0;
char temp='A';
while(i<index1){
printf("%c\t",arr[i].operator);
if(!isupper(arr[i].operand1))
printf("%c\t",arr[i].operand1);
else{
printf("pointer");
printf("%d\t",find(arr[i].operand1));
}
if(!isupper(arr[i].operand2))
printf("%c\t",arr[i].operand2);
else{
printf("pointer");
Lab Experiments L.E.27
printf("%d\t",find(arr[i].operand2));
}
i++;
temp++;
printf("\n");
}
}
int yywrap(){
return 1;
}
int main(){
printf("Enter the expression: ");
yyparse();
threeAdd();
printf("\n");
fouradd();
printf("\n");
triple();
return 0;
}
D:= 5 * f
E:= C - D
F:= a = E
*bcA
/13B
+ABC
*5fD
-CDE
=aEF
*bc
/13
+ pointer0 pointer1
*5f
- pointer2 pointer3
= a pointer4
RESULT
Thus the program for three address code generation has been executed
successfully.
Lab Experiments L.E.29
Experiment No: 5
AIM
To write a C program to implement type checking
ALGORITHM
Step 1: Track the global scope type information (e.g. classes and their members)
Step 2: Determine the type of expressions recursively, i.e. bottom-up, passing the
resulting types upwards.
Step 3: If type found correct, do the operation
Step 4: Type mismatches, semantic error will be notified
PROGRAM CODE
//To implement type checking
#include<stdio.h>
#include<stdlib.h>
int main()
{
int n,i,k,flag=0;
char vari[15],typ[15],b[15],c;
printf("Enter the number of variables:");
scanf(" %d",&n);
for(i=0;i<n;i++)
{
printf("Enter the variable[%d]:",i);
scanf(" %c",&vari[i]);
printf("Enter the variable-type[%d](float-f,int-i):",i);
scanf(" %c",&typ[i]);
if(typ[i]=='f')
L.E.30 Compiler Design
flag=1;
}
printf("Enter the Expression(end with $):");
i=0;
getchar();
while((c=getchar())!='$')
{
b[i]=c;
i++; }
k=i;
for(i=0;i<k;i++)
{
if(b[i]=='/')
{
flag=1;
break; } }
for(i=0;i<n;i++)
{
if(b[0]==vari[i])
{
if(flag==1)
{
if(typ[i]=='f')
{ printf("\nthe datatype is correctly defined..!\n");
break; }
else
{ printf("Identifier %c must be a float type..!\n",vari[i]);
break; } }
else
Lab Experiments L.E.31
OUTPUT
RESULT
Thus the above program is compiled and executed successfully and output is
verified.
L.E.32 Compiler Design
Experiment No: 6
AIM
To write a program for implementation of Code Optimization Technique.
ALGORITHM
Step 1: Generate the program for factorial program using for and do-while loop to
specify optimization technique.
Step 2: In for loop variable initialization is activated first and the condition is checked
next. If the condition is true the corresponding statements are executed and
specified increment / decrement operation is performed.
Step 3: The for loop operation is activated till the condition failure.
Step 4: In do-while loop the variable is initialized and the statements are executed then
the condition checking and increment / decrement operation is performed.
Step 5: When comparing both for and do-while loop for optimization do while is best
because first the statement execution is done then only the condition is checked.
So, during the statement execution itself we can find the inconvenience of the
result and no need to wait for the specified condition result.
Step 6: Finally, when considering Code Optimization in loop do-while is best with
respect to performance.
PROGRAM CODE
//Code Optimization Technique
#include<stdio.h>
#include<string.h>
struct op
{
char l;
char r[20];
Lab Experiments L.E.33
}
op[10],pr[10];
void main()
{
int a,i,k,j,n,z=0,m,q;
char *p,*l;
char temp,t;
char *tem;
printf("Enter the Number of Values:");
scanf("%d",&n);
for(i=0;i<n;i++)
{
printf("left: ");
scanf(" %c",&op[i].l);
printf("right: ");
scanf(" %s",&op[i].r);
}
printf("Intermediate Code\n") ;
for(i=0;i<n;i++)
{
printf("%c=",op[i].l);
printf("%s\n",op[i].r);
}
for(i=0;i<n-1;i++)
{
temp=op[i].l;
for(j=0;j<n;j++)
{
p=strchr(op[j].r,temp);
L.E.34 Compiler Design
if(p)
{
pr[z].l=op[i].l;
strcpy(pr[z].r,op[i].
r);
z++;
}
}
}
pr[z].l=op[n-1].l;
strcpy(pr[z].r,op[n-1].r);
z++;
printf("\nAfter Dead Code Elimination\n");
for(k=0;k<z;k++)
{
printf("%c\t=",pr[k].l);
printf("%s\n",pr[k].r);
}
for(m=0;m<z;m++)
{
tem=pr[m].r;
for(j=m+1;j<z;j++)
{
p=strstr(tem,pr[j].r);
if(p)
{
t=pr[j].l;
pr[j].l=pr[m].l;
for(i=0;i<z;i++)
Lab Experiments L.E.35
{
l=strchr(pr[i].r,t) ;
if(l)
{
a=l-pr[i].r;
printf("pos: %d\n",a);
pr[i].r[a]=pr[m].l;
}}}}}
printf("Eliminate Common Expression\n");
for(i=0;i<z;i++)
{
printf("%c\t=",pr[i].l);
printf("%s\n",pr[i].r);
}
for(i=0;i<z;i++)
{
for(j=i+1;j<z;j++)
{
q=strcmp(pr[i].r,pr[j].r);
if((pr[i].l==pr[j].l)&&!q)
{
pr[i].l='\0';
}
}
}
printf("Optimized Code\n");
for(i=0;i<z;i++)
{
if(pr[i].l!='\0')
L.E.36 Compiler Design
{
printf("%c=",pr[i].l);
printf("%s\n",pr[i].r);
}
}
}
OUTPUT
RESULT
Thus the program to implement code optimization technique has been executed
successfully.
Lab Experiments L.E.37
Experiment No: 7
INTRODUCTION
A compiler is a computer program that implements a programming language
specification to “translate” programs, usually as a set of files which constitute the
source code written in source language, into their equivalent machine readable
instructions(the target language, often having a binary form known as object code).
This translation process is called compilation.
BACK END
Some local optimization
Register allocation
Peep-hole optimization
Code generation
Instruction scheduling
ALGORITHM
1. Start the program
1. Open the source file and store the contents as quadruples.
2. Check for operators, in quadruples, if it is an arithmetic operator generator it
or if assignment operator generates it, else perform unary minus on register C.
3. Write the generated code into output definition of the file in outp.c
4. Print the output.
5. Stop the program.
scanf("%s",icode[i]);
} while(strcmp(icode[i++],"exit")!=0); printf("\n target code generation");
printf("\n************************"); i=0;
do
{
strcpy(str,icode[i]); switch(str[3])
{
case '+': strcpy(opr,"ADD");
break; case '-':
strcpy(opr,"SUB"); break;
case '*': strcpy(opr,"MUL");
break; case '/':
strcpy(opr,"DIV"); break;
}
printf("\n\tMov %c,R%d",str[2],i);
printf("\n\t%s%c,R%d",opr,str[4],i);
printf("\n\tMov R%d,%c",i,str[0]);
}while(strcmp(icode[++i],"exit")!=0);
//getch();
}
OUTPUT
L.E.40 Compiler Design
RESULT
Thus the program was implemented to the TAC has been successfully executed