Compiler Design Lab Manual
Compiler Design Lab Manual
VIT
Vellore Institute of Technology
(Deenned lo be Uuivesity uder section 3 of UGC Act, 1956)
Lab Manual
2
EXPERIMENT 1
Problem Statement:
Concept to be applied:
A lexical analyzer reads the characters from the source code and converts them into
tokens. Different tokens or lexemes are:
Identifiers: Names of variables and functions (e.g., varName, function1).
Constants: Numeric literals (e.g., 123, 3.14).
Operators: Symbols for operations (e.g., +, -, *, /).
Comments: Single-line (// comment) and multi-line (/* comment */).
Keywords: Reserved words in C (e.g., int, return, if).
Delimiters: Punctuation (e.g., ;, {, }, (, )).
Algorithm:
Step 1: Start the program.
Step 2: Construct a function isKeyword which receives a array of characters and checks
whether it is a keyword or not.
Step 3: Write the main function.
Step 4: Initialize operators array which has all sorts of operators present.
Step 5: Create a file named prog.txt having a simple C program.
Step 6: Open the file by fopen (prog.txt, r) in read mode.
Step 7: Get the characters while reading the file and check whether the read character is
keyword, identifier or an operator.
Step 8: Check this till fgetc(fp)) != EOF.
Step 9: Close the file by fclose(fp).
Step 10: End.
Program:
Lexical analyser code:
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
3
#include<ctype.h>
int isKeyword(char buffer[]){
char keywords[32][10] = {"auto","break","case","char","const","continue","default",
"do","double","else","enum","extern","float","for","if","int","long","return","short",
"sizeof","static","struct","switch","void","while"};
int i, flag = 0;
for(i = 0; i < 32; ++i){
if(strcmp(keywords[i], buffer) == 0)
{flag = 1;
break;
}
}
return flag;
}
int main(){
char ch, buff[15], operators[] = "+-*/%=";
FILE *fp;
int i,j=0;
fp = fopen("prog.txt","r");
if(fp == NULL){
printf("error while opening the file\n");exit(0);
}
while((ch = fgetc(fp)) != EOF)
{
for(i = 0; i < 6; ++i)
{
if(ch == operators[i])
printf("%c is operator\n", ch);
}
if(isalnum(ch)){ buff[j++] = ch;
}
else if((ch == ' ' || ch == '\n') && (j != 0)){buff[j] = '\0';
j = 0;
4
if(isKeyword(buff) == 1)
printf("%s is keyword\n", buff);
else
printf("%s is identifier\n", buff);
}
}
fclose(fp); return 0;
}
Input:
Sample file (prog.txt)
void main()
{
int b, c;
int a = b + c;
}
Output:
5
EXPERIMENT 2(a)
Problem Statement:
To write a Lex program to count the number of lines, spaces and tabs.
Concept to be Applied:
Lexical Analysis:
Lex is a tool for generating scanners that recognize patterns in text. The patterns are
described using regular expressions, and Lex automatically generates code that
matches these patterns in the input text.
Pattern Matching:
In Lex, the input file is processed based on patterns. For this task, we can define patterns
for:
Newline (\n): Represents the end of a line. Each time this pattern is matched,
we increment the line counter.
Space (' '): Each time a space is encountered, we increment the space counter.
Tab ('\t'): Each time a tab is encountered, we increment the tab counter.
Counters:
Three counters will be used to keep track of the number of lines, the number of
spaces and the number of tabs.
For each of the patterns (newline, space, and tab), we will associate an action that
increments the respective counter.
6
Algorithm:
Step 1: Initialize Counters: Set lc (line count), sc (space count), tc (tab count), wc (word
count), ch (character count) to 0.
Step 2: Start Lexical Analysis:
Display a prompt to the user to enter a sentence.
Call the yylex() function to start processing the input.
Step 3: Processing Input:
For each token read from the input:
If the token is a newline character (\n):
Increment lc by 1 (counting the line).
Increment ch by 1 (counting the newline character itself).
If the token is a space ( ) or a tab (\t):
If the token is a space, increment sc by 1 (counting the space).
If the token is a tab, increment tc by 1 (counting the tab).
Increment ch by 1 (counting the space or tab character itself).
If the token is any other character:
Increment ch by 1 (counting the character).
If the token matches one or more non-whitespace, non-newline characters
(a word):
Increment wc by 1 (counting the word).
Increment ch by the length of the token (counting the characters in the
word).
Step 4: End of Input: The yywrap() function returns 1, indicating the end of the file/input
stream has been reached.
Step 5: Display Results: Print the number of lines (lc), number of spaces (sc), number of tabs
(tc), words (wc) and characters (ch).
Program:
/* Description/Definition Section */
%{
#include<stdio.h>
int lc=0,sc=0,tc=0,ch=0,wc=0; // Global Variables
%}
// Rule Section
7
%%
[\n] { lc++; ch+=yyleng;}
[ \t] { sc++; ch+=yyleng;}
[^\t] { tc++; ch+=yyleng;}
[^\t\n ]+ { wc++; ch+=yyleng;}
%%
// Main Function
int main(){
printf("Enter the Sentence : ");
yylex();
printf("Number of lines : %d\n",lc);
printf("Number of spaces : %d\n",sc);
printf("Number of tabs, words, charc : %d , %d , %d\n",tc,wc,ch);
return 0;
}
Sample Input and Output:
8
EXPERIMENT 2(b)
Problem Statement:
To write a Lex program to check whether given number is an Armstrong number or not.
Concept to be Applied:
An Armstrong number is a number that is the sum of its own digits each raised to the power
of the number of digits. For example, 153 is an Armstrong number,
(1^3) + (5^3) + (3^3) = 153
Examples:
Input: 153
Output: 153 is a Armstrong number
Input: 250
Output: 250 is not a Armstrong number
Algorithm:
Step 1: Input Reading: Read numbers from a file (or standard input).
Step 2: Length Calculation: For each number, determine the number of digits.
Step 3: Sum Calculation: Calculate the sum of each digit raised to the power of the number
of digits.
Step 4: Comparison: Compare the calculated sum with the original number.
Step 5: Output Result: Print whether the number is an Armstrong number or not.
Program:
/* Lex program to check whether given number is Armstrong number or not */
%
{
/* Definition section */
#include <math.h>
#include <string.h>
void check(char*);
9
%
}
/* Rule Section */
%%
[0 - 9]
+ check(yytext);
%%
int main()
{
/* yyin as pointer of File type */
extern FILE* yyin;
yyin = fopen("num", "r");
// The function that starts the analysis
yylex();
return 0;
}
void check(char* a)
{
int len = strlen(a), i, num = 0;
for (i = 0; i < len; i++)
num = num * 10 + (a[i] - '0');
10
else
printf("%d is not armstrong number\n", temp);
}
Sample Input and Output:
11
EXPERIMENT 3
Problem Statement:
To write a Lex program that copies file, replacing each nonempty sequence of white spaces
by a single blank.
Concept to be Applied:
The goal is to read from an input file and output its contents to another file, ensuring that:
Any sequence of one or more whitespace characters (spaces, tabs, newlines) is replaced
with a single space.
Whitespace Characters: These include spaces ( ), tabs (\t), and newline characters (\n).
Lex (Lexical Analyzer Generator): A tool used to generate a lexer that can recognize
patterns in text.
Algorithm:
Step 1: Read Input: Open the input file and prepare to read its contents.
Step 2: Match Whitespace: Use a regex pattern to match sequences of whitespace characters.
Step 3: Output Process: For each match:
If the match is a whitespace sequence, output a single space.
If it is a non-whitespace character, output that character as-it-is.
Step 4: Write to Output: Write the processed output to a file or standard output.
Program:
File: A5.l
s[ ]
%%
[ ]([ ])* { /* Pattern for recognizing multiple spaces */
fprintf(yyout," ");
}
([ ])*(\n)([ ])* { /* spaces followed by newline followed by spaces */
fprintf(yyout," ");
}
%%
12
int main()
{
// Point yyin to a file with text, this acts as input to our program
yyin = fopen("A5_input.txt","r");
// Point yyout to output file.
yyout = fopen("A5_output.txt","w");
yylex();
return 0;
}
---------------------------------------------------------------------------------------------------------------
File: A5_input.txt
Hello, Friends
Service to humanity
is
service to divinity.
If
you
don't
know
how
compiler works,
then
you don't
know how
computer works
13
Sample Input and Output:
14
EXPERIMENT 4
Problem Statement:
To write a Lex
sequence of words (groups of letters) separated by whitespace. Every time you encounter a
word:
(a) If the first letter is a consonant, move it to the end of the word and then add ay.
(b) If the first letter is a vowel, just add ay to the end of the word. All non-letters are copied
intact to the output.
Concept to be Applied:
Pig Latin is a playful form of language transformation, primarily used in English. The rules for
converting words to Pig Latin are as follows:
1. If a word starts with a consonant, move the first letter to the end of the word and add
"ay".
o Example: "hello" becomes "ellohay".
2. If a word starts with a vowel (a, e, i, o, u), simply add "ay" to the end of the word.
o Example: "apple" becomes "appleay".
3. Non-letter characters (punctuation, whitespace) should remain unchanged.
Algorithm:
Step 1: Read Input: Open the input file and prepare to read its contents.
Step 2: Identify Words: Use regex to identify sequences of letters as words.
Step 3: Transform Each Word: For each identified word:
Program:
File: A6.l
c[a-zA-Z]
vowel[aeiouAEIOU]
15
cons[^aeiouAEIOU]
%%
{vowel}{c}* {
/* First character is vowel */
/* copy yytext into an array and append "ay" to it */
char s[100];
strcpy(s,yytext);
strcat(s,"ay");
printf("%s ",s);
fprintf(yyout,"%s",s);
}
{c}{c}* {
/* First character is consonant */
/* copy yytext into an array except first character and then add the first
character and append "ay" to it */
char s[100];
strcpy(s,yytext+1);
printf("%s%cay ",s,yytext[0]);
fprintf(yyout,"%s%cay",s,yytext[0]);
}
%%
int main()
{
printf("The output is : \n");
yyin = fopen("A6_input.txt","r");
yyout = fopen("A6_outputfile1.txt","w");
yylex();
printf("\n\n\n\n\n");
fclose(yyin);
fclose(yyout);
yyin = fopen("A6_outputfile1.txt","r");
16
yyout = fopen("A6_outputfile2.txt","w");
yylex();
printf("\n");
return 0;
}
------------------------------------------------------------------------------------------------------------
File: A6_input.txt
Many of Lifes failures are people who did not realize how close they were to success when
they gave up- Thomas Edison
Sample Input and Output:
17
EXPERIMENT 5
Problem Statement:
To write a C program to implement left recursion.
Concept to be Applied:
Left Recursion:
A grammar of the form, G = (V, T, S, P) is said to be in left recursive form if it has the
In other words, a grammar production is said to have left recursion if the leftmost
variable of its Right Hand Side is the same as the variable of its Left Hand Side. A
grammar containing a production of having left recursion is called Left Recursive
Grammar.
Eliminating Left Recursion
A left recursive production can be eliminated by rewriting the offending productions.
Consider a nonterminal A with two productions
The nonterminal A and its productions are said to be left recursive because the
A itself as the leftmost symbol on the right side.
The left- -left-
recursive productions:
Algorithm:
Step 1: Input Grammar: Read the grammar rules from the user.
Step 2: Identify Left Recursion: Analyze each production to detect left recursion.
Step 3: Transform Grammar: Convert left-recursive rules into their equivalent non-left-
recursive forms.
Step 4: Output the Transformed Grammar: Display the new grammar without left
recursion
18
Program:
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#define SIZE 20
int main()
{
char pro[SIZE], alpha[SIZE], beta[SIZE];
int nont_terminal,i,j, index=3;
nont_terminal=pro[0];
if(nont_terminal==pro[index]) //Checking if the Grammar is LEFT RECURSIVE
{
//Getting Alpha
for(i=++index,j=0;pro[i]!='|';i++,j++){
alpha[j]=pro[i];
//Checking if there is NO Vertical Bar (|)
if(pro[i+1]==0){
printf("This Grammar CAN'T BE REDUCED.\n");
exit(0); //Exit the Program
}
}
alpha[j]='\0'; //String Ending NULL Character
19
for(j=i,i=0;pro[j]!='\0';i++,j++){
beta[i]=pro[j];
}
beta[i]='\0'; //String Ending NULL character
20
EXPERIMENT 6
Problem Statement:
To write a C program to implement left factoring.
Concept to be Applied:
Left Factoring:
Left factoring is a process by which the grammar with common prefixes is transformed
to make it useful for Top down parsers.
In left factoring,
The grammar obtained after the process of left factoring is called as Left Factored
Grammar.
Example:
Do left factoring in the following grammar:
S
E
Algorithm:
Step 1: Input Grammar: Read the grammar rules from the user.
Step 2: Identify Common Prefixes: For each non-terminal, check its productions for common
prefixes.
Step 3: Transform Grammar: Factor out the common prefixes and create new non-terminals.
21
Step 4: Output the Transformed Grammar: Display the new grammar without left factoring.
Program:
#include<stdio.h>
#include<string.h>
int main()
{
char gram[20],part1[20],part2[20],modifiedGram[20],newGram[20],tempGram[20];
int i,j=0,k=0,l=0,pos;
printf("Enter Production : A->");
gets(gram);
for(i=0;gram[i]!='|';i++,j++)
part1[j]=gram[i];
part1[j]='\0';
for(j=++i,i=0;gram[j]!='\0';j++,i++)
part2[i]=gram[j];
part2[i]='\0';
for(i=0;i<strlen(part1)||i<strlen(part2);i++){
if(part1[i]==part2[i]){
modifiedGram[k]=part1[i];
k++;
pos=i+1;
}
}
for(i=pos,j=0;part1[i]!='\0';i++,j++){
newGram[j]=part1[i];
}
newGram[j++]='|';
for(i=pos;part2[i]!='\0';i++,j++){
newGram[j]=part2[i];
22
}
modifiedGram[k]='X';
modifiedGram[++k]='\0';
newGram[j]='\0';
printf("\nGrammar Without Left Factoring : : \n");
printf(" A->%s",modifiedGram);
printf("\n X->%s\n",newGram);
}
Sample Input and Output:
23
EXPERIMENT 7
Problem Statement:
To write a C program to calculate First and Follow sets of a given grammar.
Concept to be Applied:
First and Follow:
The functions follow and followfirst are both involved in the calculation of the Follow Set of
a given Non-Terminal. The follow set of
calculation of Follow falls under three broad cases:
If a Non-Terminal on the R.H.S. of any production is followed immediately by a
Terminal then it can immediately be included in the Follow set of that Non-Terminal.
If a Non-Terminal on the R.H.S. of any production is followed immediately by a Non-
Terminal, then the First Set of that new Non-Terminal gets included on the follow set
of our original Non- n, move on
to the next symbol in the production.
Non-Terminal.
If reached the end of a production while calculating follow, then the Follow set of that
non-terminal will include the Follow set of the Non-Terminal on the L.H.S. of that
production. This can easily be implemented by recursion.
Assumptions:
1.
2. -
any combination of terminals and non- terminals.
3. L.H.S. of the first production rule is the start symbol.
4. Grammar is not left recursive.
5. Each production of a non-terminal is entered on a different line.
6. Only uppercase letters are non-terminals and everything else is a terminal.
7.
Explanation:
Store the grammar on a 2D character array production. findfirst function is for calculating the
first of any non-terminal. Calculation of first falls under two broad cases:
If the first symbol in the R.H.S of the production is a Terminal then it can directly be
included in the first set.
24
If the first symbol in the R.H.S of the production is a non-terminal then call the findfirst
function again on that non-terminal. To handle these cases like Recursion is the best
possible solution. Here again, if the First of the new non-terminal contains an epsilon
then we have to move to the next symbol of the original production which can again be
a Terminal or a Non-Terminal.
Algorithm:
Step 1: Input the Grammar:
Ask the user to input the number of productions.
For each production, prompt the user to input a production rule (e.g., A=B).
Step 2: Initialize Data Structures:
Create arrays to store First and Follow sets (calc_first[] and calc_follow[]), initialized
to empty values.
Prepare temporary arrays first[] and f[] for storing intermediate results during
calculations.
Use arrays done[] and donee[] to keep track of which non-terminals already have their
First and Follow sets calculated.
Step 3: Compute the First Sets:
For each production, check if the First set of the left-hand non-terminal (e.g., A in A=B)
has already been computed.
If not, call the findfirst() function to calculate the First set for A.
Inside findfirst():
If the non-terminal is the start symbol, add '$' to its Follow set.
Look for occurrences of the non-terminal in the right-hand side of other
productions.
If followed by a terminal, add the terminal to the Follow set.
If followed by a non-terminal, add the First set of the non-terminal to the Follow
set.
If it appears at the end of a production, add the Follow set of the left-hand non-
terminal.
25
Store and print the Follow set for the non-terminal.
Step 5: Output the Results: After all computations, print the First and Follow sets for each
non-terminal.
Program:
#include <ctype.h>
#include <stdio.h>
#include <string.h>
int count, n = 0;
27
int point1 = 0, point2, xxx;
if (xxx == 1)
continue;
28
if (first[i] == calc_first[point1][lark]) {
chk = 1;
break;
}
}
if (chk == 0) {
printf("%c, ", first[i]);
calc_first[point1][point2++] = first[i];
}
}
printf("}\n");
jm = n;
point1++;
}
printf("\n-----------------------------------------------\n\n");
29
xxx = 0;
if (xxx == 1)
continue;
land += 1;
30
printf("%c, ", f[i]);
calc_follow[point1][point2++] = f[i];
}
}
printf(" }\n\n");
km = m;
point1++;
}
}
// If terminal is encountered
if (!(isupper(c))) {
first[n++] = c;
return;
}
for (j = 0; j < count; j++) {
if (production[j][0] == c) {
if (production[j][2] == '#') {
if (production[q1][q2] == '\0')
first[n++] = '#';
else if (production[q1][q2] != '\0'
&& (q1 != 0 || q2 != 0)) {
findfirst(production[q1][q2], q1,
(q2 + 1));
} else
31
first[n++] = '#';
} else if (!isupper(production[j][2])) {
first[n++] = production[j][2];
} else {
findfirst(production[j][2], j, 3);
}
}
}
}
32
}
}
}
// If terminal is encountered
if (!(isupper(c)))
f[m++] = c;
else {
int i = 0, j = 1;
for (i = 0; i < count; i++) {
if (calc_first[i][0] == c)
break;
}
33
j++;
}
}
}
Sample Input and Output:
34
EXPERIMENT 8
Problem Statement:
To write a C program to implement Recursive Descent Parsing.
Concept to be Applied:
Parsing:
Parsing is the process to determine whether the start symbol can derive the program or not. If
the Parsing is successful then the program is a valid program otherwise the program is invalid.
There are generally two types of Parsers:
1. Top-Down Parsers:
In this Parsing technique we expand the start symbol to the whole program.
Recursive Descent and LL parsers are the Top-Down parsers.
2. Bottom-Up Parsers:
In this Parsing technique we reduce the whole program to start symbol.
Operator Precedence Parser, LR(0) Parser, SLR Parser, LALR Parser and CLR
Parser are the Bottom-Up parsers.
Recursive Descent Parser:
It is a kind of Top-Down Parser. A top-down parser builds the parse tree from the top to down,
starting with the start non-terminal. A Predictive Parser is a special case of Recursive Descent
Parser, where no Back Tracking is required.
Example:
For Recursive Descent Parser, we are going to write one program for every variable.
35
Algorithm:
Step 1: Input the Expression: Read the input string from the user. Set the cursor to point at
the start of the input string.
Step 2: Start Parsing with E: Call the E() function, which tries to match the input with the
grammar rule E -> T E'.
Step 3: Parse E (Expression): Match the first part of the expression by calling T(). Then match
the remaining part by calling Edash() (E').
Step 4: Parse Edash (E'): If the current symbol is +, consume +, call T(), and recursively call
Edash(). If no + is found, return success (empty production).
Step 5: Parse T (Term): Match the first factor by calling F(). Then match the remaining part
by calling Tdash() (T').
Step 6: Parse Tdash (T'): If the current symbol is *, consume *, call F(), and recursively call
Tdash(). If no * is found, return success (empty production).
Step 7: Parse F (Factor): If the current symbol is (, consume (, call E(), and then consume ).
If the current symbol is i, consume i. If neither (nor i is found, return failure.
Step 8: Final Validation: If parsing is successful and the entire input string is consumed, the
string is valid. Otherwise, return a parsing error.
Program:
#include <stdio.h>
#include <string.h>
#define SUCCESS 1
#define FAILED 0
int main()
{
36
puts("Enter the string");
scanf("%s", string);
//sscanf("i+(i+i)*i", "%s", string);
cursor = string;
puts("");
puts("Input Action");
puts("--------------------------------");
int E()
{
printf("%-16s E -> T E'\n", cursor);
if (T()) {
if (Edash())
return SUCCESS;
else
return FAILED;
} else
return FAILED;
}
37
int Edash()
{
if (*cursor == '+') {
printf("%-16s E' -> + T E'\n", cursor);
cursor++;
if (T()) {
if (Edash())
return SUCCESS;
else
return FAILED;
} else
return FAILED;
} else {
printf("%-16s E' -> $\n", cursor);
return SUCCESS;
}
}
int T()
{
printf("%-16s T -> F T'\n", cursor);
if (F()) {
if (Tdash())
return SUCCESS;
else
return FAILED;
} else
return FAILED;
}
38
int Tdash()
{
if (*cursor == '*') {
printf("%-16s T' -> * F T'\n", cursor);
cursor++;
if (F()) {
if (Tdash())
return SUCCESS;
else
return FAILED;
} else
return FAILED;
} else {
printf("%-16s T' -> $\n", cursor);
return SUCCESS;
}
}
int F()
{
if (*cursor == '(') {
printf("%-16s F -> ( E )\n", cursor);
cursor++;
if (E()) {
if (*cursor == ')') {
cursor++;
return SUCCESS;
} else
return FAILED;
} else
39
return FAILED;
} else if (*cursor == 'i') {
cursor++;
printf("%-16s F ->i\n", cursor);
return SUCCESS;
} else
return FAILED;
}
40
EXPERIMENT 9
Problem Statement:
To write a C program to implement SLR Parsing algorithm.
Concept to be Applied:
SLR (1) Parsing
SLR (1) refers to simple LR Parsing. It is same as LR(0) parsing.
The only difference is in the parsing table.
To construct SLR (1) parsing table, we use canonical collection of LR (0) item.
In the SLR (1) parsing, we place the reduce move only in the follow of lefthand side.
Various steps involved in the SLR (1) Parsing:
For the given input string write a context free grammar.
Check the ambiguity of the grammar.
Add Augment production in the given grammar.
Create Canonical collection of LR (0) items.
Draw a data flow diagram (DFA).
Construct a SLR (1) parsing table.
Algorithm:
Step1: Initialize the Input and Stack:
If the action indicates Accept (state 102), declare the string as accepted.
If no valid shift or reduce action is found, declare a syntax error.
Step 6: Exit:
Stop when the input string is either successfully parsed or an error is encountered.
Program:
#include<stdio.h>
#include<string.h>
int axn[][6][2]={
{{100,5},{-1,-1},{-1,-1},{100,4},{-1,-1},{-1,-1}},
{{-1,-1},{100,6},{-1,-1},{-1,-1},{-1,-1},{102,102}},
{{-1,-1},{101,2},{100,7},{-1,-1},{101,2},{101,2}},
{{-1,-1},{101,4},{101,4},{-1,-1},{101,4},{101,4}},
{{100,5},{-1,-1},{-1,-1},{100,4},{-1,-1},{-1,-1}},
{{-1,-1},{101,6},{101,6},{-1,-1},{101,6},{101,6}},
{{100,5},{-1,-1},{-1,-1},{100,4},{-1,-1},{-1,-1}},
{{100,5},{-1,-1},{-1,-1},{100,4},{-1,-1},{-1,-1}},
{{-1,-1},{100,6},{-1,-1},{-1,-1},{100,1},{-1,-1}},
{{-1,-1},{101,1},{100,7},{-1,-1},{101,1},{101,1}},
{{-1,-1},{101,3},{101,3},{-1,-1},{101,3},{101,3}},
{{-1,-1},{101,5},{101,5},{-1,-1},{101,5},{101,5}}
};//Axn Table
int gotot[12][3]={1,2,3,-1,-1,-1,-1,-1,-1,-1,-1,-1,8,2,3,-1,-1,-1,
-1,9,3,-1,-1,10,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1}; //GoTo table
42
int a[10];
char b[10];
int top=-1,btop=-1,i;
void push(int k)
{
if(top<9)
a[++top]=k;
}
void pushb(char k)
{
if(btop<9)
b[++btop]=k;
}
char TOS()
{
return a[top];
}
void pop()
{
if(top>=0)
top--;
}
void popb()
{
if(btop>=0)
b[btop--]='\0';
}
void display()
{
43
for(i=0;i<=top;i++)
printf("%d%c",a[i],b[i]);
}
void display1(char p[],int m) //Displays The Present Input String
{
int l;
printf("\t\t");
for(l=m;p[l]!='\0';l++)
printf("%c",p[l]);
printf("\n");
}
void error()
{
printf("Syntax Error");
}
void reduce(int p)
{
int len,k,ad;
char src,*dest;
switch(p)
{
case 1:dest="E+T";
src='E';
break;
case 2:dest="T";
src='E';
break;
case 3:dest="T*F";
src='T';
break;
44
case 4:dest="F";
src='T';
break;
case 5:dest="(E)";
src='F';
break;
case 6:dest="i";
src='F';
break;
default:dest="\0";
src='\0';
break;
}
for(k=0;k<strlen(dest);k++)
{
pop();
popb();
}
pushb(src);
switch(src)
{
case 'E':ad=0;
break;
case 'T':ad=1;
break;
case 'F':ad=2;
break;
default: ad=-1;
break;
}
45
push(gotot[TOS()][ad]);
}
int main()
{
int j,st,ic;
char ip[20]="\0",an;
// clrscr();
printf("Enter any String\n");
+
scanf("%s",ip);
push(0);
display();
printf("\t%s\n",ip);
for(j=0;ip[j]!='\0';)
{
st=TOS();
an=ip[j];
if(an>='a'&&an<='z') ic=0;
else if(an=='+') ic=1;
else if(an=='*') ic=2;
else if(an=='(') ic=3;
else if(an==')') ic=4;
else if(an=='$') ic=5;
else {
error();
break;
}
if(axn[st][ic][0]==100)
{
46
pushb(an);
push(axn[st][ic][1]);
display();
j++;
display1(ip,j);
}
if(axn[st][ic][0]==101)
{
reduce(axn[st][ic][1]);
display();
display1(ip,j);
}
if(axn[st][ic][1]==102)
{
printf("Given String is accepted \n");
// getch();
break;
}
/* else
{
printf("Given String is rejected \n");
break;
}*/
}
return 0;
}
47
Sample Input and Output:
48
EXPERIMENT 10
Problem Statement:
To write a C program to implement type checking.
Concept to be Applied:
Type Checking: Type checking ensures that operations are performed on compatible data
types. For instance, in the case of division (/), if any variable involved in the operation is of
type float, the expression must also handle the result as a float.
Expression Parsing: Parsing is the process of analyzing a string or sequence of tokens (like
the expression) to ensure it follows certain rules.
Flag Mechanism for Division: Using flags is a way to track certain conditions in a program.
A flag is essentially a boolean variable (either true or false) that indicates whether a specific
condition has occurred.
Mapping Variables to Types: Variables in programming are often associated with a type,
such as int or float. Operations on these variables need to respect these types.
Error Handling: Error handling involves detecting and managing errors or exceptions that
may occur during program execution.
Algorithm:
Step 1: Input Number of Variables: Prompt the user to enter the number of variables (n).
Initialize arrays to store variable names and types.
49
Scan the expression to check if it contains a division operator (/).
If division is found, set a flag indicating that the expression involves division.
Compare the first variable in the expression with the variable names stored.
If the expression contains division:
If the expression does not contain division, print that the datatype is correctly defined.
Step 6: End:
Program:
//To implement type checking
#include<stdio.h>
#include<stdlib.h>
int main()
{
int n,i,k,flag=0;
char vari[15],typ[15],b[15],c;
printf("Enter the number of variables:");
scanf(" %d",&n);
for(i=0;i<n;i++)
{
printf("Enter the variable[%d]:",i);
scanf(" %c",&vari[i]);
printf("Enter the variable-type[%d](float-f,int-i):",i);
scanf(" %c",&typ[i]);
if(typ[i]=='f')
50
flag=1;
}
printf("Enter the Expression(end with $):");
i=0;
getchar();
while((c=getchar())!='$')
{
b[i]=c;
i++; }
k=i;
for(i=0;i<k;i++)
{
if(b[i]=='/')
{
flag=1;
break; } }
for(i=0;i<n;i++)
{
if(b[0]==vari[i])
{
if(flag==1)
{
if(typ[i]=='f')
{ printf("\nthe datatype is correctly defined..!\n");
break; }
else
{ printf("Identifier %c must be a float type..!\n",vari[i]);
break; } }
else
{ printf("\nthe datatype is correctly defined..!\n");
51
break; } }
}
return 0;
}
52
EXPERIMENT -11 (a)
Problem Statement:
To write a YACC program to check whether given string is Palindrome or not.
Concept to be Applied:
YACC is the standard parser generator for the
Unix operating system. An open source program, yacc generates code for the parser in
the C programming language. The acronym is usually rendered in lowercase but is
occasionally seen as YACC or Yacc.
A palindrome is a sequence of letters, numbers, or whole words that reads the same
forwards as it does backwards.
Palindrome Examples:
123454321 (numbers)
Race car, kayak (letters)
Hannah, Otto, Ava, Bob (names)
King, are you glad you are King? (words)
Algorithm:
Step 3: Parser:
53
Step 5: Output Result: Print whether the input string is a palindrome or not.
Program:
Lexical Analyzer Source Code:
%{
/* Definition section */
#include <stdio.h>
#include <stdlib.h>
#include "y.tab.h"
%}
/* %option noyywrap */
/* Rule Section */
%%
%%
int yywrap()
{
return -1;
}
54
Parser Source Code:
%{
/* Definition section */
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
extern int yylex();
int i;
int k =0;
%}
%union {
char* f;
}
/* Rule Section */
%%
S:E{
flag = 0;
k = strlen($1) - 1;
if(k%2==0){
55
for (i = 0; i <= k/2; i++) {
if ($1[i] == $1[k-i]) {
} else {
flag = 1;
}
}
if (flag == 1) printf("Not palindrome\n");
else printf("palindrome\n");
printf("%s\n", $1);
}else{
}
}
;
56
%%
//driver code
int main()
{
yyparse();
return 0;
}
57
EXPERIMENT 11 (b)
Problem Statement:
To write a YACC program to recognize strings of { an .
Concept to be Applied:
The language consists of strings that start with at least five 'a' characters followed by
exactly one 'b'.
Example valid strings: aaaaab, aaaaaaab, aaaaaaaaab, etc.
Example invalid strings: aaaab, aab, b, aaaa, etc.
Algorithm:
Step 1: Lexical Analyzer (Flex)
58
If the rule matches, print "valid string" and exit.
Define the non-terminal S:
It can recursively include more As or be empty (to allow
for any number of As before B).
Invoke yyparse:
Call yyparse() to start the parsing process, which will use the defined rules
and the lexical analyzer to validate the input string.
Program:
Lexical Analyzer Source Code:
%{
/* Definition section */
#include "y.tab.h"
%}
/* Rule Section */
%%
[aA] {return A;}
[bB] {return B;}
\n {return NL;}
. {return yytext[0];}
%%
int yywrap()
59
{
return 1;
}
Parser Source Code:
%{
/* Definition section */
#include<stdio.h>
#include<stdlib.h>
%}
%token A B NL
/* Rule Section */
%%
stmt: A S B NL {printf("valid string\n");
exit(0);}
;
S: S A
|
;
%%
//driver code
main()
60
{
printf("enter the string\n");
yyparse();
}
Sample Input and Output:
61
EXPERIMENT 12
Problem Statement:
To write a program to validate the given expression using the compiler tools LEX and YACC
in LLVM.
Concepts to be Applied:
Algorithm:
Program:
62
#include<stdio.h>
#include"y.tab.h"
%}
%%
[\t] ;
[\n] return 0;
. return yytext[0];
%%
int yywrap()
{
return 1;
}
YACC Part:
%{
#include<stdio.h>
63
%}
%token NUMBER
%token VARIABLE
%left '+' '-'
%%
S: VARIABLE'='E {
printf("\n Entered arithmetic expression is Valid\n\n");
return 0;
}
E:E'+'
E
|E'-'E
|E'*'E
|E'/'E
|E'%'E
|'('E')'
| NUMBER
| VARIABLE
;
%%
void main()
{
printf("\n Enter Any Arithmetic Expression which can have operations Addition,
Subtraction, Multiplication, Division, Modulus and Round brackets:\n");
yyparse();
64
}
void yyerror()
{
printf("\n Entered arithmetic expression is Invalid\n\n");
}
Execution Steps:
nano exp.y
yacc -d exp.y
nano exp.l
lex exp.l
clang lex.yy.c y.tab.c -w
./a.out
65
Sample Input and Output:
For Valid Expressions:
66
EXPERIMENT 13
Problem Statement:
To write a program to implement a simple calculator using Lex and YACC in LLVM.
Concept to be Applied:
Token Definition: The Lex file defines patterns (regular expressions) that match
different tokens in the input.
Numbers: The pattern [0-9]+ matches sequences of digits, which are converted
to integers using atoi() and assigned to yylval for use in the parser.
Operators: Various arithmetic operators (+, -, *, /, %, **) are recognized and
returned as tokens.
Parentheses: The characters ( and ) are recognized as tokens for grouping
expressions.
Ignoring Whitespace: Spaces and tabs are ignored, allowing the input to be
formatted freely.
Error Handling: Any unrecognized character triggers an error message,
helping to identify invalid input.
Grammar Rules: The YACC file defines the grammar for valid arithmetic expressions.
Operator Precedence and Associativity: The %left and %right declarations
establish the precedence and associativity of operators. For example,
multiplication and division have higher precedence than addition and
subtraction, and exponentiation (**) is right associative.
Error Handling: The yyerror function provides feedback when an error occurs
during parsing.
Recursive Descent Parsing: The grammar rules allow for recursive evaluation of
expressions, enabling the parsing of nested expressions and proper evaluation order.
Expression Evaluation:
Abstract Syntax Tree (AST): The YACC file can be expanded to build an AST or
directly evaluate expressions based on the parsed tokens, typically by implementing
rules for combining operands and operators.
Mathematical Functions: Including <math.h> allows for advanced mathematical
operations (like power) to be used in evaluations.
67
Integration and Compilation:
To run this program, we need to compile both the LEX and YACC files together. This
process typically involves generating C source files from both, compiling them, and
linking the resulting object files.
Algorithm:
Step 1: Place the C declaration statements inside %{ and %}.
Step 2: Declare the tokens.
Step 3: Define the associativity of the operators and algebraical functions.
Step 4: Define the expression types and action to be done.
Step 5: In the main function get the input expression and start parsing.
Step 6: If there are no errors then print the result of the expression.
Program:
Lex file:
%{
#include "y.tab.h"
%}
%%
[0-9]+ { yylval = atoi(yytext); return NUMBER; } [+\-*/\n] { return yytext[0]; }
"%" { return MOD; }
"**" { return POWER; }
"(" { return '('; }
")" { return ')'; }
[ \t] ; /* Ignore whitespace */
. { printf("Invalid character: %s\n", yytext); }
%%
68
}
YACC file:
%{
#include <stdio.h> #include <math.h> #include<stdlib.h>
void yyerror(char* s) { printf("Error: %s\n", s);
}
int yylex();
%}
69
EXPERIMENT 14
Problem Statement:
To write a YACC program to convert Infix expression to Postfix expression.
Concept to be Applied:
Infix Expression
Postfix Expression
In a postfix expression, the operators appear after their operands (e.g., A B +).
No parentheses are needed, as the order of operations is unambiguous based on the
position of the operators.
Algorithm:
Step 1: Initialization:
70
T (Term): Can be a term followed by * or / and another factor P, or just a factor
P.
P (Power): Can be a factor followed by ^ and another power P, or just a factor
F.
F (Factor): Can be an expression within parentheses or a digit.
Implement yyerror() to print an error message if parsing fails, indicating a syntax error.
The parsing continues until all input is processed (indicated by returning 0 on newline),
and a newline is printed after the entire expression is evaluated.
Program:
Parser Source Code:
File: C4.y
%{
/* Definition section */
#include <ctype.h>
71
#include<stdio.h>
#include<stdlib.h>
%}
%token digit
/* Rule Section */
%%
/*All these grammar rules are established for operator precedence and
associativity*/
/*S prints new line after evaluating E*/
S: E {printf("\n\n");}
;
/*E can be evaluated to E+T or E-T or just T*/
E: E '+' T { printf ("+");}
| E '-' T { printf ("-");}
|T
;
/*T can be evaluted to T*P or T/P or just P*/
T: T '*' P { printf("*");}
| T '/' P { printf("/");}
|P
;
/*P can be evaluated to F^P or just F*/
P: F '^' P { printf ("^");}
|F
;
/*F can evaluated to E or a number*/
F: '(' E ')'
| digit {printf("%d", $1);}
;
%%
72
//driver code
int main()
{
printf("Enter infix expression: ");
yyparse(); //to parse the input
}
yyerror()
{
printf("NITW Error");
}
Lexical Analyzer Source Code:
File: C4.l
%{
#include "y.tab.h"
extern int yylval;
%}
%%
/*If the token is a Integer number,return it.*/
[0-9]+ {yylval=atoi(yytext); return digit;}
/*If the token is a space or tab,ignore it.*/
[\t] ;
/*If the token is a new line,return 0*/
[\n] return 0;
/*If the token didn't match with any of the above,return the first
character*/
. return yytext[0];
%%
73
Sample Input and Output:
74
EXPERIMENT 15
Problem Statement:
To write a YACC program to generate 3-Address code for a given expression.
Concept to be Applied:
In general, Three Address instructions are represented as:
a = b op c
a=b+c
c=axb
Algorithm:
Step 1: Define the Grammar for Arithmetic Expressions:
For each rule, associate an action to generate 3-address code for the corresponding part
of the expression.
Use temporary variables like t1, t2, etc., to store intermediate results of sub-expressions.
Program:
File: C6.y
%{
#include <math.h>
#include<ctype.h>
#include<stdio.h>
int var_cnt=0;
char iden[20];
%}
%token digit
%token id
%%
/* Separating the LHS and RHS of the expression. */
S:id '=' E { printf("%s = t%d\n",iden, var_cnt-1); }
/* Following the operator precedence. */
/* '+','-' have least precendece. They have to be printed after all the others 3-
Address codes are printed. */
E:E '+' T { $$=var_cnt; var_cnt++; printf("t%d = t%d + t%d;\n", $$, $1, $3 );
}
|E '-' T { $$=var_cnt; var_cnt++; printf("t%d = t%d - t%d;\n", $$, $1, $3 );
}
|T { $$=$1; }
;
/* '*','/' have second least precedence. They have to be printed before the 3-
Address codes of operators '+' and '-' are printed. */
T:T '*' F { $$=var_cnt; var_cnt++; printf("t%d = t%d * t%d;\n", $$, $1, $3 ); }
76
|T '/' F { $$=var_cnt; var_cnt++; printf("t%d = t%d / t%d;\n", $$, $1, $3 ); }
|F {$$=$1 ; }
;
/* '^' has second precedence. These 3-Address code has to be printed after the 3-
Address codes of brackets are printed. */
F:P '^' F { $$=var_cnt; var_cnt++; printf("t%d = t%d ^ t%d;\n", $$, $1, $3 );}
| P { $$ = $1;}
;
/* Brackets have highest precedence. These 3-Address codes are to be printed
before all the others 3-Address codes are printed. */
/* This recursively calls the second rule in this set of rules for printing the
3-Address codes of the expression inside the brackets. */
P: '(' E ')' { $$=$2; }
|digit { $$=var_cnt; var_cnt++; printf("t%d = %d;\n",$$,$1); }
;
%%
int main()
{
var_cnt=0;
printf("Enter an expression : \n");
yyparse();
return 0;
}
yyerror()
{
printf("NITW Error\n");
}
File: C6.l
/* Definitions */
d [0-9]+
77
a [a-zA-Z]+
%{
/* Including the required header files. */
#include<stdio.h>
#include<stdlib.h>
#include"y.tab.h"
extern int yylval;
extern char iden[20];
%}
/*
Rules:
If any number is matched, make it as the yyval and send as token.
If any word is matched, make it as the yylval and send as token.
If any delimiter is matched, does nothing about it.
If a new line character is encountered, end the program.
If anything else is matched, send the first character of the matched
text.
*/
%%
{d} { yylval=atoi(yytext); return digit; }
{a} { strcpy(iden,yytext); yylval=1; return id; }
[ \t] {;}
\n return 0;
. return yytext[0];
%%
78
Sample Input and Output:
79
EXPERIMENT 16
Problem Statement:
To write a C program for implementation of Code Optimization Technique.
Concepts to be Applied:
Intermediate Code: A representation of code in a lower-level format used in compilers
before generating machine code.
Dead Code Elimination: Removing code that does not affect the program's results.
Common Subexpression Elimination: Identifying and removing duplicated
expressions to reduce redundant calculations.
Algorithm:
Step 1: Generate the program for factorial program using for and do-while loop to specify
optimization technique.
Step 2: In for loop variable initialization is activated first and the condition is checked next. If
the condition is true the corresponding statements are executed and specified increment
/ decrement operation is performed.
Step 3: The for loop operation is activated till the condition failure.
Step 4: In do-while loop the variable is initialized and the statements are executed then the
condition checking and increment / decrement operation is performed.
Step 5: When comparing both for and do-while loop for optimization dowhile is best because
first the statement execution is done then only the condition is checked. So, during the
statement execution itself we can find the inconvenience of the result and no need to
wait for the specified condition result.
Step 6: Finally, when considering Code Optimization in loop do-while is best with respect to
performance.
Program:
//Code Optimization Technique
#include<stdio.h>
#include<string.h>
struct op
{
80
char l;
char r[20];
}
op[10],pr[10];
void main()
{
int a,i,k,j,n,z=0,m,q;
char *p,*l;
char temp,t;
char *tem;
printf("Enter the Number of Values:");
scanf("%d",&n);
for(i=0;i<n;i++)
{
printf("left: ");
scanf(" %c",&op[i].l);
printf("right: ");
scanf(" %s",&op[i].r);
}
printf("Intermediate Code\n") ;
for(i=0;i<n;i++)
{
printf("%c=",op[i].l);
printf("%s\n",op[i].r);
}
for(i=0;i<n-1;i++)
{
temp=op[i].l;
for(j=0;j<n;j++)
{
81
p=strchr(op[j].r,temp);
if(p)
{
pr[z].l=op[i].l;
strcpy(pr[z].r,op[i].
r);
z++;
}
}
}
pr[z].l=op[n-1].l;
strcpy(pr[z].r,op[n-1].r);
z++;
printf("\nAfter Dead Code Elimination\n");
for(k=0;k<z;k++)
{
printf("%c\t=",pr[k].l);
printf("%s\n",pr[k].r);
}
for(m=0;m<z;m++)
{
tem=pr[m].r;
for(j=m+1;j<z;j++)
{
p=strstr(tem,pr[j].r);
if(p)
{
t=pr[j].l;
pr[j].l=pr[m].l;
for(i=0;i<z;i++)
82
{
l=strchr(pr[i].r,t) ;
if(l)
{
a=l-pr[i].r;
printf("pos: %d\n",a);
pr[i].r[a]=pr[m].l;
}}}}}
printf("Eliminate Common Expression\n");
for(i=0;i<z;i++)
{
printf("%c\t=",pr[i].l);
printf("%s\n",pr[i].r);
}
for(i=0;i<z;i++)
{
for(j=i+1;j<z;j++)
{
q=strcmp(pr[i].r,pr[j].r);
if((pr[i].l==pr[j].l)&&!q)
{
pr[i].l='\0';
}
}
}
printf("Optimized Code\n");
for(i=0;i<z;i++)
{
if(pr[i].l!='\0')
{
83
printf("%c=",pr[i].l);
printf("%s\n",pr[i].r);
}
}
}
Sample Input and Output:
84