0% found this document useful (0 votes)
8 views

Compiler Design Lab Manual

The document is a lab manual for the Compiler Design Lab course at Vellore Institute of Technology for the Fall Semester 2024-25. It includes various challenging assignments and experiments related to lexical analysis, such as developing a lexical analyzer, counting lines and spaces, checking for Armstrong numbers, and transforming text to Pig Latin. Each experiment provides problem statements, concepts, algorithms, and sample programs written in C and Lex.

Uploaded by

Fwyw vV
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Compiler Design Lab Manual

The document is a lab manual for the Compiler Design Lab course at Vellore Institute of Technology for the Fall Semester 2024-25. It includes various challenging assignments and experiments related to lexical analysis, such as developing a lexical analyzer, counting lines and spaces, checking for Armstrong numbers, and transforming text to Pig Latin. Each experiment provides problem statements, concepts, algorithms, and sample programs written in C and Lex.

Uploaded by

Fwyw vV
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 84

VIT

VIT
Vellore Institute of Technology
(Deenned lo be Uuivesity uder section 3 of UGC Act, 1956)

Schoolof Computer Science and Engineering


(SCOPE)

Fall Semester 2024-25

Course Code: BCSE307P


Course Title: Compiler Design Lab

Lab Manual

Faculty: Dr. Mukku Nisanth Kartheek (20151)

Dr. Sahaaya Arul Mary S A (19584)

Dr. Vetriselvi T (18842)


Dr. Suganthini C (21136)

Prof. Umadevi K.S


Head, Department of Software Systems
Head of the Department
Department of Software Systems
School of Computer Scence &Enginsering (SCOPE)
Vellore Institute of Technology (VIT)
Demed to be University under saclon 3ole UGC A, 1958
Vellore - 632 014, Tamil Nadu, India
1
CHALLENGING ASSIGNMENTS

S.NO TITLE PAGE NO.


1. C Program to design Lexical Analyzer. 3
2 (a). Lex program to count the number of lines, spaces and tabs. 6
2 (b). Lex program to check whether given number is Armstrong 9
number or not.
3. Lex program that copies file, replacing each nonempty 12
sequence of white spaces by a single blank.
4. Lex . 15
5. C Program to eliminate Left Recursion. 18
6. C Program to perform Left Factoring. 21
7. C Program to Compute FIRST and FOLLOW of a given 24
grammar.
8. C Program to Implement Recursive Descent Parser. 35
9. C Program to implement SLR Parsing algorithm. 41
10. C Program to implement Type Checking. 49

11 (a). YACC program to check whether given string is Palindrome 53


or not.
11 (b). YACC program to recognize strings of {an . 58
12. Program to validate the given expression using Lex and YACC 62
in LLVM.
13. Program to implement simple calculator using Lex and YACC 67
in LLVM.
14. YACC program to convert Infix expression to Postfix 70
expression.
15. YACC program to generate 3-Address code for a given 75
expression.
16. C Program for implementation of Code Optimization 80
Technique.

2
EXPERIMENT 1
Problem Statement:

To develop a lexical analyzer in C to detect identifiers, keywords and operators.

Concept to be applied:

A lexical analyzer reads the characters from the source code and converts them into
tokens. Different tokens or lexemes are:
Identifiers: Names of variables and functions (e.g., varName, function1).
Constants: Numeric literals (e.g., 123, 3.14).
Operators: Symbols for operations (e.g., +, -, *, /).
Comments: Single-line (// comment) and multi-line (/* comment */).
Keywords: Reserved words in C (e.g., int, return, if).
Delimiters: Punctuation (e.g., ;, {, }, (, )).

Algorithm:
Step 1: Start the program.
Step 2: Construct a function isKeyword which receives a array of characters and checks
whether it is a keyword or not.
Step 3: Write the main function.
Step 4: Initialize operators array which has all sorts of operators present.
Step 5: Create a file named prog.txt having a simple C program.
Step 6: Open the file by fopen (prog.txt, r) in read mode.
Step 7: Get the characters while reading the file and check whether the read character is
keyword, identifier or an operator.
Step 8: Check this till fgetc(fp)) != EOF.
Step 9: Close the file by fclose(fp).
Step 10: End.

Program:
Lexical analyser code:

#include<stdio.h>
#include<stdlib.h>
#include<string.h>

3
#include<ctype.h>
int isKeyword(char buffer[]){
char keywords[32][10] = {"auto","break","case","char","const","continue","default",
"do","double","else","enum","extern","float","for","if","int","long","return","short",
"sizeof","static","struct","switch","void","while"};
int i, flag = 0;
for(i = 0; i < 32; ++i){
if(strcmp(keywords[i], buffer) == 0)
{flag = 1;
break;
}
}
return flag;
}

int main(){
char ch, buff[15], operators[] = "+-*/%=";
FILE *fp;
int i,j=0;
fp = fopen("prog.txt","r");
if(fp == NULL){
printf("error while opening the file\n");exit(0);
}
while((ch = fgetc(fp)) != EOF)
{
for(i = 0; i < 6; ++i)
{
if(ch == operators[i])
printf("%c is operator\n", ch);
}
if(isalnum(ch)){ buff[j++] = ch;
}
else if((ch == ' ' || ch == '\n') && (j != 0)){buff[j] = '\0';
j = 0;
4
if(isKeyword(buff) == 1)
printf("%s is keyword\n", buff);
else
printf("%s is identifier\n", buff);
}
}
fclose(fp); return 0;
}
Input:
Sample file (prog.txt)
void main()
{
int b, c;
int a = b + c;
}
Output:

5
EXPERIMENT 2(a)

Problem Statement:
To write a Lex program to count the number of lines, spaces and tabs.

Concept to be Applied:

Lexical Analysis:

Lex is a tool for generating scanners that recognize patterns in text. The patterns are
described using regular expressions, and Lex automatically generates code that
matches these patterns in the input text.

Pattern Matching:

In Lex, the input file is processed based on patterns. For this task, we can define patterns
for:
Newline (\n): Represents the end of a line. Each time this pattern is matched,
we increment the line counter.
Space (' '): Each time a space is encountered, we increment the space counter.
Tab ('\t'): Each time a tab is encountered, we increment the tab counter.

Counters:

Three counters will be used to keep track of the number of lines, the number of
spaces and the number of tabs.

Actions associated with Patterns:

For each of the patterns (newline, space, and tab), we will associate an action that
increments the respective counter.

Structure of a Lex Program:

A Lex program has three main sections:


1. Definitions Section: Used to declare variables or initialize counters.
2. Rules Section: Contains the regular expressions and corresponding actions.
3. User Code Section: Contains any additional C code needed for the program,
such as printing the results.

6
Algorithm:
Step 1: Initialize Counters: Set lc (line count), sc (space count), tc (tab count), wc (word
count), ch (character count) to 0.
Step 2: Start Lexical Analysis:
Display a prompt to the user to enter a sentence.
Call the yylex() function to start processing the input.
Step 3: Processing Input:
For each token read from the input:
If the token is a newline character (\n):
Increment lc by 1 (counting the line).
Increment ch by 1 (counting the newline character itself).
If the token is a space ( ) or a tab (\t):
If the token is a space, increment sc by 1 (counting the space).
If the token is a tab, increment tc by 1 (counting the tab).
Increment ch by 1 (counting the space or tab character itself).
If the token is any other character:
Increment ch by 1 (counting the character).
If the token matches one or more non-whitespace, non-newline characters
(a word):
Increment wc by 1 (counting the word).
Increment ch by the length of the token (counting the characters in the
word).
Step 4: End of Input: The yywrap() function returns 1, indicating the end of the file/input
stream has been reached.
Step 5: Display Results: Print the number of lines (lc), number of spaces (sc), number of tabs
(tc), words (wc) and characters (ch).

Program:
/* Description/Definition Section */
%{
#include<stdio.h>
int lc=0,sc=0,tc=0,ch=0,wc=0; // Global Variables
%}

// Rule Section

7
%%
[\n] { lc++; ch+=yyleng;}
[ \t] { sc++; ch+=yyleng;}
[^\t] { tc++; ch+=yyleng;}
[^\t\n ]+ { wc++; ch+=yyleng;}
%%

int yywrap(){ return 1; }


/* After inputting press ctrl+d */

// Main Function
int main(){
printf("Enter the Sentence : ");
yylex();
printf("Number of lines : %d\n",lc);
printf("Number of spaces : %d\n",sc);
printf("Number of tabs, words, charc : %d , %d , %d\n",tc,wc,ch);

return 0;
}
Sample Input and Output:

8
EXPERIMENT 2(b)
Problem Statement:
To write a Lex program to check whether given number is an Armstrong number or not.

Concept to be Applied:
An Armstrong number is a number that is the sum of its own digits each raised to the power
of the number of digits. For example, 153 is an Armstrong number,
(1^3) + (5^3) + (3^3) = 153
Examples:
Input: 153
Output: 153 is a Armstrong number

Input: 250
Output: 250 is not a Armstrong number

Algorithm:
Step 1: Input Reading: Read numbers from a file (or standard input).
Step 2: Length Calculation: For each number, determine the number of digits.
Step 3: Sum Calculation: Calculate the sum of each digit raised to the power of the number
of digits.
Step 4: Comparison: Compare the calculated sum with the original number.
Step 5: Output Result: Print whether the number is an Armstrong number or not.

Program:
/* Lex program to check whether given number is Armstrong number or not */
%
{
/* Definition section */
#include <math.h>
#include <string.h>
void check(char*);

9
%
}
/* Rule Section */
%%
[0 - 9]
+ check(yytext);
%%
int main()
{
/* yyin as pointer of File type */
extern FILE* yyin;
yyin = fopen("num", "r");
// The function that starts the analysis
yylex();
return 0;
}
void check(char* a)
{
int len = strlen(a), i, num = 0;
for (i = 0; i < len; i++)
num = num * 10 + (a[i] - '0');

int x = 0, y = 0, temp = num;


while (num > 0) {
y = pow((num % 10), len);
x = x + y;
num = num / 10;
}
if (x == temp)
printf("%d is armstrong number \n", temp);

10
else
printf("%d is not armstrong number\n", temp);
}
Sample Input and Output:

11
EXPERIMENT 3
Problem Statement:
To write a Lex program that copies file, replacing each nonempty sequence of white spaces
by a single blank.

Concept to be Applied:
The goal is to read from an input file and output its contents to another file, ensuring that:
Any sequence of one or more whitespace characters (spaces, tabs, newlines) is replaced
with a single space.
Whitespace Characters: These include spaces ( ), tabs (\t), and newline characters (\n).
Lex (Lexical Analyzer Generator): A tool used to generate a lexer that can recognize
patterns in text.

Algorithm:
Step 1: Read Input: Open the input file and prepare to read its contents.
Step 2: Match Whitespace: Use a regex pattern to match sequences of whitespace characters.
Step 3: Output Process: For each match:
If the match is a whitespace sequence, output a single space.
If it is a non-whitespace character, output that character as-it-is.
Step 4: Write to Output: Write the processed output to a file or standard output.

Program:
File: A5.l
s[ ]
%%
[ ]([ ])* { /* Pattern for recognizing multiple spaces */
fprintf(yyout," ");
}
([ ])*(\n)([ ])* { /* spaces followed by newline followed by spaces */
fprintf(yyout," ");
}
%%
12
int main()
{
// Point yyin to a file with text, this acts as input to our program
yyin = fopen("A5_input.txt","r");
// Point yyout to output file.
yyout = fopen("A5_output.txt","w");
yylex();
return 0;
}
---------------------------------------------------------------------------------------------------------------
File: A5_input.txt
Hello, Friends
Service to humanity
is
service to divinity.
If
you
don't
know
how
compiler works,
then
you don't
know how
computer works

13
Sample Input and Output:

14
EXPERIMENT 4
Problem Statement:
To write a Lex
sequence of words (groups of letters) separated by whitespace. Every time you encounter a
word:
(a) If the first letter is a consonant, move it to the end of the word and then add ay.
(b) If the first letter is a vowel, just add ay to the end of the word. All non-letters are copied
intact to the output.

Concept to be Applied:
Pig Latin is a playful form of language transformation, primarily used in English. The rules for
converting words to Pig Latin are as follows:
1. If a word starts with a consonant, move the first letter to the end of the word and add
"ay".
o Example: "hello" becomes "ellohay".
2. If a word starts with a vowel (a, e, i, o, u), simply add "ay" to the end of the word.
o Example: "apple" becomes "appleay".
3. Non-letter characters (punctuation, whitespace) should remain unchanged.

Algorithm:
Step 1: Read Input: Open the input file and prepare to read its contents.
Step 2: Identify Words: Use regex to identify sequences of letters as words.
Step 3: Transform Each Word: For each identified word:

Apply the appropriate Pig Latin transformation.


Step 4: Output Non-letter Characters: Copy non-letter characters to the output unchanged.
Step 5: Write to Output: Print the transformed words and unchanged characters.

Program:
File: A6.l
c[a-zA-Z]
vowel[aeiouAEIOU]

15
cons[^aeiouAEIOU]
%%
{vowel}{c}* {
/* First character is vowel */
/* copy yytext into an array and append "ay" to it */
char s[100];
strcpy(s,yytext);
strcat(s,"ay");
printf("%s ",s);
fprintf(yyout,"%s",s);
}
{c}{c}* {
/* First character is consonant */
/* copy yytext into an array except first character and then add the first
character and append "ay" to it */
char s[100];
strcpy(s,yytext+1);
printf("%s%cay ",s,yytext[0]);
fprintf(yyout,"%s%cay",s,yytext[0]);
}
%%
int main()
{
printf("The output is : \n");
yyin = fopen("A6_input.txt","r");
yyout = fopen("A6_outputfile1.txt","w");
yylex();
printf("\n\n\n\n\n");
fclose(yyin);
fclose(yyout);
yyin = fopen("A6_outputfile1.txt","r");
16
yyout = fopen("A6_outputfile2.txt","w");
yylex();
printf("\n");
return 0;
}
------------------------------------------------------------------------------------------------------------
File: A6_input.txt
Many of Lifes failures are people who did not realize how close they were to success when
they gave up- Thomas Edison
Sample Input and Output:

17
EXPERIMENT 5
Problem Statement:
To write a C program to implement left recursion.

Concept to be Applied:
Left Recursion:
A grammar of the form, G = (V, T, S, P) is said to be in left recursive form if it has the

In other words, a grammar production is said to have left recursion if the leftmost
variable of its Right Hand Side is the same as the variable of its Left Hand Side. A
grammar containing a production of having left recursion is called Left Recursive
Grammar.
Eliminating Left Recursion
A left recursive production can be eliminated by rewriting the offending productions.
Consider a nonterminal A with two productions

The nonterminal A and its productions are said to be left recursive because the
A itself as the leftmost symbol on the right side.
The left- -left-
recursive productions:

Why do we eliminate left recursion?


Left recursion often poses problems for parsers, either because it leads them into
infinite recursion (loop) or because they expect rules in a normal form that forbids it.
Top-down parsing methods cannot handle left recursive grammars, so a transformation
is needed to eliminate left recursion.
Therefore, grammar is often pre-processed to eliminate the left recursion.

Algorithm:
Step 1: Input Grammar: Read the grammar rules from the user.
Step 2: Identify Left Recursion: Analyze each production to detect left recursion.
Step 3: Transform Grammar: Convert left-recursive rules into their equivalent non-left-
recursive forms.
Step 4: Output the Transformed Grammar: Display the new grammar without left
recursion
18
Program:
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#define SIZE 20
int main()
{
char pro[SIZE], alpha[SIZE], beta[SIZE];
int nont_terminal,i,j, index=3;

printf("Enter the Production as E->E|A: ");


scanf("%s", pro);

nont_terminal=pro[0];
if(nont_terminal==pro[index]) //Checking if the Grammar is LEFT RECURSIVE
{
//Getting Alpha
for(i=++index,j=0;pro[i]!='|';i++,j++){
alpha[j]=pro[i];
//Checking if there is NO Vertical Bar (|)
if(pro[i+1]==0){
printf("This Grammar CAN'T BE REDUCED.\n");
exit(0); //Exit the Program
}
}
alpha[j]='\0'; //String Ending NULL Character

if(pro[++i]!=0) //Checking if there is Character after Vertical Bar (|)


{
//Getting Beta

19
for(j=i,i=0;pro[j]!='\0';i++,j++){
beta[i]=pro[j];
}
beta[i]='\0'; //String Ending NULL character

//Showing Output without LEFT RECURSION


printf("\nGrammar Without Left Recursion: \n\n");
printf(" %c->%s%c'\n", nont_terminal,beta,nont_terminal);
printf(" %c'->%s%c'|#\n", nont_terminal,alpha,nont_terminal);
}
else
printf("This Grammar CAN'T be REDUCED.\n");
}
else
printf("\n This Grammar is not LEFT RECURSIVE.\n");
}
Sample Input and Output:

20
EXPERIMENT 6
Problem Statement:
To write a C program to implement left factoring.

Concept to be Applied:
Left Factoring:
Left factoring is a process by which the grammar with common prefixes is transformed
to make it useful for Top down parsers.
In left factoring,

We make one production for each common prefixes.

The common prefix may be a terminal or a non-terminal or a


combination of both.

Rest of the derivation is added by new productions.

The grammar obtained after the process of left factoring is called as Left Factored
Grammar.

Example:
Do left factoring in the following grammar:
S
E

The left factored grammar is-


S `/a

Algorithm:
Step 1: Input Grammar: Read the grammar rules from the user.
Step 2: Identify Common Prefixes: For each non-terminal, check its productions for common
prefixes.
Step 3: Transform Grammar: Factor out the common prefixes and create new non-terminals.

21
Step 4: Output the Transformed Grammar: Display the new grammar without left factoring.

Program:
#include<stdio.h>
#include<string.h>
int main()
{
char gram[20],part1[20],part2[20],modifiedGram[20],newGram[20],tempGram[20];
int i,j=0,k=0,l=0,pos;
printf("Enter Production : A->");
gets(gram);
for(i=0;gram[i]!='|';i++,j++)
part1[j]=gram[i];
part1[j]='\0';
for(j=++i,i=0;gram[j]!='\0';j++,i++)
part2[i]=gram[j];
part2[i]='\0';
for(i=0;i<strlen(part1)||i<strlen(part2);i++){
if(part1[i]==part2[i]){
modifiedGram[k]=part1[i];
k++;
pos=i+1;
}
}
for(i=pos,j=0;part1[i]!='\0';i++,j++){
newGram[j]=part1[i];
}
newGram[j++]='|';
for(i=pos;part2[i]!='\0';i++,j++){
newGram[j]=part2[i];

22
}
modifiedGram[k]='X';
modifiedGram[++k]='\0';
newGram[j]='\0';
printf("\nGrammar Without Left Factoring : : \n");
printf(" A->%s",modifiedGram);
printf("\n X->%s\n",newGram);
}
Sample Input and Output:

23
EXPERIMENT 7

Problem Statement:
To write a C program to calculate First and Follow sets of a given grammar.

Concept to be Applied:
First and Follow:
The functions follow and followfirst are both involved in the calculation of the Follow Set of
a given Non-Terminal. The follow set of
calculation of Follow falls under three broad cases:
If a Non-Terminal on the R.H.S. of any production is followed immediately by a
Terminal then it can immediately be included in the Follow set of that Non-Terminal.
If a Non-Terminal on the R.H.S. of any production is followed immediately by a Non-
Terminal, then the First Set of that new Non-Terminal gets included on the follow set
of our original Non- n, move on
to the next symbol in the production.
Non-Terminal.
If reached the end of a production while calculating follow, then the Follow set of that
non-terminal will include the Follow set of the Non-Terminal on the L.H.S. of that
production. This can easily be implemented by recursion.
Assumptions:
1.
2. -
any combination of terminals and non- terminals.
3. L.H.S. of the first production rule is the start symbol.
4. Grammar is not left recursive.
5. Each production of a non-terminal is entered on a different line.
6. Only uppercase letters are non-terminals and everything else is a terminal.
7.
Explanation:
Store the grammar on a 2D character array production. findfirst function is for calculating the
first of any non-terminal. Calculation of first falls under two broad cases:

If the first symbol in the R.H.S of the production is a Terminal then it can directly be
included in the first set.

24
If the first symbol in the R.H.S of the production is a non-terminal then call the findfirst
function again on that non-terminal. To handle these cases like Recursion is the best
possible solution. Here again, if the First of the new non-terminal contains an epsilon
then we have to move to the next symbol of the original production which can again be
a Terminal or a Non-Terminal.

Algorithm:
Step 1: Input the Grammar:
Ask the user to input the number of productions.
For each production, prompt the user to input a production rule (e.g., A=B).
Step 2: Initialize Data Structures:
Create arrays to store First and Follow sets (calc_first[] and calc_follow[]), initialized
to empty values.
Prepare temporary arrays first[] and f[] for storing intermediate results during
calculations.
Use arrays done[] and donee[] to keep track of which non-terminals already have their
First and Follow sets calculated.
Step 3: Compute the First Sets:
For each production, check if the First set of the left-hand non-terminal (e.g., A in A=B)
has already been computed.
If not, call the findfirst() function to calculate the First set for A.
Inside findfirst():

If the symbol is a terminal, add it directly to the First set.


If it's a non-terminal, recursively compute the First set of the right-hand side
symbols.
Add epsilon (#) if applicable.

Store and print the First set for the non-terminal.

Step 4: Compute the Follow Sets


For each non-terminal, check if the Follow set has already been computed.
If not, call the follow() function to calculate the Follow set for that non-terminal.
Inside follow():

If the non-terminal is the start symbol, add '$' to its Follow set.
Look for occurrences of the non-terminal in the right-hand side of other
productions.
If followed by a terminal, add the terminal to the Follow set.
If followed by a non-terminal, add the First set of the non-terminal to the Follow
set.
If it appears at the end of a production, add the Follow set of the left-hand non-
terminal.

25
Store and print the Follow set for the non-terminal.

Step 5: Output the Results: After all computations, print the First and Follow sets for each
non-terminal.

Program:
#include <ctype.h>
#include <stdio.h>
#include <string.h>

// Functions to calculate Follow


void followfirst(char, int, int);
void follow(char c);

// Function to calculate First


void findfirst(char, int, int);

int count, n = 0;

// Stores the final result of the First Sets


char calc_first[10][100];

// Stores the final result of the Follow Sets


char calc_follow[10][100];
int m = 0;

// Stores the production rules


char production[10][10];
char f[10], first[10];
int k;
char ck;
int e;
26
int main(int argc, char** argv)
{
int jm = 0;
int km = 0;
int i, j, choice;
char c, ch;

// Take number of productions from user


printf("Enter the number of productions: ");
scanf("%d", &count);

// Take the production rules as input from user


printf("Enter the productions (format: A=BC, A->BC, etc.):\n");
for (i = 0; i < count; i++) {
printf("Production %d: ", i + 1);
scanf("%s", production[i]);
}

// Variables used in the First and Follow calculation


int kay;
char done[count];
int ptr = -1;

// Initializing the calc_first array


for (k = 0; k < count; k++) {
for (kay = 0; kay < 100; kay++) {
calc_first[k][kay] = '!';
}
}

27
int point1 = 0, point2, xxx;

for (k = 0; k < count; k++) {


c = production[k][0];
point2 = 0;
xxx = 0;

// Checking if First of c has already been calculated


for (kay = 0; kay <= ptr; kay++)
if (c == done[kay])
xxx = 1;

if (xxx == 1)
continue;

// Function call to calculate First set


findfirst(c, 0, 0);
ptr += 1;

// Adding c to the calculated list


done[ptr] = c;
printf("\n First(%c) = { ", c);
calc_first[point1][point2++] = c;

// Printing the First Sets of the grammar


for (i = 0 + jm; i < n; i++) {
int lark = 0, chk = 0;

for (lark = 0; lark < point2; lark++) {

28
if (first[i] == calc_first[point1][lark]) {
chk = 1;
break;
}
}
if (chk == 0) {
printf("%c, ", first[i]);
calc_first[point1][point2++] = first[i];
}
}
printf("}\n");
jm = n;
point1++;
}
printf("\n-----------------------------------------------\n\n");

// Initializing the calc_follow array


char donee[count];
ptr = -1;

for (k = 0; k < count; k++) {


for (kay = 0; kay < 100; kay++) {
calc_follow[k][kay] = '!';
}
}
point1 = 0;
int land = 0;
for (e = 0; e < count; e++) {
ck = production[e][0];
point2 = 0;

29
xxx = 0;

// Checking if Follow of ck has already been calculated


for (kay = 0; kay <= ptr; kay++)
if (ck == donee[kay])
xxx = 1;

if (xxx == 1)
continue;
land += 1;

// Function call to calculate Follow set


follow(ck);
ptr += 1;

// Adding ck to the calculated list


donee[ptr] = ck;
printf(" Follow(%c) = { ", ck);
calc_follow[point1][point2++] = ck;

// Printing the Follow Sets of the grammar


for (i = 0 + km; i < m; i++) {
int lark = 0, chk = 0;
for (lark = 0; lark < point2; lark++) {
if (f[i] == calc_follow[point1][lark]) {
chk = 1;
break;
}
}
if (chk == 0) {

30
printf("%c, ", f[i]);
calc_follow[point1][point2++] = f[i];
}
}
printf(" }\n\n");
km = m;
point1++;
}
}

// Function to calculate First set


void findfirst(char c, int q1, int q2)
{
int j;

// If terminal is encountered
if (!(isupper(c))) {
first[n++] = c;
return;
}
for (j = 0; j < count; j++) {
if (production[j][0] == c) {
if (production[j][2] == '#') {
if (production[q1][q2] == '\0')
first[n++] = '#';
else if (production[q1][q2] != '\0'
&& (q1 != 0 || q2 != 0)) {
findfirst(production[q1][q2], q1,
(q2 + 1));
} else

31
first[n++] = '#';
} else if (!isupper(production[j][2])) {
first[n++] = production[j][2];
} else {
findfirst(production[j][2], j, 3);
}
}
}
}

// Function to calculate Follow set


void follow(char c)
{
int i, j;

// Adding "$" to the follow set of the start symbol


if (production[0][0] == c) {
f[m++] = '$';
}
for (i = 0; i < count; i++) {
for (j = 2; j < strlen(production[i]); j++) {
if (production[i][j] == c) {
if (production[i][j + 1] != '\0') {
followfirst(production[i][j + 1], i, j + 2);
}

if (production[i][j + 1] == '\0' && c != production[i][0]) {


follow(production[i][0]);
}
}

32
}
}
}

// Helper function for Follow set calculation


void followfirst(char c, int c1, int c2)
{
int k;

// If terminal is encountered
if (!(isupper(c)))
f[m++] = c;
else {
int i = 0, j = 1;
for (i = 0; i < count; i++) {
if (calc_first[i][0] == c)
break;
}

// Adding First set of the non-terminal to the Follow set


while (calc_first[i][j] != '!') {
if (calc_first[i][j] != '#') {
f[m++] = calc_first[i][j];
} else {
if (production[c1][c2] == '\0') {
follow(production[c1][0]);
} else {
followfirst(production[c1][c2], c1, c2 + 1);
}
}

33
j++;
}
}
}
Sample Input and Output:

34
EXPERIMENT 8

Problem Statement:
To write a C program to implement Recursive Descent Parsing.

Concept to be Applied:
Parsing:
Parsing is the process to determine whether the start symbol can derive the program or not. If
the Parsing is successful then the program is a valid program otherwise the program is invalid.
There are generally two types of Parsers:
1. Top-Down Parsers:
In this Parsing technique we expand the start symbol to the whole program.
Recursive Descent and LL parsers are the Top-Down parsers.
2. Bottom-Up Parsers:
In this Parsing technique we reduce the whole program to start symbol.
Operator Precedence Parser, LR(0) Parser, SLR Parser, LALR Parser and CLR
Parser are the Bottom-Up parsers.
Recursive Descent Parser:
It is a kind of Top-Down Parser. A top-down parser builds the parse tree from the top to down,
starting with the start non-terminal. A Predictive Parser is a special case of Recursive Descent
Parser, where no Back Tracking is required.
Example:

Before removing left recursion After removing left recursion


E E+T|T E TE1
T T*F|F E1 +TE1 |
F (E) | id T FT1
T1 *FT1 |
F (E) | id

For Recursive Descent Parser, we are going to write one program for every variable.

35
Algorithm:

Step 1: Input the Expression: Read the input string from the user. Set the cursor to point at
the start of the input string.

Step 2: Start Parsing with E: Call the E() function, which tries to match the input with the
grammar rule E -> T E'.

Step 3: Parse E (Expression): Match the first part of the expression by calling T(). Then match
the remaining part by calling Edash() (E').

Step 4: Parse Edash (E'): If the current symbol is +, consume +, call T(), and recursively call
Edash(). If no + is found, return success (empty production).

Step 5: Parse T (Term): Match the first factor by calling F(). Then match the remaining part
by calling Tdash() (T').

Step 6: Parse Tdash (T'): If the current symbol is *, consume *, call F(), and recursively call
Tdash(). If no * is found, return success (empty production).

Step 7: Parse F (Factor): If the current symbol is (, consume (, call E(), and then consume ).
If the current symbol is i, consume i. If neither (nor i is found, return failure.

Step 8: Final Validation: If parsing is successful and the entire input string is consumed, the
string is valid. Otherwise, return a parsing error.

Program:
#include <stdio.h>
#include <string.h>

#define SUCCESS 1
#define FAILED 0

int E(), Edash(), T(), Tdash(), F();


const char *cursor;
char string[64];

int main()
{

36
puts("Enter the string");
scanf("%s", string);
//sscanf("i+(i+i)*i", "%s", string);
cursor = string;
puts("");
puts("Input Action");
puts("--------------------------------");

if (E() && *cursor == '\0') {


puts("--------------------------------");
puts("String is successfully parsed");
return 0;
} else {
puts("--------------------------------");
puts("Error in parsing String");
return 1;
}
}

int E()
{
printf("%-16s E -> T E'\n", cursor);
if (T()) {
if (Edash())
return SUCCESS;
else
return FAILED;
} else
return FAILED;
}

37
int Edash()
{
if (*cursor == '+') {
printf("%-16s E' -> + T E'\n", cursor);
cursor++;
if (T()) {
if (Edash())
return SUCCESS;
else
return FAILED;
} else
return FAILED;
} else {
printf("%-16s E' -> $\n", cursor);
return SUCCESS;
}
}

int T()
{
printf("%-16s T -> F T'\n", cursor);
if (F()) {
if (Tdash())
return SUCCESS;
else
return FAILED;
} else
return FAILED;
}

38
int Tdash()
{
if (*cursor == '*') {
printf("%-16s T' -> * F T'\n", cursor);
cursor++;
if (F()) {
if (Tdash())
return SUCCESS;
else
return FAILED;
} else
return FAILED;
} else {
printf("%-16s T' -> $\n", cursor);
return SUCCESS;
}
}

int F()
{
if (*cursor == '(') {
printf("%-16s F -> ( E )\n", cursor);
cursor++;
if (E()) {
if (*cursor == ')') {
cursor++;
return SUCCESS;
} else
return FAILED;
} else

39
return FAILED;
} else if (*cursor == 'i') {
cursor++;
printf("%-16s F ->i\n", cursor);
return SUCCESS;
} else
return FAILED;
}

Sample Input and Output:

40
EXPERIMENT 9

Problem Statement:
To write a C program to implement SLR Parsing algorithm.

Concept to be Applied:
SLR (1) Parsing
SLR (1) refers to simple LR Parsing. It is same as LR(0) parsing.
The only difference is in the parsing table.
To construct SLR (1) parsing table, we use canonical collection of LR (0) item.
In the SLR (1) parsing, we place the reduce move only in the follow of lefthand side.
Various steps involved in the SLR (1) Parsing:
For the given input string write a context free grammar.
Check the ambiguity of the grammar.
Add Augment production in the given grammar.
Create Canonical collection of LR (0) items.
Draw a data flow diagram (DFA).
Construct a SLR (1) parsing table.

Algorithm:
Step1: Initialize the Input and Stack:

Take an input string from the user.


Initialize the stack with the starting state 0 and empty symbols.
Step 2: Parsing Loop:

Repeat until the entire string is processed:


Get Current State: Fetch the current state from the top of the stack.
Identify Input Symbol: Check the current input symbol (ip[j]).
Determine Action: Use the Action Table (axn) to determine whether to Shift or
Reduce based on the current state and input symbol.
Step 3: Shift Operation:

If the action is Shift:


Push the current input symbol onto the symbol stack (pushb).
Push the corresponding state from the Action Table onto the stack (push).
Move to the next input symbol.
Step 4: Reduce Operation:
41
If the action is Reduce:
Use the Reduce Table (production rules) to pop symbols and states from the
stack.
Based on the production rule, reduce to a non-terminal symbol.
Push the non-terminal symbol and its corresponding state from the Goto Table
(gotot) onto the stack.
Step 5: Accept or Reject:

If the action indicates Accept (state 102), declare the string as accepted.
If no valid shift or reduce action is found, declare a syntax error.
Step 6: Exit:

Stop when the input string is either successfully parsed or an error is encountered.

Program:
#include<stdio.h>
#include<string.h>
int axn[][6][2]={
{{100,5},{-1,-1},{-1,-1},{100,4},{-1,-1},{-1,-1}},
{{-1,-1},{100,6},{-1,-1},{-1,-1},{-1,-1},{102,102}},
{{-1,-1},{101,2},{100,7},{-1,-1},{101,2},{101,2}},
{{-1,-1},{101,4},{101,4},{-1,-1},{101,4},{101,4}},
{{100,5},{-1,-1},{-1,-1},{100,4},{-1,-1},{-1,-1}},
{{-1,-1},{101,6},{101,6},{-1,-1},{101,6},{101,6}},
{{100,5},{-1,-1},{-1,-1},{100,4},{-1,-1},{-1,-1}},
{{100,5},{-1,-1},{-1,-1},{100,4},{-1,-1},{-1,-1}},
{{-1,-1},{100,6},{-1,-1},{-1,-1},{100,1},{-1,-1}},
{{-1,-1},{101,1},{100,7},{-1,-1},{101,1},{101,1}},
{{-1,-1},{101,3},{101,3},{-1,-1},{101,3},{101,3}},
{{-1,-1},{101,5},{101,5},{-1,-1},{101,5},{101,5}}
};//Axn Table
int gotot[12][3]={1,2,3,-1,-1,-1,-1,-1,-1,-1,-1,-1,8,2,3,-1,-1,-1,
-1,9,3,-1,-1,10,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1}; //GoTo table
42
int a[10];
char b[10];
int top=-1,btop=-1,i;
void push(int k)
{
if(top<9)
a[++top]=k;
}
void pushb(char k)
{

if(btop<9)
b[++btop]=k;
}
char TOS()
{
return a[top];
}
void pop()
{
if(top>=0)
top--;
}
void popb()
{
if(btop>=0)
b[btop--]='\0';
}
void display()
{

43
for(i=0;i<=top;i++)
printf("%d%c",a[i],b[i]);
}
void display1(char p[],int m) //Displays The Present Input String
{
int l;
printf("\t\t");
for(l=m;p[l]!='\0';l++)
printf("%c",p[l]);
printf("\n");
}
void error()
{
printf("Syntax Error");
}
void reduce(int p)
{
int len,k,ad;
char src,*dest;
switch(p)
{
case 1:dest="E+T";
src='E';
break;
case 2:dest="T";
src='E';
break;
case 3:dest="T*F";
src='T';
break;

44
case 4:dest="F";
src='T';
break;
case 5:dest="(E)";
src='F';
break;
case 6:dest="i";
src='F';
break;
default:dest="\0";
src='\0';
break;
}
for(k=0;k<strlen(dest);k++)
{
pop();
popb();
}
pushb(src);
switch(src)
{
case 'E':ad=0;
break;
case 'T':ad=1;
break;
case 'F':ad=2;
break;
default: ad=-1;
break;
}

45
push(gotot[TOS()][ad]);
}
int main()
{
int j,st,ic;
char ip[20]="\0",an;
// clrscr();
printf("Enter any String\n");
+
scanf("%s",ip);

push(0);
display();
printf("\t%s\n",ip);
for(j=0;ip[j]!='\0';)
{
st=TOS();
an=ip[j];
if(an>='a'&&an<='z') ic=0;
else if(an=='+') ic=1;
else if(an=='*') ic=2;
else if(an=='(') ic=3;
else if(an==')') ic=4;
else if(an=='$') ic=5;
else {
error();
break;
}
if(axn[st][ic][0]==100)
{

46
pushb(an);
push(axn[st][ic][1]);
display();
j++;
display1(ip,j);
}
if(axn[st][ic][0]==101)
{
reduce(axn[st][ic][1]);
display();
display1(ip,j);
}
if(axn[st][ic][1]==102)
{
printf("Given String is accepted \n");
// getch();
break;
}
/* else
{
printf("Given String is rejected \n");
break;
}*/
}
return 0;
}

47
Sample Input and Output:

48
EXPERIMENT 10

Problem Statement:
To write a C program to implement type checking.

Concept to be Applied:
Type Checking: Type checking ensures that operations are performed on compatible data
types. For instance, in the case of division (/), if any variable involved in the operation is of
type float, the expression must also handle the result as a float.

Expression Parsing: Parsing is the process of analyzing a string or sequence of tokens (like
the expression) to ensure it follows certain rules.

Flag Mechanism for Division: Using flags is a way to track certain conditions in a program.
A flag is essentially a boolean variable (either true or false) that indicates whether a specific
condition has occurred.

Mapping Variables to Types: Variables in programming are often associated with a type,
such as int or float. Operations on these variables need to respect these types.

Error Handling: Error handling involves detecting and managing errors or exceptions that
may occur during program execution.

Algorithm:

Step 1: Input Number of Variables: Prompt the user to enter the number of variables (n).
Initialize arrays to store variable names and types.

Step 2: Input Variables and Types:

For each variable:


Prompt the user to enter the variable's name and its type (f for float, i for int).
If any variable is of type float, set a flag to indicate this.

Step3: Input Expression:

Prompt the user to enter an expression, ending with a $ symbol.


Store the expression in an array (b[]).

Step 4: Check for Division:

49
Scan the expression to check if it contains a division operator (/).
If division is found, set a flag indicating that the expression involves division.

Step 5: Type Check for Expression:

Compare the first variable in the expression with the variable names stored.
If the expression contains division:

Ensure the variable used is of type float.


If the variable is not of type float, print an error message.
Otherwise, confirm the data type is correctly defined.

If the expression does not contain division, print that the datatype is correctly defined.

Step 6: End:

Return from the program after type checking.

Program:
//To implement type checking
#include<stdio.h>
#include<stdlib.h>
int main()
{
int n,i,k,flag=0;
char vari[15],typ[15],b[15],c;
printf("Enter the number of variables:");
scanf(" %d",&n);
for(i=0;i<n;i++)
{
printf("Enter the variable[%d]:",i);
scanf(" %c",&vari[i]);
printf("Enter the variable-type[%d](float-f,int-i):",i);
scanf(" %c",&typ[i]);
if(typ[i]=='f')

50
flag=1;
}
printf("Enter the Expression(end with $):");
i=0;
getchar();
while((c=getchar())!='$')
{
b[i]=c;
i++; }
k=i;
for(i=0;i<k;i++)
{
if(b[i]=='/')
{
flag=1;
break; } }
for(i=0;i<n;i++)
{
if(b[0]==vari[i])
{
if(flag==1)
{
if(typ[i]=='f')
{ printf("\nthe datatype is correctly defined..!\n");
break; }
else
{ printf("Identifier %c must be a float type..!\n",vari[i]);
break; } }
else
{ printf("\nthe datatype is correctly defined..!\n");

51
break; } }
}
return 0;
}

Sample Input and Output:

52
EXPERIMENT -11 (a)
Problem Statement:
To write a YACC program to check whether given string is Palindrome or not.

Concept to be Applied:
YACC is the standard parser generator for the
Unix operating system. An open source program, yacc generates code for the parser in
the C programming language. The acronym is usually rendered in lowercase but is
occasionally seen as YACC or Yacc.
A palindrome is a sequence of letters, numbers, or whole words that reads the same
forwards as it does backwards.
Palindrome Examples:
123454321 (numbers)
Race car, kayak (letters)
Hannah, Otto, Ava, Bob (names)
King, are you glad you are King? (words)
Algorithm:

Step 1: Initialization: Include necessary libraries and define token types.

Step 2: Lexical Analyzer:

Read input characters.


Identify and return tokens:

String: Return as STR if it consists of letters.


Operators: Return mathematical symbols (e.g., +, -, *, /).
Whitespace: Ignore.

Step 3: Parser:

Define grammar rules.


Match input with rule to get the input string.

Step 4: Palindrome Check:

Calculate string length.


Loop through the first half of the string:

Compare characters from the start and end.


If all characters match, it's a palindrome; otherwise, it's not.

53
Step 5: Output Result: Print whether the input string is a palindrome or not.

Step 6: Error Handling: If a syntax error occurs, print an error message.

Program:
Lexical Analyzer Source Code:

%{
/* Definition section */
#include <stdio.h>
#include <stdlib.h>
#include "y.tab.h"
%}

/* %option noyywrap */

/* Rule Section */
%%

[a-zA-Z]+ {yylval.f = yytext; return STR;}


[-+()*/] {return yytext[0];}
[ \t\n] {;}

%%

int yywrap()
{
return -1;
}

54
Parser Source Code:

%{
/* Definition section */
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
extern int yylex();

void yyerror(char *msg);


int flag;

int i;
int k =0;
%}

%union {
char* f;
}

%token <f> STR


%type <f> E

/* Rule Section */
%%

S:E{
flag = 0;
k = strlen($1) - 1;
if(k%2==0){

55
for (i = 0; i <= k/2; i++) {
if ($1[i] == $1[k-i]) {
} else {
flag = 1;
}
}
if (flag == 1) printf("Not palindrome\n");
else printf("palindrome\n");
printf("%s\n", $1);

}else{

for (i = 0; i < k/2; i++) {


if ($1[i] == $1[k-i]) {
} else {
flag = 1;
}
}
if (flag == 1) printf("Not palindrome\n");
else printf("palindrome\n");
printf("%s\n", $1);

}
}
;

E : STR {$$ = $1;}


;

56
%%

void yyerror(char *msg)


{
fprintf(stderr, "%s\n", msg);
exit(1);
}

//driver code
int main()
{
yyparse();
return 0;
}

Sample Input And Output:

57
EXPERIMENT 11 (b)

Problem Statement:
To write a YACC program to recognize strings of { an .

Concept to be Applied:
The language consists of strings that start with at least five 'a' characters followed by
exactly one 'b'.
Example valid strings: aaaaab, aaaaaaab, aaaaaaaaab, etc.
Example invalid strings: aaaab, aab, b, aaaa, etc.

Algorithm:
Step 1: Lexical Analyzer (Flex)

Define Header Section:


Include the header file "y.tab.h" for token definitions.
Specify Rules for Token Recognition:
Define patterns to recognize specific characters:
Recognize a or A and return token A.
Recognize b or B and return token B.
Recognize newline character (\n) and return token NL.
Match any other character (.) and return it as itself.
Define yywrap Function:
Implement the yywrap() function that returns 1, indicating that there are no more
inputs to process.

Step 2: Parser (YACC)

Define Header Section:


Include standard libraries: <stdio.h> and <stdlib.h>.
Declare Tokens:
Declare the tokens A, B, and NL.
Define Grammar Rules:
Create rules for parsing:
Define the rule stamt:
It recognizes a sequence starting with token A, followed
by zero or more As (captured by S), and ends with token
B, followed by a newline (NL).

58
If the rule matches, print "valid string" and exit.
Define the non-terminal S:
It can recursively include more As or be empty (to allow
for any number of As before B).

Implement Error Handling:


Define yyerror function to handle syntax errors:
Print "invalid string" and exit when a parsing error occurs.

Step 3: Main Function

Prompt for Input:

Print "enter the string" to prompt the user.

Invoke yyparse:

Call yyparse() to start the parsing process, which will use the defined rules
and the lexical analyzer to validate the input string.

Program:
Lexical Analyzer Source Code:
%{
/* Definition section */
#include "y.tab.h"
%}

/* Rule Section */
%%
[aA] {return A;}
[bB] {return B;}
\n {return NL;}
. {return yytext[0];}
%%

int yywrap()
59
{
return 1;
}
Parser Source Code:
%{
/* Definition section */
#include<stdio.h>
#include<stdlib.h>
%}

%token A B NL

/* Rule Section */
%%
stmt: A S B NL {printf("valid string\n");
exit(0);}
;
S: S A
|
;
%%

int yyerror(char *msg)


{
printf("invalid string\n");
exit(0);
}

//driver code
main()

60
{
printf("enter the string\n");
yyparse();
}
Sample Input and Output:

61
EXPERIMENT 12
Problem Statement:

To write a program to validate the given expression using the compiler tools LEX and YACC
in LLVM.

Concepts to be Applied:

Lexical Analysis (LEX):


LEX will be used to tokenize the input expression. We can define patterns for
different types of tokens (e.g., identifiers, constants, operators, parentheses).
Each token type will have a corresponding action to return that token to the
parser.
Syntax Analysis (YACC):
YACC will define the grammar of valid expressions. It will use production rules
to specify how tokens can be combined.
The grammar will include rules for operators, parentheses, and precedence.
LLVM Integration:
LLVM can be used for advanced features, such as generating intermediate code
from valid expressions, enabling further transformations or optimizations.
Error Handling:
Implement robust error handling in both the lexer and parser to provide
meaningful feedback for invalid expressions.

Algorithm:

Step 1: Start the program.


Step 2: Reading an expression.
Step 3: Checking the validating of the given expression according to the rules using YACC.
Step 4: Using the expression rule print the result of the given values.
Step 5: Stop the program.

Program:

//Program to recognize a valid arithmetic expression that uses operator +, -, * and /.


LEX Part:
%{

62
#include<stdio.h>
#include"y.tab.h"
%}
%%

[a-zA-Z]+ return VARIABLE;

[0-9]+ return NUMBER;

[\t] ;

[\n] return 0;

. return yytext[0];

%%
int yywrap()
{
return 1;
}

YACC Part:
%{
#include<stdio.h>
63
%}
%token NUMBER
%token VARIABLE
%left '+' '-'

%left '*' '/' '%'


%left '(' ')'

%%
S: VARIABLE'='E {
printf("\n Entered arithmetic expression is Valid\n\n");
return 0;

}
E:E'+'
E

|E'-'E
|E'*'E

|E'/'E
|E'%'E
|'('E')'
| NUMBER
| VARIABLE

;
%%

void main()
{
printf("\n Enter Any Arithmetic Expression which can have operations Addition,
Subtraction, Multiplication, Division, Modulus and Round brackets:\n");
yyparse();

64
}
void yyerror()
{
printf("\n Entered arithmetic expression is Invalid\n\n");
}

Execution Steps:

nano exp.y
yacc -d exp.y
nano exp.l
lex exp.l
clang lex.yy.c y.tab.c -w
./a.out

65
Sample Input and Output:
For Valid Expressions:

For Invalid Expressions:

66
EXPERIMENT 13
Problem Statement:
To write a program to implement a simple calculator using Lex and YACC in LLVM.

Concept to be Applied:

Lexical Analysis (Lex file):

Token Definition: The Lex file defines patterns (regular expressions) that match
different tokens in the input.
Numbers: The pattern [0-9]+ matches sequences of digits, which are converted
to integers using atoi() and assigned to yylval for use in the parser.
Operators: Various arithmetic operators (+, -, *, /, %, **) are recognized and
returned as tokens.
Parentheses: The characters ( and ) are recognized as tokens for grouping
expressions.
Ignoring Whitespace: Spaces and tabs are ignored, allowing the input to be
formatted freely.
Error Handling: Any unrecognized character triggers an error message,
helping to identify invalid input.

Parsing (YACC file):

Grammar Rules: The YACC file defines the grammar for valid arithmetic expressions.
Operator Precedence and Associativity: The %left and %right declarations
establish the precedence and associativity of operators. For example,
multiplication and division have higher precedence than addition and
subtraction, and exponentiation (**) is right associative.
Error Handling: The yyerror function provides feedback when an error occurs
during parsing.
Recursive Descent Parsing: The grammar rules allow for recursive evaluation of
expressions, enabling the parsing of nested expressions and proper evaluation order.

Expression Evaluation:

Abstract Syntax Tree (AST): The YACC file can be expanded to build an AST or
directly evaluate expressions based on the parsed tokens, typically by implementing
rules for combining operands and operators.
Mathematical Functions: Including <math.h> allows for advanced mathematical
operations (like power) to be used in evaluations.

67
Integration and Compilation:

To run this program, we need to compile both the LEX and YACC files together. This
process typically involves generating C source files from both, compiling them, and
linking the resulting object files.

Algorithm:
Step 1: Place the C declaration statements inside %{ and %}.
Step 2: Declare the tokens.
Step 3: Define the associativity of the operators and algebraical functions.
Step 4: Define the expression types and action to be done.
Step 5: In the main function get the input expression and start parsing.
Step 6: If there are no errors then print the result of the expression.

Program:
Lex file:
%{
#include "y.tab.h"
%}

%%
[0-9]+ { yylval = atoi(yytext); return NUMBER; } [+\-*/\n] { return yytext[0]; }
"%" { return MOD; }
"**" { return POWER; }
"(" { return '('; }
")" { return ')'; }
[ \t] ; /* Ignore whitespace */
. { printf("Invalid character: %s\n", yytext); }
%%

int yywrap() { return 1;

68
}

YACC file:
%{
#include <stdio.h> #include <math.h> #include<stdlib.h>
void yyerror(char* s) { printf("Error: %s\n", s);
}
int yylex();
%}

%token NUMBER MOD POWER


%left '+' '-'
%left '*' '/' MOD
%right POWER
%left UMINUS
%%

Sample Input and Output:

69
EXPERIMENT 14
Problem Statement:
To write a YACC program to convert Infix expression to Postfix expression.

Concept to be Applied:
Infix Expression

In an infix expression, operators appear between operands (e.g., A + B).


Parentheses and operator precedence rules dictate the order of evaluation.

Postfix Expression

In a postfix expression, the operators appear after their operands (e.g., A B +).
No parentheses are needed, as the order of operations is unambiguous based on the
position of the operators.

Advantages of Postfix Notation

No need for parentheses: The order of evaluation is determined by the position of


operators and operands.
Ease of evaluation: Postfix expressions are easier for machines to evaluate as they can
be processed in a single left-to-right pass without the need to handle operator
precedence or parentheses.

Algorithm:

Step 1: Initialization:

Include necessary header files (<stdio.h>, <stdlib.h>, <ctype.h>).


Define tokens for the parser in the header file (y.tab.h).

Step 2: Define Tokens:

In the parser, define digit as a token to represent integer numbers.

Step 3: Grammar Rules:

Define grammar rules for the expressions:

S (Start Rule): Accepts an expression E and prints a newline after evaluating.


E (Expression): Can be an expression followed by + or - and another term T, or
just a term T.

70
T (Term): Can be a term followed by * or / and another factor P, or just a factor
P.
P (Power): Can be a factor followed by ^ and another power P, or just a factor
F.
F (Factor): Can be an expression within parentheses or a digit.

Step 4: Lexical Analysis:

Define patterns in the lexer:

If the token is a sequence of digits ([0-9]+), convert it to an integer using atoi()


and return the token digit.
Ignore whitespace (tabs) using [\t].
On a newline character, return 0 to signify the end of input.
For any unmatched character, return it as is.

Step 5: Driver Code:

In the main function, prompt the user to enter an infix expression.


Call yyparse() to start the parsing process.

Step 6: Error Handling:

Implement yyerror() to print an error message if parsing fails, indicating a syntax error.

Step 7: Parsing and Evaluation:

As the parser processes the input according to the grammar rules:

Each time it successfully recognizes a rule, it may print the corresponding


operator (e.g., +, -, *, /, ^) as it evaluates the expression.
If the parser recognizes a number, it prints the integer value.

Step 8: End of Parsing:

The parsing continues until all input is processed (indicated by returning 0 on newline),
and a newline is printed after the entire expression is evaluated.

Program:
Parser Source Code:
File: C4.y
%{
/* Definition section */
#include <ctype.h>

71
#include<stdio.h>
#include<stdlib.h>
%}
%token digit
/* Rule Section */
%%
/*All these grammar rules are established for operator precedence and
associativity*/
/*S prints new line after evaluating E*/
S: E {printf("\n\n");}
;
/*E can be evaluated to E+T or E-T or just T*/
E: E '+' T { printf ("+");}
| E '-' T { printf ("-");}
|T
;
/*T can be evaluted to T*P or T/P or just P*/
T: T '*' P { printf("*");}
| T '/' P { printf("/");}
|P
;
/*P can be evaluated to F^P or just F*/
P: F '^' P { printf ("^");}
|F
;
/*F can evaluated to E or a number*/
F: '(' E ')'
| digit {printf("%d", $1);}
;
%%

72
//driver code
int main()
{
printf("Enter infix expression: ");
yyparse(); //to parse the input
}
yyerror()
{
printf("NITW Error");
}
Lexical Analyzer Source Code:
File: C4.l
%{
#include "y.tab.h"
extern int yylval;
%}
%%
/*If the token is a Integer number,return it.*/
[0-9]+ {yylval=atoi(yytext); return digit;}
/*If the token is a space or tab,ignore it.*/
[\t] ;
/*If the token is a new line,return 0*/
[\n] return 0;
/*If the token didn't match with any of the above,return the first
character*/
. return yytext[0];
%%

73
Sample Input and Output:

74
EXPERIMENT 15

Problem Statement:
To write a YACC program to generate 3-Address code for a given expression.

Concept to be Applied:
In general, Three Address instructions are represented as:

a = b op c

Here, a, b and c are the operands.


Operands may be constants, names, or compiler generated temporaries. op represents
the operator.
The characteristics of Three Address instructions are-
They are generated by the compiler for implementing Code Optimization.
They use maximum three addresses to represent any statement.
They are implemented as a record with the address fields.
Examples of Three Address instructions are:

a=b+c

c=axb

Algorithm:
Step 1: Define the Grammar for Arithmetic Expressions:

First, we need to define a context-free grammar (CFG) that handles arithmetic


expressions involving operators like +, -, *, /, parentheses, and numbers/identifiers.

Step 2: Associate Actions to Grammar Rules:

For each rule, associate an action to generate 3-address code for the corresponding part
of the expression.

Step 3: Maintain Temporary Variables for Intermediate Results:

Use temporary variables like t1, t2, etc., to store intermediate results of sub-expressions.

Step 4: Algorithm for 3-Address Code Generation:

Traverse the expression based on operator precedence (using YACC rules).


75
Generate code for smaller sub-expressions first.
Store the result in a temporary variable and use that temporary variable in the larger
expression.

Step 5: Stop the program.

Program:
File: C6.y
%{
#include <math.h>
#include<ctype.h>
#include<stdio.h>
int var_cnt=0;
char iden[20];
%}
%token digit
%token id
%%
/* Separating the LHS and RHS of the expression. */
S:id '=' E { printf("%s = t%d\n",iden, var_cnt-1); }
/* Following the operator precedence. */
/* '+','-' have least precendece. They have to be printed after all the others 3-
Address codes are printed. */
E:E '+' T { $$=var_cnt; var_cnt++; printf("t%d = t%d + t%d;\n", $$, $1, $3 );
}
|E '-' T { $$=var_cnt; var_cnt++; printf("t%d = t%d - t%d;\n", $$, $1, $3 );
}
|T { $$=$1; }
;
/* '*','/' have second least precedence. They have to be printed before the 3-
Address codes of operators '+' and '-' are printed. */
T:T '*' F { $$=var_cnt; var_cnt++; printf("t%d = t%d * t%d;\n", $$, $1, $3 ); }

76
|T '/' F { $$=var_cnt; var_cnt++; printf("t%d = t%d / t%d;\n", $$, $1, $3 ); }
|F {$$=$1 ; }
;
/* '^' has second precedence. These 3-Address code has to be printed after the 3-
Address codes of brackets are printed. */
F:P '^' F { $$=var_cnt; var_cnt++; printf("t%d = t%d ^ t%d;\n", $$, $1, $3 );}
| P { $$ = $1;}
;
/* Brackets have highest precedence. These 3-Address codes are to be printed
before all the others 3-Address codes are printed. */
/* This recursively calls the second rule in this set of rules for printing the
3-Address codes of the expression inside the brackets. */
P: '(' E ')' { $$=$2; }
|digit { $$=var_cnt; var_cnt++; printf("t%d = %d;\n",$$,$1); }
;
%%
int main()
{
var_cnt=0;
printf("Enter an expression : \n");
yyparse();
return 0;
}
yyerror()
{
printf("NITW Error\n");
}

File: C6.l
/* Definitions */
d [0-9]+
77
a [a-zA-Z]+
%{
/* Including the required header files. */
#include<stdio.h>
#include<stdlib.h>
#include"y.tab.h"
extern int yylval;
extern char iden[20];
%}
/*
Rules:
If any number is matched, make it as the yyval and send as token.
If any word is matched, make it as the yylval and send as token.
If any delimiter is matched, does nothing about it.
If a new line character is encountered, end the program.
If anything else is matched, send the first character of the matched
text.
*/
%%
{d} { yylval=atoi(yytext); return digit; }
{a} { strcpy(iden,yytext); yylval=1; return id; }
[ \t] {;}
\n return 0;
. return yytext[0];
%%

78
Sample Input and Output:

79
EXPERIMENT 16

Problem Statement:
To write a C program for implementation of Code Optimization Technique.

Concepts to be Applied:
Intermediate Code: A representation of code in a lower-level format used in compilers
before generating machine code.
Dead Code Elimination: Removing code that does not affect the program's results.
Common Subexpression Elimination: Identifying and removing duplicated
expressions to reduce redundant calculations.

Algorithm:
Step 1: Generate the program for factorial program using for and do-while loop to specify
optimization technique.
Step 2: In for loop variable initialization is activated first and the condition is checked next. If
the condition is true the corresponding statements are executed and specified increment
/ decrement operation is performed.
Step 3: The for loop operation is activated till the condition failure.
Step 4: In do-while loop the variable is initialized and the statements are executed then the
condition checking and increment / decrement operation is performed.
Step 5: When comparing both for and do-while loop for optimization dowhile is best because
first the statement execution is done then only the condition is checked. So, during the
statement execution itself we can find the inconvenience of the result and no need to
wait for the specified condition result.
Step 6: Finally, when considering Code Optimization in loop do-while is best with respect to
performance.

Program:
//Code Optimization Technique
#include<stdio.h>
#include<string.h>
struct op
{

80
char l;
char r[20];
}
op[10],pr[10];
void main()
{
int a,i,k,j,n,z=0,m,q;
char *p,*l;
char temp,t;
char *tem;
printf("Enter the Number of Values:");
scanf("%d",&n);
for(i=0;i<n;i++)
{
printf("left: ");
scanf(" %c",&op[i].l);
printf("right: ");
scanf(" %s",&op[i].r);
}
printf("Intermediate Code\n") ;
for(i=0;i<n;i++)
{
printf("%c=",op[i].l);
printf("%s\n",op[i].r);
}
for(i=0;i<n-1;i++)
{
temp=op[i].l;
for(j=0;j<n;j++)
{

81
p=strchr(op[j].r,temp);
if(p)
{
pr[z].l=op[i].l;
strcpy(pr[z].r,op[i].
r);
z++;
}
}
}
pr[z].l=op[n-1].l;
strcpy(pr[z].r,op[n-1].r);
z++;
printf("\nAfter Dead Code Elimination\n");
for(k=0;k<z;k++)
{
printf("%c\t=",pr[k].l);
printf("%s\n",pr[k].r);
}
for(m=0;m<z;m++)
{
tem=pr[m].r;
for(j=m+1;j<z;j++)
{
p=strstr(tem,pr[j].r);
if(p)
{
t=pr[j].l;
pr[j].l=pr[m].l;
for(i=0;i<z;i++)

82
{
l=strchr(pr[i].r,t) ;
if(l)
{
a=l-pr[i].r;
printf("pos: %d\n",a);
pr[i].r[a]=pr[m].l;
}}}}}
printf("Eliminate Common Expression\n");
for(i=0;i<z;i++)
{
printf("%c\t=",pr[i].l);
printf("%s\n",pr[i].r);
}
for(i=0;i<z;i++)
{
for(j=i+1;j<z;j++)
{
q=strcmp(pr[i].r,pr[j].r);
if((pr[i].l==pr[j].l)&&!q)
{
pr[i].l='\0';
}
}
}
printf("Optimized Code\n");
for(i=0;i<z;i++)
{
if(pr[i].l!='\0')
{

83
printf("%c=",pr[i].l);
printf("%s\n",pr[i].r);
}
}
}
Sample Input and Output:

84

You might also like