0% found this document useful (0 votes)
16 views

CD LAB Manual R-22

The document is a laboratory manual for the Compiler Design Lab course (22CSC25) at Chaitanya Bharathi Institute of Technology, detailing the course objectives, outcomes, and a list of practical experiments. It includes information on the department's vision, mission, educational objectives, and program outcomes, along with assessment procedures and rubrics for grading. The manual serves as a comprehensive guide for students to understand compiler design principles and tools such as Lex and Yacc.

Uploaded by

qure rito
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

CD LAB Manual R-22

The document is a laboratory manual for the Compiler Design Lab course (22CSC25) at Chaitanya Bharathi Institute of Technology, detailing the course objectives, outcomes, and a list of practical experiments. It includes information on the department's vision, mission, educational objectives, and program outcomes, along with assessment procedures and rubrics for grading. The manual serves as a comprehensive guide for students to understand compiler design principles and tools such as Lex and Yacc.

Uploaded by

qure rito
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 73

Laboratory Manual

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

PROGRAM - B.E
COMPILER DESIGN LAB (22CSC25)
SEMESTER - VI

R-22 Regulation

Prepared by:
1. Dr. V. Padmavathi, Associate Professor
2. Sri. K.Kiran Prakash, Assistant Professor
3. Dr. M. Venkata Krishna Reddy, Assistant Professor

CHAITANYA BHARATHI INSTITUTE OF TECHNOLOGY


(An Autonomous Institution, Affiliated to Osmania University, Approved by AICTE,
Accredited by NAAC with A++ Grade and Programs Accredited by NBA)
Chaitanya Bharathi Post, Gandipet, Kokapet (Vill.), Hyderabad, Ranga Reddy - 500 075, Telangana
www.cbit.ac.in

i
CHAITANYA BHARATHI INSTITUTE OF TECHNOLOGY
(An Autonomous Institution, Affiliated to Osmania University, Approved by AICTE,
Accredited by NAAC with A++ Grade and Programs Accredited by NBA)
Chaitanya Bharathi Post, Gandipet, Kokapet (Vill.), Hyderabad, Ranga Reddy - 500 075, Telangana
www.cbit.ac.in
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Name of the Lab Course:


COMPILER DESIGN LAB (22CSC25)
INDEX
Item Page No
Institute Vision, Institute Mission, Quality Policy, Department Vision and Department
iv
Mission
Program Educational Objectives (PEOs), Program Outcomes (POs) and Program
v
Specific Outcomes (PSOs)
Syllabus viii
Course Introduction ix
Assessment Procedure of CIE and SEE x
General Instructions for Laboratory Classes xiii

LABORATORY / PRACTICAL

Exp No Laboratory / Practical Page No

1. Tokenization – By constructing DFA of Lexical Analyzer 1

2. Writing a scanner application using (Tools: Jlex / JFlex / Lex). 3

3. Implementing parser for small language. 6

4. Implementing Parser with scanner or Without Scanner. 8

5. Implementing parser with Scanner, without Scanner or with yacc/ bison generators. 10

6. Program to generate predictive LL1 parsing table for the grammar 12

7. Implementation of the language to an intermediate form (e.g. three-address code). 16

8. Generation of target code (in assembly language). 19

ii
9. Target Code improvement with help of optimization techniques. 21

10. Implement Mini Compiler with Phases. 26

11. Program to find number of characters, spaces, lines, tabs in a given file. 28

12. Writing a scanner application without using Lex tool. 30

13. Program to demonstrate Lexical Analyser tool. 33

14. Implementing Recursive Decent Parser for a grammar. 36

15. Program to find FIRST function of a grammar. 39

16. Program to find FIRST function of a grammar. 41

17. Program to find the Canonical LR(0) collection of items. 44

18. Program to simulate Symbol table Management. 49

Note:
a) Minimum 10 experiments should be included in every course other than experiments beyond
syllabus.
b) Add project/ case studies/ student defined experiments for the courses having experiments
less than 12 in syllabus.
c) Include Minimum 2 experiments beyond prescribed experiments meeting industry problems to
challenge student learning.

iii
CHAITANYA BHARATHI INSTITUTE OF TECHNOLOGY
(An Autonomous Institution, Affiliated to Osmania University, Approved by AICTE,
Accredited by NAAC with A++ Grade and Programs Accredited by NBA)
Chaitanya Bharathi Post, Gandipet, Kokapet (Vill.), Hyderabad, Ranga Reddy - 500 075, Telangana
www.cbit.ac.in

Vision of Institute
To be the Centre of Excellence in Technical Education and Research.

Mission of Institute
To address the Emerging needs through Quality Technical Education
and Advanced Research.

Quality Policy
CBIT imparts value based Technical Education and Training to meet the
requirements of students, Industry, Trade/ Profession, Research and
Development Organizations for Self-sustained growth of Society.

iv
CHAITANYA BHARATHI INSTITUTE OF TECHNOLOGY
(An Autonomous Institution, Affiliated to Osmania University, Approved by AICTE,
Accredited by NAAC with A++ Grade and Programs Accredited by NBA)
Chaitanya Bharathi Post, Gandipet, Kokapet (Vill.), Hyderabad, Ranga Reddy - 500 075, Telangana
www.cbit.ac.in

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Vision of the Department


To be in the frontiers of Computer Science and Engineering with academic excellence and
Research.
Mission of the Department
• Educate students with the best practices of Computer Science by integrating the latest
research into the curriculum
• Develop professionals with sound knowledge in theory and practice of Computer Science and
Engineering
• Facilitate the development of academia-industry collaboration and societal outreach programs
• Prepare students for full and ethical participation in a diverse society and encourage lifelong
learning
Program Educational Objectives (PEOs)
• Analyze and provide solutions for real world problems using state-of-the-art engineering,
mathematics, computing knowledge, and emerging technologies. Exhibit professional
leadership qualities and excel in interdisciplinarydomains. Demonstrate human values,
professional ethics, skills and zeal for lifelong learning.
• Contribute to the research community and develop solutions to meet the needs of public and
private sectors. Work in emerging areas of research and develop solutions to meet the needs
of public and private sectors

v
CHAITANYA BHARATHI INSTITUTE OF TECHNOLOGY
(An Autonomous Institution, Affiliated to Osmania University, Approved by AICTE,
Accredited by NAAC with A++ Grade and Programs Accredited by NBA)
Chaitanya Bharathi Post, Gandipet, Kokapet (Vill.), Hyderabad, Ranga Reddy - 500 075, Telangana
www.cbit.ac.in

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


Program Outcomes (POs)
PO1. Engineering Knowledge:
Apply the knowledge of mathematics, science, engineering fundamentals, and an engineering specialization for the
solution of complex engineering problems.

PO2. Problem analysis:


Identify, formulate, review, research literature, and analyse complex engineering problems reaching substantiated
conclusions using first principles of mathematics, natural sciences, and engineering sciences.

PO3. Design/development of solutions:


Design solutions for complex engineering problems and design system components or processes that meet the
specified needs with appropriate consideration for the public health and safety, and cultural, societal, and
environmental considerations.

PO4. Conduct investigations of complex problems:


Use research-based knowledge and research methods including design of experiments, analysis and interpretation
of data, and synthesis of the information to provide valid conclusions.

PO5. Modern tool usage:


Create, select, and apply appropriate techniques, resources, and modern engineering and IT tools, including
prediction and modelling to complex engineering activities with an understanding of the limitations

PO6. The engineer and society:


Apply reasoning informed by the contextual knowledge to assess societal, health, safety, legal, and cultural issues
and the consequent responsibilities relevant to the professional engineering practice.

PO7. Environment and sustainability:


Understand the impact of the professional engineering solutions in societal and environmental contexts, and
demonstrate the knowledge of, and need for sustainable development.

PO8. Ethics:
Apply ethical principles and commit to professional ethics and responsibilities and norms of the engineering
practice.

PO9. Individual and team work:


Function effectively as an individual, and as a member or leader in diverse teams, and in multidisciplinary settings.
vi
PO10. Communication:
Communicate effectively on complex engineering activities with the engineering community and with the society
at large, such as, being able to comprehend and write effective reports and design documentation, make effective
presentations, and give and receive clear instructions.

PO11. Project management and finance:


Demonstrate knowledge and understanding of the engineering and management principles and apply these to one’s
own work, as a member and leader in a team, to manage projects and in multidisciplinary environments.

PO12. Life-long learning:


Recognize the need for, and have the preparation and ability to engage in independent and life-long learning in the
broadest context of technological change.

Program Specific Outcomes (PSOs)


After successful completion of the program, students will be able to:
1. Graduates will acquire the practical competency in Computer Science and Engineering through emerging
technologies and open-source platforms related to the domains
2. Graduates will design and develop innovative products by applying principles of computer science and
engineering
3. Graduates will be able to successfully pursue higher education in reputed institutions and provide solutions as
entrepreneurs.

vii
22CSC25
COMPILER DESIGN LAB
Instruction 2 P Hours per week
Duration of SEE 3 Hours
SEE 50 Marks
CIE 50 Marks
Credits 1

Pre-requisites: Data Structures, Design and analysis of algorithms, Formal language and automata theory.

Course Objectives: This course aims to:


1. Define the rules for implementing lexical analyzer and to understand the concepts behind the working of
compiler tools- Lex, Turbo C, Yacc.
2. Analyze and apply regular grammar for various source statements expression.
3. Implement front end of the compiler by means of generating intermediate codes, implement code optimization
techniques and error handling.
Course Outcomes: Upon completion of this course, students will be able to:
1. Implement the rules for the analyzing phases of a compiler.
2. Examine the concepts of compiler tools: Lex, Flex, Yacc, Turbo C.
3. Apply various Syntax techniques on grammars to build the parsers.
4. Generate various intermediate code representations for source code.
5. Implement the concepts of code optimization, code generation phases.

CO-PO Articulation Matrix


PO/PSO PO PO PO PO PO PO PO PO PO PO PO PO PSO PSO PSO
CO 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3
CO 1 3 2 - 1 - - - - - - - 3 1 - -
CO 2 2 2 1 2 3 - - - - - - - 1 1 -
CO 3 2 2 1 1 - - - - - - - 1 2 - -
CO 4 3 3 1 2 - - - - - - - 1 3 2 1
CO 5 3 2 1 1 3 - - - - - - 2 1 - 1

List of Programs:
1. Tokenization – By constructing DFA of Lexical Analyzer.
2. Writing a standalone scanner application using (Tools: Jlex / JFlex / Lex).
3. Implementing parser for a small language.
4. Implementing parser with Scanner, without Scanner or with yacc/bison generators.
5. Program to generate predictive LL1 parsing table for the Expression grammar.
6. Program to generate SLR parsing table for the Expression grammar.
7. Implementation of the language to an intermediate form (e.g. three-address code).
8. Generation of target code (in assembly language).
9. Target Code improvement with help of optimization techniques.
10. Implement Mini Compiler with Phases.

Text Books:
1. Keith D Cooper & Linda Tarezon, “Engineering a Compiler”, 2nd edition, Morgan Kafman, 2004.
2. John R Levine, Tony Mason, Doug Brown, “Lex &Yacc”, 3rd Edition, Shroff Publisher, 2007.
Suggested Reading:
1. Kenneth C Louden, “Compiler Construction: Principles and Practice”, Cengage Learning, 2005.
2. John R Levine,”Lex&Yacc”, 2nd Edition, Oreilly Publishers, 2009.
Online Resources:
1. http://www.nptel.ac.in/courses/106108052
2. http://en.wikibooks.org/wiki/Compiler_Construction
3. http://dinosaur.compilertools.net/
4. http://epaperpress.com/lexandyacc/
viii
COURSE INTRODUCTION
Name of the Lab Course:
COMPILER DESIGN LAB (22CSC25)

Introduction
The main goal of the Compiler Design Lab is to introduce students to the fundamentals of compiler
design and the tools necessary to conduct syntax-directed translation from a high-level programming
language into executable code. The goal of this course is to teach students how to build and deploy
language processors by exposing them to automation tools. Further, this will provide deeper insights
into understanding of advanced programming language semantics, code generation, machine-
independent optimisations, dynamic memory allocation, and object orientation.

Compiler is a software that takes programs written in high-level languages and converts them into low-
level programs that are functionally similar. Students learn how to develop and understand the internal
structure of real-world programs through compilers. Acquiring the knowledge of compilers equips them
with the practical and theoretical skills necessary to implement a programming language. As an
example, it can help students to optimise the programming language usage by providing a deeper grasp
of the language. The basic architecture of compilers is applicable to a wide variety of other programs,
including browsers, 3D apps, debuggers, simulators, and even command prompts and shells.

The compiler design lab teaches students how to use tools like LEX and YACC to generate code, and also
how to translate the syntax and semantics of programming languages into their machine-readable
equivalents.

Some of the objectives of Compiler Design Lab that make it better than customized development are:

 To provide an Understanding of the language translation peculiarities by designing complete


translator for mini language.
 To understand the practical approaches of how a compiler works.
 To understand the various phases in the design of a compiler
 To understand and analyze the role of syntax and semantics of Programming languages in
compiler construction
 To understand the design of top-down and bottom-up parsers.
 To understand syntax directed translation schemes.
 To use different tools in construction of the phases of a compiler for the mini language
 To introduce lex and yacc tools

ix
ASSESSMENT PROCEDURE AND AWARD OF CIE MARKS
Following is the subdivision for the internal marks (50) of the Lab:

(i) 20 marks for the lab internal tests

Two tests are to be conducted i.e one test after 1st cycle of experiments and second
test after the second cycle. Average of two tests marks put together should be
consider (20 maximum)

(ii) 30 marks for CIE

For the CIE 30 marks will be awarded based on the rubrics provided below.

This Rubrics are general guideline. Based on the lab type (programming or hardware) and
complexity of the course Rubrics can be customized by the department in tune to program and
course offered. Performance Indicators of the Rubrics also can be changed by the
departments/Program based on the need.
RUBRICS
S. Descriptors and Score
Parameter
No Outstanding Good Fair Poor Very Poor
1 Pre- Well prepared prepared for the Adequately Minimal Lacks
Experiment for the experimentation prepared for the preparation and preparation
Preparation experimentatio with clear experimentation without clear and without
work n with a clear specifications without clear specifications clear
5M specifications, and specifications and plan/design specifications
plan/design plan/design and plan/design (2M) and
and additional (4M) (3M) plan/design (1
information M)
(5M)
2 Experimentatio Student Student solves Student solves Student solves Student fails to
n (Problem conducts the problem and the problem and the problem solve the
Solving, experiment conducts conducts with few test problem (2M)
Methodology with all possible experiments experiment with cases with
of Conduction) test cases in an with all possible few possible test complexity (4M)
10 M optimized test cases (8M) cases with
fashion.(10M) complexity (6M)
3 Post Demonstrates Demonstrates Demonstrates Demonstrates Failed to
Experiment the simulation/ results and results and Partial results Demonstrates
Analysis [Viva, findings inference; inference; and inference; results and
Inference] /Hardware Able to answer Unable to answer Unable to inference;
5M results Infers Few Questions the Questions answer the Unable to
and answer all posed by posed by Questions posed answer the
the Questions Instructor (4M) Instructor. (3M) by Instructor Questions
posed by (2M) posed by
Instructor Instructor (1M)
(5M)
4 Report Writing Report meets all Report with Report is Report is Report is
5M requirements well-organized complete and complete, poor incomplete,
and it is content, visuals, adequate with grammar and unclear, poor
prepared in graphics, poor grammar. inadequate and grammar and
original and (3M) failed to
x
S. Descriptors and Score
Parameter
No Outstanding Good Fair Poor Very Poor
creative way to citations and organize inadequate
engage readers references (4M) thoughts (2M) (1M)
(5M)

5 Conduct Excellent team Follows the Follows safety Followed Does not
(Ethics, Safety, spirit, strictly safety precautions and minimum safety Follows safety
Team Work) follows ethics precautions, ethical practices precautions and precautions
5M and safety practices ethics and failed to ethical practices and ethical
precautions and poor team exhibit and failed to practices and
with good team work (4M) teamwork. exhibit team failed to exhibit
work (5M) (3M) work. (2M) team work.
(1M)
TOTAL SCORE

xi
ASSESSMENT PROCEDURE AND AWARD OF SEE MARKS
Descriptors and Score
S. No. Parameter
Outstanding Good Fair Poor Very Poor
1 Record The content of all the The content of most of The content of the The content of the The content of
5M experiments is well- the experiments are most of few experiments is the all the
organized, recorded the well-organized, experiments are not well-organized, experiments is
tables and neatly drawn recorded the tables, well-organized, recorded the tables, not well-
graphs. Results and neatly drawn graphs. recorded the neatly drawn organized,
discussions are well Results and discussion tables, neatly graphs. Results and recorded the
presented. (5M) are presented. (4M) drawn graphs. discussion are not tables, neatly
Results and presented (2M) drawn graphs.
discussion are not Results and
presented (3M) discussion are
not presented
(1M)
2 Write up about Presentation of the given Presentation of the Presentation of the Presentation of the Presentation of
the experiment experiment/program is given given given the given
10M very well organized experiment/program experiment/progr experiment/progra experiment/prog
with the required is organized with the am is organized m is minimal with ram is minimum
content, specifications, required content, and is without clear specifications without clear
plan/procedure of the specifications, clear specifications and plan/design. specifications
conduct and all the plan/procedure of the and plan/design. (4M) and plan/design.
required additional conduct. (6M) (2M)
information. (10M) (8M)
3 Conduction of Conducts experiment / Conducts experiment Conducts Conducts Failed to
the experiment Simulate the problem / Simulate the experiment / experiment / Conducts
and with proper connections problem with Simulate the Simulate the experiment /
observations / all possible test cases. connections / with problem with problem with Simulate the
15M All the possible most of the test cases. proper improper problem with
observations are noted. Failed to note all connections / less connections / less improper
(15M) observations (12M) test cases. test cases. (6M) connections /
(9M) less test cases.
(3M)
4 Results & Demonstrates the Demonstrates the Demonstrates the Demonstrates the Failed to
Analysis experimental results experimental results experimental partial Demonstrate the
10M with adequate analysis / with required analysis results, failed to experimental results and least
simulation/ findings / simulation/ findings maximum analysis results with least analysis /
/obtained results /obtained results / simulation/ analysis / simulation/
/plotting the graphs. /plotting the graphs. findings. simulation/ findings.
(10M) (8M) (6M) findings. (2M)
(4M)

5 Viva-Voce Answers most of the Answers most of the Answers only few Answers only few Failed to answer
10M questions with good questions with good of the questions of the questions questions. (2M)
analytical explanation. explanation. (8M) with good with nominal
(10M) explanation. (6M) explanation. (4M)
TOTAL SCORE

xii
GENERAL INSTRUCTIONS FOR LABORATORY CLASSES

DO‘S
1. Without Prior permission do not enter into the Laboratory.
2. While entering into the LAB students should wear their ID cards.
3. The Students should come with proper uniform.
4. Students should sign in the LOGIN REGISTER before entering into the laboratory.
5. Students should come with observation and record note book to the laboratory.
6. Students should maintain silence inside the laboratory.
7. After completing the laboratory exercise, make sure to shutdown the system properly

DONT‘S
1. Students bringing the bags inside the laboratory..
2. Students wearing slippers/shoes insides the laboratory.
3. Students using the computers in an improper way.
4. Students scribbling on the desk and mishandling the chairs.
5. Students using mobile phones inside the laboratory.
6. Students making noise inside the laboratory.

xiii
Program-1

AIM: Program to demonstrate tokenization – By constructing DFA of Lexical Analyzer.

DESCRIPTION: The DFA (Deterministic Finite Automaton) for the lexical analysis of
an input file is implemented. The program identifies the keywords, constants, and
relational operators present in the file.

DFA={
'i':set('n'),
'n':set('t'),
't':None
}
def main():
ip=input("enter identifier or int keyword")
c,n=0,len(ip)
print(f"transitions for {ip}")
for lexeme in ip:
c+=1
if lexeme not in DFA.keys():
print(ip[c-1:],'->','identifier')
break
cur=DFA[lexeme]
if cur is None and c==n:
print(lexeme,'->','keyword')
else:
print(lexeme,'->',cur)

if __name__=='__main__':
exit(main() or 0)

1
2
Program-2

AIM: Program to implement Scanner application using LEX TOOL.

DESCRIPTION: Lex tool recognizes the lexical patterns in the given source program.
Lex consists of 3 sections:

 Declaration section consists of valid c language, description, declaration, comments and


any global variables.
 Rules section contains the rules to be defined.
 Auxiliary procedure section contains the c code with main function.

yyin: It is a standard input file to store the input source program.

yylex: The main scanning process for the lex source program begins with yylex.

yywrap: When scanner encounters the end of file, the yywrap returns 0 else returns 1.

Input: Source
File or any LEX TOOL Lex.yy.c
Text ( x.l)

Lex.yy.c C COMPILER a.out

/*Program to implement LEXICAL ANALYZER using LEX tool*/

%{
int COMMENT=0;
%}
id [a-z][a-z0-9]*

%%
#.* {printf("\n%s is a PREPROCESSOR DIRECTIVE",yytext);}
int|double|char {printf("\n\t%s is a KEYWORD",yytext);}
if|then|endif {printf("\n\t%s is a KEYWORD",yytext);}
else {printf("\n\t%s is a KEYWORD",yytext);}
"/*" {COMMENT=1;}
"*/" {COMMENT=0;}

{id}\( {if(!COMMENT)printf("\n\nFUNCTION\n\t%s",yytext);}
{id}(\[[0-9]*\])? {if(!COMMENT) printf("\n\tidentifier\t%s",yytext);}
\{ {if(!COMMENT) printf("\n BLOCK BEGINS");ECHO; }

3
\} {if(!COMMENT)printf("\n BLOCK ends");ECHO; }
\".*\" {if(!COMMENT)printf("\n\t %s is a STRING",yytext);}
[+\-]?[0-9]+ {if(!COMMENT)printf("\n\t%s is a NUMBER",yytext);}
\( {if(!COMMENT)printf("\n\t");ECHO;printf("\t delim
openparanthesis\n");}
\) {if(!COMMENT)printf("\n\t");ECHO;printf("\t delim closed
paranthesis");}
\; {if(!COMMENT)printf("\n\t");ECHO;printf("\t delim semicolon");}
\= {if(!COMMENT)printf("\n\t%s is an ASSIGNMENT
OPERATOR",yytext);}
\<|\> {printf("\n\t %s is relational operator",yytext);}
"+"|"-"|"*"|"/" {printf("\n %s is an operator\n",yytext);}
"\n" ;
%%

main(int argc ,char **argv)


{
if (argc > 1)
yyin = fopen(argv[1],"r");
else
yyin = stdin;
yylex ();
printf ("\n");
}
int yywrap()
{
return 0;
}

Input:

#inlcude<stdio.h>
int main()
{
int num;
printf(“enter an integer”);
scanf(“%d”, &num)
}
Output:
$ lex lex.l
$ gcc lex.yy.c
$ ./a.out

4
5
Program-3
AIM: Program to implement parser for small language

DESCRIPTION: Parser for LISP language is implemented. First define the grammar rules of
the language that the parser will be parsing. Create a lexical analyzer (also known as a lex or
scanner) that will read in the input code and tokenize it according to the grammar rules. Create a
parser that will use the tokens generated by the lexer to construct a parse tree. The parse tree
represents the structural relationship between the different parts of the code. Use a stack-based
approach to parse the input code. The parser will push the tokens onto the stack and use a set of
rules to determine how to reduce the input code into a parse tree. Implement error handling
mechanisms to detect and recover from syntax errors in the input code. Once the parse tree has
been constructed, the parser may do additional processing to generate intermediate code or
perform semantic analysis. Finally, the parser may output the result of its processing in some
form, such as machine code or a high-level representation of the input code.

def generate_AST(string):
number_symbols = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '.', '-']
ind = 1
arr_to_return = []
while ind < len(string):
char = string[ind]
if char == "(":
open_cnt = 1
closed_cnt = 0
sub_str = "("
for c in string[ind + 1:]:
if c == "(": open_cnt += 1
if c == ")": closed_cnt += 1
sub_str += c
if open_cnt == closed_cnt: break
arr_to_return.append(generate_AST(sub_str))
ind += len(sub_str)
elif char == " " or char == ")":
ind += 1
else:

6
stop_ind = string.find(" ", ind)
if stop_ind == -1:
stop_ind = string.find(")", ind)
s = string[ind:stop_ind]
if all(x in number_symbols for x in list(s)):
if s.find('-', 1) == -1:
num = float(s)
arr_to_return.append(num)
else:
arr_to_return.append(s)
ind = stop_ind + 1
return arr_to_return

if __name__ == "__main__":
ip = "(first (list -1.5 (+ 2 3) 9))" # lisp
print("Input(LISP):", ip)
print("Output (AST):", generate_AST(ip))

OUTPUT:

7
Program-4

AIM: Program to implement Parser with Scanner


Ex: To recognize a valid arithmetic expression that uses operator +, -, *, % using
lex and yacc tool.

DESCRIPTION: Parser phase is implemented using Yacc tool and Scanner is nothing but
Lexical analyzer phase which is implemented using Lex tool.

Arith.l:
%{
#include "y.tab.h"
%}
%%
[a-zA-Z_][a-zA-Z_0-9]* return id;
[0-9]+(\.[0-9]*)? return num;
[+/*] return op;
. return yytext[0];
\n return 0;
%%
int yywrap()
{
return 1;
}
Arith.y:
%{
#include<stdio.h>
int valid=1;
%}
%token num id op
%%
start : id '=' s ';'
s : id x
| num x
| '-' num x
| '(' s ')' x
;
x : op s
| '-' s
|
;
%%
int yyerror()
{
valid=0;
printf("\nInvalid expression!\n");

8
return 0;
}
int main()
{
printf("\nEnter the expression:\n");
yyparse();
if(valid)
{
printf("\nValid expression!\n");
}
}

OUTPUT:

9
Program-5

AIM: Program to implement Parser with Scanner using Yacc/bison generators.

DESCRIPTION: Yacc is yet another compiler. A yacc program is converted into C program
using y.tab.c. y.tab.c is a representation of an LALR parser written in C, along with other C
rkoutines. By compiling y.tab.c along with the ly library that contains the LR parsing program
the desired object program is obtained. A Yacc source program has three parts: Declarations,
Rules Part and Supporting C-routines. Declaration part contains C declarations as well as
grammar tokens declaration. Rules part contains the production and the associated semantic
action. The third part contains the supporting C-routines. If the character is a digit the value of
the digit is stored in the variable yyval.

//calc.l
%{
#include<stdio.h>
#include "y.tab.h"
%}

%%
[0-9]+ {yylval.dval = atoi( yytext ); return DIGIT;}
\n|. return yytext[0];
%%

//calc.y
%{
/*E->E+E|E*E|(E)|DIGIT*/
%}

%union
{
int dval;
}

%token <dval> DIGIT


%type <dval> expr
%type <dval> expr1

%%
line : expr '\n' {printf("%d\n",$1);}
;
expr : expr '+' expr1 {$$ = $1 + $3 ;}
| expr '-' expr1 {$$ = $1 - $3 ;}
| expr '*' expr1 {$$ = $1 * $3 ;}
| expr '/' expr1 {$$ = $1 / $3 ;}

10
| expr1
;

expr1 : '('expr')' {$$=$2;}


| DIGIT
;
%%

int main()
{
yyparse ();
}
yyerror(char *s)
{
printf("%s",s);
}

Output:
$ lex calc.l
$ yacc -d calc.y
$ gcc lex.yy.c y.tab.c -ll
$ ./a.out

11
Program-6

AIM: Program to generate predictive LL(10) parsing table for the grammar

DESCRIPTION: A grammar whose parsing table has no multiply defined entries is said to be
LL(1). The first L in LL(1) stands for scanning the input from left to right, the second L for
producing a left most derivation and 1 for using one input symbol of look ahead at each step to
make parsing action decisions.

/*Program to construct PREDICTIVE LL(1) TABLE*/

#include<stdio.h>
#include<conio.h>
#include<string.h>
#include<process.h>

char prod[10][20],start[2];
char nonterm[10],term[10];
char input[10],stack[50];
int table[10][10];
int te,nte;
int n;

void main()
{
clrscr();
init();
parse();
getch();
}

init()
{
int i,j;
printf("\nNOTE:\n");
printf("The terminals should be entered in single lower case letters,special symbol
and\n");
printf("non-terminals should be entered in single upper case letters.\n");
printf("extends to symbol is '->' and epsilon symbol is '@' \n");
printf("\nEnter the no. of terminals:");
scanf("%d",&te);
for(i=0;i<te;i++)
{
fflush(stdin);
printf("Enter the terminal %d:",i+1);
scanf("%c",&term[i]);

12
}
term[i]='$';
printf("\nEnter the no. of non terminals:");
scanf("%d",&nte);
for(i=0;i<nte;i++)
{
fflush(stdin);
printf("Enter the non-terminal %d:",i+1);
scanf("%c",&nonterm[i]);
}
printf("\nEnter the no. of productions:");
scanf("%d",&n);
for(i=0;i<n;i++)
{
printf("Enter the production %d:",i+1);
scanf("%s",prod[i]);
}
fflush(stdin);
printf("\nEnter the start symbol:");
scanf("%c",&start[0]);
printf("\nEnter the input string:");
scanf("%s",input);
input[strlen(input)]='$';
printf("\n\nThe productions are:");
printf("\nProductionNo. Production");
for(i=0;i<n;i++)
printf("\n %d %s",i+1,prod[i]);
printf("\n\nEnter the parsing table:");
printf("\n Enter the production number in the required entry as mentioned
above.");
printf("\n Enter the undefined entry or error of table as '0'\n\n");
for(i=0;i<nte;i++)
{
for(j=0;j<=te;j++)
{
fflush(stdin);
printf("Entry of table[%c,%c]:",nonterm[i],term[j]);
scanf("%d",&table[i][j]);
}
}
}

parse()
{
int i,j,prodno;
int top=-1,current=0;

13
stack[++top]='$';
stack[++top]=start[0];
do
{
if((stack[top]==input[current])&&(input[current]=='$'))
{
printf("\nThe given input string is parsed");
getch();
exit(0);
}
else if(stack[top]==input[current])
{
top--;
current++;
}
else if(stack[top]>='A'&&stack[top]<='Z')
{
for(i=0;i<nte;i++)
if(nonterm[i]==stack[top]) break;
for(j=0;j<=te;j++)
if(term[j]==input[current]) break;
prodno=table[i][j];
if(prodno==0)
{
printf("\nThe given input string is not parsed");
getch();
exit(0);
}
else
{
for(i=strlen(prod[prodno-1])-1;i>=3;i--)
{
if(prod[prodno-1][i]!='@')
stack[top++]=prod[prodno-1][i];
}
top--;
}
}
else
{
printf("\nThe given input string is not parsed");
getch();
exit(0);
}
}while(1);
}

14
Output:

15
Program-7

AIM: To implement a language to an intermediate form (three address code)

DESCRIPTION: Three-address code is a sequence of statements of the general form x: y op z.


Where x, y and z are names, constants are compiler generated temporaries; op stands for any
operator such as Boolean – valued data. Three address code is a linearized representation of a
syntax tree or a dag in which explicit names correspond to the interior nodes of the graph.

/*Program to generate three address codes*/

#include<stdio.h>
#include<conio.h>
#include<stdlib.h>
#include<string.h>
struct three
{
char data[10],temp[7];
}s[30];
void main()
{
char d1[7],d2[7]="t";
int i=0,j=1,len=0;
FILE *f1,*f2;
f1=fopen("sum.txt","r");
f2=fopen("out.txt","w");
while(fscanf(f1,"%s",s[len].data)!=EOF)
len++;
itoa(j,d1,7);
strcat(d2,d1);
strcpy(s[j].temp,d2);
strcpy(d1,"");
strcpy(d2,"t");
if(!strcmp(s[3].data,"+")) {
fprintf(f2,"%s=%s+%s",s[j].temp,s[i+2].data,s[i+4].data);

16
j++;
}
else if(!strcmp(s[3].data,"-"))
{
fprintf(f2,"%s=%s-%s",s[j].temp,s[i+2].data,s[i+4].data);
j++;
}
for(i=4;i<len-2;i+=2)
{
itoa(j,d1,7);
strcat(d2,d1);
strcpy(s[j].temp,d2);
if(!strcmp(s[i+1].data,"+"))
fprintf(f2,"\n%s=%s+%s",s[j].temp,s[j-1].temp,s[i+2].data);
else if(!strcmp(s[i+1].data,"-"))
fprintf(f2,"\n%s=%s-%s",s[j].temp,s[j-1].temp,s[i+2].data);
strcpy(d1,"");
strcpy(d2,"t");
j++;
}
fprintf(f2,"\n%s=%s",s[0].data,s[j-1].temp);
fclose(f1);
fclose(f2);
getch();
}

Input:
out = in1 + in2 + in3 - in4

Ouput:

17
18
Program-8

AIM: To write a program to generate code (Assembly Language code).

DESCRIPTION: Generate code generates Target code. Various types of Target codes are there
that are Assembly language code, machine code, relocatable code. In this the input to a code
generator consists of the intermediate representation of the source program produced by the front
end, which generates Assembly language as target code. Symbol table is used to determine the
runtime address of the data objects denoted by the names in the intermediate representation.

/*Program for code generation*/

#include<stdio.h>
#include<conio.h>
#include<string.h>
struct three
{
char data[10],temp[7];
}s[30];
void main()
{
char *d1,*d2;
int i=0,len=0;
FILE *f1,*f2;
f1=fopen("exe.txt","r");
f2=fopen("exe1.txt","w");
while(fscanf(f1,"%s",s[len].data)!=EOF)
len++;
for(i=0;i<=len;i++)
{
if(!strcmp(s[i].data,"="))
{
fprintf(f2,"\nLDA\t%s",s[i+1].data);
if(!strcmp(s[i+2].data,"+"))
fprintf(f2,"\nADD\t%s",s[i+3].data);

19
if(!strcmp(s[i+2].data,"-"))
fprintf(f2,"\nSUB\t%s",s[i+3].data);
fprintf(f2,"\nSTA\t%s",s[i-1].data);
}
}
fclose(f1);
fclose(f2);
getch();
}

exe.txt:
t1 = in1 + in2
t2 = t1 + in3
t3 = t2 - in4
out = t3

20
Program-9

AIM: Program to improve target code with the help of optimization techniques
Ex: Dead code elimination and common subexpression elimination optimization
technique.

DESCRIPTION: The code produced by straight forward compiler algorithms can often be made
to run faster or take less space or both this improvement is achieved by program transformations
that are traditionally called code optimization. Common sub-expression elimination, copy
propagation, dead code elimination, and constant folding are common examples of such
function-preserving transformations.

/*Program to implement Optimization technique*/


#include<stdio.h>
#include<string.h>
struct op
{
char l;
char r[20];
}op[10],pr[10];
void main()
{
int a,i,k,j,n,z=0,m,q;
char *p,*l;
char temp,t;
char *tem;
printf("Enter the Number of Values:");
scanf("%d",&n);
for(i=0;i<n;i++)
{
printf("left: ");
scanf(" %c",&op[i].l);
printf("right: ");
scanf(" %s",&op[i].r);
}

21
printf("Intermediate Code\n") ;
for(i=0;i<n;i++)
{
printf("%c=",op[i].l);
printf("%s\n",op[i].r);
}
for(i=0;i<n-1;i++)
{
temp=op[i].l;
for(j=0;j<n;j++)
{
p=strchr(op[j].r,temp);
if(p)
{
pr[z].l=op[i].l;
strcpy(pr[z].r,op[i].r);
z++;
}
}
}
pr[z].l=op[n-1].l;
strcpy(pr[z].r,op[n-1].r);
z++;
printf("\nAfter Dead Code Elimination\n");
for(k=0;k<z;k++)
{
printf("%c\t=",pr[k].l);
printf("%s\n",pr[k].r);
}
for(m=0;m<z;m++)
{

22
tem=pr[m].r;
for(j=m+1;j<z;j++)
{
p=strstr(tem,pr[j].r);
if(p)
{
t=pr[j].l;
pr[j].l=pr[m].l;
for(i=0;i<z;i++)
{
l=strchr(pr[i].r,t) ;
if(l) {
a=l-pr[i].r;
printf("pos: %d\n",a);
pr[i].r[a]=pr[m].l;
}
}
}
}
}
printf("Eliminate Common Expression\n");
for(i=0;i<z;i++)
{
printf("%c\t=",pr[i].l);
printf("%s\n",pr[i].r);
}
for(i=0;i<z;i++)
{
for(j=i+1;j<z;j++)
{
q=strcmp(pr[i].r,pr[j].r);

23
if((pr[i].l==pr[j].l)&&!q)
{
pr[i].l='\0';
}
}
}
printf("Optimized Code\n");
for(i=0;i<z;i++)
{
if(pr[i].l!='\0')
{
printf("%c=",pr[i].l);
printf("%s\n",pr[i].r);
}
}
}

24
25
Program-10

AIM: Program to implement mini compiler with phases.

DESCRIPTION: The program shows how the source program is converted to tokens and these
tokens are processed to produce the output using Lex and Yacc tools. First define the grammar
rules of the language that the parser will be parsing. Create a lexical analyzer (also known as a
lex or scanner) that will read in the input code and tokenize it according to the grammar rules.
Create a parser that will use the tokens generated by the lexer to construct a parse tree. The parse
tree represents the structural relationship between the different parts of the code. Implement
error handling mechanisms to detect and recover from syntax errors in the input code. Once the
parse tree has been constructed, the parser may do additional processing to generate intermediate
code or perform semantic analysis to get the output.

Validvar.l:
%{
#include "y.tab.h"
%}
%%
[a-zA-Z_][a-zA-Z_0-9]* return letter;
[0-9] return digit;
. return yytext[0];
\n return 0;
%%
int yywrap()
{
return 1;
}
Validvar.y:
%{
#include<stdio.h>
int valid=1;
%}
%token digit letter
%%
start : letter s
s : letter s
| digit s
|
;
%%
int yyerror()
{
printf("\nIts not a identifier!\n");
valid=0;

26
return 0;
}
int main()
{
printf("\nEnter a name to tested for identifier ");
yyparse();
if(valid)
{
printf("\nIt is a identifier!\n");
}
}

OUTPUT:

27
Program-11

AIM: Program to find number of characters, spaces, lines, tabs in a given file.

DESCRIPTION: A text file is created in read mode. The characters in the file are retrieved
using fgetc( ) function. If the encountered value is EOF then it bread otherwise the alternate
operations are performed respectively. If the value is a character then the number of characters
value is incremented. If the value is blank space then the number of blank spaces value is
incremented by one. If the file contains more than one line then the no. of lines value is
incremented by 1. Similarly if there are more than one word then the no. of words value is
incremented. The output is obtained according to the input given in the file.

/*Program to count number of spaces, lines, characters & tabs*/

#include<stdio.h>
#include<stdlib.h>
#include<string.h>
int main()
{
FILE *fp1;
char ch;
int space=0;
int lines=0;
int tabs=0;
int chars=0;
fp1=fopen("doc.c","r");
while(!feof(fp1))
{
ch=fgetc(fp1);
if(isgraph(ch)||ch==' '||ch=='\n'||ch=='\t')
{
chars++;
}
if(ch==' ')
{
space++;
}
if(ch=='\n')
{
lines++;
}
if(ch=='\t')
{
tabs++;
}
}

28
printf("spaces\t-->%d\n",space);
printf("lines\t -->%d\n",lines+1);
printf("tabs\t -->%d\n",tabs);
printf("chars\t -->%d\n",chars);
fclose(fp1);
return 0;
}

Input:
This is Compiler Design Lab.

Output:
$ ./a.out

29
Program-12

AIM: Program to implement Scanner without using Lex tool.

DESCRIPTION: To obtain the description of a value entered in a file is implemented using


Scanner. Initially two files are created. Input is given in one file, various symbols entered are
described in the output file. It gives the line no, token no, token, lexeme. Meaning full character
is called as token. Lexemes are the smallest logical units of a program. Input may be digit,
alphabets, identified, keywords, operators, special symbols etc. Output shows all these values
including the line number token number and lexemes.

Input: Source Output :


File or any SCANNER TOKENS
Text
Fig:1 Scanner representation

/*Standalone Scanner program */

#include<stdio.h>
#include<ctype.h>
#include<string.h>

int main()
{
FILE *input, *output;
int l=1;
int t=0;
int j=0;
int i,flag;
char ch,str[20];
input = fopen("input.txt","r");
output = fopen("output.txt","w");
char keyword[30][30] = {"int","main","if","else","do","while"};
fprintf(output,"Line no. \t Token no. \t Token \t Lexeme\n\n");
while(!feof(input))
{
i=0;
flag=0;
ch=fgetc(input);
if( ch=='+' || ch== '-' || ch=='*' || ch=='/' )
{
fprintf(output,"%7d\t\t %7d\t\t Operator\t %7c\n",l,t,ch);
t++;
}

30
else if( ch==';' || ch=='{' || ch=='}' || ch=='(' || ch==')' || ch=='?' || ch=='@' || ch=='!' ||
ch=='%')
{
fprintf(output,"%7d\t\t %7d\t\t Special symbol\t %7c\n",l,t,ch);
t++;
}
else if(isdigit(ch))
{
fprintf(output,"%7d\t\t %7d\t\t Digit\t\t %7c\n",l,t,ch);
t++;
}
else if(isalpha(ch))
{
str[i]=ch;
i++;
ch=fgetc(input);
while(isalnum(ch) && ch!=' ')
{
str[i]=ch;
i++;
ch=fgetc(input);
}
str[i]='\0';
for(j=0;j<=30;j++)
{
if(strcmp(str,keyword[j])==0)
{
flag=1;
break;
}
}
if(flag==1)
{
fprintf(output,"%7d\t\t %7d\t\t Keyword\t %7s\n",l,t,str);
t++;
}
else
{
fprintf(output,"%7d\t\t %7d\t\t Identifier\t %7s\n",l,t,str);
t++;
}
}
else if(ch=='\n')
{
l++;
}

31
}
fclose(input);
fclose(output);
return 0;
}

Input:

int i = 60 ;
int ( ) ;

Output:

32
Program-13

AIM: Program to demonstrate using Lex tool


a) Program to identify the Octal or Hexadecimal number using Lex tool.

DESCRIPTION: Lex program contains three sections declaration part, rules part, auxiliary
code.The lex pattern for octal & hexadecimals is given in the source program.
Regular expression construction is written to identify matched patterns.

Functions written in main are:


yylex(): The main scanning process for the lex source program begins for yylex.

yywrap(): When scanner encounters the end of file. The yywrap returns 0 else returns 1.

/*Lex program to identify whether the given number is Octal or Hexadecimal*/

%{

%}
Oct [0][0-9]+
Hex [0][x|X][0-9A-F]+

%%
{Hex} printf("this is a hexadecimal number");
{Oct} printf("this is an octal number");
%%

main()
{
yylex();
}
int yywrap()
{
return 1;
}

Output:
$ lex octalhexadecimal.l
$ gcc lex.yy.c
$ ./a.out

33
AIM: b) Program to capitalize the input string using LEX TOOL.

DESCRIPTION: In declaration section display function is named. The pattern is written in the
rules part. This includes the calling of display function. In auxiliary code along with yylex() &
yywrap() display function is described. In this method the input string written in lower case
letters is converted into uppercase using to upper (). The input here is matched with pattern ‘//’.

/*LEX program to CAPITALIZE the given comment*/


%{
#include<stdio.h>
#include<ctype.h>
int k=0;
void display(char *);
%}
letter [a-zA-Z0-9]*
com [//]

%%
{com} {if(k==0) k=1;}
{letter} {if(k==1) { k=0;display(yytext);}}
%%

main()
{
yylex();
}
void display(char *s)
{
int i;
for(i=0;s[i]!='\0';i++)
printf("%c", toupper(s[i]));
}
int yywrap()
{
return 1;
}

Output:
$ lex capital.l
$ gcc lex.yy.c
$ ./a.out
cbit
CBIT

34
AIM: c) Program to identify integer or real number using lex tool.

DESCRIPTION: The pattern for the real values and integers is written in the rules section.
These are defined in the same section. Main program code includes the yylex() and yywrap()
functions. If the input is given as a real values then it gives the output as float number. If the
input is any integer then the output is not float number.

/*Lex program to identify integer or real number */

%{
#include<stdio.h>
%}

integer [0-9]+
float [0-9]+\.[0-9]+
%%
{integer} printf("This is an integer");
{float} printf("This is a real number");
%%
int main(){
yylex();
}
int yywrap(){
return 1;
}

Output:
$ lex real.l
$ gcc lex.yy.c
$ ./a.out

35
Program-14

AIM: Program to implement Recursive descent parser for a grammar.

EE+T/T
TT*F/F
F(E)/id

DESCRIPTION: Recursive decent parsing called predictive parsing where backtracking is


required with repeated scans of input. The first step is to eliminate the left recursive left factoring
for the given grammar.
Left Recursion: A grammar is left recursive if it has a non terminal then there is a derivation
AAα for some string α.
This can be reduced by replacing
AβA1
AαA1/€

Left Factoring: A grammar in the form of A  αβ1 /αβ2 /…….αβn/y can be eliminated by
replacing then with
A αA1/y
A  β1 /β2 /……./βn
1

/*Program to implement a RECURSIVE DESCENT PARSER*/

#include<stdio.h>
#include<string.h>

void E(),E1(),T(),T1(),F();

int ip=0;
static char s[10];

int main()
{
char k;
int l;
ip=0;
printf("Enter the string:\n");
scanf("%s",s);
printf("The string is: %s",s);
E();
if(s[ip]=='$')
printf("\nString is accepted.\nThe length of the string is %d\n",strlen(s)-1);
else
printf("\nString not accepted.\n");

36
return 0;
}

void E()
{
T();
E1();
return;
}

void E1()
{
if(s[ip]=='+')
{
ip++;
T();
E1();
}
return;
}

void T()
{
F();
T1();
return;
}

void T1()
{
if(s[ip]=='*')
{
ip++;
F();
T1();
}
return;
}

void F()
{
if(s[ip]=='(')
{
ip++;
E();
if(s[ip]==')')

37
{
ip++;
}
}
else if(s[ip]=='i')
ip++;
else
printf("\nId expected ");
return;
}

Output:
$ ./a.out

38
Program-15

AIM: Program to find FIRST elements for the given grammar.

DESCRIPTION: To compute FIRST(x) for all the grammar symbol x apply the following rules
until no more terminals or € can be added to any FIRST set
1. If x is terminal, then FIRST(x) is {x}
2. If x€ is a production, then add € to FIRST(x).
3. If x is non terminal and x  y1y2…..yk is a production then place a in FIRST(x) if for
some i, a is in FIRST(yi) and € is in all of FIRST(y1)………….. FIRST(yi-1); that is
y1……yi-1€ If € is in FIRST(yj) for all j=1,2…..k, then add € to FIRST(x).

/*Program to compute the FIRST of a given grammar*/

#include<stdio.h>
#include<ctype.h>

int main()
{
int i,n,j,k;
char str[10][10],f;
printf("Enter the number of productions\n");
scanf("%d",&n);
printf("Enter grammar\n");
for(i=0;i<n;i++)
scanf("%s",&str[i]);
for(i=0;i<n;i++)
{
f= str[i][0];
int temp=i;
if(isupper(str[i][3]))
{
repeat:
for(k=0;k<n;k++)
{
if(str[k][0]==str[i][3])
{
if(isupper(str[k][3]))
{
i=k;
goto repeat;
}
else
{
printf("First(%c)=%c\n",f,str[k][3]);
}

39
}
}
}
else
{
printf("First(%c)=%c\n",f,str[i][3]);
}
i=temp;
}
}

Output:
$ ./a.out

40
Program-16

AIM: Program to find FOLLOW elements for the given grammar.

DESCRIPTION: To compute FOLLOW(A), where A is the non terminal, apply the following
rules until nothing can be added to any FOLLOW set.

1. Place $ in FOLLOW(S), where S is the start symbol and $ is the input right endmarker.
2. If there is a production AαBβ, then everything in FIRST(β) except for € is placed in
FOLLOW(B).
3. If there is a production AαB, or a production A αBβ where FIRST(β) contains € (i.e.
β€) , then everything in FOLLOW(A) is in FOLLOW(B).

/*Program to compute the FOLLOWS of a given grammar*/

#include<stdio.h>

main()
{
int np,i,j,k;
char prods[10][10],follow[10][10],Imad[10][10];
printf("enter no. of productions\n");
scanf("%d",&np);
printf("enter grammar\n");
for(i=0;i<np;i++)
{
scanf("%s",&prods[i]);
}

for(i=0; i<np; i++)


{
if(i==0)
{
printf("Follow(%c) = $\n",prods[0][0]);//Rule1
}
for(j=3;prods[i][j]!='\0';j++)
{
int temp2=j;
//Rule-2: production A->xBb then everything in first(b) is in follow(B)
if(prods[i][j] >= 'A' && prods[i][j] <= 'Z')
{
if((strlen(prods[i])-1)==j)
{

printf("Follow(%c)=Follow(%c)\n",prods[i][j],prods[i][0]);

41
}
int temp=i;
char f=prods[i][j];
if(!isupper(prods[i][j+1])&&(prods[i][j+1]!='\0'))
printf("Follow(%c)=%c\n",f,prods[i][j+1]);
if(isupper(prods[i][j+1]))
{
repeat:
for(k=0;k<np;k++)
{
if(prods[k][0]==prods[i][j+1])
{
if(!isupper(prods[k][3]))
{
printf("Follow(%c)=%c\n",f,prods[k][3]);
}
else
{
i=k;
j=2;
goto repeat;
}
}
}
}
i=temp;
}
j=temp2;
}
}
}
Output:
$ ./a.out

42
43
Program -17

AIM: Program to find the canonical LR(0) items.

DESCRIPTION: A grammar for which we can construct a parsing table is said to be an LR


grammar. An LR Parser does not have to scan the entire start to know when the handle appears
on the top. Rather the state symbol on top of the stack contains all the information it needs. A
grammar that can be parsed by an LR parser examining up to k input symbols on each move is
called an LR(K) grammar.

An LR(0) item of a grammar G is a production of G with a dot at some position of the right
side.One collection of sets of LR(0) items called ‘canonical LR(0) collection’, provides the basis
for constructing SLC parsers. To construct LR(0) collection for a grammar, we define an
augmented grammar and two functions, goto and closure.

/*Program to find CANONICAL LR(0) Collections*/

//closure.c

#include<stdio.h>
#include<string.h>

char a[8][5],b[7][5];
int c[12][5];
int w=0,e=0,x=0,y=0;
int st2[12][2],st3[12];
char sta[12],ch;
void v1(char,int);
void v2(char,int,int,int);

int main()
{
int i,j,k,l=0,m=0,p=1,f=0,g,v=0,jj[12];
//clrscr();
printf("\n\n\t*******Enter the Grammar Rules (max=3)*******\n\t");
for(i=0;i<3;i++)
{
gets(a[i]);
printf("\t");
}
for(i=0;i<3;i++)
{
for(j=0;j<strlen(a[i]);j++)
{
for(k=0;k<strlen(a[i]);k++)
{

44
if(p==k)
{
b[l][m]='.';
m+=1;
b[l][m]=a[i][k];
m+=1;
}
else
{
b[l][m]=a[i][k];
m++;
}
}
p++;
l++;
m=0;
}
p=1;
}
i=0; p=0;
while(l!=i)
{
for(j=0;j<strlen(b[i]);j++)
{
if(b[i][j]=='.')
{
p++;
}
}
if(p==0)
{
b[i][strlen(b[i])]='.';
}
i++;
p=0;
}
i=0;
printf("\n\t*******Your States will be*******\n\t");
while(l!=i)
{
printf("%d--> ",i);
puts(b[i]);
i++;
printf("\t");
}
printf("\n");

45
v1('A',l);
p=c[0][0];
m=0;
while(m!=6)
{
for(i=0;i<st3[m];i++)
{
for(j=0;j<strlen(b[p]);j++)
{
if(b[p][j]=='.' && ((b[p][j+1]>=65 && b[p][j+1]<=90)||
(b[p][j+1]>=97&&b[p][j+1]<=122)))
{
st2[x][0]=m;
sta[x]=b[p][j+1];
v2(b[p][j+1],j,l,f);
x++;
//str();
}
else
{
if(b[p][j]=='.')
{
st2[x][0]=m;
sta[x]='S';
st2[x][1]=m;
x++;
}
}
}
p=c[m][i+1];
}
m++;
p=c[m][0];
}
g=0;
p=0;
m=0;
x=0;
getchar();
return 0;
}

void v1(char ai,int kk)


{
int i,j;
for(i=0;i<kk;i++)

46
{
if(b[i][2]==ai&&b[i][1]=='.')
{
c[w][e]=i;
e++;
if(b[i][2]>=65 && b[i][2]<=90)
{
for(j=0;j<kk;j++)
{
if(b[j][0]==ai && b[j][1]=='.')
{
c[w][e]=j;
e++;
}
}
}
}
}
st3[w]=e;
w++;
e=0;
}

void v2(char ai,int ii,int kk,int tt)


{
int i,j,k;
for(i=0;i<kk;i++)
{
if(b[i][ii]=='.'&& b[i][ii+1]==ai)
{
for(j=0;j<kk;j++)
{
if(b[j][ii+1]=='.' && b[j][ii]==ai)
{
c[w][e]=j;
e++;
st2[tt][1]=j;
if(b[j][ii+2]>=65 && b[j][ii+1]<=90)
{
for(k=0;k<kk;k++)
{
if(b[k][0]==b[j][ii+2] && b[k][1]=='.')
{
c[w][e]=k;
e++;
}

47
}
}
}
}
if((b[i][ii+1]>=65 && b[i][ii+1]<=90) && tt==1)
{
for(k=0;k<kk;k++)
{
if(b[k][0]==ai && b[k][1]=='.')
{
c[w][e]=k;
e++;
}
}
}
}
}
st3[w]=e;
w++;
e=0;
}

Output:
$ ./a.out

48
Program-18

AIM: Program to implement Symbol Table Management

DESCRIPTION: Used during all phases of compilation. The symbol table keep track of scope
and binding information about names. For an identifier the possible information includes
name, address, how it’s defined (as a variable, type, function name, etc.)

#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
void main()
{
int i=0,j=0,x=0,n;
void *p,*add[5];
char ch,srch,b[15],d[15],c;
printf("Expression terminated by $:");
while((c=getchar())!='$')
{
b[i]=c;
i++;
}
n=i-1;
printf("Given Expression:");
i=0;
while(i<=n)
{
printf("%c",b[i]);
i++;
}
printf("\n Symbol Table\n");

49
printf("Symbol \t addr \t type");
while(j<=n)
{
c=b[j];
if(isalpha(toascii(c)))
{
p=malloc(c);
add[x]=p;
d[x]=c;
printf("\n%c \t %d \t identifier\n",c,p);
x++;
j++;
}
else if (isdigit(c))
{
p=malloc(c);
add[x]=p;
d[x]=c;
printf("\n%c \t %d \t Constant\n",c,p);
x++;
j++;
}
else
{
ch=c;
if(ch=='+'||ch=='-'||ch=='*'||ch=='=')
{
p=malloc(ch);
add[x]=p;
d[x]=ch;
printf("\n %c \t %d \t operator\n",ch,p);

50
x++;
j++;
}}}}

OUTPUT:

51
Instructions to write Lab Record/ Report

The Lab record/ report shall consists of

1) Certificate
2) Index
i. Number.
ii. Name of the Program
iii. Date of program execution
iv. Signature of the faculty
v. Remarks
3) Programs
Each program shall be submitted in the following format.
i. Aim of the program
ii. Description
iii. Algorithm
iv. Program
v. Output
vi. Conclusion
vii. Student details as Header and Page No. as Footer

52
Continuous Internal Evaluation (CIE) for Lab (50)

Sl. No. Evaluation components Marks

1 Average of two Lab Internal-1 20

2 Pre-Experiment Preparation work 5

3 Experimentation (Problem solving, 5


methodology of conduct)

4 Post-Experiment Analysis (Viva, Inference) 10

5 Report Writing 5

6 Conduct (Ethics, Attendance, Safety, Team 5


work)

53
Viva-Voce Question Bank

1) What is a compiler?

A computer program called a compiler converts source code written in a high-level language into
a low-level machine language.

2) What is compiler design?

Compiler design involves developing software that can read and interpret source code written in
a human language and produce binary code that can be read and understood by a computer. A
compiler is a tool responsible for this transformation; it reads the source code, checks it for
mistakes, and outputs the program in machine language. The generated binary code can be
directly used on a computer without extra processing.

3) List various types of compilers.

There are three types of compilers are described below:

 Single-Pass Compilers
 Two-Pass Compilers
 Multipass Compilers

4) What is an assembler?

When run on a computer, programs written in assembly language are converted into machine
language using a piece of software known as an assembler.

5) What is a Symbol Table?

A symbol table is a database in which each identifier is represented by a record that includes
fields for the identifier's attributes. Because of the database's organization, we can easily store or
get information from the correct record based on identification. A lexical analyzer will add an
identifier to the symbol table whenever it finds one. A lexical analyzer cannot deduce an
identifier's properties.

7) What Is Code Motion?

Code motion is an optimization approach whereby a loop's total number of lines of code is
reduced. Any expression that finishes with the same value after being run through the loop can
benefit from this change. You can find this kind of statement right before the loop.

8) Explain what YACC is?

54
The YACC is a construction tool for the Unix compiler. It is put to use in the process of
generating a parser, a piece of software that determines whether or not the source code for a
program is valid by the syntactic rules of the language. In most cases, YACC is used in
conjunction with the lexical analyzer tool, which produces a lexer.

9) Differentiate Tokens, Patterns, and Lexeme.

 Tokens: Tokens are character sequences that have significance when taken together as a
whole.
 Patterns: Patterns are recurring occurrences of the exact string in the input that result in
the generation of the same token in the output. The rule referred to as a pattern is
associated with the token and is used to characterize this group of strings.
 Lexeme: It is a string of characters in the source code used to determine whether or not a
token should be granted access. The fundamental units of any language are called tokens.

10) What Are The Benefits Of Intermediate Code Generation?

 Making a compiler for several machines is as simple as connecting a new back end to the
front end of each device.
 You can make a compiler for multiple languages by connecting their respective front
ends to the same back end.
 The code generation process can be optimized by applying a machine-independent code
optimizer to intermediate code.

11) Explain lex and yacc tools?


Lex:- scanner that can identify those tokens

Yacc:- parser.yacc takes a concise description of a grammar and produces a C


routine that can parse that grammar.

12) Give the structure of the lex program.


definition section- any intitial ‘c’ program code
%%
Rules section- pattern and action separated by white space
%%
User subroutines section-concsit of any legal code.

13) The lexer produced by lex in a ‘c’ routine is called yylex()


14) Explain yytext:- contains the text that matched the pattern.
15) The yacc produced by parser is called yyparse().
16) Why we have to include ‘y.tab.h’ in lex?
y.tab.h contains token definitions eg:- #define letter 258.

55
17) Explain the structure of a yacc program?
Defn section- declarations of the tokens used in the grammar
%%
The rules section-pattern action
%%
Users subroutines section

18) What are the six phases of a compiler?

The 6 phases of a compiler are:

 Syntactic Analysis or Parsing.


 Intermediate Code Generation.
 Lexical Analysis.
 Code Optimization.
 Code Generation.
 Semantic Analysis.

19) What are the two types of compiler design?

The two types of compiler design are:

 Cross-compiler: In the field of compiler design development, a cross-compiler is a


discussion board that facilitates the creation of machine-readable code.
 Source-to-source compiler: A source-to-source compiler is used to translate source code
from one programming language into another code.

20) What is meant by three address codes in the compiler?

As an intermediate code, three-address code is simple to produce and even simpler to translate
into machine language. An expression can be represented by no more than three addresses and a
single operator. The value computed at each instruction is saved in a temporary variable
established by the compiler.

21) What are the compiler design tools?

The tools used for compiler construction are as follows:

 Scanner Generator
 Parser Generator
 Data-flow analysis engines
 Automatic code generators
 Compiler construction toolkits
 Syntax-directed translation engines

56
22) Describe the Front End Of A Compiler?

A compiler's front end comprises the components of stages that are mainly device-independent
and typically rely on the source language. The front end can also be used for some code
optimization. Includes dealing with errors at each of those steps as well. Such factors include

 Semantic evaluation
 Lexical analysis
 Syntactic analysis
 Generation of intermediate code
 The introduction of the symbol table

23) Describe the Back-end Phases Of A Compiler?

The back-end phases of a compiler consist of the parts specific to the targeted machine that does
not rely on the source language but on the intermediate language. One such example is:

 Code optimization
 Code generation, along with error handling and symbol-table operations

24) Which language is used in compiler design?

A user creates a program using the C programming language (high-level language). The software
is compiled using the C compiler, which then transforms it into an assembly program (low-level
language). Afterward, software called an assembler converts the assembly program into machine
code (object).

25) What tools are used for compiler construction?

Tools for creating compilers are the same as those for creating other programming languages like
Java and C++. Examples of this are:

 A parser
 A lexical analyzer
 A compiler frontend

26) What is bootstrapping in compiler design?

Bootstrapping is a type of compiler design in which the compiler uses an in-house language to
implement the entire language rather than a different language for each language being compiled.

27) Can you explain context-free grammar and its importance in compiler design?

57
Grammar is a set of rules that specify how a language might be formed. A context-free grammar
is a type of grammar. It is an important point in the design of compilers since the compiler needs
to comprehend the structure of the programming language it is translating into machine code to
do it accurately.

28) What is Parsing in Compiler Design?

The process of moving information from one format to another is called "parsing." Parsing can
complete this task automatically. The parser is a part of the translator that helps to arrange the
linear structure of the text by a predetermined set of rules called grammar.

29) What is lexical analysis?

The technique of determining which lexemes are present in a sentence is known as lexical
analysis. Words and morphemes are common names for lexemes, the more fundamental units of
meaning in a language. Lexemes are also sometimes referred to as morphemes. Lexical analysis
is used not just for studying written texts and phrases spoken aloud. However, it also has
applications for analyzing the spoken language used in naturalistic research.

30) Write a regular expression for an identifier?

Let's say that a regular expression for an identifier is something like /[a-z]+$/, for example. In
that scenario, the identifier will be checked against the string "a-z" + to establish whether or not
it is a legitimate identifier. If it is not legitimate, the resulting string should not be considered a
match if it does not match the pattern /[a-z]+$/. If it does match, it should be deemed a match.

31) What Are The Properties Of Optimizing Compilers?

 The code must be written to generate the smallest possible amount of the desired code.
 Currently, there can be no inaccessible source code.
 Any and every unused or unnecessary code must be eliminated from the source code.
 The code improvements should be applied by optimizing compilers to the source
language.
 Elimination of Frequent Subexpressions.
 Power savings, code migration, and the eradication of useless codes.

31) Define Symbol Table.

A symbol table is a type of data structure that the compiler uses to track how the variables are
used. It keeps records of names and information about how they are used and bound.

32) What is the Application of Compilers?

Here are some important applications of Compilers:

58
 Compilers are helpful tools for putting into practice higher-level programming languages.
 It offers support for optimization for parallelism in computer architecture.
 It is used in designing new memory structures for computer systems.
 It is used extensively in programs that translate.
 It can be used in the synthesis of hardware, the translation of binary, and the
interpretation of database queries, among other program translations.
 It is simple to use in conjunction with various other software productivity tools.

33) Which language is used in compiler design?

A user creates a program using the C programming language (high-level language). The software
is compiled using the C compiler, which then transforms it into an assembly program (low-level
language). Afterward, software called an assembler converts the assembly program into machine
code (object).

34) Why is compiler design used?

The principles of compiler design provide an in-depth look at the translation and optimization
processes. The design of a compiler includes the primary translation mechanism, error detection,
and recovery. It includes frontend lexical, syntax, semantic analysis, back-end code generation,
and optimization.

35) Is compiler design difficult?

The process of building a compiler is a difficult one. A good compiler takes concepts from
formal language theory, the study of algorithms, artificial intelligence, systems design, computer
architecture, and the theory of programming languages and applies them to translate a program.
Other areas of expertise include studying artificial intelligence and computer systems design.

59
Text Books:

1. Keith D Cooper & Linda Tarezon, “Engineering a Compiler”, Morgan Kafman, Second
edition, 2004.

2. John R Levine, Tony Mason, Doug Brown “Lex &Yacc”, 3rd Edition Shroff Publisher, 2007.

Suggested Reading:

1. Kenneth C Louden, “Compiler Construction: Principles and Practice”, Cengage Learning,


2005.

2. John R Levine,”Lex&Yacc”, Oreilly Publishers, 2nd Edition, 2009.

60

You might also like