0% found this document useful (0 votes)
97 views

Lex1 Lab Manual TE Computer SPPU

1) The document discusses implementing a lexical analyzer for a Java program subset using the Lex tool. It explains the phases of compilation like lexical analysis and the role of Lex. 2) It describes the structure of Lex programs which includes declarations, rules, and auxiliary functions sections. The rules section matches patterns to tokens using regular expressions. 3) The document provides examples of regular expressions used to recognize tokens like datatypes, numbers, operators, and identifiers in a Java program. Running Lex generates a C file that implements the lexical analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
97 views

Lex1 Lab Manual TE Computer SPPU

1) The document discusses implementing a lexical analyzer for a Java program subset using the Lex tool. It explains the phases of compilation like lexical analysis and the role of Lex. 2) It describes the structure of Lex programs which includes declarations, rules, and auxiliary functions sections. The rules section matches patterns to tokens using regular expressions. 3) The document provides examples of regular expressions used to recognize tokens like datatypes, numbers, operators, and identifiers in a Java program. Running Lex generates a C file that implements the lexical analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Group B

Assignment No. : 2

Aim:- Write a program using Lex specifications to implement lexical analysis phase of compiler to
generate tokens of subset of ‗Java„ program.

Objectives:-

To understand First phase of compiler: Lexical Analysis.

To learn and use compiler writing tools.

Understand the importance and usage of Lex automated tool.

Tools: - Lex.

Theory:-

Introduction

The first phase of compiler is Lexical analysis. It is the process of converting a sequence of
characters into a sequence of tokens (strings with an identified "meaning"). A program that
performs lexical analysis may be called a lexer, tokenizer, or scanner.Token is a group of
characters.
Example:-
lexemes Tokens
Sum ID
= Assignment Op
== Equal Op
57 Integer const
int Data Type
, Comma
( Left Paren

Need of LEX :

1. Its main job is to break up an input stream into more usable elements or in other
words, to identify the “interesting bits” in the text file.
2. Lexical analysis perform division into tokens a program needs to establish a
relationship among the tokens.
3. A C, C++, Java compiler needs to find the expression statements, declarations, blocks
and procedures in the program.
LEX Specifications:

The Structure of lex programs consists of three parts:

Declarations

The declarations section consists of two parts, auxiliary declarations and regular definitions.

The auxiliary declarations are copied as such by LEX to the output lex.yy.c file. This C or other
lang. code consists of instructions to the C compiler and are not processed by the LEX tool.The
auxiliary declarations (which are optional) are written in C language and are enclosed within '
%{ ' and ' %} ' . It is generally used to declare functions, include header files, or define global
variables and constants.

LEX allows the use of short-hands and extensions to regular expressions for the regular
definitions. A regular definition in LEX is of the form : D R where D is the symbol
representing the regular expression R.

Rules

Rules in a LEX program consists of two parts :

1. The pattern to be matched

2. The corresponding action to be executed

The pattern to be matched is specified as a regular expression.


Auxiliary functions

LEX generates C code for the rules specified in the Rules section and places this code into a
single function called yylex().. In addition to this LEX generated code, the programmer may wish
to add his own code to the lex.yy.c file. The auxiliary functions section allows the programmer to
achieve this.The auxiliary declarations and auxiliary functions are copied as such to the lex.yy.c
file,Once the code is written, lex.yy.c maybe generated using the command lex "filename.l" and
compiled as gcc lex.yy.c

Information about built-in functions and variables.

Following are the built in functions and variables used in the LEX program.
1. Variables used:
a) yyleng: A variable that defines the length of the input token in “yytext”.

b) yyin: A variable that determines the input stream for yylex () and input
functions.
c) yytext: A variable that defines the current input token recognized by the
LEX scanner. It is accessible both within a LEX action and on
return of yylex (). It is terminated with the null byte. If %
pointer is specified the definition section yytext is defined as
pointer to a pre-allocated array of char.
2. Functions used in Lex Program:

1. yylex (): The scanner that lex produces. It returns a token if it has located
in the input. A negative or zero value indicates error or end of input.

2. yywrap (): A library routine called by yylex () when it gets EOF from
yygetc. The default version of yywrap () returns 1, which indicates that no
more inputs are available. yylex () then returns 0 indicating EOF.

How to execute lex program?

There are following steps to execute lex program.

Step 1: Save file with extension '.l'. Consider file name as temp, and then it will be saved as

temp.l

Step 2: A tool 'lex' used by lexical analyzer produces a file of name lex.yy.c.

Step 3: Compiler compiles this file and produces ./a.out file


Here in below fig. Lex also takes a java program compiler

Commands

$ lex sample.l
$ cc lex.yy.c
$ ./a.out a.java

REGULAR DEFINITIONS USED IN PROGRAM:

digit [0-9]
letter [A-Za-z]

alternation, expressed by the “|” operator

Datatypes:

“int”|”char”|” float” {printf(“\n %s is datatype”,yytext);}


If the token matches with the int, char, double, float then it is

datatype. Integer Numbers:

{digit} + {printf(“\n %s is the integer number”,yytext);};


If we found one or more occurrences of digit then declared it as a integer numbers.
Real number:-

{digit}+(“E”(“+”/”- ”)?{digit}+)
{printf (“\n %s is real number”,yytext);}
If the token mathces with the above RE then it is declared as real number.

Floating point no.:-

{digit}”+””{digit}+(“E”(“+”/”-
”)?{digit}+) {printf (“\n %s is floating
point number”,yytext);}

Escape sequence:-

“\a”|”\n”;t”|”\\t”|”\b”|”\\a”{printf(“\n %s is a escape sequence”,yytext);}


If the token mathces with the one of above then it is declared as escape sequence.

Rational operators:-

“<”|”>”|”<=”|”>=”{printf(“\n %s is a relational operator”,yytext);}


If the token matches then it is declared as relational operator.

Assignment operator:-

“=” {printf(“\n %s is a assignment operator”,yytext);}


If the token matches then it is declared as assignment operator.

Arithmetic operator:

“+”|”- ”|”*”|”/”{printf(“\n %s is an arithmetic operator”,yytext);}


If the token matches then it is declared as arithmetic operator.

Special Character:

{“|”}”|”[“|”]”|”(“|”)”|” “ ' “|”\”|”\\”|”:”|”,”|{printf(“\n %s is a special character ”,yytext);}


If token matches then it is declared as special character.

Identifier:

{letter}[{letter}|{digit}]*{printf(“\n %s is identifier”,yytext);}
If token matches then it is declares it as identifier .

Conclusion: - Hence , we successfully generated the tokens of java program through Lex
Analyzer(LEX).
FAQs

1) What are the tokens?


2) What is Lexical Analysis?
3) Explain the working of Lex?
4) Define the term lexemes and pattern?
5) What are the different parts of Lex?
6) Explain the working of Rule section in Lex?
7) What is yywrap() & yytext() function

You might also like