Lex1 Lab Manual TE Computer SPPU
Lex1 Lab Manual TE Computer SPPU
Assignment No. : 2
Aim:- Write a program using Lex specifications to implement lexical analysis phase of compiler to
generate tokens of subset of ‗Java„ program.
Objectives:-
Tools: - Lex.
Theory:-
Introduction
The first phase of compiler is Lexical analysis. It is the process of converting a sequence of
characters into a sequence of tokens (strings with an identified "meaning"). A program that
performs lexical analysis may be called a lexer, tokenizer, or scanner.Token is a group of
characters.
Example:-
lexemes Tokens
Sum ID
= Assignment Op
== Equal Op
57 Integer const
int Data Type
, Comma
( Left Paren
Need of LEX :
1. Its main job is to break up an input stream into more usable elements or in other
words, to identify the “interesting bits” in the text file.
2. Lexical analysis perform division into tokens a program needs to establish a
relationship among the tokens.
3. A C, C++, Java compiler needs to find the expression statements, declarations, blocks
and procedures in the program.
LEX Specifications:
Declarations
The declarations section consists of two parts, auxiliary declarations and regular definitions.
The auxiliary declarations are copied as such by LEX to the output lex.yy.c file. This C or other
lang. code consists of instructions to the C compiler and are not processed by the LEX tool.The
auxiliary declarations (which are optional) are written in C language and are enclosed within '
%{ ' and ' %} ' . It is generally used to declare functions, include header files, or define global
variables and constants.
LEX allows the use of short-hands and extensions to regular expressions for the regular
definitions. A regular definition in LEX is of the form : D R where D is the symbol
representing the regular expression R.
Rules
LEX generates C code for the rules specified in the Rules section and places this code into a
single function called yylex().. In addition to this LEX generated code, the programmer may wish
to add his own code to the lex.yy.c file. The auxiliary functions section allows the programmer to
achieve this.The auxiliary declarations and auxiliary functions are copied as such to the lex.yy.c
file,Once the code is written, lex.yy.c maybe generated using the command lex "filename.l" and
compiled as gcc lex.yy.c
Following are the built in functions and variables used in the LEX program.
1. Variables used:
a) yyleng: A variable that defines the length of the input token in “yytext”.
b) yyin: A variable that determines the input stream for yylex () and input
functions.
c) yytext: A variable that defines the current input token recognized by the
LEX scanner. It is accessible both within a LEX action and on
return of yylex (). It is terminated with the null byte. If %
pointer is specified the definition section yytext is defined as
pointer to a pre-allocated array of char.
2. Functions used in Lex Program:
1. yylex (): The scanner that lex produces. It returns a token if it has located
in the input. A negative or zero value indicates error or end of input.
2. yywrap (): A library routine called by yylex () when it gets EOF from
yygetc. The default version of yywrap () returns 1, which indicates that no
more inputs are available. yylex () then returns 0 indicating EOF.
Step 1: Save file with extension '.l'. Consider file name as temp, and then it will be saved as
temp.l
Step 2: A tool 'lex' used by lexical analyzer produces a file of name lex.yy.c.
Commands
$ lex sample.l
$ cc lex.yy.c
$ ./a.out a.java
digit [0-9]
letter [A-Za-z]
Datatypes:
{digit}+(“E”(“+”/”- ”)?{digit}+)
{printf (“\n %s is real number”,yytext);}
If the token mathces with the above RE then it is declared as real number.
{digit}”+””{digit}+(“E”(“+”/”-
”)?{digit}+) {printf (“\n %s is floating
point number”,yytext);}
Escape sequence:-
Rational operators:-
Assignment operator:-
Arithmetic operator:
Special Character:
Identifier:
{letter}[{letter}|{digit}]*{printf(“\n %s is identifier”,yytext);}
If token matches then it is declares it as identifier .
Conclusion: - Hence , we successfully generated the tokens of java program through Lex
Analyzer(LEX).
FAQs