0% found this document useful (0 votes)
17 views5 pages

Varshil Shah Exp 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views5 pages

Varshil Shah Exp 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

BHARATIYA VIDYA BHAVAN’S

SARDAR PATEL INSTITUTE OF TECHNOLOGY


(Empowered Autonomous Institute Affiliated to University of Mumbai)
[Knowledge is Nectar]

Department of Computer Engineering

Course - System Programming and Compiler Construction (SPCC)

UID 2022301013

Name Varshil Anand Shah

Class and Batch TE Computer Engineering - Batch A

Date 19 January 2024

Lab # 1

Aim Design a Lexical analyser for different programming languages and implement using lex tool.
Assignment

Objective The objective of the above program is to create a lexer using Lex (or Flex) that tokenizes a
Java source code file. Specifically, the lexer is designed to identify Java keywords, such as if,
else, while, etc., as well as identifiers (variable names, class names, etc.). The program outputs
information about the recognized tokens, categorizing them as either keywords or identifiers.

Theory A lexical analyzer, commonly known as a lexer, is a crucial component in the compilation
process. Its primary task is to break down the source code into tokens, which are the
smallest units of meaning in a programming language.

Token Definition -
1. Keywords (if, else, while): Recognized reserved words in the language.
2. Identifiers: Sequences of letters, digits, and underscores, starting with a letter.
3. Numeric Literals: Sequences of digits representing numbers.
4. Operators (+, -, *, /): Basic mathematical and logical operators.

Lexical Rules:
1. Regular expressions define the patterns for each token type.
2. The lexer uses these rules to identify and categorize tokens.
3. Whitespace and comments are ignored to streamline tokenization.

Token Output:
1. For each recognized token, the lexer outputs a message indicating its type.
2. Identifiers and numeric literals include their actual values.

Implementation Steps -

Lexer Generation -
1. The Lex tool is employed to generate a lexical analyzer from the provided
specifications.
2. For this experiment, we’ve used flex for compiling the lex (.l) file. e.g flex file.l
3. The lexical rules are written in C-language with some Regex.
4. Once you compile the .l file, a new file called lex.yy.c is automatically generated.
BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
(Empowered Autonomous Institute Affiliated to University of Mumbai)
[Knowledge is Nectar]

Department of Computer Engineering

Compilation -
1. Now, a lex.yy.c file is generated, and the next step is to produce the executable file.
2. For, we’ve to use the gcc command to generate executable a .exe file, e.g. gcc lex.yy.c
-o <executable_file_name> -lfl.
3. This command will create a .exe file.

Execution -
1. For executing the .exe file, we need to provide some input data, so that it can check the
keywords, identifiers and much more stuff inside the data.
2. We were told to perform in the Java language. I wrote some java code in the input.txt
file.
3. Run the command - ./lexer input.txt

Output -
1. On the terminal, it will be displaying all the keywords, identifiers that you have used in
your code.

Implementation / Lex file (lexer.l) -


Code %{
#include <stdio.h>
int keywords = 0;
%}

DIGIT [0-9]
NUMBER {DIGIT}+
TEXT [A-Za-z]
KEYWORDS
"class"|"static"|"void"|"int"|"float"|"double"|"long"|"String"|"if"|"else"
|"switch"|"case"|"for"|"while"|"do"
IDENTIFIER {TEXT}({DIGIT}|{TEXT}|"_")*

%%
[ \t\n] ; /* Ignore whitespace */

"/*"[\s\S]*"*/" ; /* Ignore comments */

{KEYWORDS} { printf("Keyword: %s\n", yytext); keywords++; }


"=" { printf("Assignment Operator\n"); }
"{" { printf("Left Curly Brace\n"); }
"}" { printf("Right Curly Brace\n"); }
"(" { printf("Left Parenthesis\n"); }
")" { printf("Right Parenthesis\n"); }
";" { printf("Semicolon\n"); }
[DIGIT]+ { printf("Integer Constant: %s\n", yytext); }
{IDENTIFIER} { printf("Identifier: %s\n", yytext); }
BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
(Empowered Autonomous Institute Affiliated to University of Mumbai)
[Knowledge is Nectar]

Department of Computer Engineering


"+" { printf("Addition Operator\n"); }
"-" { printf("Subtraction Operator\n"); }
"*" { printf("Multiplication Operator\n"); }
"/" { printf("Division Operator\n"); }
"==" { printf("Equality Operator\n"); }
"!=" { printf("Inequality Operator\n"); }
"<" { printf("Less Than Operator\n"); }
">" { printf("Greater Than Operator\n"); }

. { printf("Unrecognized character: %s\n", yytext); }

%%

int main(int argc, char *argv[]) {


yyin = fopen(argv[1], "r");
if (!yyin) {
perror("Error opening file");
return 1;
}

yylex();
printf("Total keywords used: %d\n", keywords);
fclose(yyin);
return 0;
}

(Input file) Input.txt file -


class Main {
public static void main(String args[]) {
int time = 22;
if (time < 10) {
System.out.println("UID: 2022301013");
}else if (time == 18) {
System.out.println("Name: Varshil Shah");
}else {
System.out.println("TY COMPS. DIV B and Batch A");
}
}
}

Output Final output -

Commands -
BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
(Empowered Autonomous Institute Affiliated to University of Mumbai)
[Knowledge is Nectar]

Department of Computer Engineering

Output -
Keyword: class
Identifier: Main
Left Curly Brace
Identifier: public
Keyword: static
Keyword: void
Identifier: main
Left Parenthesis
Keyword: String
Identifier: args
[]Right Parenthesis
Left Curly Brace
Keyword: int
Identifier: time
Assignment Operator
22Semicolon
Keyword: if
Left Parenthesis
Identifier: time
Less Than Operator
10Right Parenthesis
Left Curly Brace
Identifier: System
.Identifier: out
.Identifier: println
Left Parenthesis
String: "UID: 2022301013"
Right Parenthesis
Semicolon
BHARATIYA VIDYA BHAVAN’S
SARDAR PATEL INSTITUTE OF TECHNOLOGY
(Empowered Autonomous Institute Affiliated to University of Mumbai)
[Knowledge is Nectar]

Department of Computer Engineering


Right Curly Brace
Keyword: else
Keyword: if
Left Parenthesis
Identifier: time
Equality Operator
18Right Parenthesis
Left Curly Brace
Identifier: System
.Identifier: out
.Identifier: println
Left Parenthesis
String: "Name: Varshil Shah"
Right Parenthesis
Semicolon
Right Curly Brace
Keyword: else
Left Curly Brace
Identifier: System
.Identifier: out
.Identifier: println
Left Parenthesis
String: "TY COMPS. DIV B and Batch A"
Right Parenthesis
Semicolon
Right Curly Brace
Right Curly Brace
Right Curly Brace
Total keywords used: 9

Conclusion In conclusion, the provided Lex code demonstrates the creation of a basic lexical analyzer using the
Lex tool, showcasing the essential steps in tokenizing a programming language and generating
meaningful output for various language constructs.

References 1. GeekForGeeks -
https://www.geeksforgeeks.org/what-is-lex-in-compiler-design

You might also like