0% found this document useful (0 votes)
29 views

Recognition of Token in Lexical Analysis-3

Uploaded by

smarakbasak17
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Recognition of Token in Lexical Analysis-3

Uploaded by

smarakbasak17
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Recognition of token in lexical

analysis

Smarak Basak 11500222009

Tanmay Paul 11500222001

Subhojit Pachal 11500222014

Nishita Dey 11500222004


Introduction of Lexical Analysis

Lexical Analysis is the first phase of


the compiler also known as a
scanner. It converts the High level
input program into a sequence
of Tokens.
1. Lexical Analysis is implemented
by taking each lexeme and
analysing them to give out
tokens.
2. The output is a sequence of
tokens (identified properly in the
symbolobtained
Tokens table) that is sent
during to the
lexical analysis are recognized
parser
by Finitefor syntax analysis
Automata.
Finite Automata (FA) is a simple idealized machine that
can be used to recognize patterns within input taken
from a character set or alphabet
What is a Token?
A lexical token is a sequence of characters that can be
treated as a unit in the grammar of the programming
languages. Example of tokens:
•Identifier (id, number, real, . . . )
•Punctuation tokens ((), {}, ; . . . )
•Operators (*, +, -, /)
•Alphabetic tokens (keywords)

Example of Non-Tokens:
•Comments, preprocessor directive, macros,
blanks, tabs, newline, etc.
Lexeme: The sequence of characters matched by
a pattern to form the corresponding token or a
sequence of input characters that comprises a
single token is called a lexeme. eg- “float”,
“abs_zero_Kelvin”, “=”, “-”, “273”, “;” .
How Lexical Analyzer Works?
1.Input preprocessing: This stage involves cleaning up the input text and
preparing it for lexical analysis.
2.Tokenization: This is the process of breaking the input text into a
sequence of tokens
3.Token classification: In this stage, the lexer determines the type of each
token.
4.Token validation: In this stage, the lexer checks that each token is valid
according to the rules of the programming language.
5.Output generation: In this final stage, the lexer generates the output of
the lexical analysis process, which is typically a list of tokens.
You can observe that we have omitted comments.
As another example, consider below printf
statement.

There are 5 valid token in this printf statement.

Exercise 1: Count number of tokens: Identifier,


Punctuator, semicolon Exercise 2: This line will be compiled which will
be sent to the first phase of compiler i.e. the
Lexical analyser. That will then be sent to the
parser for further compilation.
We can represent in the form of lexemes and tokens as
under

Lexemes Tokens Lexemes Tokens

&& Validation a IDENTIEFIER

( LAPREN = ASSIGNMENT

a IDENTIFIER a IDENTIFIER

>= COMPARISON – ARITHMETIC

b IDENTIFIER 2 INTEGER

) RPAREN ; SEMICOLON
Advantages
1.Simplifies Parsing:Breaking down the source
code into tokens makes it easier for computers to
understand and work with the code.
2.Error Detection: Lexical analysis will detect
lexical errors such as misspelled keywords or
undefined symbols early in the compilation
process.
3.Efficiency: Once the source code is converted
into tokens, subsequent phases of compilation or
interpretation can operate more efficiently.
Disadvantages
1.Limited Context: Lexical analysis operates
based on individual tokens and does not consider
the overall context of the code.
2.Overhead: Although lexical analysis is
necessary for the compilation or interpretation
process, it adds an extra layer of overhead.
3.Debugging Challenges: Lexical errors
detected during the analysis phase may not always
provide clear indications of their origins in the
original source code.

You might also like