0% found this document useful (0 votes)

70 views20 pages

Compiler Design: - Top-Down Parsing With A Recursive Descent Parser

This document discusses top-down parsing with a recursive descent parser. It begins by explaining that a parser takes the sequence of tokens from a lexical analyzer and outputs intermediate code representations or error messages. It then describes how top-down and bottom-up parsing work and that recursive descent parsing is simplest if the grammar is not complicated. The document goes on to explain how recursive descent parsing works by defining functions for each grammar rule and uses lookahead, first sets, and follow sets to determine the parsing path. It also discusses techniques for error recovery.

Uploaded by

Nirmala Varadaraju

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

70 views20 pages

Compiler Design: - Top-Down Parsing With A Recursive Descent Parser

Uploaded by

Nirmala Varadaraju

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 20

Compiler Design 5.

Top-Down Parsing with a Recursive Descent Parser

Kanat Bolazar February 2, 2010

Parsing
Lexical Analyzer has translated the source program into a sequence of tokens The Parser must translate the sequence of tokens into an intermediate representation
Assume that the interface is that the parser can call getNextToken to get the next token from the lexical analyzer And the parser can call a function called emit that will put out intermediate representations, currently unspecified

The parser outputs error messages if the syntax of the source program is wrong

Parsing: Top-Down, Bottom-Up

Given a grammar such as:
E -> 0 | E + E

And a string to parse such as "0 + 0" A parser can parse top-down, from start symbol (E above):
E -> 0+0 E+E -> 0 + E -> 0 + 0

Or parse bottom-up, grouping terminals into RHS of rules:

<- E + 0 <- E + E <- E

Usually, parsing is done as tokens are read in:

Top-down: After seeing 0, we don't yet know which rule to use; After seeing 0 +, we can expand E to E + E Bottom-up: 0 can be reduced to E right away, without seeing +

Parsing: Top-Down, Bottom-Up

Generally:
top-down is easier to understand, implement directly bottom-up is more powerful, allowing more complicated grammars top-down parsing may require changes to the grammar

Top-down parsing can be done:

programmatically (recursive descent) by table lookup and transitions

Bottom-up parsing requires table-driven parsing If the grammar is not complicated, the simplest approach is to implement a recursive-descent parser. A recursive descent parser does not require backtracking to take alternative paths along the parse (derivation) path.

Recursive Descent Parsing

For every BNF rule (production) of the form <phrase1> E the parser defines a function to parse phrase1 whose body is to parse the rule E
void parsePhrase1( ) { /* parse the rule E */

Where E consists of a sequence of non-terminal and terminal symbols Requires no left recursion in the grammar.
5

Parsing a rule
A sequence of non-terminal and terminal symbols, Y1 Y2 Y3 Yn is recognized by parsing each symbol in turn For each non-terminal symbol, Y, call the corresponding parse function parseY For each terminal symbol, y, call a function
expect(y)

that will check if y is the next symbol in the source program

The terminal symbols are the token types from the lexical analyzer If the variable currentsymbol always contains the next token:
expect(y): if (currentsymbol == y) then getNextToken() else SyntaxError()
6

Simple parse function example

Suppose that there was a grammar rule <program> class <classname> { <field-decl> <method-decl> } Then:
parseProgram(): expect(class); parseClassname(); expect({); parseFieldDecl(); parseMethodDecl(); expect(});

Look-Ahead
In general, one non-terminal may have more than one production, so more than one function should be written to parse that non-terminal. Instead, we insist that we can decide which rule to parse just by looking ahead one symbol in the input
<sentence> -> 'if' '(' <expr> ')' <block> | 'while' '(' <expr> ')' <block> ...

Then parseSentence can have the form

if (currentsymbol == "if") ... // parse first rule elsif (currentsymbol == "while") ... // parse second rule ...

First and Follow Sets

First(E), is the set of terminal symbols that may appear at the beginning of a sentence derived from E
And may also include if E can generate an empty string

Follow(<N>), where <N> is a non-terminal symbol in the grammar, is the set of terminal symbols that can follow immediately after any sentence derived from any rule of N In this grammar:
E -> 0 | E + E

First(0) = {0} First(E + E) = {0} Follow(E) = {+, EOF}

First(E) = {0}

Grammar Restriction 1
Grammar Restriction 1 (for top-down parsing): The First sets of alternative rules for the same LHS must be different (so we know which path to take upon seeing a first terminal symbol/token). Notice: This is not true in the grammar above. Upon seeing 0 we don't know if we should take 0 or E + E path.

Recognizing Possibly Empty Sentences

In a strict context free grammar, there may be rules in which the rhs is , the empty string Or in an extended BNF grammar, there may be the specification that some part of the rhs of the rule occurs 0 or 1 times <phrase1> [ <phrase2> ] Then we recognize the possibly empty occurrence of phrase2 by
if (currentsymbol is in First(<phrase2>)) then parsePhrase2()

Recognizing Sequences
In a context free grammar, you often have rules that specify any number of a phrase can occur <arglist> <arg> <arglist> | e In extended BNF, we replace this with the * to indicate 0 or more occurrences <arg> * We can recognize these sequences by using iteration. If there is a rule of the form <phrase1> <phrase2>* we can recognize the phrase2 occurrences by
while (currentsymbol is in First(<phrase2>)) do parsePhrase2()
12

Grammar Restriction 2
In either of the previous cases, where the grammar symbol may generate sentences which are empty, the grammar must be restricted suppose that <phrase2> is the symbol that can occur 0 times require that the sets First(<phrase2>) and Follow(<phrase2) be disjoint Grammar Restriction 2: If a nonterminal may occur 0 times, its First and Follow sets must be different (so we know whether to parse it or skip it on seeing a terminal symbol/token).
13

Multiple Rules
Suppose that there is a nonterminal symbol with multiple rules where each rhs is nonempty <phrase1> E1 | E2 | E3 | . . . | En then we can write ParsePhrase1 as follows: if (currentsymbol is in First( E1 )) then ParseE1 elsif (currentsymbol is in First( E2 )) then ParseE2 ... elsif (currentsymbol is in First( En )) then ParseEn else Syntax Error If any rhs can be empty, then dont give the syntax error Remember the first grammar restriction:
The sets First( E1 ), , First( En ) must be disjoint
14

Example Expression Grammar

Suppose that we have a grammar <expr> <term> { <op> <term> }* <term> const | ( <expr> ) <op> + | - Parsing functions:
void parseTerm ( ) void parseExpr ( ) { if (cursym == const) then getNextToken() { parseTerm(); else if (cursym == () while (cursym in First(<op>)) then { getNextToken(); { getNextToken(); parseExpr(); parseTerm(); expect( ) ) } } } }
15

First Sets
Here we give a more formal, and more detailed, definition of a First set, starting with any non-terminal.
If we have a set of rules for a non-terminal, <phrase1> <phrase1> E1 | E2 | E3 | . . . | En then First(<phrase1>) = First(E1)+ . . . + First(En ) (set union) For any right hand side, Y1 Y2 Y3 Yn , we make cases on the form of the rule First(aY2 Y3 Yn) = a , for any terminal symbol a First(N Y2 Y3 Yn) = First(N), for any non-terminal N that does not generate the empty string First([N]M) = First(N) + First(M) (0 or 1 occurrence of N) First({N}*M) = First(N) + First(M) (0 or more occurrences) First( ) =
16

Follow Sets
To define the set Follow(T), examine the cases of where the non-terminal T may appear on the rhs of a rule in the grammar.
N S T U or N S [T] U or N S {T}* U If U never generates an empty string, then Follow(T) includes First(U) If U can generate an empty string, then Follow(T) includes First(U) and Follow(N) N S T or N S [ T ] or N S { T }* Follow(T) includes Follow(N) The Follow set of the start symbol should contain EOT, the end of text marker

Include the Follow set of all occurrences of T from the rhs of rules to make the set Follow(T)
17

Simple Error Recovery

To enable the parser to keep parsing after a syntax error, the parser should be able to skip symbols until it finds a synchronizing symbol.
E.g. in parsing a sequence of declarations or statements, skipping to a ; should enable the parser to start parsing the next declaration or statement

General Error Recovery

A more general technique allows the syntax error routine to be given a list of symbols that it should skip to. void syntaxError(String msg, Symbols StopSymbols) { give error with msg; while (! currentsymbol in StopSymbols) { getNextSymbol } }
assuming that there is a type called Symbols of terminal symbols we may want to pass an error code instead of a message

Each recursive descent procedure should also take StopSymbols as a parameter, and may modify these to pass to any procedure that it calls
E.g. if there is a procedure to parse the parameter list of a method call, then it can have ) as a stop symbol
19

Stop Symbols
If the parser is trying to parse the rhs E of a non-terminal NE then the stop symbols are those symbols which the parser is prepared to recognize after a sentence generated by E
Remove anything ambiguous from Follow(N)

The stop symbols should always also contain the end of text symbol, EOT, so that the syntax error routine never tries to skip over symbols past the end of the program.

PastPapers Harony P4 AR 2022
No ratings yet
PastPapers Harony P4 AR 2022
409 pages
Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
ETC07402 Communication Switching Systems - Question Bank - With Answers Rev 03
100% (4)
ETC07402 Communication Switching Systems - Question Bank - With Answers Rev 03
45 pages
Syntax Analyzer
No ratings yet
Syntax Analyzer
38 pages
Chapter # 5 Parsing Mechanisms. Chapter # 5 Parsing Mechanisms
No ratings yet
Chapter # 5 Parsing Mechanisms. Chapter # 5 Parsing Mechanisms
31 pages
Chapter 3 Syntax Analysis
No ratings yet
Chapter 3 Syntax Analysis
78 pages
Module-2 1
No ratings yet
Module-2 1
51 pages
CD Unit3
No ratings yet
CD Unit3
74 pages
M2 Compiler Design
No ratings yet
M2 Compiler Design
51 pages
Compiler Design 3
No ratings yet
Compiler Design 3
140 pages
[Week 4] Syntax Analysis (CFG)
No ratings yet
[Week 4] Syntax Analysis (CFG)
50 pages
CD Chapter-3
No ratings yet
CD Chapter-3
105 pages
Syntax Analysis I 2024
No ratings yet
Syntax Analysis I 2024
38 pages
Topic #4: Syntactic Analysis (Parsing) : INF 524 Compiler Construction Spring 2011
No ratings yet
Topic #4: Syntactic Analysis (Parsing) : INF 524 Compiler Construction Spring 2011
44 pages
CD UNIT-2
No ratings yet
CD UNIT-2
107 pages
Week 5
No ratings yet
Week 5
3 pages
Unit-II CD
No ratings yet
Unit-II CD
81 pages
Parsing
No ratings yet
Parsing
38 pages
Chapter 2 - Simple Syntax Directed Translator
No ratings yet
Chapter 2 - Simple Syntax Directed Translator
39 pages
2.2 - Syntax Analysis (Upto Top-down Parsing)
No ratings yet
2.2 - Syntax Analysis (Upto Top-down Parsing)
91 pages
Chapter 3 (2)
No ratings yet
Chapter 3 (2)
41 pages
CS6109-MODULE-5
No ratings yet
CS6109-MODULE-5
117 pages
Chapter-4 - CS-411 Compiler Construction
No ratings yet
Chapter-4 - CS-411 Compiler Construction
8 pages
Compiler Unit2
No ratings yet
Compiler Unit2
89 pages
CD Unit-2
100% (1)
CD Unit-2
60 pages
Unit 3
No ratings yet
Unit 3
37 pages
3 Syntax Analysis
No ratings yet
3 Syntax Analysis
42 pages
CH03
No ratings yet
CH03
57 pages
Chapter – 3
No ratings yet
Chapter – 3
46 pages
parser (1)
No ratings yet
parser (1)
36 pages
Syntax Analysis
No ratings yet
Syntax Analysis
90 pages
Chapter 3
No ratings yet
Chapter 3
180 pages
Compiler Design Lec-Three Syntax Analysis
No ratings yet
Compiler Design Lec-Three Syntax Analysis
60 pages
Chapter 3 - Syntax Analysis Part One
No ratings yet
Chapter 3 - Syntax Analysis Part One
10 pages
Chapter 4 - Syntax Analysis Part 1
No ratings yet
Chapter 4 - Syntax Analysis Part 1
36 pages
ACD-UNIT-4 Notes
No ratings yet
ACD-UNIT-4 Notes
32 pages
Chapter – three
No ratings yet
Chapter – three
139 pages
Top Down Parser
No ratings yet
Top Down Parser
111 pages
Tekkom M4,5
No ratings yet
Tekkom M4,5
29 pages
Syntax Analysis
No ratings yet
Syntax Analysis
58 pages
Top Down PDF
No ratings yet
Top Down PDF
49 pages
Mod 2.1 - (Lec 8) - Syntax Analyzer and CFG
No ratings yet
Mod 2.1 - (Lec 8) - Syntax Analyzer and CFG
39 pages
CD Unit 2
No ratings yet
CD Unit 2
6 pages
Lec03 parserCFG
No ratings yet
Lec03 parserCFG
27 pages
Top to Bottom (1)
No ratings yet
Top to Bottom (1)
31 pages
Assignment 2 Compiler Design: Name-Akash Deep Das Rollno:-SBU190275
No ratings yet
Assignment 2 Compiler Design: Name-Akash Deep Das Rollno:-SBU190275
6 pages
Unit Iii
No ratings yet
Unit Iii
95 pages
Chapter 3
No ratings yet
Chapter 3
96 pages
CD Unit-3 Part-1
No ratings yet
CD Unit-3 Part-1
99 pages
CD UNIT 3
No ratings yet
CD UNIT 3
76 pages
Syntax Analysis I 2022 Class
No ratings yet
Syntax Analysis I 2022 Class
33 pages
Chapter 4 - Syntax Analysis CIE1
No ratings yet
Chapter 4 - Syntax Analysis CIE1
69 pages
Ch4a
No ratings yet
Ch4a
36 pages
CD Unit 2
100% (1)
CD Unit 2
20 pages
CSC 4181 Compiler Construction Parsing
No ratings yet
CSC 4181 Compiler Construction Parsing
53 pages
APznzaYtAWjYy0s_GBEoizaF1ROv5e2pS_Nl6BcNYabrBN8gt4KeYj7LFiXdkYVxT_V92vXdgLmWE0ZcbyVltch5fozoqQQ4KdG766DLjO8aJsMIPKjEjniZOjL0qtNhMykCRh_ohPtDpZvrHNBAvbbZBhvxDpVEqpjDluyzuJGi-VI3NuG46DY_24QwGBEoRdfQYjfevW6tvweeRG (1)
No ratings yet
APznzaYtAWjYy0s_GBEoizaF1ROv5e2pS_Nl6BcNYabrBN8gt4KeYj7LFiXdkYVxT_V92vXdgLmWE0ZcbyVltch5fozoqQQ4KdG766DLjO8aJsMIPKjEjniZOjL0qtNhMykCRh_ohPtDpZvrHNBAvbbZBhvxDpVEqpjDluyzuJGi-VI3NuG46DY_24QwGBEoRdfQYjfevW6tvweeRG (1)
100 pages
Chapter-3 so far
No ratings yet
Chapter-3 so far
50 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
68 pages
1 Types of Parsers in Compiler Design
100% (1)
1 Types of Parsers in Compiler Design
4 pages
Chapter 4 - Syntax Analysis
No ratings yet
Chapter 4 - Syntax Analysis
73 pages
Unit - Ii 2.1 Syntax Analysis
No ratings yet
Unit - Ii 2.1 Syntax Analysis
122 pages
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
Intent:: Singleton Pattern - Creational
No ratings yet
Intent:: Singleton Pattern - Creational
9 pages
Samuel P. Tregelles - The Hope of Christ's Second Coming (1864)
No ratings yet
Samuel P. Tregelles - The Hope of Christ's Second Coming (1864)
97 pages
Cisco 350-401: Implementing Cisco Enterprise Network Core Technologies
No ratings yet
Cisco 350-401: Implementing Cisco Enterprise Network Core Technologies
4 pages
Smart Metering Report
No ratings yet
Smart Metering Report
24 pages
Output Log
No ratings yet
Output Log
2 pages
SinoGNSS R550ccData Collector
No ratings yet
SinoGNSS R550ccData Collector
2 pages
Fusion Vaip Ds-Ix2002-A7u LX
No ratings yet
Fusion Vaip Ds-Ix2002-A7u LX
6 pages
React-Native (Development Environment)
No ratings yet
React-Native (Development Environment)
16 pages
2V0-33.22 Exam - Free Actual Q&As, Page 5 - ExamTopics
No ratings yet
2V0-33.22 Exam - Free Actual Q&As, Page 5 - ExamTopics
2 pages
Panagiotis C. Kalantzis, CISSP, CISM, ISO2700LA: Strategic Cybersecurity Leader & Executive Consultant
No ratings yet
Panagiotis C. Kalantzis, CISSP, CISM, ISO2700LA: Strategic Cybersecurity Leader & Executive Consultant
4 pages
Module 05 - Accounting and Information Systems
No ratings yet
Module 05 - Accounting and Information Systems
6 pages
Building PI System Assets & Analytics W PI AF v2010
No ratings yet
Building PI System Assets & Analytics W PI AF v2010
133 pages
Configuring the Maximum Allowed URL Length for an HTTP Request
No ratings yet
Configuring the Maximum Allowed URL Length for an HTTP Request
3 pages
Oracle Private Cloud P Solution Engineer Specialist Assessment
No ratings yet
Oracle Private Cloud P Solution Engineer Specialist Assessment
3 pages
Client Side Session Handling For Angular
No ratings yet
Client Side Session Handling For Angular
8 pages
Unit-4 IoT Student Copy
No ratings yet
Unit-4 IoT Student Copy
46 pages
Design and Analysis of Algorithms - AD3351 2021 Regulation - Question Paper 2023 Nov Dec
No ratings yet
Design and Analysis of Algorithms - AD3351 2021 Regulation - Question Paper 2023 Nov Dec
5 pages
Introduction To Cmos Vlsi Design: MIPS Processor Example
No ratings yet
Introduction To Cmos Vlsi Design: MIPS Processor Example
43 pages
Project Report Final
No ratings yet
Project Report Final
85 pages
QBT 2430
No ratings yet
QBT 2430
2 pages
BACKBOX Users Manual
No ratings yet
BACKBOX Users Manual
129 pages
Lec 5-Stacks and Queues
No ratings yet
Lec 5-Stacks and Queues
71 pages
Telit GPS User Guide r1
No ratings yet
Telit GPS User Guide r1
38 pages
Disaster Management System
No ratings yet
Disaster Management System
44 pages
RailBAM 1
No ratings yet
RailBAM 1
22 pages
Deadlock Handling DBMS
No ratings yet
Deadlock Handling DBMS
30 pages
Carlcare Cservice Tool Guide - V3.1 20220922
No ratings yet
Carlcare Cservice Tool Guide - V3.1 20220922
48 pages
Thesis Internship Germany
67% (3)
Thesis Internship Germany
7 pages