Midterm 2011 Solution
Midterm 2011 Solution
Solution:
a) After some basic simplifications, which consist mostly on the elimination of consecutive (-transitions, we would arrive at the NFA depicted below.
0 start 1
5 6
digit digit
11 12
13
10
b) In this DFA we have omitted for simplicity the edges that lead to the error state as they are obvious from the context. For instance out of state {0,1,2,5,6,7,8} there are many edges to the error state labeled with all the characters except E or e.
E E e E 4,5,6,7,8
+
0,1,2,5,6,7,8
3,5,6,7,8
9,11
digit
11,12,13
10,11
digit
digit
e
Missing transition! go to the 'error'! state error
1 of 5
c) The sequence of DFAs resulting from the refinement using selected discriminating input tokens is as shown below.
E
P2
+
0,1,2,5,6,7,8 3,5,6,7,8
P1
E e E 4,5,6,7,8 9,11
E E e E
4,5,6,7,8
P3
9,11
P1
digit
11,12,13
digit
11,12,13
P2
0,1,2,5,6,7,8
3,5,6,7,8
10,11
digit
digit
10,11
digit
digit
e
Missing transition! go to the 'error'! state error
e
Missing transition! go to the 'error'! state error
P4
E
P3
E e E
9,11
P1
S2 digit
11,12,13 start
digit
P2
0,1,2,5,6,7,8
3,5,6,7,8
4,5,6,7,8
10,11
digit
digit
error
e
Missing transition! go to the 'error'! state error
P5
(c) Second/third partition using '+'/'-' as well as 'E' and 'e' as! discriminating inputs.
2 of 5
For this grammar answer the following questions: a) [10 points] Compute the set of LR(1) items for this grammar and draw the corresponding DFA. Do not forget to augment the grammar with the initial production S Stat $ as the production (0). b) [10 points] Construct the corresponding LR parsing table. c) [05 points] Would this grammar be LR(0)? Why or why not? (Note: you do not need to construct the set of LR(0) items but rather look at the ones you have already made for LR(1) to answer this question. d) [05 points] Show the stack contents, the input and the rules used during parsing for the input string w = begin begin x end end $. e) [05 points] Would this grammar be suitable to be parsed using a top-down LL parsing method? Why, or why not? f) [05 points] Under the assumption you had to perform error-correction, what synchronizing set you would use to recover from a syntactic error in the Body non-terminal symbol?
Answer:
a) [10 points] The set of LR(1) items and the corresponding DFA are shown below:
s0
S ! Stat $, {} Stat ! Block , {$} Block ! begin Block end {$} Block ! Body, {$} Body ! x, {$}
Stat
s1
S ! Stat $, {}
Block s2 Body x
Stat ! Block , {$}
s3
begin
s5
Block ! begin Block end, {$} Block ! begin Block end, {end} Block ! Body, {end} Body ! x, {end}
s4
Body ! x , {$}
Body x Body x
s8
Block ! Body , {end}
Block
s6
Block ! begin Block end, {$}
begin begin
s7
Block ! begin Block end, {end} Block ! begin Block end, {end} Block ! Body, {end} Body ! x {end}
s11
Body ! x , {end}
end
s9
Block ! begin Block end , {$}
Block
s10
Block ! begin Block end, {end}
end
s12
Block ! begin Block end , {end}
3 of 5
c) [5 points] If we were to compute the set of LR(0) items they would lead to the same DFA and as one can observe there are no states for which there is a reduction operation and a shift operation. This means that in terms of the LR(0) parsing table we would not have any shift/reduce conflicts. This means that this particular grammar can be parsed using the LR(0) parsing method. What is the advantage of using the LR(1) method? In this particular case as with the LR(1) approach it would leads to the constructions of a more sparse and hence smaller table if one are to use sparse encoding methods. d) [5 points] The stack contents for the input string w = begin begin x end end $ is as shown below. e)
r(4) g8 s7 s5 s0 s5 s0 s7 s5 s0 s11 s11 s7 s5 s0 s8 s7 s5 s0 r(3) g10 s10 s7 s5 s0 s12 s12 s10 s7 s5 s0 r(2) g6 s6 s5 s0 s12
s9 s6 s5 s0
r(2) g2 s2 s0
r(1) g1 s1 s0 accept
e) [5 points] This grammar does have the LL(1) property as for all non-terminal symbols with more than one production the FIRST sets of the corresponding productions left-hand-sides are always disjoint. g) [5 points] Possibly the best synchronization set would be the follow set of the Body non-terminal, which in this case would consist of the end token. Hence, the parser would skip all the input characters until it reads the end and use a reduction operation to pretend having seen an instance of Body.
4 of 5
Answer:
a) [10 points] A simple grammar for this SDT is depicted below. It enforces the fact that the allowed strings are multiples of 4 when interpreted in decimal by a trailer value that includes the pattern 100. Leading zero digits are allowed freely in the grammar using a classic recursive construct.
(0) (1) (2) (3) (4) (5) (6) S List0 List Bit Bit Tail Lead List Tail List1 Bit Bit 1 0 Lead 0 0 1 | | | | | | | List.pos = 3; S.value = List.value + Tail.value List0.pos = List1.pos + 1; Bit.pos = List1.pos; List0.value = Bit.value Bit.pos = List.pos; List.value = Bit.value Bit.value = 0 Bit.value = 2Bit.pos Tail.value = Lead.value Lead.value = 2Lead.pos
b) [15 points] We define two integer attributes for the List and Bit non-terminal symbols of the grammar, respectively pos and value. The inherited pos attribute denotes the positional reference of the instance in the parse tree whereas the synthesized attribute value will hold the current value being computed for the substring rooted at that particular non-terminal symbol. The semantic rules for the SDT scheme are shown above next to each of the productions. c) [10 points] The parse tree and corresponding attribute values as well as a possible evaluation order is as shown below.
(1)
S
(6)
value = 24 + 4 = 28
(0)
(2)
List
(5)
pos = 3! value = 16 +8 = 24
Tail 0
(2)
pos = 2! 2 value = 2 = 4
List pos = 4!
(3) (4)
(5)
Bit 1
value = 16
Bit pos = 4! 1
value = 2 = 16
5 of 5