Lectures Merged
Lectures Merged
Compiler Design
Simona Motogna
S. Motogna - LFTC
Compiler
Historical reasons
Formal Design
Languages
Be a better programmer
Performant algorithms
FLCD
S. Motogna - LFTC
Organization Issues
PRESENCE IS MANDATORY
S. Motogna - LFTC
Most interesting stuff for students
• Moodle:
• All course resources
• Homeworks
• Assignments
• Labs
• Points / grades
S. Motogna - LFTC
Minimal Conditions to Pass
Bonus
Lab work
• 10 laboratory tasks
• Weighted grades:
Lab grade
Bonus points:
- “awesome” solutions
- Extra work
I wish …
S. Motogna - LFTC
S. Motogna - LFTC
What is a compiler?
Interpreter?
S. Motogna - LFTC
S. Motogna - LFTC
A little bit of history …
Java
1995
C
Pascal J. Gosling
1969 - 1973
1968 - 1970 D. Ritchie
Lisp
N. Wirth
1962
Fortran McCarthy
1954-1957
Backus
S. Motogna - LFTC
Structure of a compiler Take
notes!
Source code/
analysis Error handling
Scanning (lexical
program analysis)
Parsing (syntactical
Tokens & ST analysis)
Intermediary code
Adnotated syntax generation synthesis
tree
Intermediary
Symbol Table Intermediary code optimization
management code
Object code /
Optimized Object code program
generation
intermediary code
S. Motogna - LFTC
Chapter 1. Scanning
Definition = treats the source program as a sequence of characters,
detect lexical tokens, classify and codify them
Algorithm Scanning v1
While (not(eof)) do
detect(token);
classify(token);
codify(token);
End_while
S. Motogna - LFTC
Take
Detect notes!
I am a student.I am
Simona
if (x==y) {x=y+2}
S. Motogna - LFTC
Classify
• Classes of tokens:
• Identifiers
• Constants
• Reserved words (keywords)
• Separators
• Operators
S. Motogna - LFTC
Codify
• May be codification table
OR
code for identifiers and constants
identifier, constant
S. Motogna - LFTC
Algorithm Scanning v2
While (not(eof)) do
detect(token);
if token is reserved word OR operator OR separator
then genPIF(token, 0)
else a=a+b
endif ST
1 a
endif 2 b
endwhile
S. Motogna - LFTC
Remarks:
• genPIF = adds a pair (token, position) to PIF
S. Motogna - LFTC
Example (sem?)
• https://babeljs.io/docs/en/
• https://www.antlr.org/ and https://github.com/antlr/antlr4
• https://www.programiz.com/python-programming/online-compiler/
• https://www.w3schools.com/python/python_compiler.asp
S. Motogna - LFTC
Course 2
S. Motogna - LFTC
Symbol Table
Definition = contains all information collected during compiling
regarding the symbolic names from the source program
Variants:
- Unique symbol table – contains all symbolic names
- distinct symbol tables: IT (identifiers table) + CT (constants
table)
S. Motogna - LFTC
ST organization
Remark: search and insert
S. Motogna - LFTC
Hash table
• K = set of keys (symbolic names)
• A = set of positions (|A| = m; m –prime number)
h:K→A
h(k) = (val(k) mod m) + 1
Toy hash function to use at
lab:
Sum of ASCII codes of chars
• Conflicts: k1 ≠ k2 , h(k1) = h(k2)
S. Motogna - LFTC
Visibility domain (scope) Example:
Int main(){
… int a;
• Each scope – separate ST void f()
• Structure -> inclusion tree {float a;
… int h() {…}
Hierachical structure of STs: }
…
main void g()
{char a;
f g …
}
}
h
S. Motogna - LFTC
Formal Languages
- basic notions-
S. Motogna - FL&CD
Examples of languages
• natural (ex. English, Romanian)
• programming (ex. C,C++, Java, Python)
• formal
S. Motogna - FL&CD
Example
a boy has a dog • A→ α = rule
• S,P,V,N,Q,C,B = nonterminal symbols
S→PV • a, boy,dog,has = terminal symbols
P→ a N
N→ boy or N→ dog Remarks
(N→boy|dog) 1. Sentence = word, sequence (contains only
V → QC terminal symbols) ; denoted w.
Q → has
C → BN 2. S⇒PV⇒a NV⇒a NQC⇒a N has C -
sentential form
B→a
In general : w=a1a2. . . an
3. The rule guarantees syntactical correctness,
but not the semantical correctness (A dog has
a boy)
S. Motogna - FL&CD
Grammar
• Definition: A (formal) grammar is a 4-tuple: G=(N,Σ,P,S)
with the following meanings:
• N – set of nonterminal symbols and |N| < ∞
• Σ - set of terminal symbols (alphabet) and |Σ|<∞
• P – finite set of productions (rules), with the propriety:
P⊆(N∪Σ)∗ N(N∪Σ)∗ x (N∪Σ)∗
• S∈N – start symbol /axiom A* = transitive and
Remarks : reflexive closure =
{a,aa,aaa,…} {a0}
1. (α,β)∈P is a production denoted α→β
A = {a}
2. N ∩ Σ = ∅ A+ = {a,aa,aaa,…}
X0 =𝜀
S. Motogna - FL&CD
Binary relations defined on (N ∪ Σ)∗
• Direct derivation
α ⇒ β , α,β ∈ (N ∪ Σ)∗ if α=x1xy1 , β=x1yy1 and x→y∈P
(x is transformed in y)
• k derivation
k
α ⇒ β ,α,β ∈ (N ∪ Σ)∗
sequence of k direct derivations α⇒α1 ⇒α2 ⇒...⇒αk−1 ⇒β, α,α1,α2,...αk−1,β∈(N∪Σ)∗
• + derivation
+ k
α ⇒ β if ∃ k>0 such that α ⇒ β (there exists at least one direct derivation)
• * derivation
* k * + 0
α ⇒ β if ∃ k≥0 such that α ⇒ β namely, α⇒ β⇔α⇒ β OR α⇒ β (α=β)
S. Motogna - FL&CD
Definition: Language generated by a grammar G=(N,Σ,P,S) is:
* w}
L(G)={w∈Σ∗ | S ⇒
Remarks: L1 = {a,b,aa}
L2 = {c,d,cd}
L1L2 = {ac,ad,acd,bc,bd,bcd,aac,aad,aacd}
1. S⇒* α,α∈(N∪Σ)∗ = sentential form
* w, w∈Σ∗ = word / sequence
S⇒
2. Operations defined for languages (sets) :
L1∪L2 , L1∩L2 , L1-L2 , 𝐿 (complement) , L+=⋃!"# 𝐿! , L∗=⋃!$# 𝐿!
Concatenation: L=L1L2 = {w1w2 | w1∈L1 , w2∈L2}
3. |w|=0 (empty word - denoted ε)
Definition: Two grammar G1 and G2 are equivalent if they generate the same language
L(G1)=L(G2)
S. Motogna - FL&CD
Chomsky hierarchy(based on form α → β∈ P)
• type 0 : no restriction
• type 1 : context dependent grammar (x1Ay1 → x1γy1)
• type 2 : context free grammar (A → α ∈ P ,where A∈N and α∈(N ∪ Σ)∗
)
• type 3 : regular grammar ( A → aB|a ∈ P)
Remark :
type 3 ⊆ type 2 ⊆ type 1 ⊆ type 0
S. Motogna - FL&CD
Notations
o A,B,C,... – nonterminal symbols
o S ∈ N – start symbol
o a,b,c,... ∈ Σ – terminal symbol
o α,β,γ ∈ (N ∪ Σ)∗ - sentential forms
o ε – empty word
ox,y,z,w ∈ Σ∗ - words
o X,Y,U,... ∈ (N ∪ Σ) – grammar symbols (nonterminal or terminal)
S. Motogna - FL&CD
u
d a
a n a
r c a
a
Problem: The door to the tower is closed by the Red Dragon, using a
complicated machinery. Prince Charming has managed to steal the
plans and is asking for your help. Can you help him determining all
the person names that can unlock the door
Course 3&4
Formal Languages
- Basic notions -
S. Motogna - FL&CD
Regular languages
S. Motogna - FL&CD
S.Motogna - Formal Languages & Compiler Design
Finite Regular
Automata grammars
S.Motogna - Formal Languages & Compiler Design
u
d a
a n a
r c a
a
Problem: The door to the tower is closed by the Red Dragon, using a
complicated machinery. Prince Charming has managed to steal the
plans and is asking for your help. Can you help him determining all
the person names that can unlock the door
Finite Automata
• Intuitive model
a n a Σ
q
CU
S. Motogna - FL&CD
Definition: A finite automaton (FA) is a 5-tuple
M = (Q,Σ,δ,q0,F)
where:
• Q - finite set of states (|Q|<∞)
• Σ - finite alphabet (|Σ|<∞)
• δ – transition function : δ:Q×Σ→P(Q)
• q0 – initial state q0 ∊ Q
• F⊆Q – set of final states
S. Motogna - FL&CD
Remarks
1. Q∩Σ=∅
2. δ:Q×Σ→P(Q) , ε∈Σ0 - relation δ(q,ε)=p NOT allowed
3. If |δ(q,a)|≤1 => deterministic finite automaton (DFA)
4. If |δ(q,a)|>1 (more than a state obtained as result) =>
nondeterministic finite automaton (NFA)
S. Motogna - FL&CD
Configuration C=(q,x)
where:
- q state
- x unread sequence from input: x ∊ ∑*
S. Motogna - FL&CD
Relations between configurations
• ⊢ move / transition (simple, one step)
(q,ax) ⊢ (p,x) , p ∈ δ(q,a)
k
• ⊢ k move = a sequence of k simple transitions) C0 ⊢ C1 ⊢... ⊢ Ck
+
• ⊢ + move
+ k
C ⊢ Cʹ : ∃ k>0 such that C ⊢ Cʹ
*
•⊢ * move (star move)
k
⊢ ∃
C Cʹ : k≥0 such that
* ⊢
C Cʹ
S. Motogna - FL&CD
Definition : Language accepted by FA M = (Q,Σ,δ,q0,F) is:
*
L(M)={ w ∈ Σ∗ | (q0,w) ⊢ (qf ,ε) , qf ∈F }
Remarks
1. 2 finite automata M1 and M2 are equivalent if and only if they
accept the same language
L(M1)=L(M2)
1. ε ∈ L(M) ó q0∈F (initial state is final state)
S. Motogna - FL&CD
Representing FA b
a b
1. List of all elements p q r
2. Table a
3. Graphical representation
M=(Q,Σ,δ,p,F)
M=(Q,Σ,δ,p,F)
Q = {p,q,r}
F = {r}
Σ = {a,b}
δ(p,a) = q a b
δ(q,a)=q p q r (p,aab)|-(q,ab)|-(q,b)|-(r,ε) => aab accepted
δ(q,b)=r
q q r (p,aba)|-(q,ba)|-(r,a) => aba not accepted
δ(p,b)=r
F = {r} r - -
S. Motogna - FL&CD
Remember
• Finite automaton
M = (Q,Σ,δ,q0,F)
*
L(M)={ w ∈ Σ∗ | (q0,w) ⊢ (qf ,ε) , qf ∈F }
Reg exp
Maths
Finite Regular
Automata grammars
Proof: constructive
i. G = ({S}, 𝚺, 𝞥, S) – regular grammar such that L(G) = 𝞥
Proof: constructive
L1,L2 right linear languages => ∃G1, G2 such that
G1 = (N1, 𝚺1,P1,S1) and L1 = L(G1)
G2 = (N2, 𝚺2,P2,S2) and L2 = L(G2) assume N1∩N2 = ∅
P5 = P1 U {S5 → 𝝴} U
{S5→ 𝛂1| S1→ 𝛂 1∊ P1} U
{A→ aS1| if A→ a ∊ P1}
Proof: 𝑞1 = 𝑞 30 + 𝝴
=> Apply lemma 1 and lemma 2 (to follow, similar to RG) ! 𝑞2 = 𝑞 10 + 𝑞11 + 𝑞20 + 𝑞30
<= construct a system of regular exp equations where: 𝑞 3 = 𝑞 21
- Indeterminants – states
- Coefficients – terminals Regular exp = union of
- Equation for A: all the possibilities that put the FA in solutions corresponding
state A
- Equation of the form: X=Xa+b => solution X=ba*
to final states
F3 = F2 U {q ∊ F1 | if q02 ∊ F2}
𝛿3(q,a) = 𝛿1(q,a), if q ∊ Q1-F1
𝛿1(q,a) U 𝛿2(q02,a) if q ∊ F1
𝛿2(q,a), if q ∊ Q2 L(M3) = L(M1)L(M2)
PROOF!!! Homework
L(M3) = L(M1)*
PROOF!!! Homework
S. Motogna - FL&CD
Context free grammar (cfg)
• Procdutions of the form: A →𝛼, A∊N, 𝛼∊(N∪𝜮)*
• More powerful
S. Motogna - FL&CD
Syntax tree
Definition: A syntax tree corresponding to a cfg G = (N, 𝜮,P,S) is a tree
obtained in the following way:
1. Root is the starting symbol S
2. Nodes ∊ N∪𝜮:
1. Internal nodes ∊N
2. Leaves ∊𝜮
3. For a node A the descendants in order from left to right are X1, X2, ..., Xn
only if A → X1X2... Xn∊ P
Remarks:
a) Parse tree = syntax tree – result of parsing (syntatic analysis)
b) Derivation tree – condition 2.2 not satisfied
c) Abstract syntax tree (AST) ≠ syntax tree (semantic analysis)
S. Motogna - FL&CD
Syntax tree (cont)
Property: In a cfg G = (N, 𝜮,P,S), w ∊ L(G) if and only if there exists a
syntax tree with frontier w.
Proof: HomeWork
S. Motogna - FL&CD
Example: S-> aSbS | c; w = aacbcbc
S => aSbS => aaSbSbS => aacbSbS S => aSbS => aSbc => aaSbSbc
=> aacbcbS => aacbcbc => aaSbcbc => aacbcbc
Definition: A cfg G = (N, 𝜮,P,S) is ambigous if for a w ∊ L(G) there exists
2 distinct syntax tree with frontier w.
Example:
S. Motogna - FL&CD
Parsing (syntax analysis) modeled with cfg:
THEN
S. Motogna - FL&CD
• Unproductive symbols
• Inaccesible symbols
1. Determine elements (symbols/
productions): Greedy alg
• e - productions
2. eliminate them: construct equivalent
• Single productions grammar
Definition
Unproductive symbols A nonterminal A este unproductive in a cfg if
does not generate any word: {w| A =>* w, w Î
S*} = Æ.
Algorithm 1: Elimination of unproductive symbols
input: G = (N, S,P,S)
output: G’ = (N’, S,P’,S), L(G) = L(G’)
// idea: build N0,N1,... recursively (until saturation)
step 1: N0 = Æ; i:=1;
step 2: Ni = Ni-1 U {A| A®a Î P, a Î(Ni-1 U S)*}
step 3: if Ni <> Ni-1 then i:=i+1; goto step 2
else N’ = Ni
step 4: if S Ï N’ then L(G) = Æ
else P’ = {A®a | A®a Î P and A Î N’}
S. Motogna - FL&CD
Example
G = ({S,A,B,C,D}, {a,b,c}, P,S)
P: S ® aA | aC
A ® AB
B® b
C ® aC | CD
D® b
S. Motogna - FL&CD
Inaccesible symbols Definition
A symbol X Î NUS is inaccesible in a cfg if X does not
appear in any sentential form: ∀ S =>* a, X ∉ a
S. Motogna - FL&CD
Definition
e-productions A cfg G=(N,S,P,S) is without e-productions if
1. P ∌ A -> e (e-productions)
OR
2. ∃ S®e si S ∉ rhs(p),∀p ∊ P
Algorithm 3: Elimination of e-productions
input: cfg G = (N, S,P,S)
output: cfg G’ = (N’, S,P’,S’) step 2: Let P’ = set of productions built:
2.a. if A®a 0B1a1B2a2 . . . Bkak Î P, k>=0
and for i := 1,k Bi Î `N
step 1: construct `N = {A| A Î N, A=>+ e}
and aj Ï`N, j:=0,k
1.a. N0 := {A| A®e Î P};
i := 1; then add to P’ all prod of the form
1.b. Ni := Ni-1 U {A| A®a Î P, a ÎN*i-1} A®a 0X1a1X2a2 . . . Xkak
1.c. if Ni <> Ni-1 then i:=i+1; goto step 1.b where Xi is Bi or e (not A®e )
A->BC else `N = Ni 2.b if S ÎN’ then add S’ to N’ and S’® S|e to P
B->e else N’ := N; S’ := S.
C->e
S. Motogna - FL&CD
Example
G = ({S,A,B}, {a,b},P,S)
P: S ® aA | aAbB
A ® aA | B
B ® bB | 𝛆
S. Motogna - FL&CD
Definition
Single productions O production of the form A® B is called single production or
renaming rule.
S. Motogna - FL&CD
Example
G = ({E,T,F},{a,(,),+,*},P,E)
P: E ® E+T | T
T ® T*F | F
F ® (E) | a
S. Motogna - FL&CD
Parsing
• Cfg G = (N, 𝚺, P,S) check if w ∊ L(G) S
• Construct parse tree
{
• How:
1. Top-down vs. Bottom-up
2. Recursive vs. linear a a a
1 i-1 i
Figura 3.2:Construct
¸ia arborelui prin analiza sintactic˘a LL(1)
S.Motogna - FL&CD
3.2.1. Gramatici de tip LL(k)
Course 6
S.Motogna - FL&CD
Problem: Parsing (construct the parsee tree)
if the source program is sintactically correct
then construct syntax tree
else ”syntax error”
*
source program is sintactically correct = w ∈ L(G) ó S => w
S.Motogna - FL&CD
Parsing
S
• How:
1. Top-down vs. Bottom-up A
{
2. Recursive vs. linear
a a a
1 i-1 i
Figura 3.2:Construct
¸ia arborelui prin analiza sintactic˘a LL(1)
S.Motogna - FL&CD
3.2.1. Gramatici de tip LL(k)
Descendent Ascendent
S.Motogna - FL&CD
Result – parse tree -representation
• Arbitrary tree – child sybling representation
S.Motogna - FL&CD
1
S
index Info Parent Right
sibling
1 S 0 0 2 3 4 5
a S b S
2 a 1 0
3 S 1 2
6 7
4 b 1 3 c c
5 S 1 4
6 c 3 0
7 c 5 0
S.Motogna - FL&CD
Descendent recursive parser
• Example
S -> aSbS | aS | c
S.Motogna - FL&CD
Formal model Initial configuration:
• Configuration (q,1,𝜀, S)
(s, i, 𝛼, 𝛽)
where:
• s = state of the parsing, can be: Define moves between
• q = normal state configurations
• b = back state
• f = final state - corresponding to success: w ∊ L(G)
• e = error state – corresponding to insuccess: w ∉ L(G)
• i – position of current symbol in input sequence
w = a1a2…an, i ∊ {1,...,n+1}
• 𝛼 = working stack, stores the way the parse is built Final configuration:
• 𝛽 = input stack, part of the tree to be built (f,n+1, 𝛼,𝜀)
S.Motogna - FL&CD
Expand
WHEN: head of input stack is a nonterminal
where:
A → 𝜸1 | 𝜸2 | … represents the productions corresponding to A
1 = first prod of A
S.Motogna - FL&CD
Advance
WHEN: head of input stack is a terminal = current symbol from input
S.Motogna - FL&CD
Momentary insuccess
WHEN: head of input stack is a terminal ≠ current symbol from input
S.Motogna - FL&CD
Back
WHEN: head of working stack is a terminal
S.Motogna - FL&CD
Another try
WHEN: head of working stack is a nonterminal
S.Motogna - FL&CD
Success
(q,n+1, 𝜶, 𝜀) ⊢ (f,n+1, 𝜶, 𝜀)
S.Motogna - FL&CD
Algorithm
S.Motogna - FL&CD
S.Motogna - FL&CD
w ∊ L(G) - HOW
• Process 𝜶:
• From left to right (reverse if stored as stack)
• Skip terminal symbols
• Nonterminals – index of prod
• Example: 𝜶 = S1 a S2 a S3 c b S3 c
S.Motogna - FL&CD
When the algorithm never stops?
• S->S𝝰 – expand infinitely (left recursive)
S.Motogna - FL&CD
LL(1) Parser
S.Motogna - FL&CD
S
{
a a a
1 i-1 i
Linear algorithm
Figura 3.2:Construct
¸ia arborelui prin analiza sintactic˘a LL(1)
FIRST
dup˘a kcum se observ˘a ¸si din alegerea
¸ieiproduct
A ! Ø ˆın figura 3.2.
Predict ¸ia de lungime k reprezint˘a urm˘atoarele k simbolur
•bui generate
≈ first k terminaldin configurat
symbols ¸ia can
that curent˘a. Pentrufrom
be generated aceasta
𝛼 se introduce o
•FDefinition:
IRST k [ASU86], care calculeaz˘a primele k simboluri ce¸ine se pp
deriv˘ari succesive dintr-o anumit˘ a form˘a propozit
¸ional˘a:
F IRST k : (N [ ß) § ! P(ß k )
k § §
F IRST k (Æ) = {u|u 2 ß, Æ) ux, |u| = k sau Æ) u, |u| ∑ k}
(primele k simboluri ale lui Æ)
S.Motogna - FL&CD
LL(1) Parser
S.Motogna - FL&CD
S
{
a a a
1 i-1 i
Linear algorithm
Figura 3.2:Construct
¸ia arborelui prin analiza sintactic˘a LL(1)
L1={a, 𝛆}
L2={0,1}
L1⨁L2 ={a,0,1}
S.Motogna - FL&CD
• predict
¸ia de lungime k:
ai+1 . . . ia+k ,
FIRST
dup˘a kcum se observ˘a ¸si din alegerea
¸ieiproduct
A ! Ø ˆın figura 3.2.
Predict ¸ia de lungime k reprezint˘a urm˘atoarele k simbolur
•bui generate
≈ first k terminaldin configurat
symbols ¸ia can
that curent˘a. Pentrufrom
be generated aceasta
𝛼 se introduce o
•FDefinition:
IRST k [ASU86], care calculeaz˘a primele k simboluri ce¸ine se pp
deriv˘ari succesive dintr-o anumit˘ a form˘a propozit
¸ional˘a:
F IRST k : (N [ ß) § ! P(ß k )
k § §
F IRST k (Æ) = {u|u 2 ß, Æ) ux, |u| = k sau Æ) u, |u| ∑ k}
(primele k simboluri ale lui Æ)
S.Motogna - FL&CD
FIRSTk
• Which are the first k terminal symbols that can be generated from A?
• https://forms.office.com/r/kNHNGW7XtC
S.Motogna - FL&CD
3.2.3. Construirea tabelului de analiz˘a LL(1)
Construct FIRST din tabeldepinde de valorile funct
Calculul elementelor ¸iei F IRST .
Pentru a putea descrie o metod˘a de calcul Concatenation
a lui F IRST avem nevoie de
urm˘atoarea
ØFIRST denotedproprietate:
FIRST of length 1
1
Observat
ØRemarks: ¸ii [GJ90]:
• Dac˘a
If L, L2 sunt are
1 dou˘a limbaje
2 languages overpeste alfabetul
alphabet 𝛴, then ß, atunci:
L 1 © L2 =
{w|x 2 L1, y 2 L2, xy = w, |w| ∑ 1 sau xy = wz, |w| =and 1} ¸si
55
S.Motogna - FL&CD
A -> BC
B -> DA
D -> a
F0(A)=F0(B)=∅; F0(D)={a}
A F1(A) =F0(A) U {…| A->BC F0(B)⊕F(D)}= ∅
F1(B) ={a}
S.Motogna - FL&CD
• predict
¸ia ai - urm˘atorul simbol de pe banda de int
S.Motogna - FL&CD
FIRST
• ≈ first terminal symbols that can be generated from 𝛼
FOLLOW
• ≈ next symbol generated after/ following A
S.Motogna - FL&CD
LL(k) LL(k) Principle
• L = left (sequence is read from
left to right)
• L = left (use leftmost derivation) • In any moment of parsing, acțion
• Prediction of length k is uniquely determinde by:
• Closed part (a1…ai)
S • Current symbol A
• Prediction ai+1…ai+k (length k)
A
{
a a a ai+1…ai+k
1 i-1 i S.Motogna - FL&CD
Definition
3.2.1. Gramatici de tip LL(k)
Definit¸ia 3.1.[AU73]O gramatic˘a G = (N, ß, P, S) este de tip LL(k
• Adac˘a
cfg is pentru
LL(k) iforicare
for anydou˘a deriv˘ari
2 leftmost de stˆanga:
derivation we have:
§ §
1. S ) wAÆ
st
) st wØÆ) wx;
st
§ §
2. S ) wAÆ
st
) st w∞Æ) wy;
st
astfel
such ˆıncˆat
that F IRST
k (x) = F IRST k (y) avem
thenc˘a:
Ø = ∞.
Definit
¸ia poate fi reformulat˘a astfel: pentru orice form˘a propozit
¸ional˘a
wAÆ, primele k simboluri derivabile din AÆ definesc ˆın mod unic
ţie care se poate aplica lui A pentru ¸ine a obto derivare a unui cuvˆant
secvent¸˘a de simboluri terminale) care ˆıncepe cu w ¸si se continu
S.Motogna - FL&CD
simboluri.Aceast˘a condit¸ie este uneori dificil de verificat ¸si ˆın majo
Theorem
The necessary and sufficient condition for a grammar to be LL (
𝛽,
that for any pair of distinct productions of a nonterminal (A→
A→ 𝛾, 𝛽≠𝛾) the condition holds:
*
FIRSTk(𝛽𝛼) ⋂ FIRSTk(𝛾𝛼)= 𝛷,∀𝛼 astfel încât S => uA𝛼
such that
S.Motogna - FL&CD
LL(1) Parser
• Prediction of length 1
• Steps:
1) construct FIRST, FOLLOW Executed 1 time
2) Construct LL(1) parse table
3) Analyse sequence based on moves between configurations
S.Motogna - FL&CD
Step 2: Construct LL(1) parse table
• Possible action depend on:
• Current symbol ∈ N∪𝚺
• Possible prediction ∈ 𝚺
• Add a special character “$” ( ∉ N∪𝚺) – marking for “empty stack”
= > table:
• One line for each symbol ∈ N∪𝚺 ∪{$}
• One column for each symbol ∈ 𝚺 ∪{$}
S.Motogna - FL&CD
pentru fiecare predict
¸ie posibil˘a.
In plus, se adaug˘a un caracter special, de
obicei notat ’$’2 (N/ [ß), al c˘arui scop este s˘a marcheze sfˆar¸situl
¸ei secvent
¸si c˘aruia i se aloc˘a o linie ¸si o coloan˘a Efectul
ˆın tabel.
acestui simbol ˆın
Rules LL(1) table
faza de analiz˘a propriu-zis˘a este de a elimina verific˘arile de stiv˘a goal˘
Regulile de completare a tabelului sunt:
1. M (A, a) = (Æ, i), 8a 2 F IRST (Æ), a 6= ≤, A ! production
Æ product
¸ie ˆın
in PP cu
num˘arul
with numberi;i
M (A, b) = (Æ, i),dac˘a
if ≤ 2 F IRST (Æ), 8b 2 F OLLOW (A), A ! Æ
product¸ie in
production ˆın P cunumber
P with num˘arul
i i;
57
2. M (a, a) = pop, 8a 22ß;
3. M ($, $) = acc;
(error)ˆın
4. M(x,a)=err (eroare) otherwise
celelalte cazuri.
Pentru gramatica din exemplul precedent, construct ¸ia tabelului de ana-
liz˘a LL(1) necesit˘a ¸si calculul¸imilormult F OLLOW pentru neterminalele
A ¸si C, deoarece ≤ 2 F IRST (A) ¸si ≤ 2 F IRSTAplicarea
S.Motogna - FL&CD
(C). algoritmului
Remark
S.Motogna - FL&CD
Step 3: Definire configurations and moves
• INPUT:
• Language grammar G = (N, 𝚺, P,S)
• LL(1) parse table
• Sequence to be parsed w =a1…an
• OUTPUT:
If (w ∈L(G)) then string of productions
else error & location of error
S.Motogna - FL&CD
LL(1) configurations
Initial configuration:
(w$,S$,𝜀)
(𝛼 , 𝛽 , 𝜋 )
where:
• 𝛼 = input stack
• 𝛽 = working stack
• 𝜋 = output (result) Final configuration:
($, $, 𝜋)
S.Motogna - FL&CD
product ¸ii folosit. Se observ˘a c˘a acceptareasecvent unei¸e se face pe baza
1. push - operat ¸ia de punere ˆın stiv˘a:
criteriului
1. push stivei vide.
- operat ¸ia de punere ˆın stiv˘a:
Moves (ux,
Tranzit AÆ$,
¸iile º) ` (ux, ØÆ$,
se definesc ºi),urm˘ator:
ˆın felul dacˇa M (A, u) = (Ø, i);
(ux, AÆ$, º) ` (ux, ØÆ$, ºi), dacˇa M (A, u) = (Ø, i);
de fapt, ˆın stiva de lucru se efectueaz˘a urm˘atoarele ¸ii: se scoate
operat
1. 1.Push
push
dedin
A –fapt,
put inˆın
-stiv˘a
operatstack
¸ia de
sepunere
stiva
¸si ؈ın
de lucru
pune stiv˘a:
ˆınse efectueaz˘a urm˘atoarele
stiv˘a; ¸ii: se scoate
operat
A din
(ux, AÆ$, stiv˘a
º) ` ¸si
(ux,se pune
ØÆ$, ºi),Ø ˆın stiv˘a;
if dacˇa M (A, u) = (Ø, i);
2. pop - operat ¸ia de scoatere din stiv˘a, se elimin˘a vˆarfurile ambelor
2.de(pop
fapt,
stive
pop A and
-(dac˘a
operat push
ˆın stiva
ele de
¸ia lucruofse𝛽)efectueaz˘a
symbols
de scoatere
coincid): din stiv˘a, urm˘atoarele
se elimin˘a ¸ii:vˆarfurile
se scoate
operatambe
2. PopAstive
din stiv˘a ¸siele
se pune Ø ˆın stiv˘a;
(ux, aÆ$, º) ` (x, Æ$, º), dac˘a stacks)
– take off
(dac˘afrom stack (from
coincid): both M(a,u)=pop ;
2. pop
(ux,- aÆ$,
operat ¸ia de scoatere ifdin stiv˘a,
se elimin˘a vˆarfurile ambelor
3. tranzit ¸ia de º) ` (x,
acceptare,Æ$, º),
dac˘a dac˘a
s-a obt M(a,u)=pop
¸inut configurat ;
¸ia final˘a, notat˘a
stive (dac˘a ele coincid):
3. Accept
3. acc:
tranzit ¸ia de acceptare, dac˘a s-a obt ¸inut configurat¸ia final˘a, notat˘a
(ux, aÆ$, º) ` (x, Æ$, º), dac˘a M(a,u)=pop ;
($,
acc: $, º) ` acc ;
4. 3.
Error
($,- otherwise
tranzit ¸iaº)de
$, acceptare,
` acc dac˘a s-a obt
; eroare, ¸inut configurat
¸ia final˘a, notat˘a
4. ˆın celelalte cazuri notat˘a err:
acc:
(nÆ,
4.($,
ˆın xØ$, º) `cazuri
celelalte err. eroare, notat˘a err:
$, º) ` acc ;
(nÆ, xØ$, º) `tranzit
Corespunz˘ator err.
¸iilor deS.Motogna
mai - sus, analiza sintactic˘a LL(1) se face
4. ˆın celelalte cazuri eroare, notat˘a err:
FL&CD
Algorithm LL(1) parsing
• INPUT:
§ LL(1) table with NO conflicts;
§ G –grammar (productions)
§ Input sequence w = a1a2 . . . an
• OUTPUT:
§ sequence accepted or not?
§ If yes then string of productions
S.Motogna - FL&CD
Algorithm LL(1) parsing (cont)
alpha := w$;beta := S$;pi := ɛ; config =(alpha,beta, pi)
go := true;
while go do
if M(head(beta),head(alfa))=(b,i) then
ActionPush(config)
else
if M(head(beta),head(alfa))=pop then
ActionPop(config)
else
if M(head(beta),head(alfa))=acc then
go:=false; s:=”acc”;
else go:=false; s:=”err”;
if s=”’acc”’ then
end if
write(”Sequence accepted”);
end if write(pi)
end if else
end while write(” Sequence not accepted”)
S.Motogna - FL&CD
Remarks
1) LL(1) parser provides location of the error
I -> if C then S T
T -> ɛ | else S // is LL(1)
S.Motogna - FL&CD
Play time!!!
• Menti.com cod: 42 60 49
S.Motogna - FL&CD
Curs 8
LR(k) parsing
S.Motogna - FL&CD
Reminder:
S.Motogna - FL&CD
LR(k)
• L = left – sequence is read from left to right
• R = right – use rightmost derivations
• k = length of prediction
• Enhanced grammar
• G = (N, Σ,P,S)
• G’ =(N ∪ {S’},Σ,P ∪ {Sʹ → S},Sʹ), S’∉ N S’ does NOT appear in any rhp
S.Motogna - FL&CD
LR(k)
• Ascendent
S.Motogna - FL&CD
• Definition 1: If in a cfg G = (N, Σ, P, S) we have
* r αAw ⇒r αβw, where α ∈ (N ∪Σ)∗,A ∈ N,w ∈ Σ∗, then
S =>
any prefix of sequence αβ is called live prefix in G.
• Definition 3: LR(k) item [A → α.β,u] is valid for the live prefix γα if:
*
S⇒r γAw ⇒r γαβw
u = FIRSTk(w)
S.Motogna - FL&CD
Definition 4: A cfg G = (N, Σ, P, S) is LR(k), for k>=0, if
*
1. S’⇒r αAw ⇒r αβw
*
2. S’⇒r γBx ⇒r αβy => α = γ AND A =B AND x=y
3. FIRSTk(w) = FIRSTk(y)
S.Motogna - FL&CD
• [A → αβ.,u] – special case: prefix is all rhp - apply reduce
S.Motogna - FL&CD
determina
• tranzit comportamentul
¸ia ˆın alt˘a stare. analizorului, caracterizat p
De
De aceea
•aceea
act tabelele
¸iunea
tabelelecaredese analiz˘a
de va efectua
analiz˘a LR(k)
¸si au
LR(k) dou˘a compone
au dou˘a compone
de aç
de aç
Whatde LR item will
deplasare,
de deplasare, be in
numit˘a
numit˘athe same
”goto“.
”goto“. state?
Care • tranzit ¸ia
sunt ¸si cumˆın se
alt˘a stare.
determin˘a aceste st˘ari?
Pentru a r˘asp
Care sunt ¸sicum se determin˘a aceste st˘ari? Pentru a r˘asp
consider˘am elementul de analiz˘a [A ! Æ.BØ, u] care,conform d
• [A → α.Bβ,u] De aceea
consider˘am
valid for live tabelele
elementul deγα
prefix de analiz˘a
analiz˘a
=> LR(k) u]
[A ! Æ.BØ, au dou˘a comp
care,conform de
d
implic˘a:
de§ deplasare, numit˘a ”goto“.
implic˘a:
S ) § dr ∞Aw ) dr ∞ÆBØw ¸si
S ) Care
dr ∞Aw sunt ) dr¸si
cum se determin˘a
∞ÆBØw ¸si aceste st˘ari?
Pentru a r˘
uconsider˘am
= F IRSTk (w)elementul
valabil
de pentru
analiz˘a prefixul
[A ! viabilu]
Æ.BØ, ∞Æ.
care,conform
u = F IRSTk (w) valabil pentru prefixul viabil ∞Æ.
Dac˘a
implic˘a:ˆın gramatic˘a exist˘a¸ie o product
B ! ± atunci elementul
Dac˘a§ ˆın gramatic˘a exist˘a¸ie o product
B ! ± atunci elementul
[B ! .±,S u]) are,
∞Aw de )asemenea,
∞ÆBØw *u
¸sivalid pentru prefixul viabil
• B → δ[B∈P!=>
.±, u] dr are, de asemenea,
dr => 𝜸𝜶𝜹w’
u drvalid pentru prefixul viabil
u = F IRSTk (w) valabil pentru prefixul viabil ∞Æ.
Dac˘a ˆın gramatic˘a exist˘a¸ie o product
B ! ± atunci elemen
=> [B → .δ,u] valid for live prefix γα
[B ! .±, u] are, de asemenea, u valid pentru prefixul via
S.Motogna - FL&CD
LR(k) parsing:
LR(0), SLR, LR(1), LALR
• Define item
• Construct set of states Executed 1 time
• Construct table
S.Motogna - FL&CD
LR(0) Parser
S.Motogna - FL&CD
2. Construct set of states
• What a state contains – Algorithm closure_LR(0)
• How to move from a state to another – Function goto_LR(0)
• Construct set of states – Algorithm ColCan_LR(0)
Canonical collection
S.Motogna - FL&CD
2. closure(I) = I [ {[B ! .±]|[A ! Æ.BØ] 2 I}, conform observat
¸iei
din paragraful anterior.
Algorithm Closure_LR(0)
Algoritmul 3.8 ClosureLR0
INPUT: I-element de analiz˘a; G’- gramatica ˆımbog˘at
¸it˘a
OUTPUT: C = closure(I);
C := {I};
repeat
for 8[A ! Æ.BØ] 2 C do
for 8B ! ∞ 2 P do
if [B ! .∞] 2
/ C then
C = C [ [B ! .∞]
end if
end for
end for
until C nu se mai modific˘a
S.Motogna - FL&CD
Algorithm ColCan_LR(0)
Algoritmul 3.9 Col stariLR0
INPUT: G’- gramatica ˆımbog˘at ¸ită S-> aS|bSc|dA
OUTPUT: C - colecţia canonic˘a de st˘ari A -> dc
C := ;;
0
s0 := closure({[S ! .S]}) Goto(s0,S)
C := C [ {s 0}; Goto(s0,A)
repeat Goto(s0,a)
for 8s 2 C do Goto(s0,b)
Goto(s0,c) =∅
for 8X 2 N [ ß do
Goto(so,d)
if goto(s, X)=6 ; and goto(s, X)2/C then
C = C [ goto(s, X)
end if
end for
end for
until C nu se mai modific˘a
S.Motogna - FL&CD
A!c
3. Construct LR(0) table
• one line for each state
• 2 parts:
• Action: one column (for a state, action is unique because prediction is
ignored)
• Goto: one column for each symbol X ∈ N ∪ Σ
S.Motogna - FL&CD
Rules LR(0) table
1. if [A → α.β] ∈ si then action(si)=shift
2. if [A → β.] ∈ si and A ≠ Sʹ then action(si)=reduce l, where l =
number of production A → β
3. if [Sʹ → S.] ∈ si then action(si)=acc
4. if goto(si, X) = sj then goto(si, X) = sj
5. otherwise = error
S.Motogna - FL&CD
Remarks
1) Initial state of parser = state containing [Sʹ → .S]
2) No shift from accept state:
if s is accept state then goto(s, X) = ∅, ∀X ∈ N ∪ Σ.
3) If in state s action is reduce then goto(s, X) = ∅, ∀X ∈ N ∪ Σ.
4) Argument G’: Let G = ({S},{a,b,c},{S → aSbS,S → c},S)
states [S → aSbS.] and [S → c.] – accept / reduce ?
S.Motogna - FL&CD
Remarks (cont)
5) A grammar is NOT LR(0) if the LR(0) table contains conflicts:
• shift – reduce conflict: a state contains items of the form [A → α.β]
and [B → γ.], yielding to 2 distinct actions for that state
S.Motogna - FL&CD
4. Define configurations and moves
• INPUT:
• Grammar G’ = (NU{S’}, 𝚺, P U {S’->S},S’)
• LR(0) table
• Input sequence w =a1…an
• OUTPUT:
if (w ∈L(G)) then string of productions
else error & location of error
S.Motogna - FL&CD
LR(0) configurations
Initial configuration:
($s0,w$,𝜀)
(𝛼 , 𝛽 , 𝜋 )
where:
• 𝛼 = working stack
• 𝛽 = input stack
• 𝜋 = output (result) stack Final configuration:
($sacc, $, 𝜋)
S.Motogna - FL&CD
Moves
1. Shift
if action(sm)= shift AND head(𝛽)=ai AND goto(sm,ai)=Sj then
($s0x1 ...xmsm,ai ...an$, 𝜋) ⊢ ($s0x1 ...xmsmaisj,ai+1 ...an$, 𝜋)
2. Reduce
if action(sm) = reduce l AND (l) A → xm−p+1 ...xm AND goto(sm−p,A) = sj then
($s0 ...xmsm,ai ...an$, 𝜋) ⊢ ($s0 ...xm−psm−pAsj,ai ...an$,l 𝜋)
3. Accept
if action(sm) = accept then ($sm,$, 𝜋)=acc
4. Error - otherwise
S.Motogna - FL&CD
LR(0) Parsing Algorithm
INPUT:
- LR(0) table – conflict free
- grammar G’: production numbered
• - sequence = Input sequence w =a1…an
• OUTPUT:
if (w ∈L(G)) then string of productions
else error & location of error
S.Motogna - FL&CD
LR(0) Parsing Algorithm state :=0;
alpha := ‘$s0’; beta :=‘w$’; phi := ‘’; end:= false
Config := (alpha,beta,phi);
Repeat
if action(state)=‘shift’ then
ActionShift(config)
else
if action(state) =’reduce l” then
ActionReduce(config)
else
if action(state)=‘accept’ then
write(” success”,); write(phi);
end := true;
if action(state) = ‘error’ then
write(” error”)
end := true
Until end
S.Motogna - FL&CD
Course 9
LR(k) Parsing (cont.)
S.Motogna - FL&CD
LR(k) parsing:
LR(0), SLR, LR(1), LALR
• Define item
• Construct set of states Executed 1 time
• Construct table
S.Motogna - FL&CD
Algorithm ColCan_LR(0)
Algoritmul 3.9 Col stariLR0
INPUT: G’- gramatica ˆımbog˘at ¸ită
OUTPUT: C - colecţia canonic˘a de st˘ari
C := ;;
0
s0 := closure({[S ! .S]}) // state corresponding to prod. of S’ = initial state
C := C [ {s 0}; //initialize collection with s0
repeat
for 8s 2 C do
for 8X 2 N [ ß do
if goto(s, X)=6 ; and goto(s, X)2/C then
C = C [ goto(s, X) //add new state
end if
end for
end for
until C nu se mai modific˘a
S.Motogna - FL&CD
A!c
2. closure(I) = I [ {[B ! .±]|[A ! Æ.BØ] 2 I}, conform observat
¸iei
din paragraful anterior.
Algorithm Closure
Algoritmul 3.8 ClosureLR0
I = LR(0) item of the form [A->𝜶.𝜷]
goto(s,X): in state s, search LR(0) item that has dot in front of symbol X.
Move the dot after symbol X and call closure for this new item.
S.Motogna - FL&CD
SLR Parser
Prediction = next symbols on
input sequence
• SLR = Simple LR
• Remark:
LR(0) – lots of conflicts – solved if considering prediction
=>
1. LR(0) canonical collection of states– prediction of length 0
2. Table and parsing sequence – prediction of length 1
S.Motogna - FL&CD
SLR Parsing:
S.Motogna - FL&CD
Construct SLR table
Remarks:
1. Prediction = next symbol from input sequence => FOLLOW
- see LL(1)
2. Structure – LR(k):
• Lines - states
• action + goto Optimize table structure:
merge action and goto
action – a column for each prediction ∈𝞢 columns for Σ
goto – a column for each symbol X ∈N∪𝞢
Remark (LR(0) table):
• if s is accept state then goto(s, X) = ∅, ∀X ∈ N ∪ Σ.
• If in state s action is reduce then goto(s, X) = ∅, ∀X ∈ N ∪ Σ.
S.Motogna - FL&CD
SLR table And goto
Action GOTO
a1 … an B1 … Bm
a1,…,an ∈𝞢
s0 B1,...,Bm ∈N
s0,…,sk - states
s1
…
sk
S.Motogna - FL&CD
Rules for SLR table
1. If [A → α.β] ∈ si and goto(si,a) = sj then action(si,a)=shift s j
// dot is not at the end
S.Motogna - FL&CD
Remarks
1. Similarity with LR(0)
2. A grammar is SLR if the SLR table does not contain conflicts (more
than one value in a cell)
S.Motogna - FL&CD
Parsing sequences
• INPUT:
• Grammar G’ = (NU{S’}, 𝚺, P U {S’->S},S’)
• SLR table
• Input sequence w =a1…an
• OUTPUT:
if (w ∈L(G)) then string of productions
else error & location of error
S.Motogna - FL&CD
SLR = LR(0) configurations
Initial configuration:
($s0,w$,𝜀)
(𝛼 , 𝛽 , 𝜋 )
where:
• 𝛼 = working stack
• 𝛽 = input stack
• 𝜋 = output (result) Final configuration:
($sacc, $, 𝜋)
S.Motogna - FL&CD
Moves
head(𝛽) = prediction
1. Shift
if action(sm,ai)= shift s j then
($s0x1 ...xmsm,ai ...an$, 𝜋) ⊢ ($s0x1 ...xmsmaisj,ai+1 ...an$, 𝜋)
2. Reduce
if action(sm,ai) = reduce t AND (t) A → xm−p+1 ...xm AND goto(sm−p,A) = sj
then
($s0 ...xmsm,ai ...an$, 𝜋) ⊢ ($s0 ...xm−psm−pAsj,ai ...an$,t 𝜋)
3. Accept
if action(sm,$) = accept then ($sm,$, 𝜋)=acc
4. Error - otherwise
S.Motogna - FL&CD
LR(1) Parser
[A→𝜶.𝜷,u]
S.Motogna - FL&CD
Construct LR(1) set of states
• Alg ColCan_LR1
• Function goto_LR1
• Alg Closure_LR1
S.Motogna - FL&CD
INPUT: G’ – enhanced grammar
Algorithm ColCan_LR1 OUTPUT: C1– cannonical collection of states
C1=∅
S0 = Closure_LR1({[S’→.S,$]})
C1:= C1U {s0}
Repeat
for ∀s ∊ C1do
for ∀ X ∊ N U𝛴 do
T = goto_LR1(s,X)
if T≠ ∅ and T ∉ C1then
C1= C1U T
endif
endfor
endfor
Until C1unchanged
S.Motogna - FL&CD
Function goto_LR1
Goto_LR1 : P(ℰ1) × (N ∪ Σ) → P(ℰ1)
where ℰ1 = set ofLR(1) items
S.Motogna - FL&CD
• tranzit
¸ia ˆın alt˘a stare.
De
De aceea
aceea tabelele
tabelele de de analiz˘a
analiz˘a LR(k)
LR(k) au dou˘a compone
au dou˘a compone
de aç
de aç
Algorithm Closure_LR1
de deplasare, numit˘a
de deplasare, numit˘a ”goto“. ”goto“.
Care sunt ¸sicum se determin˘a aceste st˘ari? Pentru a r˘asp
Care sunt ¸sicum se determin˘a aceste st˘ari? Pentru a r˘asp
consider˘am elementul de analiz˘a [A ! Æ.BØ, u] care,conform d
consider˘am
• [A → α.Bβ,u] valid for elementul
live deγα
prefix analiz˘a
=> [A ! Æ.BØ, u]
care,conform d
implic˘a:
implic˘a:
§
S ) § dr ∞Aw ) dr ∞ÆBØw ¸si
S ) dr ∞Aw ) dr ∞ÆBØw ¸si
u = F IRSTk (w) valabil pentru prefixul viabil ∞Æ.
u = F IRSTk (w) valabil pentru prefixul viabil ∞Æ.
Dac˘a ˆın gramatic˘a exist˘a¸ie o product
B ! ± atunci elementul
Dac˘a ˆın gramatic˘a exist˘a¸ie o product
B ! ± atunci elementul
[B ! .±, u] are, de§ asemenea, u valid pentru prefixul viabil
• [B → .δ,
[Bsmth]∈P
! .±, u]=> are,Sde ) ∞Aw ) dr ∞ÆBØw
asemenea, )dr ∞ƱØw.
u valid pentru prefixul viabil
Aceast˘a observat ¸ie sugereaz˘a faptul
c˘a el
punz˘atoare
=> [B → .δ,b] valid for live prefix γα, unui acela¸si prefix viabil ar
∀b ∊ FIRST(𝛽u)aceast˘a//mult ¸ime
First( 𝛽w) caracterizeaz˘a
= First( 𝛽u) un pas
anali
al
Mulţimea care va cont ¸ine toate elementele d
prefix viabil va forma o stare a automatu
S.Motogna - FL&CD
8[A ! Æ.BØ, a] 2 closure(C), 8B ! ± 2 P, [B ! .±, b] 2 closure(C)
pentru 8b 2 F IRST (Øa)
Algorithm Closure_LR1
Algoritmul 3.11 ClosureLR1
INPUT: I-element de analiz˘a; G’- gramatica ˆımbog˘at
¸it˘a;
F IRST (X), 8X 2 N [ ß;
OUTPUT: C 1 = closure(I);
C1 := {I};
repeat
for 8[A ! Æ.BØ, a] 2 C 1 do
for 8B ! ∞ 2 P do
for 8b 2 F IRST (Øa) do
if [B ! .∞, b]2
/ C1 then
C1 = C1 [ [B ! .∞, b]
end if
end for
end for
end for
until C1 nu se mai modific˘a
Definit
¸ia funct
¸iei goto se actualizeaz˘a ˆın:
S.Motogna - FL&CD
Construct LR(1) table
• Structure – SLR
• Rules:
1. if [A → α.β,u] ∈ si and goto(si,a) = sj then action(si,a)=shift s j
2. if [A → β.,u] ∈ si and A ≠ Sʹ then action(s i,u)=reduce l, where l –
number of production A → β
3. if [Sʹ → S.,$] ∈ si then action(si,$)=acc
4. if goto(si, X) = sj then goto(si, X) = sj , ∀X ∈N
5. otherwise = error
S.Motogna - FL&CD
Remarks
1. A grammar is LR(1) if the LR(1) table does not contain conflicts
S.Motogna - FL&CD
4. Define configurations and moves
• INPUT:
• Grammar G’ = (NU{S’}, 𝚺, P U {S’->S},S’)
• LR(1) table
• Input sequence w =a1…an
• OUTPUT:
if (w ∈L(G)) then string of productions
else error & location of error
S.Motogna - FL&CD
LR(1) configurations
Initial configuration:
($s0,w$,𝜀)
(𝛼 , 𝛽 , 𝜋 )
where:
• 𝛼 = working stack
• 𝛽 = input stack
• 𝜋 = output (result) Final configuration:
($sacc, $, 𝜋)
S.Motogna - FL&CD
Moves
head(𝛽) = prediction
1. Shift
if action(sm,ai)= shift s j then
($s0x1 ...xmsm,ai ...an$, 𝜋) ⊢ ($s0x1 ...xmsmaisj,ai+1 ...an$, 𝜋)
2. Reduce
if action(sm,ai) = reduce t AND (t) A → xm−p+1 ...xm AND goto(sm−p,A) = sj
then
($s0 ...xmsm,ai ...an$, 𝜋) ⊢ ($s0 ...xm−psm−pAsj,ai ...an$,t 𝜋)
3. Accept
if action(sm,$) = accept then ($sm,$, 𝜋)=acc
4. Error - otherwise
S.Motogna - FL&CD
LALR Parser
• LALR = Look Ahead LR(1)
• why?
S.Motogna - FL&CD
LALR principle [A → αβ.,u] ∈ si apply reduce (k) then goto(si,A) =sm
[A → αβ.,v] ∈ sj apply reduce (k) then goto(sj,A) =sn
[A → α.β,u] ∈ si
=> [A → α.β,u|v] ∈ si,j
[A → α.β,v] ∈ sj
S.Motogna - FL&CD
LALR Parsing
• Same as LR(1)
• Number of LALR states = number of SLR / LR(0) states
S.Motogna - FL&CD
LR(k) Parsers
• LR(0):
• Items ignore prediction
• Reduce can be applied only in singular states (contain one item)
• Lot of conflicts
• SLR:
• Use same items as LR(0)
• When reduce consider prediction
• Eliminate several LR(0) conflicts (not all)
• LR(1):
• Performant algorithm for set of states
• Generate few conflicts
• Generate lot of states
• LALR:
• Merge LR(1) states ccorresponding to same kernel
• Most used algorithm (most performant)
S.Motogna - FL&CD
Quiz time
S.Motogna - FL&CD
Parsing - recap
Descendent Ascendent
Recursive Descendent recursive Ascendent recursive parser
parser
Linear LL(1) LR(0), SLR, LR(1), LALR
S.Motogna - FL&CD
Eliminarea conflictelor nu este ˆıntotdeauna u¸sor de realizat ¸si de aceea se
dore¸ste evitarea Cea
lor. mai put
¸in restrictiv˘a clas˘a este cea a gramaticilor
LR(1), dar analizorulsintactic are alte dezavantaje,asupra c˘arora vom
reveni.Figura 3.4 ilustreaz˘a incluziunea dintre tipurile de gramatici luate
Parsing - recap
ˆın considerare ˆın analiza sintactic˘a.
evident˘a ˆıntre gramatici
Se observ˘a c˘a nu exist˘a o ¸ie
corelat
LL(1) ¸sigramaticile LR(k),o gramatic˘a LL(1)
poate s˘a fie LR(1), LALR, SLR sau chiar LR(0), dar orice gramatic˘a LL(1)
este LR(1).
LR(1)
LL(1) LALR(1)
SLR
LR(0)
Figura 3.5:Relat
¸ia dintre diferite clase de gramatici ˆın
¸iefunct
de metoda
de analiz˘a sintactic˘a
S.Motogna - FL&CD
Structure of compiler
Source program
analysis
scanning
parsing
Sequence of
tokens
semantic analysis
Parse tree
generate intermediary
Adnotated syntax code synthesis
tree
optimize
Intermediary intermediary code
code
Optimized generate object Object
code program
intermediary
code
S. Motogna - LFTC
Course 10
S.Motogna - FL&CD
Important notice
Ø9.12.2021
7.30 - Course Formal Languages and Compiler Design
9.20 - Course Formal Languages and Compiler Design
Ø16.12.2021
7.30 – Course Parallel and Distributed Programming
9.20 – Course Parallel and Distributed Programming
S.Motogna - FL&CD
LEX & YACC
1. Have you heard about these tools?
S.Motogna - FL&CD
Scanning & Parsing Tools
• Scanning => lex
• Parsing => yacc
S.Motogna - FL&CD
Lex – Unix utilitary (flex – Windows version)
S.Motogna - FL&CD
INPUT FILE FORMAT
definitions
%%
rules
%%
user code
Example 1:
%%
pattern action
where:
• Is optional (if is missing, then the separator %% following the rules section can also
miss). If it exists, then its containing user defined C code is copied without any
change at the end of the file lex.yy.c.
• Normally, in the user defined code section, one may have:
- function main() containing call(s) to yylex(), if we want the scanner to work
autonomously (for ex., to test it);
- other called functions from yylex() (for ex. yywrap() or functions called during
actions); in this case, the user code from definitions section must contain:
either prototypes, either #include directives of the headers containing the
prototypes
Launching the execution:
$ lex spec.lxi
$ gcc lex.yy.c -o your_lex
$ your_lex<input.txt
options: http://dinosaur.compilertools.net/flex/manpage.html
Example
S.Motogna - FL&CD
yacc
S.Motogna - FL&CD
Parsing (syntax analysis) modeled with cfg:
THEN
• LALR
• C code
S.Motogna - FL&CD
A yacc grammar file has four main sections
%{
C declarations
%}
yacc declarations
contains declarations that define terminal and nonterminal
symbols, specify precedence, and so on.
%%
Grammar rules
%%
Additional C code
The grammar rules section
• contains one or more yacc grammar rules of the following general form:
result : components... {C statements }
;
exp: exp '+' exp
;
• More on
http://catalog.compilertools.net/lexparse.html
Example
Course 11
Push-Down Automata
(PDA)
S.Motogna - FL&CD
Intuitive Model
S.Motogna - LFTC
Definition
• A push-down automaton (APD) is a 7-tuple M = (Q,𝞢,𝞒,𝞭,q0,Z0,F)
where:
• Q – finite set of states
• 𝞢 - alphabet (finite set of input symbols)
• 𝞒 – stack alphabet (finite set of stack symbols)
• 𝞭 : Q x (𝞢 U {𝜺}) x 𝞒 →𝒫(Qx 𝞒*) –transition function
• q0 ∈Q – initial state
• Z0 ∈ 𝞒 – initial stack symbol
• F ⊆Q – set of final states
S.Motogna - LFTC
Push-down automaton
Transition is determined by:
• Current state
• Current input symbol
• Head of stack
S.Motogna - LFTC
Configurations and transition / moves
• Configuration:
(q, x, 𝞪) ∈Q x 𝜮* x 𝜞*
where:
• PDA is in state q
• Input band contains x
• Head of stack is 𝞪
• Initial configuration (q0, w, Z0)
S.Motogna - LFTC
Configurations and moves(cont.)
S.Motogna - LFTC
Representations
• Enumerate
• Table
• Graphic
S.Motogna - LFTC
Construct PDA
• L = {0n1n| n ≥ 1}
• States, stack, moves?
1. States:
• Initial state:q0 – beginning and process symbols ‘0’
• When first symbol ‘1’ is found – move to new state => q1
• Final: final state q2
2. Stack:
• Z0 – initial symbol
• X – to count symbols:
• When reading a symbol ’0’ – push X in stack
• When reading a symbol ‘1’ – pop X from stack
S.Motogna - LFTC
Exemple 1 (enumerate)
M = ({q0,q1,q2}, {0,1}, {Z0,X},𝜹,q0,Z0,{q2})
𝜹(q0,0,Z0) = (q0,XZ0)
𝜹(q0,0,X) = (q0,XX)
𝜹(q0,1,X) = (q1,𝜺)
𝜹(q1,1,X) = (q1,𝜺) Empty stack
𝜹(q1,𝜺,Z0) = (q2,Z0) 𝜹(q1,𝜺,Z0) = (q1, 𝜺)
⊢ (q1, 𝜺, 𝜺)
(q0,0011,Z0) ⊢ (q0,011,XZ0) ⊢ (q0,11,XXZ0) ⊢ (q1,1,XZ0) ⊢(q1, 𝜺, Z0) ⊢ (q2, 𝜺,
Z0)
S.Motogna - LFTC
Final state
Exemple 1 (table)
0 1 𝜺
Z0 q0,XZ0
q0 X q0,XX q1,𝜺
Z0 q2,Z0 (q1, 𝜺)
q1 X q1,𝜺
Z0
q2 X
push
pop
0, X➝XX
0, Z0➝XZ0 1, X➝𝜺
1, X➝𝜺 𝜺, Z0➝Z0
q0 q1 q2
S.Motogna - LFTC
Properties
Theorem 1: For any PDA M, there exists a PDA M’ such that
L 𝜺(M) = Lf(M’)
Theorem 2: For any PDA M, there exists a context free grammar such
that
L 𝜺(M) = L(G)
Theorem 3: For any context free grammar there exists a PDA M such
that
L(G) = L 𝜺(M)
S.Motogna - LFTC
HW
• Parser:
• Descendent recursive
• LL(1)
• LR(0), SLR, LR(1)
Corresponding PDA
S.Motogna - LFTC
Structure of compiler
Source program
analysis
scanning
parsing
Sequence of
tokens
semantic analysis
Syntax tree
generate intermediary
Annotated code synthesis
abstract syntax
tree optimize
Intermediary intermediary code
code
Optimized generate object Object
code program
intermediary
code
S. Motogna - LFTC
Semantic analysis
• Parsing – result: syntax tree (ST)
Example
S.Motogna - FL&CD
Semantic analysis
• Attach meanings to syntactical constructions of a program
• What:
• Identifiers -> values / how to be evaluated
• Statements -> how to be executed
• Declaration -> determine space to be allocated and location to be stored
• Examples:
• Type checkings
• Verify properties
• How:
• Attribute grammars
• Manual methods
S.Motogna - FL&CD
Attribute grammar
• Syntactical constructions (nonterminals) – attributes
∀ 𝑋 ∈ 𝑁 ∪ Σ: 𝐴(𝑋)
∀ 𝑝 ∈ 𝑃: 𝑅(𝑝)
S.Motogna - FL&CD
Definition
AG = (G,A,R) is called attribute grammar where:
S.Motogna - FL&CD
Example 1
• G = ({N,B},{0,1}, P, N}
P: N -> NB
N1.v = 2* N2.v + B.v
N -> B N.v = B.v
B -> 0 B.v = 0
B -> 1 B.v = 1
S.Motogna - FL&CD
Evaluate attributes
• Traverse the tree: can be an infinite cycle
S.Motogna - FL&CD
Steps
• What? - decide what you want to compute (type, value, etc.)
• Decide attributes:
• How many
• Which attribute is defined for which symbol
• Attach evaluation rules:
• For each production – which rule/rules
S.Motogna - FL&CD
Example 2 (L-attribute grammar)
Decl -> DeclTip ListId ListId.type = DeclTip.type
ListId -> Id Id.type = ListId.type
ListId2.type = ListId1.type
ListId -> ListId, Id
Id.type = ListId1.type
S.Motogna - FL&CD
Example 3 (S-attribute grammar)
ListDecl -> ListDecl; Decl ListDecl1.dim = ListDecl2.dim + Decl.dim
ListDecl -> Decl ListDecl.dim = Decl.dim
Decl -> Type ListId Decl.dim = Type.dim * ListId.no
Type -> int Type.dim = 4
Type.dim =8
Type -> long
ListId.no = 1
ListId -> Id ListId1.no = ListId2.no + 1
ListId -> ListId, Id
S.Motogna - FL&CD
Manual methods
• Symbolic execution
• Using control flow graph, simulate on stack how the program will behave
• [Grune – Modern Compiler Design]
S.Motogna - FL&CD
Course 12
S.Motogna - FL&CD
Structure of compiler
Source program
analysis
scanning
parsing
Sequence of
tokens
semantic analysis
Parse tree
generate intermediary
Annotated syntax code synthesis
tree
optimize
Intermediary intermediary code
code
Optimized generate object Object
code program
intermediary
code
S. Motogna - LFTC
Generate intermediary code
Limbaj1 Maºina1
Limbaj2 Maºina2
Cod
. intermediar .
. .
. .
Limbajm Maºinan
S. Motogna - LFTC
Representations of intermediary code
• Annotated tree: intermediary code is generated in semantic analysis
• Polish postfix form:
• No parenthesis
• Operators appear in the order of execution
• Ex.: MSIL
S. Motogna - LFTC
3 address code
= sequence of simple format statements, close to object code, with the
following general form:
Represented as:
- Quadruples
- Triples
- Indirected Triples
S. Motogna - LFTC
• Quadruples:
< op > < arg1 > < arg2 > < result >
• Triples:
< op > < arg1 > < arg2 >
S. Motogna - LFTC
Special cases:
1. Expressions with unary operator: < result >=< op >< arg2 >
2. Assignment of the form a := b => the 3 addresss code is a = b (no operatorand no 2nd
argument)
3. Unconditional jump: statement is goto L, where L is the label of a 3 address code
4. Conditional jump: if c goto L: if c is evaluated to true then unconditional jump to
statement labeled with L, else (if c is evaluated to false), execute the next statement
5. Function call p(x1, x2, ..., xn) – sequence of statements: param x1, param x2 ,
param xn, call p, n
6. Indexed variables: < arg1 >,< arg2 >,< result > can be array elements of the form a[i]
7. Pointer, references: &x,∗x
S. Motogna - LFTC
Example: b∗b−4∗a∗c
op arg1 arg2 rez
* b b t1
* 4 a t2
* t2 c t3
- t1 t3 t4
nr op arg1 arg2
(1) * b b
(2) * 4 a
(3) * (2) c
(4) - (1) (3)
S. Motogna - LFTC
Example 2
If (a<2) then a=b else a=b*b
S.Motogna - FL&CD
Optimize intermediary code
• Local optimizations:
• Perform computation at compile time – constant values
• Eliminate redundant computations
• Eliminate inaccessible code – if…then...else...
• Loop optimizations:
• Factorization of loop invariants
• Reduce the power of operations
S. Motogna - LFTC
comune C § B ¸si D + C § B.
D:=D+C*B
A:=D+C*B
Eliminate redundant
C:=D+C*Bcomputations
Secvent
¸a corespunz˘atoare de cod cu trei adrese, reprezen
este:
Example:
D:=D+C*B (1) * C B
A:=D+C*B (2) + D (1)
C:=D+C*B (3) := (2) D
(4) * C B
(5) + D (4)
(6) := (5) A
(7) * C B
(8) + D (7)
(9) := (8) C
S.Motogna - FL&CD
const˘a
const˘a ˆınˆın a scoate
a scoate aceast˘a
aceast˘a ¸iune
¸iuneinstruct
instruct ˆınaintea
ˆınaintea ciclului,
ciclului, ea
ea exe
astfel
astfel o singur˘a
o singur˘a dat˘a.
dat˘a.
Factorization of loop invariants
Exemplul
Exemplul 6.3.6.3.O secvent
¸˘a¸˘a
O secvent de de program
program ce ¸ine
ceis ¸ine
What acont
cont un invarid
un invariant
loop invariant?
ˆınainte
ˆınainte ¸si¸si dup˘a
dup˘a optimizare:
optimizare:
for(i=0,
for(i=0, i<=n,i++)
i<=n,i++) x=y+z;
x=y+z;
{ x=y+z;
{ x=y+z; for(i=0,
for(i=0, i<=n,i++)
i<=n,i++)
a[i]=i*x}
a[i]=i*x} { a[i]=i*x}
{ a[i]=i*x}
Reducerea
Reducerea puterii
puterii operat
operat¸iilor
¸iilor
Aceast˘a optimizare
Aceast˘a areare
optimizare ca ca
scop ˆınlocuirea
scop ¸iilor¸iilor
ˆınlocuireaoperat
costisitoar
operat
costis
exemplu ˆınmult
exemplu ¸irea)
ˆınmult cu cu
¸irea) operat
¸ii mai
operat ieftine
¸ii mai (adunarea)
ieftine (adunarea)
S. Motogna - LFTC ˆın defi
ˆın
V1:
P = a[0] V2:
V3
P=a[n]
For i=1 to n
3 solutions P = P*v + a[n-i]
S.Motogna - FL&CD
valoareacalculat˘a
valoarea calculat˘ala la
¸ia¸ia
iterat
iterati ° i1.
° 1.
Exemplul
Exemplul 6.4.
Considerˆand
6.4.Considerˆand
Reduce the power of operations ciclul
ciclul urm˘ator,
urm˘ator, ˆın ˆın
carecare
v v
esteest
un
deciclu,
de ciclu,elel poate
poate fi optimizat
fi optimizat astfel:
astfel:
for(i=k,i<=n,i++)
for(i=k, i<=n,i++) t1=k*v;
t1=k*v;
{ {t=i*v;
t=i*v; for(i=k,
for(i=k, i<=n,i++)
i<=n,i++)
. . ..}.} { t=t1;
{ t=t1;
t1=t1+v;...}
t1=t1+v;...}
S. Motogna - LFTC
Course 13
S.Motogna - FL&CD
Structure of compiler
Source program
analysis
scanning
parsing
Sequence of
tokens
semantic analysis
Parse tree
generate intermediary
Adnotated syntax code synthesis
tree
optimize
Intermediary intermediary code
code
Optimized generate object Object
code program
intermediary
code
S. Motogna - LFTC
Generate object code
= translate intermediary code statements into statements of object
code (machine language)
S. Motogna - LFTC
Computer with accumulator
• A stack machine consists of:
• a stack for storing and manipulating values (store subexpressions and
results)
• Accumulator – to execute operation
• 2 types of statements:
• move and copy values in and from head of stack to accumulator
• Operations on stack head, functioning as follows: operands are popped from
stack, execute operation and then put the result in stack
S. Motogna - LFTC
Example: 4 * (5+1)
Code acc stack
acc ← 4 4 <>
push acc 4 <4>
acc ← 5 5 <4>
push acc 5 <5,4>
acc ← 1 1 <5,4>
acc ← acc + head 6 <5,4>
pop 6 <4>
acc ← acc * head 24 <4>
pop 24 <>
S.Motogna - FL&CD
Computer with registers
• Registers +
• Memory
• Instructions:
• LOAD v,R – load value v in register R
• STORE R,v – put value v from register R in memory
• ADD R1,R2 – add to the value from register R1, value from register R2 and
store the result in R1 (initial value is lost!)
S. Motogna - LFTC
2 aspects:
• Register allocation – way in which variable are stored and
manipulated;
S. Motogna - LFTC
Remarks:
1. A register can be available or occupied =>
VAR(R) = set of variables whose values are stored in register R
2. For every variable, the place (register, stack or memory) in which the
current value of the value exists=>
MEM(x)= set of locations in which the value of variable x exists (will
be stored in Symbol Table)
S. Motogna - LFTC
Example: F := A ∗ B − (C + B) ∗ (A * B)
Intermediary code Object code VAR MEM
VAR(R0) = {}
VAR(R1) = {}
(1) T1 = A * B
(2) T2 = C + B
(3) T3 = T2 * T1
(4) F:= T1 – T3
S.Motogna - FL&CD
Example: F := A ∗ B − (C + B) ∗ (A * B)
Intermediary code Object code VAR MEM
VAR(R0) = {}
VAR(R1) = {}
(1) T1 = A * B LOAD A, R0 VAR(R0) = {A} MEM(T1) = {R0}
MUL R0, B VAR(R0) = {T1}
(2) T2 = C + B
(3) T3 = T2 * T1
(4) F:= T1 – T3
S.Motogna - FL&CD
Example: F := A ∗ B − (C + B) ∗ (A * B)
Intermediary code Object code VAR MEM
VAR(R0) = {}
VAR(R1) = {}
(1) T1 = A * B LOAD A, R0 VAR(R0) = {T1} MEM(T1) = {R0}
MUL R0, B
(2) T2 = C + B LOAD C, R1 VAR(R1) = {T2} MEM(T2) = {R1}
ADD R1, B
(3) T3 = T2 * T1
(4) F:= T1 – T3
S.Motogna - FL&CD
Example: F := A ∗ B − (C + B) ∗ (A * B)
Intermediary code Object code VAR MEM
VAR(R0) = {}
VAR(R1) = {}
(1) T1 = A * B LOAD A, R0 VAR(R0) = {T1} MEM(T1) = {R0}
MUL R0, B
(2) T2 = C + B LOAD C, R1 VAR(R1) = {T2} MEM(T2) = {R1}
ADD R1, B
(3) T3 = T2 * T1 MUL R1,R0 VAR(R1) = {T3} MEM(T2) = {}
MEM(T3) = {R1}
(4) F:= T1 – T3
S.Motogna - FL&CD
Example: F := A ∗ B − (C + B) ∗ (A * B)
Intermediary code Object code VAR MEM
VAR(R0) = {}
VAR(R1) = {}
(1) T1 = A * B LOAD A, R0 VAR(R0) = {T1} MEM(T1) = {R0}
MUL R0, B
(2) T2 = C + B LOAD C, R1 VAR(R1) = {T2} MEM(T2) = {R1}
ADD R1, B
(3) T3 = T2 * T1 MUL R1,R0 VAR(R1) = {T3} MEM(T2) = {}
MEM(T3) = {R1}
(4) F:= T1 – T3 SUB R0,R1 VAR(R0) = {F} MEM(T1) = {}
STORE RO, F VAR(R1) = {} MEM(F) = {R0, F}
S.Motogna - FL&CD
More about Register Allocation
• Registers – limited resource
• Registers – perform operations / computations
• Variables much more than registers
S.Motogna - FL&CD
Live variables
• Determine the number of variables that are live (used)
op op1 op2 rez
Example: 1 + b c a
2 + a e d
a=b+c
3 + a c e
d=a+e
1 2 3
e=a+c a x x x
b x
c x x x
d x
e x x
S.Motogna - FL&CD
Graph coloring allocation (Chaitin a.o. 1982)
• Graph:
• nodes = live variables that should be allocated to registers
• edges = live ranges simultaneously live
S.Motogna - FL&CD
Linear scan allocation (Poletto a.o., 1999)
• determine all live range, represented as an interval
• intervals are traversed chronologically
• greedy algorithm
S.Motogna - FL&CD
Instruction selection
Example: F := A ∗ B − (C + B) ∗ (A * B)
Intermediary code Object code VAR MEM
VAR(R0) = {}
VAR(R1) = {}
(1) T1 = A * B LOAD A, R0 VAR(R0) = {T1} MEM(T1) = {R0}
MUL R0, B
(2) T2 = C + B LOAD C, R1 VAR(R1) = {T2} MEM(T2) = {R1}
ADD R1, B STORE R0,T1
(3) T3 = T2 * T1 MUL R1,R0 MUL R0,R1 VAR(R1) = {T3} MEM(T2) = {}
MEM(T3) = {R1}
(4) F:= T1 – T3 LOAD T1,R1
S.Motogna - LFTC
Alan Turing
• Enigma (criptography)
• Turing test
• Turing machine (1937)
S.Motogna - LFTC
Turing Machine
• Mathematical model for computation
• Abstract machine
• Can simulate any algorithm
S.Motogna - LFTC
Turing Machine
• Input band (infinite)
• Reading head
• Control Unit: states
• Transitions / moves
S.Motogna - LFTC
Turing machine – definition
7-tuple M = (Q, 𝞒,b,𝞢,𝞭,q0, F) where:
• Q – finite set of states
• 𝞒 - alphabet (finite set of band symbols)
• b ∈ 𝞒 - blank (symbol)
• 𝞢 ⊆ 𝞒 \{b} – input alphabet L = left
• 𝞭 : (Q\F) x 𝞒 →Q x 𝞒 x {L,R} –transition function R = right
• q0 ∈Q – initial state
• F ⊆Q – set of final states
S.Motogna - LFTC
Example – palindrome over {0,1}
• 001100, 00100, 101101 a.s.o. accepted
• 00110, 1011 a.s.o. not accepted
001100
S.Motogna - LFTC
Example – palindrome over {0,1}
Delete 0 in left side;
0 1 b search 0 in right side
q0 (p1,b,R) (p2,b,R) (qf,b,R)
Delete 1 in left side;
p1 (p1,0,R) (p1,1,R) (q1,b,L) search 1 in right side
On right is 0 or 1?
p2 (p2,0,R) (p2,1,R) (q2,b,L)
Shift right
q1 (qr,b,L) (qf,b,R)
q2 (qr,b,L) (qf,b,R)
q1 and q2 – process 0 and
qr (qr,0,L) (qr,1,L) (q0,b,R) 1 on the right
qf
qf –final state
S.Motogna - LFTC
0110
0 1 1 0
1 1
1 1 0
1 1
1 1 0
1 1
1 1 0
1 1 0 1 1
1 1 0 1
1 1 ...
S.Motogna - LFTC
0 1 b
q0 (p1,b,R) (p2,b,R) (qf,b,R)
q1 (qr,b,L) (qf,b,R)
q2 (qr,b,L) (qf,b,R)
|- (p1, 110) |- (p1, 110b) |- (q1, 110) qr (qr,0,L) (qr,1,L) (q0,b,R)
qf
|- (qr, 11) |- (qr, 11) |- (qr, b11)
|- (q0, 11) |- . . .
S.Motogna - LFTC
https://turingmachinesimulator.com
S.Motogna - LFTC
index
course details 1-9 LR(k) parsing 141-150
introduction 10-14 LR(0) Parser 151-169
scanning 15-28 SLR Parser 170-178
context for grammar 30-31 LR(1) Parser 179-189
grammar 32-36 LALR Parser 189-192
finite automata 42-50 parsing recap 193-196
regular grammars 51-53 lex & yacc 200-220
regular sets & expressions 54-58 PDA 221-233
transformations (RG ⇔ FA ⇔ RE ⇔ RG) 59-70 context for attribute grammar 235-237
pumping lemma 72-76 attribute grammar 238-246
cfg 77-83 intermediary code 249-251
eq transformations of cfg 84-93 3 address code 252-256
parsing 94-100 Optimize intermediary code 257-262
descendent recursive parser 101-112 Generate object code 264-280
LL(1) Parser 113-138 Turing machines 281-290.