0% found this document useful (0 votes)
17 views

Chapter 3

Uploaded by

Hager Fathy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Chapter 3

Uploaded by

Hager Fathy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

Introduction to Automata and Complex Theory

Third Year - First Semester 2024

Dr. Salah Eldin Shaban

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 1


Chapter 3: Context-Free Languages
Objectives
 Context-Free Grammars
 Pushdown Automata
 Deterministic Context-Free Languages (CFLs)
 Relationship between PDA and CFLs
 Properties of Context Free Languages
 Decision Algorithms
 Deterministic Context-Free Grammars

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 2


Introduction
 Methods of Describing Languages are:
 Finite Automata
 Regular Expressions
 Context-Free Grammars
 Characteristics of Context-Free Grammars (CFGs) include:
 A more powerful method of describing languages.
 They can describe certain features that have a recursive
structure, which makes them useful in a variety of applications.
 The Context-Free Languages (CFLs) associated with Context-
Free Grammars (CFGs).
 If G is a grammar, the language specified by G is L(G).

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 3


Context-Free Grammars (CFGs)
 A Grammar consists of a collection of substitution/rewriting
rules or productions.
 Each rule appears as a line in the grammar, comprising a symbol
and a string separated by an arrow. (e.g. A → 0A1)
 The symbol is called a variable or non-terminal. (e.g. A, B)
 The string consists of variables or non-terminals and other
symbols called terminals. (non-terminals A, B and terminals 0, 1, #)
 One variable is designated as the start variable. It usually occurs
on the left-hand side of the topmost rule. (the non-terminal A)
 For example, the grammar G1: A → 0A1
A → B
B → #
G1 contains three rules. G1’s variables are A and B, where A is the start
variable. Its terminals are 0, 1, and #.
Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 4
Context-Free Grammars (CFGs) Cont.
 Use a Grammar to describe a language by generating each
string of that language in the following manner.
1. Write down the start variable. It is the variable on the left-hand
side of the top rule, unless specified otherwise.
2. Find a variable that is written down and a rule that starts with
that variable. Replace the written down variable with the right-
hand side of that rule.
3. Repeat step 2 until no variables remain.
 For example, grammar G1 generates the string 000#111.
 The sequence of substitutions to obtain a string is called a
derivation. A derivation of string 000#111 in grammar G1 is
A ⇒ 0A1 ⇒ 00A11 ⇒ 000A111 ⇒ 000B111 ⇒ 000#111
 The same information may be represented pictorially with a
parse tree. An example of a parse tree is shown as follow.
Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 5
Context-Free Grammars (CFGs) Cont.
 A parse tree for 000#111 in grammar G1

 Several grammar rules can be abbreviated with the same left-hand


variable, such as A → 0A1 and A → B, into a single line A → 0A1 |
B, using the symbol “ | ” as an “or”.

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 6


Context-Free Grammars (CFGs) Cont.
 A second example of a CFG, G2 describes a fragment of the
English language.
⟨SENTENCE⟩ → ⟨NOUN-PHRASE⟩ ⟨VERB-PHRASE⟩
⟨NOUN-PHRASE⟩ → ⟨CMPLX-NOUN⟩| ⟨CMPLX-NOUN⟩ ⟨PREP-PHRASE⟩
⟨VERB-PHRASE⟩ → ⟨CMPLX-VERB⟩ | ⟨CMPLX-VERB⟩ ⟨PREP-PHRASE⟩
⟨PREP-PHRASE⟩ → ⟨PREP⟩ ⟨CMPLX-NOUN⟩
⟨CMPLX-NOUN⟩ → ⟨ARTICLE⟩ ⟨NOUN⟩
⟨CMPLX-VERB⟩ → ⟨VERB⟩ | ⟨VERB⟩ ⟨NOUN-PHRASE⟩
⟨ARTICLE⟩ → a | the
⟨NOUN⟩ → boy | girl | flower
⟨VERB⟩ → touches | likes | sees
⟨PREP⟩ → with

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 7


Context-Free Grammars (CFGs) Cont.
 Grammar G2 has 10 variables (the capitalized grammatical
terms written in-side brackets⟩; 27 terminals (the standard
English alphabet plus a space character); and 18 rules.
 Strings in L(G2) include:
a boy sees
the boy sees a flower
a girl with a flower likes the boy
 Each of these strings has a derivation in grammar G2.
 The following is a derivation of the first string on this list.
⟨SENTENCE⟩ ⇒ ⟨NOUN-PHRASE⟩⟨VERB-PHRASE⟩
⇒ ⟨CMPLX-NOUN⟩⟨VERB-PHRASE⟩ ⇒ ⟨ARTICLE⟩⟨NOUN⟩⟨VERB-PHRASE⟩
⇒ a ⟨NOUN⟩⟨VERB-PHRASE⟩ ⇒ a boy ⟨VERB-PHRASE⟩
⇒ a boy ⟨CMPLX-VERB⟩ ⇒ a boy ⟨VERB⟩
⇒ a boy sees
Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 8
Formal Definition of a CFG
 A Grammar is a list of rules which can be used to produce or
generate all the strings of a language, and which does not
generate any strings which are not in the language.
 Mathematically, a grammar is defined as a 4-tuple;
G: (S, V, R, S)
 S is a finite set of characters, called the input alphabet, the input
symbols, or terminal symbols. It is disjoint from V.
 V is a finite set of symbols, distinct from the terminal symbols,
called non-terminal symbols (grammar variables).
 R is a finite list of rewriting rules, also called productions, which
define how strings in the language may be generated. Each of these
rewriting rules is of the form   , where  and  are arbitrary
strings of terminals and non-terminals, and  is not null.
Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 9
Formal Definition of a CFG Cont.
 S  V is a particular non-terminal called the goal symbol, which
represents exactly all the strings in the language. It is the variable on
the left-hand side of the top rule, unless specified otherwise.
 The set of terminals and non-terminals is called the vocabulary
of the grammar. For reference purposes, each of the grammars
shown will be numbered (G1, G2, G3, ...).
 For example, let production for any grammar are:
S →aSa
S →bSb
S →c
G: (S, V, R, S)
S = {a, b, c}, V = {S}, R = {S → a S, S → b S b, S → c }, S = S

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 10


Formal Definition of a CFG Cont.
 A formal notion of a context-free grammar (CFG).
 If u, v, and w are strings of variables and terminals, and A →
w is a rule of the grammar, we say that uAv yields uwv,
written uAv ⇒ uwv. Say that u derives v, written u ⇒∗v, if u
= v or if a sequence u, u, ...,u exists for k ≥ 0 and u ⇒ u1 ⇒ u2
⇒ ... ⇒ uk ⇒ v.
 The language of the grammar is {w  S∗| S ⇒∗ w}.
 In grammar G1,V = {A, B}, S = {0, 1, #}, S = A, and R is the
collection of the three rules.
 In grammar G2,
V = ⟨SENTENCE⟩, ⟨NOUN-PHRASE⟩, ⟨VERB-PHRASE⟩, ⟨PREP-PHRASE⟩,
⟨CMPLX-NOUN⟩, ⟨CMPLX-VERB⟩, ⟨ARTICLE⟩, ⟨NOUN⟩, ⟨VERB⟩, ⟨PREP⟩, and
S = {a, b, c, ..., z, “ ”} The symbol “ ” is the blank symbol.

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 11


Formal Definition of a CFG Cont.
 The following symbols will have particular meanings:
A, B, C, . . . a single non-terminal.
a, b, c, . . . a single terminal.
. . ., X, Y, Z a single terminal or non-terminal.
. . ., x, y, z a string of terminals.
, , g, . . . a string of terminals and non-terminals.
Thus, a generic production can be written as A  
 Terminals: Lower case letters, operator symbols, punctuation
symbols, digits, boldface strings are all terminals.
 Non-Terminals: Upper case letters, lower case italic names are
usually non terminals.

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 12


Formal Definition of a CFG Cont.
 The Backus-Naur Form (BNF)
 BNF is the most common way of specifying productions.
 The non-terminals are enclosed in angle brackets ‹ ›,
 The arrow is replaced by a ::= as: ‹ S › ::= a ‹ S › b which is the
BNF version of the grammar rule: S  a S b
 So, in BNF, productions have the form
left side  definition
where left side  (S  V) * and definition  (S  V) *
 For left side contains one non-terminal, left side ∩ V = ϕ.
 For example, Let production for any grammar are:
S  a S a, S  b S b, S c
can represented as: S  a S a | b S b | c

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 13


Formal Definition of a CFG Cont.
 A derivation is a sequence of rewriting rules, applied to the
starting non-terminal, ending with a string of terminals.
 It demonstrates that a particular string is a member of the language.
 Assuming S is the starting non-terminal, derivations are written as:
 , , and g are strings of terminals and/or non-terminals, and x
is a string of terminals. S      g  …  x
 The one-step derivation is defined by  A    g 
where A  g is a production in the grammar
 The language generated by G is defined by L(G) = {w  T* | S + w}
 A sentential form is the goal or start symbol, or any string that
can be derived from it, that is any string w such that S w
where w  (S  V) *
Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 14
Formal Definition of a CFG Cont.
 A recursive grammar permits derivations of the form
A w1 A w2 (where A  V and w1 and w2  (S  V) *)
 It is a left recursive if A A w and a right recursive if A wA
 A self-embedding grammar permits derivations of the form
A w1 A w2 (where A  V and w1 and w2  (S  V) *)
but where w1 or w2 contains at least one terminal that is (w1 ∩ S) 
(w2 ∩ S ) ≠ ϕ.
 In addition, we define
 is leftmost lm if  does not contain a nonterminal
 is rightmost rm if  does not contain a nonterminal
Transitive closure * (zero or more steps)
Positive closure + (one or more steps)

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 15


Formal Definition of a CFG Cont.
 A grammar G = ({+,*, (, ), -, id}, {E}, R, E) with Productions R
R = { E  E + E | E * E | ( E ) | - E | id }
 Example derivations:
E  - E  - id
E rm E + E rm E + id rm id + id
E * E
E * id + id
E + id * id + id
 Two grammars, G1 and G2, are said to be equivalent if L(G1) =
L(G2) – i.e., they specify the same language.
 Let a grammar with productions E  E + E | E ∗ E | x | y | z

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 16


Formal Definition of a CFG Cont.
 A leftmost derivation of x + y ∗ z is:
E E+E x+E x+E∗E x+y∗E x+y∗z
 A rightmost derivation of x + y ∗ z is:
E E+E  E+E∗E  E+E∗z  E+y∗z x+y∗z
 Another leftmost derivation of x + y ∗ z is:
E E∗E E+E∗E x+E∗E x+y∗E x+y∗z
 The following is neither left- nor rightmost:
E E+E E+E∗E E+y∗E x+y∗E x+y∗z
 For the above grammar, there can be several different
derivations for a particular string e.g.
S  aSA  aBAA  abAAA  ababAA  abababA  abababab
S  aSA  aSab  aBAab  aBabab  abAabab  abababab
S  BA  bAA  babA  babab
Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 17
Formal Definition of a CFG Cont.
 A derivation example, G3 consists of:
R = { S  0S0| 1S1| 0| 1 }
 Four rules, the terminal symbols {0, 1}, and
 The starting non-terminal S.
 An example of a derivation using G5 is:
S  0S0  00S00  001S100  0010100
 Thus, 0010100 is in L(G3).
 G3 specifies language of palindromes of odd length over the
alphabet {0,1}.
 A palindrome is a string which reads the same from left to right
as it does from right to left.
L(G3) = {0, 1, 000, 010, 101, 111, 00000, . . . }

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 18


Formal Definition of a CFG Cont.
 A derivation example, G4 consists of:
R = { S  ASB| e, A  a, B  b }
 Four rules, the terminal symbols {a, b}, and e is not a terminal.
 The starting non-terminal S.
 An example of a derivation using G6 is:
S  ASB  aSB  aASBB  aaSBB  aaBB  aabB  aabb
 Thus, aabb is in L(G4). G4 specifies the set of all strings of a's and
b's which contain the same number of a's as b's and in which all
the a's precede all the b's.
L(G4) = {e, ab, aabb, aaabbb, aaaabbbb, aaaaabbbbb, . . . }
= {anbn} such that n greater than or equal to zero.
 G4 language is the set of all strings of a's and b's which consist of
zero or more a's followed by exactly the same number of b's.
Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 19
Formal Definition of a CFG Cont.
 A Derivation/Parse Tree is a pictorial representation of a
derivation.
 The root of the tree is labeled by the start symbol.
 Each leaf of the tree is labeled by a terminal (=token) or e in the
derived string.
 Each interior node is labeled by a non-terminal in a sentential form.
 If A  X1 X2 … Xn is a production, then node A has immediate
children X1, X2, …, Xn where Xi is a (non)terminal or e
 A parse tree for the string aaabbb using G4 is:

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 20


Formal Definition of a CFG Cont.
 A Derivation/Parse Tree.
 The yield of a parse tree is the
concatenation of its leaves from
left to right.
 The yield is always a string that is
derived from the root node.
 The yield is a terminal string
labelled from S  {e}.

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 21


Formal Definition of a CFG Cont.
 For example, Consider grammar G5 = ({S}, {a, b}, R, S).
 The set of rules, R, is S → aSb | SS | ε.
 G5 generates strings such as abab, aaabbb, and aababb.
 L(G5) is the language of all strings of properly nested parentheses.
 The right-hand side of a rule may be the ε empty string.
 Consider grammar G6 = (V, Σ, R, ⟨EXPR⟩).
V is {⟨EXPR⟩, ⟨TERM⟩, ⟨FACTOR⟩} and S is {a, +, x, (, )}.
 The rules are
⟨EXPR⟩  ⟨EXPR⟩ + ⟨TERM⟩ | ⟨TERM⟩
⟨TERM⟩  ⟨TERM⟩ x ⟨FACTOR⟩ | ⟨FACTOR⟩
⟨FACTOR⟩  ( ⟨EXPR⟩ ) | a
 The two strings a + a x a and (a + a) x a can be generated with
grammar G6. The parse trees are shown in the following figure.
Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 22
Formal Definition of a CFG Cont.

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 23


Designing Context-Free Grammars
 The Designing Techniques.
 The design of CFGs requires creativity as the design of FAs.
 CFGs are even trickier to construct than FAs because programming a
machine for specific tasks is more accustomed than describing languages
with grammars.
 Some helpful techniques, singly or in combination, face with the
problem of constructing a CFG of a CFL.
 First, construct a complex CFG as the union of simpler CFGs.
 Break a complex CFL into simpler pieces and then construct individual
grammars for each piece.
 Merge the individual grammars into a grammar for the original language
by combining their rules and then adding the new rule S → S1|S2| …|Sk,
where the variables Si are the start variables for the individual grammars.
 Solving several simpler problems is often easier than solving
complicated ones.
Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 24
Designing Context-Free Grammars Cont.
 For example, to get a grammar for the language {0n1n | n  0} 
{1n0n | n  0},
 First construct the grammar S1 → 0S11 | ε for the language {0n1n| n
≥ 0} and the grammar S2 → 1S20 | ε for the language {1n0n | n  0}
and then add the rule S → S1 | S2 to give the grammar
S → S1 | S2
S1 → 0S11 | ε
S2 → 1S20 | ε
 Second, constructing a CFG for a regular language by
constructing a DFA for it first. Converting any DFA into an
equivalent CFG as follows:
 Make a variable Ri for each state qi of the DFA.
 Add the rule Ri → aRj to the CFG if δ(qi, a) = qj is a transition in the DFA.
Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 25
Designing Context-Free Grammars Cont.
 Add the rule Ri → ε if qi is an accept state of the DFA.
 Make R0 the start variable of the grammar, where q0 is the start state
of the machine.
 Verify on your own that the resulting CFG generates the same
language that the DFA recognizes.
 Third, certain CFLs contain strings with two substrings that are
“linked” in the sense that a machine for such a language would
need to remember an unbounded amount of information about one
of the substrings to verify that it corresponds properly to the other
substring. For example, in {0n1n| n ≥ 0} a machine would need to
remember the number of 0s to verify that it equals the number of 1s.
You can construct a CFG to handle this situation by using a rule of
the form R → uRv, which generates strings wherein the portion
containing the u’s corresponds to the portion containing the v’s.
Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 26
Designing Context-Free Grammars Cont.
 Finally, in more complex languages, the strings may contain
certain structures that appear recursively as part of other (or the
same) structures. That situation occurs in the grammar that
generates arithmetic expressions. Any time the symbol a appears,
an entire parenthesized expression might appear recursively instead.
To achieve this effect, place the variable symbol generating the
structure in the location of the rules corresponding to where that
structure may recursively appear.

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 27


Ambiguity
 A CFG is ambiguous if there is more than one derivation
tree for a particular string.
 Such string may have more than one interpretation.
 The languages generated only by ambiguous grammars are called
inherently ambiguous.
 For example, there are two different derivation trees for the string
var + var * var using grammar G6
G6: Expr → Expr + Expr | Expr * Expr | ( Expr ) | var | const

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 28


Ambiguity Cont.
 One way to resolve an ambiguity is to rewrite the grammar of
the language to be unambiguous e.g. G7 of G6.
 There is a derivation tree for var + var * var using grammar G7.
G7:
1. Expr → Expr + Term | Term
2. Term → Term * Factor | Factor
3. Factor → ( Expr ) | var | const

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 29


Ambiguity Cont.
 Another example of ambiguity in programming languages is
the conditional statement as defined by grammar G8:
G8: IfStmt → if Cond then Stmt | if Cond then Stmt else Stmt
 Two different derivation trees for a string: if Cond then if Cond then
Stmt else Stmt are:

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 30


Ambiguity Cont.
 A derivation tree for a string: if Cond then if Cond then OtherStmt
else OtherStmt using grammar G9 is:

G9:
1.IfStmt → Matched | Unmatched
2.Matched → if Cond then Matched else Matched |
OtherStmt
3.Unmatched → if Cond then Stmt | if Cond then
Matched else Unmatched

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 31


Chomsky Hierarchy
 Noam Chomsky defined four levels of grammars according to
complexity and the corresponding four classes of automata
or abstract machine types have been identified.
Chomsky
Grammar Recognizer
Language Class
3 Regular Finite State Automaton
2 Context-Free Push-Down Automaton
1 Context-Sensitive Linear-Bounded Automaton
0 Unrestricted Turing Machine

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 32


Chomsky Hierarchy Cont.
1. Unrestricted grammar is one in which there are no restrictions on
the rewriting rules. Each production rule on the form  →  where 
and  are arbitrary string of grammar symbols with  ≠ e.
2. Context-Sensitive grammar is one in which each rule must be of the
form:  A g →   g where , , and g are any string of terminals and
non-terminals (including e), and A represents a single non-terminal. 
is at least as long as  that is clearly |  | ≤ |  |
3. Context-Free Grammar (CFG) is one in which each rule must be of
the form: A →  where A represents a single non-terminal and  is
any string of terminals and non-terminals (A ϵ N and  ϵ (S Ս N)*).
4. Regular Grammar - If all production rules of a CFG are of the form:
A → wB or A → w where A and B are non-terminals and w ϵ S*, then
we say that is a right linear grammar. If all production rules of a CFG
are of the form: A → Bw or A → w, we call it a left linear grammar.
Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 33
Chomsky Hierarchy Cont.
 Every context-sensitive grammar is in the unrestricted class.
 Every CFG is in the context-sensitive and unrestricted classes.
 Every regular grammar is in the context-free, context-
sensitive, and unrestricted classes.

 L(regular)  L(context free)  L(context sensitive)  L(unrestricted)


 Where L(T) = {L(G)|G is of type T} That is: the set of all languages
generated by grammars G of type T.
Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 34
Chomsky Hierarchy Cont.
 A grammar G is said to be
 Regular if it is right linear where each production is of the form
AwB or Aw
or left linear where each production is of the form
ABw or Aw
 Context-free if each production is of the form
A   where A  N and   (N  S)*
 Context sensitive if each production is of the form
Ag
where A  N, , g,   (N  S)*, |g| > 0
 Unrestricted

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 35


Chomsky Hierarchy Cont.
 Examples:
 Every finite language is regular! (construct a FSA for strings in
L(G))
 L1 = { anbn | n  1 } is context free
 L2 = { anbncn | n  1 } is context sensitive
 G10 is an example of a context-sensitive grammar.
G10:
1. S → aSBC | e
2. aB → ab
3. bB → bb
4. C → c
5. CB → CX
6. CX → BX
7. BX → BC
Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 36
Chomsky Hierarchy Cont.
 The Chomsky normal form (simplification of CFGs).
 It is often convenient to have CFGs in simplified form.
 The Chomsky normal form is one of the simplest and most useful
forms. It is useful in giving algorithms for working with CFGs.
 A CFG is in Chomsky normal form if every rule is of the form
1. A → BC
2. A → a
where a is any terminal and A, B, and C are any variables—
except that B and C may not be the start variable.
 In addition, permit the rule S → ε, where S is the start variable.
 Theorem: Any context-free language is generated by a
context-free grammar in Chomsky normal form.

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 37


Chomsky Hierarchy Cont.
 The conversion of any grammar G into Chomsky normal form.
 It is often convenient to have CFGs in simplified form.
 Add a new start variable. So, the start variable does not occur
on the right-hand side of a rule.
 Then, eliminate all e-rules of the form A → e where A is not the
start variable.
 Also eliminate all unit rules of the form A → B.
 In both cases, patch up the grammar to be sure that it still
generates the same language.
 Finally, convert the remaining rules into the proper form.

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 38


Reduction of CFGs
 There are several ways to restrict the format of CFG
without reducing the language generation power of CFG.
 Let L be a non-empty context free language generated by a
CFG with elimination of:
1. Useless symbols, those terminals or non-terminals that do
not appears in any derivation of a terminal string from the
start symbol.
2. Unit productions, those of the form X → Y for some non-
terminals X and Y.
3. e-productions, those of the form X → e for some non-
terminal X.

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 39


Elimination of Useless Production/Symbols from CFG

 Any symbol is useful only when it is deriving any terminal and if a


symbol is deriving a terminal but not reachable from the start state.
 All terminals will be useful symbols.
 A symbol that is useful will be both generating and reachable.
 For example: Find the reduced grammar that is equivalent to G11

G11: 1. S → S B | a C
2. A → b S C a
3. B → a S B | b B C
4. C → a B C | a d

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 40


Elimination of Useless Production/Symbols from CFG Cont.

Solution:
1. Since C → a d, C is a generating symbol and
2. Since S → a C, S is also a generating symbol.
3. According to the production A → b S C a, A is also a generating symbol.
4. Right side of B → a S B and B → b B C contains B, and B is not
terminating, so B is not a generating symbol.
5. So, we can eliminate those productions and grammar becomes:

G11: 1. S → a C
2. A → b S C a
3. C → a d

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 41


Elimination of Useless Production/Symbols from CFG Cont.

Solution:
Since S is the start symbol and right side of S does not contain A, hence A
is not reachable as:

So, by eliminating A we get

G11: 1. S → a C
2. C → a d

which is reduced grammar equivalent to the given grammar, containing no


useless symbol.

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 42


Eliminating Unit Productions
 A unit production in a CFG is a production of the form:
Non -terminal → One non-terminal (A → B)
Eliminate unit productions Algorithm:
While (there exist a unit production A → B)
{
Select a unit production, such that there exist a production
B → a, where a is a terminal.
for (every non-unit production, B → a)
add production A → a to the grammar.
Eliminate A → B from the grammar
}

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 43


Eliminating Unit Productions Cont.
 Example: Remove the unit productions from G12:
 Solution: There are three-unit productions in the grammar (I)

G12: I II
1. S → A B
3. B → C G12: 1. S → A B
2. A → a
4. C → D 2. A → a
3. B → C | b
5. D → E 3. B → a | b
4. C → D
4. C → a
5. D → E III
5. D → a
6. E → a G12: 1. S → A B
6. E → a
2. A → a
3. B → a | b

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 44


Eliminate e-Productions
 Null productions are of the form A → e.
 All e-productions cannot be removed from a grammar if the
language contains e as a word, but if it does not remove all.
 In each CFG, a non-terminal N is nullable if there is a production
N → e or there is a derivation that starts at N and leads to e:
N ... e
 If A → e is an eliminated production, look for all productions whose
right side contains A, and replace each occurrence of A in each of
these productions to obtain the non e-productions.
 Add the resultant non e-productions to the grammar to keep the
language the same.

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 45


Eliminate e-Productions Cont.
 Example: Remove the null productions from G13:
 Solution:
There are two null productions in the grammar A → e and B → e.
To eliminate A → e must change the productions containing A in the right
side. Those productions are S → A B A C and A → a A
So, replace each occurrence of A by e. Four new productions are in (I)

G13: 1. S → A B A C
1. S → B A C | A B C | B C
2. A → a A | e
2. A → a
3. B → b B | e
4. C → c
I

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 46


Eliminate e-Productions Cont.
 Add these productions to the grammar and eliminate A → e (II).

G13: 1. S → A B A C | B A C | A B C | B C
1. S → A A C | A C | C
2. A → a A | a 2. B → b
3. B → b B | e
III
4. C → c
II

 To eliminate B → e, change the productions containing B on the right


side. Doing that generate new productions in (III).

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 47


Eliminate e-Productions Cont.
 Add these productions to the grammar and remove the
production B → e from the grammar.
 The new grammar after removal of e-productions is:

G13: 1. S → A B A C | A B C | B A C | B C | A A C | A C | C

2. A → a A | a
3. B → b B | b
4. C → c

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 48


Reduction of CFGs Cont.
 A collective example:
 Consider the following CFG and convert it to Chomsky normal form
 In the following conversion steps, Rules shown in bold have just
been added. Rules shown in gray have just been removed.
1. The original CFG is shown on the left. The result of applying the first step to
make a new start variable appears on the right.
S → ASA | aB S0 → S
A→B|S S → ASA | aB
B→b|e A→B|S
B→b|e
2. Remove e-rules B → e, shown on the left, and A → e, shown on the right.
S0 → S S0 → S
S → ASA | aB | a S → ASA | aB | a | SA | AS | S
A→B|S|e A→B|S|e
B→b|e B→b

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 49


Reduction of CFGs Cont.
3. Remove unit rules S → S, shown on the left, and S0 → S, shown on the right.
S0 → S S0 → S | ASA | aB | a | SA | AS
S → ASA | aB | a | SA | AS | S S → ASA | aB | a | SA | AS
A→B|S A→B|S
B→b B→b
Remove unit rules A → B and A → S.
S0 → ASA | aB | a | SA | AS S0 → ASA | aB | a | SA | AS
S → ASA | aB | a | SA | AS S → ASA | aB | a | SA | AS
A→B|S|b A → S | b | ASA | aB | a | SA | AS
B→b B→b
4. Convert the remaining rules into the proper form by adding additional variables and rules.

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 50


Reduction of CFGs Cont.
 The final grammar in Chomsky normal form is equivalent to the
grammar. Simplify the resulting grammar by using a single variable U
and rule U → a.)
S0 → AA1 | UB | a | SA | AS
S → AA1 | UB | a | SA | AS
A → b | AA1 | UB | a | SA | AS
A1 → SA
U→a
B→b

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 51


?

Introduction to Automata and Complex Theory Dr. Salah Eldin Shaban 52

You might also like