0% found this document useful (0 votes)

14 views

readingMaterialweek4-5

This document introduces context-free languages and grammars, which are more powerful than finite automata and regular expressions for describing languages, particularly those with recursive structures. It explains the formal definition of context-free grammars, provides examples, and discusses their applications in programming language specification and compilation. The document also outlines techniques for designing context-free grammars, emphasizing the importance of creativity in their construction.

Uploaded by

squooshy2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views

readingMaterialweek4-5

Uploaded by

squooshy2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

2

CONTEXT-FREE
LANGUAGES

In Chapter 1 we introduced two different, though equivalent, methods of de-

scribing languages: finite automata and regular expressions. We showed that many
languages can be described in this way but that some simple languages, such as
{0n 1n | n ≥ 0}, cannot.
In this chapter we present context-free grammars, a more powerful method
of describing languages. Such grammars can describe certain features that have
a recursive structure, which makes them useful in a variety of applications.
Context-free grammars were first used in the study of human languages. One
way of understanding the relationship of terms such as noun, verb, and preposition
and their respective phrases leads to a natural recursion because noun phrases
may appear inside verb phrases and vice versa. Context-free grammars help us
organize and understand these relationships.
An important application of context-free grammars occurs in the specification
and compilation of programming languages. A grammar for a programming lan-
guage often appears as a reference for people trying to learn the language syntax.
Designers of compilers and interpreters for programming languages often start
by obtaining a grammar for the language. Most compilers and interpreters con-
tain a component called a parser that extracts the meaning of a program prior to
generating the compiled code or performing the interpreted execution. A num-
ber of methodologies facilitate the construction of a parser once a context-free
grammar is available. Some tools even automatically generate the parser from
the grammar.
101

Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the
eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional
content at any time if subsequent rights restrictions require it.
102 CHAPTER 2 / CONTEXT-FREE LANGUAGES

The collection of languages associated with context-free grammars are called

the context-free languages. They include all the regular languages and many
additional languages. In this chapter, we give a formal definition of context-free
grammars and study the properties of context-free languages. We also introduce
pushdown automata, a class of machines recognizing the context-free languages.
Pushdown automata are useful because they allow us to gain additional insight
into the power of context-free grammars.

2.1
CONTEXT-FREE GRAMMARS
The following is an example of a context-free grammar, which we call G1 .

A → 0A1
A→B
B →#

A grammar consists of a collection of substitution rules, also called produc-

tions. Each rule appears as a line in the grammar, comprising a symbol and
a string separated by an arrow. The symbol is called a variable. The string
consists of variables and other symbols called terminals. The variable symbols
often are represented by capital letters. The terminals are analogous to the in-
put alphabet and often are represented by lowercase letters, numbers, or special
symbols. One variable is designated as the start variable. It usually occurs on
the left-hand side of the topmost rule. For example, grammar G1 contains three
rules. G1 ’s variables are A and B, where A is the start variable. Its terminals are
0, 1, and #.
You use a grammar to describe a language by generating each string of that
language in the following manner.
1. Write down the start variable. It is the variable on the left-hand side of the
top rule, unless specified otherwise.
2. Find a variable that is written down and a rule that starts with that variable.
Replace the written down variable with the right-hand side of that rule.
3. Repeat step 2 until no variables remain.
For example, grammar G1 generates the string 000#111. The sequence of
substitutions to obtain a string is called a derivation. A derivation of string
000#111 in grammar G1 is
A ⇒ 0A1 ⇒ 00A11 ⇒ 000A111 ⇒ 000B111 ⇒ 000#111.
You may also represent the same information pictorially with a parse tree. An
example of a parse tree is shown in Figure 2.1.

FIGURE 2.1
Parse tree for 000#111 in grammar G1

All strings generated in this way constitute the language of the grammar.
We write L(G1 ) for the language of grammar G1 . Some experimentation with
the grammar G1 shows us that L(G1 ) is {0n #1n | n ≥ 0}. Any language that can
be generated by some context-free grammar is called a context-free language
(CFL). For convenience when presenting a context-free grammar, we abbreviate
several rules with the same left-hand variable, such as A → 0A1 and A → B,
into a single line A → 0A1 | B, using the symbol “ | ” as an “or”.
The following is a second example of a context-free grammar, called G2 ,
which describes a fragment of the English language.

⟨SENTENCE ⟩ → ⟨NOUN-PHRASE ⟩⟨VERB-PHRASE ⟩

⟨NOUN-PHRASE ⟩ → ⟨CMPLX-NOUN ⟩ | ⟨CMPLX-NOUN ⟩⟨PREP-PHRASE ⟩
⟨VERB-PHRASE ⟩ → ⟨CMPLX-VERB ⟩ | ⟨CMPLX-VERB ⟩⟨PREP-PHRASE ⟩
⟨PREP-PHRASE ⟩ → ⟨PREP ⟩⟨CMPLX-NOUN ⟩
⟨CMPLX-NOUN ⟩ → ⟨ARTICLE⟩⟨NOUN ⟩
⟨CMPLX-VERB ⟩ → ⟨VERB ⟩ | ⟨VERB ⟩⟨NOUN-PHRASE ⟩
⟨ARTICLE ⟩ → a | the
⟨NOUN ⟩ → boy | girl | flower
⟨VERB ⟩ → touches | likes | sees
⟨PREP ⟩ → with

Grammar G2 has 10 variables (the capitalized grammatical terms written in-

side brackets); 27 terminals (the standard English alphabet plus a space charac-
ter); and 18 rules. Strings in L(G2 ) include:

a boy sees
the boy sees a flower
a girl with a flower likes the boy

Each of these strings has a derivation in grammar G2 . The following is a deriva-

tion of the first string on this list.

⟨SENTENCE ⟩ ⇒ ⟨NOUN-PHRASE ⟩⟨VERB-PHRASE ⟩

⇒ ⟨CMPLX-NOUN ⟩⟨VERB-PHRASE ⟩
⇒ ⟨ARTICLE⟩⟨NOUN ⟩⟨VERB-PHRASE ⟩
⇒ a ⟨NOUN ⟩⟨VERB-PHRASE ⟩
⇒ a boy ⟨VERB-PHRASE ⟩
⇒ a boy ⟨CMPLX-VERB ⟩
⇒ a boy ⟨VERB ⟩
⇒ a boy sees

FORMAL DEFINITION OF A CONTEXT-FREE GRAMMAR

Let’s formalize our notion of a context-free grammar (CFG).

DEFINITION 2.2
A context-free grammar is a 4-tuple (V, Σ, R, S), where
1. V is a finite set called the variables,
2. Σ is a finite set, disjoint from V , called the terminals,
3. R is a finite set of rules, with each rule being a variable and a
string of variables and terminals, and
4. S ∈ V is the start variable.

If u, v, and w are strings of variables and terminals, and A → w is a rule of the

grammar, we say that uAv yields uwv, written uAv ⇒ uwv. Say that u derives v,
∗
written u ⇒ v, if u = v or if a sequence u1 , u2 , . . . , uk exists for k ≥ 0 and
u ⇒ u1 ⇒ u2 ⇒ . . . ⇒ uk ⇒ v.
∗
The language of the grammar is {w ∈ Σ∗ | S ⇒ w}.
In grammar G1 , V = {A, B}, Σ = {0, 1, #}, S = A, and R is the collection
of the three rules appearing on page 102. In grammar G2 ,
!
V = ⟨SENTENCE ⟩, ⟨NOUN-PHRASE ⟩, ⟨VERB-PHRASE ⟩,
⟨PREP-PHRASE ⟩, ⟨CMPLX-NOUN ⟩, ⟨CMPLX-VERB ⟩,
"
⟨ARTICLE ⟩, ⟨NOUN ⟩, ⟨VERB ⟩, ⟨PREP ⟩ ,
and Σ = {a, b, c, . . . , z, “ ”}. The symbol “ ” is the blank symbol, placed invisibly
after each word (a, boy, etc.), so the words won’t run together.
Often we specify a grammar by writing down only its rules. We can identify
the variables as the symbols that appear on the left-hand side of the rules and
the terminals as the remaining symbols. By convention, the start variable is the
variable on the left-hand side of the first rule.

EXAMPLES OF CONTEXT-FREE GRAMMARS

EXAMPLE 2.3
Consider grammar G3 = ({S}, {a, b}, R, S). The set of rules, R, is
S → aSb | SS | ε.
This grammar generates strings such as abab, aaabbb, and aababb. You can
see more easily what this language is if you think of a as a left parenthesis “(”
and b as a right parenthesis “)”. Viewed in this way, L(G3 ) is the language of
all strings of properly nested parentheses. Observe that the right-hand side of a
rule may be the empty string ε.

EXAMPLE 2.4
Consider grammar G4 = (V, Σ, R, ⟨EXPR ⟩).
V is {⟨EXPR⟩, ⟨TERM ⟩, ⟨FACTOR ⟩} and Σ is {a, +, x, (, )}. The rules are
⟨EXPR ⟩ → ⟨EXPR ⟩+⟨TERM⟩ | ⟨TERM ⟩
⟨TERM ⟩ → ⟨TERM ⟩x⟨FACTOR ⟩ | ⟨FACTOR ⟩
⟨FACTOR ⟩ → ( ⟨EXPR ⟩ ) | a
The two strings a+a xa and (a+a) xa can be generated with grammar G4 .
The parse trees are shown in the following figure.

FIGURE 2.5
Parse trees for the strings a+a xa and (a+a) xa

A compiler translates code written in a programming language into another

form, usually one more suitable for execution. To do so, the compiler extracts

the meaning of the code to be compiled in a process called parsing. One rep-
resentation of this meaning is the parse tree for the code, in the context-free
grammar for the programming language. We discuss an algorithm that parses
context-free languages later in Theorem 7.16 and in Problem 7.45.
Grammar G4 describes a fragment of a programming language concerned
with arithmetic expressions. Observe how the parse trees in Figure 2.5 “group”
the operations. The tree for a+a xa groups the x operator and its operands
(the second two a’s) as one operand of the + operator. In the tree for (a+a) xa,
the grouping is reversed. These groupings fit the standard precedence of mul-
tiplication before addition and the use of parentheses to override the standard
precedence. Grammar G4 is designed to capture these precedence relations.

DESIGNING CONTEXT-FREE GRAMMARS

As with the design of finite automata, discussed in Section 1.1 (page 41), the
design of context-free grammars requires creativity. Indeed, context-free gram-
mars are even trickier to construct than finite automata because we are more
accustomed to programming a machine for specific tasks than we are to describ-
ing languages with grammars. The following techniques are helpful, singly or in
combination, when you’re faced with the problem of constructing a CFG.
First, many CFLs are the union of simpler CFLs. If you must construct a CFG for
a CFL that you can break into simpler pieces, do so and then construct individual
grammars for each piece. These individual grammars can be easily merged into
a grammar for the original language by combining their rules and then adding
the new rule S → S1 | S2 | · · · | Sk , where the variables Si are the start variables
for the individual grammars. Solving several simpler problems is often easier
than solving one complicated problem.
For example, to get a grammar for the language {0n 1n |n ≥ 0}∪{1n 0n |n ≥ 0},
first construct the grammar

S1 → 0S1 1 | ε

for the language {0n 1n | n ≥ 0} and the grammar

S2 → 1S2 0 | ε

for the language {1n 0n | n ≥ 0} and then add the rule S → S1 | S2 to give the
grammar

S → S1 | S2
S1 → 0S1 1 | ε
S2 → 1S2 0 | ε.

Second, constructing a CFG for a language that happens to be regular is easy

if you can first construct a DFA for that language. You can convert any DFA into
an equivalent CFG as follows. Make a variable Ri for each state qi of the DFA.
Add the rule Ri → aRj to the CFG if δ(qi , a) = qj is a transition in the DFA. Add
the rule Ri → ε if qi is an accept state of the DFA. Make R0 the start variable of
the grammar, where q0 is the start state of the machine. Verify on your own that
the resulting CFG generates the same language that the DFA recognizes.
Third, certain context-free languages contain strings with two substrings that
are “linked” in the sense that a machine for such a language would need to re-
member an unbounded amount of information about one of the substrings to
verify that it corresponds properly to the other substring. This situation occurs
in the language {0n 1n | n ≥ 0} because a machine would need to remember the
number of 0s in order to verify that it equals the number of 1s. You can construct
a CFG to handle this situation by using a rule of the form R → uRv, which gen-
erates strings wherein the portion containing the u’s corresponds to the portion
containing the v’s.
Finally, in more complex languages, the strings may contain certain structures
that appear recursively as part of other (or the same) structures. That situation
occurs in the grammar that generates arithmetic expressions in Example 2.4.
Any time the symbol a appears, an entire parenthesized expression might appear
recursively instead. To achieve this effect, place the variable symbol generating
the structure in the location of the rules corresponding to where that structure
may recursively appear.

AMBIGUITY
Sometimes a grammar can generate the same string in several different ways.
Such a string will have several different parse trees and thus several different
meanings. This result may be undesirable for certain applications, such as pro-
gramming languages, where a program should have a unique interpretation.
If a grammar generates the same string in several different ways, we say that
the string is derived ambiguously in that grammar. If a grammar generates some
string ambiguously, we say that the grammar is ambiguous.
For example, consider grammar G5 :

⟨EXPR ⟩ → ⟨EXPR ⟩+⟨EXPR ⟩ | ⟨EXPR ⟩x⟨EXPR ⟩ | ( ⟨EXPR ⟩ ) | a

This grammar generates the string a+a xa ambiguously. The following figure
shows the two different parse trees.

FIGURE 2.6
The two parse trees for the string a+a xa in grammar G5

This grammar doesn’t capture the usual precedence relations and so may
group the + before the × or vice versa. In contrast, grammar G4 generates
exactly the same language, but every generated string has a unique parse tree.
Hence G4 is unambiguous, whereas G5 is ambiguous.
Grammar G2 (page 103) is another example of an ambiguous grammar. The
sentence the girl touches the boy with the flower has two different
derivations. In Exercise 2.8 you are asked to give the two parse trees and observe
their correspondence with the two different ways to read that sentence.
Now we formalize the notion of ambiguity. When we say that a grammar
generates a string ambiguously, we mean that the string has two different parse
trees, not two different derivations. Two derivations may differ merely in the
order in which they replace variables yet not in their overall structure. To con-
centrate on structure, we define a type of derivation that replaces variables in a
fixed order. A derivation of a string w in a grammar G is a leftmost derivation if
at every step the leftmost remaining variable is the one replaced. The derivation
preceding Definition 2.2 (page 104) is a leftmost derivation.

DEFINITION 2.7
A string w is derived ambiguously in context-free grammar G if
it has two or more different leftmost derivations. Grammar G is
ambiguous if it generates some string ambiguously.

Sometimes when we have an ambiguous grammar we can find an unambigu-

ous grammar that generates the same language. Some context-free languages,
however, can be generated only by ambiguous grammars. Such languages are
called inherently ambiguous. Problem 2.29 asks you to prove that the language
{ai bj ck | i = j or j = k} is inherently ambiguous.

CHOMSKY NORMAL FORM

When working with context-free grammars, it is often convenient to have them
in simplified form. One of the simplest and most useful forms is called the

Chomsky normal form. Chomsky normal form is useful in giving algorithms

for working with context-free grammars, as we do in Chapters 4 and 7.

DEFINITION 2.8
A context-free grammar is in Chomsky normal form if every rule is
of the form
A → BC
A→a
where a is any terminal and A, B, and C are any variables—except
that B and C may not be the start variable. In addition, we permit
the rule S → ε, where S is the start variable.

THEOREM 2.9
Any context-free language is generated by a context-free grammar in Chomsky
normal form.

PROOF IDEA We can convert any grammar G into Chomsky normal form.
The conversion has several stages wherein rules that violate the conditions are
replaced with equivalent ones that are satisfactory. First, we add a new start
variable. Then, we eliminate all ε-rules of the form A → ε. We also eliminate
all unit rules of the form A → B. In both cases we patch up the grammar to be
sure that it still generates the same language. Finally, we convert the remaining
rules into the proper form.

PROOF First, we add a new start variable S0 and the rule S0 → S, where
S was the original start variable. This change guarantees that the start variable
doesn’t occur on the right-hand side of a rule.
Second, we take care of all ε-rules. We remove an ε-rule A → ε, where A
is not the start variable. Then for each occurrence of an A on the right-hand
side of a rule, we add a new rule with that occurrence deleted. In other words,
if R → uAv is a rule in which u and v are strings of variables and terminals, we
add rule R → uv. We do so for each occurrence of an A, so the rule R → uAvAw
causes us to add R → uvAw, R → uAvw, and R → uvw. If we have the rule
R → A, we add R → ε unless we had previously removed the rule R → ε. We
repeat these steps until we eliminate all ε-rules not involving the start variable.
Third, we handle all unit rules. We remove a unit rule A → B. Then,
whenever a rule B → u appears, we add the rule A → u unless this was a unit
rule previously removed. As before, u is a string of variables and terminals. We
repeat these steps until we eliminate all unit rules.
Finally, we convert all remaining rules into the proper form. We replace each
rule A → u1 u2 · · · uk , where k ≥ 3 and each ui is a variable or terminal symbol,

with the rules A → u1 A1 , A1 → u2 A2 , A2 → u3 A3 , . . . , and Ak−2 → uk−1 uk .

The Ai ’s are new variables. We replace any terminal ui in the preceding rule(s)
with the new variable Ui and add the rule Ui → ui .

EXAMPLE 2.10
Let G6 be the following CFG and convert it to Chomsky normal form by using
the conversion procedure just given. The series of grammars presented illus-
trates the steps in the conversion. Rules shown in bold have just been added.
Rules shown in gray have just been removed.

1. The original CFG G6 is shown on the left. The result of applying the first
step to make a new start variable appears on the right.
S0 → S
S → ASA | aB
S → ASA | aB
A → B|S
A → B|S
B → b|ε
B → b|ε

2. Remove ε-rules B → ε, shown on the left, and A → ε, shown on the right.

S0 → S S0 → S
S → ASA | aB | a S → ASA | aB | a | SA | AS | S
A → B |S |ε A → B |S |ε
B → b|ε B → b

3a. Remove unit rules S → S, shown on the left, and S0 → S, shown on the
right.
S0 → S S0 → S | ASA | aB | a | SA | AS
S → ASA | aB | a | SA | AS | S S → ASA | aB | a | SA | AS
A → B|S A → B|S
B → b B → b

3b. Remove unit rules A → B and A → S.

S0 → ASA | aB | a | SA | AS S0 → ASA | aB | a | SA | AS
S → ASA | aB | a | SA | AS S → ASA | aB | a | SA | AS
A → B |S |b A → S | b | ASA | a B | a | SA | AS
B → b B → b

4. Convert the remaining rules into the proper form by adding additional vari-
ables and rules. The final grammar in Chomsky normal form is equivalent to G6 .
(Actually the procedure given in Theorem 2.9 produces several variables Ui and
several rules Ui → a. We simplified the resulting grammar by using a single
variable U and rule U → a.)
S0 → AA1 | U B | a | SA | AS
S → AA1 | U B | a | SA | AS
A → b | AA1 | U B | a | SA | AS
A1 → SA
U → a
B → b

2.2
PUSHDOWN AUTOMATA
In this section we introduce a new type of computational model called pushdown
automata. These automata are like nondeterministic finite automata but have an
extra component called a stack. The stack provides additional memory beyond
the finite amount available in the control. The stack allows pushdown automata
to recognize some nonregular languages.
Pushdown automata are equivalent in power to context-free grammars. This
equivalence is useful because it gives us two options for proving that a language is
context free. We can give either a context-free grammar generating it or a push-
down automaton recognizing it. Certain languages are more easily described in
terms of generators, whereas others are more easily described by recognizers.
The following figure is a schematic representation of a finite automaton. The
control represents the states and transition function, the tape contains the in-
put string, and the arrow represents the input head, pointing at the next input
symbol to be read.

FIGURE 2.11
Schematic of a finite automaton

With the addition of a stack component we obtain a schematic representation

of a pushdown automaton, as shown in the following figure.

FIGURE 2.12
Schematic of a pushdown automaton

A pushdown automaton (PDA) can write symbols on the stack and read them
back later. Writing a symbol “pushes down” all the other symbols on the stack.
At any time the symbol on the top of the stack can be read and removed. The
remaining symbols then move back up. Writing a symbol on the stack is of-
ten referred to as pushing the symbol, and removing a symbol is referred to as
popping it. Note that all access to the stack, for both reading and writing, may
be done only at the top. In other words a stack is a “last in, first out” storage
device. If certain information is written on the stack and additional information
is written afterward, the earlier information becomes inaccessible until the later
information is removed.
Plates on a cafeteria serving counter illustrate a stack. The stack of plates
rests on a spring so that when a new plate is placed on top of the stack, the plates
below it move down. The stack on a pushdown automaton is like a stack of
plates, with each plate having a symbol written on it.
A stack is valuable because it can hold an unlimited amount of information.
Recall that a finite automaton is unable to recognize the language {0n 1n | n ≥ 0}
because it cannot store very large numbers in its finite memory. A PDA is able to
recognize this language because it can use its stack to store the number of 0s it
has seen. Thus the unlimited nature of a stack allows the PDA to store numbers of
unbounded size. The following informal description shows how the automaton
for this language works.
Read symbols from the input. As each 0 is read, push it onto the stack. As
soon as 1s are seen, pop a 0 off the stack for each 1 read. If reading the
input is finished exactly when the stack becomes empty of 0s, accept the
input. If the stack becomes empty while 1s remain or if the 1s are finished
while the stack still contains 0s or if any 0s appear in the input following
1s, reject the input.
As mentioned earlier, pushdown automata may be nondeterministic. Deter-
ministic and nondeterministic pushdown automata are not equivalent in power.

Nondeterministic pushdown automata recognize certain languages that no de-

terministic pushdown automata can recognize, as we will see in Section 2.4. We
give languages requiring nondeterminism in Examples 2.16 and 2.18. Recall
that deterministic and nondeterministic finite automata do recognize the same
class of languages, so the pushdown automata situation is different. We focus on
nondeterministic pushdown automata because these automata are equivalent in
power to context-free grammars.

FORMAL DEFINITION OF A PUSHDOWN AUTOMATON

The formal definition of a pushdown automaton is similar to that of a finite
automaton, except for the stack. The stack is a device containing symbols drawn
from some alphabet. The machine may use different alphabets for its input and
its stack, so now we specify both an input alphabet Σ and a stack alphabet Γ.
At the heart of any formal definition of an automaton is the transition func-
tion, which describes its behavior. Recall that Σε = Σ ∪ {ε} and Γε = Γ ∪ {ε}.
The domain of the transition function is Q × Σε × Γε . Thus the current state,
next input symbol read, and top symbol of the stack determine the next move of
a pushdown automaton. Either symbol may be ε, causing the machine to move
without reading a symbol from the input or without reading a symbol from the
stack.
For the range of the transition function we need to consider what to allow
the automaton to do when it is in a particular situation. It may enter some
new state and possibly write a symbol on the top of the stack. The function δ
can indicate this action by returning a member of Q together with a member
of Γε , that is, a member of Q × Γε . Because we allow nondeterminism in this
model, a situation may have several legal next moves. The transition function
incorporates nondeterminism in the usual way, by returning a set of members of
Q × Γε , that is, a member of P(Q × Γε ). Putting it all together, our transition
function δ takes the form δ : Q × Σε × Γε −→P(Q × Γε ).

DEFINITION 2.13
A pushdown automaton is a 6-tuple (Q, Σ, Γ, δ, q0 , F ), where Q, Σ,
Γ, and F are all finite sets, and
1. Q is the set of states,
2. Σ is the input alphabet,
3. Γ is the stack alphabet,
4. δ : Q × Σε × Γε −→P(Q × Γε ) is the transition function,
5. q0 ∈ Q is the start state, and
6. F ⊆ Q is the set of accept states.

A pushdown automaton M = (Q, Σ, Γ, δ, q0 , F ) computes as follows. It ac-

cepts input w if w can be written as w = w1 w2 · · · wm , where each wi ∈ Σε and
sequences of states r0 , r1 , . . . , rm ∈ Q and strings s0 , s1 , . . . , sm ∈ Γ∗ exist that
satisfy the following three conditions. The strings si represent the sequence of
stack contents that M has on the accepting branch of the computation.
1. r0 = q0 and s0 = ε. This condition signifies that M starts out properly, in
the start state and with an empty stack.
2. For i = 0, . . . , m − 1, we have (ri+1 , b) ∈ δ(ri , wi+1 , a), where si = at
and si+1 = bt for some a, b ∈ Γε and t ∈ Γ∗ . This condition states that M
moves properly according to the state, stack, and next input symbol.
3. rm ∈ F . This condition states that an accept state occurs at the input end.

EXAMPLES OF PUSHDOWN AUTOMATA

EXAMPLE 2.14
The following is the formal description of the PDA (page 112) that recognizes
the language {0n 1n | n ≥ 0}. Let M1 be (Q, Σ, Γ, δ, q1 , F ), where
Q = {q1 , q2 , q3 , q4 },
Σ = {0,1},
Γ = {0, $},
F = {q1 , q4 }, and
δ is given by the following table, wherein blank entries signify ∅.

Input: 0 1 ε
Stack: 0 $ ε 0 $ ε 0 $ ε
q1 {(q2 , $)}
q2 {(q2 , 0)} {(q3 , ε)}
q3 {(q3 , ε)} {(q4 , ε)}
q4

We can also use a state diagram to describe a PDA, as in Figures 2.15, 2.17,
and 2.19. Such diagrams are similar to the state diagrams used to describe finite
automata, modified to show how the PDA uses its stack when going from state
to state. We write “a,b → c” to signify that when the machine is reading an
a from the input, it may replace the symbol b on the top of the stack with a c.
Any of a, b, and c may be ε. If a is ε, the machine may make this transition
without reading any symbol from the input. If b is ε, the machine may make
this transition without reading and popping any symbol from the stack. If c
is ε, the machine does not write any symbol on the stack when going along this
transition.

FIGURE 2.15
State diagram for the PDA M1 that recognizes {0n 1n | n ≥ 0}

The formal definition of a PDA contains no explicit mechanism to allow the

PDA to test for an empty stack. This PDA is able to get the same effect by initially
placing a special symbol $ on the stack. Then if it ever sees the $ again, it knows
that the stack effectively is empty. Subsequently, when we refer to testing for an
empty stack in an informal description of a PDA, we implement the procedure in
the same way.
Similarly, PDAs cannot test explicitly for having reached the end of the input
string. This PDA is able to achieve that effect because the accept state takes effect
only when the machine is at the end of the input. Thus from now on, we assume
that PDAs can test for the end of the input, and we know that we can implement
it in the same manner.

EXAMPLE 2.16
This example illustrates a pushdown automaton that recognizes the language

{ai bj ck | i, j, k ≥ 0 and i = j or i = k}.

Informally, the PDA for this language works by first reading and pushing
the a’s. When the a’s are done, the machine has all of them on the stack so
that it can match, them with either the b’s or the c’s. This maneuver is a bit
tricky because the machine doesn’t know in advance whether to match the a’s
with the b’s or the c’s. Nondeterminism comes in handy here.
Using its nondeterminism, the PDA can guess whether to match the a’s with
the b’s or with the c’s, as shown in Figure 2.17. Think of the machine as having
two branches of its nondeterminism, one for each possible guess. If either of
them matches, that branch accepts and the entire machine accepts. Problem 2.57
asks you to show that nondeterminism is essential for recognizing this language
with a PDA.

FIGURE 2.17
State diagram for PDA M2 that recognizes
{ai bj ck | i, j, k ≥ 0 and i = j or i = k}

EXAMPLE 2.18
In this example we give a PDA M3 recognizing the language {wwR | w ∈ {0,1}∗ }.
Recall that wR means w written backwards. The informal description and state
diagram of the PDA follow.
Begin by pushing the symbols that are read onto the stack. At each point,
nondeterministically guess that the middle of the string has been reached and
then change into popping off the stack for each symbol read, checking to see that
they are the same. If they were always the same symbol and the stack empties at
the same time as the input is finished, accept; otherwise reject.

FIGURE 2.19
State diagram for the PDA M3 that recognizes {wwR | w ∈ {0, 1}∗}

Problem 2.58 shows that this language requires a nondeterministic PDA.

Grammar Practice - Osslt
No ratings yet
Grammar Practice - Osslt
22 pages
Personal Best British Edition B2 Students Book SS
100% (2)
Personal Best British Edition B2 Students Book SS
2 pages
A Detailed Lesson Plan in Teaching Grammar
100% (10)
A Detailed Lesson Plan in Teaching Grammar
4 pages
Context-Free Grammar: Sojharo Mangi BS-2
No ratings yet
Context-Free Grammar: Sojharo Mangi BS-2
10 pages
2.1 Context-Free Grammars
No ratings yet
2.1 Context-Free Grammars
42 pages
17 Context Free Languages With Examples
No ratings yet
17 Context Free Languages With Examples
74 pages
2015 Grammar 4 CS
No ratings yet
2015 Grammar 4 CS
19 pages
Lect 11
No ratings yet
Lect 11
7 pages
Compiler CFG Slides of PowerPoint
No ratings yet
Compiler CFG Slides of PowerPoint
66 pages
FL&T Unit 3 - 1 - 1724732026415
No ratings yet
FL&T Unit 3 - 1 - 1724732026415
17 pages
09 Parsing
No ratings yet
09 Parsing
11 pages
Unit-4 Context Free Grammar
No ratings yet
Unit-4 Context Free Grammar
106 pages
Context-Free Grammar (CFG) : Dr. Nadeem Akhtar
No ratings yet
Context-Free Grammar (CFG) : Dr. Nadeem Akhtar
56 pages
FLAT UNIT-III IContext - Free - Grammars
No ratings yet
FLAT UNIT-III IContext - Free - Grammars
59 pages
Notes 4x
No ratings yet
Notes 4x
3 pages
Atcd - 21CS51 - M3
No ratings yet
Atcd - 21CS51 - M3
36 pages
Theory of Computation Notes
No ratings yet
Theory of Computation Notes
4 pages
Grammar and Language: Grammar: It Is System That Specifies
No ratings yet
Grammar and Language: Grammar: It Is System That Specifies
40 pages
214 Grammar 2014
No ratings yet
214 Grammar 2014
50 pages
ATC Module 3
No ratings yet
ATC Module 3
38 pages
ch5ـcontextـfreeـgrammars
No ratings yet
ch5ـcontextـfreeـgrammars
49 pages
6 CFG
No ratings yet
6 CFG
34 pages
Toc 3
No ratings yet
Toc 3
65 pages
Chapter 4 Syntax Analysis
No ratings yet
Chapter 4 Syntax Analysis
95 pages
toc u3.ppt
No ratings yet
toc u3.ppt
54 pages
CSC 305: Programming Paradigm: Introduction To Language, Syntax and Semantics
No ratings yet
CSC 305: Programming Paradigm: Introduction To Language, Syntax and Semantics
38 pages
Lecture 7 - Context Free Grammars
No ratings yet
Lecture 7 - Context Free Grammars
59 pages
Unit 3
No ratings yet
Unit 3
25 pages
Chapter 4 - Context-Free Grammars and Languages
No ratings yet
Chapter 4 - Context-Free Grammars and Languages
60 pages
Grammar Free
No ratings yet
Grammar Free
10 pages
Unit 2
No ratings yet
Unit 2
86 pages
Unit3 Toc
No ratings yet
Unit3 Toc
97 pages
Context Free Grammar and Parsing
0% (1)
Context Free Grammar and Parsing
138 pages
Chapter 3
No ratings yet
Chapter 3
52 pages
Chapter 3 Context Free Language
No ratings yet
Chapter 3 Context Free Language
84 pages
Unit 3-Theory of Computation
No ratings yet
Unit 3-Theory of Computation
77 pages
lecture 4
No ratings yet
lecture 4
26 pages
Cit316 Summary From Noungeeks
No ratings yet
Cit316 Summary From Noungeeks
89 pages
Samir Cfg
No ratings yet
Samir Cfg
105 pages
ATC Module 3
No ratings yet
ATC Module 3
40 pages
Motivation For Formal Grammars
No ratings yet
Motivation For Formal Grammars
15 pages
UNIT-3: 08/02/23 UNIT 2 - Context-Free Grammar 1
No ratings yet
UNIT-3: 08/02/23 UNIT 2 - Context-Free Grammar 1
69 pages
MITWPU - Unit 3-Theory of Computation
No ratings yet
MITWPU - Unit 3-Theory of Computation
72 pages
Context Free Grammar CFG
No ratings yet
Context Free Grammar CFG
71 pages
Theory of Computation: Lecture 7: Context-Free Grammar
No ratings yet
Theory of Computation: Lecture 7: Context-Free Grammar
21 pages
Atc Module 3 Notes
No ratings yet
Atc Module 3 Notes
38 pages
18 Context-free Grammars
No ratings yet
18 Context-free Grammars
13 pages
l5_cfg
No ratings yet
l5_cfg
21 pages
Compiler Construction Week 5
No ratings yet
Compiler Construction Week 5
18 pages
Compiler Design Unit 2
No ratings yet
Compiler Design Unit 2
44 pages
Context Free Grammar
No ratings yet
Context Free Grammar
5 pages
Lec. 05 CFG
No ratings yet
Lec. 05 CFG
79 pages
Chapter 4 Syntax Analysis
No ratings yet
Chapter 4 Syntax Analysis
95 pages
Homework and Exams
No ratings yet
Homework and Exams
8 pages
Context - Free - Language
No ratings yet
Context - Free - Language
17 pages
Chomsky Hierarchy1 1
No ratings yet
Chomsky Hierarchy1 1
21 pages
UnitIII Formal Languages and Grammars
No ratings yet
UnitIII Formal Languages and Grammars
34 pages
Chapter 3 - Context Free Languages
No ratings yet
Chapter 3 - Context Free Languages
59 pages
LanguagesandGrammars Unit 3
No ratings yet
LanguagesandGrammars Unit 3
65 pages
Lec02-Syntax Analysis and LL
No ratings yet
Lec02-Syntax Analysis and LL
74 pages
Artificial Intelligence: Natural Language Processing II
No ratings yet
Artificial Intelligence: Natural Language Processing II
51 pages
The Genetic Code of All Languages; Part-5 (Hebrew)
From Everand
The Genetic Code of All Languages; Part-5 (Hebrew)
Moni Kanchan Panda
No ratings yet
Introduction to Formal Languages
From Everand
Introduction to Formal Languages
György E. Révész
2/5 (1)
Section 1
No ratings yet
Section 1
10 pages
Perrin Smith Handbook of Current English
No ratings yet
Perrin Smith Handbook of Current English
625 pages
Paquete de Repaso: Español 1: Nombre: - Hora
0% (1)
Paquete de Repaso: Español 1: Nombre: - Hora
14 pages
8 - Hk2 - Trần Văn Ơn Đề Cương Av - k.8 - hkii - 18 -19-Đã Chuyển Đổi
No ratings yet
8 - Hk2 - Trần Văn Ơn Đề Cương Av - k.8 - hkii - 18 -19-Đã Chuyển Đổi
6 pages
30 Ratatouille - Passive Voice
No ratings yet
30 Ratatouille - Passive Voice
1 page
Futur Proche
0% (1)
Futur Proche
1 page
Unit 8 - Lesson 3 - A Closer Look 2
No ratings yet
Unit 8 - Lesson 3 - A Closer Look 2
4 pages
Types of Adjective Descriptive Adjective
No ratings yet
Types of Adjective Descriptive Adjective
5 pages
Full download Reciprocal Constructions 1st Edition Vladimir P. Nedjalkov (Ed.) pdf docx
100% (11)
Full download Reciprocal Constructions 1st Edition Vladimir P. Nedjalkov (Ed.) pdf docx
67 pages
Curriculum Map Third Quarter English 9
71% (7)
Curriculum Map Third Quarter English 9
9 pages
Tenses - Study Notes
No ratings yet
Tenses - Study Notes
12 pages
English 7, 9 and 10
No ratings yet
English 7, 9 and 10
10 pages
Ellipsis & Substitution - GMAT Verbal Section
No ratings yet
Ellipsis & Substitution - GMAT Verbal Section
11 pages
Graus de Parentesco em Inglês
No ratings yet
Graus de Parentesco em Inglês
5 pages
The Verbs
No ratings yet
The Verbs
9 pages
Đề Ngữ Pháp - 01.2002
No ratings yet
Đề Ngữ Pháp - 01.2002
4 pages
1.verb To Be
No ratings yet
1.verb To Be
30 pages
Nivel Teen
No ratings yet
Nivel Teen
6 pages
CH 2 Grammar Practice
No ratings yet
CH 2 Grammar Practice
3 pages
Week 4 Wednesday Powerpoint With Capstone Tasks
No ratings yet
Week 4 Wednesday Powerpoint With Capstone Tasks
25 pages
There + To Be (Haber) en Español Afirmativa Negativa Interrogativa
No ratings yet
There + To Be (Haber) en Español Afirmativa Negativa Interrogativa
12 pages
Italian Proficiency Test
No ratings yet
Italian Proficiency Test
7 pages
Ele Unit7 Revision PDF
No ratings yet
Ele Unit7 Revision PDF
2 pages
Arya Dwiyan Nur Syifai XI-MM1: Odal Uxiliary
No ratings yet
Arya Dwiyan Nur Syifai XI-MM1: Odal Uxiliary
12 pages
Bourdeau, C. Egativity in Shawi
No ratings yet
Bourdeau, C. Egativity in Shawi
52 pages
Caption - Understanding of Caption
No ratings yet
Caption - Understanding of Caption
3 pages
Qualifying Examination Reviewer For Secondary Education English Students
No ratings yet
Qualifying Examination Reviewer For Secondary Education English Students
17 pages

readingMaterialweek4-5

Uploaded by

readingMaterialweek4-5

Uploaded by

2

In Chapter 1 we introduced two different, though equivalent, methods of de-

The collection of languages associated with context-free grammars are called

A grammar consists of a collection of substitution rules, also called produc-

⟨SENTENCE ⟩ → ⟨NOUN-PHRASE ⟩⟨VERB-PHRASE ⟩

Grammar G2 has 10 variables (the capitalized grammatical terms written in-

Each of these strings has a derivation in grammar G2 . The following is a deriva-

⟨SENTENCE ⟩ ⇒ ⟨NOUN-PHRASE ⟩⟨VERB-PHRASE ⟩

FORMAL DEFINITION OF A CONTEXT-FREE GRAMMAR

If u, v, and w are strings of variables and terminals, and A → w is a rule of the

EXAMPLES OF CONTEXT-FREE GRAMMARS

A compiler translates code written in a programming language into another

DESIGNING CONTEXT-FREE GRAMMARS

for the language {0n 1n | n ≥ 0} and the grammar

Second, constructing a CFG for a language that happens to be regular is easy

⟨EXPR ⟩ → ⟨EXPR ⟩+⟨EXPR ⟩ | ⟨EXPR ⟩x⟨EXPR ⟩ | ( ⟨EXPR ⟩ ) | a

Sometimes when we have an ambiguous grammar we can find an unambigu-

CHOMSKY NORMAL FORM

Chomsky normal form. Chomsky normal form is useful in giving algorithms

with the rules A → u1 A1 , A1 → u2 A2 , A2 → u3 A3 , . . . , and Ak−2 → uk−1 uk .

2. Remove ε-rules B → ε, shown on the left, and A → ε, shown on the right.

3b. Remove unit rules A → B and A → S.

With the addition of a stack component we obtain a schematic representation

Nondeterministic pushdown automata recognize certain languages that no de-

FORMAL DEFINITION OF A PUSHDOWN AUTOMATON

A pushdown automaton M = (Q, Σ, Γ, δ, q0 , F ) computes as follows. It ac-

EXAMPLES OF PUSHDOWN AUTOMATA

The formal definition of a PDA contains no explicit mechanism to allow the

{ai bj ck | i, j, k ≥ 0 and i = j or i = k}.

Problem 2.58 shows that this language requires a nondeterministic PDA.

You might also like