0% found this document useful (0 votes)
105 views9 pages

Context-Free Languages and Pushdown Automata

1) The document introduces pushdown automata (PDAs), which are similar to nondeterministic finite automata (NFAs) but can also read and pop symbols from a stack. 2) PDAs recognize context-free languages, a larger class of languages than regular languages recognized by NFAs alone. The language {anbn: n>=0} is given as an example of a non-regular context-free language recognizable by a PDA. 3) Context-free grammars are introduced as an equivalent way to define context-free languages, where words are generated by substituting variables according to production rules.

Uploaded by

Abhishek Gambhir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
105 views9 pages

Context-Free Languages and Pushdown Automata

1) The document introduces pushdown automata (PDAs), which are similar to nondeterministic finite automata (NFAs) but can also read and pop symbols from a stack. 2) PDAs recognize context-free languages, a larger class of languages than regular languages recognized by NFAs alone. The language {anbn: n>=0} is given as an example of a non-regular context-free language recognizable by a PDA. 3) Context-free grammars are introduced as an equivalent way to define context-free languages, where words are generated by substituting variables according to production rules.

Uploaded by

Abhishek Gambhir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Chapter 2

Context-free languages and pushdown


automata

In the previous section we considered a particularly simple model of a machine: a


computer with no memory. We saw that this type of machine is able to recognise
all languages that can be described by regular expressions. However, we have also
seen that some very simple languages, such as {an bn : n 0}, are not regular.
In this section we add some memory to our machines, and see that they can now
recognise a bigger class of languages, although still not all languages.

1. Pushdown automata
See Sipser §2.2 for more information about this section.
The memory that we add to our machines will be in the form of a stack : we can
put letters on to top of the stack (called pushing) and read and remove letters from
the top of the stack (called popping), but we can’t read any more of the stack than
the top letter. A good example to think of is a stack of plates in a cafeteria: we
can put plates on the top, and take plates from the top, but trying to take plates
from further down can lead to problems. . .

A pushdown automaton P is similar to an NFA, in that it starts in a state s0 and


reads a word w = a1 a2 . . . an from left to right. At each step P nondeterministically
reads the next letter of w or the empty letter ✏. The di↵erence from an NFA is that
P can simultaneously also read and pop the top letter from the stack, although it
doesn’t have to.
If X is an alphabet and ✏ is the empty word, we write X✏ for X [ {✏}.

Definition 1.1. A pushdown automaton (PDA) is a 6-tuple P = (Q, ⌃, , , q0 , T ).


1. Q is a finite set of states.
2. ⌃ is a finite set, the input alphabet: the input to P is a word in ⌃⇤ .

18
19

3. is a finite set, the stack alphabet: the word on the stack at every point is a
word from ⇤ .

4. : Q ⇥ ⌃✏ ⇥ ✏ ! P(Q ⇥ ✏) is a non-deterministic transition function.

5. q0 2 Q is the start state.

6. T ✓ Q is the set of accept states.

We interpret the transition function as follows. The input is a triple (q, a, x)


of the current state q 2 Q, a letter a 2 ⌃ that is the next letter of the input word
(or ✏), and a letter x 2 (or ✏) to read and pop from the stack. If a = ✏ then P
doesn’t check the input word, and if x = ✏ then P doesn’t check the stack.
The output of is a set of ordered pairs (q 0 , y), where q 0 2 Q and y 2 ✏ . It
is a set because the PDA P is non-deterministic, and will try all output pairs in
parallel. The state q 0 2 Q is the state that P moves to, and the letter y 2 ✏ is the
letter (possibly ✏) to be pushed to the stack.
If we want to read a letter from the stack without changing the stack, we pop
it o↵, and then push the same letter back on again.

Definition 1.2. A PDA P = (Q, ⌃, , , q0 , T ) accepts an input word w 2 ⌃⇤ if w


can be written as w = w1 w2 . . . wm , where each wi 2 ⌃ [ ✏, and there is a sequence
of states q0 , q1 , . . . , qm 2 Q and words u0 = ✏, u1 , . . . , um 2 ⇤ such that qm 2 T
and P moves properly according to the transition function . We do not require
the stack to be empty at the end.
The language L(P ) is the set of all words in ⌃⇤ that P accepts.

Example 1.3. We design a PDA P1 to recognise the language {0n 1n : n 0}.


The idea is that as long as P1 is reading 0s, it should push them on to the stack.
Once it starts seeing 1s, it should pop a 0 from the stack for each 1 that it sees. It
should accept a word only if it reaches the last 1 at the same time as it pops the
last 0.
A PDA can’t check whether the stack is empty, so we start by pushing a special
symbol (usually $) onto the stack, so that when we pop it, P1 knows that the stack
is empty. The only important thing about the special symbol is that you only use
it for the bottom of the stack, to avoid confusion!
Here is a diagram of P1 :

This diagram is similar to the ones used to describe a DFA or NFA, the only
di↵erence is the labels on the arrows. We write a, b ! c to indicate that when the
machine reads an a from the input, it can pop a b from the stack and push a c onto
the stack.
20CHAPTER 2. CONTEXT-FREE LANGUAGES AND PUSHDOWN AUTOMATA

• If a = ✏, then P1 can make this move without reading any letters from the
input word. If a 6= ✏, but the next letter of the input word is not a, then this
branch of computation terminates (just like an NFA).
• If b = ✏, then P1 does not pop anything from the stack. If b 6= ✏, but the top
letter on the stack is not b, then this branch of computation terminates.
• If c = ✏, then P1 does not push anything onto the stack.
Since P1 is quite small, here is a formal description of it as well.

P1 = ({q0 , q1 , q2 , q3 }, {0, 1}, {0, $}, , q0 , {q0 , q3 })

The transitions are given by the following table. An entry ; in column (qj , b, c)
means that P1 rejects that path of computation: remember that when P1 has
finished reading the input word, it can stop using the transition function.

Input: 0 1 ✏
Pop from stack: 0 $ ✏ 0 $ ✏ 0 $ ✏
q0 ; ; ; ; ; ; ; ; {(q1 , $)}
q1 ; ; {(q1 , 0)} {(q2 , ✏)} ; ; ; ; ;
q2 ; ; ; {(q2 , ✏)} ; ; ; {(q3 , ✏)} ;
q3 ; ; ; ; ; ; ; ; ;

Theorem 1.4. 1. Let L be a regular language. Then L is recognised by a PDA.


2. PDAs can also recognise some non-regular languages.

Proof. Part 1. Since L is regular, it is recognised by a DFA M = (A, S, F, s0 , T ).


We mimic this DFA with PDA P . The 6-tuple defining P is (S, A, ;, , s0 , T ),
where is defined as follows: For each s 2 S, and for each a 2 A, set (s, a, ✏) =
{(F (a, s), ✏)}. For all s 2 S, set (s, ✏, ✏) = ;. Then on input any w 2 A⇤ , the PDA
P goes through the same sequence of states as M , and never looks at the stack.
Part 2. We saw in Chapter 1 Example 5.3 that {am bm : m > 0} is not a regular
language, so {0n 1n : n 0} is also not regular. However, in Example 1.3 we have
constructed a PDA to accept {0n 1n : n 0}. ⌅

Example 1.5. Let’s design a PDA P2 to recognise the language

L = {ai bj ck : i, j, k 0, and i = j or i = k}.

The PDA P2 first pushes all of the as onto the stack. It then uses nondeterminism
to test two things at once. One branch of the computation tests whether the number
of as is equal to the number of bs, and then ignores the cs. The other branch ignores
the bs, and then tests whether the number of as is equal to the number of cs. The
PDA P2 tests equality of the numbers of letters by popping one a from the stack
for each letter (b or c) it reads, and checking whether it reaches the bottom of the
stack at the same time as it finishes reading the bs or cs, depending on the branch
of computation. If either branch matches, then the PDA accepts.
21

Here is a diagram of P2 :

Exercise: Show that the language L from Example 1.5 is not regular.
In Example 1.5 we used the full power of nondeterminism to simultaneously test
i = j or i = k. It can be shown that nondeterminism makes a di↵erence to PDAs:
there are languages (including the one from Example 1.5) that can be recognised
by a (nondeterministic) PDA, but that cannot be recognised by a machine that is
like a PDA but deterministic. We will not study machines that are like PDAs but
are deterministic.

2. Context-free languages
See Sipser §2.1 for more information on this section.
We saw in Chapter 1 that there are three equivalent ways to define regular
languages: as the languages recognised by DFAs, as the languages recognised by
NFAs, and as the languages described by a regular expression.
We now give a new way of defining languages, called context-free grammars
(CFGs). We will see later that this is another way of defining the class of languages
accepted by PDAs. Context free grammars di↵er from regular expressions, in that
rather than being static objects which either do or don’t match a given word, they
consist of sets of rules from which we make words.

Definition 2.1. A context free grammar (CFG) is a 4-tuple (N, ⌃, R, S), where

1. N is a finite set of variables;

2. ⌃ is a finite set of terminals: we require N \ ⌃ = ;;


˙ ⇤;
3. R is a finite set of production rules A ! w, where A 2 N and w 2 (N [⌃)

4. S 2 N is the start variable.

Variables are normally written with upper case letters and terminals with lower
case letters or numbers. If we have more than one rule with the same left side we
usually group them together: instead of writing, for example A ! w1 and A ! w2 ,
we write A ! w1 | w2 . This means that either rule can be applied.

Example 2.2. Our first context free grammar G1 is ({S}, {0, 1}, R, S), where R
contains two production rules: R = {S ! 0S1 | ✏}.
22CHAPTER 2. CONTEXT-FREE LANGUAGES AND PUSHDOWN AUTOMATA

Just as for regular expressions, each context free grammar G = (N, ⌃, R, S)


defines a language L(G) over ⌃, the alphabet of terminals. Each word of the
language is made as follows:
1. The initial word is just the start variable S.
2. Choose one of the variables in the current word, and one of the production
rules that has that variable as the left hand side. Replace that variable in the
current word with the string on the right hand side of the production rule.
3. Repeat Step 2 until no variables remain in the current word.
A sequence of substitutions like this is called a derivation. We write the derivation

S ) w1 ) w2 ) . . . ) w.

If we aren’t interested in the intermediate steps, we write this as S )⇤ w.

Definition 2.3. A language L is context-free if there exists a CFG G such that


L = L(G).

Example 2.2, ctd. One derivation is S ) ✏. Another is

S ) 0S1 ) 00S11 ) 000S111 ) 000111.

We see that S )⇤ 0n 1n for all n 0. Furthermore, this is all the words that can
be derived from S, so L(G1 ) = {0n 1n : n 0}, and this language is context-free.

Example 2.4. Consider the CFG, G2 , which has variables {R, S, T, X}, terminals
{a, b} and start symbol R. The production rules of G2 are

R ! XRX | S
S ! aT b | bT a
T ! XT X | X | ✏
X ! a | b.

Let’s try some derivations. If we want to make words in L(G2 ) then our deriva-
tions should start from R.
1. R ) S ) aT b ) ab
2. R ) S ) bT a ) bXa ) baa
3. R ) XRX ) XXRXX ) XXSXX ) XXbT aXX )⇤ babaXX )⇤ babaaa

So ab, baa and babaaa are all in L(G2 ).


However, ✏, a, b 62 L(G2 ): every derivation must use exactly one of the two rules
S ! aT b or S ! bT a exactly once, and so every word in L(G2 ) contains at least
one a and at least one b.
To further explore a language, it can be useful to look at derivations starting
from other variables. We observe that X can turn into any single letter, so for any
word w of length n we have X n )⇤ w. Next, notice that

T )⇤ X n T X n for all n = 1, 2, . . .
X nT X n ) X 2n+1
X nT X n ) X 2n

So from T we can reach any word.


Similarly, R )⇤ X n SX n for any n 0. Continuing to reason like this, we
can show that L(G2 ) consists of all words w = w1 . . . wn 2 {a, b}⇤ that are not
palindromes: there must be a number k such that wk 6= wn k+1 .
23

Context-free grammars were originally used in the study of human (natural) lan-
guages. An important application is the compilation and specification of program-
ming languages. Designers of compilers and interpreters often start by obtaining a
grammar for the language, and most compilers and interpreters contain a parser,
which extracts the meaning of the programme before either generating the compiled
code or running the programme. A parser is constructed (either automatically or
by hand) from the CFG that describes the language.

3. Pushdown automata and context free languages


Here is the key theorem linking context free grammars and pushdown automata.
The key ideas for the proof are contained in Sipser §2.2.

Theorem 3.1. A language is context-free if and only if some pushdown automaton


recognises it.

Having a grammar version and a machine version of a formal language is ex-


tremely useful – for some problems one will be more helpful than the other.

Corollary 3.2. The class of regular languages is a proper subset of the class of
context-free languages.

Proof. We showed in Theorem 1.4 that the regular languages are a proper subset
of the languages recognised by PDAs. The result now follows from Theorem 3.1.

We won’t prove Theorem 3.1, but will give the method of going from a context
free grammar to a pushdown automaton, which gives an idea of how one might
prove one direction.
Let the context free grammar be G = (V, ⌃, R, S). We will design a pushdown
automaton P which mimics all possible left-most derivations of G. (A leftmost
derivation is one where at every step the leftmost remaining variable is replaced.
One can show that if w 2 L(G) then there is a left-most derivation which produces
w). The PDA P uses nondeterminism to mimic all of them at once. It will accept
an input word w if one of the derivations produces w.
The PDA P will use its stack to store each step of the derivations: it will
pop the variable that it wants to replace, and push on the right hand side of the
corresponding rule of the CFG. This is described in Step 3, below, where we show
how to push words of length greater than 1 onto the stack.
There is one further complication, due to the fact that P can only read the top
letter of the stack: the next variable that P wishes to replace could be hidden under
some terminals. The PDA deals with this by matching the terminals at the top of
the stack to the letters of w, as they are produced: reading the stack from top to
bottom, these will appear in the same order as they are in w. More specificially, if
P is in state qloop (see Step 1, below), and there is a terminal a 2 ⌃ on the top of
the stack, P reads the next letter b of w. If a = b then P pops a o↵ the stack: see
Step 4, below. If a 6= b, then this branch of computation rejects w.

1. Start with four states: q0 , q1 , qloop and qaccept . Make q0 the start state, and
qaccept the only accept state.
Most of the work of P happens at qloop . Whenever P is in qloop the stack
will contain the result of a derivation in G (except with some initial terminals
missing, if they have already been matched with the input word w).
24CHAPTER 2. CONTEXT-FREE LANGUAGES AND PUSHDOWN AUTOMATA

2. Put an arrow labelled ✏, ✏ ! $ from q0 to q1 , and an arrow labelled ✏, ✏ ! S


from q1 to qloop .
We mark the bottom of the stack, and then start the derivation with S.
Put an arrow labelled ✏, $ ! ✏ from qloop to qaccept .
Once the stack is empty we accept the input word if we’ve read, and matched,
all of its letters.
These are the only arrows to or from q0 , q1 , and qaccept .

3. For each rule in R of the form A ! u, where A 2 V and u 2 (V [ ⌃)⇤ , add a


chain of new states and arrows as follows:

(a) If u = u1 u2 . . . un , then add n 1 new states (or none if u = ✏).


We’ll call them q2 , q3 , . . . , qn , but you don’t have to label them.
(b) Add transitions:
i. From qloop to qn , labelled ✏, A ! un
ii. For i = n 1, n 2, . . . , 2, from qi+1 to qi , labelled ✏, ✏ ! ui .
iii. From q2 to qloop , labelled ✏, ✏ ! u1 .
We add a sequence of transitions which pop A from the stack and push
u on the stack in reverse order. When P returns to qloop , the current
word in the derivation will be read from the top of the stack to the bottom.

4. For each terminal a 2 ⌃, put an arrow from qloop to qloop labelled a, a ! ✏.


These match up the initial terminals of the current word with the next letters
of the input word w

Example 3.3. We show how to turn the CFG G1 from Example 2.2 into a PDA.
Here’s what the PDA looks like after Steps 1 and 2:

Here’s the final PDA. The chains of states for each rule that we added in Step 3
have been labelled:

I’ve also labelled the loops that we add at Step 4.


25

Example 3.4. Consider the CFG ({S, T }, {a, b}, R, S), where R is

S ! aT b | b
T ! T a | ✏.

There are four rules in R here, so the PDA has four chains added in Step 3 (but
two of them are very short!).

4. The pumping lemma for context-free languages


See §2.3 of Sipser for more information on this section.
To show that a language is context free, we give either a CFG or a PDA for it.
Every finite language is regular, and hence by Corollary 3.2 is context-free. In this
section we state the Pumping Lemma for context-free languages: just as for regular
languages, the job of this lemma is to show that an infinite language is not context
free. The only di↵erence from the Pumping Lemma from Section 5 of Chapter 1
is that whereas with regular languages we broke the word up into three parts, and
pumped the middle one, in this case we break the word up into five parts, and pump
the second and fourth.

Lemma 4.1 (The Pumping Lemma for context-free languages) Let L be a


context-free language. Then there exists an integer N (the pumping length) such
that, if w 2 L and |w| > N , then w = uvxyz, where

1. |vxy| < N ,

2. |v| + |y| > 0,

3. for each i 0, the word uv i xy i z 2 L.

Condition 1 means that although we don’t know how long the first part u of w
is, we do know that vxy can’t be too long. Condition 2 says that we’re not allowed
to choose v and y to both be ✏ (otherwise Condition 3 would trivially hold).
26CHAPTER 2. CONTEXT-FREE LANGUAGES AND PUSHDOWN AUTOMATA

The proof of Lemma 4.1 is not terribly complicated, but it relies on the concept
of a parse tree, which has not been introduced in these notes. Hence we omit its
proof, and instead just give some examples of how to use it.

Example 4.2. Prove that the language L = {an bn cn : n 1} is not context free.
We assume that L is context-free, and get a contradiction. Since L is context-
free, the Pumping Lemma for context-free languages applies. Let N be the pumping
length. Consider the word w = aN bN cN . Then w 2 L and |w| > N . We must show
that no matter how we write w = uvxyz, if Conditions 1 and 2 hold then there
exists an i such that uv i xy i z 62 L.
By Condition 1, |vxy| < N , so vxy cannot be of the form ai bN cj . Hence either
vxy is a power of one of a, b or c; or vxy is of the form ai bj or bi cj (for some
i, j > 0).
If either v or y is equal ai bj or bi cj (with both i and j greater than 0), then
/ L(a⇤ b⇤ c⇤ ), and so uv 2 xy 2 z 2
uv 2 xy 2 z 2 / L.
Hence v and y are both powers of single letters. Then at most two of a, b, c
appear in v and y. At least one of v and y is not ✏, by Condition 2, so at least one
of a, b, c appears in v and y. So uv 2 xy 2 z does not contain equal numbers of as, bs
and cs, and so uv 2 xy 2 z 2 / L.
Thus Condition 3 of the Pumping Lemma is violated, contradicting our assump-
tion that the Pumping Lemma applies. Hence L is not context-free.

Example 4.3. Prove that the language L2 = {ss : s 2 {0, 1}⇤ } is not context-free.
We assume that L2 is context-free, and get a contradiction. Since L2 is context
free, the Pumping Lemma for context-free languages applies. Let N be the pumping
length.
The first issue is finding a good choice of w = ss, so that when we pump it we
leave the language L2 . For example, one possibility might be w0 = 0N 10N 1, but by
writing w0 as uvxyz where v = 0i in the first subword 0N , and y = 0i in the second
subword 0N , this choice of w can in fact be pumped.
A better choice of w is 0N 1N 0N 1N . Then w 2 L2 and |w| > N . We must
consider all ways of writing w = uvxyz so that Conditions 1 and 2 of the Pumping
Lemma both hold, and show that Condition 3 always fails. Condition 3 implies
that there exists an s1 2 {0, 1}⇤ such that w1 = uv 2 xy 2 z = s1 s1 2 L2 .
First suppose that uvxy is a subword of 0N 1N (the beginning of w). Notice
that the first letter of w1 is 0, so s1 begins with 0. If v = 0i and y = 0j , then
uv 2 xy 2 z = 0N +i+j 1N 0N 1N , with i + j > 0 by Condition 2, and i + j < N by
Condition 1. Then counting back from the end of w1 , we see that the first letter of
s1 is 1, contradicting our earlier observation that the first letter of s1 is 0. If v = 1i
and y = 1j then uv 2 xy 2 z = 0N 1N +i+j 0N 1N , and we get the same contradiction.
Similarly, if v = 0i and y = 1j then uv 2 xy 2 z = 0N +i 1N +j 0N 1N , and we get the
same contradiction. Finally, if either of v or w is of the form 0i 1j , then uv 2 xy 2 z
must finish with 1N 0N 1N , so again the first letter of s1 can be neither a 0 nor a 1.
Instead suppose that vxyz is a subword of 0N 1N (the end of w). Then reasoning
as in the previous case, but considering now the last letter of s1 (which must be a
1, since the last letter of w is a 1), we get a similar contradiction.
Hence vxy must be contained in the middle subword 1N 0N of w, and vxy must
contain at least one 1 and at least one 0. But then uv 0 xy 0 z = uxz = 0N 1i 0j 1N ,
where i < N or j < N or both, and so uxz 2 / L2 .
Thus Condition 3 of the Pumping Lemma is violated in every case, contradicting
our assumption that the Pumping Lemma applies. Hence L2 is not context-free.

You might also like