The Theory of Parsing, Translation, and Compiling PDF
The Theory of Parsing, Translation, and Compiling PDF
PARSING, TRANSLATION,
A N D COMPILING
VOLUME I: PARSING
A L F R E D V. A H O
JEFFREY D. U L L M A N
PRENTICE-HALL, INC.
1098
ISBN: 0-13-914556-7
Library of Congress Catalog Card No. 72-1073
ALFRED V. AHO
JEFFREY D. ULLMAN
CONTENTS
PREFACE
O MATHEMATICAL PRELIMINARIES
XIII
xiv CONTENTS
| AN I N T R O D U C T I O N TO C O M P I L I N G 53
1.1 Programming Languages 53
1.1.1 Specification of Programming Languages 53
1.1.2r Syntax and Semantics 55
Bibliographic Notes 57
2 ELEMENTS OF L A N G U A G E T H E O R Y 83
2.1.2 Grammars 84
2.1.3 Restricted Grammars 91
2.1.4 Recognizers 93
Exercises 96
Bibliographic Notes 102
3 THEORY OF T R A N S L A T I O N 212
5 O N E - P A S S NO B A C K T R A C K PARSING 333
XVIll CONTENTS
APPENDIX 501
BIBLIOGRAPHY 519
INDEX TO L E M M A S , T H E O R E M S ,
AND ALGORITHMS 531
INDEX TO V O L U M E i 533
MATHEMATICAL
0 PRELIMINARIES
This section will briefly review some of the most basic concepts from set
theory: relations, functions, orderings, and the usual operations on sets.
0.1.1. •S e t s
Example 0.1
Let the nonnegative integers be atoms. Then A = {1, {2, 3}, 4} is a set.
A's members are l, {2, 3}, and 4. The member {2, 3} of A is also a set. Its
members are 2 and 3. However, the atoms 2 and 3 are not members of
A itself. We could equivalently have written A = {4, 1, {3, 2}}. Note that
#A=3. D
A useful way o f defining sets is by means of a predicate, a statement
involving one or more unknowns which has one of two values, true or false.
The set defined by a predicate consists of exactly those elements for which
the predicate is true. However, we must be careful what predicate we choose
to define a set, or we may attempt to define a set that could not possibly exist.
Example 0.2
The phenomenon alluded to above is known as Russell's paradox. Let
P(X) be the predicate " X is not a member of itself"; i.e., X ~ X. Then we
might think that we could define the set Y of all X such that P(X) was true;
i.e., Y consists of exactly those sets that are not members of themselves.
Since most common sets seem not to be members of themselves, it is tempting
to suppose that set Y exists.
But if Y exists, we should be able to answer the question, "Is Y a member
of itself?" But this leads to an impossible situation. If Y ~ Y, then P(Y) is
false, and Y is not a member of itself, by definfi:ion of Y. Hence, it is not
possible that Y ~ Y. Conversely, suppose that Y ~ Y. Then, by definition
of Y again, Y ~ Y. We see that Y ~ Y implies Y ~ Y and that Y ~ Y
implies Y ~ Y. Since either Y ~ Y or Y ~ Y is true, both are true, a situa-
tion which we shall assume is impossible. One "way out" is to accept that
set Y does not exist. [Z]
The normal way to avoid Russell's paradox is to define sets only by those
predicates P(X) of the form " X is in A and Pi(X)," where A is a known set
and P1 is an arbitrary predicate. If the set A is understood, we shall just write
PI(X) for " X is in A and Pi(X)."
SEC. 0.1 CONCEPTS FROM SET THEORY 3
Example 0.3
Let P(X) be the predicate "X is a nonnegative even integer." That is,
P(X) is "X is in the set of integers and Pi(X)," where PI(X) is the predi-
cate " X is even." Then A----[X]P(X)} is the set which is often written
{0, 2, 4, . . . , 2 n , . . . ) . Colloquially, we can assume that the set of nonnega-
tive integers is understood, and write A -- IX I X is even). D
There are several basic operations on sets which can be used to construct
new sets.
4 MATHEMATICAL PRELIMINARIES CHAP. 0
DEFINITION
Let A and B be sets. The union of A and B, written A U B, is the set
containing all elements in A together with all elements in B. Formally,
A uB=~xIxEA orx~B}.t
The intersection of A and B, written A ~ B, is the set of all elements
that are in both A and B. Formally, A n B = {x Ix ~ A and x 6 B}.
The difference of A a n d B, written A -- B, is the set of all elements in A
that are not in B. If A = U m t h e set of all elements under consideration or
the universal set, as it is sometimes called--then U - B is often written ./~
and called the complement of B.
Note that we have referred to the universal set as the set of all objects
"under consideration." We must be careful to be sure that U exists. For
example, if we choose U to be "the set of all sets," then we would have
Russell's paradox again. Also, note that /t is not well defined unless we
assume that complementation with respect to some known universe is
implied.
In general A B = A n /t. Venn diagrams for these set operations are
shown in Fig. 0.2.
means A3 u A, U As U . . .
tNote that we may not have a set guaranteed to include A W B, so this use of predi-
cate definition appears questionable. In axiomatic set theory, the existence of A w B is
taken to be an axiom.
SEC. 0.1 CONCEPTS FROM SET THEORY 5
DEFINITION
Let A be a set. The power set of A, written ~P(A) or sometimes 2 A, is the
set of all subsets of A. That is, 6'(A) = [B! B ~ A}.t
Example 0.4
Let A = {1, 2}. Then ~P(A) = ( ~ , {1}, {2}, {1, 2}}. As another example,
6,(~) = {~]. 13
In general, if A is a finite set of m members, CP(A) has 2 m members. The
empty set is a m e m b e r of 6'(A) for every A.
We have observed that the members of a set are considered to be un-
ordered. It is often convenient to have ordered pairs of objects available for
discourse. We thus make the following definition.
DEFINITION
Let a and b be objects. Then (a, b) denotes the ordered pair consisting of
a and b in that order. We say that (a, b) = (c, d) if and only if a = c and
b = d. In contrast, [a, b} = {b, a}.
Ordered pairs can be considered sets if we define (a, b) to be the set
[a, [a, b}}. It is left to the Exercises to show that [a, [a, b]} = [c, [c, d]} if and
only if a = c and b = d. Thus this definition is consistent with what we
regard to be the fundamental property of ordered pairs.
DEFINITION
The Cartesian product of sets A and B, denoted A x B, is
{(a, b) la ~ A and b ~ B}.
Example 0.5
Let A = [ 1, 2} and B = {2, 3, 4}. Then
A x B = ((1, 2), (I, 3), (1, 4), (2, 2), (2, 3), (2, 4)}. D
0.1.3. Relations
~The existence of the power set of any set is an axiom of set theory. The other set
defining axioms, in addition to the power set axiom and the union axiom previously
mentioned, are:
(1) If A is a set and P a predicate, then [XIP(X) and X ~ A} is a set.
(2) If X is an atom or set, then [X} is a set.
(3) If A is a set, then {XI for some Y, we have X ~ Y and Y ~ A} is a set.
6 MATHEMATICAL PRELIMINARIES CHAP. 0
DEFINITION
Let A and B be sets. A relation from A to B is any subset of A x B. If
A = B, we say that the relation is on A. If R is a relation from A to B, we
write a R b whenever (a, b) is in R. We call A the domain of R, and B the
range of R.
Example 0.6
Let A be the set of integers. The relation < is {(a, b) la is less than b}.
We thus write a < b exactly when we would expect to do so. [[]
DEFINITION
The relation {(b, a)[ (a, b ) ~ R} is called the inverse of R and is often
denoted R- 1.
A relation is a very general concept. Often a relation may possess certain
properties to which special names have been given.
DEFINITION
Let A be a set and R a relation on A. We say that R is
(1) Reflexive if a R a for all a in A,
(2) Symmetric if "a R b" implies "b R a" for a, b in A, and
(3) Transitive if "a R b and b R c" implies "a R c" for a, b, c in A. The
elements a, b, and c need not be distinct.
Relations obeying these three properties occur frequently and have addi-
tional properties as a consequence. The term equivalence relation is used to
describe a relation which is reflexive, symmetric, and transitive.
An important property of equivalence relations is that an equivalence
relation R on a set A partitions A into disjoint subsets called equivalence
classes. For each element a in A we define [a], the equivalence class of a, to
be the set {b l a R b}.
Example 0.7
Consider the relation of congruence modulo N on the nonnegative inte-
gers. We say that a _= b mod N (read "a is congruent to b modulo N") if
there is an integer k such that a - b = kN. As a specific case let us take
N = 3. Then the set {0, 3, 6 , . . . , 3 n , . . . } forms an equivalence class, since
3n ~ 3m mod 3 for all integer values of m and n. We shall use [0] to denote
this class. We could have used [3] or [6] or [3n], since any element of an
equivalence class can be used as a representative of that class.
The two other equivalence classes under the relation congruence modulo
3 are
[1] = {1, 4, 7 , . . . , 3n -t- 1 , . . . }
[2] = {2, 5, 8 , . . . , 3n -q- 2 , . . . }
SEC. 0.1 CONCEPTS FROM SET THEORY 7
The union of the three sets [0], [ 1] and [2] is the set of all nonnegative integers.
Thus we have partitioned the set of all nonnegative integers into the three
disjoint equivalence classes [0], [1], and [2] by means of the equivalence rela-
tion congruence modulo 3 (Fig. 0.3). D
Set of all
nonnegative
integers
[1]
In the literature the term partial order is sometimes used to denote what
we call a reflexive partial order.
0, 1,2~
2}
DEFINITION
0.1.6. Mappings
M-1 : B ---~ A such that M-l(b) = a if and only if M(a) = b. If there exists
b in B for which there is no a in A such that M(a) = b, then M-~ will be a
partial function.
The notion of a bijection is used to define the cardinality of a set, which,
informally speaking, denotes the number of elements the set contains.
DEFINITION
Two sets A and B are of equal cardinality if there is a bijection M from
AtoB.
Example 0.9
{0, 1, 2} and {a, b, c} are of equal cardinality. To prove this, use, for
example, the bijection M = {(0, a), (1, b), (2, c)}. The set of integers is
equal in cardinality to the set of even integers, even though the latter is
a proper subset of the former. A bijection we can use to prove this would be
{(i, 2i)1i is an integer}. E]
We can now define precisely what we mean by a finite and infinite set.I
DEFINITION
A set S isfinite if it is equal in cardinality to the set {1, 2 , . . . , n} for some
integer n. A set is infinite if it is equal in cardinality to a proper subset of
itself. A set is countable if it is equal in cardinality to the set of positive
integers. It follows from Example 0.9 that every countable set is infinite.
An infinite set that is not countable is called uncountable.
Examples of countable sets are
(1) The set of all positive and negative integers,
(2) The set of even integers, and
(3) {(a, b) la and b are integers}.
Examples of uncountable sets are
(1) The set of real numbers,
(2) The set of all mappings from the integers to the integers, and
(3) The set of all subsets of the positive integers.
EXERCISES
0.1.1. Write out the sets defined by the following predicates. Assume that
A = {0, 1,2, 3, 4, 5, 6}.
(a) {XI X is in A and X is even}.
tWe have used these terms previously, of course, assuming that their intuitive meaning
was clear. The formal definitions should be of some interest, however.
12 MATHEMATICALPRELIMINARIES CHAP. 0
•J'Strictly speaking, (al . . . . . am) means (((... (al, a2), a3) . . . . ), am), according to the
definition of A m.
14 MATHEMATICALPRELIMINARIES CHAP. 0
0.1.18. Let A be a finite set and let B ~ A. Show that if M" A ---~ B is a bijec-
tion, then A = B.
0.1.19. Let A and B have m and n elements, respectively. Show that there are
n m total functions from A t o B. How many (not necessarily total)
functions from A to B are there ?
'0.1.20. Let A be an arbitrary (not necessarily finite) set. Show that the sets
&(A) and {MI M is a total function from A to {0, 1}} are of equal car-
dinality.
0.1.21. Show that the set of all integers is equal in cardinality to
(a) The set of primes.
(b) The set of pairs of integers.
Hint: Define a linear order on the set of pairs of integers by
( i l , j l ) R (i2,J2) if and only if il + j l < i2 + j 2 or il + j l = i2 + j 2
and il < i2.
0.1.22. Set A is "larger than" B if A and B are of different cardinality but B
is equal in cardinality to a subset of A. Show that the set of real numbers
between 0 and 1, exclusive, is larger than the set of integers. Hint:
Represent real numbers by unique decimal expansions. In contradic-
tion, suppose that the two sets in question were of equal cardinality.
Then we could find a sequence of real numbers rl, r 2 , . . , which
included all real numbers r, 0 < r < 1. Can you find a real number r
between 0 and 1 which differs in the ith decimal place from rt for all i?
"0.1.23. Let R be a linear order on a finite set A. Show that there exists a unique
element a ~ A such that a R b for all b ~ A --{a}. Such an element
a is called the least element. If A is infinite, does there always exist a
least element ?
"0.1.24. Show that [a, [a, b}] = {c, [c, d}} if and only if a = c and b = d.
0.1.25. Let R be a partial order on a set A. Show that if a R b, then b R a is
false.
"0.1.26. Use the power set and union axioms to help show that if A and B are
sets, then A × B is a set.
*'0.1.27. Show that every set is either finite or infinite, but not both.
"0.1.28. Show that every countable set is infinite.
"0.1.29. Show that the following sets have the same cardinality:
(1) The set of real numbers between 0 and 1,
(2) The set of all real numbers,
(3) The set of all mappings from the integers to the integers, and
(4) The set of all subsets of the positive integers,
*'0.1.30. Show that 6~(A) is always larger than A for any set A.
0.1.31. Show that if R is a partial order on a set A, then the relation R' given
by R' = R w {(a, a)[a ~ A} is a reflexive partial order on A.
SEC. 0.2 SETS OF STRINGS 15
0.1.32. Show that if R is a reflexive partial order on a set A, then the relation
R' = R -- [(a, a)la ~ A} is a partial order on A.
In this book we shall be dealing primarily with sets whose elements are
strings of symbols. In this section we shall define a number of terms dealing
with strings.
0.2.1. Strings
We shall ordinarily use capital Greek letters for alphabets. The letters
a, b, c, and d will represent symbols and the letters t, u, v, w, x, y, and z
generally represent strings. We shall represent a string of i a's by a ~. For
example, a t = a,t a z = aa, a 3 = aaa, and so forth. Then, a ° is e, the empty
string.
DEFINITION
tWe thus identify the symbol a and the string consisting of a alone.
16 MATHEMATICAL PRELIMINARIES CHAP. 0
and y. For example, if x = ab and y = cd, then xy = abed. For all strings
X, xe -~- e x = x.
0.2.2. Languages
DEFINITION
A language over an alphabet Z is a set of strings over E. This definition
surely encompasses almost everyone's notion of a language. F O R T R A N ,
A L G O L , PL/I, and even English are included in this definition.
Example 0.10
DEFINITION
We let E* denote the set containing all strings over E including e. For
example, if A is the binary alphabet {0, 1], then
E* = [e, 0, 1, 00, 01, 10, 11,000, 0 0 1 , . . . ] .
Every language over E is a subset of E*. The set of all strings over E but
excluding e will be denoted by E+.
Example 0.11
CONVENTION
When no confusion results we shall often denote a set consisting of a
single element by the element itself. Thus according to this convention a*
= {a}*.
DEFINITION
A language L such that no string in L is a proper prefix (suffix) of any
other string in L is said to have the prefix (suffix) property.
For example, a* does not have the prefix property but (a~b[i ~ O} does.
Example 0.12
Suppose that we wish to change every instance of 0 in a string to a and
every 1 to bb. We can define a h o m o m o r p h i s m h such that h ( 0 ) = a and
h(1) = bb. Then i f L is the language (0"l"[n ~ 1}, h ( L ) = {a"b2"ln ~ 1}. [Z
If h" E~ --~ Ez* is a h o m o m o r p h i s m , then the relation h-1. E2* ---~ 6~(E~*),
defined below, is called an inverse homomorphism. If y is in E2*, then h-~(y)
is the set of strings over E1 which get mapped by h to y. That is, h-~(y) =
(xlh(x) = y}. I f L is a language over E2, then h-I(L)is the language over Ex
consisting of those strings which get mapped by h into a string in L. Formally,
h - ' ( L ) = U h-'(y) = (xlh(x) ~ L].
y EL
Example 0.13
Let h be a h o m o m o r p h i s m such that h(0) = a and h(1) = a. It follows
that h-~(a) = {0, 1} and h-~(a *) = [0, 1}*.
As a second example, suppose that h is a h o m o m o r p h i s m such that
h(0) = a and h(i) = e. Then h- l(e) = 1" and h- l(a) = 1'01". Here 1"01"
denotes the language [ 1i01J I i, j ~ 0], which is consistent with our definitions
and the convention which identifies a and [a]. [~]
EXERCISES
0.2.1. Give all the (a) prefixes, (b) suffixes, and (c) substrings of the string abe.
0.2.2. Prove or disprove: L + = L* -- {el.
0.2.3. Let h be the homomorphism defined by h(0) = a, h(1) = bb, and h(2) = e.
What is h(L), where L = {012]* ?t
0.2.4. Let h be as in Exercise 0.2.3. What is h-l({ab]*)?
*0.2.5. Prove or disprox,e the following"
(a) h-l(h(L)).'= L.
(b) h(h-~(L)) = L.
0,2,6, Can L* or L + ever be ~ ? Under what circumstances are L* and L +
finite ?
0.3.1. Proofs
Example 0.14
Suppose then that S(n) is the statement
1-q-3+5+...+2n-- 1 =n 2
That is, the sum of odd integers is a perfect square. Suppose we wish to show
that S(n) is true for all positive integers. Thus N = {~1,2, 3 , . . . } .
Basis. F o r n = l w e h a v e l = 12.
Inductive Step. Assuming S ( 1 ) , . . . , S(n) are true [in particular, that S(n)
SEe. 0.3 CONCEPTSFROMLOGIC 21
is true], we have
Often a statement (theorem) may read "P if and only if Q" or "P is
a necessary and sufficient condition for Q," where P and Q are themselves
statements. The terms/f, only if, necessary, and sufficient have precise mean-
ings in logic.
A logical connective is a symbol that can be used to create a statement
out of simpler statements. For example, and, or, not, implies are logical
connectives, not being a unary connective and the others binary connectives.
If P and Q are statements, then P and Q, P or Q, not P, and P implies Q are
also statements.
The symbol A is used to denote and, V to denote or, ,,~ to denote not,
and ~ to denote implies.
There are well-defined rules governing the truth or falsehood of a state-
ment containing logical connectives. For example, the statement P and Q is
true only when both P is true and Q is also true. We can summarize the
properties of a logical connective by a table, called a truth table, which dis-
plays the value of a composite statement in terms of the values of its compo-
nents. Figure 0.5 shows the truth table for the logical connectives and, or,
not and implies.
F F F F T T
F T F T T T
T F F T F F
T T T T F T
Fig. 0.5 Truth tables for and, or, not, and implies.
F r o m the table (Fig. 0.5) we see that P ~ Q is false only when P is true
and Q is false. It may seem a little odd that if P is false, then P implies Q
22 MATHEMATICAL PRELIMINARIES CHAP. 0
EXERCISES
DEFINITION
Propositional calculus is a good example of a mathematical system. Formally,
propositional calculus can be defined as a system ~ consisting of
(1) A set of primitive symbols,
(2) Rules for generating well-formed statements,
(3) A set of axioms, and
(4) Rules of inference.
tWe assume "not" takes precedence over "implies." Thus the proper phrasing of the
sentence is (not P) implies (not Q). In general, "not" takes precedence over "and," which
takes precedence over "or," which takes precedence over "implies."
EXERCISES 23
(1) The primitive symbols of g are ( , ) , ---~, ~ , and an infinite set of statement
letters al, a2, a3 . . . . . The symbol ~ can be thought of as implies and ~ as not.
(2) A well-formed statement is formed by one or more applications of the
following rules:
(a) A statement letter is a statement.
(b) If A and B are statements, then so are ( ~ A) and (A ~ B).
(3) Let A, B, and C be statements. The axioms of S are
AI: (A ~ ( n - - , A))
A2: ((A ~ (n ~ c)) ~ ((A ~ B) ~ (a ~ C)))
(4) The rule of inference is modus ponens, i.e., from' the statements (A -----~B)
and A we can infer the statement B.
We shall leave out parentheses wherever possible. The statement a ---~ a is a
theorem of S and has as p r o o f the sequence of statements
(i) (a ~ ((a ---~ a) ~ a)) ~ ((a ---~ (a ---~ a)) --~ (a --~ a)) from A2 with
A =a, B=(a~a), and C = a .
(ii) a ~ ((a ~ a) ~ a) from A1.
(iii) (a ~ (a ~ a)) ---~ (a ~ a) by modus ponens from (i) and (ii).
(iv) a ~ (a ~ a) from A1.
(v) a ---~ a by modus ponens from (iii) and (iv).
"0.3.1. Prove that ( ~ a --~ a) ~ a is a theorem of S.
0.3.2. A tautology is a statement that is true for all possible truth values of
the statement variables. Show that every theorem of S is a tautology.
Hint: Prove the theorem by induction on the n u m b e r of steps necessary
to obtain the theorem.
**0.3.3. Prove the converse of Exercise 0.3.2, i.e., that every tautology is a theorem.
Thus a simple m e t h o d to determine whether a statement of propositional
calculus is a theorem is to determine whether that statement is a tau-
tology.
0.3.4. Give the truth table for the statement if P then if Q then R.
DEFINITION
Boolean algebra can be interpreted as a system for manipulating
truth-valued variables using logical connectives informally interpreted
as and, or, and not. Formally, a Boolean algebra is a set B t o g e t h e r with
operations • (and), + (or), and - (not). The axioms of Boolean algebra
are the following: F o r all a, b, and c in B,
(1) a + ( b + c ) = ( a + b ) + c (associativity)
a - ( b . c) = ( a - b ) . c.
(2) a + b = b . + a (commutativity)
a.b =b.a.
(3) a . (b + c) = ( a - b ) + ( a . c) (distributivity)
a + ( b . c ) = ( a + b ) . ( a + c).
24 MATHEMATICAL PRELIMINARIES CHAP. 0
Show that then S(a) is true for all a in A. Note that this is a generaliza-
tion of the principle of simple induction.
0.3.12. Show that there are only four unary logical connectives. Give their truth
tables.
0.3.13. Show that there are 16 binary logical connectives.
0.3.14. Two logical statements are equivalent if they have the same truth table.
Show that
(a) ~ (P A Q) is equivalent to ~ p v ~ Q.
(b) ~ (P V Q) is equivalent to ~ P A ~ Q.
0.3.15. Show that P --~ Q is equivalent to ~ Q ~ ~ P.
0.3.16. Show that P ~ Q is equivalent to P A ~ Q ~ false.
"0.3.17. A set of logical connectives is complete if for any logical statement we
can find an equivalent statement containing only those logical connec-
tives. Show that [A, ~ ] and [V, ~} are complete sets of logical con-
nectives.
BIBLIOGRAPHIC NOTES
0.4.1. Procedures
Example 0.15
0.4.2. Algorithms
Example 0.16
Consider the procedure of Example 0.15. We observe that steps 1 and 2
must be executed alternately. After step 1, step 2 must be executed. After
step 2, step 1 may be executed, or there may be no next step; i.e., the proce-
dure halts. We can prove that for every input p and q, the procedure halts
after at most 2q steps,t and that thus the procedure is an algorithm. The
proof turns on observing that the value r computed in step 1 is less than
the value of q, and that, hence, successive values of q when step 1 is executed
form a monotonically decreasing sequence. Thus, by the qth time step 2 is
executed, r, which cannot be negative and is less than the current value of q,
must attain the value zero. When r = 0, the procedure halts. [3
There are several reasons why a procedure may fail to halt on some inputs.
It is possible that a procedure can get into an infinite loop under certain
conditions. For example, if a procedure contained the instruction
Step I: If x = 0, then go to Step 1, else halt,
then for x = 0 the procedure would never halt. Variations on this situation
are countless.
Our interest will be almost exclusively in algorithms. We shall be inter-
ested not only in proving that algorithms are correct, but also in evaluating
algorithms. The two main criteria for evaluating how well algorithms perform
will be
(1) The number of elementary mechanical operations executed as a func-
tion of the size of the input (time complexity), and
(2) How large an auxiliary memory is required to hold intermediate
tin fact, 4 logs q is an upper bound on the number of steps executed for q > 1. We
leave this as an exercise.
28 MATHEMATICAL PRELIMINARIES CHAP. 0
results that arise during the execution, again as a function of the size of the
input (space complexity).
Example 0.17
0.4.5. Problems
We shall use the word problem in a rather specific way in this book.
DEFINITION
Example 0.18
An example of a problem is "x is less than y, for integers x and y." More
colloquially, we can express the statement in question form and delete men-
tion of the type of x and y: "Is x less than y ?"
DEFINITION
An instance of a problem is a set of allowable values for its unknowns.
For example, the instances of the problem of Example 0.18 are ordered
pairs of integers.
A mapping from the set of instances of a problem to ~yes, no} is called
a solution to the problem. If this mapping can be specified by an algorithm,
then the problem is said to be (recursively) decidable or solvable. If no
algorithm exists to specify this mapping, then the problem is said to be
(recursively) undecidable or unsolvable.
One of the remarkable achievements of twentieth-century mathematics
was the discovery of problems that are undecidable. We shall see later that
undecidable problems seriously hamper the development of a broadly appli-
cable theory of computation.
Example 0.19
Let us discuss the particular problem "Is procedure P an algorithm ?"
Its analysis will go a long way toward exhibiting why some problems are
undecidable. First, we must assume that all procedures are specified in some
formal system such as those mentioned earlier in this section.
It appears that every formal specification language for procedures admits
only a countable number of procedures. While we cannot prove this in gen-
eral, we give one example, the formalism for representing absolute machine
language programs, and leave the other mentioned specifications for the
Exercises. Any absolute machine language program is a finite sequence of
O's and l's (which we imagine are grouped 32, 36, 48, or some number to
a machine word).
Suppose that we have a string of O's and l's representing a machine lan-
guage program. We can assign an integer to this program by giving its posi-
tion in some well ordering of all strings of O's and l's. One such ordering
can be obtained by ordering the strings of O's and l's in terms of increasing
length and lexicographically ordering strings of equal length by treating each
string as a binary number. Since there are only a finite number of strings of
any length, every string in {~0, 1~* is thus mapped to some integer. The first
SEC. 0.4 PROCEDURES A N D ALGORITHMS 31
Integer String
1 e
2 0
3 1
4 00
5 01
6 10
7 11
8 000
9 001
In this fashion we see that for each machine language program we can find
a unique integer and that for each integer we can find a certain machine
language program.
It seems that no matter what formalism for specifying procedures is taken,
we shall always be able to find a one-to-one correspondence between pro-
cedures and integers. Thus it makes sense to talk about the ith procedure in
any given formalism for specifying procedures. Moreover, the correspon-
dence between procedures and integers is sufficiently simple that one can,
given an integer i, write out the ith procedure, or given a procedure, find its
corresponding number.
Let us suppose that there is a procedure Pj which is an algorithm and
takes as input a specification of a procedure in our formalism and returns
the answer "yes" if and only if its input is an algorithm. All known formalisms
for procedure specification have the property that procedures can be com-
bined in certain simple ways. In particular, given the hypothetical procedure
(algorithm) Pj, we could construct an algorithm Pk to work as follows:
ALGORITHM Pk
as input and gives "yes" or "no" as output, Pk applies P to itself (P) as input.
(We assume that procedure specifications are such that these questions about
input and output forms can be ascertained by inspection. The assumption
is true in known cases.)
(3) Pk gives output "no" or "yes" if P gives output "yes" or "no," respec-
tively.
We see that Pk is an algorithm, on the assumption that P~ is an algorithm.
Also Pk requires one input. But what does PE do when its input is itself?
Presumably, Pj determines that Pk is an algorithm [i.e., Py(Pk)= "yes"].
Pk then simulates itself on itself. But now Pk cannot give an output that is
consistent. If Pk determines that this simulation gives "yes" as output, Pk
gives "no" as output. But Pk just determined that it gave "yes" when applied
to itself. A similar paradox occurs if Pk finds that the simulation gives "no."
We must conclude that it is fallacious to assume that the algorithm Pj exists,
and thus the question "Is P an algorithm ?" is not decidable for any of the
known procedure formalisms. [1
YlYz "'" Ym" We shall call such a sequence a viable sequence for this instance
of Post's correspondence problem. We shall often use xlx2...xm to represent
the viable sequence.
Example 0.20
Consider the following instance of Post's correspondence problem over
{a, b}:
{(abbb, b), (a, aab), (ba, b)}
The sequence (a, aab), (a, aab), (ba, b), (abbb, b) is viable, since
(a)(a)(ba)(abbb) = (aab)(aab)(b)(b).
The instance {(ab, aba), (aba, baa), (baa, aa)} of Post's correspondence
problem has no viable sequences, since any such sequence must begin with
the pair (ab, aba), and from that point on, the total number of a's in the
first components of the pairs in the sequence will always be less than the num-
ber of a's in the second components. D
EXERCISES
0.4.1. A perfect number is an integer which is equal to the sum of all its
divisors (including 1 but excluding the number itself). For example,
6= 1 +2+3 and 2 8 = 1 + 2 - t - 4 - t - 7 + 14 are the first two per-
fect numbers. (The next three are 496, 8128, and 33550336.) Construct
a procedure which has input i and output the ith perfect number. (At
present it is not known whether there are a finite or infinite number of
perfect numbers.)
0.4.2. Prove that the Euclidean algorithm of Example 0.15 is correct.
0.4.3. Provide an algorithm to add two n-digit decimal numbers. How much
time and space does the algorithm require as a function of n ? (See
Winograd [1965] for a discussion of the time complexity of addition.)
0.4.4. Provide an algorithm to multiply two n-digit decimal numbers. How
much time and space does the algorithm require ? (See Winograd [1967]
34 MATHEMATICALPRELIMINARIES CHAP. 0
and Cook and Aanderaa [1969] for a discussion of the time com-
plexity of multiplication.)
0.4.5. Give an algorithm to multiply two integer-valued n by n matrices.
Assume that integer arithmetic operations can be done in one step.
What is the speed of your algorithm ? If it is proportional to n 3 steps,
see Strassen [1969] for an asymptotically faster one.
0.4.6. Let L ~ [a,b}*. The characteristic function for L is a mapping
.fL:Z---~ {0, 1 }, where Z is the set of nonnegative integers, such that
.fL (i) = 1 if the i th string in {a, b}* is in L and .fL (i) = 0 otherwise.
Show that L is recursively enumerable if and only if .fL is a partial
recursive function.
0.4.7. Show that L is a recursive set if and only if both L and L are recursively
enumerable.
0.4.8. Let P be a procedure which defines a recusively enumerable set
L ~_ {a, b}*. F r o m P construct a procedure P' which will generate all
and only all the elements of L. That is, the output of P' is to be an
infinite string of the form xl ~: xz ~ x3 ~ . . . , where L = {xl, x z , . . . ] .
Hint: Construct P' to apply i steps of procedure P to the jth string in
[a, b}* for all (i, j), in a reasonable order.
DEFINITION
A Turing machine consists of a finite set of states (Q), tape symbols
(F), and a function ~ (the next move function, i.e., program) that maps
a subset of Q × F to Q × F × [L, R]. A subset ~ ~ F is designated
as the set of input symbols and one symbol in I" - ~ is designated the
blank. One state, q0, is designated the start state. The Turing machine
operates on a tape, one square of which is pointed to by a tape head.
All but a finite number of squares hold the blank at any time. A con-
figuration of a Turing machine is a pair (q, ~ Ffl), where q is the state,
• fl is the nonblank portion of the tape, and r is a special symbol,
indicating that the tape head is at the square immediately to its right.
(F does not occupy a square.)
The next configuration after configuration (q, 0cFfl) is determined
by letting A be the symbol scanned by the tape head (the leftmost
symbol of fl, or the blank if fl = e) and finding ~(q, A). Suppose that
~(q, A) = (p, A', D), where p is a state, A' a tape symbol, and D = L
or R. Then the next configuration is (p, 0c' Ffl'), where ~'fl' is formed
from ~ r fl by replacing the A to the right of r by A' and then moving
the symbol in direction D (left if D = L, right if D = R). It may be
necessary to insert a blank at one end in order to move F.
The Turing machine can be thought of as a formalism for defining
procedures. Its input may be any finite length string w in E*. The
procedure is executed by starting with configuration (q0, r w) and repeat-
edly computing next configurations. If the Turing machine halts, i.e.,
it has reached a configuration for which no move is defined (recall that
may not be specified for all pairs in Q x F), then the output is the
nonblank portion of the Turing machine's tape.
EXERCISES 35
*0.4.9. Exhibit a Turing machine that, given an input w in {0, 1}*, will write
YES on its tape if w is a palindrome (i.e., w = wR) and write NO
otherwise, halting in either case.
'0.4.10. Assume that all Turing machines use a finite subset of some countable
set of symbols, ala2 . . . . for their states and tape symbols. Show that
there is a one-to-one correspondence between the integers and Turing
machines.
*'0.4.11. Show that there is no Turing machine which halts on all inputs (i.e.,
algorithm) and determines, given integer i written in binary on its tape,
whether the ith Turing machine halts. (See Exercise 0.4.10.)
'0.4.12. Let al, a2 . . . . be a countable set of symbols. Show that the set of
finite-length strings over these symbols is countable.
'0.4.13. Informally describe a Turing machine which takes a pair of integers
i and j as-input, and halts if and only if the ith Turing machine halts
when given the jth string (as in Exercise 0.4.12) as input. Such a Turing
machine is called universal.
*'0.4.14. Show that there exists no Turing machine which always halts, takes
input (i,]) a pair of integers, and prints YES on its tape if Turing
machine i halts with input j and NO otherwise. Hint: Assume such
a Turing machine existed, and derive a contradiction as in Example
0.19. The existence of a universal Turing machine is useful in many
proofs.
*'0.4.15. Show that there is no Turing machine (not necessarily an algorithm)
which determines whether an arbitrary Turing machine is an algorithm.
Note that this statement is stronger than Exercise 0.4.14, where we
essentially showed that no such Turing machine which always halts
exists.
"0.4.16. Show that it is undecidable whether a given Turing machine halts when
started with blank tape.
0.4.17. Show that the problem of determining whether a statement is a theorem
in propositional calculus is decidable. Hint: See Exercises 0.3.2 and
0.3.3.
0.4.18. Show that the problem of deciding whether a string is in a particular
recursive set is decidable.
0.4.19. Does Post's correspondence problem have a viable sequence in the
following instances ?
(a) (01,011), (10, 000), (00, 0).
(b) (1, 11), (11,101), (101,011), (011, 1011).
How do you reconcile being able to answer this exercise with the fact
that Post's correspondence problem is undecidable ?
0.4.20. Show that Post's correspondence problem with strings restricted to be
over the alphabet [a} is decidable. How do you reconcile this result
with the und~idability of Post's correspondence problem ?
36 MATHEMATICALPRELIMINARIES CHAP. 0
BIBLIOGRAPHIC NOTES
Davis [1965] is a good anthology of many early papers in the study of proce-
dures and algorithms. Turing's paper [Turing, 1936-1937] in which Turing machines
first appear makes particularly interesting reading if one bears in mind that the
paper was written before modern electronic computers were conceived.
The study of recursive and partial recursive functions is part of a now well-
developed branch of mathematics called recursive function theory. Rogers [1967],
Kleene [1952], and Davis [1958] are good references in this subject.
Post's correspondence problem first appeared in Post [1947]. The partial cor-
respondence problem preceding Exercise 0.4.24 is from Knuth [1965].
Computational complexity is the study of algorithms from the point of view
of measuring the number of primitive operations (time complexity) or the amount
of auxiliary storage (space complexity) required to compute a given function.
Borodin [1970] and Hartmanis and Hopcroft [1971] give readable surveys of
this topic, and Irland and Fischer [1970] have compiled a bibliography on this
subject.
Solutions to many of the *'d exercises in this section can be found in Minsky
[1967] and Hopcroft and Ullman [1969].
SEC. 0.5 CONCEPTS FROM GRAPH THEORY 37
Example 0.21
Let G = ( A , R ) , where A = { 1 , 2 , 3 , 4 } and R = { ( 1 , 1 ) , ( 1 , 2 ) , ( 2 , 3 ) ,
(2, 4), (3, 4), (4, 1), (4, 3)}. We can draw a picture of the graph G by number-
ing four points 1, 2, 3, 4 and drawing an arrow from point a to point b if
(a, b) is in R. Figure 0.6 shows a picture of this directed graph. [Z]
Let G1 = (At, R~) and G2 = (A2, R2) be graphs. We say G~ and G2 are
equal (or the same) if there is a bijectionf: A~ ---~ A2 such that a R~ b if and
38 MATHEMATICALPRELIMINARIES CHAP. 0
Example 0.22
Let G, = ((a, b, c}, [(a, b), (b, c), (c, a)])and G 2 = ((0, 1, 2}, ((1, 0), (2, 1),
(0, 2)]). Let the labeling of G I be defined by f~(a) = fl(b) = X, f~(c) = Y,
g~((a,b))=g~((b,c))=ot, gl((c,a))=fl. Let the labeling of G2 be
fz(0) = f2(2) = X, f2(1) = Y, gz((0, 2)) = g2((2, 1)) = a, and gz((1, 0)) = ft.
G 1 and G z are shown in Fig. 0.7.
G~ and G z are equal. The correspondence is h(a) = O, h(b) = 2, h(c) = 1.
b
X
)
X X
) X
Y
(a) G1 (b) G2
DEFINITION
DEFINITION
A dag (short for directed acyclic graph) is a directed graph that has no
cycles. Figure 0.8 shows an example of a dag.
A node having in-degree 0 will be called a base node. One having out-
degree 0 is called a leaf In Fig. 0.8, nodes 1, 2, 3, and 4 are base nodes and
nodes 2, 4, 7, 8, and 9 are leaves.
0.5.3. Trees
numbered 1. We shall follow the convention of drawing trees with the root
on top and having all arcs directed downward. Adopting this convention we
can omit the arrowheads.
THEOREM 0.3
A tree T has the following properties:
(1) T is acyclic.
(2) For each node in a tree there is a unique path from the root to that
node.
Proof. Exercise. E]
DEFINITION
A subtree of a tree T = (A, R) is any tree T' = (A', R') such that
(1) A' is nonempty and contained in A,
(2) R ' = A ' × A ' A R , and
(3) No node of A -- A' is a descendant of a node in A'.
SEC. 0.5 CONCEPTS FROM GRAPH THEORY 41
For example,
is a subtree of the tree in Fig. 0.9. We say that the root of a subtree dominates
the subtree.
DEFINITION
An ordered directed graph is a pair (A, R) where A is a set of vertices as
before and R is a set of linearly ordered lists of edges such that each element
of R is of the form ((a, b~), (a, b2),. • •, (a, b~)), where a is a distinct member
of A. This element would indicate that, for vertex a, there are n ares leaving
a, the first entering vertex b~, the second entering vertex b2, and so forth.
Example 0.23
Figure 0.10 shows a picture of an ordered directed graph. The linear
ordering on the arcs leaving a vertex is indicated by numbering the arcs
leaving a vertex by 1, 2 , . . . , n, where n is the out-degree of that vertex.
The formal specification for Fig. 0.10 is (A, R), where A = [a, b, c} and
R = [((a, c), (a, b), (a, b), (a, a)), ((b, c))}. D
Notice that Fig. 0.10 is not a directed graph according to our definition,
since there are two arcs leaving node a and entering node b. (Recall that
in a set there is only one instance of each element.)
As for unordered graphs, we define the notions of labeling and equality
of ordered graphs.
42 MATHEMATICAL PRELIMINARIES CHAP. 0
DEFINITION
A labeling of an ordered graph G = (A, R) is a pair of mappings f and
g such that
(1) f : A ----~ S for some set S ( f labels the nodes), and
(2) g maps R to sequences of symbols from some set T such that g maps
((a, b l ) , . . . , (a, b,)) to a sequence of n symbols of T. (g labels the edges.)
Labeled graphs Gi = (A1, R1) and Gz = (As, R2) with labelings (fl, gl)
and (fz, g2), respectively, are equal if there exists a bijection h ' A 1 ~ A2
such that
(1) Ri contains ((a, b ~ ) , . . . , (a, b,)) if and only if R2 contains
((h(a), h ( b l ) ) , . . . , (h(a), h(b;))),
(2) f~fa) = f z ( h ( a ) ) for all a in A 1, and
(3) g l ( ( ( a , bl), . . . , (a, b,))) : g2((h(a), h(bl)) . . . . , (h(a), h(b,))).
Informally, two labeled ordered graphs are equal if there is a one-to-one
correspondence between nodes that preserves the node and edge labels. If
the labeling functions all have a range with one element, then the graph is
essentially unlabeled, and only condition (1) needs to be shown. Similarly,
only the node labeling or only the edge labeling may map to a single element,
and condition (2) or (3) will become trivial.
For each ordered graph (A, R), there is an underlying unordered graph
(A, R') formed by allowing R' to be the set of (a, b) such that there is a list
((a, bl) . . . . , (a, bn)) in R, and b = bi for some i, 1 _~ i < n.
An ordered dag is an ordered graph whose underlying graph is a dag.
An ordered tree is an ordered graph (A, R) whose underlying graph is
a tree, and such that if ((a, bl),. • •, (a, bn)) is in R, then b~ ~ bj, if i -~ j.
Unless otherwise stated, we shall assume that the direct descendants of
a node of an ordered dag or tree are always linearly ordered from left to
right in a diagram.
There is a great distinction between ordered graphs and unordered graphs
from the point of view of when two graphs are the same.
For example the two trees T~ and T2 in Fig. 0.11 are equivalent if T1 and
T1 7'2
Fig. 0.11 Two trees.
SEC. 0.5 CONCEPTS FROM GRAPH THEORY 43
T2 are unordered. But if T~ and T2 are ordered, then T1 and T2 are not the
same.
Many theorems about dags, and especially trees, can be proved by induc-
tion, but it is often not clear on what to base the induction. Theorems which
yield to this kind of proof are often of the form that something is true for
all, or a certain subset of, the nodes of the tree. Thus we must prove some-
thing about nodes of the tree, and we need some parameter of nodes such
that the inductive step can be proved.
Two such parameters are the depth of a node, the minimum path length
(or in the case of a tree, the path length) from a base node (root in the case
of a tree) to the given node, and the height (or level) of a node, the maximum
path length from the node to a leaf.
Another approach to inductions on finite ordered trees is to order the
nodes in some way and perform the induction on the position of the node
in that sequence. Two common orderings are defined below.
DEFINITION
Example 0.24
Consider the ordered tree of Fig. 0.12. The preorder of the nodes is
123456789. The postorder is 342789651. [-]
such that all edges point downward. The linear order is given by the position
of nodes in the column.
For example, under this type of transformation the dag of Fig. 0.8 could
look as shown in Fig. 0.13.
Formally we say that R' is a linear order that embeds a partial order R on
a set A if R' is a linear order and R ~ R', i.e., a R b implies that a R' b for
all a, b in A. Given a partial order R, there are many linear orders that embed
R (Exercise 0.5.5). The following algorithm finds one such linear order.
SEC. 0.5 CONCEPTS FROM GRAPH THEORY 45
ALGORITHM 0.1
Topological sort.
Input. A partial order R on a finite set A.
Output. A linear order R' on A such that R ~ R'.
Method. Since A is a finite set, we can represent the linear order R' on
A as a list a 1, a 2. . . . , a, such that as R' aj if i < j , and A = [ a l , . . . , a,}.
The following steps construct this sequence of elements:
(1) L e t i = I , A t = A , a n d R ~ = R .
(2) If At is empty, halt, and al, a 2 , . . . , a~_l is the desired linear order.
Otherwise, let at be an element in A~ such that a R~ a~ is false for all a ~ A~.
(3) Let A,.+t be A ; - {a,.} and Ri+ 1 be R, ~ (A,.+i × A,.+~). Then let i
be i + 1 and repeat step (2). I~
If we represent a partial order as a dag, then Algorithm 0. l has a p a r t i c u -
larly simple interpretation. At each step (A~, R,) is a dag and ai is a base
node of (A i, Ri). The dag (A~+~, R,.+~) is formed from (A,., R;) by deleting
node ai and all edges leaving a,.
Example 0.25
Let A = {a, b, c, d} and R ---- {(a, b), (a, c), (b, d), (c, d)}. Since a is the
only node in A such that a' R a is false for all a' E A, we must choose a 1 = a.
Then A z = {b, c, d} and R z = {(b, d), (c, d)}; we now choose either b or
c for a2. Let us choose az = b. Then A 3 = {c, d} and R3 = {(c, d)}. Continu-
ing, we find a3 = c and a 4 = d.
The complete linear order R' is {(a, b), (b, c), (c, d), (a, c), (b, d), (a, d)}.
THEOREM 0 . 4
Algorithm 0.1 produces a linear order R' which embeds the given partial
order R.
Proof. A simple inductive exercise.
But there are also other representations. For example, we can use nested
brackets to indicate the nodes at each depth of a tree. Recall that the depth
of a node in a tree is the length of the path from the root to that node. For
example, in Fig. 0.9, node 1 is at depth 0, node 3 is at depth 1, and node 6
is at depth 2. The depth of a tree is the length of the longest path. The tree of
Fig. 0.9 has depth 2.
Using brackets to indicate depth, the tree of Fig. 0.9 could be represented
as 1(2, 3(4, 5, 6)). We shall call this the left-bracketed representation, since
a subtree is represented by the expression appearing inside a balanced pair
of parentheses and the node which is the root of that subtree appears imme-
diately to the left of the left parenthesis.
DEFINITION
In general the left-bracketed representation of a tree T can be obtained by
applying the following recursive rules to T. The string lrep(T) denotes the
left-bracketed representation of tree T.
(l) If T has a root numbered a with subtrees T1, T 2 , . . . , Tk in order,
then lrep(T) = a(lrep(T1), l r e p ( T 2 ) , . . . , lrep(Tk)).
(2) If T has a root numbered a with no direct descendants, then
lrep(T) = a.
If we delete the parentheses from a left-bracketed representation of a tree,
we are left with a preorder of the nodes.
We can also obtain a right-bracketed representation for a tree T, rrep(T),
as follows:
(1) I f T h a s a root numbered a with subtrees T 1, T 2 , . . . , Tk, then rrep(T)
-- (rrep(T~), r r e p ( T 2 ) , . . . , rrep(Tk))a.
(2) If T has a root numbered a with no direct descendants, then
rrep(T) = a.
Thus rrep(T) for the tree of Fig. 0.12 would be ((3, 4)2, ((7, 8, 9)6)5)1.
In this representation, the direct ancestor is immediately to the right of
the first right parenthesis enclosing that node. Also, note that if we delete
the parentheses we are left with a postorder of the nodes.
Another representation of a tree is to list the direct ancestor of nodes
l, 2, . . . , n of a tree in that order. The root would be recognized by letting
its ancestor be 0.
Example 0.26
The tree shown in Fig. 0.14 would be represented by 0122441777. Here
0 in position 1 indicates that node 1 has "node 0" as its direct ancestor (i.e.,
node 1 is the root). The 1 in position 7 indicates that node 7 has direct ances-
tor 1.
CONCEPTS FROM GRAPH THEORY 47
sac. 0.5
1 2 3 4
1 1 0 0
0 0 1 1
0 0 0 1
1 0 i 1 0
Fig. 0.15 Boolean matrix for Fig. 0.6.
oo
"t'That is, use the usual formula for matrix multiplication with the Boolean operations
• and + for multiplication and addition, respectively.
48 MATHEMATICALPRELIMINARIES CHAP. 0
Since step (3) is executed once for all possible values of i, j, and k, Algorithm
0.2 is n 3 in time complexity.
It is not immediately clear that Algorithm 0.2 does produce the minimum
cost of any path from node i to j. Thus we should prove that Algorithm 0.2
does what it claims.
THEOREM 0.5
Statement (0.5.1). After step (3) is executed with k = / , mr1 has the small-
est value expressible as a sum o f the f o r m c,~,, + . . . + c~..,~., where v~ = i,
v , = j, a n d none of v 2 , . . . , Vm_ ~ is greater t h a n I.
We shall call this m i n i m u m value the correct value o f m~j with k = 1.
This value is the cost of a cheapest p a t h from node i to node j which does
not pass t h r o u g h a node whose n u m b e r is higher than l.
Inductive Step. Assume that statement (0.5.1) is true for all l < 10. Let us
consider the value o f mt~ after step (3) has been executed with k = lo.
Suppose that the m i n i m u m sum C.~v, + . " + c.._,., for mil with k = l0
is such that no vp, 2 < p < m -- 1, is equal to lo. F r o m the inductive hypothe-
sis e.,~,-+- .... + e.._,., is the correct value of mtj with k = 10 - - I , so
c.,~. + . . . + c~..,., is also the correct value o f mtj with k -- 10.
N o w suppose that the m i n i m u m sum stj = e.,~, -+- . . . + c~._,v, for mr1
with k -- 1o is such that vp = l0 for some 2 ~ p < m -- 1. T h a t is, stj is the
cost of the p a t h v 1, v2, • • •, vm. W e can assume that there is no node vq on
this path, q #: p, such that v~ is 10. Otherwise the p a t h vl, v 2 , . . . , Vm contains
a cycle, a n d we can delete at least one term from the sum c~,~, + • • • + c~._,..
without increasing the value o f the sum stj. Thus we can always find a sum
for stj in which vp = 10 for only one value of p, 2 ~ p < m -- 1.
Let us assume that 2 < p < m - - 1. The c a s e s p = 2 a n d p--m-- 1
are left to the reader. Let us consider the sum s t . , - - c.,., -+- . . . + c.._,~.
a n d s..j = e....., + . . . + c.._,.. (the costs of the paths f r o m node i to node
v p a n d f r o m node v p to node j in the sum s~). F r o m the inductive hypothesis
we can assume that st.. is the correct value for m;.. with k = l0 -- 1 a n d
that s~.j is the correct value for mv.j with k = l0 -- 1. Thus when step (3)
is executed with k = l o, mij is correctly given the value mi., + m~.j.
W e have thus shown that statement (0.5.1) is true for all 1. W h e n l = n,
statement (0.5.1) states that at the end of Algorithm 0.2, m~ has the lowest
possible value. [-7
ALGORITHM 0.3
Finding the set of nodes accessible from a given node of a directed graph.
Input. G r a p h (A, R), with A a finite set and a in A.
Output. The set of nodes b in A such that a R* b.
Method. We form a list L and update it repeatedly. We shall also mark
m e m b e r s of A during the course of the algorithm. Initially, all members of
A are u n m a r k e d . The nodes m a r k e d will b e those accessible f r o m a.
(1) Set L = a and m a r k a.
(2) If L is empty, halt. Otherwise, let b be the first element on list L.
Delete b f r o m L.
(3) F o r all c in A such that b R c and c is u n m a r k e d , add c to the b o t t o m
of list L, m a r k c, and go to step (2). [~]
We leave a p r o o f that Algorithm 0.3 works correctly to the Exercises.
EXERCISES
0.5.1. What is the maximum number of edges a dag with n nodes can have ?
0.5.2. Prove Theorem 0.3.
0.5.3. Give the pre- and postorders for the tree of Fig. 0.14. Give left- and
right-bracketed representations for the tree.
*0.5.4. (a) Design an algorithm that will map a left-bracketed representation
of a tree into a right-bracketed representation.
(b) Design an algorithm that will map a right-bracketed representation
of a tree into a left-bracketed representation.
0.5.5. How many linear orders embed the partial order of the dag of Fig. 0.8 ?
0.5.6. Complete the proof of Theorem 0.5.
0.5.7. Give upper bounds on the time and space necessary to implement
Algorithm 0.1. Assume that one memory cell is needed to store any
node name or integer, and that one elementary step is needed for each
of a reasonable set of primitive operations, including the arithmetic
operations and examination or alteration of a cell in an array indexed
by a known integer.
0.5.8. Let A = (a, b, c, d} and R = ((a, b), (b, c), (a, c), (b, d)}. Find a linear
order R' such that R ~ R'. How many such linear orders are there ?
DEFINITION
An undirected graph G is a triple (A, E, f ) where A is a set of nodes,
E is a set of edge names and f is a mapping from E to the set of un-
ordered pairs of node~. I f f ( e ) = ~a, b], then we mean that edge e con-
nects nodes a and b. A path in an undirected graph is a sequence of
EXERCISES 51
nodes a0, al, a2 . . . . . an such that there is an edge connecting at_l and
at for 1 _~ i ~ n. An undirected graph is c o n n e c t e d if there is a path
between every pair of distinct nodes.
DEFINITION
An undirected tree can be defined recursively as follows. An
undirected tree is a set of one or more nodes with one distinguished
node r called the root of the tree. The remaining nodes can be parti-
tioned into zero or more sets T~ . . . . , Tk each of which forms a tree.
The trees T~ . . . . . Tk are called the subtrees of the root and an
undirected edge connects r with all of and only the subtree roots.
A spanning tree for a connectefi undirected graph G is a tree which
contains all nodes of N.
0.5.9. Provide an algorithm to construct a spanning tree for a connected
undirected graph.
0.5.10. Let (A,R) be an unordered graph such that A = {1,2,3, 4} and
R = {(1, 2), (2, 3), (4, 1), (4, 3)}. Find R ÷, the transitive closure of R.
Let the adjacency matrix for R be M. Compute M ÷ and show that
M + is the adjacency matrix for (A, R+).
0.5.11. Show that Algorithm 0.2 takes time proportional to n 3 in basic steps
similar to those mentioned in Exercise 0.5.7.
0.5.12. Prove that Algorithm 0.3 marks node b if and only if a R ÷ b.
0.5.13. Show that Algorithm 0.3 takes time proportional to the maximum of
# A and # R .
0.5.14. The following are three unordered directed graphs. Which two are the
same ?
G1 = ({a, b, c}, [(a, b), (b, c), (c, a)))
G2 = ({a, b, c), {(b, a), (a, c), (b, c)))
G3 = ({a, b, c], {(c, b), (c, a), (b, a)])
0.5.15. The following are three ordered directed graphs with only nodes labeled.
Which two are the same 9.
G1 = ({a, b, cl, {((a, b), (a, c)), ((b, a), (b, c)), ((c, b))})
G2 = ([a, b, c}, [((a, c)), ((b, c), (b, a)), ((c, b), (c, a))})
G3 = ([a, b, c}, [((a, c), (a, b)), ((b, c)), ((c, a), (c, b))})
Programming Exercises
0.5.21. Write a program that will construct an adjacency matrix from a linked
list representation of a graph.
0.5.22. Write a program that will construct a linked list representation of a
graph from an adjacency matrix.
0.5.23. Write programs to impIement Algorithms 0.1, 0.2, and 0.3.
BIBLIOGRAPHIC NOTES
53
54 AN INTRODUCTION TO COMPILING CHAP. 1
A=B,C
L GOTO L
has a grammatical structure which is indicated by the tree of Fig. 1.1, whose
nodes are labeled by syntactic categories and whose leaves are labeled by
the terminal symbols, which, in this case, are English words.
Likewise, a program written in a programming language can be broken
56 AN INTRODUCTION TO COMPILING CHAP. 1
<sentence>
<noun phrase>
J <verb phrase>.
/ ~
<adjective> <noun> <verb> <phrase>
I
the
I
pig
I
is <preposition> <noun phrase>
I
in
/
<adjective>
\ <noun>
I
the
I
pen
Fig. 1.1 Tree structure for English sentence.
a+b,c
m a y have a syntactic structure given by the tree of Fig. 1.2.t The term parsing
or syntactic analysis is given to the process of finding the syntactic structure
<expression>
<identifier> <identifier> c
b
Fig. 1.2 Tree from arithmetic expression.
tThe use of three syntactic categories, (expression), (term), and (factor), rather than
just (expression), is forced on us by our desire that the structure of an arithmetic expres-
sion be unique. The reader should bear this in mind, lest our subsequent examples of the
syntactic analysis of arithmetic expressions appear unnecessarily complicated.
BIBLIOGRAPHIC NOTES 57
BIBLIOGRAPHIC NOTES
Several other algebraic languages were also developed at that time, but FORTRAN
emerged as the most widely used language. Since that time hundreds of high-level
programming languages have been developed. Sammet [1969] gives an account
of many of the languages in existence in the mid-1960's.
Much of the theory of programming languages and compilers has lagged
behind the practical development. A great stimulus to the theory of formal lan-
guages was the use of what is now known as Backus Naur Form (BNF) in the
syntactic definition of ALGOL 60 (Naur [1963]). This report, together with the
early work of Chomsky [1959a, 1963], stimulated the vigorous development of
the theory of formal languages during the 1960's. Much of this book presents
results from language theory which have relevance to the design and understand-
ing of language translators.
Most of the early work on language theory was concerned with the syntactic
definition of languages. The semantic definition of languages, a much more difficult
question, received less attention and even at the time of the writing of this book
was not a fully resolved matter. Two good anthologies on, the formal specification
of semantics are Steel [1966] and Engeler [1971]. The IBM Vienna laboratory
definition of PL/I [Lucas and Walk, 1969] is one example of a totally formal
approach to the specification of a major programming language.
One of the more interesting developments in programming languages has been
the creation of extensible languages--languages whose syntax and semantics can
be changed within a program. One of the earliest and most commonly proposed
schemes for language extension is the macro definition. See McIlroy [1960], Leaven-
worth [1966], and Cheatham [1966], for example. Galler and Perlis [1967] have
suggested an extension scheme whereby new data types and new operators can be
introduced into ALGOL. Later developments in extensible languages are con-
tained in Christensen and Shaw [1969] and Wegbreit [1970]. ALGOL 68 is an
example of a major programming language with language extension facilities [Van
Wijngaarden, 1969].
1.2. AN O V E R V I E W OF C O M P I L I N G
nature of a problem is essential and will make the techniques for solution
of that problem applicable to other basically similar problems.
A source program in a programming language is nothing more than
a string of characters. A compiler ultimately converts this string of characters
into a string of bits, the object code. In this process, subprocesses with
the following names can often be identified:
(1) Lexical analysis.
(2) Bookkeeping, or symbol table operations.
(3) Parsing or syntax analysis.
(4) Code generation or translation to intermediate code (e.g. assembly
language).
(5) Code optimization.
(6) Object code generation (e.g. assembly).
In any given compiler, the order of the processes may be slightly different
from that shown, and several of the processes may be combined into a single
phase. Moreover, a compiler should not be shattered by any input it receives;
it must be capable of responding to any input string. For those input strings
which do not represent syntactically valid programs, appropriate diagnostic
messages must be given.
We shall describe the first five phases of compilation briefly. These phases
do not necessarily occur separately in an actual compiler. However, it is
often convenient to conceptually partition a compiler into these phases in
order to isolate the problems that are unique to that part of the compilation
process.
The lexical analysis phase comes first. The input to the compiler and hence
the lexical analyzer is a string of symbols from an alphabet of characters.
In the reference version of PL/I for example, the terminal symbol alphabet
contains the 60 symbols
AB C ... Z $@~
0 1 2 ... 9 u
blank
=÷-,/(),.;"&l~>< 7%
In a program, certain combinations of symbols are often treated as a
single entity. Some typical examples of this would include the following"
(1) In languages such as PL/I a string of one or more blanks is normally
treated as a single blank.
(2) Certain languages have keywords such as BEGIN, END, GOTO,
DO, INTEGER, and so forth which are treated as single entities.
60 AN INTRODUCTION TO COMPILING CHAP. 1
Example 1.1
The lexical analysis phase would find COST, PRICE, and TAX to be tokens
of type identifier and 0.98 to be a token of type constant. The characters
--, (, -+-, ), and • are tokens by themselves. Let us assume that all constants
and identifiers are to be mapped into tokens of the type (id). We assume
that the data component of a token is a pointer, an entry to a table contain-
ing the actual name of the identifier together with other data we have col-
lected about that particular identifier. The first component of a token is used
by the syntactic analyzer for parsing. The second component is used by the
code generation phase to produce appropriate machine code.
Thus the output of the lexical analyzer operating directly on our input
string would be the following sequence of tokens"
Lexical analysis is easy if tokens of more than one character are isolated
by characters which are tokens themselves. In the example above, --, (, -+-,
and • cannot appear as part of an identifier, so COST, PRICE, and TAX
can be readily distinguished as tokens.
However, lexical analysis may not be so easy in general. For example,
consider the following valid F O R T R A N statements:
(1) DO 10 I = 1.15
(2) DO 10 I - - 1,15
DECLARE(X1, X 2 , . . . , Xn)
the lexical analyzer would have no way of telling whether D E C L A R E was
a function identifier and X1, X 2 , . . . , Xn were its arguments or whether
D E C L A R E was a keyword causing the identifiers X1, X2, . . . , Xn to have
the attribute (or attributes) immediately following the right parenthesis.
Here the distinction would have to be made on what follows the right paren-
thesis. But since n can be arbitrarily large,** the PL/I lexical analyzer might
have to look ahead an arbitrary distance. However, there is another approach
to lexical analysis that is less convenient but it avoids the problem of arbitrary
lookahead.
We shall define two extreme approaches to lexical analysis. Most tech-
niques in use fall into one or the other of these categories and some are
a combination of the two:
(1) A lexical analyzer is said to operate directly if, given a string of input
text and a pointer into that text, the analyzer will determine the token imme-
diately to the right of the place pointed to and move the pointer to the right
of the portion of text forming the token.
(2) A lexical analyzer is said to operate indirectly if, given a string of
text, a pointer into that text, and a token type, it will determine if input
Example 1.2
Consider the F O R T R A N text
DO 10 I = 1,15
with the pointer currently at the left end. An indirect lexical analyzer would
respond "yes" if asked for a token of type DO or a token of type (identifier).
In the former case, the pointer would be moved two symbols to the right,
and in the latter case, five symbols to the right.
A direct lexical analyzer would examine the text up to the comma and
conclude that the next token was of type DO. The pointer would then move
two symbols to the right, although many more symbols would be scanned in
the process. El
1.2.3. Bookkeeping
D I M E N S I O N A(10,20)
mation about identifiers is stored. Such a table is often called a symbol table.
The table will list all identifiers together with the relevant information con-
cerning each identifier.
Suppose that we encounter the statement
1.2.4. Parsing
plied and that then the result is added to A. No other ordering of the opera-
tions will produce the desired calculation.
Parsing is one of the best-understood phases of compilation. From a set
of syntactic rules it is possible to automatically construct parsers which will
make sure that a source program obeys the syntactic structure defined by
these syntactic rules. In Chapters 4-7 we shall study several different parsing
techniques and algorithms for generating a parser from a given grammar.
The output from the parser is a tree which represents the syntactic struc-
ture inherent in the source program. In many ways this tree structure is closely
related to the parsing diagrams we used to make for English sentences in
elementary school.
Example 1.3
Suppose that the output of the lexical analyzer is the string of tokens
This string conveys the information that the following three operations are
to be performed in exactly the following way:
(1) (id)3 is to be added to (id)2,
(2) The result of (1)is to be multiplied by (id)4, and
(3) The result of (2) is to be stored in the location reserved for (id)~.
This sequence of steps can be pictorially represented in terms of a labeled
tree, as shown in Fig. 1.3. That is, the interior nodes of the tree represent
<id>l <id>4
<id> 7 + <id> 3
actions which must be taken. The direct descendants of each node either
represent values to which the action is to be applied (if the node is labeled by
an identifier or is an interior node) or help to determine what the action
should be. (In particular, the = , q-, and • signs do this.) Note that the paren-
theses in (1.2.1) do not explicitly appear in the tree, although we might want
to show them as direct descendants of n 1. The role of the parentheses is only
to influence the order of computation. If they did not appear in (1.2.1), then
the usual convention that multiplication "takes precedence" over addition
would apply, and the first step would be to multiply (ida3 and (id)4. [Z]
The tree built by the parser is used to generate a translation of the input
program. This translation could be a program in machine language, but more
often it is in an intermediate language such as assembly language or "three
address code." (The latter is a sequence of simple statements; each involves
no more than three identifiers, e.g. A = B, A = B -t- C, or GOTO A.)
If the compiler is to do extensive code optimization, then code of the
three address type is preferred. Since three address code does not pin com-
putations to specific computer registers, it is easier to use registers to advan-
tage when optimizing. If little or no optimization is to be done, then assembly
or even machine code is preferred as an intermediate language. We shall give
a running example of a translation into an assembly type language to illus-
trate the salient points of the translation process.
For this discussion let us assume that we have a computer with one work-
ing register (the accumulator) and assembly language instructions of the form
Instruction Effect
Here the notation c(m) ---~ accumulator, for example, means the contents of
memory location m are to be placed in the accumulator. The expression
= m denotes the numerical value m. With these comments, the effects of
the seven instructions should be obvious.
The output of the parser is a tree (or some representation of one) which
represents the syntactic structure inherent in the string of tokens coming out
66 AN INTRODUCTION TO COMPILING CHAP. 1
of the lexicai analyzer. F r o m this tree, and the information stored in the
symbol table, it is possible to construct the object code. In practice, tree con-
struction and code generation are often carried out simultaneously, but
conceptually it is easier to think of these two processes as occurring serially.
There are several methods for specifying how the intermediate code is to be
constructed from the syntax tree. One method which is particularly elegant
and effective is the syntax-directed translation. Here we associate with each
node n a string C(n) of intermediate code. The code for node n is constructed
by concatenating the code strings associated with the descendants of n and
other fixed strings in a fixed order. Thus translation proceeds from the bot-
tom up (i.e., from the leaves toward the root). The fixed strings and fixed
order are determined by the algorithm used. More will be said about this
in Chapters 3 and 9.
An important problem which arises is how to select the code C(n) for
each node n such that C(n) at the root is the desired code for the entire
statement. In general, some interpretation must be placed on C(n) such that
the interpretation can be uniformly applied to all situations in which node
n can appear.
For arithmetic assignment statements, the desired interpretation is fairly
natural and. will be explained in the following paragraphs. In general, the
interpretation must be specified by the compiler designer if the method of
syntax-directed translation is to be used. This task may be easy or hard, and
in difficult cases, the detailed structure of the tree may have to be adjusted
to aid in the translation process.
For a specific example, we shall describe a syntax-directed translation of
simple arithmetic expressions. We notice that in Fig. 1.3, there are three
types of interior nodes, depending on whether their middle descendant is
labeled = , + , or ,. These three types of nodes are shown in Fig. 1.4, where
/
(a) (b) (c)
Fig. 1.4 Types of interior nodes. P
(1) If n is a node of type (a), then C(n) will be code which computes the
value of the expression on the right and stores it in the location reserved for
the identifier labeling the left descendant.
(2) If n is a node of type (b) or (c), then C(n) is code which, when preceded
by the operation code LOAD, brings to the accumulator the value of the
subtree dominated by n.
Thus, in Fig. 1.3, when preceded by LOAD, C(nl) brings to the accumu-
lator the value of (id)2 ÷ (id)3, and C(n2) brings to the accumulator the
value of ((id)2 -t- (id)3) * (id)4. C(n3) is code, which brings the latter value
to the accumulator and stores it in the location of (id)l.
We must consider how to build C(n) from the code for n's descendants.
In what fellows, we assume that assembly language statements are to be
generated in one string, with a semicolon or a new line separating the state-
ments. Also, we assume that assigned to each node n of the tree is a level
number l(n), which denotes the maximum length of a path from that node to
a leaf. Thus, l(n)= 0 if n is a leaf, and if n has descendants n t , . . . , nk,
l(n) = maxx_<~_<kl(n~) + 1. We can compute l(n) bottom up, at the same time
that C(n) is computed. The purpose of recording levels is to control the use
of temporary stores. We must never store two needed quantities in the same
temporary location simultaneously. Figure 1.5 shows the level numbers of
each node in the tree of Fig. 1.3.
We shall now define a syntax-directed code generation algorithm to
3 n3
<i ~>4
<id> 2 + <id~ 3
0 0 0
Fig. 1.5 Levelnumbers.
68 AN INTRODUCTION TO COMPILING CHAP. 1
compute C(n) for all nodes n of a tree consisting of leaves, a root of type
(a), and interior nodes of either type (b) or type (c).
ALGORITHM 1.1
Syntax-directed translation of simple assignment statements.
Input. A labeled ordered tree representing an assignment statement
involving arithmetic operations -+- and • only. We assume that the level of
each node has been computed.
Output. Assembly language code to perform the assignment.
Method. Do steps (i) and (2) for all nodes of level 0. Then do steps (3),
(4), (5) on all nodes of level 1, then level 2, and so forth, until all nodes have
been acted upon.
(1) Suppose that n is a leaf with label (id~>j.
(i) Suppose that entry j in the identifier table is a variable. Then
C(n) is the name of that variable.
(ii) Suppose that entry j in the identifier table is a constant k. Then
C(n) is ' = k . ' t
(2) If n is a leaf, with label = , ,, or -t-, then C(n) is the empty string.
(In this algorithm, we do not need or wish to produce an output for leaves
labeled = , ,, or + . )
(3) If n is a node of type (a) and its descendants are hi, n2, and n 3, then
C(n) is 'LOAD' C(ns)'; STORE' C(nl).
(4) If n is a node of type (b) and its descendants are nx, n2, and n a, then
C(n) is C(n3)'; STORE $' l(n)'; LOAD' C(nl)'; A D D $' l(n).
This sequence of instructions uses a temporary location whose name is
the character $ followed by the level number of node n. It is straightforward
to see that when this sequence is preceded by LOAD, the value finally resid-
ing in the accumulator will be the sum of the values of the expressions domi-
nated by nl and n3.
We make two comments on the choice of temporary names. First, these
names are chosen to start with $ so that they cannot be confused with the
identifier names in F O R T R A N . Second, because of the way l(n) is chosen,
we can claim that C(n) contains no reference to a temporary $i if i is greater
than l(n). Thus, in particular, C(nx) contains no reference to '$' l(n). We can
thus guarantee that the value stored into '$' l(n) will still be there when it is
added to the accumulator.
(5) If all is as in (4) but node n is of type (c), then C(n) is
C(n3) '; STORE $' l(n) '; LOAD' C(nl)'; MPY $' l(n).
This code has the desired effect, with the desired result appearing in the
accumulator. 5
1"For emphasis, we surround with quotes those strings which represent themselves,
rather than naming a string.
SEC. 1.2 AN OVERVIEW OF COMPILING 69
We leave a proof of the correctness for Algorithm 1.1 for the Exercises.
It proceeds recursively on the height (i.e., level) of a node.
Example 1.4
Let us apply Algorithm 1.1 to the tree of Fig. 1.3. The tree given in Fig.
1.6 has the code associated with each node explicitly shown on the tree.
The nodes labeled (id)l through (id)4 are given the associated code COST,
LOAD = 0.98
STORE $2
LOAD TAX
n3 STORE $1
LOAD PRICE
ADD $1
MPY $2
STORE COST
= 0.98
$2
/ LOAD TAX
$1
/ PRICE
$1
$2
Y
TAX )
STORE sl ~ nlj .
LOAD PRICE <1~
ADD $1
Thus, when preceded by LOAD, C(nl) produces the sum of PRICE and
TAX in the .accumulator, although it does it in an awkward way. The code
70 AN INTRODUCTION TO COMPILING CHAP. 1
optimization process can "iron out" some of this awkwardness, or the rules
by which the object code is constructed can be elaborated to take care of
some special cases.
Next we can evaluate C(n2) using rule (5), and get
C(n2) = '=0.98; STORE $2; LOAD' C(nl)'; MPY $2'
Here, C(nl) is the string mentioned in the previous paragraph, and $2
is used as temporary, since l(n2) = 2.
We evaluate C(n3) using rule (3) and get
C(n3) = 'LOAD' C ( n 2 ) ' ; STORE COST'
The list of assembly language instructions (with semicolons replaced by
new lines) which form the translation of our original "COST . . . . " state-
ment is
Example 1.5
These four transformations have been selected for their applicability to
(1.2.2). In general there would be a large set of transformations, and they
would be tried in various combinations. In (1.2.2), we notice that rule (1)
applies to LOAD PRICE; A D D $1, and we can, on speculation, tempo-
rarily replace these instructions by LOAD $1; A D D PRICE, obtaining
the code
We can now apply rule (4) to the sequence LOAD =0.98; STORE $2.
These two instructions are deleted and $2 in the instruction MPY $2 is
replaced by M P Y =0.98. The final code is
The code of (1.2.5) is the shortest that can be obtained using our four
transformations and is the shortest under any set of reasonable transfor-
mations. 5
tA similar simplification could be obtained using rule (4) directly. However, we are
trying to give some examples of how different types of transformations can be used.
SEC. 1.2 AN OVERVIEW OF COMPILING 73
In general, when the compiler comes to a point in the input stream where
it cannot continue producing a valid parse, some compilers attempt to make
a "minimal" change in the input in order for the parse to proceed. Some
possible changes are
(1) Alteration of a single character. For example, if the parser is given
"identifier" INTEJER by the lexical analyzer and it is not proper for an
identifier to appear at this point in the program, the parser may guess that
the keyword INTEGER was meant.
(2) Insertion of a single token. For example, the parser can replace 2C
by 2 • C. (2 -q- C would do as well, but in this case, we "know" that 2 • C
is more likely.)
(3) Deletion of a single token. For example, a comma is often incorrectly
inserted after the 10 in a F O R T R A N statement such as DO 10 I = 1, 20.
(4) Simple permutation of tokens. For example, INTEGER I might be
written incorrectly as I INTEGER.
In many programming languages, statements are easily identified. If it
becomes hopeless to parse a particular (ill-formed) statement, even after
applying changes such as those above, it is often possible to ignore the state-
ment completely and continue parsing as though this ill-formed, statement
did not appear.
In general, however, there is very little of a mathematical nature known
about error recovery algorithms and algorithms to generate "good" diag-
nostics. In Chapters 4 and 5, we shall discuss certain parsing algorithms, LL,
LR, and Earley's algorithm, which have the property that as soon as the
input stream is such that there is no possible following sequence which
could make a well-formed input, the algorithms announce this fact. This
property is useful in error recovery and analysis, but some parsing algorithms
discussed do not possess it.
1.2.8. Summary
Book
keeping
Source Object
program
program
Error
analysis
generated from this structure, are detected during code generation. An exam-
ple of this situation would be a variable used without declaration. The parser
ignores the data component of tokens and so could not detect this error.
The symbol tables (bookkeeping) are produced in the lexical analysis
process and in some situations also during syntactic analysis when, say,
attributes and the identifiers to which they refer are connected in the tree
structure being formed. These tables are used in the code generation phase
and possibly in the assembly phase of compilation.
A final phase, which we refer to as assembly, is shown in Fig. 1.7. In this
phase the intermediate code is processed to produce the final machine lan-
guage representation of the object program. Some compilers may produce
machine language code directly as the result of code generation, so that the
assembly phase may not be explicitly present.
The model of a compiler we have portrayed in Fig. 1.7 is a first-order
approximation to a real compiler. For example, some compilers are designed
to operate using a very small amount of storage and as a consequence may
EXERCISES 75
EXERCISES
Research Problem
There are many research areas and open problems concerned with compiling
and translation of algorithms. These will be mentioned in more appropriate chap-
ters. However, we mention one here, because this area will not be treated in any
detail in the book.
1.2.14. Develop techniques for proving compilers correct. Some work has been
done in this area and in the more general area of proving programs
and/or algorithms correct. (See the following Bibliographic Notes.)
However, it is clear that more work in the area is needed.
An entirely different approach to the problem of producing reliable
compilers is to develop theory applicable to their empirical testing.
That is, we assume we "know" our compiling algorithms to be correct.
We want to test whether a particular program implements them correctly.
In the first approach, above, one would attempt to prove the equivalence
of the written program and abstract compiling algorithm. The second
approach suggested is to devise a finite set of inputs to the compiler
such that if these are compiled correctly, one can say with reasonably
certainty (say a 99 Yo confidence level) that the compiler program has
no bugs. Apparently, one would have to make some assumption about
the frequency and nature of programming errors in the compiler pro-
gram itself.
BIBLIOGRAPHIC NOTES
1.3. OTHER A P P L I C A T I O N S OF P A R S I N G A N D
TRANSLATING ALGORITHMS
In this section we shall mention two areas, other than compiling, in which
hierarchical structures such as those f o u n d in parsing and translating
algorithms can play a major role. These are the areas of natural language
translation and pattern recognition.
78 AN INTRODUCTION TO COMPILING CHAP. 1
Example 1.6
Our example concerns graphs called "d-charts,"t which can be thought
of as the flow charts for a programming language whose programs are
defined by the following rules:
(1) A simple assignment statement is a program.
(2) If $1 and Sz are programs, then so is S 1 ; S ~.
(3) If S 1 and Sz are programs and A is a predicate, then
while A do S end
is a program.
We can write flow charts for all such programs, where the nodes (blocks)
of the flow chart represent code either to test a predicate or perform a simple
assignment statement. All the d-charts can be constructed by beginning with
a single node, representing a program, and repeatedly replacing nodes repre-
senting programs by one of the three structures shown in Fig. 1.8. These
replacement rules correspond to rules (2), (3), and (4) above, respectively.
The rules for connecting these structures to the rest of the graph are
the following. Suppose that node no is replaced by the structure of Fig.
1.8(a), (b), or (c).
(1) Edges entering no now enter nx, n3, or n6, respectively.
(2) An edge from no to node n is replaced by an edge from n2 to n in
Fig. 1.8(a), by edges from both n4 and n5 to n in Fig. 1.8(b), and by an edge
from n6 to n in Fig. 1.8(c).
Nodes n3 and n6 represent predicate tests and may not be further replaced.
The other nodes represent programs and may be further replaced.
() nl
n6 ~ n 7
n4
( n2
(a) (b) (c)
(a) (b)
.I , i "\ \,
' I ~ i I
I , I I
! 'l \1 \il"s,~ (s~),,i! i! L(s~)) \'~6 I I
t t\ .K'-" i~.,;// t\ "T' c~ ,~ / i
\
\,~, 1111
C1
Bl C5
c3 J % s3 B4 d ,i, C6 % S7
B2
B3 S1 $2 $4 $5 B5 $6
Fig. 1.11 Tree describing d-chart structure.
can be defined using web grammars (suitably generalized from Example 1.8),
for example, the class of planar graphs or the class of binary trees.
BIBLIOGRAPHIC NOTES
In this section, we shall discuss from a general point of view the two
principal methods of defining languages--the generator and the recognizer.
83
84 ELEMENTS OF LANGUAGE THEORY CHAP. 2
We shall discuss only the most common kind of generator, the Chomsky
grammar. We treat recognizers in somewhat greater generality, and in sub-
sequent sections we shall introduce some of the great variety of recognizers
that have been studied.
2.1.1. Motivation
2.1.2. Grammars
Example 2.1
An example of a grammar is G1 = ({A, S}, (0, 1}, P, S), where P consists
of
S: ~ 0A1
0,4 ~ 00A 1
A ~e
The nonterminal symbols are A and S and the terminal symbols are 0 and
1. D
Example 2.2
Let us consider grammar G 1 of Example 2.1 and the following derivation'
S ==~ 0A1 ~ 00A1 t =-~ 0011. That is, in the first step, S is replaced by 0A1
according to the production S ~ 0A1. At the second step, 0A is replaced
SEC. 2.1 REPRESENTATIONS FOR LANGUAGES 87
by 00A 1, and at the third, A is replaced by e. We may say that S =-~ 0011,
+ ,
S~ 0011, S ~ 00i 1, and that 0011 is in L(G~). It c a n b e shown that
L ( G , ) -- {0"1" I n ~ 1}
CONVENTION
a ~fll
~Z -----> ff2
tg > ~n
Example 2,3
Example 2.4
Let Go = ([E, T, F}, [a, + , ,, (,)}, P, E), where P consists of the produc-
tions
E >E+TIT
T >T,F]F
F > (E) la
E - - - > E-+- T
~T+ T
>'F-q- T
>a+T
>a-+- T , F
--->a+F,F
.-=->a + a , F
---> a .+- a , a
L(Go) is the set of arithmetic expressions that can be built up using the sym-
bols a, ÷ , ,, (, and). D
The grammar in Example 2.4 will be used repeatedly in the book and is
always referred to as Go.
Example 2.5
Let G be defined by
S .> aSBC[abC
CB > BC
bB ~ bb
bC > bc
cC > cc
SEC. 2.1 REPRESENTATIONS FOR LANGUAGES 89
S ~ aSBC
aab CB C
>. a a b B C C
aabbCC
aabbcC
aabbcc
S ~ CD Ab ~ bA
C > aCA Ba ~ aB
C ~ bCB Bb ~~ bB
AD >aD C >e
BD- >bD D >e
Aa ~ aA
An example of a derivation in G is
S---~CD
- ~.aCAD
- ~ abCBAD
===~.abBA D
abBaD
abaBD
abab D
abab
We shall show that L ( G ) = [ww] w ~ {a, b}*}. That is, L(G) consists of
strings of a's and b's of even length such that the first half of each string is
the same as the second half.
Since L(G) is a set, the easiest way to show thaf L(G) = {ww[w ~ {a, b}*}
is to show that {ww]w ~ {a, b}*} ..q L(G) and that L(G) ~ [ww[w ~ {a, b}*}.
90 ELEMENTS OF LANGUAGE THEORY CHAP• 2
To show that { w w l w ~ {a, b}*} ~ L(G) we must show that every string
of the form ww can be derived from S. By a simple inductive proof we can
show that the following derivations are possible in G"
(I) s ~ CD.
(2) For n .> O,
n
C ~ c~c2 . . " c . C X . X . _ ~ ... X~
c~c~ . . . c~X.X~_~ . . . X~
>" C l C 2 • • . Cn_iC n
c; ~ [a, b].
G is said to be
(1) Right-linear if each production in P is of the form A ~ x B or
A ~ x, where A and B are in N and x is in Z*.
(2) Context-free if each production in P is of the form A ~ ~, where
A is in N and tx is in (N u Z)*.
(3) Context-sensitive if each production in P is of the form ~ ---~ fl, where
I~I_<lPl.
A grammar with no restrictions as above is called unrestricted.
The grammar of Example 2.3 is a right-linear grammar, Another example
of a right-linear grammar is the grammar w i t h the productions
S ---->. OS[1St e
S > ASIe
A >011
generates the language {0, 1}*, which, as we have seen, can also be generated
by a right-linear grammar.
We should also mention that there are a number of grammatical models
that have been recently introduced outside the Chomsky hierarchy. Some of
the motivation in introducing new grammatical models is to find a generative
device that can better represent all the syntax and/or semantics of pro-
gramming languages. Some of these models are introduced in the Exercises.
2.1.4. Recognizers
!
ao at [ a2 an Input tape
t
~ Input head
Finite state
control
g ( Z , Z ~ . . . Zn, r, . . . r , ) = Y, . . . Y k Z ~ . . . Zn.
We say that a recognizer accepts an input string w if, starting from the
initial configuration with w on the input tape, the recognizer can make
a sequence of moves and end in a final configuration.
We should point out that a nondeterministic recognizer may be able to
make many different sequences of moves from an initial configuration. How-
ever, if at least one of these sequences ends in a final configuration, then
the initial input string will be accepted.
The language defined by a recognizer is the set of input strings it accepts.
For each class of grammars in the Chomsky hierarchy there is a natural
class of recognizers that defines the same class of languages. These recognizers
are finite automata, pushdown automata, linear bounded automata, and
Turing machines. Specifically, the following characterizations of the Chomsky
languages exist"
(I) A language L is right-linear if and only if L is defined by a (one-way
deterministic) finite automaton.
(2) A language L is context-free if and only if L is defined by a (one-way
nondeterministic) pushdown automaton,
(3) A language L is context-sensitive if and only if L is defined by a (two-
way nondeterministic) linear bounded automaton.
(4) A language L is recursively enumerable if and only if L is defined by
a Turing machine.
The precise definition of these recognizers will be found in the Exercises
and later sections. Finite automata and pushdown automata are important
in the theory of compiling and will be studied in some detail in this chapter.
EXERCISES
2.1.13. Prove that the grammar G~ of Example 2.1 generates {0"l"[n > 1}.
Hint: Observe that each sentential form has at most one nonterminal.
Thus productions can be applied in only one place in a string.
DEFINITION
In an unrestricted grammar G there are many ways of deriving a
given sentence that are essentially the same, differing only in the order
in which productions are applied. If G is context-free, then we can
represent these essentially similar derivations by means of a derivation
tree. However, if G is context-sensitive or unrestricted, we can define
equivalence classes of derivations in the following manner.
Let G = (N, E, P, S) be an unrestricted grammar. Let D be the
set of all derivations of the form S ~ w. That is, elements of D
are sequences of the form (0~0, 0~1. . . . , t~n) such that t~0 = S, ~n ~ ~*,
and oct_~ ~ oct 1 < i < n.
Define a relation R0 on D by (t~0, t X l , . . . , o~n) R0 (fl0, r i , . • . , fin)
if and only if there is some i between 1 and n - 1 such that
(1) txj = fl~ for all 1 ~ j ~ n such that j ~ i.
(2) We can write oct_l = 71727a~'4~'s and t~+l = ~'lO?aeys such that
72~t~ and Y 4 ~ 6 are in P, and either t x t = 7 ' ~ 3 y 4 ~ s and
fit = 71Y2736~'5, or conversely.
Let R be the least equivalence relation containing R0. Each equiva-
lence class of R represents the essentially similar derivations of a given
sentence.
*'2.1.14. What is the maximum size of an equivalence class of R (as a function
of n and i 0c~ I) if G is
(a) Right-linear.
(b) Context-free.
(c) Such that every production is of the form tx ~ fl and [ tx I < [fl [.
• 2.1.15. Let G be defined by
S ~ AOBIB1A
A- > B_BI0
B----~ AAI 1
What is the size of the equivalence class under R which contains the
derivation
1000B ~ 10001
DEFINITION
A grammar G is said to be unambiguous if each w in L(G) appears
as the last component of a derivation in one and only one equivalence
class under R, as defined above. For example,
EXERCISES 99
S- > abClaB
B- > be
bC----> bc
A + X i V i X 2 V 2 "'" X . V . , n>0
and
M----> X~ ~uIX2 V 2 " " X+V., n~0
S---->Tg
T~ Tf
T~ A~C
Af > aA
Bf - - - > bB
EXERCISES 101
Cf - >cC
Ag----+a
Bg---~b
Cg =-+. c
Then L(G) = {a"b"c" In _> 1 }. For example, aabbcc has the derivation
S--~Tg
- - ~ Tfg
AfgBfgCfg
aAgBfgCfg
aaBfgCfg
- - ~ aabBgCfg
aabbCfg
aabbeCg
> aabbcc [-]
0, if i is even
(a) f ( i ) = 1, if i is odd.
a, if i is even
(b) f ( i ) b, if i is odd.
• [0, if i is even and the input symbol under the input
(c) f ( i ) = 1 head is a
1, otherwise.
2.1.31. Which of the following could be memory store functions for the
recognizer in Exercise 2.1.30 ?
(a) g(i, X ) -- 0
g(i, Y) -- i ÷ t.
(b) g(i, X ) -- 0
g(i, Y ) - { i ÷ 1, if the previous store instruction was X
÷ 2, if the previous store instruction was Y.
102 ELEMENTSOF LANGUAGE THEORY CHAP. 2
DEFINITION
Open Problems
2.1.34. Is the complement of a context-sensitive language always context-
sensitive ?
The recognizer of Exercise 2.1.26 is called a linear bounded auto-
maton (LBA). If we make it deterministic, we have a deterministic LBA
(DLBA).
2.1.35. Is every context-sensitive language recognized by a DLBA ?
2.1.36. Is every indexed language recognized by a DLBA ?
By Exercise 2.1.28, a positive answer to Exercise 2.1.35 implies a
positive answer to Exercise 2.1.36.
BIBLIOGRAPHIC NOTES
machine appeared in McCulloch and Pitts [1943]. The study of recognizers was
stimulated by the work of Moore [1956] and Rabin and Scott [1959].
A significant amount of effort in language theory has been expended in deter-
mining the algebraic properties of classes of languages and in determining decida-
bility results for classes of grammars and recognizers. For each of the four classes
of grammars in the Chomsky hierarchy there is a class of recognizers which defines
precisely those languages generated by that class of grammars. These observations
have led to a study of abstract families of languages and recognizers in which
classes of languages are defined in terms of algebraic properties. Certain algebraic
properties in a class of languages are necessary and sufficient to guarantee the
existence of a class of recognizers for those languages. Work in this area was
pioneered by Ginsburg and Greibach [1969] and Hopcroft and Ullman [1967].
Book [1970] gives a good survey of language theory circa 1970.
Haines [1970] claims that the left context grammars in Exercise 2.1.6 generate
exactly the context-sensitive languages. Exercise 2.1.28 is from Aho [1968].
DEFINITION
Let Z be a finite alphabet. We define a regular set over Z recursively in
the following manner:
(1) ~ (the empty set) is a regular set over Z.
(2) {e} is a regular set over E.
(3) {a} is a regular set over Z for all a in Z.
(4) If P and Q are regular sets over Z, then so are
(a) P U a .
(b) PQ.
(c) P*.
(5) Nothing else is a regular set.
Thus a subset of Z* is regular if and only if it is ~ , {e}, or {a}, for some
a in Z, or can be obtained from these by a finite number of applications of
the operations union, concatenation, and closure.
We shall define a convenient method for denoting regular sets over
a finite alphabet E.
104 ELEMENTSOF LANGUAGE THEORY CHAP. 2
DEFINITION
Regular expressions over X and the regular sets they denote are defined
recursively, as follows"
(1) ~ is a regular expression denoting the regular set ~ .
(2) e is a regular expression denoting the regular set {e}.
(3) a in X is a regular expression denoting the regular set {a}.
(4) I f p and q are regular expressions denoting the regular sets P and Q,
respectively, then
(a) (p + q) is a regular expression denoting P u Q.
(b) (pq) is a regular expression denoting PQ.
(e) (p)* is a regular expression denoting P*.
(5) Nothing else is a regular expression.
We shall use the shorthand notation p+ to denote the regular expression
pp*. Also, we shall remove redundant parentheses from regular expressions
whenever no ambiguity can arise. In this regard, we assume that * has the
highest precedence, then concatenation, and then -+-. Thus, 0 -+- 10" means
(0 + (1 (0"))).
Example 2.8
Some examples of regular expressions are
(1) 01, denoting {01}.
(2) 0", denoting {0}*.
(3) O + 1)*, denoting {0, 1}*.
(4) (0 + 1)* 011, denoting the set of all strings of 0's and l's ending in 011.
(5) (a + b)(a + b + 0 + 1)*, denoting the set of all strings in {0, 1, a, b}*
beginning with a or b.
(6) (00 + 11)*((01 + 10)(00 + 11)*(01 + 10)(00 + 11)*)*, denoting the
set of all strings of O's and l's containing both an even number of O's and
an even number of l's. E]
It should be quite clear that for each regular set we can find at least one
regular expression denoting that regular set. Also, for each regular expression
we can construct the regular set denoted by that regular expression. Unfor-
tunately, for each regular set there is an infinity of regular expressions denot-
ing that set.
We shall say two regular expressions are equal ( = ) if they denote the same
set.
Some basic algebraic properties of regular expressions are stated in the
following lemma.
LEMMA 2.1
Let g, p, and ~, be regular expressions. Then
(1) ~ + fl = fl + ~ (2) ~ * -- e
SEC. 2.2 REGULAR SETS, THEIR GENERATORS, AND THEIR RECOGNIZERS 105
(2.2.1) X = aX + b
where a and b are regular expressions. We can easily verify by direct substitu-
tion that X = a*b is a solution to Eq. (2.2.1). That is to say, when we sub-
stitute the set represented by a*b in both sides of Eq. (2.2.1), then each side
of the equation represents the same set.
We can also have sets of equations that define languages. For example,
consider the pair of equations
X = a~ X --Jr-a 2 Y -~- a3
(2.2.2)
Y = blX-+- b2 Y-+- b3
X : (a 1 -Jr-azb2*ba)*(a 3 q- a2bz*b3)
Y : (b 2 + b~aa*a2)*(b3 -if- b~aa*a3)
However, we should first mention that not all regular expression equa-
tions have unique solutions. For example, if
(2.2.3) X = ~X + fl
is a regular expression equation and a denotes a set which contains the empty
string, then X = 0c*(fl + y) is also a solution to (2.2.3) for all ?. (? does not
even have to be regular. See Exercise 2.2.7.) Thus Eq. (2.2.3) has an infinity
'! 06 ELEMENTSOF LANGUAGE THEORY CHAP. 2
Example 2.9
X 3 --01*(OX2 + e) q- 1X 3,
If we now work on (2.2.5), which was not changed by the previous step,
we replace X 2 by I*0X 3, in (2.2.7), and obtain
We now reach step 5 of Algorithm 2.1. From Eq. (2.2.8) we obtain the
solution for X3"
Since X 3 does not appear in (2.2.4), that equation is not modified. We then
solve (2.2.10), obtaining
The output of Algorithm 2.1 is the set of Eq. (2.2.9), (2.2.11), and (2.2.13).
D
108 ELEMENTSOF LANGUAGE THEORY CHAP. 2
Since g is a solution, we have g(Xj) = ~jo U a j l g ( X 1 ) (_) " ' " U ajng(Xn) for
all j. In particular, a j0 ~ g(Xj) and aj~g(Xk) ~ g(Xj) for all j and k. Thus,
Wmis in g ( X j . ) , Win-~Wm is in g(Xj._,), and so forth. Finally, w = w~w2 - . . Wm
is in g(Xi, ) = g(X,). But then we have a contradiction, since we supposed
that w was not in g ( X 3 . Thus we can conclude that f ( X 3 ~ g ( X 3 for all i.
It immediately follows that f is the minimal fixed point of Q. [~]
LEMMA 2.4
Let Q1 and Q2 be the set of equations before and after a single applica-
tion of step 2 of Algorithm 2.1. Then Q1 and Q2 have the same minimal
fixed point.
Proof. Suppose that in step 2 the equation
At --- O~iO -~- O~tiAi .q_ ~i,i+lAi+l _ql_ . . . _~_ ~i,,A, '
is the equation for A~, j > i, in Q1. In Q2 the equation for Aj becomes
where
~ytO~u~i0
Pk ~jk + * for i < k < n
We can use Lemma 2.3 to express the minimal fixed points of Q1 and
Q2, which we shall denote by fl and f2, respectively. From the form of Eq.
(2.2.15), every string in f2(A~) is in f l ( A j ) . This follows from the fact that
any string w which is in the set denoted by • j~,0ci~ * can be expressed as
WlW2 . . " Wm, where wl is in ~ji, wm is in ~ik, and w2 . . . . , Wm-1 are in ~,.
Thus, w is the concatenation of a sequence of strings in the sets denoted by
coefficients of Q1, for which the subscripts satisfy the condition of Lemma
2.3. A similar observation holds for strings in o~j~.*o%. Thus it can be shown
that f2(A~) ~ f~(Aj).
Conversely, suppose w is in f l ( A j ) . Then by Lemma 2.3 we may write
w = W l . . ' W m , for some sequence of nonzero subscripts ll, . . . . lm such
that Wm is in 0ct.0, wp is in ~I~1.... 1 ~ p < m, and 11 = j. We can group the
w / s uniquely, such that we can write w : Yl "'" Y,, where yp : w, . . . w,,
and
1 10 ELEMENTS OF LANGUAGE THEORY CHAP. 2
(1) If 1, ~ i, then s = t + 1.
(2) If l, > i, then s is chosen such that l , + ~ , . . . , l, are all i and l,+~ =/= i.
It follows that in either case, yp is in the coefficient of A j,÷, in the equation
of Q2 for A h, and hence w is in fz(Aj). We conclude that f~(A~)= f2(Aj)
for all j. [S]
LE~VnV~A2.5
Let Q~ and Q2 be the sets of equations before and after a single appli-
cation of step 5 in Algorithm 2.1. Then Q~ and Q2 have the same minimal
fixed points.
Proof. Exercise, similar to Lemma 2.4.
THEOREM 2.1
Algorithm 2.1 correctly determines the minimal fixed point of a set of
standard form equations.
Proof. After step 5 has been applied for all j, the equations are all of
the form A,. = a~, where at is a regular expression over Z. The minimal
fixed point of such a set is clearly f(A,) = ~.
2.2.2. Regular Sets and Right-Linear Grammars
L(G3) = L(G1) U L(G2) because for each derivation $3 =-~ w there is either
+ + Ga
a derivation S~ =-~ w or $2 =-~ w and conversely. Since G 3 is a right-linear
Gt G2
grammar, L(G3) is a right-linear language.
(ii) Let G4 be the right-linear grammar (N1 U N~, X, P4, St) in which
P4 is defined as follows:
(1) If A ~ x B is in P~, then A ~ x B is in P4.
(2) If A --~ x is in P1, then A ~ xS2 is in P4.
(3) All productions in P2 are in P4.
+ + + +
productions of the form A ---~ x in P4 that "came out of" P~ we can write
+ +
tions used in the derivation S1 =-~ x S z arose from rules (1) and (2) of the
construction of P4. Thus we must have the derivations S~ =~ x and S ~ y.
Gt G2
Hence, L(G4) ~ L ( G i ) L ( G 2 ) . It thus follows that L(G4) = L ( G t ) L ( G 2 ) .
(iii) Let G5 = (Nt u ($5), X, Ps, $5) such that $5 is not in N~ and P5
is constructed as follows"
(1) If A ~ x B is in P~, then A ---~ x B is in Ps.
(2) If A ~ x is in P1, then A ~ xS~ and A ~ x are in P5-
(3) $5 ~ S a l e are in Ps-
+ + + + +
We can now equate the class of right-linear languages with the class of
regular sets.
THEOREM 2.2
A language is a regular set if and only if it is a right-linear language.
Proof.
Only if." This portion follows from Lemmas 2.6 and 2.7 and induction
on the number of applications of the definition of regular set necessary to
show a particular regular set to be one.
If: Let G = (N, E, P, S) be a right-linear grammar with N = (A l , . . . , A,].
We can construct a set of regular expression equations in standard form
with the nonterminals in N as indeterminates. The equation for A t is
112 ELEMENTSOF LANGUAGE THEORY CHAP. 2
Example 2.10
S >OAI1SIe
A > 0BI1A
B > 0S]IB
We shall now consider a fourth way, as the sets defined by finite automata.
A finite automaton is one of the simplest recognizers. Its "infinite" memory
is null. Ordinarily, the finite automaton consists only of an input tape and
a finite control. Here, we shall allow the finite control to be nondeterministic,
but restrict the input head to be one way. In fact, we require that the input
head shift right on every move.'t The two-way finite automaton is considered
in the Exercises.
tRecall that, by definition, a one-way recognizer does not shift its input head left but
may keep it stationary during a move. Allowing a finite automaton to keep its input head
stationary does not permit the finite automaton to recognize any language not recognizable
by a conventional finite automaton.
REGULAR SETS, THEIR GENERATORS, AND THEIR RECOGNIZERS 113
SEC. 2.2
Example 2.11
Let M = (~p,q, r}, [0, 1}, $,p, {r~}) be a finite automaton, where $ is
specified as follows"
Input
$ 0 1
State p
q
r Jr] ~r}
M accepts all strings of O's and l's which have two consecutive O's. That
is, state p is the initial state and can be interpreted as "Two consecutive O's
have not yet appeared, and the previous symbol was not a 0." State q means
"Two consecutive O's have not appeared, but the previous symbol was a 0."
State r means "Two consecutive O's have appeared." Note that once state r
is entered, M remains in that state.
On input 01001, the only possible sequence of configurations, beginning
with the initial configuration (p, 01001), is
~- (r, 1)
[- (r, e)
Thus, 01001 is in L(M). E]
Example 2.12
Let us design a nondeterministic finite automaton to accept the set of
strings in {1, 2, 3} such that the last symbol in the input string also appears
previously in the string. That is, 121 is accepted, but 31312 is not. We shall
have a state q0, which represents the idea that no attempt has been made to
recognize anything. In this state the automaton (or that existence of it,
anyway) is "coasting in neutral." We shall have states q t, q2, and q3, which
represent the idea that a "guess" has been made that the last symbol of the
string is the subscript of the state. We have one final state, qs" In addition
to remaining in qo, the automaton can go to state q= if a is the next input.
If (an existence of) the automaton is in qa, it can go to qs if it sees another a.
The automaton goes no place from qs, since the question of acceptance must
be decided anew as each symbol on its input tape becomes the "last." We
specify M formally as
Input
1 2 3
Since (q0, 12321)l-~-- (qs, e), the string 12321 is in L(M). Note that certain
configurations are repeated in Fig. 2.2, and for this reason a directed acyclic
graph might be a more suitable representation for the configurations entered
b y M . E]
1 0
0 _~0,1
Start
1, 2, 3
1,2, l 1 1
Start - )
indicated the start state by pointing to it with an arrow labeled "start," and
final states have been circled.
We shall define a deterministic finite automaton as a special case of the
nondeterministic variety.
DEFINITION
As a special case of (2.2.16), ({qo}, w)I-~ (S', e) for some S' in F' if and
only if (q0 ' w)I .A-.
M
(p ' e) for some p in F. Thus, L(M') = L(M)
Example 2.13
Let us construct a finite automaton M ' = (Q, {1, 2, 3}, ,6', (q0}, F) accept-
ing the language of M in Example 2.12. Since M has 5 states, it seems that
M ' has 32 states. However, not all of these are accessible from the initial
state. That is, we call a state p accessible if there is a w such that
(q0, w)I* (P, e), where q0 is the initial state. Here, we shall construct only the
accessible states.
We begin by observing that {q0} is accessible. 6'((q0}, a ) = (q0, q~} for
a = 1, 2, and 3. Let us consider the state {q0, ql}. We have 6'((q0, qx}, 1) =
1 18 ELEMENTSOF LANGUAGE THEORY CHAP. 2
Input
1 2 3
State A = {q0} B C D
B = { q o , qx} E F G
C={qo,q2] F H I
D = [qo, q3} G I J
E = [qo, q l , q f } E F G
F = {qo, q I , q 2 } K K L
G = { q o , qx,q3} M L M
H = [q0, q2, qf} F H I
1 = [qo,qz, q3} L N N
J = {qo, q 3, qf} G I J
K = {qo, q l , q z , qy} K K L
L = [qo, q l , q 2 , q 3 } P P P
M = {qo, q l , q 3 , qy} M L M
N = { q o , q 2 , q 3 , qf} L N N
P = {qo,ql,q2, q3,qf} P P P
{qo, ql, qr}" Proceeding in this way, we find that a set of states of M is acces-
sible if and only if-
(1) It contains qo, and
(2) If it contains @, then it also contains ql, q~, or q3.
The complete set of accessible states, together with the 8' function, is
given in Fig. 2.4.
The initial state of M' is A, and the set of final states consists of E, H, .1,
K, M, N, and P.
and (q2, w)1~7 (q, e). This result, together with the definition of F, yields
L(M) = L(M~) U L(M2).
(ii) To construct a finite automaton M to recognize L1L 2 let
M = (Q1 t.j Q z, E, t~, q~, F), where ~ is defined by
(1) ~(q, a) = ~l(q, a) for all q in Q~ - F~,
(2) 5(q, a) = ill(q, a) U t~2(q2, a) for all q in F1, and
(3) $(q, a) = ~2(q, a) for all q in Q2.
Let
F = J Fz', if q2 ~ Fz
IF1 U F2, if q2 ~ F2
That is, M begins by simulating M1. When M reaches a final state of M1,
it may, nondeterministically, imagine that it is in the initial state of M2
by rule (2). M will then simulate M2. Let x be in L 1 and y in L 2. Then
(q~, xy)[.~- (q, y) for some q in F~. If x = e, then q = q l. If y ~ e, then, using
one rule from (2) and zero or more from (3), (q, y)1.~- (r, e) for some r ~ F2.
If y = e, then q is in F, since q2 ~ F2. Thus, xy is in L(M). Suppose that w
is in L(M). Then (q l, w)I-~ (q, e) for some q ~ F. There are two cases to
consider depending on whether q ~ F2 or q ~ F1. Suppose that q ~ /72.
Then we can write w = xay for some a in E such that
O
2.2.5. Summary
THEOREM 2.5
The following statements are equivalent:
(1) L is a regular set.
(2) L is a right-linear language.
(3) L is a finite a u t o m a t o n language.
(4) L is a nondeterministic finite a u t o m a t o n language.
(5) L is d e n o t e d by a regular expression. [ ]
EXERCISES
2.2.1. Which of the following are regular sets? Give regular expressions for
those which are.
(a) The set of words with an equal number of 0's and l's.
(b) The set of words in {0, 1}* with an even number of O's and an odd
number of l's.
(c) The set of words in ~* whose length is divisible by 3.
(d) The set of words in {0, 1]* with no substring 101.
2.2.2. Show that the set of regular expressions over 1~ is a CFL.
2.2.3. Show that if L is any regular set, then there is an infinity of regular
expressions denoting L.
2.2.4. Let L be a regular set. Prove directly from the definition of a regular
set that L R is a regular set. Hint: Induction on the number of applica-
tions of the definition of regular set used to show L to be regular.
2.2.5. Show the folilowing identities for regular expressions 0~, fl, and ~:
(a) ~(fl + 7) = ~P + ~7. (g) (~ + P)r = ~r + Pr.
(b) ~ + (p + ~) = (~ + p) + 7. (h) ~ * = e .
(i) 0C* -t- 0C -- ~;*.
(d) 0ce -- e0~ = 0~. (j) (~*)* = ~*.
(e) ~ = ~ =~. (k) (~ + p)* = (~*p*)*.
(f) ~ + 0 ~ = ~ . (1) ~ + ~ = ~ .
2.2.6. Solve the following set of regular expression equations:
A1 = (01" + 1)A1 + A2
A2 = 11 + 1A1 + 00A3
A3 = e + A 1 ÷A2
(2.2.18) X = 0CX + fl
X = ~1 X -t- 0~2 Y -b ~3
Y = PI X ~ 1~2 Y ~ ~3
DEFINITION
A right-linear grammar G = (N, Z, P, S) is called a regular gram-
mar when
(1) All productions with the possible exception of S ~ e are of the
form A ~ aB or A ----~ a, where A and B are in N and a is in Z.
(2) If S---~ e is in P, then S does not appear on the right of any
production.
2.2.13. Show that every regular set has a regular grammar. Hint: There are
several ways to do this. One way is to apply a sequence of transforma-
tions to a right-linear grammar G which will map G into an equivalent
regular grammar. Another way is to construct a regular grammar
directly from a finite automaton.
2.2.14. Construct a regular grammar for the regular set generated by the
right-linear grammar
A----~ BIC
B---~ OBI1BIOll
C----~ O D I 1 C l e
D ~ 0Cl 1D
DEFINITION
A production A ~ 0c of right-linear grammar G = (N, E, P, S) is
useless if there do not exist strings w and x in ~* such that
DEFINITION
Open Problem
2.2.31. How close to the bound of Exercise 2.2.28 for converting n-state two-
way nondeterministic finite automata to k-state finite automata is it
actually possible to come ?
BIBLIOGRAPHIC NOTES
Example 2.14
Consider the finite automaton M whose transition graph is shown in
Fig. 2.5.
0
Start-~
1
Start • ( ~0 . )'- - '- - - 0.1
- ~' - '- - -N-/ ~ x_ ~ ,_~~. j. .~. l. . . . . . , ~ ~
0
Fig. 2.6 Reducedmachine.
126 ELEMENTS OF LANGUAGE THEORY CHAP. 2
Proof The "only if" portion is trivial. The "if" portion is trivial if F
has 0 or n states. Therefore, assume the contrary.
We shall show that the following condition must hold on the k-indistin-
guishability relations"
n-2 n-3 2 1 0
THEOREM 2.6
M ' of Algorithm 2.2 has the smallest number of states of any finite
automaton accepting L(M).
Proof. Suppose that M " had fewer states than M' and that L(M") -- L(M).
Each equivalence class under --~ is nonempty, so each state of M' is
accessible. Thus there exist strings w and x such that (q0', w)1~,, (q, e) and
(q0', x)[~,, (q, e), where qo' is initial state of M " , but w and x take M' to
different states. Hence, w and x take M to different states, say p and r, which
are distinguishable. That is, there is some y such that exactly one of wy
and xy is in L(M). But wy and xy must take M " to the same state, namely,
that state s such that (q, y)[~,, (s, e). Thus is not possible that exactly one
of wy and xy is in L(M"), as supposed.
Example 2.15
Let us find a reduced finite automaton for the finite automaton M whose
k
transition graph is shown in Fig. 2.7. The equivalence classes for = , k > 0,
are as follows"
0
For ~ " {A, F}, {B, C, D, E}
1
For ~ " {A, F}, {B, E}, {C, D}
2
For --" {A, F}, {B, E}, {C, D}.
Start
a
b a _~ /"-
2 1 1
Since ~ = ~ we have ~ - ~ . The reduced machine M' is (~[A], [B], [C]},
[a, b}, ~', A, [[A]}), where ~' is defined as
a b
Here we have chosen [A] to represent the equivalence class [A, F}, [B] to
represent [B, E}, and [C] to represent [C, D}.
l"~--(ql, yz)
l-'~-"(q l, z)
I-~- (q2, e)
SEC. 2.3 PROPERTIES OF REGULAR SETS 129
must be a valid sequence of moves for all i > 0. Since w = xyz is in L, xytz
is in L for all i > 1. The case i = 0 is handled similarly, l---1
Example 2.16
We shall use the pumping iemma to show that L -- [0"l"[n > 1} is not
a regular set. Suppose that L is regular. Then for a sufficiently large n, 0"1"
can be written as xyz such that y =/= e and xyiz ~ L for all i ~ 0. If y ~ 0 +
or y ~ 1+, then xz = xy°z ~ L. If y E 0+1 +, then xyyz ~ L. We have a con-
tradiction, so L cannot be regular. [--]
We say that a set A is closed under the n-ary operation 0 if O(a 1, a2, . . . , a,)
is in A whenever at is in A for 1 < i < n. For example, the set of integers is
closed under the binary operation addition.
In this section we shall examine certain operations under which the class
of regular sets is closed. We can then use these closure properties to help
determine whether certain languages are regular. We already know that if
Lt and L2 are regular sets, then L~ U L2, LiL2, and L* are regular.
DEFINITION
A class of sets is a Boolean algebra of sets if it is closed under union,
intersection, and complementation.
THEOREM 2.8
The class of regular sets included in Z* is a Boolean algebra of sets for
any alphabet l~.
Proof We shall show closure under complementation. We already have
closure under union, and closure under intersection follows from the set-
theoretic law A n B = 7 U - B - (Exercise 0.1.4). Let M -- (Q, A, d~, q0, F)
be any finite automaton with A ~ E. It is easy to show that every regular
set L ~ E* has such a finite automaton. Then the finite automaton
M' = (Q, A, 6, qo, Q - F) accepts A* -- L(M). Note that the fact that M is
completely specified is needed here. Now L(M), the complement with respect
to ~*, can be expressed as L ( M ) = L ( M ' ) U E * ( E - - A)E*. Since
]g*(E -- A)E* is regular, the regularity of L(M) follows from the closure of
regular sets under union. E]
THEOREM 2.9
The class of regular sets is closed under reversal.
Proof. Let M -- (Q, IE, ~, q0, F) be a finite automaton defining the regu-
lar set L. To define L R we "run M backward." That is, let M ' be the nondeter-
ministic finite automaton (Q u {q~}, ~:, 6', q~, F'), where F' is [q0] if e ~ L
and F ' = {q0, q~} if e e L.
130 ELEMENTSOF LANGUAGE THEORY CHAP. 2
The class of regular sets is closed under most common language theoretic
operations, More of these closure properties are explored in the Exer-
cises.
ALGORITHM 2 . 4
Example 2.17
We can enumerate the Turing machines. (See the Exercises in Section 0.4.)
Let M1, M z , . . • be such an enumeration. We can define the integers to be
a representation of the regular sets as follows:
(1) If Mt accepts a regular set, then let integer i represent that regular set.
(2) If Mt does not accept a regular set, then let integer i represent {e}.
Each integer thus represents a regular set, and each regular set is repre-
sented by at least one integer. It is known that for the representation of
Turing machines used here the emptiness problem is undecidable (Exercise
0.4.16). Suppose that it were decidable whether integer i represented @.
Then it is easy to see that M~ accepts ~ if and only if i represents ~ . Thus
the emptiness problem is undecidable for regular sets when regular sets are
specified in this manner. D
EXERCISES
2.3.1. Given a finite automaton with n accessible states, what is the smallest
number of states the reduced machine can have ?
EXERCISES 1 33
2.3.2. Find the minimum state finite automaton for the language specified by
the finite automaton M = ((A, B, C, D, E, F], {0, 1}, ~, A, [E, F}), where
is given by
Input
0 1
State A B C
B E F
C A A
D F E
E D F
F D E
2.3.3. Show that for all n there is an n-state finite automaton such that
n-2 n-3
~ ~.
DEFINITION
We say that a relation R on 1~* is right-invariant if x R y implies
x z R y z for all x, y, z in ~*.
2.3.5. Show that L is a regular set if and only if L is the union of some of the
equivalence classes of a right-invariant equivalence relation R of finite
index. Hint: Only if." Let R be the relation x R y if and only if
(q0, x)1~ (p, e), (q0, Y) [~ (q, e), and p = q. (That is, x and y take a finite
automaton defining L to the same state.) Show that R is a right-invariant
equivalence relation of finite index. If: Construct a finite automaton
for L using the equivalence classes of R for states.
DEFINITION
We say that E is the coarsest right-invariant equivalence relation for
a language L ~ E* if x E y if and only if for all z ~ E* we find
x z ~ L exactly when y z ~ L.
The following exercise states that every right-invariant equivalence
relation defining a language is always contained in E.
2.3.6. Let L be the union of some of the equivalence classes of a right-invariant
equivalence relation R on ~*. Let E be the coarsest right-invariant
equivalence relation for L. Show that E ~ R.
*2.3.7. Show that the coarsest right invariant equivalence relation for a lan-
guage is of finite index if and only if that language is a regular set.
2.3.8. Let M = (Q, E, ~, q0, F) be a reduced finite automaton. Define the
relation E on ~* as follows: x E y if and only if (q0, x)I ~ (p, e),
(q0, Y)I~ (q, e), and p = q. Show that E is the coarsest right-invariant
equivalence relation for L ( M ) .
134 ELEMENTS OF LANGUAGE THEORY CHAP. 2
DEFINITION
An equivalence relation R on E* is a congruence relation if R is
both left- and right-invariant (i.e., if x R y, then wxz R wyz for all
w, x, y, z in E*).
2.3.9. Show that L is a regular set if and only if L is the union of some of the
equivalence classes of a congruence relation of finite index.
2.3.10. Show that if M1 and Mz are two reduced finite automata such that
L ( M i ) = L(M2), then the transition graphs of M1 and M2 are the same.
"2.3.11. Show that Algorithm 2.2 is of time complexity n 2. (That is, show that
there exists a finite automaton M with n states such that Algorithm 2.2
requires n 2 operations to find the reduced automaton for M.) What is
the expected time complexity of Algorithm 2.2 ?
It is possible to find an algorithm for minimizing the states in a
finite automaton which always runs in time no greater than n log n,
where n is the number of states in the finite automaton to be reduced.
The most time-consuming part of Algorithm 2.2 is the determination
of the equivalence classes under =_ in step 2 using the method suggested
in Lemma 2.11. However, we can use the following algorithm in step 2
to reduce the time complexity of Algorithm 2.2 to n log n.
This new algorithm refines partitions on the set of states in a
manner somewhat different from that suggested by Lemma 2.11.
Initially, the states are partitioned into final and nonfinal states.
Then, suppose that we have the partition consisting of the set of
blocks {Itt, lt2 . . . . , uk-~}. A block lrt in this partition and an input
symbol a are selected and used to refine this partition. Each block no
such that ~(q, a) ~ us for some q in ~zj is split into two blocks/t~ and
7t~' such that 7t~. = [ q t q ~ 7r~ and 5(q,a) ~ 7it} and 7t~' =Tt s -- 7t:j,
Thus, in contrast with the method in Lemma 2.11, here blocks are
refined when the successor states on a given input have previously
been shown inequivalent.
ALGORITHM 2.6
([2}, otherwise
(4) Set k = 3.
EXERCISES 135
(d) S e t k = k + l .
(8) Go to step 5.
2.3.12. Apply Algorithm 2.6 to the finite automata in Example 2.15 and
Exercise 2.3.2.
2.3.13. Prove that Algorithm 2.6 correctly determines the indistinguishability
classes of a finite automaton.
*'2.3.14. Show that Algorithm 2.6 can be implemented in time n log n.
2.3.15. Show that the following are not regular sets:
(a) {0"10"ln > 1}.
(b) {wwl w is in {0, 1}*].
(c) L(G), where G is defined by productions S ----. aSbS[ e.
(d) {a"~ln >_ 1}.
(e) {ap IP is a prime}.
(f) {wiw is in {0, 1]* and w has an equal number of O's and l's].
2.3.16. Let f ( m ) be a monotonically increasing function such that for all n
there exists m such that f ( m + 1) > f ( m ) + n. Show that {aS(m)[m > 1}
is not regular.
DEFINITION
Let L1 and Lz be languages. We define the following operations"
(1) L1/L2 = {wlfor some x ~ L2, wx is in L1].
(2) INIT(Lx) = {wlfor some x, wx is in L1].
(3) FIN(La) = {wlfor some x, xw is in L1].
(4) SUB(L1) = {wl for some x and y, xwy is in La ].
(5) M I N ( L i ) = {wlw ~ L1 and for no proper prefix x of w is
x eLx}.
(6) MAX(L1) = {wlw ~ L1 and for no x v~ e is wx ~ L,}.
Example 2.18
Let L1 ={0"1"0 m l n , m ~ l } and L2 = l * 0 * .
T h e n L ~ / L z = L1 U {0il~li>_ 1,] < i].
Lz/L1 = ~3.
136 ELEMENTSOF LANGUAGE THEORY CHAP. 2
D a . a = Ox(Oaa)
*'2.3.21. Let L be a regular set. Show that { x l x y ~ L for some y such that
Ixl = [y I} is regular.
A generalization of Exercise 2.3.21 is the following.
**2.3.22. Let L be a regular set and f ( x ) a polynomial in x with nonnegative
integer coefficients. Show that
EXERCISES 137
Open Problem
2.3.32. Find a fast algorithm (say, one which takes time n k for some constant
k on automata of n states) which gives a minimum state nondeter-
ministic finite automaton equivalent to a given one.
Programming Exercises
2.3.33. Write a program that takes as input a finite automaton, right-linear
grammar, or regular expression and produces as output an equivalent
finite automaton, right-linear grammar, or regular expression. For
example, this program can be used to construct a finite automaton from
a regular expression.
2.3.34. Construct a program that takes as input a specification of a finite
automaton M and produces as output a,reduced finite automaton that
is equivalent to M.
2.3.35. Write a program that will simulate a nondeterministic finite automaton.
2.3.36. Construct a program that determines whether two specifications of a
regular set are equivalent.
138 ELEMENTS OF LANGUAGE THEORY CHAP. 2
BIBLIOGRAPHIC NOTES
The minimization of finite automata was first studied by Huffman [1954] and
Moore [1956]. The closure properties of regular sets and decidability results for
finite automata are from Rabin and Scott [1959].
The Exercises contain some of the many results concerning finite automata
and regular sets. Algorithm 2.6 is from Hopcroft [1971]. Exercise 2.3.22 has been
proved by Kosaraju [1970]. The derivative of a regular expression was defined by
Brzozowski [1964].
There are many techniques to minimize incompletely specified finite automata
(Exercise 2.3.27). Ginsburg [1962] and Prather [1969] consider this problem.
Kameda and Weiner [1968] give a partial solution to Exercise 2.3.32.
The books by Gill [1962], Ginsburg [1962], Harrison [1965], Minsky [1967],
Booth [1967], Ginzburg [1968], Arbib [1969], and Salomaa [1969a] cover finite
automata in detail.
Thompson [1968] outlines a useful programming technique for constructing a
recognizer from a regular expression.
2.4. CONTEXT-FREE L A N G U A G E S
A labeled ordered tree D is a derivation tree (or parse tree) for a context-
free grammar G(A) = (N, X, P, A) if
(1) The root of D is labeled A.
(2) If D ~ , . . . , D k are the subtrees of the direct descendants of the root
and the root of D t is labeled X~, then A ~ Xt . . . X k is a production in P.
D~ must be a derivation tree for G(Xg) = (N, X, P, X~) if X~ is a nonterminal,
and D t is a single node labeled X t if X~ is a terminal.
(3) Alternatively, if D a is the only subtree of the root of D and the root
of D a is labeled e, then A ~ e is a production in P.
Example 2.19
The trees in Fig. 2.8 are derivation trees for the grammar G = G(S)
defined by S ---~ aSbSIbSaS] e. D
We note that there is a natural ordering on the nodes of an ordered tree.
That is, the direct descendants of a node are ordered "from the left" as defined
J a S
e e e
(a) (b) (c) (d)
Fig. 2.11 Derivation trees.
140 ELEMENTSOF LANGUAGE THEORY CHAP. 2
Example 2.20
The set of nodes consisting of only the root is a cut. Another cut is the
set of leaves. The set of circled nodes in Fig. 2.9 is a cut. E]
DEFINITION
LEMMA 2.12
Let S = a0, a l , . . . , an be a derivation of a, from S in C F G G =
(N, Z,P, S). Then there is a derivation tree D for G such that D has frontier
a n and interior frontiers a0, a l , . . . , a,-1 (among others).
Proof. We shall construct a sequence of derivation trees D t, 0 < i < n,
such that the frontier of D~ is at.
Let D O be the derivation tree consisting of the single node labeled S.
Suppose that at = fltAT'~ and this instance of A is rewritten to obtain
at+~ = fl~X~X2 . . . Xky~. Then the derivation tree Di+ 1 is obtained from D~
by adding k direct descendants to the leaf labeled with this instance of A
(i.e., the node which contributes the Ifl, I ÷ 1st symbol to the frontier of Dr)
and labeling these direct descendants X,, X 2 , . . . , Xk respectively. It should
be evident that the frontier of D~+1 is a~+~. The construction of D~+~ from
D~ is shown in Fig. 2.10.
s S
fli A Yi fit A rt
X1 X2...Xk
There are two derivations that can be constructed from a derivation tree
which will be of particular interest to us.
DEFINITION
Example 2.21
Let Go be the C F G
E= E+TIT
T= >T,FIF
F > (E) la
The derivation tree shown in Fig. 2.11 represents ten equivalent derivations
!
T
I
F
I
F
I
a
i
a Fig. 2.11 Exampleof a tree.
DEFINITION
E ~ E + T I E * T[ T
Z~ (E) ta
144 ELEMENTS OF LANGUAGE THEORY CHAP. 2
THEOREM 2. t 2
Algorithm 2.7 says " Y E S " if and only if S =~ w for some w in X*.
Proof. We first prove the following statement by induction on i"
The case k = 0 (i.e., production A ---~ e) is not ruled out. The inductive
step is complete.
The definition of Nt assures us that if N~ = Nt_ 1, then N~ = Nt+ 1 . . . . •
We must show that if A ~ w for some w ~ X*, then A is in Ne. By the above
comment, all we need to show is that A is in Nt for some i. We show the
following by induction on n"
/I
(2.4.2) If A ~ w, then A is in N t for some i
DEFINITION
We say that a symbol X in N U X is inaccessible in a C F G G =
(N, X, P, S) if X does not appear in any sentential form.
"l'This is an "obvious" comment that requires a little thought. Think about the deriva-
n+l
tion tree for the derivation A ~ w. wy is the frontier of the subtree with root Xj.
148 ELEMENTSOF LANGUAGE THEORY CHAP. 2
Method.
(1) Let V0 = {S} and set i = 1.
(2) Let V~ = {X[ some A --, ~Xp is in P and A is in Vt_~} U V~_~.
(3) If Vt ~ Vt_ 1, set i = i + 1 and go to step (2). Otherwise, let
N'= V, A N
z'= z, n z
P' be those productions in P which involve
only symbols in Vt
C' = (N', Z',P', S) 5
There is a great deal of similarity between Algorithms 2.7 and 2.8. Note
that in Algorithm 2.8, since V~ _.q N u E, step (2) of the algorithm can be
repeated at most a finite number of times. Moreover, a straightforward proof
by induction on i shows that S =-~ o~Xfl if and only if X is in Vt for some i.
G
We are now in a position to remove all useless symbols from a CFG.
ALGORITHM 2 . 9
appear in at least one derivation of the form S =:~ w X y :=~ wxy. Note that
applying Algorithm 2.8 first and then applying Algorithm 2.7 will not always
result in a grammar with no useless symbols.
T~EOREM 2.13
a' of Algorithm 2.9 has no useless symbols, and L(G') = L(G).
Proof. We leave it for the Exercises to show that L(G') = L(G). Suppose
that A E N' is useless. From the definition of useless, there are two cases
to consider.
Case 1" S =-~ eArl is false for all e and ft. In this case, A would have
G'
Case 2" S =-~ eArl for some a and p, but A =~ w is false for all w in ~'*.
G" G'
Then A is not removed in step (2), and, moreover, if A =~- ~,B~, then B is not
G
clude that A ==~ w is also false for all w, and A is eliminated in step (1).
G
The proof that no terminal of G' is useless is handled similarly and is
left for the Exercises. [~]
Example 2.22
Consider the grammar G = ((S, A, B~},(a, b}, P, S), where P consists of
S- >alA
A >AB
B >b
S' >elS
Example 2.23
Consider the grammar of Example 2.19 with productions
THEOREM 2.14
Algorithm 2.10 produces an e-free grammar equivalent to its input
grammar.
Proof By inspection, G' of Algorithm 2.10 is e-free. To prove that
SEC. 2.4 CONTEXT-FREE LANGUAGES 149
The proof of (2.4.3) is left for the Exercises. Substituting S for A in (2.4.3),
we see that for w =/= e, w ~ L(G) if and only if w ~ L(G'). The fact that
e ~ L(G) if and only if e ~ L(G') is evident. Thus, L(G) -- L(G'). [--7
Another transformation on grammars which we find useful is the removal
of productions of the form A --. B, which we shall call single productions.
ALGORITHM 2.11
Removal of single productions.
Input. An e-free C F G G.
Output. An equivalent e-free C F G G' with no single productions.
Method.
Example 2.24
Let us apply Algorithm 2.11 to the grammar G O with productions
E--+E+ TIT
T ~T,FIF
F- ~. ( E ) l a
In step (1), N~ = (E, T, F}, N r = {T, F}, N~ = {F]. After step (2), P' becomes
E , E-t- T I T , F I ( E ) I a
T ,T,FI(E)la
F , (E)la [Z]
THEOREM 2.15
In Algorithm 2.11, G' has no single productions, and L(G) = L(G').
- ? : -
150 ELEMENTSOF LANGUAGE THEORY CHAP. 2
LEMMA 2.14
Let G = (N, Z, P, S) be a C F G and A ---~ ~Bfl be in P for some B ~ N
and 0~ and fl in (N U Z)*. Let B ---~ ~'1 I?z[ "'" [~'k be all the B-productions
in P. Let G ' = (N, Z , P ' , S ) w h e r e
Example 2.25
Let us replace the production A ---~ aAA in the grammar G having the
two productions A ~ aAA[b. Applying Lemma 2.14, assuming that ~ = a,
B = A, and fl = A, we would obtain G' having productions
A ---~ a a A A A l a b A l b .
a/!~A~b
LI I ! I
b b b b b
(a) (b)
In G In G'
DEFINITION
ALGORITHM 2.12
Conversion to Chomsky normal form.
Input. A proper C F G G = (N, 2;, P, S) with no single productions.
Output. A C F G G' in CNF, with L(G) -- L(G').
Method. From G we shall construct an equivalent C N F grammar G'
as follows. Let P' be the following set of productions:
(1) Add each production of the form A --~ a in P to P'.
(2) Add each production of the form A --~ B C in P to P'.
(3) If S --, e is in P, add S --, e to P'.
(4) For each production of the form A ---, Xi . . - X k in P, where k > 2,
add to P' the following set of productions. We let X', stand for X~. if X,. is in
N, and let X',. be a new nonterminal if X; is in E.
each production of G' with a nonterminal a', and then to each production
with a nonterminal of the form (Xt - . . Xj). The resulting grammar will
b e G . [Z]
Example 2.26
Let G be the proper C F G defined by
S > aABIBA
A . > B B B [a
B > ASIb
S- > a ' ( A B ) I BA
A > B < B B ) Ia
B >ASIb
(AB> , AB
<BB> , nB
a' > a [Z]
We next show that it is possible to find for each CFL a grammar in which.
every production has a right side beginning with a terminal. Central to the
construction is the idea of left recursion and its elimination.
DEFINITION
LEMMA 2.15
Let G = (N, Z, P, S) be a C F G in which
Example 2.27
Let Go be our usual grammar with productions
E >E+TIT
T >T , F I F
F > (E) la
E > TITE'
E' >+ TI+TE'
T > FIFT'
tNote that the A ~ p{s are in the initial and final sets of A-productions.
SEC. 2.4 CONTEXT-FREE LANGUAGES 155
A A
/\ /\
A Oql A'
/\ /\
.,4 °q2 ~ik A'
/\
Oqk- I A"
A"
/\ /\. *°
A aik A'
/ /\
oq2 A'
\
oq 1
ALGORITHM 2.13
Elimination of left recursion.
Input. A proper C F G G -- (N, X, P, S).
Output. A C F G G' with no left recursion.
Method.
(1) Let N = [ A I , . . . , A,}. We shall first transform G so that if At ---~
is a production, then a begins either with a terminal or some Aj such that
j > i. For this purpose, set i = 1.
(2) Let the Arproductions be A~--~ Atal 1 " " I AtO~m fl~ I " "lflp, where
no ,81 begins with A k if k < i. (It will always be possible to do this.) Replace
these Arproductions by
where A'~ is a new variable. All the At-productions now begin with a terminal
or Ak for some k > i.
(3) If i = n, let G' be the resulting grammar, and halt. Otherwise, set
i=i+ 1 and j = 1.
(4) Replace each production of the form A t ---~ A~0~, by the productions
A,---> fll0~! " " [ flm0~, where Aj ---~ fl~ ] . . . [tim are all the Approductions. It
will now be the case that all Afproductions begin with a terminal or A k,
for k > j, so all At-productions will then also have that property.
(5) If j = i - 1, go to step (2). Otherwise, setj =j-Jr- 1 and go to step (4).
D
THEOREM 2.18
Every CFL has a hon-left-recursive grammar.
Proof. Let G be a proper grammar for C F L L. If we apply Algorithm
2.13, the only transformations used are those of Lemmas 2.14 and 2.15.
Thus the resulting G' generates L.
We must show that G' is free of left recursion. The following two state-
ments are proved by induction on a quantity which we shall subsequently
define:
(2.4.4) After step (2) is executed for i, all At-productions begin with
a terminal or A k, for k > i
(2.4.5) After step (4) is executed for i and j, all At-productions begin with
a terminal or A u, for k > j
k < j such that A~ =-~ Ajfl ==~ Ak? =-~ Aloe. We must now show that no A't
lm lm lm
SEC. 2,4 CONTEXT-FREE LANGUAGES 157
Example 2.28
Let G be
A ~ BCIa
B----> CA IAb
C > AB[ CC[ a
Example 2.29
Consider the grammar G with productions
E >TITE'
E' > +T] + T E '
T >FIFT'
T' > ,F] ,FT'
F > (E) ia
Take E' < E < T' < T < F as the linear order on nonterminals.
SEC. 2.4 CONTEXT-FREE LANGUAGES 159
g >(E)'IaI(E)'T'IaT'I(E)'E'IaE'I(E)' T'E']aT'E'
E' ~ +TI +TE'
T ~(E)'IaI(E)'T'IaT'
T' ~ , F I,FT'
r ~ (E)' Ia
)' ---~)
A >AaBIBB[b
B ~ aA [ B A a t B d l c
A = A aB q-- BB -+- b
(2.4.6) B = a A -Jr-B A a -at- B d -+- c
DEFINITION
d = d R + _B
Example 2.30
Let us consider the grammar whose corresponding defining equations
are (2.4.7), that is,
A ~ AaBIBBIb
B ~ aA IBAaIBd[ c
We rewrite these equations according to step (2) of Algorithm 2.15 as
(2.4.9) I~ ZXI=fa:
• A a Z~
q--dlI~ zX] -q- EaBBA a q - d l
The grammar corresponding t o (2.4.8) and (2.4.9) is
A > b WI an YI c YI b
B >b X l a A Z l c Z l a A l e
W ~ aBWla11
X > aBX
Y >B W [ A a Y [ d Y [ B
z ~nXlAaZldZlAa[d
EXERCISES 163
EXERCISES
nl
n4
n5
n7 Fig. 2.14 Unlabelled derivation tree.
2.4.4. Show that the following are equivalent statements about a CFG G and
sentence w:
(a) w is the frontier of two distinct derivation trees of G.
(b) w has two distinct leftmost derivations in G.
(c) w has two distinct rightmost derivations in G.
**2.4.5. What is the largest number of different derivations that are represen-
table by the same derivation tree of n nodes ?
2.4.6. Convert the grammar
S >AIB
A > aBlbSlb
B ----~. ABIBa
C---->. AS[b
S >ABC
A -----~ BBIe
B > CCIa
C >AAIb
S-----> A I B
A-----> C I D
B----~ D I E
C---->. S l a l e
19 > Slb
E----> S l c l e
Xn+l = n + 1
n)
(These are the Catalan numbers.)
*2.4.30. Show that if L is a C F L containing no sentence of length less than 2,
then L has a g r a m m a r with all productions of the form A ---~ aocb.
2.4.31. Show that every C F L has a g r a m m a r in which if X1 X2 . . . Xk is the
right side of a production, then XI . . . . . Xk are all distinct.
DEFINITION
A C F G G = (N, E, P, S) is linear if every production is of the form
A ~ w B x or A ---~ w for w and x in Z* and B in N.
2.4.32. Show that every linear language without e has a g r a m m a r in which
each production is of one of the forms A ~ aB, A ~ Ba, or A ~ a.
*2.4.33. Show that every C F L has a g r a m m a r G ---- (N, E, P, S) such that if A
,
is in N -- [S], then [ w l A :=~ w and w is in Z*} is infinite.
2.4.34. Show that every C F L has a recursive grammar. H i n t : Use L e m m a 2.14
and Exercise 2.4.33.
166 ELEMENTSOF LANGUAGE THEORY CHAP. 2
DEFINITION
Programming Exercises
2.4.38. Construct a program that eliminates all useless symbols from a CFG.
2.4.39. Write a program that maps a CFG into an equivalent proper CFG.
2.4.40. Construct a program that removes all left recursion from a CFG.
2.4.41. Write a program that decides whether a given derivation tree is a valid
derivation tree for a CFG.
BIBLIOGRAPHIC NOTES
2.5. PUSHDOWN A U T O M A T A
Read only
a2 • .- art
input tape
!
[ Filit
state
control Zi
Z2
Pushdown
list
Zm
where
(1) Q is a finite set of :tate symbols representing the possible states of
the finite state control,
(2) 2~ is a finite input alphabet,
(3) F is a finite alphabet of pushdown list symbols,
(4) 6 is a mapping from Q x (Z w {e}) x F to the finite subsets of
QxF*,
(5) qo ~ Q is the initial state of the finite control,
(6) Z0 E F is the symbol that appears initially on the pushdown list
(the start symbol), and
(7) F ~ Q is the set of final states.
A configuration of P is a triple (q, w, 0c) in Q x ~* x F*, where
(1) q represents the current state of the finite control.
(2) w represents the unused portion of the input. The first symbol of w
is under the input head. If w = e, then it is assumed that all of the input
tape has been read.
(3) a represents the contents of the pushdown list. The leftmost symbol
of ~ is the topmost pushdown symbol. If 0¢ = e, then the pushdown list is
assumed to be empty.
A move by P will be represented by the binary relation I-v- (or ~ whenever
P is understood) on configurations. We write
Example 2.31
Let us give a pushdown automaton for the language L = [0"l']n > 0}.
Let P = ({qo, ql, q2}, {0, 1}, {Z, 0}, d~, qo, Z, {q0}), where
P operates by copying the initial string of O's from its input tape onto its
pushdown list and then popping one 0 from the pushdown list for each 1
that is seen on the input. Moreover, the state transitions ensure that all O's
must precede the l's. For example, with the input string 001 I, P would make
the following sequence of moves:
Example 2.32
Let us design a pushdown automaton for the language
Let P = ([qo, ql, q2}, [a, b}, {Z, a, b}, ~, qo, Z, {q2]), where
(1) $(qo, a, Z) = [(qo, aZ)}
(2) $(qo, b, Z) : [(qo, bZ)}
(3) $(qo, a, a) = [(qo, aa), (q l, e)}
(4) O(qo, a, b) : ((qo, ab)}
(5) 6(qo, b, a) : {(qo, ba)}
(6) ,6(qo, b, b) = ((qo, bb), (q l, e)}
(7) ,6(ql, a, a) : [(ql, e)}
(8) 6(ql, b, b) -- [(ql, e)]
(9) d~(ql, e, Z) : [(q2, e)}
P initially copies some of its input onto its pushdown list, by rules (1),
(2), (4), and (5) and the first alternatives of rules (3) and (6). However, P is
nondeterministic. Anytime it wishes, as long as its current input matches
the top of the pushdown list, it may enter state q~ and begin matching its
SEC. 2.5 PUSHDOWN AUTOMATA 171
pushdown list against the input. The second alternatives of rules (3) and (6)
represent this choice, and the matching continues by rules (7) and (8). Note
that if P ever fails to find a match, then this instance of P "dies." However,
since P is nondeterministic, it makes all possible moves. If any choice causes
P to expose the Z on its pushdown list, then by rule (9) that Z is erased and
state q2 entered. Thus P accepts if and only if all matches are made.
For example, with the input string abba, P can make the following
sequences of moves, among others"
(1) (q0, abba, Z) [--- (qo, bba, aZ)
(qo, ba, baZ)
(qo, a, bbaZ)
(qo, e, abbaZ)
(2) (q0, abba, Z) ~ (qo, bba, aZ)
[-- (qo, ba, baZ)
(q l, a, aZ)
k- (q~,e,Z)
k- (q2, e,e).
Since the sequence (2) ends in final state q2, P accepts the input string abba.
Again it is relatively easy to show that if w = CxC2 . . . c,,c,,c,,_~ . . . c~,
each ct in {a, b}, 1 < i < n, then
I"-' (q~, e, Z)
I ~ (q2, e, e).
Thus, L ~ L(P).
It is not quite as easy to show that if (q0, w, Z)I-~--(q2, e, ~) for some
~ I'*, then w is of the form xx R for some x in (a + b) ÷ and ~ = e. This
proof is left for the Exercises. We can then conclude that L ( P ) = L.
The pushdown automaton of Example 2.32 quite clearly brings out the
nondeterministic nature of a PDA. From any configuration of the form
(qo, aw, a~) it is possible for P to make one of two moves---either push
another a on the pushdown list or pop the a from the top of the pushdown
list.
We should emphasize that although a nondeterministic pushdown
automaton may provide a convenient abstract definition for a language,
the device must be deterministically simulated to be realized in practice. In
Chapter 4 we shall discuss systematic methods for simulating nondetermin-
istic pushdown automata.
172 ELEMENTS OF LANGUAGE THEORY CHAP. 2
In this section we shall define some variants of PDA's and relate the lan-
guages defined to the original PDA languages. First we would like to bring
out a fundamental aspect of the behavior of a PDA which should be quite
intuitive. This can be stated as "What transpires on top of the pushdown
list is independent of what is under the top of the pushdown list."
LEtvr~L~ 2.20
Let P = (Q, Z, I", d~, q0, Z0, F) be a PDA. If (q, w, A)I -z- (q', e, e), then
(q, w, A~)I -n- (q', e, t~) for all A ~ 17' and ~ ~ r * .
Proof A proof by induction on n is quite elementary. For n = 1, the
lemma is certainly true. Assuming that it is true for all 1 ~ n < n', let
(q, w, A)!-~-- (q', e, e). Such a sequence of moves must be of the form
•]'This is another of those "obvious" statements which may require some thought•
Imagine the P D A running through the indicated sequence of configurations. Eventually,
the length of the pushdown list becomes k - 1 for the first time. Since none of X2 . . . Xk
has ever been the top symbol, they must still be there, so let n l be the number of elapsed
moves• Then wait until the length of the list first becomes k -- 2 and let n2 be the number
of additional moves made. Proceed in this way until the list becomes empty•
SEC. 2.5 PUSHDOWN AUTOMATA 173
list by some other finite-length string in a single move. Recall that our original
version of PDA could replace only the topmost symbol on top of the push-
down list on a given move.
DEFINITION
Example 2.33
Let us define an extended PDA P to recognize L = {wwRlw ~ [a, b}*}.
Let P = ([q, p}, [a, b}, {a, b, S, Z}, 6, q, Z, [p}), where
(1) 6(q, a, e) = {(q, a)}
(2) 6(q, b, e) = [(q, b)}
(3) O(q, e, e) = {(q, S)}
(4) O(q, e, aSa) = {(q, S)}
(5) O(q, e, bSb) = {(q, S)}
(6) 6(q, e, SZ) = {(p, e)}
With input aabbaa, P can make the following sequence of moves:
P operates by first storing a prefix of the input on the pushdown list. Then
a centermarker S is placed on top of the pushdown list. P then places the
next input symbol on the pushdown list and replaces aSa or bSb by S on
the list. P continues in this fashion until all of the input is used. If SZ then
remains on the pushdown list, P erases SZ and enters the final state. D
These rules cause the buffer in the finite control to fill up (i.e.,
contain m symbols).
(4) qt = [q0, Z o Z T - 1]. The buffer initially contains Z0 on top and m -- 1
Z l ' s below. Z l's are used as a special marker for the bottom of the pushdown
list.
(5) F1 = [[q, ~]lq ~ F, 0c ~ F~*}.
It is not difficult to show that
(q, aT, X l . . . XkXk+ ~ . " Xn) [-~-(r, w, Y1 "'" YIXk+i "'" Xn)
if and only if ([q, a], aT, fl)1-~- ([r, a'], w, fl'), where
(1) ~fl = x~ . . . x , z ~ ,
(2) ~ ' f l ' = r~ . . . Y,X~+x . . . x o z ~ ,
(3) I~t = t~:1 = m, and
(4) Between the two configurations of P1 shown is none whose state has
a second component (buffer) of length m. Direct examination of the rules
of P1 is sufficient.
Thus, (qo, w, Zo)]-~- (q, e, oO for some q in F and ~z in F* if and only if
tWe shall usually make the set of final states ~ if the PDA is to accept by empty
pushdown list. Obviously, the set of final states could be anything we wished.
176 ELEMENTS OF LANGUAGE THEORY CHAP. 2
We can now use these results to show that the P D A languages are exactly
the context-free languages. In the following lemma we construct the natural
(nondeterministic) "top-down" parser for a context-free grammar.
LEMMA 2.24
Let G = (N, Z, P, S) be a CFG. F r o m G we can construct a P D A R
such that L,(R) = L(G).
Proof. We shall construct R to simulate all leftmost derivations in G.
Let R = ({q}, Z, N U X, 5, q, S, ~ ) , where 5 is defined as follows:
(1) If A ~ ~ is in P, then ~(q, e, A) contains (q, 00.
(2) J(q, a, a) = [(q, e)} for all a in X.
We now want to show that
SEC. 2.5 P U S H D O W N AUTOMATA 177
wl
N o w suppose that A :=~ w for some m > 1. The first step of this deriva-
mt
tion must be of the form A ~ X12"2 " " Xk, where X~ ==~ xi for some
m~ < m, 1 < i < k, and where x t x 2 . . . x k = w. Then
A:=~w.
For n = 1, w = e and A ---, e is in P. Let us assume that this statement is
true for all n' < n. Then the first move made by R must be of the form
A---~.X1 ... X k
:,g
, x~X~ .. " Xk
- ~. 2 1 2 2 ..° 2k_lXk = W
is a derivation of w from A in G.
+
Example 2.34
Let us construct a P D A P such that L,(P) = L(Go) , where Go is our usual
grammar for arithmetic expressions. Let P = ({~q), Z, F, ~, q, E, ~ ) , where
is defined as follows:
(1) 6(q, e, E) = [(q, E + T), (q, T)}.
(2) 6(q, e, T) = [(q, T , F ) , (q, F)}.
(3) 5(q, e, F) = [(q, (E)), (q, a)}.
(4) d~(q, b, b) = [(q, e)} for all b ~ [a, + , ,, (,)}.
With input a + a • a, P can make the following moves among others:
Notice that in this sequence of moves P has used the rules in a sequence that
corresponds to a leftmost derivation of a + a • a from E in G 0. D
Example 2.35
Consider the grammar with the following productions"
S >ActBd
A- ~ aAb[ab
B ~ aBbblabb
This grammar generates the language {a"b"cln ~> 1] u {a"b2"d[ n ~> 1].
Consider the right-sentential form aabbbbd. The only handle of this string
is abb, since aBbbd is a right-sentential form. Note that although ab is the
right side of the production A ---, ab, ab is not a handle of aabbbbd since
aAbbbd is not a right-sentential form. [-]
+ T
i
F F
I !
a
!
a a
i
(a)
E + T E T
T T • F T T F
I
F F
I i
a F
I
a
a
I a
I
(b) (c)
Fig. 2.16 Handle pruning.
with the top at the left, one can create a PDA doing exactly the same things,
but with the pushdown top at the right, by reversing all strings in F*. For
example, (p, VWX) ~ ~(q, a, YZ) becomes (p, X W V ) ~ ~(q, a, Z Y). Of
course, one must specify the fact that the top is now at the right. Conversely,
a P D A with top at the right can easily be converted to one with the top a
the left.
We see that the 7-tuple notation for PDA's can be interpreted as two
different PDA's, depending on whether the top is taken at the right or left.
We feel that the notational convenience which results from having these two
conventions outweighs any initial confusion. As the "default condition,"
unless it is specified otherwise, ordinary PDA's have their pushdown tops
on the left and extended PDA's have their pushdown tops on the right.
LEMMA 2.25
Let (N, X, P, S) be a CFG. From G we can construct an extended PDA
R such that L(R) = L(G).t R can "reasonably" be said to operate by handle
pruning.
Proof. Let R = ({q, r }, X, N U ~ U {$}, ~, q, $, {r }) be an extended PDA~
in which $ is defined as follows:
(1) d~(q, a, e) = {(q, a)} for all a ~ Z. These moves cause input symbols
to be shifted on top of the pushdown list.
(2) If A --~ ~ is in P, then O(q, e, 00 contains (q, A).
(3) $(q, e, $ S ) = {(r, e)}.
We shall show that R operates by computing right-sentential forms of G,
starting with a string of all terminals (on R's input) and ending with the
string S. The inductive hypothesis which will be proved by induction on n is
* n
(2.5.3) S~ aAy ~ xy implies (q, xy, $) ~ (q, y, SaA)
fm rm
tObviously, Lemma 2.25 is implied by Lemmas 2.23 and 2.24. It is the construction
that is of interest here.
:l:Our convention puts pushdown tops on the right.
182 ELEMENTS OF LANGUAGE THEORY CHAP. 2
* n--I
nonterminal. By (2.5.3), S ==>?Bzy ==~xy implies (q, xy, $ ) ~ (q, zy, $?B).
rm rm
Also, (q, zy, $?B) ~ (q, y, $~,Bz) ~-- (q, y, $o~A) is a valid sequence of moves.
We conclude that (2.5.3) is true. Since (q, e, $ S ) ~ (r, e, e), we have
L(C) _~ L(R).
We must now show the following, in order to conclude that L(R) ~ L(G),
and hence, L(G) = L(R)"
The basis, n = 0, holds vacuously. For the inductive step, assume that
(2.5.4) is true for all values of n < m. When the top symbol of the pushdown
list of R is a nonterminal, we know that the last move of R was caused by
rule (2) of the definition of ,3. Thus we can write
Example 2.36
Z ~ F. We shall then show that [qZr] =~ w if and only if (q, w, Z) t----(r, e, e).
Formally, let G = (N, Z, P, S), where
(1) N = ([qZr][q, r ~ O , Z ~ F} u (S}.
(2) The productions in P are constructed as follows"
(a) If 8(q, a, Z) contains (r, X1 " " Xk), t k _> 1, then add to P all
productions of the form
tR has its pushdown list top on the left, since we did not state otherwise.
184 ELEMENTSOF LANGUAGE THEORY CHAP. 2
and Z ~ F, [qZr] =~ w if and only if (q, w, Z) Fz- (r, e, e). We leave the proof
+
for the Exercises. Then, S =~ [qoZoq] =~ w if and only if (q0, w, Z0) ~-- (q, e, e)
for q in Q. Thus, Le(R) -- L(G). D
We can summarize these results in the following theorem.
THEOREM 2.21
The following statements are equivalent:
(1) L is L(G) for a CFG G.
(2) L is L(P) for a PDA P.
(3) L is Le(P) for a PDA P.
(4) L is L(P) for an extended PDA P.
Proof ( 3 ) ~ (1) by Lemma 2.26. (1)----~ (3) by Lemma 2.24. (4)--~ (2)
by Lemma 2.21, and (2) ~ (4) is trivial. (2) ~ (3) by Lemma 2.22 and
(3)----, (2) by Lemma 2.23. r-]
Example 2.37
Let us construct a D P D A for the language L = {wcw~lw ~ [a, b}+].
Let P = ([qo, q l, q2}, {a, b, c}, {Z, a, b}, ,6, qo, Z, {q2]), where the rules of t~
are
Until P sees the centermarker c, it stores its input on the pushdown list.
When the e is reached, P goes to state q l and proceeds to match its subsequent
input against the pushdown list. A proof that L(P) -- L is left for the Exer-
cises. D
tlf the extended PDA has its pushdown list top at the left, replace "suffix" by "prefix."
186 ELEMENTSOF LANGUAGE THEORY CHAP. 2
e-moves without creating a shorter pushdown list; that list might grow
indefinitely or cycle between several different strings.
Note that there are nonlooping configurations which after popping part
of their list using e-moves enter a looping configuration. We shall show that
it is impossible to make an infinite number of e-moves from a configuration
unless a looping configuration is entered after a finite, calculable number of
moves.
If P enters a looping configuration in the middle of the input string, then
P will not use any more input, even though P might satisfy Lemma 2.27.
Given a D P D A P, we want to modify P to form an equivalent D P D A P'
such that P' can never enter a looping configuration.
ALGORITHM 2.16
Proof We first prove that step (1) correctly determines C1 U C2. If (q, A)
is in C1 U C2, then, obviously, (q, e, A ) ~ (r, e, ~). Conversely, suppose
that (q, e, A) ~ (r, e, ~).
Case 1" There exists fl ~ r * , with I//I > nan21 and (q, e , A ) ~ (p, e, fl)
~--(r, e, ~ ) f o r some p ~ Q. If we consider, for j = 1, 2 , . . . , nan2l + 1,
the configurations which P entered in the sequence of moves (q, e, A)
(p, e, fl) the last time the pushdown list had length j, then we see that there
must exist q' and A' such that at two of those times the state of P was q' and
A' was on top of the list. In other words, we can write (q, e, A) l-- (q', e, A'6)
(q', e, A'76 ) ~ (p, e, fl). Thus, (q, e, A) ~ (q', e, A'$) ~ (q', e, A'rJ~) for
allj ~ 0 by Lemma 2.20. Here, m > 0, so an infinity of e-moves can be made
from configuration (q, e, A), and (q, A) is in C~ U C 2.
Case 2: Suppose that the opposite of case 1 is true, namely that for all fl
such that (q, e, A) ~ (p, e, fl) ]--- (r, e, ~) we have I/~1 _< nln2l. Since there
are n 3 -+- 1 different fl's, na possible states, and n2 -k- n~ -+- n~ -+- . . . -q- n~'"a
= (n~~"~- n2)/(n 2 --1) possible pushdown lists of length at most nlnJ,
there must be some repeated configuration. It is immediate that (q, A) is
in Cx U C 2.
The proof that step (2) correctly apportions C a U C2 between C a and C~
is left for the Exercises. E]
DEFINITION
enters state r from p and stays in state r without altering the pushdown list.
Thus, L(P') = L(P).
It is necessary to show that P ' is continuing. Rules (3), (4), and (5) assure
us that no violation of the "continuing" condition occurs if P enters a looping
configuration. It is necessary to observe only that if P is in a configuration
which is not looping, then within a finite number of moves it must either
(1) Make a non-e-move or
(2) Enter a configuration which has a shorter pushdown list.
Moreover, (2) cannot occur indefinitely, because the pushdown list is
initially of finite length. Thus either (1) must eventually occur or P enters
a looping configuration after some instance of (2). We may conclude that
P ' is continuing. [~]
where ~ ( q , a , Z ) = ( p , ~ , ) , i = 0 i f p ~ F , and i = l i f p E F .
(ii) If q c Q, Z e F, and d(q, e, Z) = (p, ~,), then
1 90 ELEMENTS OF LANGUAGE THEORY CHAP. 2
EXERCISES
2.5.1. Construct PDA's accepting the complements (with respect to [a, b}*)
of the following languages:
(a) {a"b"a" [n ~> 1}.
(b) [wwRIw ~ [a, b}*}.
(c) [ambnamb" [m, n ~ 1}.
(d) (ww] w ~ (a, b}*).
Hint: Have the nondeterministic 'PDA "guess" why its input is not in
the language and check that its guess is correct.
2.5.2. Prove that the PDA of Example 2.31 accepts [wwR[ W ~ (a, b}+}.
2.5.3. Show that every CFL is accepted by a PDA which never increases the
length of its pushdown list by more than one on a single move.
2.5.4. Show that every CFL is accepted by a PDA P = (Q, E, 1-', 6, q0, Z0, F)
such that if (p, 7) is in O(q, a, Z), then either ~, = e, ~, - Z, or 7 -- YZ
for some Y ~ F. Hint: Consider the construction of Lemma 2.21.
2.5.5. Show that every CFL is accepted by a PDA which makes no e-moves.
Hint: Recall that every CFL has a grammar in Greibach normal form.
2.5.6. Show that every CFL is L(P) for some two-state PDA P.
2.5.7. Complete the proof of Lemma 2.23.
2.5.8. Find bottom-up and top-down recognizers (PDA's) for the following
grammars:
(a) S ---, aSb [e.
(b) S - - ~ AS[b
A - ~ SAla.
(c) s ~ , SSlA
A--~OAliSIO1.
2.5.9. Find a grammar generating L(P), where
P = ([q0, ql, q2], [a, b}, {Z0, A], 6, q0, Z0, [q2})
EXERCISES 1 91
and ~ is given by
Open Questions
2.5.22. Does there exist a language accepted by a 2PDA that is not accepted by
a 2DPDA ?
2.5.23. Does there exist a CFL which is not accepted by any 2DPDA ?
Programming Exercises
2.5.24. Write a program that simulates a deterministic PDA.
*2.5.25. Devise a programming language that can be used to specify pushdown
automata. Construct a compiler for your programming language. A
source program in the language is to define a PDA P. The object program
is to be a recognizer which given an input string w simulates the behavior
of P on w in some reasonable sense.
2.5.26. Write a program that takes as input CFG G and constructs a nondeter-
ministic top-down (or bottom-up) recognizer for G.
BIBLIOGRAPHIC NOTES
The importance of pushdown lists, or stacks, as they are also known, in lan-
guage processing was recognized by the early 1950's. Oettinger [1961] and
Schutzenberger [1963] were the first to formalize the concept of a pushdown auto-
maton. The equivalence of pushdown automaton languages and context-free
languages was demonstrated by Chomsky [1962] and Evey [1963].
Two-way pushdown automata have been studied by Hartmanis et al. [1965],
Gray et al. [1967], Aho et al. [1968], and Cook [1971].
DEFINITION
with the descendants of lg deleted, we see that A ==~ vAx, where v and x are
the frontiers from the descendant leaves of lr to the left and right, respec-
tively, of Ig. Finally, let w be the frontier of the subtree dominated by Ig.
+
o w x y
4- +
Putting all these derivations together, we have S =-~ u A y ==~ uwy, and
+ + + + + +
for all i > 1, S =-~ u A y ==~ u v A x y =-~ uv2AxZy ==~. . . ==~ uvtAxty ==~ uvtwxty.
Thus condition (4) is satisfied. Moreover, u has at least one distinguished
position, the descendant of some direct descendant of 1~. v likewise
has at least one distinguished position, descending from 1r. Thus condition
(2) is satisfied. Condition (1) is satisfied, since w has a distinguished position,
namely np.
To see that condition (3), that v w x has no more than k distinguished
positions, is satisfied, we observe that b 1, being the 2m -t-- 3rd branch node
from the end of path n 1, . . . , np, has no more than k distinguished positions.
Since 1r is a descendant of b 1, our desired result is immediate.
We should also consider the alternative case in which at least m -+ 2 of
b l, . . . , b2,,+ 3 are right branch nodes. However, this case is handled symmet-
rically, and we shall find condition (2) satisfied because x and y each have
distinguished positions. [~
Let L be a CFL. Then there exists a constant k such that if lzl >_ k and
z ~ L, then we can write z = u v w x y such that v x ~ e, l v w x [ < k, and for
all i, uvtwxty is in L.
P r o o f In Theorem 2.24, choose any C F G for L and let all positions of
each sentence be distinguished. [~]
It is the corollary to Theorem 2.24 that we most often use when proving
certain languages not to be context-free. Theorem 2.24 itself will be used
when we talk about inherent ambiguity of CFL's in Section 2.6.5.
Example 2.38
Let us use the pumping lemma to show that L = [a"'[n > 1} is not a
CFL. If L were a CFL, then we would have an integer k such that if n 2 > k,
then a"' = uvwxy, where v and x are not both e and i vwxl < k. In particular,
let n be k itself. Certainly k 2 > k. Then uv2wxZy is supposedly in L. But
since Ivwxl_< k, we have 1 _<lvxl_< k, so k 2 < t uv~wx~y l < k 2 + k.
But the next perfect square after k 2 is (k + 1 ) 2 = k 2 + 2k + 1. Since
k 2 + k < k 2 + 2k + 1, we see that luvZwxZy[ is not a perfect square. But
by the pumping lemma, uv2wx2y is in L, which is a contradiction. D
Example 2.39
Closure properties can often be used to help prove that certain languages
are not context-free, as well as being interesting from a theoretical point of
view. In this section we shall summarize some of the major closure properties
of the context-free languages.
DEFINITION
L'=[xlx2... x, l alaz...a, ~ L
x l ~ L,t
x2 ~ Za,
.
x,~Lo.}
is in £.
Example 2.40
Let L = [0"l"in ~ 1}, L0 = [a}, and L 1 = [bmcmlm _~ 1}. Then the sub-
stitution of L0 and Li into Z is
L' -- [anbmlcmlbm"-cm. . . . bm.cm"I n > 1, mt ~> 1} [--]
THEOREM 2.25
The class of context-free languages is closed under substitution.
P r o o f Let L ~ Z* be a CFL where ~ -- {a 1, a 2 , . . . , a,]. Let L a _ ~*
be a CFL for each a in ~. Call the language that results from the substitution
of the La's for a in L by the name L'. Let G -- (N, E, P, S) be a C F G for L
and G~ -- (N,, E,, Pa, a') be a C F G for La. We assume that N and all Na are
mutually disjoint. Let G ' = (N', E', P', S), where
SEC. 2.6 PROPERTIES OF CONTEXT-FREE LANGUAGES 197
(1) N ' = U N~ w N.
aEE
(2) ~ f f = U ~a"
aEZ
(3) Let h be the homomorphism on N U Z such that h ( A ) = A for all A in
N and h ( a ) = a' for a in Z. Let P ' = {A ----~h(a) lA ---~ a is in P} U U P,.
aEZ
Thus, P ' consists of the productions of the Ga'S together with the produc-
tions of G with all terminals made (primed) nonterminals. Let al . . . a , be
! f f
inLandx~inL~,forl ~i~n. ThenS~a'l ... a, ~ xlaz
,
.. a, ~ . . .
G' G" O'
There are many other operations under which the context-free languages
are closed. Some of these operations will be discussed in the Exercises. We
shall conclude this section by providing a few applications of closure prop-
erties in showing that certain sets are not context-free languages.
Example 2.41
L = {ww[w ~ {a, b} +} is not a context-free language. Suppose that L
were context-free. Then L' = L A a+b +a+b + = {amb"amb" l m, n > 1} would
also be context-free by Theorem 2.26. But from Exercise 2.6.3(e), we know
that L' is not a context-free language. [Z]
Example 2.42
L = { w w l w ~ {c,f} +} is not a context-free language. Let h be the homo-
morphism h ( c ) = a and h ( f ) = b. Then h ( L ) = { w w l w ~ {a, b}+}, which by
the previous example is not a context-free language. Since the CFL's are
closed under homomorphism (corollary to Theorem 2.25), we conclude
that L is not a CFL. [2]
Example 2.43
A L G O L is not a context-free language. Consider the following class of
A L G O L programs:
Let LA be the set of all valid A L G O L programs. Let R be the regular set
denoted by the regular expression
We have already seen that the emptiness problem is decidable for context-
free grammars. Algorithm 2.7 will accept any context-free grammar G as
input and determine whether or not L(G) is empty.
Let us consider the membership problem for CFG's. We must find
an algorithm which given a context-free grammar G = (N, E, P, S) and
a word w in E*, will determine whether or not w is in L(G). Obtaining an
efficient algorithm for this problem will provide much of the subject matter
of Chapters 4-7. However, from a purely theoretical point of view we
can immediately conclude that the membership problem is solvable for
CFG's, since we can always transform G into an equivalent proper context-
free grammar G' using the transformations of Section 2.4.2. Neglecting
the empty word, a proper context-free grammar is a context-sensitive gram-
mar, so we can apply the brute force algorithm for deciding the membership
problem for context-sensitive grammars to G'. (See Exercise 2.1.19.)
Let us consider the equivalence problem for context-free grammars.
Unfortunately, here we encounter a problem which is not decidable. We
shall prove that there is no algorithm which, given any two CFG's G1 and G2,
can determine whether L(G1) = L(G2). In fact, we shall show that even given
a C F G G1 and a right-linear grammar G2 there is no algorithm to determine
whether L(Gi) = L(G2). As with most undecidable problems, we shall show
that if we can solve the equivalence problem for CFG's, then we can solve
Post's correspondence problem. We can construct from an instance of Post's
correspondence problem two naturally related context-free languages.
DEFINITION
Let C = (x 1, y l ) , . . . , (x,, y,) be an instance of Post's problem over
alphabet E. Let I = {1, 2 , . . . , n}, assume that I ~ E = .•, and let Lc
200 ELEMENTSOF LANGUAGE THEORY CHAP. 2
LEMMA 2.31
Example 2.44
We can easily show that the complement of L : {a"bnc"ln ~ 1} is a CFL.
The sentence w ~ L if and only if one or more of the following hold"
(1) w is not in a+b÷c÷.
(2) w : aib@k, and i ~ j.
(3) w : atbJck, and j ~ k.
The set satisfying (1) is regular, and the sets satisfying (2) and (3) are each
context-free, as the reader can easily show by constructing nondeterministic
PDA's recognizing them. Since the CFL's are closed under union, L is a CFL.
But if L were a deterministic CFL, then L would be likewise, by Theorem
2.23. But L is not even a CFL. [Z]
2.6.5. Ambiguity
Example 2.45
Perhaps the most famous example of ambiguity in a programming lan-
guage is the dangling else. Consider the grammar G with productions
has two derivation trees as shown in Fig. 2.18. The derivation tree in Fig.
2.18(a) imposes the interpretation
if
(a)
if b then S
if b then S else
i
a
(b)
Fig. 2.18 Two derivation trees.
-- 7
204 ELEMENTS OF LANGUAGE THEORY CHAP. 2
S >AIB
A > x~Ailxti, for 1 < i < n
B~ yiBilyti, for 1 < i < n
Example 2.46
Let us consider the grammar and language of the previous example.
The reason that grammar G is ambiguous is that an else can be associated
with two different then's. For this reason, programming languages which
allow both if-then--else and if-then statements can be ambiguous. This
ambiguity can be removed if we arbitrarily decide that an else should be
attached to the last preceding then, as in Fig. 2.18(b).
We can revise the grammar of Example 2.45 to have two nonterminals,
S~ and $2. We insist that $2 generate if-then-else, while S~ is free to generate
either kind of statement. The rules of the new grammar are
The fact that only $2 precedes else ensures that between the then-else
pair generated by any one production must appear either the single symbol a
or another else. Thus the structure of Fig. 2.18(a) cannot occur. In Chapter 5
we shall develop deterministic parsing methods for various grammars,
including the current one, and shall be able at that time to prove our new
grammar to be unambiguous. [Z]
often harder to parse than unambiguous ones, we shall mention some of the
more common constructs of this nature here so that they can be recognized
in practice.
A proper grammar containing the productions A--~ AAI;o~ will be am-
biguous because the substring A A A has two parses-
A A
A
/\ /\
A A A A
A >AB1B
B >~
or the productions
A >BAIB
B >¢
Example 2.47
Let L - - [ a i b : c 1 ] i - - j or j ~ l~. L is an inherently ambiguous CFL.
Intuitively, the reason is that the words with i - - j must be generated by
a set of productions different from those generating the words with j - - I.
At least some of the words with i -- j -- l must be generated by both mecha-
nisms.
206 ELEMENTSOF LANGUAGE THEORY CHAP. 2
One C F G for L is
S: ~ > A B I D C
A > aAle
B > bBcle
C > cCl e
D > aDb[e
other. Thus there exists a sentential f o r m tiAt2Bt3, where the t's are terminal
strings. F o r all i a n d j, tlv~wxq2(v')iw'(x')it3 would presumably be in L. But
I vl = ]x[ a n d [v'l = Ix' [. Also, x a n d v' consist exclusively of b's, v consists
of a's, a n d x' consists of c's. Thus choosing i a n d j equal a n d sufficiently
large will ensure that the above w o r d has m o r e b's than a's or o's. W e m a y
thus conclude that G is a m b i g u o u s and that L is inherently ambiguous. [ ]
EXERCISES
2.6.1. Let L be a context-free language and R a regular set. Show that the
following languages are context-free:
(a) INIT(L).
(b) FIN(L).
(c) SUB(L).
(d) L/R.
(e) L ~ R.
The definitions of these operations are found in the Exercises of Sec-
tion 2.3 on p. 135.
2.6.2. Show that if L is a CFL and h a homomorphism, then h-a(L) is a CFL.
Hint: Let P be a PDA accepting L. Construct P' to apply h to each of
its input symbols in turn, store the result in a buffer (in the finite con-
trol), and simulate P on the symbols in the buffer. Be sure that your
buffer is of finite length.
2.6,3. Show that the following are not CFL's:
(a) [aibic j [j < i].
(b) {a~bJckl i < j < k}.
(c) The set of strings with an equal number of a's, b's, and c's.
(d) [a'bJaJb' IJ ~ i].
(e) [amb"a'~b ~ [m, n ~ 1].
(f) {a~biek[none of i, j, and k are equal}.
(g) {nHa ~ [n is a decimal integer > 1}. (This construct is representative
of F O R T R A N Hollerith fields.)
**2.6.4. Show that every CFL over a one-symbol alphabet is regular. Hint:
Use the pumping lemma.
**2.6.$. Show that the following are not always CFL's when L is a CFL:
(a) MAX(L).
(b) MIN(L).
(c) L 1/2 = [x[ for some y, xy is in L and [xl = [y[}.
*2.6.6. Show the following pumping lemma for linear languages. If L is a linear
language, there is a constant k such that if z ~ L and Iz] > k, then
z = uvwxy, where[uvxy] < k, vx ~ e and for all i, uvtwx~y is in L.
2.6.7. Show that [a"b"amb m In, m > 1} is not a linear language.
'2.6.8. A one-turn PDA is one which in any sequence of moves first writes
208 ELEMENTS OF LANGUAGE THEORY CHAP. 2
symbols on the pushdown list and then pops symbols from the push-
down list. Once it starts popping symbols from the pushdown list, it
can then never write on its pushdown list. Show that a C F L is linear
if and only if it can be recognized by a one-turn PDA.
*2.6.9. Let G = (N, 2~, P, S) be a CFG, Show that the following are CFL's"
(a) {tg[S ==~ tg}.
Im
(b) { ~ l s ~ ~}.
rm
simulates P but keeps on each cell of its pushdown list the information,
" F o r what states p of M and q of P does there exist w that will take
M from state p to a final state and cause P to accept if started in state
q with this cell the top of the pushdown list ?" We must show that
there is but a finite amount of information for each cell and that P '
can keep track of it as the pushdown list grows and shrinks. Once we
know how to construct P', the four desired D P D A ' s are relatively easy
to construct.
2.6.21. Show that for deterministic C F L L and regular set R, the following
may not be deterministic C F L ' s :
(a) R L .
(b) [ x l x R ~ L].
(c) {xi for some y ~ R, we have y x ~ L}.
(d) h(L), for h o m o m o r p h i s m h.
2.6.22. Show that h-I(L) is a deterministic C F L if L is.
**2.6.23. Show that Qc u Pc is an inherently ambiguous C F L whenever it is
not empty.
**2.6.24. Show that it is undecidable whether a C F G G generates an inherently
ambiguous language.
*2.6.25. Show that the grammar of Example 2.46 is unambiguous.
**2.6.26. Show that the language LI U Lz, where L1 = {a'b'ambmlm, n > 1}
and L2 = {a'bma~b" i m, n > 1}, is inherently ambiguous.
**2.6.27. Shove that the C F G with productions S----~ a S b S c l a S b l b S c l d is
ambiguous. Is the language inherently ambiguous ?
*2.6.28. Show that it is decidable for a D P D A P whether L ( P ) has the prefix
property. Is the prefix property decidable for an arbitrary C F L ?
DEFINITION
A D y c k language is a C F L generated by a grammar G =
({S}, Z, P, S), where E = {al . . . . . a~, bl . . . . . bk} for some k _~ 1 and
P consists of the productions S --~ S S l a i S b l l a 2 S b z l " " l a k S b k l e .
**2.6.29. Show that given alphabet Z, we can find an alphabet Z', a Dyck lan-
guage LD ~ ~ ' , and a homorphism h from Z'* to Z* such that for any
C F L L ~ Z* there is a regular set R such that h(LD ~ R ) = L.
*2.6.30. Let L be a C F L and S ( L ) = [ilfor some w ~ L, we have Iwl = i}.
Show that S ( L ) is a finite union of arithmetic progressions.
DEFINITION
An n-vector is an n-tuple of nonnegative integers. If v i = (a l , . . . , a,)
and v~ = (bl . . . . . b,) are n-vectors and c a nonnegative integer, then
v l -~- vz = (a l ÷ b l . . . . . a, + b,) and cv l = (ca l , . . ., ca,). A set S
of n-vectors is linear if there are n-vectors v0 . . . . . Vk such that
S=[vlv = v 0 + ClVl + . . . + CkVk, for some nonnegative integers
21 0 ELEMENTS OF LANGUAGE THEORY CHAP. 2
S ~ SSIOSI le
Open Problem
2.6.37. Is it decidable, for D P D A ' s PI and P2, whether L(P1) -- L(P2)?
Research Problems
2.6.38. Develop methods for proving certain grammars to be unambiguous.
By Theorem 2.30 it is impossible to find a method that will work for
BIBLIOGRAPHIC NOTES 211
BIBLIOGRAPHIC NOTES
We shall not attempt to reference here all the numerous papers that have been
written on context-free languages. The works by Hopcroft and Ullman [1969],
Ginsburg [1966], Gross and Lentin [1970], and Book [1970] contain many of the
references on the theoretical developments of context-free languages.
Theorem 2.24, Ogden's lemma, is from Ogden [1968]. Bar-Hillel et al. [1961]
give several of the basic theorems about closure properties and decidability results
of CFL's. Ginsburg and Greibach [1966] give many of the basic properties of
deterministic CFL's.
Cantor [1962], Floyd [1962a], and Chomsky and Schutzenberger [19631 inde-
pendently discovered that it is undecidable whether a CFG is ambiguous. The
existence of inherently ambiguous CFL's was noted by Parikh [1966]. Inherently
ambiguous CFL's are treated in detail by Ginsburg [1966] and Hopcroft and
Ullman [19691.
The Exercises contain many results that appear in the literature. Exercise 2.6.19
is from Stearns [1967]. The constructions hinted at in Exercise 2.6.20 are given in
detail by Hopcroft and Ullman [1969]. Exercise 2.6.29 is proved by Ginsburg
[1966]. Exercise 2.6.31 is known as Parikh's theorem and was first given by Parikh
[1966]. Exercise 2.6.32 is from Salomaa [1969b]. Exercise 2.6.33 is from Chomsky
[1959a]. Exercise 2.6.36 is from Blattner [19721.
THEORY OF
TRANSLATION
212
SEC. 3.1 FORMALISMS FOR TRANSLATIONS 213
Example 3.1
A a alpha N v nu
B fl beta E, ~ xi
F y gamma O o omicron
A 5 delta II r~ pi
E e epsilon P p rho
Z ( zeta Z a sigma
H 1/ eta T z tau
O 0 theta T o upsilon
I z iota • ~b phi
K x kappa X Z chi
A 2 lambda ~F V psi
M g mu ~ co omega
]'The term "Polish" is used, as this notation was first described by the Polish mathe-
matician Lukasiewicz, whose name is significantly harder to pronounce than is "Polish."
SEC. 3.1 FORMALISMS FOR TRANSLATIONS 215
Example 3.2
Consider the infix expression (a + b) • c. This expression is of the form
E1 * E2, where E1 = (a q- b) and E2 = c. Thus the prefix and postfix Polish
expressions for E2 are both c. The prefix expression for E~ is the same as that
for a-q-b, which is q-ab. Thus the prefix expression for (a ÷ b ) , c is
,+abe.
Similarly, the postfix expression for a ÷ b is ab ÷, so the postfix expres-
sion for (a + b) • c is a b ÷ c , . D
Example 3.3
Consider the following translation schema which defines the translation
{(x, XR)[X ~ {0, 1}*}. That is, for each input x, the output is x reversed. The
rules defining this translation are
(1) S---* OS S = SO
(2) S ~ I S S=S1
(3) S---. e S-- e
to this form. To do so, we expand the first S using the production S ~ 0S.
Then we replace the output sentential form S by SO in accordance with the
translation element S ---- SO. For the time being, we can think of the trans-
lation element simply as a production S --~ SO. Thus we obtain the transla-
tion form (0S, SO). We can expand each S in this new translation form by
using rule (1) again to obtain (00S, SO0). If we then apply rule (2), we obtain
(001S, S100). If we then apply rule (3), we obtain (001,100). No further rules
can be applied to this translation form and thus (001,100) is in the transla-
tion defined by this translation schema. V-]
such that (~0, fl0) = (S, S), (~,, ft.) = (x, y), and each fl, is obtained by
applying to fl~-i the translation element corresponding to the production
used in going from t~_ 1 to ~t at the "corresponding" place. The string y is
an output for x.
Often the output sentential forms can be created at the time the input is
being parsed.
Example 3.4
Consider the following translation scheme which maps arithmetic expres-
sions of L ( G o ) to postfix Polish"
E----~ E + T E = ET-+-
E--~T E-'T
T--* T* F T= TF*
T---~F T=F
F ~ (E) F = E
F---~a F=a
Let us determine the output for the input a + a • a. To do so, let us first
find a leftmost derivation of a -+ a • a from E using the productions of the
translation scheme. Then we compute the corresponding sequence of trans-
lation forms as shown"
(E, E) ~ (E -q- T, E T + )
- - - - > ( T + T, T T + )
------~ (F q- T, F T + )
~- (a q- T, aT q-)
----> (a -q- T , F, aTF , +)
u ( a -q- F , F, aFF , + )
> (a + a , F, aaF , + )
~ >(a-+a,a, aaa,+)
The translation schemata in Examples 3.3 and 3.4 are special cases of
an important class of translation schemata called syntax-directed translation
schemata.
DEFINITION
(1) (S, S) is a translation form, and the two S's are said to be associated.
(2) If (eArl, e'Afl') is a translation form, in which the two explicit
instances of A are associated, and if A ---~ 7, ~" is a rule in R, then (e?fl, e'?'fl')
is a translation form. The nonterminals of ? and ~,' are associated in the trans-
lation form exactly as they are associated in the rule. The nonterminals of
• and fl are associated with those of e' and fl' in the new translation form
exactly as in the old. The association will again be indicated by superscripts,
when needed, and this association is an essential feature of the form.
If the forms (eArl, e'Afl') and (eI, fl, e'~,'fl'), together with their associa-
tions, are related as above, then we write (eArl, e'Afl') =-~ (e?fl, e'?'fl').
T
+ * k
We use =~, ==~, and ==~ to stand for the transitive closure, reflexive-transitive
T T T
closure, and k-fold product of ==~. As is customary, we shall drop the sub-
T
script T whenever possible.
The translation defined by T, denoted z(T), is the set of pairs
Example 3.5
Consider the SDTS T = (IS}, {a, -q-}, {a, + }, R, S), where R has the rules
DEFINITION
P = [A ~ e l A ---~ e, fl is in R},
220 THEORY OF TRANSLATION CHAP. 3
Example 3.6
Let us consider the SDTS T = ({S, A}, {0, 1}, {a, b}, R, S), where R
consists of
"['Note that fl may not be uniquely determined from A and ~. If more than one rule
is applicable, the choice can be arbitrary.
SEC. 3.1 FORMALISMS FOR TRANSLATIONS 221
s s s
0
/1\ A S
/1\
S A a S
/1\ A a
o
/ 1s\ A \ 1 1
//1\ 0 S A b
//1\ A S a
I I1 1
I I 1 1
I Ib b
We next apply step (2) to the first two direct descendants of the root.
Application of step (2) to the second of these descendants results in two more
calls of step (2). The resulting tree is shown in Fig. 3.2(c). Notice that
(00111, bbbaa) is in z(T).
The basis, one interior node, is trivial. All direct descendants are leaves,
and there must be a rule A --, x, y in R.
For the inductive step, assume that statement (3.1.1) holds for smaller
trees, and let the root of E have direct descendants with labels X 1 , . . . , Xk.
Then x = xl . . " x~, where Xj =-> xj, 1 _<j _< k. Let the direct descendants of
Gt
the root of E ' have labels Y~ • • • Y~. Then y = y~ . • • Yl, where Yj ~ yi,
Go
1 ~ j ~ 1. Also, there is a rule A ----~X1 . . . Xk, Y~ .." Yi in R.
If Xj is a nonterminal, then it is associated with some Y~, where
X~ = Yp,. By the inductive hypothesis (3.1.1), (Xj, X:) =~ (xj, yp). Because
of the permutation of nodes in step (2b), we know that
where a) m) is
(a) y j if Yj is in N and is associated with one of X1, • . . , Xm, and
(b) Yj otherwise.
Thus Eq. (3.1.1) follows.
Part (2) of the theorem is a special case of the following statement"
i
(3.1.2) If (A, A) ~ (x, y), then there is a derivation tree D in Gi, with
root labeled A, frontier x, and a sequence of choices in step (2b)
so that the application of step (2) to D gives a tree with frontier y.
We comment that the order in which step (2) of Algorithm 3.1 is applied
to nodes is unimportant. We could choose any order that considered each
interior node exactly once. This statement is also left for the Exercises.
DEFINITION
Example 3.7
The following simple SDTS maps the arithmetic expressions in L(Go)
to arithmetic expressions with no redundant parentheses:
tNote that the underlying grammar is ambiguous, but that each input word has exactly
one output.
224 THEORYOF TRANSLATION CHAP. 3
I' al a2 an
Read only
input tape
1
Finite
control
Write only
output tape
an input x, the input string x must take M from an initial state to a final
state.
Example 3.8
Let us design a finite transducer which recognizes arithmetic expressions
generated by the productions
S ~a+SIa--SI+SI--SIa
and removes redundant unary operators from these expressions. For exam-
ple, we would translate - - a - - 4 - - a ~ a into - - a - a-q-a. In this
language, a represents an identifier, and an arbitrary sequence of unary -q-'s
and --'s is permitted in front of an identifier. Notice that the input language
is a regular set. Let M = (Q, ~, A, d~, q0, F), where
(1) a = {q0, ql, q2, q3, q,}.
(2) E = [a, + , - - ] .
(3) A = E.
(4) 6 is defined by the transition graph of Fig. 3.4. A label x/y on an edge
directed from the node labeled qt to the node labeled qj indicates that 6(q~, x)
contains (qj, y).
(5) F = {ql].
M starts in state q0 and determines whether there are an odd or even
Start
+/e
--/e
(q2, a , - a- a)
1- (ql, e, - - a -- a + a)
Example 3.9
Let M = ({q0, q~}, {a}, {b}, ~, qo, {q~}) a n d let J(q0, a) = {(ql, b)} a n d
d~(ql, e) = {(q~, b)}. T h e n
(q0, a, e ) 1 ~ (q~, e, b)
~-~-(ql, e, b '+1)
is a valid sequence o f m o v e s for all i ~ 0. Thus, 'r(M) = {(a, bt)[i ~ 1}. [Z]
sac. 3.1 FORMALISMS FOR TRANSLATIONS 227
Example 3,10
The language generated by the following grammar G is not regular"
a/e
Start
DEFINITION
We can also define extended PDT's with their pushdown list top on the
right in a way analogous to extended PDA's.
Example 3.11
Let P be the pushdown transducer
DEFINITION
If P = (Q, E, F, A, c5, qo, Zo, F) is a pushdown transducer, then the push-
down automaton (Q, E, F, ~', q0, Z0, F), where ~'(q, a, Z) contains (r, ~,) if
and only if ~(q, a, Z) contains (r, 7, y) for some y, is called the P D A under-
lying P.
We say that the P D T P = (Q, E, F, A, 6, q0, Z0, F) is determip,stie
(a DPDT) when
(1) For all q e Q, a ~ E w {e}, and Z e F, 6(q, a, Z ) contains at most
one element, and
(2) If 0(q, e, Z ) ~ ~ , then 6(q, a, Z ) = ~ for all a E ~:.t
Clearly, if L is the domain of z(P) for some pushdown transducer P, then
L = L(P'), where P ' is the pushdown automaton underlying P.
?Note that this definition is slightly stronger than saying that the underlying PDA is
deterministic. The latter could be deterministic, but (1) may not hold because the PDT
can give two different outputs on two moves which are otherwise identical. Also note that
condition (2) implies that if 6(q, a, Z) ~ ~ for some a e ~, then 6(q, e, Z) = ~.
230 THEORY OF TRANSLATION CHAP. 3
Many of the results proved in Section 2.5 for pushdown automata carry
over naturally to pushdown transducers. In particular, the following lemma
can be shown in a way analogous to Lemmas 2.22 and 2.23.
LEMMA 3.1
A translation T is z(P1) for a pushdown transducer P1 if and only if T
is ze(P2) for a pushdown transducer P2.
Proof Exercise.
A pushdown transducer, particularly a deterministic pushdown trans-
ducer, is a useful model of the syntactic analysis phase of compiling. In Sec-
tion 3.4 we shall use the pushdown transducer in this phase of compiling.
Now we shall prove that a translation is a simple SDT if and only if it
can be defined by a pushdown transducer. Thus the pushdown transducers
characterize the class of simple SDT's in the same manner that pushdown
automata characterize the context-free languages.
LEMMA 3.2
Let T = (N, Z, A, R, S) be a simple SDTS. Then there is a pushdown
transducer P such that z,(P) = z(T).
Proof. Let G~ be the input grammar of T. We construct P to recognize
L(G~) top-down as in Lemma 2.24.
To simulate a rule A ---~ ~, fl of T, P will replace ,4 on top of its pushdown
list by 0~with output symbols offl intermeshed. That is, if ~ = XoA ~xi • • • A , x ,
and f l - - - y o A ~ y ~ . . . A,y,, then P will place XoYoA~x~y~... A,x,,y, on its
pushdown list. We need, however, to distinguish between the symbols of Z
and those of A, so that the word x~y~ can be broken up correctly. If Z and
A are disjoint, there is no problem, but to take care of the general case, we
define a new alphabet A' corresponding to A but known to be disjoint
from Z. That is, let A' consist of new symbols a' for each a ~ A. Then
Z ~ A ' = ~ . Let h be the homomorphism defined by h ( a ) = a' for each
ainA.
Let P = ({q}, Z, N U Z U A', A, 5, q, S, ~), where 5 is defined as fol-
lows-
(1) If A ~ XoBlX~... BkXk, YoB1Yl " " BkYk is a rule in R with k _~ 0,
then 5(q, e, A) contains (q, Xoy~B~x~y'~ . . . BkxkY~,, e), where y~ = h(yt),
O~li~k.
(2) 6(q, a, a) = {(q, e, e)} for all a in Z.
(3) O(q, e, a') = {(q, e, a)} for all a in A.
By induction on m and n, we can show that, for A in N and m, n ~ 1,
m
(3.1.3) (A, A) :::> (x, y) for some m if and only if (q, x, A, e) ~ - (q, e, e, y)
for some n
SEC. 3.1 FORMALISMS FOR TRANSLATIONS 231
where the xt's are in Z* and the h(y~)'s denote strings in (A')*, with the y~'s
in A*. Then Xo must be a prefix of x, and the next moves of P remove x 0
from the input and pushdown list and then emit Yo. Let x' be the
remaining input. There must be some prefix u 1 of x' that causes the level
holding Ba to be popped from the pushdown list. Let Vl be emitted up to
the time the pushdown list first becomes shorter than [Bi"-BkXkh(Yk)l.
Then (q, u~, B~, e)1--- (q, e, e, vl) by a sequence of fewer than n moves. By
inductive hypothesis (3.1.3), (B~, Bi) ==~ (ua, v~).
Reasoning in this way, we find that we can write x as XoUlXl . . . UkX ~ and
y as Y o V l Y ~ ' ' ' V k y k SO that ( B , B ~ ) = : , ( u , %) for 1 _~ i_~ k. Since rule
A - - ~ xoB~x~ . . . BkXk, Y o B i Y l "'" BkYe is clearly in R, we have
(A, A) = - (x, y).
As a special case of (3.1.3), we have (S, S ) = ~ (x, y) if and only if
(q, x, S, e)[---(q, e, e, y), so z , ( P ) = z ( T ) . [~
Example 3.'12
The simple SDTS T having rules
P = ({q}, [a, + , ,}, {E, a, + , ,, a', -q-', ,'}, [a, + , ,}, 6, q, E, ~),
where d~ is defined by
(1) 6(q, e, E) = {(q, + EE +', e), (q, • EE ,', e), (q, aa', e)}
(2) d~(q, b, b) = [(q, e, e)} for all b in [a, + , ,]
(3) 6(q, e, b') = [(q, e, b)} for all b in {a, + , ,}.
This is a nondeterministic pushdown transducer. Example 3.11 gives
an equivalent deterministic pushdown transducer. [~[]
LEMMA 3.3
Let P = (Q, E, F, A, $, qo, Zo, F) be a pushdown transducer. Then there
is a simple SDTS T such that z(T) = 1re(P).
Proof. The construction is similar to that of obtaining a CFG from
a PDA. Let T = (N, E, A, R, S), where
(1) N = [[pAq] IP, q ~ Q, A ~ F} t,J {S}.
(2) R is defined as follows:
(a) If 5(p, a, A) contains (r, X ~ X z . . . X k, y), then if k > 0, R con-
tains the rules
Thus we have (S, S) ~ ([q0 Z0 q], [q0 Z0 q]) ==~ (x, y) if and only if
(q0, x, Zo, e ) ~ (q, e, e, y). Hence z(T) = z,(P). [[]
Example 3.13
Using the construction in the previous lemma let us build a simple SDTS
from the pushdown transducer in Example 3.11. We obtain the SDTS
T = (N, {a, + , ,}, [a, + , ,}, R, S), where N = {[qXq]] X ~ [ + , ,, E}} U S
and where R has the rules
EXERCISES 233
S >a,a
S > + SS, S S +
S >,SS, SS, Q
THEOREM 3.2
Tis a simple S D T if and only if T i s z(P) for some p u s h d o w n transducer P.
Proof I m m e d i a t e f r o m L e m m a s 3.1, 3.2, and 3.3. E]
In C h a p t e r 9 we shall introduce a machine called the p u s h d o w n processor
which is capable of defining all syntax-directed translations.
EXERCISES
3.1.1. An operator with one argument is called a unary operator, one with two
arguments a binary operator, and in general an operator with n argu-
ments is called an n-ary operator. For example, -- can be either a
unary operator (as in - - a ) or a binary operator (as in a - b). The
degree of an operator is the number of arguments it takes. Let 19 be
a set of operators each of whose degree is known and let ~ be a set
of operands. Construct context-free grammars G1 and G2 to generate
the prefix Polish and postfix Polish expressions over O and ~.
"3.1.2. The "precedence" of infix operators determines the order in which
the operators are to be applied. If binary operator 0t "takes precedence
over" 02, then a02 b01 c is to be evaluated as a02 (b01 c). For
example, • takes precedence over + , so a + b • c means a + (b • c)
rather than (a + b) • c. Consider the Boolean operators ~ (equiva-
lence), ~ (implication), V (or), A (and), and --7 (not). These operators
are Iisted in order of increasing precedence. - 7 is a unary operator and
the others are binary. As an example, ---7 (a V b) _= - 7 a A --7 b has the
implied parenthesization (--a (a V b)) :- ((--~ a) A (--1 b)). Construct a
CFG which generates all valid Boolean expressions over these opera-
tors and operands a, b, c with no superfluous parentheses.
234 THEORYOF TRANSLATION CHAP. 3
tNote that this comma separates the two parts of the rule.
EXERCISES 235
(var) ~ (id), ( i d )
(exp)- ~ (id), ( i d )
(statement) ~ ( v a r ) ~-- (exp), ( v a r ) ~-- (exp)
(id) ~ a(id), a(id)
(id)- > b(id), b(id)
( i d ) ----> a, a
(id) ~ b, b
Why is this not an SDTS ? What should the output be for the following
input sentence:
or
A-----~x,y
(1) So = 1.
(2) If at is an m-ary operator, let & = &_ 1 + m -- 1.
(3) If at ~ ~, let st = st-1 -- 1.
Prove that a a . . . a , , is a prefix expression if and only if s, = 0 and
st > 0 for all i < n.
"3.1.17. Let a l . . . a , be a prefix expression in which a l is an m-ary operator.
Prove that the unique way to write a l . . . a , as a a w ~ ' . ' W m , where
wl . . . . ,Wm are prefix expressions, is to choose wj, 1 < j < m, so that
it ends with the first ak such that sk = m - - j .
"3.1.18. Show that every prefix expression with binary operators comes from
a unique infix expression with no redundant parentheses.
3.1.19. Restate and prove Exercises 3.1.16-3.1.18 for postfix expressions.
3.1.20. Complete the proof of Theorem 3.1.
"3.1.21. Prove that the order in which step (2) of Algorithm 3.1 is applied t o
nodes does not affect the resulting tree.
3.1.22. Prove Lemrna 3.1.
3.1.23. Give pushdown transducers for the simple SDT's defined by the trans-
lation schemata of Examples 3.5 and 3.7.
3.1.2,4. Construct a grammar for SNOBOL4 statements that reflects the
associativity and precedence of operators given in Appendix A.2.
3.1.25. Give an SDTS that defines the (empty store) translation of the following
PDT"
where ~ is given by
BIBLIOGRAPHIC NOTES 237
Open Problems
3.1.29. Is it decidable whether two deterministic finite transducers are equiva-
lent ?
3.1.30. Is it decidable whether two deterministic pushdown transducers are
equivalent ?
Research Problem
3.1.31. It is known to be undecidable whether two nondeterministic finite
transducers are equivalent (Exercise 3.1.28). Thus we cannot "mini-
mize" them in the same sense that we minimized finite automata in
Section 3.3.1. However, there are some techniques that can serve to
make the number of states smaller. Can you find a useful collection of
these ? The same can be attempted for PDT's.
BIBLIOGRAPHIC NOTES
DEFINITION
Example 3.14
The translation T = {(a~,aO[n ~ 1} is characterized by 0 ÷, since
T = {(ha(w), h2(w))lw ~ 0+}, where ha(0) = h2(0) = a. E]
Example 3.15
The translation T = [ ( a " , a " ) [ n > 1} i s strongly characterized by
L 1 = {a"b"! n > 1}. It is also strongly characterized by L 2 = {wlw consists
of an equal number of a's and b's]. The homomorphisms in each case are
ha(a) = a, h i ( b ) = e and h2(a)= e, hz(b)= a. T is not strongly characterized
by the language 0 +. [Z
LEMMA 3.4
It follows that (S, x, e) ~ (f, e, y) if and only if (S, S) ==~ (x, y). The
details are left for the Exercises. Thus, z(T) = z(M). E]
THEOREM 3.3
Thus, (qo, x, e) ~ (q, e, y), with q in F, if and only if qo => wq ==~ w, where
h l ( w ) = x and h2(w ) = y. Hence, T = {(hi(w), h2(w)) I w ~ L ( G ) } . Thus, L ( G )
strongly characterizes T. D
COROLLARY
In much the same fashion we can show an analogous result for simple
syntax-directed translations.
THEOREM 3.4
be a rule in R.
A straightforward induction on n shows that
Itl n
(1) If A :=~ w, then (A, A) ==~ (hi(w), hz(w)).
Gx T1
B B
(2) If (A, A) =~ (x, y), then there is some w such that A =:~ w, hi (w) = x,
T1 Gt
and h2(w) = y.
Thus, z ( T 1 ) = T.
O n l y if: Let T = z(T2), where T2 = (N, E , A , R , S), and let A ' =
{ a ' [ a ~ A} be an alphabet of new symbols. Construct C F G G2 =
(N, X U A', P, S), where P contains production A ~ Xoy~Blxmy'l . . . B k x k y ~
for each rule A ~ x o B l x t . . . B k X k , Y o B l Y l . . . B k Y k in R; Yl is Yt with each
symbol a ~ A replaced by a'. Let h 1 and h 2 be the obvious homomorphisms,
ha(a) = a for a ~ E, h i ( a ) = e for a ~ A', hz(a) = e for a E E, and h2(a') = a
for a ~ A. Again it is elementary to prove by induction that
SEC. 3.2 PROPERTIES OF SYNTAX-DIRECTED TRANSLATIONS 241
n n
(1) If A ==~ w, then (A, A) ~ (ha(w), h2(w)).
G~ T2
n n
(2) If (A, A) ~ (x, y), then for some w, we have A ~ w, ha(w) = x,
T2 G~
COROLLARY
We can use Theorem 3.3 and 3.4 to show that certain translations are not
regular translations or not simple SDT's. It is easy to show that the domain
and range of every simple SDT is a CFL. But there are simple syntax-directed
translations whose domain and range are regular sets but which cannot be
specified by any finite transducer or even pushdown transducer.
Example 3.16
Consider the simple SDTS T with rules
S----~ OS, SO
S > 1S, S1
S >e,e
z(T) = {(w, wR.)[W E {0, 1}*}. We shall show that z(T) is not a regular trans-
lation.
Suppose that x(T) is a regular translation. Then there is some regular
language L which strongly characterizes z(T). We can assume without loss of
generality that L ~ {0, 1, a, b}*, and that the two homomorphisms involved
are hi(0) = 0, hi(l) = 1, ha(a) = hi(b) = e and h2(0) = h2(1) = e, h2(a) = O,
hz(b) = 1.
If L is regular, it is accepted by a finite automaton
with s states for some s. There must be some z ~ L such that hi(z) = 0~1 '
and h2(z)= lS0 ". This is because (0Sl ", 1"0 ") ~ z(T). All O's precede all
l's in z, and all b's precede all a's. Thus the first s symbols of z are only
O's and b's. If we consider the states entered by M when reading the first s
symbols of z, we see that these cannot all be different; we can write z = uvw
such that (qo, z) ~ (q, vw) t--- (q, w) ~ (p, e), where l uvl <_ s, l vl >_ 1, and
p ~ F. Then uvvw is in L. But hl(uvvw ) = 0"+ml" and h2(uvvw) = l'+n0 s,
where not both m and n are zero. Thus, (0'÷ml s, l'+n0s) ~ z(T), a contradic-
tion. We conclude that z(T) is not a regular translation. D
242 THEORY OF TRANSLATION CHAP. 3
Example 3.17
Consider the SDTS T with the rules
S- ~ A~I~cA ~2~,A~2~cA~1~
A > 0A, 0A
A~ 1A, 1A
A >e,e
Here 'r(T) = {(ucv, vcu)lu, v ~ [0, 1}*}. We shall show that 'r(T) is not
a simple SDT.
Suppose that L is a CFL which strongly characterizes z(T). We can sup-
pose that A' = [c', 0', 1'}, L ~ ({0, 1, c} u A3*, and that h 1 and h2 are the
obvious homomorphisms. For every u and v in {0, 1}*, there is a word zuv
in L such that hl(zu,)= ucv and h2(z,,)= vcu. We consider two cases,
depending on whether c precedes or follows c' in certain of the zu,'s.
Case 1: For all u there is some v such that c precedes c' in z~v. Let R be
the regular set {0, 1, 0', l'}*c[0, 1, 0', l'}*c'{0, 1, 0', 1'}*. Then L n R is
a CFL, since the CFL's are closed under intersection with regular sets.
Note that L n R is the set of sentences in Z in which c precedes c'. Let M be
the finite transducer which, until it reads c, transmits O's and l's, while
skipping over primed symbols. After reading c, M does nothing until it
reaches c'. Subsequently, M prints 0 for 0' and 1 for 1', skipping over O's
and l's. Then M ( L n R) is a CFL, since the CFL's are closed under finite
transductions, and in this case M ( L N R) = [uulu ~ [0, 1}*}. The latter is not
a C F L by Example 2.41.
Case 2: For some u there is no v such that c precedes c' in zu,. Then for
every v there is a u such that c' precedes c in z,,. An argument similar to
case 1 shows that if L were a CFL, then [vv[v ~ [0, 1}*} would also be
a CFL. We leave this argument for the Exercises.
We conclude that ~(T) is not strongly characterized by any context-free
language and hence is not a simple SDT. [[]
Let 3, denote the class of regular translations, 3, the simple SDT's, and
3 the SDT's. From these examples we have the following result.
THEOREM 3.5
Z, ~ Z, ~ Z.
Proof 3, ~ 3 is by definition. 3, ~ 3s is immediate when one realizes
that a finite transducer is a special case of a PDT. Proper inclusion follows
from Examples 3.16 and 3.17. [[]
SEC. 3.2 PROPERTIES OF SYNTAX-DIRECTED TRANSLATIONS 243
(1) There exist no x ~ E* and y ~ A* such that (A, A) => (x, y), or
(2) For no t~ and t~2 i n (N U E)* and fl~ and flz in (N U A)* does
:g
(s, s)
LEMMA 3.6
Every SDT of order k is defined by an SDTS of order k with no useless
nonterminals.
Proof. Exercise analogous to Theorem 2.13. [~
LEMMA 3.7
Every SDT T of order k ~ 2 is defined by an SDTS T1 = (N, 1~, A, R, S),
where if A ---~ ~, fl is in R, then either
(1) ~ and fl are in N*, or
(2) ~ is in E* and fl in A*.
Moreover, T1 has no useless nonterminals.
Proof. Let T2 = (N', E, A, R', S) be an SDTS with no useless nonter-
minals such that z ( T z ) = T. We construct R from R' as follows. Let
A -----~x o B l x l ' " BkXk, YoCiYl"'" CkYk be a rule in R', with k > 0. Let
be the permutation on the set of integers 1 to k such that the nonterminal
Bi is associated with the nonterminal C~(i). Introduce new nonterminals
A', D ~ , . . . , D k and E 0 , . . . , Ek, and replace the rule by
SEC. 3.2 PROPERTIES OF SYNTAX-DIRECTED TRANSLATIONS 245
Since each Dt and E~ has only one rule, it is easy to see that the effect of
all these new rules is exactly the same as the rule they replace. Rules in R'
with no nonterminals on the right are placed directly in R. Let N be N'
together with the new nonterminals. Then z ( T 2 ) : z(T~) and T2 satisfies
the conditions of the lemma. D
LEMMA 3.8
~2 : ~3"
Proof. It suffices, by Lemma 3.7, to show how a rule of the form
A ~ B 1 B 2 B 3, C1C2C 3 can be replaced by two rules with two nonterminals in
each component of the right side. Let ~z be the permutation such that Bt is
associated with C~,). There are six possible values for ~z. In each case, we can
introduce a new nonterminal D and replace the rule in question by two rules
as shown in Fig. 3.6.
It is straightforward to check that the effect of the new rules is the same
as the old in each case.
LEMMA 3.9
if i is odd
~(i)
k+ i _I 2i + 1
if i is even
Thus, re4 is [3, 1, 4, 2] and zt6 is [4, 1, 5, 2, 6, 3]. Define ztk for k odd by
rk+l
if/= 1
2 '
i
~(i) = k - - - ~ - + 1, if i is even
i--1
if i is odd and i ¢ 1
2 '
For example, if as, a2, a3, and a4 are called a, b, c, and d, then
T4 = {(a'bJckd z, cea'd~bJ) l i, j, k, l ~ 0}
Tk. We assume without loss of generality that T satisfies Lemma 3.9, and
hence Lemmas 3.6 and 3.7. We shall prove, by contradiction, that T cannot
exist.
DEFINITION
Let at, aj, and at be distinct members of £k. We say that a~ is between
a t and al if either
(1) i < j < l , or
(2) zte(i) < ~k(J) < z~k(l).
Thus a symbol is formally between two others if it appears physically between
them either in the domain or range of Tk.
LEMMA 3.12
Let A cover £k, and let A ---~ B ~ . . . Bin, C 1 " " Cm be a rule satisfying
Lemma 3.11. If Bt covers {at} and also covers {a,}, and at is between a t and
a,, then Bt covers {a,}, and for no j ~ i does Bj cover {a,}.
P r o o f Let us suppose that r < t < s. Suppose that Bj covers {a,}, j ~ i.
There are two cases to consider, depending on whether j < i or j > i.
Case 1: j < i. Since in the underlying grammar of T, Bt derives a string
g
with a, in it and Bj derives a string with a, in it, we have (A, A):=~ (x, y),
where x has an instance of at preceding an instance of a t. Then by Lemma
3.6, there exists such a sentence in the domain of Tk, which we know not to
be the case.
Case 2: j > i. Allow Be to derive a sentence with a, in it, and we can
similarly find a sentence in the domain of T k with a, preceding a,.
By contradiction, we rule out the possibility that r < t < s. The only
other possibility, that nk(r) < nk(t) < nk(S), is handled similarly, reasoning
about the range of T k. Thus no Bj, j ~ i, covers {at}. If Bj covers £, where
a, e £, then B~ certainly covers {a,}. Thus by Lemma 3.11, Bt covers some
set containing a,, and hence covers {a,}. D
LEMMA 3.13
If A covers Ek, k > 4, then there is some rule A --~ B ~ . . . Bin, C ~ . . . C~
and some i, 1 ~ i ~ m, such that Bg covers £k'
Proof. We shall do the case in which k is even. The case of odd k is simi-
lar and will be left for the Exercises. Let A ~ B ~ . . . Bin, C x "'" Cm be a rule
satisfying Lemma 3.11. Since m ~ k -- 1 by hypothesis about T, there must
SEC. 3.2 PROPERTIES OF SYNTAX-DIRECTED TRANSLATIONS 249
be some Bt which covers two members of Ek, say Bt covers [a,, as}, r ~ s.
Hence, Bt covers {at} and [as}, and by Lemma 3.12, if at is between ar and as,
then Bt covers (at} and no C i, j ~ i, covers {at}.
If we consider the range of Tk, we see that, should Bt cover (ak/z] and
{akin+ ~}, then it covers (a} for all a ~ E~, and no other Bj covers any {a}.
It will follow by Lemma 3.11 that Bt covers Ek. Reasoning further, if Bt
covers [am} and {a,}, where m < k/2 and n > k/2, then consideration of
the domain assures us that B~ covers {ak/2} and (ak/2+ ~}.
Thus, if one of r and s is equal to or less than k/2, while the other is greater
than k/2, the desired result is immediate.
The other cases are that r < k/2 and s < k/2 or r > k/2, s > k/2. But in
the range, any distinct r and s, both equal to or less than k/2, have some at,
t > k/2, between them. Likewise, if r > k/2 and s > k/2, we find some at,
t < k/2, between them. The lemma thus follows in any case. [~]
LEMMA 3.14
Tk is in 3k -- 3e-~, for k > 4.
Proof. Clearly, T k is in ~3k. It suffices to show that T, the hypothetical
SDTS of order k -- 1, does not exist. Since S certainly covers Zk, by Lemma
3.13 we can find a sequence of nonterminals A0, A 1 , . . . , A#N in N, where
A 0 = S and for 0 < i < ~ N , there is a rule A t ~ stAr+lilt, ?tAt+Id;r More-
over, for all i, At covers Ek. N o t all the A's can be distinct, so we can find
i and j, with i < j and A t = Aj. By Lemma 3.6, we can find w t , . . . , w~o so
that for all p ~ 0,
(w ~w~A,w~w2, w3w~A~wfw,)
(WlWsW9W6W2~
" " W3w¢w~0w~w,).
By Lemma 3.9(1) we can assume that not all of ~t, fit, ~'t, and Jt are e, and
by Lemma 3.9(2) that not all of ws, w6, WT, and ws are e.
For each a ~ Ek, it must be that wsw6 and w7w8 have the same number of
occurrences of a, or else there would be a pair in z(T) not in T k. Since A t
covers Ek, should ws, w6, w7, and w8 have any symbol but ax or a~, we could
easily choose w9 to obtain a pair not in Tk. Hence there is an occurrence of
a t or a k in w 7 or w8. Since A t covers E~ again, we could choose wl0 to yield
a pair not in T k. We conclude that T does not exist, and that T k is not in
250 THEORY OF TRANSLATION CHAP. 3
THEOREM 3.8
With the exception of k = 2, 3k is properly contained in ~3k÷1 for k .~ 1.
Proof. The case k = 1 is L e m m a 3.5. The other cases are L e m m a 3.14.
D
A n interesting practical consequence of T h e o r e m 3.8 is that while it m a y
be attractive to build a compiler writing system that assumes the underlying
g r a m m a r to be in C h o m s k y n o r m a l form, such a system is not capable o f
p e r f o r m i n g any syntax-directed translation of which a m o r e general system
is capable. However, it is likely that a practically motivated S D T would at
worst be in ~33 (and hence in ~3z).
EXERCISES
"3.2.1. Let T be a SDT. Show that there is a constant c such that for each x in
the domain of T, there exists y such that (x, y) 6 T and [y I<_ c([ x[ ÷ 1).
*3.2.2 (a) Show that if T1 is a regular translation and Tz is an SDT, then
Ti o T2 = {(x, z)[ for some y, (x, y) ~ T1 and (y, z) ~ T2} is an SDT.I"
(b) Show that T1 o T2 is simple if T2 is.
3.2.3 (a) Show that if T is an SDT, then T-~ is an SDT.
(b) Show that T-1 is simple if T is.
*3.2.4 (a) Let T~ be a regular translation and T2 an SDT. Show that T2 o T~
is an SDT.
(b) Show that T2 o T~ is simple if T2 is.
3.2.5. Give strong characterizing languages for
(a) The SDT Example 3.5.
(b) The SDT of Example 3.7.
(c) The SDT of Example 3.12
3.2.6. Give characterizing languages for the SDT's of Exercise 3.2.5 which do
not strongly characterize them.
3.2.7. Complete the proof of Lemma 3.4.
3.2.8. Complete case 2 of Example 3.17.
3.2.9. Show that every simple SDT is defined by a simple SDTS with no useless
nonterminals.
3.2.10. Let T~ be a simple SDT and Tz a regular translation. Is T1 ~ T2 always
a simple SDT ?
3.2.11. Prove Lemma 3.6.
tOften, this operation on translations, called composition, is written with the operands
in the opposite order. That is, our definition above would be for Tz o T1, not T1 o T2.
We shall not change to the definition given here, for the sake of natural appearance.
SEC. 3.3 LEX~CALANALYSIS 251
A ~ aA, aA
A---> e, e
B- ~ bB, b B
B - - - - ~ e, e
C -----~ cC, c C
C - . - - > e, e
D - > dD, dD
D-----> e, e
and one other rule. Give the minimum order of "r(T) if that additional
rule is
(a) S ---~ A B C D , A B C D .
(b) S - - . A B C D , B C D A .
(c) S ~ A B C D , D B C A .
(d) S ~ A B C D , B D A C .
3.2.15. Show that if T is defined by a DPDT, then T is strongly characterized by
a deterministic context-free language.
3.2.16. Is the converse of Exercise 3.2.15 true?
3.2.17. Prove the corollaries to Theorems 3.3 and 3.4.
BIBLIOGRAPHIC NOTES
The concept of a characterizing language and the results of Sections 3.2.1 and
3.2.2 are from Aho and Ullman [1969b]. The results of Section 3.2.3 are from
Aho and Ultman [1969a].
Lexical analysis is the first phase of the compiling process. In this phase,
characters from the source program are read and collected into single logical
items called tokens. Lexical analysis is important in compilation for several
reasons. Perhaps most significant, replacing identifiers and constants in
a program by single tokens makes the representation of a program much
more convenient for later processing. Lexical analysis further reduces the
length of the representation of the program by removing irrelevant blanks
252 THEORY OF TRANSLATION CHAP. 3
and comments from the representation of the source program. During sub-
sequent stages of compilation, the compiler may make several passes over
the internal representation of the program. Consequently, reducing the length
of this representation by lexical analysis can reduce the overall compilation
time.
In many situations the constructs we choose to isolate as tokens are some-
what arbitrary. For example, if a language allows complex number constants
of the form
The sets of allowable character strings that form the identifiers and other
tokens of programming languages are almost invariably regular sets. For
example, F O R T R A N identifiers are described by "from one to six letters or
digits, beginning with a letter." This set is clearly regular and has the regular
expression
•l'Recall that we do not distinguish between a regular expression and the set it denotes
if the distinction is clear.
254 THEORY OF TRANSLATION CHAP. 3
Example 3.18
We can specify the F O R T R A N identifiers by the following sequence of
regular definitions"
(letter) = A IB I . . . IZ
(digit) = 0 1 1 [ . - . t 9
(identifier) = (letter)((letter) I(digit)) *s
(digit) = 0l 1 [ . . . 19
(sign) = -% I - [ e
(integer) = (sign) (digit) +
(decimal) = (sign)((digit)*. (digit)+ I(digit) + • (digit)*)
(constant) = (integer) l (decimal) [(decimal ) E ( i n t e g e r )
ALGORITHM 3.2
Construction of a nondeterministic finite automaton from an extended
regular expression.
256 THEORY OF TRANSLATION CHAP. 3
(h) R0 is R~-~. Do the same as in step (g), but in part (iii) F0 is defined
as [[q, i] l q ~ El, 1 ~ i < n] instead.
THEOREM 3.9
We comment that in parts (g) and (h) of Algorithm 3.2 the second com-
ponent of the state of M0 can be implemented efficiently in software as a
counter, in many cases, even when the automaton is converted to a deter-
ministic version. This is so because in many cases R 1 has the prefix property,
and a word in R t n can be broken into words in R x trivially. For example,
R~ might be (digit) as in Example 3.18, and all members of (digit) are of
length 1.
Example 3.20
To apply case (g), we construct states [q,, i] and [qT, i], for 1 ~ i ~ 5.
The final states are [q4, i], 1 _~ i ~ 5, and [qT, 1]. The last is also the initial
state. We have a machine (Qs, X, 85, [qT, 1], Fs), where F5 is as above, and
t~([q7 , 1], a) = {[q4, 1]}; O([q4, i], a) : {[q4, i +- 1]}, for all a in X and i :
tThat is, two states of a nondeterministic finite automaton can be identified if both are
final or both are nonfinal and on each input they transfer to the same set of states. There
are other conditions under which two states of a nondeterministic finite automaton can be
identified, but this condition is all that is needed here.
258 THEORY OF TRANSLATION CHAP. 3
1, 2, 3, 4. Thus states [qT, 2], . . . , [qT, 5] are not accessible and do not have
to appear in Qs. Hence, Q5 = Fs.
To obtain the final a u t o m a t o n for (identifier> we use case (d). The result-
ing a u t o m a t o n is
M = ([ql, q2, [q,, 1 ] , . . . , [q4, 5]}, Z, 5, ql, [q2, [q,, 1 ] , . . . , [q4, 5]}),
where ~5 is defined by
(1) c~(ql, ~) = [q2} for all letters ix.
(2) c~(q2, 00 = [[q,, 1]} for all 0c in Z.
(3) c~([q,, i], ~) = [[q,, i -1- 1]} for all ~ in Z and 1 _~ i < 5.
Note that [q7, 1] is inaccessible and has been removed from M. Also, M is
deterministic here, although it need not be in general.
The transition graph for this machine is shown in Fig. 3.7. [ZI
Start
A~ . . . ~ Z~
~0 ..... 9
A,...,Z,
0. . . . . 9
When the lexical analysis is direct, one must search for one of a large
number of tokens. The most efficient way is generally to search for these in
parallel, since the search often narrows quite quickly. Thus the model of
a direct lexical analyzer is m a n y finite a u t o m a t a operating in parallel, or to
be exact, one finite transducer simulating many a u t o m a t a and emitting a sig-
nal as to which of the a u t o m a t a has successfully recognized a string.
If we have a set of nondeterministic finite a u t o m a t a to simulate in parallel
and their state sets are disjoint, we can merge the state sets and next state
functions to create one nondeterministic finite automaton, which may be
converted to a deterministic one by Theorem 2.3. (The only nuance is that
SEC. 3.3 LEXICALANALYSIS 259
the initial state of the deterministic automaton is the set of all initial states
of the components.) Thus it is more convenient to merge before converting
to a deterministic device than the other way round.
The combined deterministic automaton can be considered to be a simple
kind of finite transducer. It emits the token name and, perhaps, information
that will locate the instance of the token. Each state of the combined automa-
ton represents states from various of the component automata. Apparently,
when the combined automaton enters a state which contains a final state
of one of the component automata, and no other states, it should stop and
emit the name of the token for that component automaton. However, matters
are often not that simple.
For example, if an identifier can be any string of characters except for
a keyword, it does not make for good practice to define an identifier by the
exact regular set, because it is complicated and requires many states. Instead,
one uses a simple definition for identifier (Example 3.18 is one such) and
leaves it to the combined automaton to make the right decision.
In this case, should the combined automaton enter a state which included
a final state for one of the keyword automata and a state of the automaton
for identifiers and the next input symbol (perhaps a blank or special sign)
indicated the end of the token, the keyword would take priority, and indi-
cation that the keyword was found would be emitted.
Example 3.21
Let us consider a somewhat abstract example. Suppose that identifiers
are composed of any string of the four symbols D, F, I, and O, followed by
a blank (b), except for the keywords DO and IF, which need not be followed
by a blank, but may not be followed immediately by any of the letters D,
F, I, or O.
The identifiers are recognized by the finite automaton of Fig. 3.8(a),
DO by that of Fig. 3.8(b), and IF by Fig. 3.8(c). (All automata here are deter-
ministic, although that need not be true in general, of course.)
The merged automaton is shown in Fig. 3.9. State q2 indicates that an
identifier has been found. However, states [q 1, q8} and [q 1, qs} are ambiguous.
They might indicate IF or DO, respectively, or they might just indicate the
initial portion of some identifier, such as DOOF. To resolve the conflict,
the lexical analyzer must look at an additional character. If a D , O , I, or F
follows, we had the prefix of an identifier. If anything else, including a blank,
follows (assume that there are more characters than the five mentioned), we
enter new states, q9 or q l0, and emit a signal to the effect that DO or IF,
respectively, was detected, and that it ends one symbol previously. If we
enter q2, we emit a signal saying that an identifier has been found, ending one
symbol previously.
Since it is the output of the device, not the state, that is important, states
260 THEORYOF TRANSLATION CHAP. 3
F,I,O
Start
(a) Identifiers
Start-- Q
(b) DO
Start r Q
(c) IF
Fig. 3.8 Automata for lexical analysis.
q2, qg, and ql0 can be identified and, in fact, will have no representation at
all in the implementation. U
not D, F, I, 0
D,F,I,O
D D,F,I
D,
F,O
Start
D,I,O
D,F,I,O
notD, F,I,O
very few next characters lead to the same state, It may be too expensive of
space to allocate a full table for each such state. A reasonable compromise
between time and space considerations, for many states, would be to use
binary decisions to weed out those few characters that cause a transition to
an unusual state.
EXERCISES
3.3.1. Give regular expressions for the following extended regular expressions:
(a) (a+3b+3) .2.
(b) (alb)* -- (ab)*.
(c) (aal bb) .4 ~ a(ab[ ba)+b.
tUnlike Fig. 3.8(a), Fig. 3.9 does not permit the empty string to be an identifier.
262 THEORYOF TRANSLATION CHAP. 3
Research Problem
3.3.8. Give an algorithm to choose an implementation for direct lexical ana-
lyzers. Your algorithm should be able to accept some indication of the
desired time-space trade off. You may not wish to implement the symbol-
by-symbol action of a finite automaton, but rather allow for the possibility
of other actions. For example, if many of the tokens were arithmetic
signs of length 1, and these had to be separated by blanks, as in
SNOBOL, it might be wise to separate out these tokens from others as
the first move of the lexical analyzer by checking whether the second
character was blank.
Programming Exercises
3.3.9. Construct a lexical analyzer for one of the programming languages given
in the Appendix. Give consideration to how the lexical analyzer will
recover from lexicai errors, particularly misspellings.
3.3.10. Devise a programming language based on extended regular expressions.
Construct a compiler for this language. The object language program
should be an implementation of the lexical analyzer described by the
source program.
SEC. 3.4 PARSING 263
BIBLIOGRAPHIC NOTES
The AED RWORD (Read a WORD) system was the first major system to use
finite state machine techniques in the construction of lexical analyzers. Johnson
et al. [1968] provide an overview of this system.
An algorithm that constructs from a regular expression a machine language
program that simulates a corresponding nondeterministic finite automaton is given
by Thompson [1968]. This algorithm has been used as a pattern-matching mecha-
nism in a powerful text-editing language called QED.
A lexical analyzer should be designed to cope with lexical errors in its input.
Some examples of lexical errors are
(1) Substitution of an incorrect symbol for a correct symbol in a token.
(2) Insertion of an extra symbol in a token.
(3) Deletion of a symbol from a token.
(4) Transposition of a pair of adjacent symbols in a token.
Freeman [1964] and Morgan [1970] describe techniques which can be used to
detect and recover from errors of this nature. The Bibliographic Notes at the end
of Section 1.2 provide additional references to error detection and recovery in
compiling.
3.4. PARSING
We say that a sentence w in L(G) for some C F G G has been parsed when
we know one (or perhaps all) of its derivation trees. In a translator, this tree
may be "physically" constructed in the computer memory, but it is more
likely that its representation is more subtle. One can deduce the parse tree
by watching the steps taken by the syntax analyzer, although the connection
would hardly be obvious at first.
Fortunately, most compilers parse by simulating a PDA which is recog-
nizing the input either top-down or bottom-up (see Section 2.5). We shall
see that the ability of a PDA to parse top-down is associated with the ability
of a P D T to map input strings to their leftmost derivations. Bottom-up pars-
ing is similarly associated with mapping input strings to the reverse of their
rightmost derivations. We shall thus treat the parsing problem as that of
mapping strings to either leftmost or rightmost derivations. While there are
many other parsing strategies, these two definitions serve as the significant
benchmarks.
264 THEORY OF TRANSLATION CHAP. 3
Some other parsing strategies are mentioned in various parts of the book.
In the Exercises at the end of Sections 3.4, 4.1, and 5.1 we shall discuss left-
corner parsing, a parsing method that is both top-down and bottom-up in
nature. In Section 6.2.1 of Chapter 6 we shall discuss generalized top-down
and bottom-up parsing.
DEFINITION
Example 3.22
Consider the grammar Go, where the productions are numbered as shown:
(1) E ~ E + T
(2) E---~ T
(3) T---~ T. F
(4) T ~ F
(5) F ~ (E)
(6) F ~ a
The left parse of the sentence a , (a + a) is 23465124646. The right parse
of a • (a + a) is 64642641532.
Example 3.23
Let G O be the usual grammar with productions numbered as in Example
3.22. Then T~ = ([E, T, F}, { + , . , (,), a}, { 1 , . . . , 6}, R, E), where R consists
of
E ~E+ T, lET
E ~ T, 2T
T~ ~ T * F, 3TF
T ,~ F, 4F
r ~ (E), 5E
F ~ a, 6
The pair of derivation trees in Fig. 3.10 shows the translation defined for
a,(a+a). D
The following theorem is left for the Exercises.
266 THEORY OF TRANSLATION CHAP. 3
E E
i 2/!
L 4/! !NNNN~,
! 1/E~T
1
T
!
!
F
!
2/! 4/!
F a 4
/! F
I
6
I !
a 6
THEOREM 3.10
Let G = (N, X, P, S) be a CFG. Then Ti = [(w, zt) lS ~==~w}.
Proof. We can prove by induction that (A, A) =-~ (w, n) if and only if
T~
Note that M~ is almost, but not quite, the PDT that one obtains by
Lemma 3.2 from the SDTS Tj°.
Example 3.24
Let us construct a left parser for G 0. Here
With the input string a + a , a, M~0 can make the following sequence of
moves, among others:
Let us now turn our attention to the right-parsing problem. Consider the
rightmost derivation of a + a • a from E in G O•
E---~IE+ T
,-3E+T,F
---~6E+ T,a
----~4E+F,a
~6E+a,a
~.z T + a , a
.,-4F+a,a
~6aq-a.a
THEOREM 3.12
Let G = (N, Z, P, S) be a CFG. Then lr,(Mr) = {(w, nR)[ S :::~'~ w}.
Proof. The proof is similar to that of Lemma 2.25 and is left for the
Exercises. 5
Example 3.25
The right parser for Go would be
where
~(q, e, E -q- T) = [(q, E, 1)}
d~(q, e, T) = {(q, E, 2)}
~(q, e, T • F) = [(q, T, 3)}
dr(q, e, F) = {(q, T, 4)}
~(q, e, (E)) = {(q, F, 5)}
,~(q, e, a) = {(q, F, 6)}
~(q, b, e) = {(q, b, e)} for all b in
~(q, e, SE) = {(q, e, e)}
Thus, Mr would produce the right parse 64264631 for the input string
a + a.a. [~]
In both cases we shall permit the DPDT to use an endmarker to delimit the
right end of the input string.
Note that all grammars are left- and right-parsable in an informal sense,
but it is determinism that is reflected in the formal definition.
We find that the classes of left- and right-parsable grammars are incom-
mensurate; that is, neither is a subset of the other. This is surprising in view
of Section 8.1, where we shall show that the LL grammars, those which can
be left-parsed deterministically in a natural way, are a subset of the LR
grammars, those which can be right-parsed deterministically in a natural way.
The following examples give grammars which are left- (right-) parsable but
not right- (left-) parsable.
Example 3.26
Let G1 be defined by
272 THEORY OF TRANSLATION CHAP. 3
THEOREM 3.13
The classes of left- and right-parsable grammars are incommensurate.
Proof. By Examples 3.26 and 3.27. [Z]
We can extend the "==~ and ==~ notations to SDT's by saying that
274 THEORYOF TRANSLATION CHAP. 3
(0~, fl) ~==~ (~,, ~) if and only if (tz, fl) ~ (~,, c~) by a sequence of rules such
that the leftmost nonterminal of e is replaced at each step and these rules,
with translation elements deleted, form the sequence of productions ~. We
define ~-~ for SDT's analogously.
DEFINITION
LEMMA 3.15
Proof. Here the symbol c plays the role of a right endmarker. Suppose
that with input w, P emitted some non-e-string, say dx, where d = a or b.
Let d be the other of a and b, and consider the action of P with wdc as input.
It must emit some string, but that string begins with d. Hence, P does not
map wdc to dwRcwd, as demanded. Thus, P may not emit any output until
the right endmarker c is reached. At that time, it has some string aw on its
pushdown list and is in state qw.
Informally, a~ must be essentially w, in which case, by erasing a~, P can
emit wR. But once P has erased aw, P cannot then "remember" all of w in order
SEe, 3.4 PARSING 275
to print it. A formal proof of the lemma draws upon the ideas outlined in
Example 3.26. We shall sketch such a proof here.
Consider inputs of the form w - - a ; . Then there are integers j and k,
a state q, and strings • and fl such that when the input is aJ+"kc, P will place
eft" on its pushdown list and enter state q. Then P must erase the pushdown
list down to e at or before the time it emits w%. But since e is independent
of w, it is no longer possible to emit w. [--]
THEOREM 3.15
There exists a simple SDTS T - - ( N , E, A, R, S) such that there is no
D P D T P for which r(P) = {OrR, x) ! (S, S) ~'~ (w, x) for some w}.
Proof. Let T be defined by the rules
(1) S --. Sa, aSa
(2) S --, Sb, bSb
(3) S ~ c, c
Then L7 = 3(1 ÷ 2)*, where G is the underlying grammar. If we let
17(1) -- a and h(2) -- b, then the desired z(P) is [(3e, h(e)Rch(e))la ~ [ 1, 2}*}.
If P existed, with or without a right endmarker, then we could easily con-
struct a D P D T to define the translation {(wc, wRcw)lw e {a, b}*}, in contra-
diction of Lemma 3.15. [--I
We conclude that both left parsing and right parsing are of interest, and
we shall study both in succeeding chapters. Another type of parsing which
embodies features of both top-down and bottom-up parsing is left-corner
parsing. Left-corner parsing will be treated in the Exercises.
DEFINITION
Let G 1 = (N1, ~, P1, $1) and G2 = (N2, ~, P2, $2) be C F G ' s such that
L(G1) = L(G2). We say that G 2 left-covers G1 if there is a homomorphism h
from P2 to P1 such that
(1) If $2 ~==~ w, then $1 ht~)==~ W, and
(2) For all rt such that Sx ~==~ w, there exists n' such that $2 *'==~ w and
h(n') = n.
Example 3.28
Let G 1 be the grammar
(1) S --~ 0S1
(2) S --~ 01
and G2 be the following C N F grammar equivalent to G l:
(1) S ~ AB
(2) s - ~ A c
(3) B ~ SC
(4) A ~ 0
(5) C ~ 1
We see G 2 left-covers G 1 with the homomorphism h(1) = 1, h(2) = 2, and
h(3) = h(4) = h(5) = e. For example,
G2 also right-covers G1, and in this case, the same h can be used. For
example,
G 1 does not left- or right-cover G2. Since both grammars are unambigu-
ous, the mapping between parses is fixed. Thus a homomorphism g showing
that G1 was a left cover would have to map 1"2 into (143)"24(5) "+I, which
can easily be shown to be impossible. [~
Example 3.29
The key step in the C h o m s k y normal form construction (Algorithm 2.12)
is the replacement of a production A --~ X t ' " X,, n > 2, by A ~ X t B a ,
B t ~ X 2 B z , . . . , B,_z ~ X,_ t X.. The resulting g r a m m a r can be shown to
left-cover the original if we m a p production A ~ XaB~ to A --, X~ . . . X.
and each of the productions B 1 ~ X z B 2 , . . . , Bn-z - ~ X n - I X n to the empty
string. If we wish a right cover instead, we may replace A ~ X 1 . . ' X, by
A ~ BIXn, O 1 ~ B2Xn_i,... , B,,_ 2 ~ X t X 2. [--]
EXERCISES
Example 3.30
Figure 3.11 shows a parse tree for the sentence bbaaab generated
by the following grammar:
(1) S ~ A S (2) S ~ BB
(3) A ~ b A A (4) A --~ a
(5) B ~ b (6) B - - r e
EXERCISES 279
nl
/N
®n3
/\
(~n4 ! ~)n6 (~)n7 (~)n8
/ I I /
~),n5 Qnl 2 Qnl 3(~)nl4
H4 /'//2 /119 /7/5 ///15 ///10 ///16 Hll /I/12 1/6 H1 ///13 /q/7 t//3 ///14 /18
DEFINITION
Example 3.31
Tic for the g r a m m a r of the previous example is
3.4.21. Prove that (w, ~) is in z(T~) if and only if n is a left-corner parse for w.
3.4.22. Show that for each CFG there is a (nondeterministic) PDT which maps
the sentences of the language to their left-corner parses.
3.4.23. Devise algorithms which will map a left-corner parse into (1) the
corresponding left parse and (2) the corresponding right parse and
conversely.
3.4.24. Show that if Ga left- (right-) covers G2 and G2 left- (right-) covers G~,
then G3 left- (right-) covers Gi.
3.4.25. Let G a be a cycle-free grammar. Show that G i is left- and right-covered
by grammars with no single productions.
3.4.26. Show that every cycle-free grammar is left- and right-covered by gram-
mars in CNF.
*3.4.27. Show that not every CFG is covered by an e-free grammar.
3.4.28. Show that Algorithm 2.9, which eliminates useless symbols, produces
a grammar which left- and right-covers the original.
**3.4.29. Show that not every proper grammar is left- or right-covered by a
grammar in GNF. Hint: Consider the grammar S ~ S01 S11011.
**3.4.30. Show that Exercise 3.4.29 still holds if the homomorphism in the
definition of cover is replaced by a finite transduction.
"3.4.31. Does Exercise 3.4.29 still hold if the homomorphism is replaced by a
pushdown transducer mapping ?
Research Problem
3.4.32. It would be nice if whenever G2 left- or right-covered G i, every SDTS
with G1 as underlying grammar were equivalent to an SDTS with G~
as underlying grammar. Unfortunately, this is not so. Can you find the
conditions relating G1 and G2 so that the SDT's with underlying
grammar G t are a subset of those with underlying grammar Gz ?
BIBLIOGRAPHIC NOTES
281
282 GENERALPARSING METHODS CHAP. 4
4.1.1. S i m u l a t i o n of a P D T
A PDA M = (Q, ~, I', $, q0, Z0, F) is halting if for each w in ~*, there is
a constant k , such that if (q0, w, Z 0 ) ~ (q, x, 7), then m < kw. A P D T is
halting if its underlying PDA is halting.
sEc. 4.1 BACKTRACK PARSING 283
cJ C ° ~
c3~ ~c4 c,~C2~c,s
C5
!1
C8 C12
[
C16
C6
I I I
C9 Cl 3
C7
I Clo
I ! C14 Fig. 4.1 Moves of parser.
= ? =
284 GENERAL PARSING METHODS CHAP. 4
Co represents the initial configuration (q, aacbc, S, e). The rules of T show
that two next configurations are possible from Co, namely Cx = (q, ache, SbS, 1)
and C2 = (q, acbc, S, 2). (The ordering here is arbitrary.) From C~, T can
enter configurations C3 = (q, cbc, SbSbS, 11) and C4 = (q, cbc, SbS, 12).
From C2, T can enter configurations C~ = (q, cbc, SbS, 21) and C~5 =
(q, cbc, S, 22). The remaining configurations are determined uniquely.
One way to determine all parses for the given input string is to determine
all accepting configurations which are accessible from Co in the tree of con-
figurations. This can be done by tracing out all possible paths which begin
at C Oand terminate in a configuration from which no next move is possible.
We can assign an order in which the paths are tried by ordering the choices
of next moves available to T for each combination of state, input symbol,
and symbol on top of the pushdown list. For example, let us choose (q, SbS, 1)
as the first choice and (q, S, 2) as the second choice of move whenever
the rule t~(q, a, S) is applicable.
Let us now consider how all the accepting configurations of T can be
determined: by systematically tracing out all possible sequences of moves of
T. From Co suppose that we make the first choice of next move to obtain C~.
From C~ we again take the first choice to obtain C 3. Continuing in this
fashion we follow the sequence of configurations Co, C~, C a, C5, C6, C7. C7
represents the terminal configuration (q, e, bS, 1133), which is not an accept-
ing configuration. To determine if there is another terminal configuration,
we can "backtrack" up the tree until we encounter a configuration from
which another choice of next move not yet considered is available. Thus we
must be able to restore configuration C6 from C7. Going back to C6 from
C7 can involve moving the input head back on the input, recovering what
was previously on the pushdown list, and deleting any output symbols that
were emitted in going from C6 to C7. Having restored C6, we must also have
available to us the next choice of moves (if any). Since no alternate choices
exist in C6, we continue backtracking to C5, and then C a and C~.
From C~ we can then use the second choice of move for ~(q, a, S) and
obtain configuration C4. We can then continue through configurations C8
and C9 to obtain C~0 = (q, e, e, 1233), which happens to be an accepting
configuration.
We can then emit the left parse 1233 as output. If we are interested in
obtaining only one parse for the input we can halt at this point. However,
if we are interested in all parses, we can proceed to backtrack to configuration
Co and then try all configurations accessible from C2. C~4 represents another
accepting configuration, (q, e, e, 2133).
We would then halt after all possible sequences of moves that T could
have made have been considered. If the input string had not been syntactically
well formed, then all possible move sequences would have to be considered.
SEC. 4.1 BACKTRACK PARSING 285
The name top-down parsing comes from the idea that we attempt to
produce a parse tree for the input string starting from the top (root) and
working down to the leaves. We begin by taking the given grammar and
numbering in some order the alternates for every nonterminal. That is, if
A---~ ~z~10c21"" 10c~ are all the A-productions in the grammar, we assign
some ordering to the ¢zt's (the alternates for A).
For example, consider the grammar mentioned in the previous section.
The S-productions are
S > aSbS[aSIc
and let us use them in the order given. That is, aSbS will be the first alternate
for S, aS the second, and c the third. Let us assume that our input string is
aacbc. We shall use an input pointer which initially points at the leftmost
symbol of the input string.
Briefly stated, a top-down parser attempts to generate a derivation tree
for the input as follows. We begin with a tree containing one node labeled S.
That node is the initial active node. We then perform the following steps
recursively"
286 GENERAL PARSING METHODS CHAP. 4
(a) (b)
I
c ¢
i I
¢
(c) (d)
the second input symbol, we advance the input pointer to the third input
symbol.
We then expand the leftmost S in Fig. 4.2(b), but this time we cannot use
either the first or second alternate because then the resulting left-sentential
form would not be consistent with the input string. Thus we must use the
third alternate to obtain Fig. 4.2(c). We can now advance the input pointer
from the third to the fourth and then to the fifth input symbol, since the next
two active symbols in the left-sentential form represented by Fig. 4.2(c) are
c and b.
We can expand the leftmost S in Fig. 4.2(c) using the third alternate for
S to obtain Fig. 4.2(d). (The first two alternates are again inconsistent with
the input.) The fifth terminal symbol is c, and thus we can advance the input
pointer one symbol to the left. (We assume that there is a marker to denote
the end of the input string.) However, there are more symbols generated by
Fig. 4.2(d), namely bS, than there are in the input string, so we now know
that we are on the wrong track in finding a correct parse for the input.
Recalling the pushdown parser of Section 4.1.1, we have at this point gone
through the sequence of configurations Co, C1, C3, C5, C6, C7. There is no
next move possible from C 7.
We must now find some other left-sentential form. We first see if there is
another alternate for the production used to obtain the tree of Fig. 4.2(d)
from the previous tree. There is none, since we used S ~ c to obtain Fig.
4.2(d) from Fig. 4.2(c). We then return to the tree of Fig. 4.2(c) and reset
the input pointer to position 3 on the input. We determine if there is another
alternate for the production used to obtain Fig. 4.2(c) from the previous tree.
Again there is none, since we used S - 4 c to obtain Fig. 4.2(c) from Fig.
4.2(b). We thus return to Fig. 4.2(b), resetting the input pointer to position 2.
We used the first alternate for S to obtain Fig. 4.2(b) from Fig. 4.2(a), so now
we try the second alternate and obtain the tree of Fig. 4.3(a).
We can now advance the input pointer to position 3, since the a generated
matches the a at position 2 in the input string. Now, we may use only the
third alternate to expand the leftmost S in Fig. 4.3(a) to obtain Fig. 4.3(b).
The input symbols at positions 3 and 4 are now matched, so we can advance
the input pointer to position 5. We can apply only the third alternate for S
in Fig. 4.3(b), and we obtain Fig. 4.3(c). The final input symbol is matched
with the rightmost symbol of Fig. 4.3(c). We thus know that Fig. 4.3(c) is
a valid parse for the input. At this point we can backtrack to continue look-
ing for other parses, or terminate.
Because our grammar is not left-recursive, we shall eventually exhaust all
possibilities by backtracking. That is, we would be at the root, and all alter-
nates for S would have been tried. At this point we can halt, and if we have
not found a parse, we can report that the input string is not syntactically
well formed.
288 GENERAL PARSING METHODS CHAP. 4
a S s
!
c
(a) (b)
a . S c
I
¢
(c)
Fig. 4.3 Further attempts at parsing.
d D(d)
1 1
2 2
3 5
4 26
5 677
6 458330
Fig. 4.4 Values of D(d).
D ( d ) grows very rapidly, faster than 2 2`-` for d > 3. (Also, see Exercise
4.1.4.) This growth is so huge that any grammar in which two productions
of this form need to be considered could not possibly be reasonably parsed
using this modification of the top-down parsing algorithm.
For these reasons the approach generally taken is to apply the top-down
parsing algorithm only to grammars that are free of left recursion.
Method.
(i) For each nonterminal A in N, order the alternates for A. Let A t be
the index for the ith alternate of A. For example, if A --~ t~litz2[.-. [~k
290 GENERAL PARSING METHODS CHAP. 4
are all the A-productions in P and we have ordered the alternates as shown,
then A 1 is the index for as, A2 is the index for 0~2, and so forth.
(2) A 4-tuple (s, i, ~, fl) will be used to denote a configuration of the
algorithm"
(a) s denotes the state of the algorithm.
(b) i represents the location of the input pointer. We assume that
the n + 1st "input symbol" is $, the right endmarker.
(c) 0~ represents the first pushdown list (L1).
(d) fl represents the second pushdown list (L2).
The top of 0c will be on the right and the top of fl will be on the left. L2 repre-
sents the "current" left-sentential form, the one which our expansion of
nonterminals has produced. Referring to our informal description of top-
down parsing in Section 4.1.1, the symbol on top of L2 is the symbol labeling
the active node of the derivation tree being generated. L1 represents the cur-
rent history of the choices of alternates made and the input symbols over
which the input head has shifted. The algorithm will be in one of three states
q, b, or t; q denotes normal operation, b denotes backtracking, and t is the
terminating state.
(3) The initial configuration of the algorithm is (q, 1, e, S$).
(4) There are six types of steps. These steps will be described in terms of
their effect on the configuration of the algorithm. The heart of the algorithm
is to compute successive configurations defined by a "goes to" relation, t--.
The notation (s, i, a, fl) ~ (s', i', a', fl') means that if the current configu-
ration is (s, i, ~, fl), then we are to go next into the configuration (s', i', ~', fl').
Unless otherwise stated, i can be any integer from 1 to n + 1, a a string in
(X w I)*, where I is the set of indices for the alternates, and fl a string in
(N U X)*. The six types of move are as follows:
(a) Tree expansion
We have reached the end of the input and have found a left-
sentential form which matches the input. We can recover the left
parse from a by applying the following homomorphism h to a:
h(a) -- e for all a in E; h(A,) = p, where p is the production num-
ber associated with the production A ~ ?, and ? is the ith alter-
nate for A.
(d) Unsuccessful match of input symbol and derived symbol
Example 4.1
Let us consider the operation of Algorithm 4.1 using the grammar G
with productions
(1) E > T-4- E
(2) E >T
(3) T - - - . F . T
(4) T >F
(5) F - >a
We shall now show that Algorithm 4.1 does indeed produce a left parse
for w according to G if one exists.
DEFINITION
A partial left parse is the sequence of productions used in a leftmost
derivation of a left-sententiai form. We say that a partial left parse is con-
sistent with the input string w if the associated left-sentential form is consis-
tent with w.
Let G = (N, X, P, S) be the non-left-recursive g r a m m a r of Example 4.1
and let w = a 1 - . . an be the input string. The sequence of consistent partial
left parses for w, n0, tel, z~2, . . . . zti, • • • is defined as follows:
(1) no is e and represents a derivation of S from S. (z~0 is not strictly
a parse.)
(2) n1 is the production n u m b e r for S ---~ ~, where ~z is the first alternate
for S.
(3) ztt is defined as follows: Suppose that S "'-~=~ xAy. Let fl be the lowest
n u m b e r e d alternate for A, if it exists, such that we can write xfl?-- xyO,
where ~ is either e or begins with a nonterminal and xy is a prefix of w.
Then n ~ - - n i _ i A k , where k is the n u m b e r of alternate ft. In this case we
call rt~ a continuation of n,._~. If, on the other hand, no such fl exists, or
S .... =~ x for some terminal string x, then let j be the largest integer less
than i - 1 such that the following conditions hold:
294 GENERALPARSING METHODS CHAP. 4
(a) Let S ~'=~ xB?, and let nj+ 1 be a continuation of nj, with alternate
ek replacing B in the last step of nj+ 1. Then there exists an alter-
nate e,, for B which follows ek in the order of alternates for B.
(b) We can write Xem? = xyO, where O is e or begins with a nonter-
minal; xy is a prefix of w. Then ni -- njB,,, where B m is the num-
ber of production B --, e~. In this case, we call nt a modification
of ~i- 1-
(c) 7r~ is undefined if (a) or (b) does not apply.
Example 4.2
For the g r a m m a r G of Example 4.1 and the input string a + a, the
sequence of consistent partial left parses is
e
1
13
14
145
1451
14513
14514
1452
14523
14524
145245
2
23
24
t i n fact, a stronger result is possible; i is linear in n. However, this result suffices for
the time being and will help prove the stronger result.
SEC. 4.1 BACKTRACK PARSIN6 295
COROLLARY
Let G = (N, ~, P, S) be a non-left-recursive grammar. Then there is
i
a constant c' such that if S ~ w B e and w ~ e, then [el <_ c'[ w 1.
lm
Proof. Referring to Fig. 4.5, we have shown that the path from the root to
n o is no longer than k(I w l + 2). Thus, [e[_~ kl(I w[ -q- 2). Choose c' = 3kl. E]
LEMMA 4.2
Let G = (N, Z , P , S) be a C F G with no useless nonterminals and
w = ala 2 . . . a, an input string in Z*. The sequence of consistent left parses
for w is finite if and only if G is not left-recursive.
296 GENERALPARSING METHODS CHAP. 4
DEFINITION
(q, I, e, S$) }----(q, j~, 71, ~1 $) t--- (q, J2, 7,2, fl~ $),
THEOREM 4.1
Algorithm 4.1 produces a left parse for w if one exists and otherwise
emits an error message.
Proof. F r o m Lemma 4.3 we see that the algorithm cycles through all
consistent partial left parses until either a left parse is found for the input
or all consistent partial left parses are exhausted. From Lemma 4.2 we know
that the number of partial left parses is finite, so the algorithm must even-
tually terminate. [[]
i= 1 -q-J1 + . . . -q-it
< 1 + (l- 1)ct + k/c~ I~1-gZc~
< kZc~ I~1-- (g -- 1)Zci
< kZcxl ~I -- jZc~.
COROLLARY 1
Let G = (N, ~, P, S) be a non-left-recursive grammar. Then there exists
i
a constant c' such that if S =-~ wAa and w ~ e, then i < c'[w I.
lm
Proof. By the corollary to Lemma 4.1, there is a constant c" such that
la [ ~ c"] w 1. By Lemma 4.4, i_~ c[ wAoc [. Since I wan[ ~ (2 -q- c") [w l, the
choice c' = c(2 -q-- c") yields the desired result. ~ .
COROLLARY 2
Let G = (N, ~, P, S) be a non-left-recursive grammar. Then there is
a constant k such that if 7t is a partial left parse consistent with sentence w,
and S " = ~ xe, where e is either e or begins with a nonterminal, then
I~1 _< k(Iwl + ~).
Proof. If x -~ e, then by Corollary 1, we have 17tl _~ c'lx 1. Certainly,
Ixt_<lwl, so l nl _< c'lwl. As an exercise, we can show that if x = e, then
l r~l _< c'. Thus, 17tl _< c'(lw[ + 1) in either case. [Z]
THEOREM 4.2
There is a constant c such that Algorithm 4.1, with input w of length
n ~ 1, uses no more than cn cells, if one cell only is needed for each symbol
on the two lists of the configurations.
Proof. Except possibly for the last expansion made, list L2 is part of
a left-sentential form a such that S ~=-~ 0¢, where 7~ is a partial left parse
consistent with w. By Corollary 2 to Lemma 4.4, I nl _< k(l w l + 1). Since
there is a bound on the length of the right side of any production, say/, we
know that i a l <_ kl(i w l -+- 1) < 2kl I w !. Thus the length of L2 is no greater
than 2kl l w I -q- l -- 1 < 3kl l w 1.
List L1 consists of part of the left-sentential form ~ (most or all of the
SEC. 4.1 BACKTRACK P A R S I N G 299
THEOREM 4.3
There is a constant c such that Algorithm 4.1, when its input w is of length
n ~> 1, makes no more than e" elementary operations, provided the calcu-
lation of one step of Algorithm 4.1 takes a constant number of elementary
operations.
Proof. By Corollary 2, every partial left parse consistent with w is of
length at most c~n for some ca. Thus there are at most e~ different partial
left parses consistent with w for some constant e z. Algorithm 4.1 computes
at most n configurations between configurations whose contents of L1
describe consecutive partial left parses. The total number of configurations
computed by Algorithm 4.1 is thus no more than nel. From the binomial
theorem the relation nc~ < (c 2 -+- 1)" is immediate. Choose c to be (c 2 + 1)m,
where m is the maximum number of elementary operations required to com-
pute one step of Algorithm 4.1. U
Theorem 4.3 is in a sense as strong as possible. That is, there are non-
left-recursive grammars which cause Algorithm 4.1 to spend an exponential
amount of time, because there are c" partial left parses consistent with some
words of length n.
Example 4.3
Let G = ({S}, {a, b}, P, S), where P consists of S --~ aSS l e. Let X(n) be
the number of different leftmost parses of a", and let Y(n) be the number of
partial left parses consistent with a". The following recurrence equations
define X(n) and Y(n):
X(0) = i
(4.1.2) n-I
X(n)-- Z X(i)X(n- 1--i)
i=0
Y(0)---- 2
(4.1.3) n-I
Y(n) = Y ( n - 11 + ~] X(i) Y ( n - 1 -- i)
i=0
Line (4.1.2) comes from the fact that every derivation for a sentence a"
with n ~ 1 begins with production S ---, aSS. The remaining n -- 1 a's can
be divided any way between the two S's. In line (4.1.3), the Y(n -- 1) term
corresponds to the possibility that after the first step S =~ aSS, the second
S is never rewritten; the summation corresponds to the possibility that
the first S derives a t for some i. The formula Y(0) -- 2 is from the observation
:300 GENERAL PARSING METHODS CHAP. 4
that the null derivation and the derivation S :::> e are consistent with string e.
From Exercise 2.4.29 we have
X(n)-- 1 (2~)
n+l
so Xfn) :> 2"-1. Thus
n-1
Y(n) > Y ( n - 1 ) + ~] 2'-1 Y ( n - 1 - - i )
i=0
FIRST k(a) -- [x t~ ::~ xfl and [xl = k or a =:~ x and i xl < k}.
lm
That is, FIRSTk(a ) consists of all terminal prefixes of length k (or less if a
derives a terminal string of length less than k) of the terminal strings that
can be derived from ~.
(2) We can look ahead at the next k input symbols to determine whether
a given alternate should be used. For example, we can tabulate, for each
alternate a, a lookahead set FIRSTk(a ). If no prefix of the remaining input
string is contained in FIRSTk(a), we can immediately reject a and try the
next alternate. This technique is very useful both when the given input is
in L(G) and when it is not in L(G). In Chapter 5 we shall see that for certain
classes of grammars the use of lookahead can entirely eliminate the need
for backtracking.
(3) We can add bookkeeping features which will allow faster backtrack-
ing. For example, if we know that the last m-productions applied have no
applicable next alternates, when failure occurs we can skip back directly to
the position where there is an applicable alternate.
(4) We can restrict the amount of backtracking that can be done. We
shall discuss parsing techniques of this nature in Chapter 6.
Another severe problem with backtrack parsing is its poor error-locating
capability. If an input string is not syntactically well formed, then a compiler
SEC. 4.1 BACKTRACKPARSING 301
should announce which input symbols are in error. Moreover, once one error
has been found, the compiler should recover from that error so that parsing
can resume in order to detect any additional errors that might occur.
If the input string is not syntactically well formed, then the backtracking
algorithm as formulated will merely announce error, leaving the input
pointer at the first input symbol. To obtain more detailed error information,
we can incorporate error productions into the grammar. Error productions
are used to generate strings containing common syntactic errors and would
make syntactically invalid strings well formed. The production numbers in
the output corresponding to these error productions can then be used to
signal the location of errors in the input string. However, from a practical
point of view, the parsing algorithms presented in Chapter 5 have better
error-announcing capabilities than backtracking algorithms with error pro-
ductions.
4.1.5. Bottom-Up Parsing
a b a
(a)
/~.b a/A~b
(b)
A B
a b a b a
(c)
Ajs
a/ ~b a/!~a (d)
Fig. 4.6 P a r t i a l p a r s e trees in b o t t o m -
up parse.
We shift a on the pushdown list and find that no reductions are possible.
We then backtrack to the last position at which we made a reduction, namely
where the pushdown list contained Aab (b is on top here) and we replaced
ab by A, i.e., when the partial tree was that of Fig. 4.6(a). Since no other
reduction is possible, we now shift instead of reducing. The pushdown list
now contains Aaba. We can then reduce aba to B, to obtain Fig. 4.6(c).
Next, we replace AB by S and thus have a complete tree, shown in Fig. 4.6(d).
This method can be viewed as considering all possible sequences of moves
of a nondeterministic right parser for a grammar. However, as with top-
down parsing, we must avoid situations in which the number of possible
moves is infinite.
One such pitfall occurs when a grammar has cycles, that is, derivations
+
ALGORITHM 4.2
Bottom-up backtrack parsing.
Input. C F G G = (N, E, P, S) with no cycles or e-productions, whose
productions are numbered 1 to p, and an input string w = alaz .. • a,, n ~ 1.
Output. One right parse for w if one exists. The output "error" otherwise.
Method.
(1) Order the productions arbitrarily.
(2) We shall couch our algorithm in the 4-tuple configurations similar
to those used in Algorithm 4.1. In a configuration (s, i, a, fl)
(a) s represents the state of the algorithm.
(b) i represents the current location of the input pointer. We assume
the n + 1st input symbol is $, the right endmarker.
(c) a represents a pushdown list L1 (whose top is on the right).
(d) fl represents a pushdown list L2 (whose top is on the left).
As before, the algorithm can be in one of three states q, b, or t. L1 will hold
a string of terminals and nonterminals that derives the portion of input to
the left of the input pointer. L2 will hold a history of the shifts and reductions
necessary to obtain the contents of L1 from the input.
(3) The initial configuration of the algorithm is (q, 1, $, e).
(4) The algorithm itself is as follows. We begin by trying to apply step 1.
Step 1" Attempt to reduce
provided A --~ fl is the jth production in P and fl is the first right side in
the linear ordering in (1) that is a suffix of aft. The production number is
written on L2. If step 1 applies, return to step 1. Otherwise go to step 2.
Step 2: Shift
(q, i, oc, r) [-- (q, i -k 1, oca~,s?)
h(s) = e
h(j) = j for all production numbers
if the jth production in P is A ----~fl and the next production in the ordering
of (1) whose right side is a suffix of ~fl is B ~ fl', numbered k. Note that
t~fl = t~'fl'. Go to step 1. (Here we have backtracked to the previous reduc-
tion, and we try the next alternative reduction.)
if the top entry on L2 is the shift symbol. (Here all alternatives at position i
have been exhausted, and the shift action must be undone. The input pointer
moves left, the terminal symbol a t is removed from L1 and the symbol s is
removed from L2.) D
Example 4.4
Let us apply this bottom-up parsing algorithm to the grammar G with
productions
sEc. 4.1 BACKTRACK PARSING 305
(1) E - - - ~ E + T
(2) E ~T
(3) T ~ T, F
(4) T~ ~F
(5) F~ ~ a
DEFINITION
LEMMA 4.5
THEOREM 4.4
Algorithm 4.2 correctly finds a right parse of w if one exists, and signals
an error otherwise.
Proof By Lemma 4.5, the number of partial right parses consistent with
the input is finite. It is left as an Exercise to show that unless Algorithm 4.2
finds a parse, it cycles through all partial right parses in a natural order.
Namely, each partial right parse can be coded by a sequence of production
indices and shift symbols (s). Algorithm 4.2 considers each such sequence
that is a partial right parse in a lexicographic order. That lexicographic order
is determined by an order of the symbols, placing s last, and ordering the
production indices as in step 1 of Algorithm 4.2. Note that not every sequence
of such symbols is a consistent partial right parse. [-7
Paralleling the analysis for Algorithm 4.1, we can also show that the
lengths of the lists in the configurations for Algorithm 4.2 remain linear in
the input length.
THEOREM 4.5
Let one cell be needed for each symbol on a list in a configuration of
Algorithm 4.2, and let the number of elementary operations needed to com-
pute one step of Algorithm 4.2 be bounded. Then for some constants c a and
c 2, Algorithm 4.2 requires can space and c~ time, when given input of length
n~l.
Proof Exercise. [~]
(2) We can attempt to order the reductions so that the most likely reduc-
tions are made first.
(3) We can add information to determine whether certain reductions
will lead to success. For example, if the first reduction uses the production
A ~ ax . . . ak, where a~ is the first input symbol and we know that there is
no y in Z* such that S ~ Ay, then this reduction can be immediately ruled
out. In general we want to be sure that if $~ is on L1, then ~ is the prefix
of a right-sentential form. While this test is complicated in general, certain
notions, such as precedence, discussed in Chapter 5, will make it easy to rule
out many ~'s that might appear on L1.
(4) We can add features to make backtracking faster. For example, we
might store information that will allow us to directly recover the previous
configuration at which a reduction was made.
Some of these considerations are explored in Exercises 4.1.12-4.1.14 and
4.1.25. The remarks on error detection and recovery with the backtracking
top-down algorithm also apply to the bottom-up algorithm.
EXERCISES
D(1) = 1
D(d) = ( D ( d - 1) 2) + 1
is D(d) = [k~], where k is a real number and [x] is the greatest integer
_< x. Here, k = 1.502837 . . . .
4.1.5. Complete the proof of Corollary 2 to Lemma 4.4.
4.1.6. Modify Algorithm 4.1 to refrain from using an alternate if it is impos-
sible to derive the next k input symbols, for fixed k, from the resulting
left-sentential form.
4.1.7. Modify Algorithm 4.1 to work on an arbitrary grammar by putting
bounds on the length to which L1 and L2 can grow.
*'4.1.8. Give a necessary and sufficient condition on the input grammar such
that Algorithm 4.1 will never enter the backtrack mode.
4.1.9. Prove L e m m a 4.5.
4.1.10. Prove Theorem 4.5.
4.1.11. Modify Algorithm 4.2 to work for an arbitrary C F G by bounding the
length of the lists L1 and L2.
4.1.12. Modify Algorithm 4.2 to run faster by checking that the partial right
parse together with the input to the right of the pointer does not con-
tain any sequence of k symbols that could not be part of a right-
sentential form.
4.1.13. Modify Algorithms 4.1 and 4.2 to backtrack to any specially designated
previous configuration using a finite number of reasonably defined
elementary operations.
*'4.1.14. Give a necessary and sufficient condition on a grammar such that
Algorithm 4.2 will operate with no backtracking. What if the modifi-
cation of Exercise 4.1.12 is first m a d e ?
4.1.15. Find a cycle-free grammar with no e-productions on which Algorithm
4.2 takes an exponential amount of time.
4.1.16. Improve the bound of Lemma 4.4 if the grammar has no e-productions.
4.1.17. Show that if a grammar G with no useless symbols has either a cycle
or an e-production, then Algorithm 4.2 will not terminate on any
sentence not in L(G).
DEFINITION
We shall outline a programming language in which we can write
nondeterministic algorithms. We call the language N D F (nondeter-
ministic F O R T R A N ) , because it consists of F O R T R A N - l i k e statements
plus the statement C H O I C E (nl . . . . . nk), where k ~ 2 and nl . . . . . nk
are statement numbers.
To define the meaning of an N D F program, we postulate the
existence of an interpreter capable of executing any finite number of
EXERCISES 309
Example 4.5
The following N D F program prints the legend NOT A PRIME
one or more times if the input is not a prime number, and prints
nothing if it is a prime"
READ N
I=l
PICK A VALUE OF I G R E A T E R T H A N 1
1 I=I+l
CHOICE (1, 2)
2 IF (I .EQ. N) STOP
F I N D IF I IS A DIVISOR OF N A N D NOT EQUAL
TO N
IF ( ( N / I ) , I .NE. N) STOP
WRITE ("NOT A PRIME")
STOP
"4.1.38. Write an N D F program which prints all answers to the "eight queens
problem." (Select eight points on an 8 x 8 grid so that no two lie on
any row, column, or diagonal line.)
'4.1.19. Write N D F programs to simulate a left or right parser.
It would be nice if there were an algorithm which determined if a
given N D F program could run forever on some input. Unfortunately
this is not decidable for F O R T R A N , or any other programming lan-
guage. However, we can make such a determination if we assume that
branches (from IF and assigned GOTO statements, not CHOICE
statements) are externally controlled by a "demon" who is trying to
make the program run forever, rather than by the values of the pro-
gram variables. We say an N D F program is halting if for each input
there is no sequence of branches and nondeterministic choices that
cause any copy of the program to run beyond some constant number
of executed statements, the constant being a function of the number of
310 GENERAL PARSING METHODS CHAP. 4
input cards available for data. (Assume that the program halts if it
attempts to read and no data is available.)
• 4.1.20. Give an algorithm to determine whether an N D F program is halting
under the assumption that no D O loop index is ever decremented.
"4.1.21. Give an algorithm which takes a halting N D F program and constructs
from it an equivalent A L G O L program. By "equivalent program,"
we have to mean that the A L G O L program does input and output
in an order which the N D F program might do it, since no order for the
N D F program is known. ALGOL, rather than F O R T R A N , is preferred
here, because recursion is very convenient.
4.1.22. Let G = (N, ~, P, S) be a CFG. From G construct a C F G G' such
that L(G') = ~* and if S '~=> w, then S ~==~ w.
G G'
A P D T (with pushdown top on the left) that behaves as a nondeter-
ministic left-corner parser for a grammar can be constructed from the
grammar. The parser will use as pushdown symbols nonterminals,
terminals, and special symbols of the form [A, B], where A and B are
nonterminals.
Nonterminals and terminals appearing on the pushdown list are
goals to be recognized top-down. In a symbol [A, B], A is the current
goal to be recognized and B is the nonterminal which has just been
recognized bottom-up. F r o m a C F G G = (N, 2~, P, S) we can construct
a P D T M = ([q}, E, N x N W N W E, A, ~, q, S, ~ ) which will be a
left-corner parser for G. Here A = [ 1, 2 . . . . , p} is the set of production
numbers, and 5 is defined as follows:
(1) Suppose that A ~ ot is the ith production in P.
(a) If 0t is of the form Bfl, where B E N, then O(q, e, [C, B])
contains (q, fl[C, A], i) for all C ~ N. Here we assume that
we have recognized the left-corner B bottom-up so we
establish the symbols in fl as goals to be recognized top-
down. Once we have recognized fl, we shall have recognized
an A.
(b) If 0t does not begin with a nonterminal, then O(q, e, C)
contains (q, a[C, A], i) for all nonterminals C. Here, once
is recognized, the nonterminal A will have been recognized.
(2) 6(q, e, [A, A]) contains (q, e, e) for all A ~ N. Here an instance
of the goal A which we have been looking for has been recognized. If
this instance of A is not a left corner, we remove [A, A] from the push-
down list signifying that this instance of A was the goal we sought.
(3) O(q, a, a ) = [(q, e, e)} for all a ~ E. Here the current goal is
a terminal symbol which matches the current input symbol. The goal,
being satisfied, is removed.
M defines the translation
Example 4.6
Consider the C F G G = (N, ~, P, S) with the productions
(1) E ~ E q- T
(2) E ~ T
(3) T - >F t T
(4) T ~F
(5) F - > (e)
(6) F - - + a
( a t a Jr- a, E, e)
The second rule in (1 d) is applicable (so is the first), so the PDT can go
into configuration
!
T F
I
a
1 F
1
a
I Fig. 4.7 Derivation tree for a t a -k a.
31 2 GENERALPARSING METHODS CHAP. 4
Here we are saying that the left corner of the production T ~ F 1" T
will now be recognized once we find t and T. We can then enter the
following configurations:
Example 4.7
Consider the grammar S - - ~ Scl ab and the input string abc. Ini-
tially, the parser would have ($, Q0) on the pushdown list, where Q0
is S---~. Scl .ab. We can then shift the first input symbol and write
(a, Q1) on the pushdown list, where Q1 is S - - - , . Sc[. a.b. Here, we
can be beginning the productions S ~ Scl ab, or we could have seen
the first a of production S ~ ab. Shifting the next input symbol b,
we would write (b, Qz) on the pushdown list, where Q1 is
S - - ~ . Sc] .ab.. We can then reduce using production S ~ ab. The
pushdown list would now contain ($, Qo)(S, Q3), where Q3 is
S ~ .S.c[.ab. [[]
BIBLIOGRAPHIC NOTES
We shall study two parsing methods that work for all context-free gram-
mars, the Cocke-Younger-Kasami algorithm and Earley's algorithm. Each
algorithm requires n 3 time and n 2 space, but the latter requires only n z time
when the underlying grammar is unambiguous. Moreover, Earley's algorithm
can be made to work in linear time and space for most of the grammars which
can be parsed in linear time by the methods to be discussed in subsequent
chapters.
In the last section we observed that the top-down and bottom-up back-
tracking methods may take an exponential amount of time to parse according
to an arbitrary grammar. In this section, we shall give a method guaranteed
to do the job in time proportional to the cube of the input length. It is essen-
tially a "dynamic programming" method and is included here because of
its simplicity. It is doubtful, however, that it will find practical use, for three
reasons:
(1) n 3 time is too much to allow for parsing.
(2) The method uses an amount of space proportional to the square of
the input length.
(3) The method of the next section (Earley's algorithm) does at least as
well in all respects as this one, and for many grammars does better.
The method works as follows. Let G = (N, Z, P, S) be a Chomsky normal
form C F G with no e-production. A simple generalization works for non-
C N F grammars as well, but we leave this generalization to the reader. Since
a cycle-free C F G can be left- or right-covered by a C F G in Chomsky normal
form, the generalization is not too important.
Let w = a l a 2 . . . a n be the input string which is to be parsed according
to G. We assume that each a~ is in X for 1 < i < n. The essence of the
SEC. 4.2 TABULAR PARSING METHODS 315
Method.
(1) Set tit = [A I A ~ a~ is in P} for each i. After this step, if t~1 contains
+
Since I < k < j, both k and j -- k are less than j. Thus both t~k and ti+k,j_ k
are computed before t~j is computed. After this step, if t~j contains A, then
+ +
(3) Repeat step (2) until tij is known for all 1 _~ i < n, and 1 < j
n--i+l.
Example 4.8
Consider the C N F grammar G with productions
tNote that we are not discussing in detail how this is to be done. Obviously, the com-
putation involved can be done by computer. When we discuss the time complexity of
Algorithm 4.3, we shall give details of this step that enable it to be done efficiently.
316 GENERAL PARSING METHODS cam. 4
S > AA IASIb
A > SA IASla
Let abaab be the input string. The parse table T that results from Algorithm
4.3 is shown in Fig. 4.8. F r o m step (1), tit = {A} since A ~ a is in P and
at = a. In step (2) we add S to t32, since S ---~ A A is in P and A is in both
t3t and t4t. Note that, in general, if the tt~'s are displayed as shown, we can
A,S A,S
A,S S A,S
A,S A S A,S
jt 1 A S A A S
Then, if B is in tie and C is in t~+k.j-k for some k such that 1 ~ k < j and
A ~ B C is in P, we add A to ttj. That is, we move up the ith column and
down the diagonal extending to the right of cell ttj simultaneously, observing
the nonterminals in the pairs of cells as we go.
Since S is in t~5, abaab is in L(G). [B
THEOREM 4.6
If Algorithm 4.3 is applied to C N F g r a m m a r G and input string a~ .. • a,,
+
then upon termination, A is in t~j if and only if A ==~ a s . . . a~+~_~.
Proof. The proof is a straightforward induction on j and is left for the
Exercises. The most difficult step occurs in the "if" portion, where one must
+
observe that i f j > 1 and A ==~ a s . . . at+i_~, then there exist nonterminals
+
B and C and integer k such that A ~ B C is in P, B ~ a~ . . . an+k_ 1, and
+
C ~ at+ k • • • a,+j_ l. E]
shall assume that we have several integer variables available, one of which
is n, the input length. An elementary operation, for the purposes of this discus-
sion, is one of the following'
(1) Setting a variable to a constant, to the value held by some variable,
or to the sum or difference of the value of two variables or constants;
(2) Testing if two variables are equal,
(3) Examining and/or altering the value of t~j, if i and j are the current
values of two integer variables or constants, or
(4) Examining a,, the ith input symbol, if i is the value of some variable.
We note that operation (3) is a finite operation if the grammar is known
in advance. As the grammar becomes more complex, the amount of space
necessary to store t,.j and the amount of time necessary to examine it both
increase, in terms of reasonable steps of a more elementary nature. However,
here we are interested only in the variation of time with input length. It is
left to the reader to define some more elementary steps to replace (3) and
find the functional variation of the computation time with the number of
nonterminais and productions of the grammar.
CONVENTION
We take the notation '~f(n) is 0(g(n))" to mean that there exists a constant
k such that for all n ~ 1, f(n) ~ kg(n). Thus, when we say that Algorithm
4.3 operates in time 0(n3), we mean that there exists a constant k for which
it never takes more than kn 3 elementary operations on a word of length n.
THEOREM 4.7
Algorithm 4.3 requires 0(n 3) elementary operations of the type enumerated
above to compute tij for all i and j.
Proof To compute tit for all i merely requires that we set i = 1 [opera-
tion(l)], then repeatedly set t~l to ~A]A --~ a~ is in P~} [operations (3) and (4)],
test if i = n [operation (2)], and if not, increment i by I [operation (1)].
The total number of elementary operations performed is 0(n).
Next, we must perform the following steps to compute t~j"
(1) Set j -- 1.
(2) Test ifj -- n. If not, increment j by 1 and perform line(j), a procedure
to be defined below.
(3) Repeat step (2) until j = n.
Exclusive of operations required for line(j), this routine involves 2 n - 2
elementary operations. The total number of elementary operations required
for Algorithm 4.3 is thus 0(n) plus ~ = 2 l(j), where l(j)is the number of
elementary operations used in line(j). We shall show that l(j) is 0(n 2) and
thus that the total number of operations is 0(n0.
318 GENERAL PARSING METHODS CHAP. 4
The procedure line(j) computes all entries t~j such that 1 _< i < n -
j -I-- 1. It embodies the procedure outlined in Example 4.8 to compute tij. It is
defined as follows (we assume that all t~j initially have value N)'
(1) Let i = 1 a n d j ' = n - - j - k 1.
(2) L e t k = 1.
(3) Let k' -- i q- k and j " = j - k.
(4) Examine t;k and tk,i,,. Let
(5) Increment k by 1.
(6) If k - - j , go to step (7). Otherwise, go to step (3).
(7) If i-----j', halt. Otherwise do step (8).
(8) Increment i by 1 and go to step (2).
We observe that the above routine consists of an inner loop, (3)-(6),
and an outer loop, (2)-(8). The inner loop is executedj -- 1 times (for values
of k from 1 to j - 1) each time it is entered. At the end, t,.j has the value
defined in Algorithm 4.3. It consists of seven elementary operations, and so
the inner loop uses 0(j) elementary operations each time it is entered.
The outer loop is entered n -- j -Jr- 1 times and consists of 0(j) elementary
operations each time it is entered. Since j ~ n, each computation of line(j)
takes 0(n 2) operations.
Since line(j) is computed n times, the total number of elementary opera-
tions needed to execute Algorithm 4.3 is thus 0(n3). [--]
We shall now describe how to find a left parse from the parse table.
The method is given by Algorithm 4.4.
ALGORITHM 4.4
Left parse from parse table.
Input. A Chomsky normal form C F G G - - ( N , E, P, S) in which the
productions in P are numbered from 1 to p, an input string w -- a~a 2 . . . a,,
and the parse table T for w constructed by Algorithm 4.3.
Output. A left parse for w or the signal "error."
Method. We shall describe a recursive routine gen(i,j, A) to generate
+
a left parse corresponding to the derivation A = , a i a ~ + 1 . . . ai+j_ 1. The
lm
routine gen(i, j, A) is defined as follows"
(1) I f j -- 1 and the mth production in P is A --, a,., then emit the produc-
tion number m.
(2) If j > 1, k is the smallest integer, 1 < k < j, such that for some B
in t;k and C in t~+k,j-k, A --* B C is a production in P, say the mth. (There
SEC. 4.2 TABULAR PARSING METHODS 31 9
may be several choices for A --~ B C here. We can arbitrarily choose the one
with the smallest m.) Then emit the production number m and execute
gen(i, k, B), followed by gen(i -t-- k, j - k, C).
Algorithm 4.4, then, is to execute gen(1, n, S), provided that S is in t~,. If
S is not in t,,, emit the message "error."
Example 4.9
Let G be the grammar with the productions
(1) S ~ ~ A A
(2) S ~ AS
(3) S >b
(4) A ~ SA
(5) A ~ AS
(6) A ,a
320 GENERAL PARSING METHODS CHAP. 4
Let w = abaab be the input string. The parse table for w is given in Example
4.8.
Since S is in T 15, w is in L ( G ) . To find a left parse for abaab we call routine
gen(1, 5, S). We find A in t~l and in t24 and the production S ---~ A A in the
set of productions. Thus we emit 1 (the production number for S---~ A A )
and then call gen(1, 1, A) and gen(2, 4, A). gen(1, 1, A) gives the production
number 6. Since S is in t21 and A is in t33 and A --~ S A is the fourth produc-
tion, gen(2, 4, A) emits 4 and calls gen(2, 1, S) followed by gen(3, 3, A).
Continuing in this fashion we obtain the left parse 164356263.
Note that G is ambiguous; in fact, abaab has more than one left parse.
It is not in general possible to obtain all parses of the input from a parse
table in less than exponential time, as there may be an exponential number
of left parses for the input. D
We should mention that Algorithm 4.4 can be made to run faster if,
when we construct the parse table and add a new entry, we place pointers to
those entries which cause the new entry to appear (see Exercise 4.2.21).
In this section we shall present a parsing method which will parse an input
string according to an arbitrary C F G using time 0(n 3) and space 0(n2), where
n is the length of the input string. Moreover, if the C F G is unambiguous,
the time variation is quadratic, and on most grammars for programming
languages the algorithm can be modified so that both the time and space
variations are linear with respect to input length (Exercise 4.2.18). We shall
first give the basic algorithm informally and later show that the computation
can be organized in such a manner that the time bounds stated above can
be obtained.
The central idea of the algorithm is the following. Let G = (N, Z, P, S)
be a C F G and let w = a l a 2 . . . a n be an input string in Z*. An object of
the form [A--~ X i X 2 . . . X k • Xk+l... Xm, i] is called an item for w if
A --~ X'~ . . - X'~ is a production in P and 0 ~ i ~ n. The dot between X k and
Xk+~ is a metasymbol not in N or Z. The integer k can be any number
including 0 (in which case the • is the first symbol) or m (in which case it
is the last).t
For each integer j, 0 ~ j ~ n, we shall construct a list of items Ij such
that [A --~ t~ • fl, i] is in Ij for 0 ~ i ~ j if and only if for some }, and ~, we
have S ~ ~.4~, ~ ~ a~ . . . a,, and tx ~ a,.l " " aj. Thus the second com-
ponent of the item and the number of the list on which it appears bracket
the portion of the input derived from the string ~. The other conditions on
the item merely assure us of the possibility that the production A --~ t~fl
could be used in the way indicated in some input sequence that is consistent
with w up to position j.
The sequence of lists Io, I1,. • •, I, will be called the parse lists for the
input string w. We note that w is in L(G) if and only if there is some item of
the form [S ~ ~ . , 0] in I,.
We shall now describe an algorithm which, given any g r a m m a r , will
generate the parse lists for any input string.
ALGORITHM 4.5
Earley's parsing algorithm.
Input. C F G G = (N, E, P, S) and an input string w -- ala 2 • .. a, in Z*.
Output. The parse lists Io, I~ . . . . . I,.
Method. First, we construct Io as follows"
(1) If S --~ a is a production in P, add [S ~ • 0~, 0] to Io.
N o w perform steps (2) and (3) until no new items can be added to Io.
(2) If [B ~ y-, 0] is on Io,t add [A --~ ~B • p, 01 for all [A --- a • Bp, 0]
on I o.
(3) Suppose that [A ~ ~ . Bfl, 0] is an item in I o. A d d to Io, for all
productions in P of the form B ~ y, the item [B ~ • y, 0] (provided this item
is not already in Io).
We now construct Ij, having constructed I0, I 1 , . . . , Ij_ t-
(4) For each [B --~ 0~ • aft, i] in Ij_~ such that a -- aj, add [B --. 0~a • fl, i]
to I v.
N o w perform steps (5) and (6) until no new items can be added.
(5) Let [A --~ ~,., i] be an item in Ij. Examine It for items of the form
[B ~ 0~ • Ap, k]. F o r each one found, we add [B ~ ~A • fl, k] to Ij.
(6) Let [A ~ 0~- Bfl, i] be an item in Ij. For all B ~ 7 in P, we add
[B ~ • y, j] to Ij.
N o t e that consideration of an item with a terminal to the right of the dot
yields no new items in steps (2), (3), (5) and (6).
The algorithm, then, is to construct Ij for 0 ~ j _~ n. [~]
Example 4.10
Let us consider the g r a m m a r G with the productions
(1) E >T + E
(2) E-- >T
(3) T - - - > F , T
(4) T >F
(5) F ~ > (E)
(6) F - - - ~ a
tNote that ? can be e. This is the way rule (2) becomes applicable initially.
322 GENERAL PARSING METHODS CHAP. 4
and let (a q - a ) , a be the input string. From step (1) we add new items
[E----,. T + E, 0] and [ E - - , . T, 0] to I 0. These items are considered by
adding to I0 the items [T ~ • F • T, 0] and [T --, • F, 0] from rule (3). Con-
tinuing, we then add [F ~ • (E), 0] and [F--~ • a, 0]. No more items can
be added to I 0.
We now construct I~. By rule (4) we add I F - - , (. E), 0], since aa -----(.
Then rule (6) causes [ E ~ . T + E , 1], [ E ~ . T , 1], [ T - - ~ . F , T , 1],
[T--, • F, 1], [F ~ • (E), 1], and [F----, • a, 1] to be added. Now, no more
items can be added to I~.
To construct 12, we note that a2 = a and that by rule (4) [F----, a . , 1] is
to be added to Iz. Then by rule (5), we consider this item by going to I~ and
looking for items with F immediately to the right of the dot. We find two,
and add [ T - - , F . • T, 1] and [T--~ F . , 1] to 12. Considering the first of
these yields nothing, but the second causes us to again examine I~, this time
for items with. T in them. Two more items are added to/2, [E ~ T . q- E, 1]
and [E--~ T . , 1]. Again the first yields nothing, but the second causes
[F ~ (E .), 0] to be added to 12. Now no more items can be added to 12,
so I2 is complete.
The values of all the lists are given in Fig. 4.9.
Io 11 Iz
[E----> • T q - E, 0] I F - - ~ (. E ) , 0 ] [F---. a . , 1]
[E~. T, 0] [E~. T - t - E , 1] [T~F..T, 1]
[T---~ • F . T, 0] [ E - - ~ • T, 1] [ T - - ~ F . , 1]
[T--~ • F, 0] [ T - - ~ . F . T , 1] [ E - - ~ T . W E , l]
[F--~ • (E), 01 [T----~ • F, 11 [E ~ T . , 1]
[F ~ ° a, 0] [F ~ • (E), 1] [ F - o (E .), 0]
[F ---~ . a , l ]
13 h ls
[E ~ T q- • E, 11 [F ~ a., 3] [F ~ ( E ) . , 0]
[E------~ • T-t- E, 31 [T----~ F. • T, 3] [T----~ F . , T, 0]
[E ~ • T, 31 [T ~ F., 31 [T ~ F . , 0]
[T---> • F , T, 3] [E ~ T. + E, 31 [E ~ T . + E, 01
[T --~ • F, 3] [E ~ T., 3] [E --~ T . , 0]
[F---~ • (E), 31 [E----* T + E . , 1]
[F ~ • a, 31 [F ~ ( E . ) , 01
16 I7
[T-.F.. T, 0] [F---~ a . , 6]
[T---, • F , T, 6] IT---, F . • T, 61
[T----~ • F, 6] [T----~ F . , 61
[F ~ • (E), 6] [T ~ F , T . , 01
[F ~ • a, 6] [E ----* T . q- E, 01
[E--~ T.,0]
strings ~,' and 6' such that S =-~ ? ' A J ' and y ' = > a 1 • • • a r It then follows that
= t~'aj ==~ at+ 1 "'" aj, and the inductive hypothesis is satisfied with ~, = ~,'
and ~ = 6'.
Next, suppose that [,4 ---. a . fl, i] is added by rule (5). Then 0~ = ogB
for some B in N, and for some k , [,4 ~ o~' • B f l , i] is on I k. Also, [B --, 1/., k]
.
is on Ij for some I/in (N u E)*. By the inductive hypothesis, r / ~ > ak+ 1 "" • aj
and 0~'==~ a ~ + l . . . a k . Thus, tx = o g B = > a ~ + l . . . a j. Also by hypothesis,
there exist ~,' and 6' such that S ~ ?'A6' and ~,'=> a 1 .-- ai. Again, the rest
of the inductive hypothesis is satisfied with y = ~,' and J = 6'.
The remaining case, in which [A ---. ct. fl, i] is added by rule (6), has
0c = e and i = j. Its elementary verification is left to the reader, and we
conclude the "only if" portion.
I f: The "if" portion is the p r o o f of the statement
(4.2.1) If S =~ ~,AO, ~, =~ a 1 . . . a~, A ---, ctfl is in P, and tz = ~ a~+l " ' " aj,
then [A --. ~ • fl, i] is on list Ij
=> a ~ . " a j l l O , . But in the first derivation, 0 ~ ' ==> a l " ' ' a k , and in the
second, 03~' =-~ a l " " a v Then there are two distinct derivation trees for
some al " " an, with ~'B deriving ai+l " " aj in two different ways.
326 GENERAL PARSING METHODS CHAP. 4
THEOREM 4.11
In all cases, Algorithm 4.5 can be executed in 0(n 3) reasonably defined
elementary operations when the input is of length n.
Proof. Exercise.
Our last portion of the analysis of Earley's algorithm concerns the method
of constructing a parse from the completed lists. For this purpose we give
Algorithm 4.6, which generates a right parse from the parse lists. We choose
to produce a right parse because the algorithm is slightly simpler. A left
parse can also be found with a simple alteration in the algorithm.
Also for the sake of simplicity, we shall assume that the grammar at hand
has no cycles. If a cycle does exist in the grammar, then it is possible to have
328 GENERAL PARSING METHODS CHAP. 4
arbitrarily many parses for some input strings. However, Algorithm 4.6 can
be modified to accommodate grammars with cycles (Exercise 4.2.23).
It should be pointed out that as for Algorithm 4.4, we can make Algorithm
4.6 simpler by placing pointers with each item added to a list in Algorithm
4.5. Those pointers give the one or two items which lead to its placement on
its list.
ALGORITHM 4.6
Construction of a right parse from the parse lists.
Input. A cycle-free C F G G = (N, X, P, S) with the productions in P
numbered from 1 to p, an input string w = a 1 . . . a n, and the parse lists
I0, I a , . . . , I, for w.
Output. 7r, a right parse for w, or an "error" message.
Method. If no item of the form [S---~ 0c., 0] is on In, then w is not in
L(G), so emit "error" and halt. Otherwise, initialize the parse 7r to e and
execute the routine R([S----~ ~z., 0], n) where the routine R is defined as
follows:
Routine R([A ---+ f l . , i], j ) :
(1) Let n be h followed by the previous value of ~z, where h is the number
of production A ---~ ft. (We assume that 7r is a global variable.)
(2) If fl = X i X z . . . Xm, set k = m and 1 = j.
(3) (a) If Xk ~ X, subtract 1 from both k a n d / .
(b) If X k ~ N, find an item [Xk ~ 7' ", r] in I~ for some r such that
[A ~ X ~ X z . . . Xk_~ . X k . . . Xm, i] is in I,. Then execute
R([Xk ---' 7' ", r], l). Subtract 1 from k and set l = r.
(4) Repeat step (3) until k = 0. Halt. E]
Algorithm 4.6 works by tracing out a rightmost derivation of the input
string using the parse lists to determine the productions to use. The routine
R called with arguments [A --~ f l . , i] and j appends to the left end of the
current partial parse the number corresponding to the production A ~ ft.
If fl = voBlvlB2v 2 . . . B,v,, where B 1 , . . . , Bs are all the nonterminals in fl,
then the routine R determines the first production used to expand each Bt,
say B,---~ fl,, and the position in the input string w immediately before
the first terminal symbol derived from B,. The following recursive calls of
R are then made in the order shown:
R([B, --,/~,.,i,],j,)
R([B,_, ~ /L-~ ", ~,-~],L-,)
°
SEC. 4.2 T A B U L A R P A R S I N G METHODS 329
where
(1) j, = j - Iv, I and
(2) jq -- iq+x --[vq [for 1 ~ q < s.
Example 4.11
Let us apply Algorithm 4.6 to the parse lists of Example 4.10 in order to
produce a right parse for the input string (a + a) • a. Initially, we can execute
R([E----~ T . , 0], 7). In step (1), n gets value 2, the number associated with
production E ~ T. We then set k = 1 and 1 = 7 and execute step (3b) of
Algorithm 4.6. We find [T ----~ F • T . , 0] on 17 and [E ~ • T, 0] on Io. Thus
we execute R([T ~ F , T . , 0], 7), which results in production number 3
being appended to the left of n. Thus, n = 32. Following this call of R, in
step (2) we set k = 3 and l = 7.
Step (3b) is then executed with k = 3. We find [T---, F . , 6] on I0 and
IT ~ F , . T, 0] on /6, so we call R([T----, F . , 6], 7). After completion of
this call, we set k = 2 and l = 6. In Step (3a) we consider • and set k = 1
and 1 = 5. We then find [ F - - , ( E ) . , 0] on ls and [T---,. FF, T, 0] on I0,
so we call R([F---~ ( E ) . , 0], 5).
Continuing in this fashion we obtain the right parse 64642156432.
The calls of routine R are shown in Fig. 4.10 superimposed on the deri-
vation tree for (a + a) • a. D
E R([E~ T-,01,7)
!
i R([T~F,T.,01,7)
( )
i R([,,,,,~,~E~
T+E., 1],4) F R([F~a -,61,7)
R([T~F-,ll,2) T/ E R([E~T.,3],4)
a
I
I +
R([F~a.,11,2) F T R([T~F.,31,4)
a F R([Foa., 31,4)
THEOREM 4.12
A l g o r i t h m 4.6 correctly finds a right parse of a 1 . . . a, if one exists, a n d
can be m a d e to operate in time 0(n2).
P r o o f A straightforward induction on the order of the calls o f routine
R shows that a right parse is produced. We leave this portion of the p r o o f for
the Exercises.
In a m a n n e r analogous to T h e o r e m 4.10, we can show that a call of
R([A --~ f t . , i], j ) takes time 0 ( ( j - 0 2) if we can show that step (3b) takes
0 ( j - i) elementary operations. To do so, we must preprocess the lists in such
a way that the time taken to examine all the finite n u m b e r o f items on I k
whose second c o m p o n e n t is l requires a fixed c o m p u t a t i o n time. That is, for
each parse list, we must link the items with a c o m m o n second c o m p o n e n t
a n d establish a header pointing to the first entry on that list. This preprocess-
ing can be done in time 0(n 2) in an obvious way.
In step (3b), then, we examine the items on list Iz with second c o m p o n e n t
r = l, l - 1 , . . . , i until a desired item of the f o r m [X k --, 7 ", r] is found.
The verification that we have the desired item takes fixed time, since all items
with second c o m p o n e n t i on I, can be f o u n d in finite time. The total a m o u n t
of time spent in step (3b) is thus p r o p o r t i o n a l to j - i. [--]
EXERCISES
4.2.24. What is the maximum number of items that can appear in a list Ij in
Algorithm 4.5 ?
*4.2.25. A grammar G is said to be of finite ambiguity if there is a constant k
such that if w is in L(G), then w has no more than k distinct left parses.
Show that Earley's algorithm takes time 0(n 2) on all grammars of finite
ambiguity.
Open Problems
There is little known about the actual time necessary to parse an
arbitrary context-free grammar. In fact, no good upper bounds are
known for the time it takes to recognize sentences in L(G) for arbitrary
CFG G, let alone parse it. We therefore propose the following open
problems and research areas.
4.2.26. Does there exist an upper bound lower than 0(n 3) on the time needed
to recognize an arbitrary CFL on some reasonable model of a random
access computer or a multitape Turing machine ?
4.2.27. Does there exist an upper bound better than 0(n 2) on the time needed
to recognize unambiguous CFL's ?
Research Problems
4.2.28. Find a CFL which cannot be recognized in time f(n) on a random
access computer or Turing machine (the latter would be easier), where
f(n) grows faster than n; i.e., lim,_.~ (n/f (n)) = 0. Can you find a CFL
which appears to take more than 0(n 2) time for recognition, even if you
cannot prove this to be so ?
4.2.29. Find large classes of CFG's which can be parsed in linear time by
Eadey's algorithm. Find large classes of ambiguous CFG's which can
be parsed in time 0(n 2) by Earley's algorithm. It should be mentioned
that all the deterministic CFL's have grammars in the former class.
Programming Exercises
4.2.30. Use Earley's algorithm to construct a parser for one of the grammars
in the Appendix.
4.2.31. Construct a program that takes as input any CFG G and produces as
output a parser for G that uses Earley's algorithm.
BIBLIOGRAPHIC NOTES
tThis does not involve an extension of the definition of a DPDT. The k "lookahead
symbols" are stored in the finite control.
333
334 ONE-PASSNO BACKTRACK PARSING CHAP. 5
Example 5.1
Let ~ ---- a b a c A a B . The closed portion of ~ is abac; the open portion is
A a B . If ~ ---- abc, then abc is its closed portion and e its open portion. Its
border is at the right end.
sEc. 5.1 LL(k) GRAMMARS 335
=* . . . =~ et~ such that ~ =* w, then we can construct ~+1, the next step of
lm lm Ira
the derivation, by observing op,ly the closed portion of ~i and a "little
more," the "little more" being the next k input symbols of w. (Note that
the closed portion of ~t is a prefix of w.) It is important to observe that if
we do not see all of w when ~,+1 is constructed, then we do not really know
what terminal string is ultimately derived from S. Thus the LL(k)condition
implies that ~;+i is substantially independent (except for the next k terminal
symbols) of what is derived from the open portion of ~.
Viewed in terms of a derivation tree, we can construct a derivation tree
for a sentence w x y in an LL(k) grammar starting from the root and working
top-down determfiaistieally. Specifically, if we have constructed the partial
derivation tree with frontier wArE, then knowing w and the first k symbols
of x y we would know which production to use to expand A. The outline of
the complete tree is shown in Fig. 5.1.
w x y
Example 5.2
Let G 1 be the grammar S ~ a A S ] b, A ---~ a l b S A . Intuitively, G1 is LL(1)
because given C, the leftmost nonterminal in any left-sentential form, and
c, the next input symbol, there is at most one production for C capable of
deriving a terminal string beginning with c. Going to the definition of an
LL(1) grammar, if S = ~ w S a ~ wfla ~ w x and S ~ w S a = ~ wTa ~ wy
lra lm lm lrn lm lra
and x and y start with the same symbol, we must have fl = 7'. Specifically, if
x and y start with a, then production S ---~ a A S was used and fl = 7' = a A S .
S---~ b is not a possible alternative. Conversely, if x and y start with b,
S ~ b must be the production used, and fl = 7' = b. Note that x = y = e
is impossible, since S does not derive e in G~.
A similar argument prevails when we consider two derivations
S ~ wAoc ~ wile ~ w x and S ~ wAoc ~ wT'a =-> wy. [~
lm lm lm tm lrn Ira
Example 5.3
Let us consider the more complicated case of the grammar G 2 defined by
S ---~ e t a b A , A ~ S a a l b . We shall show that G2 is LL(2). To do this, we shall
show that if wBtx is any left-sentential form of G2 and w x is a sentence in
L(G), then there is at most one production B ~ f l in Gz such that
FIRST2(flt~ ) contains FIRST2(x ). Suppose that S ~ wSt~ ~ wflo~ ==~ w x
lm lm lm
and S ~ wSoc :=~ w~,o~ ~ wy, where the first two symbols of x and y agree if
lm Im lm
they exist. Since G2 is a linear grammar, ~ must be in (a + b)*. In fact,
we can say more. Either w = ~ = e, or the last production used in the deri-
vation S ~ wSoc was A ~ Saa. (There is no other way for S to "appear"
lm
in a sentential form.) Thus either ~ = e or ~ begins with aa.
Suppose that S ---~ e is used going from wSt~ to wilts. Then fl = e, and x
is either e or begins with aa. Likewise, if S ~ e is used going from wSt~ to
w},~, then ~ = e and y = e, or y begins with aa. If S ~ abA is used going
from wSoc to wflo~, then fl = a b A , and x begins with ab. Likewise, if S ---~ a b A
is used going from wStx to wToc, then ~, = a b A , and y begins with ab. There
are thus no possibilities other than x = y = e, x and y begin with aa, or both
begin with ab. Any other condition on the first two symbols of x and y implies
that one or both derivations are impossible. In the first two cases, S ---~ e is
used in both derivations, and fl = ? = e. In the third case, S ---~ a b A must
be used, and fl = ~, = abA.
It is left for the Exercises to prove that the situation in which A is the
symbol to the right of the border of the sentential form in question does not
yield a contradiction of the LL(2) condition. The reader should also verify
that G~ is not LL(1). D
Example 5.4
Let us consider the grammar G 3 = ({S, A, B}, {0, 1, a, b}, P3, S), where
P3 consists of S --~ A IB, A --~ a A b l O, B ~ aBbb l l. L(G3) is the language
(a'Ob"in ~ 0} u {a"lb2"! n ~ 0}. G 3 is not LL(k) for any k. Intuitively, if
we begin scanning a string of a's which is arbitrarily long, we do not know
whether the production S ---~ A or S ~ B was first used until a 0 or 1 is seen.
Referring to the definition of LL(k) grammar, we may take w = a = e,
fl = A, ~, = B, x = akOb k, and y = a klb 2k in the derivations
0 ,
S ~ S ~ A ~ akOb k
lm lm
0 *
S ~ S ~ B ~ a k l b 2k
lm lm
to satisfy conditions (1) and (2) of the definition. Moreover, x and y agree in
338 ONE-PASSNO BACKTRACKPARSING CHAP. 5
the first k symbols. However, the conclusion that fl -- ~, is false. Since k can
be arbitrary here, we may conclude that G 3 is not an LL grammar. In fact,
in Chapter 8 we shall show that L(G3) has no LL(k) grammar. F---]
We shall show that we can parse LL(k) grammars very conveniently using
what we call a k-predictive parsing algorithm. A k-predictive parsing algorithm
for a CFG G = (N, X, P, S) uses an input tape, a pushdown list, and an
output tape as shown in Fig. 5.2. The k-predictive parsing algorithm attempts
to trace out a leftmost derivation of the string placed on its input tape.
Input head
i] F ,i
Output tape
7
Fig. 5.2 Predictive parsing algorithm.
The input tape contains the input string to be parsed. The input tape is
read by an input head capable of reading the next k input symbols (whence
the k in k-predictive). The string scanned by the input head will be called the
lookahead string. In Fig. 5.2 the substring u of the input string wux represents
the lookahead string.
The pushdown list contains a string X0c$, where X'a is a string of push-
down symbols and $ is a special symbol used as a bottom of the pushdown
list marker. The symbol X is on top of the pushdown list. We shall use r' to
represent the alphabet of pushdown list symbols (excluding $).
The output tape contains a string 7z of production indices.
We shall represent the configuration of a predictive parsing algorithm by
a triple (x, X~, n), where
(1) x represents the unused portion of the original input string.
(2) Xa represents the string on the pushdown list (with X on top).
(3) n is the string on the output tape.
For example, the configuration in Fig. 5.2 is (ux, Xa$, ~r).
SEC. 5.1 LL(k) GRAMMARS 339
Example 5.5
Let us construct a 1-predictive parsing algorithm et for G~, the simple
LL(1) grammar of Example 5.2. First, let us number the productions of G1
as follows:
(1) S - ,~ aAS
(2) S ~ b
(3) A .~ a
(4) A~ ,~bSA
L o o k a h e a d string
a e
Symbol
on top of aAS, 1 b, 2 error
pushdown , ,
Using this table, et would parse the input string abbab as follows"
For the first move M(S, a) = (aAS, 1), so S on top of the pushdown list is
replaced by aAS, and production number 1 is written on the output tape.
For the next move M(a, a) = pop so that a is removed from the pushdown
list, and the input head is moved one position to the right.
Continuing in this fashion, we obtain the accepting configuration
(e, $, 14232). It should be clear that 14232 is a left parse of abbab, and, in
fact, a~ is a valid 1-predictive parsing algorithm for G 1.
COROLLARY
Let ~ be a valid k-predictive parsing algorithm for G. Then there is
a deterministic left parser for G.
Example 5.6
Let us construct a deterministic left parser P~ for the 1-predictive parsing
algorithm in Example 5.5. Since the grammar is simple, we can obtain
a smaller DPDT if we move the input head one symbol to the right on each
move. The left parser will use $ as both a right endmarker on the input tape
and as a bottom of the pushdown list marker.
Let P~ = ([q0, q, accep(}, (a, b, $}, {S, A, a, b, $~},~, q0, $, [accept]), where
is defined as follows"
~(q0, e, $) = (q, S$, e)
~(q, a, S) = (q, AS, 1)
~(q, b, S) = (q, e, 2)
~(q, a, A) = (q, e, 3)
~(q, b, A) = (q, SA, 4)
~(q, $, $) = (accept, e, e)
We shall show that for every LL(k) grammar we can mechanically con-
struct a valid k-predictive parsing algorithm. Since the parsing table is the
heart of the predictive parsing algorithm, we shall show how a parsing table
can be constructed from the grammar. We begin by examining the impli-
cations of the LL(k) definition.
The LL(k) definition states that, given a left-sentential form w a s , then
w and the next k input symbols following w will uniquely determine which
production is to be used to expand A. Thus at first glance it might appear
that we have to remember all of w to determine which production is to be
used next. However, this is not the case. The following theorem is fundamen-
tal to an understanding of LL(k) grammars.
THEOREM 5.2
Let G = (N, E, P, S) be a CFG. Then G is LL(k) if and only if the follow-
ing condition holds" If A ---~ fl and A ---~ ~, are distinct productions in P,
then FIRSTk(fl00 ~ FIRSTk(T00 = ~ ' for all wAtt such that S = ~ wAtx.
lm
Proof
Only if: Suppose that there exist w, A, ~, fl, and 7 as above, but
FIRSTk(fltx) ~ FIRSTk(7~) contains x. Then by definition of FIRST, we
have derivations S ~ wAoc =~ wflo~ =~ w x y and S ~ wAoc =~ wTo~ =-~ w x z
Im Im lm lm lm lm
for some y and z. (Note that here we need the fact that N has no useless
nonterminals, as we assume for all grammars.) If Ix I < k, then y = z = e.
Since fl ~ 7, G is not LL(k).
lf: Suppose that G is not LL(k). Then there exist two derivations
S ~ wAoc ~ wfloc ~ w x and S =~ wAoc =-~ wT~ =-~ wy such that x and y
lm lm lm lm lm lm
agree up to the first k places, but fl ~ ~,. Then A ---~ fl and A ~ y are distinct
productions in P, and FIRST(fiE) and FIRST(?~) each contain the string
FIRST(x), which is also FIRST(y). [~]
Example 5.7
The grammar G having the two productions S - - - , a S a cannot be EL(l),
since F I R S T 1 ( a S ) = F I R S T , ( a ) = a. Intuitively, in parsing a string begin-
SEC. 5.1 LL(k) GRAMMARS 343
ning with an a, looking only at the first input symbol we would not know
whether to use S ~ a S or S --~ a to expand S. On the other hand, G is LL(2).
Using the notation in Theorem 5.2, if S =~ wAoc, then A = S and 0c = e.
lm
The only two productions for S are as given, so that fl = a S and ~, = a.
Since FIRST2(aS) = aa and FIRST2(a ) = a, G is LL(2) by Theorem 5.2. E]
Thus we can show that a grammar G is LL(1) if and only if for each set
of A-productions A --~ tz~ [ ~ 2 1 - ' ' I~n the following conditions hold"
(1) FIRSTi(tz~), F I R S T i ( t z 2 ) , . . . , FIRSTi(tzn) are all pairwise disjoint.
(2) If ~t ~ e, then FIRST~(~i) A FOLLOW~(A) = ~ for 1 < j < n,
i~j.
These conditions are merely a restatement of Theorem 5.3. We should cau-
tion the reader that, appealing as it may seem, Theorem 5.3 does not gener-
alize directly. That is, tet G be a C F G such that statement (5.1.1) holds"
Example 5.8
Consider the grammar C, defined by
S > aAaalbAba
A >ble
Example 5.9
Consider the grammar G with the two productions S ~ Sa[ b. From
i
Theorem 5.2, consider the derivation S ~ Sa t, i ~ 0 with A = S, ~ = e,
fl = Sa, and ~, = b. Then for i ~ k, FIRSTk(Saa t) A FIRSTk(ba ~) = ba k- ~.
Thus, G cannot be LL(k) for any k. [[]
Example 5.10
Let G be the grammar S ----~ Salb, which we saw in Example 5.9 was not
SEC. 5.1 LL(k) GRAMMARS 345
S > bS'
S' > aS' l e
Example 5.11
Consider the LL(2) grammar G with the two productions S---~ aS! a.
We can "factor" these two productions by writing them as S---~ a(S[ e).
That is, we assume that concatenation distributes over alternation (the
vertical bar). We can then replace these productions by
S > aA
A >Sic
Example 5.12
Let us consider producing a parsing table for the grammar G with pro-
ductions
(1) E ~ TE' (2) E ' .--, -b T E '
(3) E'---~ e (4) T ~ FT'
(5) T'---~ • F T ' (6) T'--~ e
(7) F ~ (E) (8) F ~ a
Using Theorem 5.3 the reader can verify that G is an LL(1) grammar.
I n fact, the discerning reader will observe that G has been obtained from Go
using the transformation eliminating left recursion as in Example 5.10. Go
is not LL, by the way.
Let us now compute the entries for the E-row using step (1) of Algorithm
a ( ) + • e
TE', 1 TE', 1
E
t
e, 3 +TE', 2 e, 3
,,
FT', 4 FT', 4
T
it
e, 6 e, 6 • FT', 5 e, 6
a, 8 (E),7
pop
pop
pop
pop
pop
accept
5.1. Here, FIRSTI[TE' ] = {(, a}, so M[E, (] = [TE', 1] and M[E, a] = [TE', 1].
All other entries in the E-row are error. Let us now compute the entries
for the E'-row. We note FIRST1[-+- TE'] = -t-, so M[E', .+] = [-q- TE', 2].
Since E'--~ e is a production, we must compute FOLLOWl[E' ] = {e,)}.
Thus, M[E', e] = M[E', )] = [e, 3]. All other entries for E' are error. Continu-
ing in this fashion, we obtain the parsing table for G shown in Fig. 5.4.
Error entries have been left blank.
The 1-predictive parsing algorithm using this table would parse the input
string (a • a) in the following sequence of moves:
THEOREM 5.4
Algorithm 5.1 produces a valid parsing table for an LL(I) grammar G.
Proof. We first note that if G is an LL(1) grammar, then at most one
value is defined in step (1) of Algorithm 5.1 for each entry M(A, a) of the
parsing matrix. This observation is merely a restatement of Theorem 5.3.
Next, a straightforward induction on the number of moves executed by
a 1-predictive parsing algorithm ~ using the parsing table M shows that if
(xy, S$, e)1--~-(y, oc$, zt), then S ~ x0~. Another induction on the number
of steps in a leftmost derivation can be used to show the converse, namely
348 ONE-PASS NO BACKTRACK PARSING CHAP. 5
that if S"=~ xa, where a is the open portion of xa, and FIRSTI(y ) is in
FIRSTI(a), then (xy, S$, e) l * (Y, aS, n). It then follows that (w, S$, e)
I*-- (e, $, ~z) if and only if S "=* w. Thus ff is a valid parsing algorithm for
G, and M a valid parsing table for G. V-]
Example 5.13
Let L, = [e, abb) and L2 : [b, bab]. Then L1 O2 L2 = {b, ba, ab}.
P r o o f Exercise. [B
DEFINITION
which can be used in the first step of a derivation A x ==~ uv for any x ~ L
and v ~ Z*. Each set of strings Yt gives all possible prefixes of length
up to k of terminal strings which can follow a string derived from B~ when
we use the production A ~ a, where a = x o B l x i B 2 x 2 . . . BmXm, in any
derivation of the form A x ~ ax ~ uv, with x in L.
lm
By Theorem 5.2, G = (N, Z, P, S) is not LL(k) if and only if there exists
in (N U Z)* such that
Example 5.14
Consider the LL(2) grammar
S > aAaalbAba
A >ble
Let us compute the LL(2) table Ts, te~, which we shall denote To. Since
S ---~aAaa is a production, we compute FIRST2(aAaa ) Q2 {e} = {aa, ab}.
Likewise, S---~ bAba is a production, and FIRST2(bAba ) e 2 [e} = {bb}.
Thus we find To(aa)= (S---~ aAaa, Y). Y is the local follow set for A;
Y = FIRST2(aa)02 [e] = [aa}. The string aa is the string to the right of
A in the production S---~ aAaa. Continuing in this fashion, we obtain the
table To shown below"
Table To
Production Sets
aa S ~ aAaa {aa}
ab S ~ aAaa {aa}
bb S ~ bAba {ba}
add to 3 the LL(k) table Ts,,r,, for 1 < i < m, if Ts,,r, is not already in 3.
(4) Repeat step (3) until no new LL(k) tables can be added to 3.
Example 5.15
Let us construct the relevant set of LL(2) tables for the grammar
S ~ aAaalbAba
A >ble
ba A ----~b
aa a----~e u
ba a ----~ e
bb A ----~ b
At this point 3 = {Ts,{e], ZA,{aa} , ZA,{ba] ~ and no new entries can be added to
3 in Algorithm 5.2 so that the three LL(2) tables in 3 are the relevant LL(2)
table for G.
From the relevant set of LL(k) tables for an LL(k) grammar G we can
use the following algorithm to construct a valid parsing table for G. The k-
predictive parsing algorithm using this parsing table will actually use the
LL(k) tables themselves as nonterminal symbols on the pushdown list.
ALGORITHM 5.3
A parsing table for an LL(k) grammar G = (N, X, P, S).
Input. An LL(k) C F G G = (N, X, P, S) and 3, the set of LL(k) tables
for G.
Output. M, a valid parsing table for G.
Method. M is defined on (3 U X u [$}) × X.k as follows:
(1) If A --~ xoBlxlB2x2 ... BmX m is the ith production in P and TA,L is
352 ONE-PASS NO BACKTRACK PARSING CHAP. 5
using the relevant set of LL(2) tables constructed in Example 5.15. The pars-
ing table resulting from Algorithm 5.3 is shown in Fig. 5.5. In Fig. 5.5,
To = Ts, t,~, T1 = Ta,~aa~, and T2 = TA,Cba~-Blank entries indicate error.
aa ab a ba bb b e
T1 e,4 b,3
T2 e,4 b,. 3
accept
THEOREM 5.5
(5.1.2) S "=~ x0c if and only if there is some ~' in (3 u ~)* such that
h(og) = ~, and (xy, To$, e)I--~- (y, oc'$, n) for all y such that
0~~ y. To is the LL(k) table associated with S and {el.
If." From the manner in which the parsing table is constructed, whenever
a production number i is emitted corresponding to the ith production
A ---~fl, the parsing algorithm replaces a table T such that h(T) = A by a string
fl' such that h ( f l ' ) = ft. The "if" portion of statement (5.1.2)can thus be
proved by a straightforward induction on the number of moves made by
the parsing algorithm.
Only if." Here we shall show that
(5.1.3) If A ===~x, then the parsing algorithm will make the sequence of
moves (xy, T, e) ~ (y, e, n) for any LL(k) table T associated with
A and L, where L = FIRSTk(0 0 for some ~ such that S ==~wAoc,
lm
and y is in L.
(YzX2 " ' " ymXmY, T2x2 " ' " TmXm, iffl)
,
As another example, let us construct the parsing table for the LL(2)
grammar G 2 of Example 5.3.
Example 5.17
Consider the LL(2) grammar G2
(1) S >e
(2) S > abA
(3) A ~ Saa
(4) A >b
Let us first construct the relevant LL(2) tables for G 2. We begin by con-
structing To = Ts.t,~"
see. 5.1 LL(k) GRAMMARS 355
Table To
Production Sets
e S-----re m
ab S~ abA {e]
b A --, b
aa A---~ Saa [aa}
ab A---~ Saa {aa}
aa S ---~ e m
ab S ~ abA {aa}
aa A ----~Saa {aa}
ab A.--, Saa {aa}
ba A --, b
From these LL(2) tables we obtain the parsing table shown in Fig. 5,6.
The 2-predictive parsing algorithm using this parsing table would parse
the input string abaa by the following sequence of moves"
(abaa, To $, e) R (abaa, ab T1 $, 2)
1- (baa, bT1 $, 2)
R (aa, Ti $, 2)
[-- (aa, T2aa$, 23)
i-- (aa, aa$, 231)
[- (a, aS, 231)
t- (e, $, 231) [B
356 ONE-PASSNO BACKTRACK PARSING CHAP. 5
aa ab a ba bb b e
To ab T1, 2 e, 1
Tl T2aa, 3 T2aa, 3 b,4
T2 e, 1 ab T3 , 2
T3 T2aa, 3 T2aa, 3 b, 4
a pop pop pop
S accept
(1) S ==~wAoc,
tm
(4) As F~_~(A) ~ Ft(A) ~ E,k for all A and i, eventuaIly we must reach
an i for which Ft_i(A) = Ft(A) for all A in N. Let FIRSTk(A ) = Ft(A) for
that value of i. E]
Example 5.18
Let us construct the sets F~(X), assuming that k has the value l, for
the grammar G with productions
S > BA
A >-q-BAle
B > DC
C > , DCle
D > (S)la
Initially,
F0(S) = F0(B) =
Fo(A) = [-t-, e}
Fo(C) = [*, e}
Fo(D) = [(, a}
Then F,(B)={(, a} and Fi(X) = Fo(X) for all other X. Then F z ( S ) = [(, a}
and F 2 ( X ) = El(X) for all other X. F3(X) = F2(X) for all X, so that
THEOREM 5.7
Algorithm 5.5 correctly computes FIRSTk(A ).
Proof. We observe that if for all X in N U ~, Fi_i(X)= Ft(X), then
SEC. 5.1 LL(k) GRAMMARS 359
Ft(X) = Fj(X) for all j > i and all X. Thus we must prove that x is in
FIRSTk(A ) if and only if x is in Fj(A) for some j.
lf: We show that F / A ) ~ FIRSTk(A) by induction on j. The basis,
j = 0, is trivial. Let us consider a fixed value of j and assume that the
hypothesis is true for smaller values ofj.
If x is in Fj(A), then either it is in Fj_ i(A), in which case the result is
immediate, or we can find A ~ Y1 . . . Yn in P, with xp in F~_ l(Yp), where
1 ~ p ~ n, such that x = FIRSTk(x 1 . . . xn). By the inductive hypothesis,
xp is in FIRSTk(Yp). Thus there exists, for each p, a derivation Y~ ==~ yp,
where xp is FIRSTk(yp). Hence, A =-~ y l . . . y ~ . We must now show that
x = FIRSTk(y I . . . y~), and thus conclude that x is in FIRSTk(A ).
Case 1: tx~ . . . xn[ < k. Then yp = xp for each p, and x = y t . . . y ~ .
Since Yl "'" Yn is in FIRSTk(A ) in this case, x is in FIRSTk(A ).
Case 2." For some s > 0, Ix1 " " x,I < k but Ix1 " . . x~+11>_ k. Then
yp = xp for 1 ~ p ~ s, and x is the first k symbols of x~ . . . x,+~. Since
x,+~ is a prefix of Ys+l, x is a prefix of y1 . . . y,+l and hence of y~ . . . y~.
Thus, x is FIRSTk(A ).
/.
ALGORITHM 5.6
Computation of tr(A).
Input. A C F G G = (N, X, P, S).
Output.
Method. We shall compute, for all A and B in N, sets a(A, B) such that
$
a(A,B)={L[LcE* k, and for some x and a, A = ~ x B a and L =
lm
FIRST(e)}. We construct sets tri(A, B) for each A and B and for i = 0, 1. . . .
as follows"
(1) Let a o ( A , B ) = {L ~ Z*k[A---~flBa is in P and L = FIRST(a)}.
(2) Assume that tr~_~(A, B) has been computed for all A and B. Define
at(A, B) as follows"
(a) If L is in at_ i(A, B), place L in tri(A, B).
(b) If there is a production A ---~ X1 . " X, in P, place L in at(A, B)
if for some j, 1 _~j < n, there is a set L' in trt_~(Xj, B) and L =
L' ~ k FIRST(Xj+i) @~k "'" ~ k FIRST(X,).
(3) When for some i, an(A, B) = a~_ x(A, B) for all A and B, let a(A, B) =
at(A, B). Since for all i, a~_~(A, B ) ~ at(A, B ) ~ (P(Z*k), such an i must
exist.
(4) The desired set is a(S, A). [~]
THEOREM 5.8
In Algorithm 5.6, L is in a(S, A) if and only if for some w ~ Z* and
a ~ (N u Z)*, S ==~wAa and L = FIRSTk(a ).
lm
Proof The proof is similar to that of the previous theorem and is left
for the Exercises.
Example 5.19
Let us test the grammar G with productions
S > AS[e
A ~ aAlb
for the LL(1) condition.
We begin by computing F I R S T i ( S ) = [e, a, b] and FIRSTi(A) = {a, b}.
We then must compute a(S) = a(S, S) and a(A) = a(S, A). From step
(1) of Algorithm 5.4 we have
EXERCISES
S~ aAaBl bAbB
A- ~ alab
B- ~aBla
A ------~ 0cA'
h ' ---> ]31~,
portion of the input derived from X1 (the symbol X1 is the left corned
and the next k input symbols. In the formal definition, should Xx be
a terminal, we may then look only at the next k - 1 symbols. This
restriction is made for the sake of simplicity in stating an interesting
theorem which will be Exercise 5.1.33. In Fig. 5.7, we would recognize
production A ~ X1 . . . Xp after seeing wx and the first k symbols
(k - 1 if Xa is in E) of y. Note that if G were LL(k), we could recognize
the production "sooner," specifically, once we had seen w and
FIRSTk(Xy).
w x y
Example 5.20
Consider the following grammar G with productions
(1) S - - * S + A (2) S--~ A
(3) A --> A . B (4) A--> B
(5) B---> ( S ) (6) B ~ a
G is an LC(1) grammar. G is, in fact, Go slightly disguised. A left-
corner parsing table for G is shown in Fig. 5.8.
The parser using this left-corner parsing table would make the
following sequence of moves on input ~ a , a):
A a[A,B] , 6 <S>[A,B] , 5
L. . . . . . . . . . . , ,, ~h
B a[B,B], 6 <S>[B,B], 5
a pop
< pop
> pop
+[ pop
pop
sl accept
The reader can easily verify that 56436242 is the correct left corner
parse for (a • a).
"5.1.27. Show that the grammar with productions
S----+ A [ B
A ~ aAblO
B - > aBbb l 1
E----->.E + T [ T
T--->" T . F [ F
F----> PI" F I P
P----~" (E) l a
Research Problems
5.1.35. Find transformations which can be used to convert non-LL(k) grammars
into equivalent LL(1) grammars.
Programming Exercises
5.1.36. Write a program that takes as input an arbitrary C F G G and con-
structs a 1-predictive parsing table for G if G is LL(i).
5.1.37. Write a program that takes as input a parsing table and an input string
and parses the input string using the given parsing table.
5.1.38. Transform one of the grammars in the Appendix into an LL(1) grammar.
Then construct an LL(1) parser for that grammar.
5.1.39. Write a program that tests whether a grammar is LL(1).
Let M be a parsing table for an LL(1) grammar G. Suppose that we
are parsing an input string and the parser has reached the configuration
(ax, Xoc, ~z). If M(X, a ) = error, we would like to announce that an
error occurred at this input position and transfer to an error recovery
routine which modifies the contents of the pushdown list and input
tape so that parsing can proceed normally. Some possible error recovery
strategies are
(1) Delete a and try to continue parsing.
(2) Replace a by a symbol b such that M(X, b ) ~ error and con-
tinue parsing.
(3) Insert a symbol b in front of a on the input such that M(X, b)
error and continue parsing. This third technique should be used with
care since an infinite loop is easily possible.
(4) Scan forward on the input until some designated input symbol
b is found. Pop symbols from the pushdown list until a symbol X is
found such that X : ~ bfl for some ft. Then resume normal parsing.
We also might list for each pair (X, a) such that M(X, a) = error,
several possible error recovery methods with the most promising
method listed first. It is entirely possible that in some situations inser-
368 ONE-PASS NO BACKTRACK PARSING CHAP. 5
BIBLIOGRAPHIC NOTES
LL(k) grammars were first defined by Lewis and Steams [1968]. In an early
version of that paper, these grammars were called TD(k) grammars, TD being an
acronym for top-down. Simple LL(1) grammars were first investigated by Koren-
jak and Hopcroft [1966], where they were called s-grammars.
The theory of LL(k) grammars was extensively developed by Rosenkrantz and
Steams [1970], and the answers to Exercises 5.1.21-5.1.24 can be found there.
LL(k) grammars and other versions of deterministic top-down grammars have
been considered by Knuth [1967], Kurki-Suonio [1969], Wood [1969a, 1970] and
Culik [1968].
P.M. Lewis, R.E. Stearns, and D.J. Rosenkrantz have designed compilers
for ALGOL and FORTRAN whose syntax analysis phase is based on an LL(1)
parser. Details of the ALGOL compiler are given by Lewis and Rosenkrantz
[1971]. This reference also contains an LL(1) grammar for ALGOL 60.
LC(k) grammars were first defined by Rosenkrantz and Lewis [1970]. Clues
to Exercises 5.1.27-5.1.34 can be found there.
list, the PDT will shift some number (possibly none) of the leading symbols
of x onto the pushdown list until the right end of the handle of o~t is found.
In this case, the string y is shifted onto the pushdown list.
Then the PDT must locate the left end of the handle. Once this has been
done, the PDT will replace the handle (here fly), which is on top of the push-
down list, b y the appropriate nonterminal (here B) and emit the number of
the production B----~ fly. The PDT now has yB on its pushdown list, and
the unexpended input is z. These strings are the open and closed portions,
respectively, of the right-sentential form ~_ i.
Note that the handle of ocAx can never lie entirely within ~, although it
could be wholly within x. That is, ~t-1 could be of the form o~AxlBx2, and
a production of the form B ---~y, where x l y x 2 = x could be applied to obtain
0~. Since Xl could be arbitrarily long, many shifts may occur before ~t can be
reduced to ~_ 1.
To sum up, there are three decisions which a shift-reduce parsing
370 ONE-PASS NO BACKTRACK PARSING CHAP. 5
algorithm must make. The first is to determine before each move whether to
shift an input symbol onto the pushdown list or to call for a reduction. This
decision is really the determination of where the right end of a handle occurs
in a right-sentential form.
The second and third decisions occur after the right end of a handle is
located. Once the handle is known to lie on top of the pushdown list, the left
end of the handle must be located within the pushdown list. Then, when
the handle has been thus isolated, we must find the appropriate nonterminal
by which it is to be replaced.
A grammar in which no two distinct productions have the same right
side is said to be uniquely invertible (UI) or, alternatively, backwards deter-
ministic. It is not difficult to show that every context-free language is gener-
ated by at least one uniquely invertible context-free grammar.
If a grammar is uniquely invertible, then once we have isolated the handle
of a right-sentential form, there is exactly one nonterminal by which it can
be replaced. However, many useful grammars are not uniquely invertible,
So in general we must have some mechanism for knowing with which non-
terminal to replace a handle.
Example 5.21
Let us consider the grammar G with the productions
(1) S ~ SaSb
(2) S ~e
Let us parse the sentence aabb using a pushdown list and a shift-reduce
parsing algorithm. We shall use $ as an endmarker for both the input string
and the bottom of the pushdown list.
We shall describe the shift-reduce parsing algorithm in terms of configu-
rations consisting of triples of the form (~X, x, n), where
(1) ~X represents the contents of the pushdown list, with X on top;
(2) x is the unexpended input; and
(3) n is the output to this point.
We can picture this configuration as the configuration of an extended PDT
with the state omitted and the pushdown list preceeding the input. In
Section 5.3.1 we shall give a formal description of a shift-reduce parsing
algorithm.
Initially, the algorithm will be in configuration ($, aabb$, e). The algo-
SEC. 5.2 DETERMINISTIC BOTTOM-UP PARSING 371
rithm must then recognize that the handle of the right-sentential form aabb
is e, occurring at the left end, and that this handle is to be reduced to S.
We defer describing the actual mechanism whereby handle recognition
occurs. Thus the algorithm must next enter the configuration ($S, aabb$, 2).
It will then shift an input symbol on top of the pushdown list to enter
configuration ($Sa, abb$, 2). Then it will recognize that the handle e is
on top of the pushdown list and make a reduction to enter configuration
( $SaS, abb $, 22).
Continuing in this fashion, the algorithm would make the following
sequence of moves:
In this section we shall define a large class of grammars for which we can
always construct deterministic right parsers. These grammars are the LR(k)
grammars.
Informally, we say that a grammar is LR(k) if given a rightmost derivation
S = ao - ~ al =~ a2 ~ . . . - ~ am = z, we can isolate the handle of each
rm rm rm rm
right-sentential form and determine which nonterminal is to replace the han-
dle by scanning ai from left to right, but only going at most k symbols past
the right end of the handle of ai.
Suppose that a,_~ = aAw and a~ = aflw, where fl is the handle of a~.
Suppose further that /3 = X1X2... X,.. If the grammar is LR(k), then we
can be sure of the following facts:
(1) Knowing aX~X~ ... Xj and the first k symbols of Xj+~... X,w, we
can be certain that the right end of the handle has not been reached until
j = r .
(2) Knowing aft and at most the first k symbols of w, we can always deter-
mine that fl is the handle and that fl is to be reduced to A.
372 ONE-PASSNO BACKTRACK PARSING CHAP. 5
(3) When ~_ ~ = S, we can signal with certainty that the input string is
to be accepted.
Note that in going through the sequence ~m, ~ - ~ , ' ' ' , ~0, we begin by
looking at only FIRSTk(~m)= FIRSTk(w ). At each step our lookahead
string will consist only o f k or fewer terminal symbols.
We Shall now define the term LR(k) grammar. But before we do so, we
first introduce the simple concept of an augmented grammar.
DEFINITION
DEFINITION
Let G = (N, E, P, S) be a C F G and let G ' = (N', ~, P', S') be its aug-
mented grammar. We say that G is LR(k), k ~ 0, if the three conditions
:g
(1) S =:~ ~aw ::~ ~pw,
rill I'm
(3) F I R S T k ( w ) = FIRSTk(y )
imply that otAy = yBx.
The reason we cannot always use this definition is that if the start symbol
appears on the right side of some production we may not be able to determine
whether we have reached the end of the input string and should accept or
whether we should continue parsing.
Example 5.22
Consider the grammar G with the productions
S >Sa[a
If we ignore the restriction against the start symbol appearing on the right
side of a production, i.e., use the alternative definition, G would be an LR(0)
grammar.
However, using the correct definition, G is not LR(0), since the three
conditions
0
(1) S'"---~ S"----'-~ S,
Gt rnl Gt rrll
Perhaps the best way to describe the behavior of an LR(k) parser is via
a running example.
Let us consider the g r a m m a r G of Example 5.21, which we can verify is
an LR(1) g r a m m a r . The a u g m e n t e d g r a m m a r G' is
(0) S ' - - + S
(1) S - >SaSb
(2) S >e
To 2 X 2 T1 X X
T1 S X A X T2 X
T2 2 2 X T3 X X
T3 S S X X T4 Ts
75 2 2 X r6 X X
T5 1 X 1 X X X
r6 S S X X T4 TT,
T7 1 X X X X
Legend
i -= reduce using production i
S ~ shift
A -= accept
X ~ error
Example 5.23
Let us apply Algorithm 5.7 to the initial configuration (To, aabb, e)
using the LR(1) tables of Fig. 5.9. The lookahead string here is a. The parsing
action function of To on a is reduce 2, where production 2 is S --~ e. By step
(2b), we are to remove 2[el = 0 symbols from the pushdown list and emit 2.
The table on top of the pushdown list after this process is still T 0. Since
the goto part of table To with argument S is T~, we then place STi on top of
the pushdown list to obtain the configuration (ToST~, aabb, 2).
Let us go through this cycle once more. The lookahead string is still a.
The parsing action of T~ on a is shift, so we remove a from the input and
place a on the pushdown list. The goto function of Tt on a is T2, so after this
step we have reached the configuration (ToSTlaT2, abb, 2).
Continuing in this fashion, the LR parser would make the following
sequence of moves:
rm
Example 5.24
S - - - ~ CID
C > aClb
D > aDlc
t In fact, G~ is LR(0).
378 ONE-PASS NO BACKTRACK PARSING CHAP. 5
Let us refer to the LR(1) definition, and suppose that we have derivation
S' ~ tzAw =~, ocflw and S ' ==~ 7 B x =~ t~fly. Then since G'i is right-linear,
VIII rm rm r/n
Example 5.25
Let G 2 be the left-linear g r a m m a r with productions
S= > A b l Bc
A~ > Aale
B >Bale
Note that L(G2) = L(G1) for G1 above. However, G 2 is not LR(k) for any k.
Suppose that G2 is LR(k). Consider the two rightmost derivations in
the a u g m e n t e d g r a m m a r G~"
These two derivations satisfy the hypotheses of the LR(k) definition with
tz = e, fl = e, w = akb, y = e, and y = akc. Since A ~ B, G2 is not LR(k).
Moreover, this violation of the LR(k) condition holds for any k, so that
G2 i s n o t L R . 5
Example 5.26
A situation in which the location of a handle cannot be uniquely deter-
mined is found in the grammar G 3 with the productions
S-----~AB
A ----~ a
B ~ CDIaE
C .-----~. ab
D- >bb
E- > bba
G3 is not LR(1). We can see this by considering the two rightmost derivations
in the augmented grammar"
and
S' ~ S ~ AB ~ AaE ~ Aabba
u A w
In Example 5.25 we would argue that G 2 is not LR(k) because after seeing
the first k a's, we cannot determine whether production A ~ e or B ---~ e is
to be used to derive the empty string at the beginning of the input. We
cannot tell which production is used until we see the last input symbol, b
or c. In Chapter 8 we shall try to make rigorous arguments of this type, but
although the notion is intuitively appealing, it is rather difficult to formalize.
for aA from those for aft in a way that can be "implemented" on a pushdown
transducer (or possibly in any other convenient way). Thus we must consider
a table that includes enough information to computethe table corresponding
to aA from that for aft if it is decided that aAw ==~ apw for an appropriate w.
rm
We thus make the following definitions.
DEFINITION
Note that fix may be e and that every viable prefix has at least one valid
LR(k) item.
Example 5.27
Consider grammar G1 of Example 5.24. Item [C ---~ a • C, e] is valid for
aaa, since there is a derivation S ~ aaC => aaaC. That is, ~ = aa and
rill rill
w = e in this example.
Note the similarity of our definition of item here to that found in the
description of Earley's algorithm. There is an interesting relation between
the two when Earley's algorithm is applied to an LR(k) grammar. See
Exercise 5.2.16.
The LR(k) items associated with the viable prefixes of a grammar are
the key to understanding how a deterministic right parser for an LR(k)
grammar works. In a sense we are primarily interested in LR(k) items of
the form [A ~ fl -, u], where the dot is at the right end of the production.
These items indicate which productions can be used to reduce right-sententia!
forms. The next definition and the following theorem are at the heart of
LR(k) parsing.
DEFINITION
We define the e-free first function, EFF~(00 as follows (we shall delete
the k and/or G when clear)"
(1) If a does not begin with a nonterminal, then EFFk(a) = FIRSTk(00.
(2) If a begins with a nonterminal, then
*
Example 5.28
Consider the grammar G with the productions
S >AB
A >Bale
B~ > CbIC
C >cle
FIRST2(S) = {e, a, b, c, ab, ac, ba, ca, cb}
EFF2(S ) = {ca, cb} D
where FIRSTk(w ) = FIRSTk(y ) and l r,~t >_ Is,at but ?Bx ~ gay.
Proof. We know by the LR(k) definition that we can find derivations
satisfying all conditions, except possibly the condition ]76[ > I~Pl. Thus,
assume that Iral < I~Pl. We shall show that there is another counter-
example to the LR(k) condition, where ?~ plays the role of t~,a in that con-
dition.
Since we are given that ?6x = ~fly and i ral < I~/~ !, we find that for some
z in E+, we can write gfl = ?6z. Thus we have the derivations
and
S' ~ ~Aw ~ ~pw = ?d~zw
rill rill
SEC. 5.2 DETERMINISTIC BOTTOM-UP PARSING 383
:o
S' ~ ocAw ~ ~pw
rill rill
and
where FIRSTk(W) = FIRSTk(x ) = u ---- v. Since the two items are distinct,
either A ~ A~ or fl ~ fl~. In either case we have a violation of the LR(k)
definition.
384 ONE-PASS NO BACKTRACK PARSING CHAP. 5
and
where aft = a~fl~ and FIRSTk(zx ) = u. But then G is not LR(k), since
a A z x cannot be equal to a~Alx if z ~ E+.
Case 3: Suppose that f12 contains at least one nonterminal symbol. Then
f12 ==~ u~Bu3 ==~ u~u2u3, where u~u2 ~ e, since a leading nonterminal is not
rm rm
and
rm rill
alfllu~Bu3x ~ alfllulu2u3x
I'm rm
such that alfl~ = aft and uluzu3x = uy. The LR(k) definition requires that
OCAUlUzU3X = ~xfllulBu3x. That is, aAu~u2 = a~fl~u~B. Substituting aft for
a~fl~, we must have Aulu2 = flu~B. But since uluz ~ e, this is impossible.
Note that this is the place where the condition that u is in EFFk(fl2v ) is
required. If we had replaced E F F by F I R S T in the statement of the theorem,
then u~u2 could be e and aAu~u2u3x could then be equal to a~fl~u~Bu3x (if
ulu2 = e and fl = e).
If: Suppose that G is not LR(k). Then there are two derivations in the
augmented grammar
and
S' ~ },Bx
rm
such that the length of its open portion is no more than lafll + 1. That is,
I~A~ I _<[~fll + 1. Then we can write (5.2.2) as
(5.2.3)
rm rm rm
where alfl = aft. Thus el = a and aAy = o~Bx, contrary to the hypothesis
that G is not LR(k). [~]
ALGORITHM 5.8
Construction of V~(?).
Input. C F G G = (N, X, P, S) and 7 in (N u X)*.
Output. V~(r).
Method. If 7 = X t X 2 " ' " X,, we construct V~(7)by constructing Vk(e),
v~(x~), v~(xxx~), . . ., v~(x,x~ ... x.).
(1) We construct Vk(e) as follows"
(a) If S ~ ~ is in P, add [S ----, • ~, e] to Vk(e).
(b) If [A --~ • B~, u] is in Ve(e) and B ~ fl is in P, then for each x
in FIRSTk(~u) add [ B - - , . ,8, x] to Vk(e), provided it is not
already there.
(c) Repeat step (b) until no more new items can be added to Vk(e).
(2) Suppose that we have constructed V,(X1X2... Xi_x), i ~ n. We
construct Vk(XtX2 "'" Xi) as follows"
(a) If [A---* ~ - X , fl, u] is in Vk(Xx... Xi_,), add [A--* ~Xt" fl, u]
to Vk(X, . . . X,).
(b) If [A --* ~ • Bfl, u] has been placed in Vk(Xi "'" X3 and B ---, J
is in P, then add [B---.. J, x] to V k ( X , . . . X3 for each x in
FIRSTk(flu), provided it is not already there.
(c) Repeat step (2b) until no more new items can be added to
v~(x, ... x,). El
DEFINITION
Note that step (2) is really independent of X1 . " X~_ ,, depending only on the
set Vk(Xi ... Xt-,) itself.
Example 5.29
Let us construct Va(e), V,(S), and Vx(Sa) for the augmented grammar
S' >S
S > SaSb
S >e
SEC. 5.2 DETERMINISTIC BOTTOM-UP PARSING 387
(Note, however, that Algorithm 5.8 does not require that the grammar be
augmented.) We first compute V(e) using step 1 of Algorithm 5.8. In step
(la) we add [S' --, • S, e] to V(e). In step (lb) we add [S ----~ • SaSb, e] and
[S ~ . , e] to V(e). Since [S ----~ • SaSb, e] is now in V(e), we must also add
IS ---~ • SaSb, x] and [S ---~., x] to V(e) for all x in FIRST(aSb) = a. Thus,
V(e) contains the following items"
IS' ~ • S, e]
[S --->-. SaSb, e/a]
[S~ > ., e/a]
Here we have used the shorthand notation [A ~ ~x. ,8, x ~ / x 2 / " ' / x , ] for
the set of items [A ----~~ . ,8, x~], [A ---~ 0~ • fl, x 2 ] , . . . , [A ~ ~ . fl, x,]. To
obtain V(S), we compute GOTO(V(e), S). F r o m step (2a) we add the three
items [S' ---~ S . , e] and IS ---~ S . aSb, e/a] to V(S). Computing the closure
adds no new items to V(S), so V(S) is
IS' > S . , e]
[S ~ S . aSb, e/a]
• 'TXfly =-~ ocAw, where ~'~, = X ~ . . . X~_~, and every step in the deri-
rill rill
S =~ X , . . . X , A @ =~ X , . . . X, A z =~ X , . . . X, fl2z,
rm rm rm
Algorithm 5.8 provides a method for constructing the set of LR(k) items
valid for any viable prefix. In the construction of a right parser for an LR(k)
grammar G we are interested in the sets of items which are valid for all viable
prefixes of G, namely the collection of the sets of valid items for G. Since
a grammar contains a finite number of productions, the number of sets of
SEC. 5.2 DETERMINISTIC BOTTOM-UP PARSING 389
items is also finite, but often very large. If ? is a viable prefix of a right
sententiai form ?w, then we shall see that Vk(?) contains all the information
about ? needed to continue parsing yw.
The following algorithm provides a systematic method for computing
the sets of LR(k) items for G.
ALGORITHM 5.9
DEFINITION
If G is a CFG, then the collection of sets of valid LR(k) items for its
augmented grammar will be called the canonical collection of sets of LR(k)
items for G.
Note that it is never necessary to compute GOTO(~, S'), as this set of
items will always be empty.
Example 5.30
Let us compute the canonical collection of sets of LR(1) items for the
grammar G whose augmented grammar contains the productions
St = > S
S > SaSb
S >e
GOTO(120, a) and GOTO(a0, b) are both empty, since neither a nor b are
viable prefixes of G. Next we must compute GOTO(al, X) for X ~ {S, a, b}.
GOTO(~i, S) and GOTO(~i, b) are empty and a; 2 = GOTO(12~, a) is
Grammar Symbol
S a b
of 121 122 m
Items 122 123
123 124 t25
124 126
125
126 124 127
127
Note that GOTO(12, X) will always be empty if all items in 12 have the dot at
the right end of the production. Here, 125 and 127 are examples of such sets
of items.
The reader should note the similarity in the GOTO table above and
the GOTO function of the LR(1) parser for G in Fig. 5.9. [Z]
SEC. 5.2 DETERMINISTIC BOTTOM-UP PARSING 391
THEOREM 5.11
Algorithm 5.9 correctly determines ~.
Proof. By Theorem 5.10 it sumces to prove that a set of items (~ is placed
in S if and only if there exists a derivation S =* ocAw =~ 6¢flw, where 7 is
rm rm
a prefix of 0eft and 6 - - Vk(7). The "only if" portion is a straightforward
induction on the order in which the sets of items are placed in S. The "if"
portion is a no less straightforward induction on the length of 7. These are
both left for the Exercises. [-]
Example 5.31
Let us test the grammar in Example 5.30 for LR(l)-ness. We have
S = {60, • • •, 67}. The only sets of LR(1) items which need to be tested are
those that contain a dot at the right end of a production. These sets of items
are 60, 6 I, 6 z, 64, 65, and 67.
Let us consider 60. I n the items [S' --, • S, e] and [ S - - , • SaSb, e/a] in
6o, EFF(S) and EFF(Sa) are both empty, so no violation of consistency
with the items [S - - , . , e/a] occurs.
392 ONE-PASS NO BACKTRACK PARSING CHAP. 5
canonical collection of sets of LR(k) items for G, then there can be no con-
flicts between actions specified by rules (la), (1 b), and (l c) above.
We say that the table T(~) is associated with a viable prefix ? of G if
a =
DEFINITION
The canonical set of LR(k) tables for an LR(k) grammar G is the pair
(~, To), where 5 the set of LR(k) tables associated with the canonical collec-
tion of sets of LR(k) items for G. T o is the LR(k) table associated with V~(e).
W e shall usually represent a canonical LR(k) parser as a table, of which
each row is an LR(k) table.
The LR(k) parsing algorithm given as Algorithm 5.7 using the canonical
set of LR(k) tables will be called the canonical LR(k)parsing algorithm or
canonical LR(k) parser, for short.
W e shall now summarize the process of constructing the canonical set of
LR(k) tables from an LR(k) grammar.
ALGORITHM 5.11
(o) s'
(1) S > SaSb
(2) S >e
Let us construct To = <f0, go>, the table associated with a~0. Since k = 1,
the possible lookahead strings are a, b, and e. Since ~o contains the items
[S ~ . , e/a], f o ( e ) = fo(a)= reduce 2. From the remaining items in ~to we
determine that fo(b)= error [since EFF(S00 is empty]. To compute the
GOTO function go, we note that GOTO(a~o, S ) = ~i and GOTO(~o, X) is
empty otherwise. If T1 is the name given to T(~i), then go(S)= Tx and
go(X) = e r r o r for all other X. We have now completed the computation of
To. We can represent To as follows"
fo go
b ,Sc a b
To 2 X 2 T1 X X
Y*, then the right end of the handle of Xa . . . X~uy must occur somewhere
to the right of X~.
sEc. 5.2 DETERMINISTIC B O T T O M - U P P A R S I N G 395
=~ e, ==~ w and S ==~ fl~ =~ . . . ==~ tim :=~ W, consider the smallest i such
rm rm rm rm rm
that OCn-t ~ flm-r A violation of the LR(k) definition for any k is immediate.
We leave details for the Exercises. It follows that the canonical LR(k) parsing
algorithm for an LR(k) grammar G produces a right parse for an input w
if and only if w ~ L(G).
It may not be completely obvious at first that the canonical LR(k) parser
operates in linear time, even when the elementary operations are taken to
be its own steps. That such is the case is the next theorem.
THEOREM 5.13
Both the LL(k) and LR(k) parser implementations seem to require plac-
ing large tables on the pushdown list. Actually, we can avoid this situation,
as follows:
(1) Make one copy of each possible table in memory. Then, on the push-
down list, replace the tables by pointers to the tables.
(2) Since both the LL(k) tables and LR(k) tables return the names of
other tables, we can use pointers to the tables instead of names.
We note that the grammar symbols are actually redundant on the push-
down list and in practice would not be written there.
EXERCISES
item [A--~ 0c-fl, j] (in the sense of Earley's algorithm) on list Ii.
.
Show that there is a derivation S ==~ y x such that item [A - ~ 0c • fl, u]
rm
"5.2.18. Use Exercise 5.2.16 to show that if G is LR(k), then Earley's algorithm
with k symbol lookahead (see Exercise 4.2.17) takes linear time and
space.
5.2.19. Let X be any symbol. Show that EFFk(Xa) = EFF~(X) ~ FIRST~(a).
5.2.20. Use Exercise 5.2.19 to give an efficient algorithm to compute EFF(t~)
for any 0~.
5.2.21. Give formal details to show that cases 1 and 3 of Theorem 5.9 yield
violations of the LR(k) condition.
5.2.22. Complete the proof of Theorem 5.11.
5.2.23. Prove the correctness of Algorithm 5.10.
5.2.24. Prove the correctness of Algorithm 5.11 by showing each of the obser-
vations following that algorithm.
In Chapter 8 we shall prove various results regarding LR grammars.
The reader may wish to try his hand at some of them now (Exercises
5.2.25-5.2.28).
• *5.2.25. Show that every LL(k) grammar is an LR(k) grammar.
• *5.2.26. Show that every deterministic C F L has an LR(1) grammar.
• 5.2.27. Show that there exist grammars which are (deterministically) right-
parsable but are not LR.
• 5.2.28. Show that there exist languages which are LR but not LL.
Programming Exercises
5.2.33, Write a program to test whether an arbitrary grammar is LR(1). Esti-
mate how much time and space your program will require as a function
of the size of the input grammar.
5.2.34. Write a program that uses an LR(1) parsing table as in Fig. 5.9 to parse
input strings.
5.2.35. Write a program that generates an LR(1) parser for an LR(1) grammar.
5.2.36. Construct an LR(1) parser for a small grammar.
*5.2.37. Write a program that tests whether an arbitrary set of LR(1) tables
forms a valid parser for a given CFG.
Suppose that an LR(1) parser is in the configuration (ocT, ax, 7t)
and that the parsing action associated with T and a is error. As in LL
parsing, at this point we would like to announce error and transfer to
an error recovery routine that modifies the input and/or the pushdown
list so that the LR(1) parser can continue. As in the LL case we can
delete the input symbol, change it, or insert another input symbol
depending on which strategy seems most promising for the situation
at hand.
Leinius [1970] describes a more elaborate strategy in which LR(1)
tabIes stored in the pushdown list are consulted.
5.2.38. Write an LR(1) grammar for a sma11 language. Devise an error recovery
procedure to be used in conjunction with an LR(i) parser for this
grammar. Evaluate the efficacy of your procedure.
BIBLIOGRAPHIC NOTES
LR(k) grammars were first defined by Knuth [1965]. Unfortunately, the method
given in this section for producing an LR parser will result in very large parsers
for grammars of practical interest. In Chapter 7 we shall investigate techniques
developed by De Remer [1969], Korenjak [1969], and Aho and Ullman [i971],
which can often be used to construct much smaller LR parsers.
The LR(k) concept has also been extended to context-sensitive grammars by
Walters [1970].
The answer to Exercises 5.2.8-5.2.10 are given by Hopcroft and Ullman [1969].
Exercise 5.2.12 is from Knuth [1965].
5.3. PRECEDENCE G R A M M A R S
DEHNITION
Let G = (N, ~, P, S) be a C F G in which the productions have been
numbered from I to p. A shift-reduce parsing algorithm for G is a pair of
functions (2 = (f, g),tt where f is called the shift-reduce function and g the
reduce function. These functions are defined as follows:
(1) f maps V* x (~ U [$})* to {shift, reduce, error, accept], where
V = N U ~ u {$}, and $ is a new symbol, the endmarker.
(2) g maps V* x (E u [$})* to {1, 2 . . . . , p, error], under the constraint
that if g(a, w) = i, then the right side of production i is a suffix of a.
A shift-reduce parsing algorithm uses a left-to-right input scan and
a pushdown list. The function f decides on the basis of what is on the push-
down list and what remains on the input tape whether to shift the current
input symbol onto the pushdown list or call for a reduction. If a reduction
is called for, then the function g is invoked to decide what reduction to make.
We can view the action of a shift-reduce parsing algorithm in terms of
configurations which are triples of the form
where
(1) $X1 " " Xm represents the pushdown list, with Xm on top. Each X't
is in N W E, and $ acts as a bottom of the pushdown list marker.
(2) al . " a, is the remaining portion of the original input, a 1 is the cur-
rent input symbol, and $ acts as a right endmarker for the input.
(3) p l . . . Pr is the string of production numbers used to reduce the origi-
nal input to X~ . . . Xma~ " " a,.
We can describe the action of (2 by two relations, t--a-and !-¢-, on configu-
rations. (The subscript (2 will be dropped whenever possible.)
(1) If f ( a , aw) = shift, then (a, aw, r~)p- (aa, w, z~) for all a in V*, w in
(E U {$})*, and n in { 1 , . . . , p ] * .
(2) If f(afl, w) = reduce, g(o~fl, w) = i, and production i is A ~ fl, then
(aft, w, n:) ~ (aA, w, n:i).
(3) If f ( a , w) = accept, then (a, w, n ) ~ accept.
(4) Otherwise, (a, w, 70 ~ error.
•lThese functions are not the functions associated with an LR(k) table.
SEC. 5.3 PRECEDENCE GRAMMARS 401
Example 5.33
Let us construct a shift-reduce parsing algorithm ~ = (f, g) for the
grammar G with productions
($, aabb$, e)
The first move is determined by f ( $ , aabb$), which we see, from the speci-
fication o f f , is reduce. To determine the reduction we consult g($, aabb$),
which we find is 2. Thus the first move is the reduction
The next move is determined by f($S, aabb$), which is shift. Thus the next
move is
($S, aabb$, 2) ~ ($Sa, abb$, 2)
subsequently in this chapter will need only information near the top of the
stack. We therefore adopt the following convention.
CONVENTION
I f f and g are functions of a shift-reduce parsing algorithm and f(0~, w)
is defined, then we assume that f(floc, wx) = f(oc, w) for all fl and x, unless
otherwise stated. The analogous statement applies to g.
5.3.2, Simple Precedence Grammars
Example 5.34
Let G have the productions
S > aSSblc
The precedence relations for G, together with the added precedence relations
involving the endmarkers, are shown in the precedence matrix of Fig. 5.11.
Each entry gives the precedence relations that hold between the symbol
labeling the row and the symbol labeling the column. Blank entries are
interpreted as error.
a b c
• <. • <.
<. ~ <.
<. <.
Fig. 5.11 Precedence relations.
ers. We have $ < a, a < c, and c 3> c. The handle of accb is the first c, so
the precedence relations have isolated this handle.
LEMMA 5.3
THEOREM 5.14
Let G = (N, E, P, S) be a proper C F G with no e-productions. If
then
(1) For p < i < k, either X~+~ < X~ or Xt+~ "--- X~;
(2) Xk+a <~Xk;
(3) For k > i ;> 1, Xt+~ --" X~; and
(4) X~ > a~.
Proof. The proof will proceed by induction on n. For n = 0, we have
$S$ ~ $Xk . . . Xa$. F r o m the definition of the precedence relations we
SEC. 5.3 PRECEDENCE GRAMMARS 407
$S$~X,,...Xk+IAa,...aq
n
COROLLARY 1
COROLLARY 2
Every simple precedence grammar is unambiguous.
P r o o f . All we need to do is observe that for any right-sentential form fl,
other than S, the previous right-sentential form a such that a r~m fl is unique.
From Corollary 1 we know that the handle of fl can be uniquely determined
by scanning fl surrounded by endmarkers from left to right until the first 3>
relation is found, and then scanning back until a <~ relation is encountered.
The handle lies between these points. Because a simple precedence grammar is
uniquely invertible, the nonterminal to which the handle is to be reduced is
unique. Thus, ~ can be uniquely found from ft. D
We note that since we are dealing only with proper grammars, the fact
that this and subsequent parsing algorithms operate in linear time is not
difficult to prove. The proofs are left for the Exercises.
We shall now describe how a deterministic right parser can be constructed
for a simple precedence grammar.
408 ONE-PASS NO BACKTRACK PARSING CI-IAP. 5
ALGORITHM 5.12
Shift-reduce parsing algorithm for a simple precedence grammar.
Input. A simple precedence grammar G = (N, X, P, S) in which the
productions in P are numbered from 1 to p.
Output. CZ= ( f , g), a shift-reduce parsing algorithm.
Method.
(1) The shift-reduce parsing algorithm will employ $ as a bottom marker
for the pushdown list and a right endmarker for the input.
(2) The shift-reduce function f will be independent of the contents of
the pushdown list except for the topmost symbol and independent of the
remaining input except for the leftmost input symbol. Thus we shall define
f only on (N u X U {$}) x (X u [$}), except in one case (rule c).
(a) f(X, a) = shift if X < a or X - - ~ a.
(b) f ( X , a) = reduce if X -> a.
(c) f($S, $) = accept.l"
(d) f ( X , a) = error otherwise.
(These rules can be implemented by consulting the precedence matrix itself.)
(3) The reduce function g depends only on the string on top of the push-
down list up to one symbol below the handle. The remaining input does not
affect g. Thus we define g only on (N U X t,.) {$})* as follows:
(a) g(X~+lXkXk_l "'" XI, e) = i if Xk+~ < Xk, Xj+~ " Xj for k > j
> 1, and production i is A ~ X k X k - ~ ' ' " Xi. (Note that the
reduce function g is only invoked when X~ 3> a, where a is the
current input symbol.)
(b) g(~z, e) = error, otherwise. [Z]
Example 5.35
Let us construct a shift-reduce parsing algorithm ~ = ( f , g) for the gram-
mar G with productions
(1) S > aSSb
(2) S ~ c
The precedence relations for G are given in Fig. 5.11 on p. 405. We can
use the precedence matrix itself for the shift-reduce function f. The reduce
function g is as follows"
(1) g(XaSSb) = 1 if X ~ IS, a, $].
(2) g(Xc) = 2 if X ~ [S, a, $}.
(3) g(00 = error, otherwise.
tNote that this rule may take priority over rules (2a) and (2b) when X = S and a = $.
SEC. 5.3 PRECEDENCE GRAMMARS 409
In configuration (Sac, cb$, e), for example, we have f(c, b ) = reduce and
g(ac, e) = 2. Thus
(Sac, cb $, e) R ( $aS, cb $, 2)
Let us examine the behavior of a on aeb, an input not in L(G). With
acb as input a would make the following moves:
THEOREM 5.15
Algorithm 5.12 constructs a valid shift-reduce parsing algorithm for
a simple precedence grammar.
Proof. The proof is a straightforward consequence of Theorem 5.14,
the unique invertibility property, and the construction in Algorithm 5.12.
The details are left for the Exercises. El
410 ONE-PASSNO BACKTRACK PARSING CHAP. 5
the first .~ relation is encountered. The handle lies between the <~ and 3>
relations.
This discussion motivates the following definition.
DEFINITION
Example 5.36
Consider the grammar G having the productions
S > 0 S l l 1011
The (1, 1) precedence relations for G are shown in Fig. 5.12. Since 1 ~- 1
and 1 3> 1, G is not a (1, 1) precedence grammar.
Let us use Algorithm 5.13 to compute the (2, 1) precedence relations for G.
We start by computing S. Initially, $ = [$S$, $$S}. We consider $S$ by
adding $0S, 0S1, S11, 11 $, (these are all the substrings of $0S11 $ of length 3),
S 0 1 $
"_,.> .>
and $01 and 011 (substrings of $0115 of length 3). Consideration of $$S
adds $$0. Consideration of $0S adds $00, 00S, and 001. Consideration of
0S1 adds 111, and consideration of 00S adds 000. These are all the members
of $.
To construct <Z, we consider those strings in S with S at the right. We
obtain $$ <Z 0, $0 ~ 0, and 00 ~ 0. To construct -~-, we again consider the
strings in S with S at the right and find $0 " S, 0S " 1, S1 " 1, $0 " 1,
01 ~ 1, 00-~- S, and 00-~- 1.
To construct .>, we consider strings in $ with S in the middle. We find
11 3> $ from $S$ and i 1 3> 1 from 0S1.
The (2,1)precedence relations for G are shown in Fig. 5.13. Strings of
length 2 which are not in the domain of ~-, <Z, or 3> do not appear.
S 0 l $
$$ <.
$0
0S
l
00 <.
01
SI
11 i -> ">
Fig. 5.13 (2, 1) precedence relations.
THEOREM 5.16
Algorithm 5.13 correctly computes 4 , -~-, and 3>.
Proof We first show that S is defined correctly. That is, 7, ~ S if and
only ifl r l = m + n and 7' is a substring of t~flu, where $mS$" *~rmt~Aw ==>,:m
~flW
and u = FIRST,(w).
Only if: The proof is by induction on the order in which strings are added
to $. The basis, the first two members of $, is immediate. For the induction,
suppose that 7 is added to $, because txAx is in S and A --~ fl is in P; that is,
~, is a substring of ¢tflx. Since ~Ax is in $, from the inductive hypothesis we
414 ONE-PASS NO BACKTRACK PARSING CHAP. 5
We may show the following theorem, which is the basis of the shift-reduce
parser for uniquely invertible (m, n) precedence grammars analogous to
that of Algorithm 5.12.
THEOREM 5 . 1 7
(5.3.1) l-m
x , x , _ , . . . x , + xAax ... a,
l-m> XpXp_ 1 . . . Xk +l Xk . . . X l a 1 . . . aq
i
$~S$" ~ r m ~,Bw
- -r m~ ~ ' ~ 2 w
(5.3.2)
------>
rm
XpXp_l ... Xk+lAal ... aq
COROLLARY
If G of Theorem 5.17 is an (m, n) precedence grammar, then Theorem
5.17 can be strengthened by adding the condition that no other relation holds
between the strings in question to each of (1)-(4) in the statement of the
theorem. D
production was to be used in making the reduction, and thus had to examine
these symbols anyway.
To make this scheme work, we must be able to determine which produc-
tion to use in case the right side of one production is a suffix of the right
side of another. For example, suppose that ocflTw is a right-sentential form in
which the right end of the handle occurs between ? and w. If A----~ ? and
B ~ / 3 7 are two productions, then it is not apparent which production should
be used to make the reduction.
We shall restrict ourselves to applying the longest applicable production.
The weak precedence grammars are one class of grammars for which this rule
is the correct one.
DEFINITION
Example 5.37
The grammar G with the following productions is an example of a weak
precedence g r a m m a r t :
E > E -t- TI + TI T
T >T,F[F
F > (E)la
t It should be obvious that G is related to our favorite grammar Go. In fact, L(G)
is just L(Go) with superfluous unary + signs, as in -q- a . (q-- a -k- a), included. Go is
another example of a uniquely invertible weak precedence grammar which is not a simple
precedence grammar.
:l:The fact that these three productions have the same left side is coincidental.
SEC. 5.3 PRECEDENCE GRAMMARS 417
E T F a ( )
_• <. <.
LEMMA 5.5
Let G be as in Lemma 5.4, and suppose that G is uniquely invertible.
If there is no production of the form A--~ txXfl, then in the derivation
418 ONE-PASSNO BACKTRACK PARSING CHAP. 5
Thus the essence of the parsing algorithm for uniquely invertible weak
precedence grammars is that we can scan a right-sentential form (surrounded
by endmarkers) from left to right until we encounter the first 3> relation.
This relation delimits the right end of the handle. We then examine symbols
one at a time to the left of .~. Suppose that B ~ fl is a production and we
see Xfl to the left of the .~ relation. If there is no production of the form
A--* ~X,8, then by Lemma 5.5, fl is the handle. If there is a production
A --. ocXfl, then we can infer by Lemma 5.4 that B --* fl is not applicable.
Thus the decision whether to reduce fl can be made examining only one
symbol to the left of ft.
We can thus construct a shift-reduce parsing algorithm for each uniquely
invertible weak precedence grammar.
ALGORITHM 5.14
Shift-reduce parsing algorithm for weak precedence grammars.
Input. A uniquely invertible weak precedence grammar G = (N, Z, P, S)
in which the productions are numbered from 1 to p.
Output. ~Z = ( f , g), a shift-reduce parsing algorithm for G.
Method. The construction is similar to Algorithm 5.12. The shift-reduce
function f is defined directly from the precedence relations:
(1) f ( X , a) = shift if X < a or X ~ - a.
(2) f ( X , a) = reduce if X .> a.
(3) f($S, $) = accept.
(3) f ( X , a) = error otherwise.
The reduce function g is defined to reduce using the longest applicable
production"
(4) g(Xfl) = i if B ~ fl is the ith production in P and there is no produc-
tion in P of the form A ~ ocXfl for any A and u.
(5) g(cx) = error otherwise.
THEOREM 5.18
Algorithm 5.14 constructs a valid shift-reduce parsing algorithm for G.
SEC. 5.3 PRECEDENCE GRAMMARS 419
Example 5.38
S~0Sl11011
S > OSAltOA1
A- >1
S A 0 1
, ,
S "= ~
Ai "=
0 =" I ~ <-
1 "~ "~
$ <-
Fig. 5.15 Precedence relations for G'.
420 ONE-PASS NO BACKTRACK PARSING CHAP, 5
Example 5.39
Consider the grammar G with productions
E >E+TIT
T >T*FIF
F > al(E) la(L )
L > L,E[E
E >E+TIT
T > T* FIF
F >al(E) la(Z, E) Ia(E)
L >L, EIE
Since L no longer appears to the left of), we do not have E 3> ) in this gram-
mar. We can easily verify that G' is a weak precedence grammar. D
Proof.
lf: Let G = (N, l~, P, S) be a simple precedence grammar. Then clearly,
condition (1) of the definition of weak precedence grammar is satisfied.
Suppose that condition (2) were not satisfied. That is, there exist A ~ txXYfl
and B ---~ Yfl in P, and either X < B or X ~ B. Then X <~ Y, by Lemma
5.3. But X = ' Y because of the production A ---, o~XYfl. This situation is
impossible because G is a precedence grammar.
(1) Let N' be N plus new symbols of the form [0c] for each e ~ e such
that A ---~ fie is in P for some A and ft.
(2) Let P ' consist of the following productions:
(a) IX]----~ X for each [X] in N' such that X is in N U l~.
(b) [X~] ----~ X[e] for each [Xe] in N', where X is in N U E and 0c ~ e.
(c) A ~ [0c] for each A ----~t~ in P.
We shall show that 4 , ~-, and .> for the grammar G' are mutually dis-
joint. No conflicts can involve the endmarker. Thus let X and Y be in
N' U E. We observe that
(1) If X < Y, t h e n X i s i n N U E ;
(2) If X ~ Y, then X is in N U E, and Y is in N ' -- N, since right sides
of length greater than one only appear in rule (2b); and
(3) If X .2> Y, then X is in N' U E and Y is in E.
COROLLARY
Example 5.40
E T F [TI [E)] a ( ) + • $
] °
I .> ,>
EXERCISES
"5.3.18. Give a simple precedence grammar for the language {0nalnln > 1}
u {0nbl2nin >__ 1}.
'5.3.19. Show that every context-free grammar with no e-productions can be
transformed into a (1, 1) precedence grammar.
5.3.20. For C F G G = (N, X, P, S), define the relations 2,/z, and p as follows:
(1) AAX if A ---~ X~ is in P for some 0~.
(2) X/t Y if A ~ ~XYfl is in P for some 0~ and ft. Also, $/zS and
S/z$.
(3) XpA if A ---~ ~ X is in P for some 0~.
Show the following relations between the Wirth-Weber precedence
relations and the above relations ( + denotes the transitive closure;
• denotes reflexive and transitive closure)'
(a) < = / z 2 ÷.
(b) ~ u {($, S), (S, $)} = ~.
(c) 3> = p*ltA* n ((N u ~) x ~).
• "5.3.21. Show that it is undecidable whether a given grammar is an extended
precedence grammar [i.e., whether it is (m, n) precedence for some m
and n].
• 5.3.22. Show that if G is a weak precedence grammar, then G is an extended
precedence grammar (for some m and n).
5.3.23. Show t h a t a is in FOLLOWl(A) if and only if A < a, A ~ a, or A 3> a.
5.3.24. Generalize Lemma 5.3 to extended precedence grammars.
5.3.25. Suppose that we relax the extended precedence conditions to permit
0c < w and 0~ ~ w if they are generated only by rules (lb) and (2b).
Give a shift-reduce algorithm to parse any grammar meeting the
relaxed definition.
Research Problem
5.3.26. Find transformations which can be used to convert grammars into simple
or weak precedence grammars.
Open Problem
5.3.27. Is every simple precedence language generated by a simple precedence
grammar in which the start symbol does not appear on the fight side
of any production? It would be nice if so, as otherwise, we might
attempt to reduce when $S is on the pushdown list and $ on the input.
Programming Exercises
5.3.28. Write a program to construct the Wirth-Weber precedence relations
for a context-free grammar G. Use your program on the grammar for
PL360 in the Appendix.
5.3.29. Write a program that takes a context-free grammar G as input and
constructs a shift-reduce parsing algorithm for G, if G is a simple
precedence grammar. Use your program to construct a parser for PL360.
426 ONE-PASS NO BACKTRACK PARSING CHAP. 5
5.3.30. Write a program that will test whether a grammar is a uniquely invert-
ible weak precedence grammar.
5.3.31. Write a program to construct a shift-reduce parsing algorithm for a
uniquely invertible weak precedence grammar.
BIBLIOGRAPHIC NOTES
S' ~r m aaw =~
rm
apw,
in the augmented grammar G' = (N t2 [S'}, Z, P W {S' ---~ S}, S'), then the
handle fl and the production A ---~ fl which is used to reduce the handle in
aflw can be uniquely determined by
(1) Scanning aflw from left to right until the handle is encountered.
(2) Basing the decision of whether ~ is the handle of aflw, where ~,~ is
a prefix of aft, only on ~, the m symbols to the left of 6 and n symbols to
the right of 6.
(3) Choosing for the handle the leftmost substring which includes, or is
to the right of, the rightmost nonterminal of aflw, from among possible
candidates suggested in (2).
For notational convenience we shall append m $'s to the left and n $'s to
the right of every right-sentential form. With the added $'s we can be sure
that there will always be at least m symbols to the left of and n symbols to
the right of the handle in a padded right-sentential form.
DEFmmON
G = (N, Z, P, S) is an (m, n)-bounded right-context (BRC) grammar if
the four conditions"
(1) $'ns'$" ~G ' rm
aAw ~G" r m aflw and
(2) $ms'$" ~
G' rm
?Bx ~G" r m y~x = ~' fly are rightmost derivations in the
augmented grammar G ' : (N U {S'}, Z, P U {S' ---~ S}, S').
(3) lxl_<lyl
(4) the last m symbols of a and a' coincide, and the first n symbols of
w and y coincide
imply that t~'Ay----~Bx; that is, a ' = 7,, A = B, and y----x.
A grammar is BRC if it is (m, n)-BRC for some m and n.
428 ONE-PASS NO BACKTRACK PARSING CHAP. 5
S >Sala
Example 5.41
The grammar G i with productions
S > aAc
A > Abblb
is a (1, 0)-BRC grammar. The right-sentential forms (other than S' and S)
are aAb2"c for all n > 0 and ab2"+le for n > 0. The possible handles are
aAe, Abb, and b, and in each right-sentential form the handle can be uniquely
determined by scanning the sentential form from left to right until aAe or
Abb is encountered or b is encountered with an a to its left. Note that neither
b in Abb could possibly be a handle by itself, because A or b appears to its
left.
On the other hand, the grammar G2 with productions
S > aAc
A > bAb[b
Example 5.42
The grammar G with productions
S > aAlbB
A ---+ 0A l 1
B----+ 0BI 1
Referring to the BRC definition, we note that a -- $ma0m, 0~' --- ~' --- $mb0m,
fl - - 6 = 1, and y = w - - x -- $n. Then ~ and ~' end in the same m symbols,
0 m; w and y begin with the same n, $"; and [xl < ]y l, but ~'Ay =/=?Bx. (A and
B are themselves in the BRC definition.)
The grammar with productions
S > aA [bA
A ---~ 0A I 1
Condition (3) in the definition of BRC may at first seem odd. However,
it is this condition that guarantees that if, in a right-sentential form a'fly, fl
is the leftmost substring which is the right side of some production A ~ fl
and the left and right Context of fl in a'fly is correct, then the string a'Ay
which results after the reduction will be a right-sentential form.
The BRC grammars are related to some of the classes of grammars we
have previously considered in this chapter. As mentioned, they are a subset
of the LR grammars. The BRC grammars are extended precedence grammars,
and every uniquely invertible (m, n) precedence grammar is a BRC grammar.
The (1, 1)-BRC grammars include all uniquely invertible weak precedence
grammars. We shall prove this relation first.
THEOREM 5.20
If G = (N, E, P, S) is a uniquely invertible weak precedence grammar,
then it is a (1, 1)-BRC grammar.
430 ONE-PASSNO BACKTRACK PARSING CHAP. 5
$S'$ ~ rm
aAw ~ riD.
aflw
and
where a and a' end in the same symbol; w and y begin with the same symbol;
and l xl _< l yl, but ~,Bx ~ a'Ay. Since G is weak precedence, by Theorem
5.14 applied to 75x, we encounter the 3> relation first between 5 and x.
Applying Theorem 5.14 to ~flw, we encounter 3> between fl and w, and since
w and y begin with the same symbol, we encounter 3> between fl and y.
Thus, l a'13l >_ I~'~ I. Since we are given Ixl _< i y l, we must have a'fl = 7di
and x = y.
If we can show that fl = O, we shall have a' = 7- But by unique invertibil-
ity, A = B. We would then contradict the hypothesis that 7Bx ~ ~'Ay.
If f l ¢ J, then one is a suffix of the other. We consider cases to show
/ / = ,~.
Case 1:]3 = eX5 for some e and X. X is the last symbol of y, and there-
fore we have X <~ B or X " B by Theorem 5.14 applied to right-sentential
form ?Bx. This violates the weak precedence condition.
Case 2: J = eXfl for some e and X. This case is symmetric to the above.
We conclude that fl = O and that G is (1, 1)-BRC. [Z
THEOREM 5.21
Every (m, k)-BRC grammar is an LR(k) grammar.
Proof. Let G = (N, Z,P, S) be (m, k)-BRC but not LR(k). Then by
Lemma 5.2, we have two derivations in the augmented grammar G'
S' ~ r i l l aA w - -r m~ oq3w
and
COROLLARY
(1) Either 1~[ = m --t-/, where l is the length of the longest right side in
P o r i e l < m -q- I and ~ begins with $".
(2) Ixl = n.
(3) There is a derivation $ms'$" ~rm flay :72 flYY, where txx is a substring
of ]32,ypositioned so that ~ lies within fl? and does not include the last symbol
of fly.
We delete G, m, and n from ~(A) and 9Z when they are obvious.
The intention is that the appearance of substring ocflx in scanning a right-
sentential form from left to right should indicate that the handle is fl and
that it is to be reduced to A whenever (e, fl, x) is in ~C(A). The appearance
of ex, when (e, x) is in 9Z, indicates that we do not have the handle yet, but
it is possible that the handle exists to the right of e. The following lemma
assures us that this is, in fact, the case.
LEMMA 5.6
G = (N, E, P, S) is (m, n)-BRC if and only if
(1) Let A --, fl and B ~ 6 be distinct productions. Then if (~, fl, x) is in
~Cm,,(A) and (~,, 6, x) is in 5Cm.,(B), then ~fl is not a suffix of ~,O, or vice versa;
(2) For all A ~ N, if (0~, fl, x) is in ~Cm.,(A), then (Oo~fl,x) is not in 9Zn,.
for any 0.
Proof.
If: Suppose that G is not (m, n)-BRC. Then we can find derivations in
the augmented grammar G
and
where 0~ and 0F coincide in the last m places, w and y coincide in the first n
places, and Ix I_< ]y 1, but 7Bx ~ ogAy. Let e be the last m places of 0~ and z
the first n places of w. Then (e, fl, z) is in ~(A). If x ~ y, and Ix[ _< ]y 1,we
must have (Oefl, z) in 9Z for some 0 and thus condition (2) is violated. If x = y,
then 07, 6, z) is in ~(B), where I/is the last m symbols of 7. If A --> fl and
B --~ d~ are the same, then with x = y we conclude 7Bx = oFAy, contrary to
hypothesis. But since one of ~6 or eft is a suffix of the other, we have a
violation of (1) if A --, fl and B ~ ~ are distinct.
Only if: Given a violation of (1) or (2), a violation of the (m, n)-BRC
condition is easy to construct. We leave this part for the Exercises. [Z
ALGORITHM 5.17
Method.
(1) Let f ( a , w) = shift if (~, w) is in 9Zm...
(2) f(a, w) = reduce if ~ = axa2, and (~1, a2, w) is in 5Cm.,(A) for some
A, unless A = S', a~ = $, and a2 = S.
(3) f($ss, $") = accept.
(4) f ( a , w) = error otherwise.
(5) g(a, w) = i if we can write a = a~a2, (aa, ~2, w) is in ~(A), and the
ith production is A --~ ~2.
(6) g(a, w) = error otherwise. D
THEOREM 5.23
Example 5.43
(0) s ' ~s
(1) s , OA
(2) S ,~ 1S
(3) a ~ OA
(4) A 71
G is (1, 0)-BRC. To compute 5C(A), 5C(S), and ~ , we need the set of
strings of length 3 or less that can appear in the viable prefix of a right-
sentential form and have a nonterminal. These are $S', $S, $0A, SIS, 00A,
11 S, 10A, and substrings thereof.
We calculate
i~ consists of the pairs (a, e), where a is $, $0, $00, 000, $1, $11, $10,
111,100, or 110. The functions f and g are given in Fig. 5.17. By "ending of
a" we mean the shortest suffix of a necessary to determine f ( a , e) and to
determine g(a, e) if necessary.
$0A reduce
10A reduce
00A reduce
$IS reduce
11S reduce
O1 reduce
O0 shift
$0 shift
$10 shift
110 shift
$1 shift
$11 shift
111 shift
$ shift
$S accept
Example 5.44
Consider the (non-UI) weak precedence grammar G with productions
S- > aA l bB
A > CAIIC1
B > DBE1 [DE1
C >0
D- >0
E >1
G generates the language [a0"l"ln ~ 1} U (b0"l 2"In > 1], which we shall show
in Chapter 8 not to be a simple precedence language. Precedence relations
for G are given in Fig. 5.19 (p. 437). Note that G is not uniquely invertible,
because 0 appears as the right side in two productions, C --~ 0 and D ~ 0.
However, ~1,0(C) = [(a, 0, e), (C, 0, e)} and ~l,0(D) = {(b, 0, e), (D, 0, e)~
Thus, if we have isolated 0 as the handle of a right-sentential form, then
the symbol immediately to the left of 0 will determine whether to reduce
the 0 to C or to D. Specifically, we reduce 0 to C if that symbol is a or C
and we reduce 0 to D if that symbol is b or D. [~]
- 0
L~
,d
o
.,,.f
0
a0
•~'-~ - ;,,~
1,1
436
SEe. 5.4 OTHER CLASSES OF SHIFT-REDUCE PARSABLE GRAMMARS 437
S A B C D E a b 0 1 $
.>
<. .>
<. <.
- <.
i °
0 .> .>
1~ .> .>
$ <. <.
THEOREM 5.24
Algorithm 5.18 is valid for G.
Proof. Exercise. It suffices to show that every MSP grammar is a BRC
grammar, and then show that the functions of Algorithm 5.18 agree with
those of Algorithm 5.17. D
Example 5.45
The grammar G o is a classic example of an operator precedence grammar"
(1) E ~ E + T (2) E ~ T
(3) T ~ T , F (4) T ~ F
(5) F --~ (E) (6) F ---~ a
The operator precedence relations are given in Fig. 5.20. 5
( a * + ) $
COROLLARY
Example 5.46
Let us parse the string (a -+- a) • a according to the operator precedence
relations of Fig. 5.20 obtained from G o. However, we shall not worry about
nonterminals and merely keep their place with the symbol E. That way we
do not have to worry about whether F should be reduced to T, or T to F
(although in this particular case, we could handle such matters by going
outside the methods of operator precedence parsing). We are effectively
parsing according to the g r a m m a r G"
(1) E >E + E
(3) E > E, E
(s) E , (E)
(6) E >a
shift if b ~ c or b ~-. c
reduce if b 3> c
(1) f(b},, c) =
accept if b = $, 7 = E, and c = $
error otherwise
We can verify that 661563 is indeed a skeletal right parse for (a q-- a) • a
according to G. We can view this skeletal parse as a tree representation of
(a + a) • a, as shown in Fig. 5.21. [ ]
( E )
E + E
I
a
I
a Fig. 5.21 Skeletalparse tree.
442 ONE-PASSNO BACKTRACKPARSING CHAP. 5
Example 5.46 is a special case of a technique that works for many gram-
mars, especially those that define languages which are sets of arithmetic
expressions. Involved is the construction of a new grammar with all nonter-
minals of the old grammar replaced by one nonterminal and single produc-
tions deleted. If we began with an operator precedence grammar, we can
always find one parse of each input by a shift-reduce algorithm. Quite often
the new grammar and its parser are sufficient for the purposes of translation,
and in such situations the operator precedence parsing technique is a particu-
larly simple and efficient one.
DEFINITION
Let G = (N, X, P, S) be an operator grammar. Define Gs = ([S}, X, P', S),
the skeletal grammar for G, to consist of all productions S----~ Xi " " Xm
such that there is a production A ~ Y1 "" " Ym in P, and for 1 < i < m,
(1) Xt = Y, if Y, ~ X.
(2) X ~ = S if Yt ~ N.
However, we do not allow S ---~ S in P'.
We should warn the reader that L(G) ~ L(G,) and in general L(G,) may
contain strings not in L(G). We can now give a shift-reduce algorithm for
operator precedence grammars.
ALGORITHM 5.19
Operator precedence parser.
Input. An operator precedence grammar G = (N, X, P, S).
Output. Shift-reduce parsing functions f and g for G,.
Method. Let fl be S or e.
(1) f(afl, b) = shift if a < b or a " b.
(2) f(afl, b) = reduce if a -> b.
(3) f($S, $ ) = accept.
(4) f ( a , w) = error otherwise.
(5) g(aflby, w ) = / i f
(a) fl is S or e;
(b) a < b;
(c) The " relation holds between consecutive terminal symbols of
y, if any; and
(d) Production i of G, is S --~ flby.
(6) g(g, w) = error otherwise. D
LEMMA 5.8
If g is a right-sentential form of an operator grammar, then the symbol
appearing immediately to the left of the handle cannot be a nonterminal.
Proof. If it were, then the right-sentential form to which ~z is reduced
would have two adjacent nonterminals. [Z]
THEOREM 5.26
Algorithm 5.19 parses all sentences in L(G).
Proof. By the corollary to Theorem 5.25, the first 3> and the previous <
correctly isolate a handle. Lemma 5.7 justifies the restriction that fl be only
S or e (rather than any string in S*). Lemma 5.8 justifies inclusion of fl in
the handle in rule (5). D
What we shall next discuss is not another parsing algorithm, but rather
a language in which deterministic (nonbacktracking) top-down and bottom-
up parsing algorithms can be described. This language is called the Floyd-
Evans production language, and a number of compilers have been imple-
mented using this syntactic metalanguage. The name is somewhat of a mis-
nomer, since the statements of the language need not refer to any particular
productions in a grammar. A program written in Floyd-Evans productions
is a specification of a parsing algorithm with a finite state control influencing
decisions.'l"
A production language parser is a list of production language statements.
Each statement has a label, and the labels can be considered to be the states
of the finite control. We assume that no two statements have the same label.
The statements act on an input string and a pushdown list and cause a right
parse to be constructed. We can give an instantaneous description of the
parser as a configuration of the form
where
(1) q is the label of the currently active statement;
1"Wemight add that this is not the ultimate generalization of shift-reduce algorithms.
The LR(k) parsing algorithm uses a finite control and also keeps extra information on its
pushdown list. In fact, a DPDT might really be considered the most general kind of shift-
reduce algorithm. However, as we saw in Section 3.4 the DPDT is not really constrained
to parse by making reductions according to the grammar for which its output is a presumed
parse, as is the LR(k) algorithm and the algorithms of Sections 5.3 and 5.4.
Z
444 ONE=PASSNO BACKTRACK PARSING CHAP. 5
and statement L1 is
L1 says that if the string on top of the pushdown list is ~ and the current
input symbol is a, then replace ~ by fl, emit the string s, move the input head
one symbol to the right (indicated by the presence of ,), and go next to
statement L2. Thus the parser would enter the configuration (L2, 7fl, x, ns).
The symbol a may be e, in which case the current input symbol is not relevant,
although if the • is present, an input symbol will be shifted anyway.
If statement L1 did not apply, because the top of the pushdown list did
not match ~ or the current input symbol was not a, then the statement imme-
diately following L1 on the list of statements must be applied next.
Both labels on a statement are optional (although we assume that each
statement has a name for use in configurations). If the symbol --~ is missing,
then the pushdown list is not to be changed, and there would be no point in
having fl ~ e. If the symbol • is missing, then the input head is not to be
moved. If the (next label~ is missing, the next statement on the list is always
taken.
Other possible actions are accept and error. A blank in the action field
indicates that no action is to be taken other than the pattern matching and
possible reduction.
Initially, the parser is in configuration (L, $, w$, e), where w is the input
string to be parsed and L is a designated statement. The statements are then
serially checked until an applicable statement is found. The various actions
specified by this statement are performed, and then control is transferred to
the statement specified by the next label.
The parser continues until an error or accept action is encountered. The
output is valid only when the accept is executed.
SEC. 5.4 OTHER CLASSES OF S H I F T - R E D U C E PARSABLE GRAMMARS 445
Example 5.47
Example 5.48
Let G consist of the productions
(1) S: > aS
(2) S > bS
(3) S >a
L0: # ~#1 ,
LI: a >S [ emit 3 L4
L2: b ! L0
L3: $ [ error
L4: aS >S l emit 1 L4
L5: bS S [ emit 2 L4
L6: $S $ > $S$[ accept •
L7: I error
The input is accepted at statement L6. However, with input aa, the follow-
ing sequence of moves is made"
448 ONE-PASS NO BACKTRACK PARSING CHAP. 5
Context-free
grammars
~ U ~ m b i g u o u s ~
Floyd,-Evans~ " CFG's
parsable I Operator
R precedence
BRC LR(I)
!
I
MSP
Simple MSP
Uniquely
invertible
extended
precedenc/
Uniquely
invertible
weak
precedence
Simple
precedence
Fig. 5.22 Hierarchy of grammars.
easy to implement and work quite efficiently. The (1, 1)-precedence grammars
are also easy to parse, but obtaining a (1, 1)-precedence grammar for a language
often requires the addition of many single productions of the form A ---~ X
to make the precedence relations disjoint. Also, there are many deter-
ministic CFL's for which no uniquely invertible simple or weak precedence
grammar exists.
The LR(1) technique presented in this chapter closely follows Knuth's
original work. The resulting parsers can be extremely large. However, the
techniques to be presented in Chapter 7 produce LR(1) parsers whose size
and operating speed are competitive with precedence parsers for a wide
variety o f programming languages. See Lalonde et al. [1971] for some
empirical results. Since the LR(1) grammars embrace a large class of gram-
mars, LR(1) parsing techniques are also attractive.
Finally we should point out that it is often possible to improve the per-
formance of any given parsing technique in a specific application. In Chapter
7 we shall discuss some methods which can be used to reduce the size and
increase the speed of parsers.
EXERCISES
5.4.1. Give a shift reduce parsing algorithm based on the (1, 0)-BRC technique
for G1 of Example 5.41.
5.4.2. Which of the following grammars are (1, 1)-BRC?
(a) S--~ aAIB
A ---~ 0A1 la
B ----~0B1 lb.
(b) S --. aA l bB
,4 ~ 0AllOl
B ---, 0Bli 01.
(c) E----,E ÷ TI E - TIT
T---, T . F! T/F! F
F---. (E)I-- EIa.
DEFINITION
5.4.3. Show that every (m, n)-BC grammar is an (m, n)-BRC grammar.
5.4.4. Give a shift-reduce parsing algorithm for BC grammars.
5.4.5. Give an example of a BRC grammar that is not BC.
5.4.6. Show that every uniquely invertible extended precedence grammar is
BRC.
5.4.7. Show that every BRC grammar is extended precedence (not necessarily
uniquely invertible, of course).
5.4.8. For those grammars of Exercise 5.4.2 which are (1, 1)-BRC, give shift-
reduce parsing algorithms and implement them with decision trees.
5.4.9. Prove the "only if" portion of Lemma 5.6.
5.4.10. Prove Theorem 5.22.
5.4.11. Which of the following grammars are simple MSP grammars ?
(a) Go.
(b) S--~ A [ B
A ~ 0All01
B~ 2B1 !1.
(c) S - - , A I B
A - ~ OAllOl
B--~ OB1 I1.
(d) S--~ A I B
A ~ 0A1 [01
B --~ 01B1 [01.
5.4.12. Show that every uniquely invertible weak precedence grammar is a
simple MSP grammar.
5.4.13. Is every (m, n; m, n)-MSP grammar an (m, n)-BRC grammar .9
5.4.14. Prove Theorem 5.24.
5.4.15. Are the following grammars operator precedence ?
(a) The grammar of Exercise 5.4.2(b).
(b) S ~ if B then S else 5'
S ~ if B then S
S---~ s
B---~ B o r b
B---~ b.
(c) 5' --~ if B then St else S
S --~ if B then S
$1 --~ if B then $1 else $1
S--, s
S1----). S
B---~ B o r b
B--lb.
The intention in (b) and (c)is that the terminal symbols are if, then,
else, or, b, and s.
452 ONE-PASSNO BACKTRACK PARSING CHAP. 5
5.4.16. Give the skeletal grammars for the grammars of Exercise 5.4.15.
5.4.17. Give shift-reduce parsing functions for those grammars of Exercise
5.4.15 which are operator precedence.
5.4.18. Prove Theorem 5.25.
5.4.19. Show that the skeletal grammar G, is uniquely invertible for every
operator grammar G.
*5.4.20. Show that every operator precedence language has an operator prece-
dence grammar with no single productions.
"5.4.21. Show that every operator precedence language has a uniquely invertible
operator precedence grammar.
5.4.22. Give production language parsers for the grammars of Exercise 5.4.2.
5.4.23. Show that every BRC grammar has a Floyd-Evans production lan-
guage parser.
5.4.24. Show that every LL(k) grammar has a production language parser
(generating left parses).
**5.4.25. Show that it is undecidable whether a grammar is
(a) BRC.
(b) BC.
(c) MSP.
5.4.26. Prove that a grammar G = (N, E, P, S) is simple MSP if and only if
it is a weak precedence grammar and if A ---~ ~ and B ~ 0~ are in P,
A ~ B, then I(A) ~ I(B) = ~ .
*5.4.27. Suppose we relax the condition on an operator grammar that it be
proper and have no e-productions. Show that under this new definition,
L is an operator precedence language if and only if L - [e} is an
operator precedence language under our definition.
5.4.28. Extend Domolki's algorithm as presented in the Exercises of Section 4.1
to carry along information on the pushdown list so that it can be used
to parse BRC grammars.
DEFINITION
where the C's are replaced at each step, the J ' s are all in {el u V - T,
and X is in T. Then we say that X .> a.
(4) If S ~ aXfl and a is in {e) u V - T , then $ < X . I f f l i s i n
[el kJ V - - T, then X - > $.
N o t e that if T = Z, we have defined the operator precedence rela-
tions, and if T = V, we have the W i r t h - W e b e r relations.
Example 5,49
Consider Go, with token set A = {F, a, (,), + , ,]. We find (..-~),
since (E) is a right side, and E is not a token. We have -t- < *, since
there is a right side E --t--T, and a derivation T ~ T . F, and T is not
a token. Also, -4- .> + , since there is a right side E -4- T and a deriva-
tion E ~ E + T, and T is not a token. The A-canonical relations for
Go are shown in Fig. 5.23.
a + * ( ) F $
<. <.
Example 5.50
Let A be as in Example 5.49. The A-skeletal grammar for Go is
So >S0 + S o I S 0 . F [ F
F > (S0) [a D
Research Problem
5.4.32. Develop transformations which can be used to make grammars BRC,
simple precedence, or operator precedence.
Programming Exercises
5.4.33. Write a program that tests whether a given grammar is an operator
precedence grammar.
5.4.34. Write a program that constructs an operator precedence parser for an
operator precedence grammar.
5.4.35. Find an operator precedence grammar for one of the languages in the
Appendix and then construct an operator precedence parser for that
language.
5.4.36. Write a program that constructs a bounded-right-context parser for a
grammar G if G is (1, 1)-BRC.
5.4.37. Write a program that constructs a simple mixed strategy precedence
parser for a grammar G if G is simple MSP.
5.4.38. Define a programming language centered around the Floyd-Evans
production language. Construct a compiler for this programming
language.
BIBLIOGRAPHIC NOTES 455
BIBLIOGRAPHIC NOTES
In this section we shall define two formalisms for limited backtrack pars-
ing algorithms that create parse trees top-down, exhaustively trying all
alternates for each nonterminal, until one alternate has been found which
derives a prefix of the remaining input. Once such an alternate is found, no
other alternates will be tried. Of course, the "wrong" prefix may have been
found, and in this case the algorithm will not backtrack but will fail. Fortu-
456
SEC. 6.1 LIMITED BACKTRACK TOP-DOWN PARSING 457
6.1.1. TDPL
the same language as its underlying C F G defines. We shall therefore not tie
our algorithm to a particular CFG, but will treat it as a formalism for lan-
guage definition and syntactic analysis in its own right.
Let us consider a concrete example. If
S > Ac
A >alab
are productions and the alternates are taken in the order shown, then the
limited backtrackalgorithm will not recognize the sentence abc. The non-
terminal S called at input position 1 will call A at input position 1. Using
the first alternate, A reports success and moves the input pointer to position
2. However, c does not match the second input symbol, so S reports failure
starting at input position 1. Since A reported success the first time it was
called, it will not be called to try the second alternate. Note that we can
avoid this difficulty by writing the alternates for A as
A > abla
A > BC/D
or
A >a
Note that D gets called unless both B and C succeed. We shall later
explore a parsing system in which D is called only if B fails. Note also that
if both B and C succeed, then the alternate D can never be called. This
feature distinguishes TDPL from the general top-down parsing algorithm of
Chapter 4.
The special statements A ---~ a, A ~ e, and A ---~f a r e handled as follows:
DEFINITION
Example 6.1
Let P be the T D P L program (~S, A, B, C}, ~a, b}, R, S), where R is the
sequence of statements
S ~ AB/C
A=>a
B ~ CB/A
C----~ b
Let us investigate the action of P on the input string aba using the rela-
tions defined above. To begin, since S ----~AB/C is the rule for S, S calls A
with input aba. A recognizes the first input symbol and returns success.
Using part (3) of the previous definition we can write A =~ (a I ba, s). Then,
S calls B with input ba. Since B ---~ CB/A is the rule for B, we must examine
the behavior of C on ba. We find that C matches b and returns success. Using
(3) we write C =~ (b l a, s).
Then B calls itself recursively with input a. However, C fails on a and so
C ~ (~" a, f ) . B then calls A with input a. Since A matches a, A ~ (a I, s).
Since A succeeds, the second call of B succeeds. Using rule (4d) we write
B=~,. (a ~',s).
Returning to the first call of B on input ba, both C and B have succeeded,
so this call of B succeeds and we can write B ~ (ba I, s) using rule (4a).
Now returning to the call of S, both A and B have succeeded. Thus, S
matches aba and returns success. Using rule (4a) we can write S ~ (aba I, s).
Thus, aba is in L(P).
It is not difficult to show that L(P) = ab*a + b. [~]
rule for A be A ---~ B C ] D . Suppose that for i = 1 and 2, A ~ (xt ~"y~, ri)
was formed by rule (4) from B ~ (ui ~"% tt) and (possibly) C =g (u~ I v~, t~)
and/or D ~ (u~' ~"vt", t'/). Then m 1 < n 1, so the inductive hypothesis applies
to give ua = u2, vl = v2, and tl = t2. N o w two cases, depending on the value
of t t, have to be considered.
tt tt tot/ tt
Case 1: t~ t2 = f Then since l~ < na, we have u~ = u2, = %, and
t'~' = t~'. Since x~ u~, Yi vi, and r~ = t~ for i = 1 and 2 in this case,
the desired result follows.
' ' = v~ for i - - 1 and 2. Since k~ < nl, we
Case 2: t 1 -- t~ = s. Then u~v~
may conclude that u'a = u'z, v', = v~, ' and t', = t,.
' If t'l = s, then xi = u~u'
?
yt = v~, and r,. = s for i - - 1 and 2. We reach the desired conclusion. If
t'l = f, the argument proceeds with u,.~t and v,.~t as in case 1. U]
Example 6.2
Consider the extended TDPL program P = ({E, F, T}, {a, (,), + , ,}, R, S),
where R consists of
E >T+E/T
T >F,T/F
F ~ (E)/a
E > TX+E/T
T > FX, T/F
F - - + XcEX>/X,,
Xa >a
x< >(
x, >)
x+ ---~ +
X, >,
By rule (5), the first rule is replaced by E --~ B i / T and B~ ---~ TX+E. By rule
(4), B1 ---~ TX+E is replaced by B~ ---~ TB2 and B2 ---~ X+E. Then, these are
replaced by B 1 ---~ TBz/D, Bz ~ X+E/D, and D----,f Rule E ~ B i / T is
replaced by E---, B aC/T and C ~ e. The entire set of rules constructed is
ALGORITHM 6.1
Derivation tree from the execution of a TDPL program.
Input. A TDPL program P = (N, E, R, S) and a sentence w in X* such
that S ~ (w r', s).
output. A derivation tree for w.
Method. The heart of the algorithm is a recursive routine buildtree which
takes as argument a statement of the form A ~ (x ~'y, s) and builds a tree
whose root is labeled A and whose frontier is x. Routine buildtree is initially
called with the statement S =%- (w ~', s) as argument.
Routine buildtree: Let A ~ (x r"Y, s) be input to the routine.
(1) If the rule for A is A ---, a or A ---~ e, then create a node with label A
and one direct descendant, labeled a or e, respectively. Halt.
(2) If the rule for A is m~A ~ BC/D and we can write x = x tx2 such that
ml
B =:~ (x 1 r"x2y, s) and c =:~ (x 21 Y, s), create a node labeled A. Execute
routine buildtree with argument B ~ (x 1 I xzY, s), and then with argument
mi
C =:~ (x2 I Y, s). Attach the resulting trees to the node labeled A, so that
the roots of the trees resulting from the first and second calls are the left and
right direct descendants of the node. Halt.
(3) If the rule for A is A ----~B C / D but (2) does not hold, then it must be
m3
that D :=~ (x ~'y, s). Call routine buildtree with this argument and make
the root of the resulting tree the lone direct descendant of a created node
labeled A. Halt.
Note that routine buildtree calls itself recursively only with smaller values
of m, so Algorithm 6.1 must terminate.
Example 6.3
Let us use Algorithm 6.1 to construct a parse tree generated by the TDPL
program P of Example 6.1 for the input sentence aba.
We initially call routine buildtree with the statement S ~ (aba I, s) as
argument. The rule S ~ A B / C succeeds because A and B each succeed,
SEC. 6.1 LIMITEDBACKTRACKTOP-DOWNPARSING 465
recognizing a and ba, respectively. We then call routine buildtree twice, first
with argument A =~. (a I, s) and then with argument B ~ (ha I, s). Thus
the tree begins as shown in Fig. 6.I(a). A succeeds directly on a, so the node
labeled A is given one descendant labeled a. B succeeds because its rule is
B ~ CB/A and C and B succeed on b and a, respectively. Thus the tree
grows to that in Fig. 6.1 (b).
A/s S
a
i /\
C B
(a) (b)
A
I/ C BIB
b A
I
a
(c)
Fig. 6.1 Construction from parse tree in TDPL.
LEMMA 6.2
LEMMA 6.3
THEOREM 6.1
(6.1.1) [qZp] ~ (w ~"x, s), for any x, if and only if (q, wx, Z)[ ÷ (p, x, e)
(6.1.2) If (q, wx, Z)l-~- (p, x, e), then for all p' ~ p, [qZp'] ~ (1 wx, f )
i:
468 LIMITED BACKTRACK PARSING ALGORITHMS CHAP. 6
where a is e or the first symbol of wx. By rule (2) or rules (5) and (6),
[qZp] ~ (a l y, s), where ay = wx, and [qZp'] -----~(r wx, f ) for all p ' ~ p.
Thus the basis is proved.
Suppose that the result is true for numbers of moves fewer than the num-
ber required to go from configuration (q, wx, Z) to (p, x, e). Let w = aw' for
some a ~ ~ U [e}. There are two cases to consider.
Case 1: The first move is (q, wx, Z ) ~ (q', w'x, X) for some X in F.
By the inductive hypothesis, [q'Xp] ~ (w' r"x, s) and [qZp'] ~ (I w'x, f ) for
p' ~ p. Thus by rule (3) and rules (5) and (7), we have [qXp] =-~ (w r"x, s)
and [qXp'] ~ (r" wx, f) for all p' ~ p. The extended rules of P should be
first translated into rules of the original type to prove these contentions
rigorously.
Case 2: For some X and Y in I', we have, assuming that w ' = yw",
(q, wx, Z) ~---(q', w' x, X Y ) I--e-(q", w" x, Y)I--~- (p, x, e), where the pushdown
list always has at least two symbols between configurations (q', w'x, XY)
and (q", w"x, Y). By the inductive hypothesis, [q'Xq"] ~ (y I w"x, s) and
[q"Yp] ~ (w" I x, s). Also, if p' ~ q", then [q'Xp'] ~ (I w'x, f). Suppose
first that a = ¢. If We examine rule (4) and use the definition of the extended
T D P L statements, we see that every sequence of the form [q'Xp'][p'Yp] fails
for p' -~ q". However, [q'Xq"][q"Yp] succeeds and so [qZp] ~ (w ["x, s)
as desired.
We further note that if p ' -~ p, then [q"Yp'] =~. (F"w"x, f), so that all
terms [q'Xp"][p"Yp'] fail. (If p " ~ q", then [q'Xp"] fails, and if p " = q",
then [p"Yp'] fails.) Thus, [qZp'] ~ (r" wx, f ) for p' ~ p. The case in which
a ~ E is handled similarly, using rules (5) and (8).
We must now show the "only if" portion of (6.1.1). If [qZp] ~ (w ~"x, s)
then [qZp] ~ (w I x, s) for some n.f We prove the result by induction on n.
If n = 1, then rule (2) must have been used, and the result is elementary.
Suppose that it true for n < no, and let [qZp] ~ (w ~"x, s).
Case 1: The rule for [qZp] is [qZp]---, [q'Xp]. Then J(q, e, Z ) = (p, X),
and [q'Xp] ~ (w I x, s), where nl < no. By the inductive hypothesis,
(q', wx, X)l.-~- (p, x, e). Thus, (q, wx, Z)I -e- (p, x, e).
Case 2: The rule for [qZp] is [qZp] ~ [q'Xqo][qoYp]/ "" /[q'Xqk][qkYp].
Then we can write w = w'w" such that for some p', [q'Xp'] ~ (w' I w"x, s)
and [p'Yp] ~ (w" r"x, s), where nl and nz are less than n 0. By hypothesis,
(q', w'w"x, XY)[---(p', w"x, Y)[---(p, x, e). By rule (4), ~(q, e, Z) =
(q', XY). Thus, (q, wx, Z) ~ (p, x, e).
tThe step counting must be performed by converting the extended rules to the original
form.
SEC. 6.1 LIMITED BACKTRACK TOP-DOWN PARSING 469
Case 3: The rule for [qZp] is defined by rule (5). That is, O(q, e, Z ) = ~ .
Then it is not possible that w = e, so let w = aw'. If the rule for nonterminal
[qZp]° is [qZp]a ~ e, we know that O(q, a, Z) = (p, e), so w' = e, w = a, and
(q, w, Z) ~ (p, e, e). The situations in which the rule for [qZp], is defined by
(7) or (8)are handled analogously to cases 1 and 2, respectively. We omit
these considerations.
To complete the proof of the theorem, we note that S ~ (w l, s) if and
only if for some p, [qoZoP] ~ (w I, s). By (6.1.1), [qoZoP] ~ (w l, s) if and
only if (q0, w, Z0)[-~-- (p, e, e). Thus, L(P) = L,(M).
COROLLARY
(1) If A has rule A ---~ a, for a in Z U [e}, then A ~ (a ~"w, s) for all
w ~ Z*, and A ~ (I w, f ) for all w ~ Z* which do not have prefix a.
(2) If A has rule A ~ f, then A =g. (~' w, f ) for all w ~ Z*.
(3) If A has rule A ----~B[C, D], then the following hold"
(a) If B =~ (w I xy, s) and C ~ (x r"y, s), then A ,,___~i(wx I Y, s).
(b) If B ~ (w r"x, s) and C ~ (F"x, f), then A"-----~2 (l wx, f).
(c) If B =~ (~ wx, f ) and D ~ (w r' x, s), then A m____~)(W~'X, S).
(d) If B ~ (I w, f ) and D ~ (r' w, f ) , then A m---e-~~ (I w, f).
Example 6.4
s A[C, El
C- > S[B, E]
A >a
B >b
E >e
A ~ ([" bb, f )
E~ (~"bb, s)
S~ (I bb, s)
B ___k. (b I b, s)
c (b I b, s)
A~ (a I bb, s)
S ---2-.,.(ab I b, s)
B (b s)
SEC. 6.1 LIMITED BACKTRACK TOP-DOWN PARSING 471
C~ (abb r', s)
A~ (a r"abb, s)
s ~ (aabb l, s) D
Example 6.5
We construct a G T D P L program to recognize the non-CFL [0"1"2" In > 1}.
By Example 6.4, we know how to write rules that check whether the
string at a certain point begins with 0"1" or 1"2". Our strategy will be to first
check that the input has a prefix of the form 0 ~ 1"2 for m ~ 0. If not, we shall
arrange it so that acceptance cannot occur. If so, we shall arrange an inter-
mediate failure outcome that causes the input to be reconsidered from
the beginning. We shall then check that the input is of the form 0'l J2 J.
Thus both tests will be met if and only if the input is of the form 0"1"2" for
n~l.
We shall need nonterminals that recognize a single terminal or cause
immediate success or failure; let us list them first"
(1) X >0
(2) Y- 3i
(3) Z ~2
(4) E >e
(5) r >f
(6) S , ~ A[Z, Z]
(7) A > X[B, El
(8) B - - ÷ A[Y, E]
Rules (7), (8), (1), (2), and (4) correspond to those of Example 6.4 exactly.
Rule (6) assures that S~ will recognize what A recognizes (0mlm), followed by
2. Note that A always succeeds, so the rule for $1 could be $1 ---~A[Z, W]
for any W.
Next, we must write rules that recognize 0* followed by 1~2J for some j.
The following rules suffice"
472 LIMITED BACKTRACK PARSING ALGORITHMS CHAP. 6
Rules (10), (11), (2), (3), and (4) correspond to Example 6.4, and C
recognizes 1J2j. The rule for $2 works as follows. As long as there is a prefix
of O's on the input, $2 recognizes one of them and calls itself further along
the input. When X fails, i.e., the input pointer has shifted over the O's, C is
called and recognizes a prefix of the form t j 2 j. Note that C always succeeds,
so S z always succeeds.
We must now put the subprograms for S~ and $2 together. We first
create a nonterrL aal $3, which never consumes any input, but succeeds or
fails as S~ fails or succeeds. The rule for $3 is
Note that if S~ succeeds, $3 will call F, which must fail and retract the input
pointer to the place where S~ was called. If S~ fails, $3 calls E, which succeeds.
Thus, $3 uses no input in any case. Now we can let S be the start symbol,
with rule
We now establish two theorems about GTDPL programs. First, the class
of TDPL definable languages is contained in the class of GTDPL definable
languages. Second, every language defined by a GTDPL program can be
recognized in linear time on a reasonable random access machine.
SEC. 6.1 LIMITED BACKTRACK TOP-DOWN PARSING 473
THEOREM 6 . 2
A > A'[E, D]
A' > B[C, F]
The main result of this section is that we can simulate the successful
recognition of an input sentence by a G T D P L program (and hence a T D P L
474 LIMITEDBACKTRACK PARSING ALGORITHMS CHAP. 6
THEOREM 6.3
THEOREM 6.4
For each GTDPL program there is a constant c such that Algorithm 6.2
takes no more than cn elementary steps on an input string of length n > 1,
where elementary steps are of the type used for Algorithm 4.3.
Proof The crux of the proof is to observe that in step (3) we cycle through
all the nonterminals at most k times for any given j. D
Example 6.6
Let P = (N, E, R I, E), where
N= [E,E+,T,T.,F,F', X , Y , P , M , A , L , R],
(1) g • T[E+, X]
(2) g+ ,PIE, Y]
(3) Z > F[T., X]
(4) T. > M[T, Y]
(5) F • L[F', A]
(6) F' • E[R, X]
(7) X---> f
(8) Y >e
(9) P >+
(I0) M • *
(11) A •a
(12) L: >(
(13) R •)
string (Y) serves. Then we can interpret statement (1) as saying that an expres-
sion is a term followed by something recognized by E÷, consisting of either
the empty string or an alternating sequence of q-'s and terms beginning with
•q- and ending in a term. A similar relation applies to statements (3) and (4).
Statements (5) and (6) say that a factor (F) is either ( followed by an
expression followed by ) or, if no ( is present, a single symbol a.
Now, suppose that (a + a) • a is the input to Algorithm 6.2. The matrix
[ t j constructed by Algorithm 6.2 is shown in Fig. 6.2.
Let us compute the entries in the eighth column of the matrix. The entries
for P, M, A, L, and R have value f, since they look for input symbols in
and the eighth input symbol is the right endmarker. X" always yields value
f, and Y always yields value 0. Applying step (3) of Algorithm 6.2, we find
that in the first cycle through step (4) the values for E+, T., and F can be
filled in and are 0, 0, and f, respectively. On the second cycle, T is given the
value f. The values for E and F ' can be computed on the third cycle.
( a -4- a ) • a $
E 7 3 f 1 f f 1 f
E+ 0 0 2 0 0 0 0 0
T 7 1 f 1 f f 1 f
T, 0 0 0 0 0 2 0 0
F 5 1 f 1 f f 1 f
F' f 4 f 2 f f f f
X f f f f f f f f
Y 0 0 0 0 0 0 0 0
P f f 1 f f f f f
M f f f f f 1 f f
A f 1 f 1 f f 1 f
L 1 f f f f f f f
R f f f f 1 f f f
where j = I wl. Here, Y is "called," and the position of the input head.
when Y is called is recorded, along with the entry on the pushdown list
for Y.
(2) Let ~(q, e, Z) = (begin, Y), where Y ~ F, and q = success or failure.
Then (q, w I x, (Z, i)7) t- (begin, .w ["x, (Y, i)r). Here Z "transfers" to Y.
The input position associated with Y is the same as that associated with Z.
(3) Let ~(begin, a, Z) = (q, e) for a in E u {e}. If q = success, then
(begin, w Iax, (Z, i)~,) ~- (success, wa ~ x, ~,). If a is not a prefix of x or
q = failure, then (begin, w I x, (Z, i)7) ~ (failure, u I v, ~,), where uv = wx
and [ul = i. In the latter case the input pointer is retracted to the location
given by the pointer on top of the pushdown list.
Note that if ~(begin, a, Z) = (success, e), then the next state of the parsing
machine is success if the unexpended input string begins with a and failure
otherwise.
Let ~ be the transitive closure of ~ . The language defined by M, denoted
L(M), is {w[w is in E* and (begin, I w, (Z0, 0))[ --~- (success, w r', e)}.
Example 6,7
Let M = (Q, (a, b}, (Z0, Y, A, B, E}, ~, begin, Zo), where 6 is given by
(1) 6(begin, e, Zo) = (begin, YZo)
(2) O(success, e, Z o ) = (begin, Z0)
(3) $(failure, e, Z0) = (begin, E)
(4) $(begin, e, Y) = (begin, A Y)
(5) O(success, e, Y ) = (begin, Y)
(6) 6(failure, e, Y) = (begin, B)
(7) $(begin, a, A) = (success, e)
(8) O(begin, b, B) = (success, e)
(9) O(begin, e, E) = (success, e)
M recognizes e or any string of a's and b's ending in b, but does so in
a peculiar way. A and B recognize a and b, respectively. When Y begins, it
looks for an a, and i f Y finds it, Y "transfers" to itself. Thus the pushdown
list remains intact, and a's are consumed on the input. If b or the end of
the input is reached, Y in state failure causes the top of the pushdown list
to be erased. That is, Y is replaced by B, and, whether B succeeds or fails,
that B is eventually erased.
Z0 calls Y and transfers to itself in the same way that Y calls A. Thus
SEC. 6.1 LIMITED BACKTRACK TOP-DOWN PARSING 479
any string of a's and b's ending in b will eventually cause Z0 to be erased and
state success entered. The action of M on input abaa is given by the following
sequence of configurations"
Note that abaa is not accepted because the end of the input was not
reached at the last step. However, ab alone would be accepted. It is important
also to note that in the fourth from last configuration B is not "called" but
replaces Y. Thus the number 2, rather than 4, appears on top of the list, and
when B fails, the input head backtracks. [-7
We shall now prove that a language is defined by a parsing machine if
and only if it is defined by a GTDPL program.
LEMMA 6.5
If L -- L(M) for some parsing machine M -- (Q, E, r', 6, begin, z0), then
L -- L(P) for some GTDPL program P.
400 LIMITED BACKTRACK PARSING ALGORITHMS CHAP. 6
This inference requires an inductive proof in its own right, but is left for
the Exercises.
SEC. 6.1 LIMITED BACKTRACK T O P - D O W N PARSING 481
(6.1.8) (begin, T'wx, (Z, 0)) ~ (begin, ~' wx, (Y~, 0)(Z, 0))
and
(6.1.13) (begin, ~'w, (Z, 0)) ~ (begin, ~"w, (Y~, O)(Z, 0))
and
Case 2 : Y 1 ~ (~ w, f ) and Y3 =%. ([' w, f). This case is similar and left
to the reader.
lf: The "if" portion of the proof is similar to the foregoing, and we leave
the details for the Exercises.
As a special case of (6.1.3), Z0 =~ (w l', s) if and only if (begin, [' w, (Z0, 0))
!-~-- (success, w [', e), so L(M) = L(P). D
LEMMA 6.6
If L = L(P) for some GTDPL program P, then L = L(M) for a parsing
machine M.
Proof. Let P = (N, X, R, S) and define M = (Q, X, N, ~, begin, S).
Define ~ as follows:
(1) If R contains rule A ---~ B[C, D], let ~(begin, e, A) = (begin, BA),
$(sueeess, e, A) = (begin, C) and dr(failure, e, A) = (begin, D).
(2) (a) If A----~ a is in R, where a is in X u [e}, let $(begin, a, A ) =
(success, e).
(b) If A ---~ f is in R, let 6(begin, e, A) = (failure, e).
A proof that L ( M ) = L(G) is straightforward and left for the Exercises.
D
THEOREM 6.5
A language L is L(M) for some parsing machine M if and only if it is
L(P) for some GTDPL program P.
Proof Immediate from Lemmas 6.5 and 6.6. D
EXERCISES
"6.1.3. Show that for every LL(1) language there is a GTDPL program which
recognizes the language with no backtracking; i.e., the parsing machine
constructed by Lemrna 6.6 never moves the input pointer t o the left
between successive configurations.
"6.1.4. Show that it is undecidable whether a TDPL program P = (N, X~, R, S)
recognizes
(a) ~.
(b) X*.
6.1.5. Show that every TDPL or GTDPL program is equivalent to one in
which every nonterminal has a rule. Hint: Show that if A has no rule,
you can give it rule A ~ A A / A (or the equivalent in GTDPL) with no
change in the language recognized.
"6.1.6. Give a TDPL program equivalent to the following extended program.
What is the language defined ? From a practical point of view, what
defects does this program have ?
s ~ Zln/C
h ------~ a
B -----~ S C A
C--+ b
6.1.13. Show that there are TDPL (and GTDPL) programs in which the
number of statements executed by the parsing machine of Lemma 6.6
is an exponential function of the length of the input string.
6.1.14. Construct a GTDPL program to simulate the meaning of the rule
A ~ BC/(D1, D2) mentioned on p. 473.
6.1.15. Find a GTDPL program which defines the language L ( M ) , where M
is the parsing machine given in Example 6.7.
6.1.16. Find parsing machines to recognize the languages of Exercise 6.1.2.
DEFINITION
A TDPL or GTDPL program P = (N, ~Z, R, S) has a partial accep-
tance failure on w if w = uv such that v -~ e and S ~ (u T"v, s). We
say that P is well formed if for every w in E*, either S ~ (1' w, f ) or
s=~ (w, Ls).
"6.1.17. Show that if L is a TDPL language (alt. G T D P L language) and $ is
a new symbol, then L$ is defined by a TDPL program (alt. GTDPL
program) with no partial acceptance failures.
"6.1.18. Let L1 be defined by a TDPL (alt. GTDPL) program and L2 by a well-
formed TDPL (alt. GTDPL) program. Show that
(a) L1 U L2,
(b) L,2,
(c) L1 ~ L2, and
(d) L1 -- .L2
are TDPL (alt. GTDPL) languages.
"6.1.19. Show that every GTDPL program with no partial acceptance failure is
equivalent to a well-formed G T D P L program. Hint: It suffices to look
for and eliminate "left recursion." That is, if we have a normal form
G T D P L program, create a CFL by replacing rules ,4 --~ B[C, D] by
productions A ~ B C and A ~ D. Let A ~ a or A ~ e be produc-
tions of the CFL also. The "left recursion" referred to is in the CFL
constructed.
*'6.1.20. Show that it is undecidable for a well-formed TDPL program
P = (N, X, R, S) whether L ( P ) = ~ . Note: The natural embedding of
Post's correspondence problem proves Exercise 6.1.4(a), but does not
always yield a well-formed program.
6.1.21. Complete the proof of Lemma 6.5.
6.1.22. Prove Lemma 6.6.
Open P r o b l e m s
6.1.23. Does there exist a context-free language which is not a GTDPL lan-
guage ?
6.1.24. Are the TDPL languages closed under complementation ?
s~c. 6.2 LIMITED BACKTRACK BOTTOM-UP PARSING 485
Programming Exercises
6.1.27. Design an interpreter for parsing machines. Write a program that takes
an extended GTDPL program as input and constructs from it an
equivalent parsing machine which the interpreter can then simulate.
6.1.28. Design a programming language centered around GTDPL (or TDPL)
which can be used to specify translators. A source program would be
the specification of a translator and the object program would be the
actual translator. Construct a compiler for this language.
BIBLIOGRAPHIC NOTES
tTMG comes from the word "transmogrify," whose meaning is "to change in appear-
ance or form, especially, strangely or grotesquely.'"
486 LIMITEDBACKTRACK PARSING ALGORITHMS CHAP. 6
Example 6.8
Consider the grammar G with productions
S • AalBb
A > 0A1101
B > 0Bl11011
G generates the language [O"Ya[n > 1} U {0"12"b ]n > 1}, which is not a
deterministic context-free language. However, we can clearly parse G by
first moving the input pointer to the end of an input string to see whether
the last symbol is a or b and then returning to the beginning of the string and
parsing as though the string were 0"1" or 0"12,, as appropriate. [-7
Example 6.9
Consider the grammar G having productions
S > OABb[OaBc
A >a
B~ >Blll
L(G) is the regular set 0al+(b + c), but G is not LR. However, we can parse
G bottom-up if we defer the decision of whether a is a phrase in a sentential
form until we have scanned the last input symbol. That is, an input string of
the form 0al" can be reduced to OaB independently of whether it is followed
by b or c. In the former case, OaBb is first reduced to OABb and then to S.
In the latter case, OaBc is reduced directly to S. Of course, we shall not pro-
duce either a left or right parse. [Z]
6,2.2. T w o - S t a c k Parsers
the first pushdown list and if the string fl is on top of the second, then we
can replace 0c by ~, on the first pushd.own list and fl by y on the second. Rules
of type (1) correspond to a shift in a shift-reduce parsing algorithm. Those of
type (2) are related to reduce moves; the essential difference is that the
symbol A, which is the left side of the production involved, winds up on
the top of the second pushdown list rather than the first. This arrangement
corresponds to limited backtracking. It is possible to move symbols from
the first pushdown list to the second (which acts as the input tape), but only
at the time of a reduction. Of course, rules of type (1) allow symbols to move
from the second list to the first at any time.
A configuration of a two-stack parser T is a triple (0~, fl, n), where tx
$(N u E)*, fl ~ (N u E)* $, and rt is a string of pairs consisting of an integer
and a production number. Thus, n could be part of a parse of some string
in L(G). We say that (0c, fl, zt)]-T (0¢, fl', n') if
(1) ~ = txltx2, fl = fl2fll, (t~2, f12) "-~ (r, ~) is a rule of T;
(2) ~' = 0c17, fl' = ~flx ; and
(3) If ((x2, f12) ~ (Y, d~) is a type 1 rule, then zt' = rt; if a type 2 rule and
production i is the applicable production, then ~z'= n(i, j), where j is equal
to I ~ ' 1 - 1.t
Note that the first stack has its top at the right and that the second has
its at the left.
We define I-~-, I~--, and [-~- from I T in the usual manner. The subscript
T will be dropped whenever possible.
The translation defined by T, denoted z(T), is [(w,n)l($, w$, e)! ~
($, S$, zt)3. We say that T is valid for G if for every w ~ L(G), there exists
a bottom-up parse n of w such that (w, rt) ~ z(T). It is elementary to show
that if (w, zt) ~ z(T), then rt is a bottom-up parse of w.
T is deterministic if whenever (0~1, ill) ----~(Yl, ~1) and (~2, flz) --~ (Y2, ~2)
are rules such that 0~ is a suffix of ~2 or vice versa and fll is a prefix of
f12 or vice versa, then ~'1 = ~'2 and di~ = ~2. Thus for each configuration C,
there is at most one C' such that C ~- C'.
Example 6.10
The string (3, 2)(4, 4)(4, 3)(2, 1)(1, 0) is a bottom-up parse of abbaa,
corresponding to the derivation
The two-stack parser has an anomaly in common with the general shift-
reduce parsing algorithms; if a grammar is ambiguous, it may still be pos-
400 LIMITED BACKTRACK PARSING ALGORITHMS CHAP. 6
Example 6.11
Let G be defined by the productions
S----~'AIB
A > aAla
B > Bala
Example 6.12
Let G be the grammar with productions
a b S A $
<. <.
.> .>
Fig. 6.3 "Precedence" relations.
where i v is 1 or 2, 1 < j _< n. Note the last 3n moves alternately shift S and
A, and then reduce either aSA or bSA to S.
It is easy to check that T is deterministic, so no other sequences of moves
are possible with words in L(G). Since all reductions of T are according to
productions of G, it follows that T is a two-stack parser for G. D
Example 6.13
Example 6.14
(1) < = / t 2 +.
(2) ± = u.
(3) > = p + a 2 * m ( N u X ) x £.
Example 6.15
Here
494 LIMITEDBACKTRACK PARSING ALGORITHMS CHAP. 6
LEMMA 6.7
LEMMA 6.8
Proof. Suppose that G is. Let G have Colmerauer relations < , --', and
>• , and let T b e the induced two-stack parser. Since G is assumed to be proper,
there exist x and y in Z* such that X *=~ x and Y *=~ y. Since X/z Y, there
is a production A ~ aXYfl and strings wl, w2, w3, and w4 in I;* such
that S *~ wlAw4 ==~ wlocXYflw4 ==~ * wlwzXYw3w 4 ==~ * wlw2xyw3w4. Since
X p*,uA + Y, there exists a production B ~ 7ZC5 such that Z ~ ~,'X, C =~-
YO', and for some zl, z2, z3, and z4, we have S ~ zaBz4 =-~ za?ZCSza
zly?'XYO'Sz4 =~ zlz~ XYz3 z4 =~ zlz2xyz3z4.
By Lemma 6.7, we may assume that X " Y. Let us watch the processing
by T of the two strings u = wlw2xyw 3 w4 and v = zlz2xyz3z 4. In particular,
let us concentrate on the strings to which x and y are reduced in each case,
and whether these strings appear on stack 1, stack 2, or spread between
them. Let 01, 0 2 , . . . be the sequence of strings to which xy is reduced
in u and W1, W2,. • • that sequence in v. We know that there is some j such
SEC. 6.2 LIMITED BACKTRACK BOTTOM-UP PARSING 495
"l'Note that we are using symbols such as X and Y to represent specific instances of
that symbol in the derivations, i.e., particular nodes of the derivation tree. We trust that
the intended meaning will be clear.
496 LIMITED BACKTRACK PARSING ALGORITHMS CHAP. 6
LEMMA 6.9
Let G = (N, X, P, S) be a C F G such that for some X and Y in N u X,
X p + g Y and X g 2 * Y . Then G is not a Colmerauer grammar.
Proof. The proof is left for the Exercises, and is similar, unfortunately,
to Lemma 6.8. Since X p*g Y, we can find A----~ o~ZYfl in P such that
Z *~ ~'X. Since X g~+ Y, we can find B ~ yXC8 in P such that C ~ YS'. By
the properness of G, we can find words u and v in L(G) such that each deri-
vation of u involves the production A --, o~ZY]3 and the derivation of ~'X
from that Z; each derivation o f v involves B ~ 7XC8 and the derivation of
YS' from C. In each case, X derives x and Y derives y for some x and y
in ~*.
As in Lemma 6.8, we watch what happens to xy in u and v. In v, we find
that Y must be reduced before X, while in u, either X and Y are reduced at
the same time (if Z *~ ~'X is a trivial derivation) or X is reduced before
SEC. 6.2 LIMITED B A C K T R A C K B O T T O M - U P PARSING 497
LEMMA 6.10
Let G = (N, Z, P, S) be any proper grammar. Then if t~XYfl is any
sentential form of G, we have X p*,u2* Y.
Proof. Elementary induction on the length of a derivation of txXYfl. D
LEMMA 6.11
Let G = (N, Z, P, S) be unambiguous and proper, with ,u n p*,u2 ÷ =
and p+,u n , u 2 * = ~3. If otYXt... XkZfl is a sentential form of G, then
the conditions X1 ,u X2 . . . . . X;,_i ,u Xk, Y p*,u2 + X1, and Xk p+g2* Z imply
that X~ ... Xk is a phrase of ~YXx "" XkZfl.
Proof. If not, then there is some other phrase of otYX~ ... XkZfl which
includes X1.
Case 1: Assume that X 2 , . . . , Xk are all included in this other phrase.
Then either Y or Z is also included, since the phrase is not X~ . . - X~. Assum-
ing that Y is included, then YgX~. But we know that Y p*,uA + X~, so that
,u n p*,u2 + ~ $3. I f Z is included, then XkltZ. But we also have Xk P+g~* Z.
If 2* represents at least one instance of 2, i.e., Xe p+,u2 + Z, then
,u n p*,u2 + ~ ~ . If 2* represents zero instances of 2, then Xkp +ltZ. Since
Xk g Z, we have Xk g2* Z, SO p+g N g2* ~ ~.
Case 2: X~ is in the phrase, but Xg+~is not for some i such that 1 < i < k .
Let the phrase be reduced to A. Then by Lemma 6.10 applied to the sentential
form to which we may reduce ttYX1... XkZfl, we have A p*,u2* Xt+I, and
hence Xg p+,u2* X~+~. But we already have X~ ,u Xg÷l, so either ,u n p*,u2 +
~ or p+,u n , u 2 * ~ ~ , depending on whether zero or more instances
of 2 are represented by 2* in p+/t2*. D
LEMMA 6.12
Let G = (N, Z, P, S) be a C F G which is unambiguous, proper, and
uniquely invertible and for which ,u n p*,u2 + = ~ and p+,u n ,u2* = ~ .
Then G is a Colmerauer grammar.
Proof. We define Colmerauer precedence relations as follows"
498 LIMITEDBACKTRACK PARSING ALGORITHMS CHAP. 6
THEOREM 6.6
Example 6.16
b s A $
<. <. <. <.
S <- I = .>
EXERCISES
6.2.1. Which of the following are top-down parses in Go ? What word is derived
if it is ?
(a) (1, 0) (3, 2) (5, 4) (2, 0) (4, 2) (2, 5) (4, 0) (4, 5) (6, 5) (6, 2) (6, 0).
(b) (2, 0) (4, 0) (5, 0) (6, 1).
6.2.2. Give a two-stack parser valid for Go.
6.2.3. Which of the following are Colmerauer grammars ?
(a) Go
(b) S ~ aA l bB
A ~ OA1 I01
B--+ 0BIll011.
(c) S - - + a A B I b
A ~ bSBla
B--~a.
*6.2.4. Show that if G is proper, uniquely invertible, /z n p*/tA + = ~ , and
p+,u n / z ~ * = ~ , then G is unambiguous. Can you use this result to
strengthen Theorem 6.6 ?
6.2.5. Show that every uniquely invertible regular grammar is a Colmerauer
grammar.
6.2.6. Show that every uniquely invertible grammar in G N F such that
p+,u n / . t = ~ is a Colmerauer grammar.
6.2.7. Show that the two-stack parser of Example 6.12 is valid.
6.2.8. Show, using Theorem 6.6, that every simple precedence grammar is a
Colmerauer grammar.
6.2.9. Prove Lemma 6.9.
6.2.10. Prove Lemma 6.10.
6.2.11. Let G be a Colmerauer grammar with Colmerauer precedence relations
< , -~, and .> such that the induced two-stack parser not only parses
every word in L(G), but correctly parses every sentential form of G.
Show that
500 LIMITED BACKTRACK PARSING ALGORITHMS CHAP. 6
(a) /t ~ ~tt.
(b) .u,,1,+ ~ < .
(c) p+.u ~ .>.
(d) p+.u2 + ~ < u .>.
Open Problem
6.2.18. Characterize the class of CFG's, unambiguous or not, which have valid
deterministic two-stack parsers induced by disjoint "precedence" rela-
tions.
BIBLIOGRAPHIC NOTES
Colmerauer grammars and Theorem 6.6 were first given by Colmerauer [1970].
These ideas were related to the token set concept by Gray and Harrison [1969].
Cohen and Culik [1971] consider an LR(k)-based scheme which effectively incor-
porates backtrack.
tThis is a special case of Lemma 6.7 and is included only for completeness.
APPENDIX
501
502 APPENDIX
of this language in two parts. The first part consists of the high-level produc-
tions which define the base language. This base language can be used as
a block-structured algebraic language by itself.
The second part of the description is the set of productions which defines
the extension mechanism. The extension mechanism allows new forms of
statements and functions to be declared by means of a syntax macro defini-
tion statement using production 37. This production states that an instance
of a (statement) can be a (syntax macro definition), which, by productions
39 and 40, can be either a (statement macro definition) or a (function macro
definition).
In productions 41 and 42 we see that each of these macro definitions
involves a (macro structure) and a (definition). The (macro structure)
portion defines the form of the new syntactic construct, and the (definition)
portion gives the translation that is to be associated with the new syntactic
construct. Both the (macro structure) and (definition) can be any string of
nonterminal and terminal symbols except that each nonterminal in the
(definition) portion must appear in the (macro structure). (This is similar
to a rule in an SDTS except that here there is no restriction on how many
times one n0nterminal can be used in the translation element.)
We have not given the explicit rules for (macro structure) and (defini-
tion). In fact, the specification that each nonterminal in the (definition)
portion appear in the (macro structure) portion of a syntax macro definition
cannot be specified by context-free productions.
Production 37 indicates that we can use any instance of a (macro struc-
ture) defined in a statement macro definition wherever (sentence) appears
in a sentential form. Likewise, production 43 allows us to use any instance
of a (macro structure) defined in a function macro definition anywhere
(primary) appears in a sentential form.
For example, we can define a sum statement by the derivation
High-Level Productions
1 (program)
(block)
2 (block)--~
begin (opt local ids) (statement list) end
3 (opt local ids)--~
(opt local ids) local (identifier); [ e
5 (statement list)
(statement) ](statement list); (statement)
7 (statement) --~
(variable) ~-- (expression) [goto (identifier)[
if (expression) then (statement) [(block) [result (expression) 1
(label): (statement)
504 APPENDIX
13 (expression) --~
(arithmetic expression) (relation op) (arithmetic expression) !
(arithmetic expression)
15 (arithmetic expression) --~
(arithmetic expression) (add op) (term) [(term)
17 <term> --~
<term) <mult op) (primary) [<primary)
19 (primary) --~
(variable) I(constant) [((expression)) I(block)
23 (variable)
(identifier) t (identifier) ((expression list))
25 (expression list)
(expression list), (expression) I(expression)
27 (relation op)--~
<1_<1=I~1>!>_
33 (add op)
+!-
35 (mult op)
,1/
Extension Mechanism
37 (statement>
(syntax macro definition>l(macro structure)
39 <syntax macro definition> ---~
(statement macro definition) I(function macro definition)
41 <statement macro definition>
smacro <macro structure> define (definition> endmacro
42 (function macro definition)
fmacro <macro structure> define (definition) endmacro
43 (primary>--~
(macro structure)
Notes
High-Level Productions
1 (statement>--~
(assignment statement> i (matching statement>!
(replacement statement>l(degenerate statement>l<end statement>
6 (assignment statement>
(optional label> <subject field> (equal> (object field> (goto field>
<eos>
7 (matching statement>
<optional label> (subject field> <pattern field> <goto field> (eos>
8 (replacement statement> --~
<optional label> <subject field> <pattern field> <equal> <object
field> (goto field> (eos>
9 (degenerate statement> ---~
(optional label> <subject field> (goto field> <eos> I
(optional label> (goto field> (eos>
11 (end statement> --~
END <eos> l END <blanks> <label> (eos> I
END <blanks> END (eos>
14 <optional label>
(label> [e
16 <subject field> --,
(blanks> (element>
17 (equal) ---~
(blanks) =
18 (object field)
(blanks) (expression)
19 (goto field)
(blanks) : (optional blanks)(basic goto)l e
21 (basic goto)----~
(goto) !S (goto) (optional blanks) (optional F goto) I
F (goto)(optional blanks)(optional S goto)
24 (goto)---~
((expression)) [ < (expression) >
26 (optional S goto) ---~
S (goto) Ie
28 (optional F goto) --~
F(goto)le
30 (eos)
(optional blanks~ ;i (optional blanks~ (eol~
32 (pattern field~
(blanks) (expression~
33 (element)
(optional unaries~ (basic element~
34 (optional unaries~---~
(operator) (optional unaries) 1e
36 ~basic element~
~identifier~ [~literal~ i (function call) I(reference~ I((expressionS)
41 (function call~ ---~
(identifier~ ((arg list~)
42 (reference~
(identifier) < (arg list) >
43 (arg list~ ---~
(arg list), (expression~ I(expression)
45 (expression~----~
(optional blanks~ (element) (optional blanks) [
(optional blanks~ (operation~ (optional blanks) 1
(optional blanks~
48 (optional blanks~
(blanks)! e
50 (operation~
(element) (binary) (element) l (element) (binary> (expression~
Regular Definitions for Lexical Syntax
(digit~ =
0111213141516171819
A.3 SYNTAXfOR PL360 507
(letter) =
AtBICI...IZ
(alphanumeric) =
(letter) l (digit)
(identifier) =
(letter) ((alphanumeric) i. I )*
(blanks) =
(blank character) +
(integer) =
(digit) +
(real) =
(integer). (integer) i (integer).
(operator) =
~l?l$1.ltt%l,I/l#1+l-l@llt&
(binary) =
(blanks 5 [(blanks) (operator) (blanks51 (blanks ~ ** (blanks)
(sliteral 5 =
' ((EBCDIC charactery -- ')* 't
(dliteral) =
" ( ( E B C D I C character) -- ")* "
(literal) =
(sliteral) [(dliteral) [(integer) I(real)
(label) =
(alphanumeric) ((EBCDIC character) -- ((blank character) 1;) )*
-- E N D
Lexical Variables
(blank character)
( E B C D I C character)
(eol~:l:
High-Level Productions
1 (register)--~
(identifier)
2 (cell identifier) ----~
(identifier)
3 (procedure identifier)----~
(identifier)
4 (function identifier)
(identifier)
5 (cell)---~
(cell identifier) ](celll)) l (cell2))
8 (celll) --,
(cell2) (arith op) (number)! (cell3) (number)
lO (cell2)
(cell3) (register)
11 (cell3)
(cell identifier) (
12 (unary op)--~
abs [ neg t neg abs
15 (arith opt
+i-I,i/!++i--
21 (logical op) ---~
and iorlxor
24 (shift op)---~
shla i shra Ishll [shrl
28 (register assignment) --~
(register) := (cell) 1
(register) := (number)l
(register) := (string)[
(register) := (register) i
(register) "= (unary op) (cell) !
(register) := (unary op)(number)l
(register) := (unary op)(register) !
(register) := @ (cell)]
(register assignment) (arith op) (cell) l
(register assignment) (arith op) (number)[
(register assignment) (arith op) (register) !
(register assignment) (logical op) (cell) !
(register assignment) (logical op) (number)[
(register assignment) (logical op) (register) !
(register assignment) (shift op) (number) I
(register assignment) (shift op) (register)
A.3 SYNTAX FOR P L 3 6 0 509
44 (funcl) --~
(func2) (number) I
(func2) (register)[
(func2) (cell) [
(func2) (string)
48 (func2)--~
(function identifier) ) ](func 1),
50 (case sequence)----~
case (register) of begin I(case sequence) (statement) ;
52 (simple statement) ---~
(cell) := (register) I(register assignment) Inull [goto (identifier) [
(procedure identifier) ](function identifier) !(func 1) ( ]
(case sequence) end I(blockbody) end
61 (relation) --~
<1=1>1<=1>=[~=
67 (not)---~
--1
68 (condition)
(register) (relation) (cell) I
(register) (relation) (number)[
(register) (relation) (register) [
(register) (relation) (string) !
overflow 1(relation) I
(cell) [(not) (cell)
76 (compound condition)
(condition) ](comp aor) (condition)
78 (comp aor)--~
(compound condition) and 1(compound condition) or
80 (cond then.)----~
(compound condition) then
81 (true part) ---~
(simple statement) else
82 (while)--~
while
83 (cond do)---~
(compound condition) do
84 (assignment step) ---~
(register assignment) step (number)
85 (limit)
until (register) Iuntil (cell) Iuntil (number)
88 (do)
do
89 (statement*)
510 APPENDIX
(simple statement)[
if (cond then) (statement*) I
if (cond then) (true part) (statement*) I
(while) (cond do) (statement*) l
for (assignment step) (limit) (do) (statement*)
94 (statement)----~
(statement*)
95 (simple type)
short integer [ integer ! logical treal [ long real Ibyte [ character
102 (type) ---~
(simple type) !array (number) (simple type)
104 (decll)--~
(type) (identifier) I(decl2) (identifier)
106 (decl2)
(decl7),
107 (decl3) ---~
(decll~ =
108 (decl4) --~
(decl3) (I (decl5),
110 (decl5~ ---o
(decl4) (number) I(decl4) (string)
112 (decl6) ---~
(decl3)
113 (dee17) --~
(decll) I
(decl6) (number) [
(decl6) (string) I
(decl5))
117 ~function declarationl)
function I(function declaration7)
119 (function declaration2) ---~
(function declarationl) (identifier)
120 (function declaration3)---~
(function declaration2) (
121 (function declaration4) ---~
(function declaration3) (number)
122 (function declaration5)
(function declaration4),
123 (function declaration6)
(function declaration5) (number)
124 (function declaration7) --~
(function declaration6) )
125 (synonymous dcl) ----~
A.3 SYNTAXFORPL360 511
(identifier)
(string)
(number)
512 APPENDIX
A.4. A S Y N T A X - D I R E C T E D TRANSLATION
S C H E M E FOR PAL
High-Level Productions
I (program)--~
(definition list)I <expressio n)
3 (definition list)
def (definition) (definition list) ]def (definition)
5 (expression)
let (definition) in (expression)]
fn (by p a r t ) . (expression)]
(where expression)
8 (where expression)
(valof expression) where (rec definition) ](valof expression)
10 (valof expression)
valof (command) [(command)
12 (command)--~
(labeled command) ; (command) ](labeled command)
14 (labeled command)
(variable) : (labeled command)[(conditional command)
16 (conditional command) --~
test (boolean) ifso (labeled command) ifnot (labeled command)[
test (boolean) ifnot (labeled command) ifso (labeled command)[
)A complete description of PAL is given in: John M. Wozencraft and Arthur Evans,
Jr., Notes on Programming Linguistics, Department of Electrical Engineering, Massa-
chusetts Institute of Technology, Cambridge, Mass., July 1969. The syntax is reprinted
by permission of the authors.
~F. L. DeRemer, Practical Translators Jbr LR(k) Languages, Ph.D. Thesis, M.I.T.,
Cambridge, Mass., 1969, by permission of the author.
A.4 A SYNTAX-DIRECTED TRANSLATION SCHEME FOR PAL 513
1 (program)=
(definition list) I(expression)
3 (definition list) =
(definition) (definition list) defl (definition) lastdef
5 (expression)=
(definition) (expression) let l
(bv part) (expression) lambda I
(where expression)
8 (where expression) =
(valof expression) (rec definition) where I
(valof expression)
10 (valof expression) =
(command) valof !(command)
12 (command)=
(labeled command)(command); I(labeled command)
14 (labeled command) =
(variable) (labeled command): I(conditional command)
16 (conditional command) =
(boolean) (labeled command) (labeled command) test-true I
(boolean) (labeled command) (labeled command) test-false I
A.4 A SYNTAX-DIRECTED TRANSLATION SCHEME FOR PAL 51 5
Regular Definitions
<uppercase letter) =
AIBIC1..'IZ
<lowercase letter> =
albtcl"'lz
<digit> =
0i1121...19
<letter> =
<uppercase letter)! <lowercase letter)
<alphanumeric> =
<letter) I<digit>
<truthvalue> =
true Ifalse
<variable head> =
<digit) + (<letter) l _ ) l
<lowercase letter> + (<uppercase letter> ]<digit> ! _ ) I
<uppercase letter) l _
<variable> =
<lowercase letter> I<variable head> (<alphanumeric>l_)*
<integer> =
<digit) +
A.4 A SYNTAX-DIRECTED TRANSLATION SCHEME FOR PAL 517
<real> =
(digit> + . (digit> +
(quotation element> =
(any character other than • or '>1
,nl, tl, bl*sl**l*'l*kl*r
(quotation> =
' ( q u o t a t i o n element>*'
(constant> =
<integer> t <real> I <quotation> I<truthvalue> I e
(relational functor> =
gr [ge[eqlnellslle
7 ~
BIBLIOGRAPHY
519
520 BIBLIOGRAPHY
GRAY, J. N., and M. A. HARRISON [1969]. Single pass precedence analysis. IEEE
Conf. Record of lOth Annual Symposium on Switching and Automata Theory,
pp. 106-117.
GRAY, J. N., M.A. HARRISON, and O. ]BARRA [1967]. Two way pushdown auto-
mata. Information and Control I 1:1, 30-70.
GREIBACH, S. A. [1965]. A new normal form theorem for context-free phrase struc-
ture grammars. J. A C M 12: 1, 42-52.
GREIBACH, S., and J. HOPCROFT [1969]. Scattered context grammars. J. Computer
and System Sciences 3: 3, 233-247.
GRIES, O. [1971]. Compiler Construction for Digital Computers. Wiley, New York.
GRIFFITHS, T. V. [1968]. The unsolvability of the equivalence problem for A-free
nondeterministic generalized machines. J. A C M 15" 3, 409-413.
GRIFFITHS, T. V., and S. R. PETRICK [1965]. On the relative efficiencies of context-
free grammar recognizers. Comm. A C M 8:5, 289-300.
GRISWOLD. R. E., J. F. POAGE, and I. P. POLONSKY [1971]. The SNOBOL 4 Pro-
gramming Language (2nd ed.)Prentice-Hall, Inc., Englewood Cliffs, N. J.
GROSS, M., and A. LENTIN [1970]. Introduction to Formal Grammars. Springer,
Berlin.
HAINES, L.H. [1970]. Representation Theorems for Context-Sensitive Languages.
Department of Electrical Engineering and Computer Sciences, Univ. of
California, Berkeley.
HALMOS, P. R. [1960]. Naive Set Theory. Van Nostrand Reinhold, New York.
HALMOS, P.R. [1963]. Lectures on Boolean Algebras. Van Nostrand Reinhold,
New York.
HARARY, F. [1969]. Graph Theory. Addison-Wesley, Reading, Mass.
HARRISON, M. A. [1965]. Introduction to Switching and Automata Theory. McGraw-
Hill, New York.
HARTMANIS, J., and J. E. HOPCROFT [1970]. An overview of the theory of compu-
tational complexity. J. A C M 18" 3, 444-475.
HARTMANIS,J., P. M. LEWISII, and R. E. STEARNS[1965]. Classifications of com-
putations by time and memory requirements. Proc. IFIP Congress. 65. Spartan,
New York, pp. 31-35.
HAYS, D. G. [1967]. Introduction to Computational Linguistics. American Elsevier,
New York.
HEXT, J. B., and P. S. ROBERTS [1970]. Syntax analysis by Domolki's algorithm.
Computer J. 13: 3, 263-271.
HOPCROFT, J. E. [1971]. An n log n Algorithm for Minimizing States in a Finite
Automaton. CS71-190, Computer Science Department, Stanford Univ.,
Stanford, Calif. Also in, Theory of Machines and Computations (Z. Kohavi
and A. Paz, eds.) Academic Press, New York, 1972, pp. 189-196.
BIBLIOGRAPHY 525
531
532 INDEXTO LEMMAS, THEOREMS, AND ALGORITHMS
533
534 INDEXTO VOLUME I
Input symbol, 113, 168, 218, 224 Left-bracketed representation (for trees),
Input tape, 93-96 46
Intermediate (of an indexed grammar), Left-corner parse, 278-280, 310-312,
100 362-367
Intermediate code, 59, 65-70 Left-corner parser, 310-312
Interpreter, 55 Left cover, 275-277, 280, 307
Intersection, 4, 197, 201,208, 484 Left factoring, 345
Inverse (of a relation), 6, 10-11 Left linear grammar, 122
Inverse finite transducer mapping, 227 Leftmost derivation, 142-143, 204, 318-
Irland, M. I., 36 320
Irons, E. T., 77, 237, 314, 455 Left parsable grammar, 271-275, 341
Irreflexive relation, 9 Left parse (see Leftmost derivation, Top-
Item (Earley's algorithm), 320, 331,397- down parsing)
398 Left parse language, 273, 277
Item (LR(k)), 381 (see also Valid item) Left parser, 266--268
Left recursion, 484 (see also Left re,cur-
sive grammar)
Left-recursive grammar, 153-158, 287-
288, 294-298, 344-345
Johnson, W. L., 263 Left sentential form, 143
Leinius, R. P., 399, 426
Length
of a derivation, 86
of a string, 16
Lentin, A., 211
Lewis, P. M., II, !92, 237, 368
Kameda, T., 138 Lexical analysis, 59-63, 72-74, 251-264
Kasami, T., 332 Lexicographic order, 13
Keyword, 59, 259 Linear bounded automaton, 100 (see also
Kleene, S. C., 36, 124 Context-sensitive grammar)
Knuth, D. E., 36, 58, 368, 399, 485 Linear grammar/language, 165-170, 207-
Korenjak, A. J., 368, 399 208, 237
Kosaraju, S. R., 138 Linear order, 10, 13-14, 43-45
k-predictive parsing algorithm (see Pre-
Linear set, 209-210
dictive parsing algorithm)
LL(k) grammar/language, 73, 268, 333-368,
Kuno, S., 313
397-398, 448, 452
Kurki-Suonio, R., 368
LL(k) table, 349-351,354-355
LL(1) grammar, 342-349, 483
Loeckx, J., 455
Logic, 19-25
Logical connective, 21-25
Labeled graph, 38, 42 Lookahead, 300, 306, 331, 334-336, 363,
Lalonde, W. R., 450 371
Lambda calculus, 29 Looping (in a pushdown automaton),
Language, 16-17, 83-84, 86, 96, 114, 169 186-189
(see also Recognizer, Grammar) LR(k) grammar/language, 73, 271, 369,
LBA (see Linear bounded automaton) 371-399, 402, 424, 428, 430, 448
LC(k) grammar/language, 362-367 LR(k) table, 374-376, 392-394, 398
Leavenworth, B. M., 58, 501 LR (1) grammar, 410, 448-450
Lee, E. S., 450 Lueas, P., 58
Lee, J. A. N., 76 Lukasiewicz, J., 214
538 INDEXTO VOLUME I
Partial acceptance failure (of a TDPL or Prefix expression, 214-215, 229, 236
GTDPL program), 484 Prefix property, 17, 19, 209
Partial correspondence problem, 36 Preorder (of a tree), 43
Partial function, 10 Problem, 29-36
Partial left parse, 293-296 Procedure, 25-36
Partial order, 9-10, 13-15, 43-45 Product
Partial recursive function (see Recursive of languages, 17 (see also Concatena-
function) tion )
Partial right parse, 306 of relations, 7
Path, 39, 51 Production, 85, 100
Pattern recognition, 79-82 Production language, 443
Paul, M., 455 Proof, 19-21, 43
Paull, M. C., 166 Proof by induction, 20-21, 43
Pav!idis, T., 82 Proper grammar, 150
PDA (see Pushdown automaton) Propositional calculus, 22-23, 35
PDT (see Pushdown transducer)
Pumping lemma
Perles, M., 211
for context-free languages, 195-196
Perlis, A. J., 58
Petrick, S. R., 314 (see also Ogden's lemma)
Pfaltz, J. L., 82 for regular sets, 128-129
Phrase, 486 Pushdown automaton, 167-192, 201,282
Pitts, E., 103 (see also Deterministic pushdown
PL/I, 501 automaton)
PL360, 507-511 Pushdown symbol, 168
Poage, J. F., 505 Pushdown transducer, 227-233, 237,265-
Polish notation (see Prefix expression, 268, 282-285 (see also Determin-
Postfix expression) istic pushdown transducer)
Polonsky, I. P., 505
Porter, J. H., 263
Position (in a string), 193
Q
Post, E. L., 29, 36
Postfix expression, 214-215, 217-218, Question (see Problem)
229, 512
Postorder (of a tree), 43
Post's correspondence problem, 32-36,
199-201
Post system, 29 Rabin, M. O., 103, 124
Power set, 5, 12 Randell, B., 76
Prather, R. E., 138 Range, 6, 10
Precedence (of operators), 65, 233-234 Recognizer, 93-96, 103 (see also Finite
Precedence conflict, 419-420 automaton, Linear bounded auto-
Precedence grammar/language, 399-400, maton, Parsing machine, Push-
403-404 (see also Extended prece- down automaton, Turing machine,
dence grammar, Mixed-strategy Two-stack parser)
precedence grammar, Operator Recursive function, 28
precedence grammar, Simple pre- Recursive grammar, 153, 163
cedence grammar, T-canonical Recursively enumerable set, 28, 34, 92,
precedence grammar, Weak prece- 97, 500
dence grammar) Recursive set, 28, 34, 99
Predecessor, 37 Reduced finite automaton, 125-128
Predicate, 2 Reflexive relation, 6
Predictive parsing algorithm, 338-348, Reflexive-transitive closure (see Closure,
351-356 reflexive and transitive)
540 INDEXTO VOLUMEI
PRENTICE-HALL, Inc.
Englewood Cliffs, New Jersey
178 • Printed in the U . S . A .
This second volume of THE THEORY
OF PARSING, TRANSLATION,
AND COMPILING completes the
definitive work in the field of com-
piling theory.
PARSING, TRANSLATION,
A N D COMPILING
A L F R E D V. A H O
JEFFREY D. U L L M A N
PRENTICE-HALL, INC.
10 9 8 7 6
ISBN: 0-13-914564-8
Library of Congress Catalog Card No. 72-1073
vii
viii PREFACE
ALFRED V. AHO
JEFFREY D. ULLMAN
CONTENTS
PREFACE VII
ix
X CONTENTS
|O BOOKKEEPING 788
11 CODE O P T I M I Z A T I O N 844
I N D E X TO V O L U M E S I AND II 989
THE THEORY OF
PARSING, TRANSLATION,
AND COMPILING
TECHNIQUES FOR
7 PARSER O P T I M I Z A T I O N
A matrix whose entries are either --1, 0, q--1, or "blank" will be called
a precedence matrix. There are obvious applications for precedence matrices
543
544 TECHNIQUESFOR PARSEROPTIMIZATION CHAP. 7
-- 1 with <~
0 with .2_
--Jr-1 with .>
blank with error
In this section we shall show how a precedence matrix can often be concisely
represented by a pair of vectors called linear precedence functions.
Example 7.1
Consider the simple precedence grammar G with productions
S >aSclbSclc
SEC. 7.1 LINEARPRECEDENCEFUNCTIONS 545
S a b c $
•> .>
-- 1 with <
0 with
+ 1 with 3>
and leaving blank entries unchanged. We can then represent this precedence
matrix by the linear precedence functions
f - - (1, O, O, 2, O)
g -- (0, 1, 1, 1, O)
1 2 3 4 5
S a b c $
1 S
2 a -1 -1 -1
3 b -1 -1 -1
4 c +1 +1
5 $ 1 -1 -1
Fig. 7.2 Precedencematrix M.
We can easily verify that these are linear precedence functions for M. For
example, f4 = 2 and g5 = 0. Thus, since f4 > gs, f and g faithfully represent
the + 1 entry M45.
The entry Max in the precedence matrix is blank. However, f4 = 2 and
g l = 0. Thus, if we use f and g to represent M, we wouldreconstruct M41
546 TECHNIQUESFOR PARSER OPTIMIZATION CHAP. 7
as + 1 (since f4 > g~). Likewise, the blank entries M l l , M~5, M42 and M43
would all be represented by + l's, and M~z, Mx3, Mzs, M35, Ms~ and Ms5
would be represented by O's.
The blank entries in the original precedence matrix represent error con-
ditions. Thus, if we use linear precedence functions to represent the prece-
dence relations in this fashion, we shall lose the ability to detect an error
when none of the three precedence relations holds. However, this error will
eventually be caught by attempting a reduction and discovering that there is
no production whose right side is on top of the pushdown list. Nevertheless,
this delay in error detection could be an unacceptable price to pay for the
convenience of using precedence functions in place of precedence matrices,
depending on how important early error detection is in the particular com-
piler involved. E]]
Example 7.2
We can overcome much of this loss of timely error detection by imple-
menting a shift-reduce parsing algorithm for a precedence grammar in which
we associate both the precedence relations .~ and ~ with shift and 3> with
reduce. Moreover, for the shift-reduce parsing action function we need only
the precedence relations from N u Z u [$} to Z u {$}. For example, we can
associate <Z and ~ w i t h - - 1 and 3> with -at- 1 and obtain the precedence
1 2 3 4
a b c $
S -1
a --1 -1 -1
b -1 -1 -1
4 c +1 +1
matrix M' in Fig. 7.3 from Fig. 7.1. The blank entries represent error condi-
tions. We can show that
are linear precedence functions for M'. These linear precedence functions
have the advantage that they reproduce the blank entries M14, M24, M34, and
M54 as 0 (since fl = f2 = f3 = f5 = g4). W e can thus use 0 to denote an error
condition and in this way preserve error detection that was present in the
original matrix M'. We shall consider this problem in greater detail in Section
7.1.3.
SEC. 7.1 LINEAR PRECEDENCE FUNCTIONS 547
In step (4) of Algorithm 7.1 we can use the following general technique
to determine whether a directed graph G is cyclic or acyclic"
548 TECHNIQUESFOR PARSER OPTIMIZATION CHAP. 7
Example 7.3
Consider the precedence matrix M of Fig. 7.4.
1 2 3 4 5
I
-1 -1 0 ! -1
2 +1 -1 0
3 -1 +1 0
4 --1 +1
5 +1 +1
Fig. 7.4 Precedence matrix.
THEOREM 7.1
A precedence matrix has linear precedence functions if and only if its
linearization graph is acyclic.
SEC. 7.1 LINEAR PRECEDENCE FUNCTIONS 549
Proof.
If: We first note that Algorithm 7.1 emits f and g only if the linearization
graph is acyclic. It suffices to show that i f f a n d g are computed by Algorithm
7.1, then
(1) M~j -- 0 implies that f. -- g j,
(2) M~j -- + 1 implies that f~ > gj, and
(3) M~j -- -- 1 implies that f. < gj
Assertion (1) is immediate from step (2) of Algorithm 7.1. To prove assertion
(2), we note that if M~j ---- + 1, then edge (if,., (~j) is added to the linearization
graph. Hence, f. > gj, since the length of a longest path to a leaf from node^
if,. must be at least one more than the length of a longest path from G j if
the linearization graph is acyclic. Assertion (3) follows similarly.
O n l y if: Suppose that a linear precedence matrix M has linear precedence
functions f and g but that the linearization graph for M has a cycle consist-
ing of the sequence of nodes Na, N2, •. •, Nk, N~+,, where N k + 1 - - N~ and
k > 1. Then by step (3), for all i, 1 < i < k, we can find nodes H~ and I~+ 1
such that
(1) H~ and I~+~ are original F's and G's;
(2) H,. and I,.+1 are represented by N,. and N,.+~, respectively; and,
(3) Either Hi is Fro, I~+ ~ is G, and Mm,, -- + 1, or H~ is G m, It+ ~ is F, and
M,,m = - - 1.
550 TECHNIQUES FOR PARSER OPTIMIZATION CHAP. 7
We observe by rule (2) that if nodes Fm and G, are represented by the same
Nt, then fm must equal g, if f and g a r e to be linearizing functions for M.
Let f a n d g be the supposed linearizing functions for M. Let ht be fm if Ht
is F= and let h t be gm if H t is Gm. Let h~ be f~ if I t is Fm and let h~ be gm if I t
is Gm. Then
But since Nk+ 1 is Na, we have h~+l = ha. However, we just showed that
ha > h~+ a. Thus, a precedence matrix with a cyclic linearization graph cannot
have linear precedence functions. [Z]
COROLLARY
We can try to find precedence functions for any matrix whose entries
have at most three values. The applicability of this technique is not affected
by what the entries represent. To illustrate this point, in this section we shall
show how precedence functions can be applied to represent operator pre-
cedence relations.
Example 7.4
Consider our favorite grammar G Owith productions
E >E+TIT
T > T , FIF
F >(E)[a
( + * a )
1 2 3 4 5
-1 --1 --1
-1 --1 --1 0
-1 +1 --1 +1
--1 +1 +1 +1
+1 +1 +1 a,)
$ ( + . ) Fig. 7.7 Reduced precedence
o matrix M'.
>• , respectively. If linear precedence functions are found, then the precedence
relation between X and Y is determined by applying the first function to X
and the second to Y. In this case, all pairs X and Y will have some precedence
relation between them, so error detection is delayed until either the end of
the input is reached or an impossible reduction is called for.
However, the linear precedence function technique can be applied to
the representation of shift-reduce parsing decisions with an opportunity of
retaining some of the error-checking capability present in the blank entries
of the original matrix of precedence relations. Let us define a weak prece-
dence matrix as an m × n matrix M whose entries are --1, + 1, and blank.
The --1 entries generally denote shifts, the + 1 entries reductions, and the
blank entries errors. Such a matrix can be used to describe the shift-reduce
function of a shift-reduce parsing algorithm for a weak precedence grammar,
a (1, 1)-precedence grammar, or a simple mixed strategy precedence grammar.
We say that v e c t o r s f a n d g are weak precedence functions for a weak pre-
cedence matrix M if f. < gj whenever Mtj = - - 1 and f. > gj whenever
M ~ j = +1.
The condition ~ = gj can then be used to denote an error condition,
represented by a blank entry M~j. In general, we may not always be able to
have ~ = gj wherever M~i is blank, but we would like to retain as much of
the error-detecting capability of the original matrix as possible.
Thus, we can view the problem of finding weak precedence functions for
a weak precedence matrix M as one of finding functions which will produce
as many O's for the critical blank entries of M as possible. We choose not to
fill in all blanks of the weak precedence matrix with O's immediately, since
this would restrict the number of useful weak precedence matrices that have
weak precedence functions. Some blank entries may have to be changed to
--1 or + 1 in order for weak precedence functions to exist (Exercise 7.1.9).
In addition, some blank entries may never be consulted by the parser, so
these entries need not be represented by O's.
The concept of independent nodes in a directed acyclic graph is of impor-
tance here. We say that two nodes N1 a n d Nz of a directed acyclic graph
are independent if there is no path from N I to Nz or from N2 to N1.
We could use Algorithm 7.1 directly to produce weak precedence func-
tions for a weak precedence matrix M, but this algorithm as given did not
attempt to maximize the number of O's produced for blank entries. However,
we shall use the first three steps of Algorithm 7.1 to produce a Iinearization
graph for M.
From Theorem 7.1 we know that M has weak precedence functions if
and only if the linearization graph for M is acyclic. The independent nodes
of the linearization graph determine which blank entries of M can be pre-
served. That is, we can have f,. = gj if and only if F~ and Gj are independent
nodes. Of course, if we choose to have fi = gj, then there may be other pairs
of independent nodes whose corresponding numbers cannot be made equal.
SEC. 7.1 LINEAR PRECEDENCE FUNCTIONS 553
Example 7.5
The matrix of W i r t h - W e b e r precedence relations for the g r a m m a r G O
is shown in Fig. 7.9. The columns corresponding to nonterminals have been
deleted, since we shall use this matrix only for shift-reduce decisions. The
corresponding reduced weak precedence matrix is shown in Fig. 7.10, and
the linearization graph that results from this reduced matrix is shown in
Fig. 7.11. In this graph the nodes labeled F1 and G4 are independent. Also,
G 2 and G 4 are independent, but F 1 and G3 are not. [~]
a ( ) + * $
=" =.
~. ~°
~° ~°
$ <" <.
Fig. 7.9 Precedence relations for Go.
1 2 3 4
-1 E
+1 -1 +1 T
3 +1 +1 +1 F,a, )
4 -1 (,+,*,$
that the total number of F-G pairs together in a cluster is as large as possible
and no one cluster contains both descendents and ancestors of another
duster. In general there may be many different sets of clusters possible, and
certain F-G pairs may be more desirabie than others. This part of the process
may well be a large combinatorial problem.
However, once we have partitioned the graph into a set of clusters, we
can then find a linear order < on the clusters such that, for clusters C and
C', C < C' if C contains a node that is a descendant of a node in C'. If
C 0, C ~ , . . . , C k is the sequence of clusters in this linear order, we then assign
0 to all nodes in C 0, 1 to all nodes in C1, and so forth.
Example 7.6
Consider the linearization graph for G O shown in Fig. 7.11. The set
~F 1, F 4, G4) is an example of a cluster of independent nodes, and so are
(F4, G2, G4~}, {F2, Gi}, ~G1, G3), and (F 3, GI~. However, the cluster (G 1, G3)
is not desirable, since both nodes in this cluster are labeled by G's and
thus it would not produce a 0 entry in the weak precedence matrix. The
cluster (F3, G~) might be more desirable than the cluster (F2, G~), since
f3 ~ g l will detect errors whenever aa, a(, )a, or )( appear in an input string,
while f~ = g~ will detect errors only for the pairs Ta and T(. Also, note that
if we detect an error whenever aa appears, we shall not be able to reduce
a to F, so the adjacencies Fa and Ta would never occur.
Thus, o n e possible clustering of nodes is {F~}, {F4, Gz, G4}, {F2}, {G3},
{F 3, G~}. Taking the linear order on clusters to be the left-to-right order
SEC. 7.1 LINEAR PRECEDENCEFUNCTIONS 555
DEFINITION
L e t a~ = ( f l , g l ) ' l " and ffz = ( f 2 , g2) be two shift-reduce parsing
algorithms for a context-free grammar G = (N, E, P, S). We say that ffl and
ffz are exactly equivalent if their behavior on each input string is identical:
that is, if an input string w is in L(G), then both parsing algorithms accept
w. If w is not in L(G), then both parsers announce error after the same number
of steps and in the same phase. If in the reduction phase, an error relation is
found after scanning an equal number of symbols down the stack.
We shall determine which blank entries in the canonical matrix of W i r t h -
Weber precedence relations can be changed without affecting the parsing
algorithms reach configuration Q = (x1 .." Xm, a l " " at, n) and we find
Q [aT error, then Q I-~-error, and, moreover, the mechanism of error detec-
tion is the same in both a and a~.
Let us first assume that in configuration Q, f~(X,n, a l ) = error but
f(Xm, a l ) - ~ error. We shall show that a contradiction arises. Thus, suppose
that Xm ?c a~ but that Xm ? al does not hold. By condition (2), Xm must be
a nonterminal. By condition (3), for all Y such that Xm ~ 0~Y is in P, Y 3>c a i
is false.
Examination of the precedence parsing algorithm indicates that the
only way for a nonterminal to be on top of the stack is for the previous move
to have been a reduction. Then there is some production Xm ----~t~ Y in P
such that the move of both parsers before configuration Q was entered is
(Xi • :. X,,,_lOCY, al . . . ar, n') F- Q. But this implies that Y 3>~ a~, in con-
tradiction.
The other possibility is that in configuration Q, g~(X, . . . Xz, e) -- error
but g(X~ . . . X,,,, e) =t= error. The only case that needs to be considered here
is that in which X,,, .>~ al, Xm 3> al and there is some s such that X, ?c X,+i,
while X; "--~ X;+ ~ and X~ --" X;+ ~ for s < i < m, but the relation X, ? X,+ 1
does not hold. We claim that X,+ 1 must be a nonterminal, because the only
way a,c could place a terminal above X, on the stack is if X, <~c X,+a or
By condition (4), we can not have X, <~ Y if Xs+ ~--, Y~ is in P. But the
only way that X,+ 1 could appear next to X, on the stack is for a reduction
of some Y0c to X,+, to occur. That is, there must be some configuration
(Xi . . . X, Ytx, b 1 . . . bk, zt") leading to Q such that
Example 7.7
Consider the following simple precedence grammar G"
E: >E+AIA
A >T
T ~T,FIF
F >(B[a
B > E)
E A T F B a ( ) + * $
E -+- -'-
Example 7.8
Using conditions (1)-(3) of Theorem 7.2 on the weak precedence relations
for Go shown in Fig. 7.9 (p. 553), we find that all the blanks in the last six
rows are essential. The only o t h e r essential b l a n k is (E, $), since E ~ T is a
p r o d u c t i o n a n d T 3> $.
E x a m i n i n g the linearization g r a p h o f Fig. 7.11, we find t h a t there are
no p r e c e d e n c e functions such t h a t every essential b l a n k is r e p r e s e n t e d by 0.
t Recall that reductions do not depend on the precedence matrix in a weak precedence
parser.
560 TECHNIQUES FOR PARSER OPTIMIZATION CHAP. 7
This wouldrequire, for example, that nodes F4, G2, G3, and a 4 all be placed
in one cluster.
At this point we might give up trying to use precedence functions to
implement the parser. However, we can consider using a slightly weaker
definition of equivalence between parsers.
Exact equivalence is very stringent. In practical situations we would be
willing to say that two shift-reduce parsing algorithms are equivalent if they
either both accept the same input strings or both announce error at the same
position on erroneous input strings. Thus, one parser could announce error
while the other made several reductions (but no shift moves) before announc-
ing error. Under this definition, which we shall call simply equivalence, we
can modify precedence relations even more drastically but still preserve
equivalence. (See Exercise 7.1.13.)
With this weaker definition of equivalence we can show that a shift-
reduce parsing algorithm using the precedence functions
E T F a ( ) + , $
0 2 i5 5 4 5 4 4 4
5 5 1 1 3 0
EXERCISES
7.1.1. Find linear weak precedence functions for the following grammars
or prove that none exist"
(a) S---~SAIA
A --~ ( S ) I ( )
(b) E --~ E + T[ + TIT
T---, T , FIF
F--~ (E)[a
7.1.2. Show that if M ' is a matrix formed from M by permuting some rows
and/or columns, then the vectors f and g produced by Algorithm 7.1
for M ' will be permutations of those produced for M.
7.1.3. Find linear precedence functions for the matrix of Fig. 7.14.
7.1.4. Find an algorithm to determine whether a matrix has linear precedence
functions f and g such that f = g.
EXERCISES 561
+1 +1 +1
+1 +1 +1
-1 -1 +1 +1 +1
-! -1 -1 +1 +1 +1
-1 -I -1 -1 +1 +1
-1 -1 -1 -1 -1 0
Fig. 7.14 Matrix.
"7.1.5. (a) Show that the technique given after Algorithm 7.1 for determining
whether a directed graph is acyclic actually works.
(b) Show that this technique can be implemented to work in time
0(n) + 0(e), where n is the number of nodes and e is the number
of edges in the given graph. H i n t : Choose a node in the graph.
Color all nodes on a path extending from this node until either a
leaf or previously colored node is encountered. If a leaf is found,
remove it, back up to its immediate ancestor, and then continue the
coloring process.
7.1.6. (a) Show that the labeling technique given after Algorithm 7.1 will
find the length of a longest path beginning at each node.
(b) Show that this technique can be implemented in time 0(n) + 0 (e),
where n is the number of nodes and e the number of edges in the
graph.
7.1.7. Give an algorithm which takes a matrix M with entries --1, 0, + l ,
and blank and a constant k and determines whether there exist vectors
f and g such that
(1) If Mij = --1, then f~ + k < gs;
(2) If Mij -- 0, then [f~ -- g j l - < k;
(3) If M~j -- + 1, then f~ > gj + k.
DEFINITION
Let M be a weak precedence matrix. We say a sequence of integers
il, i2, •. •, it,, where k is even and greater than 3, is a cycle of M if
(1) M,.~m, = -- 1 for odd j, Mij÷li, = + i for even j, and Mia, = + I,
or
(2) M,~m, = + 1 for odd j, Mm,t j = -- 1 for even j, and M,i, = -- 1.
7.1.8. Show that there exist weak precedence functions for a weak precedence
matrix if and only if M contains no cycle.
7.1.9. Let M be a weak precedence matrix and let i, j, k, and l be indices such
that either
(1) Mtk --- M j l - - --1, M j k -- + 1, and Ma is blank, or
(2) Mtk -- Mit = + 1, Mix -- --1, and Ma is blank.
Let M ' be M with Ma replaced by -- 1 in case (1) and by + 1 in case (2).
562 TECHNIQUES FOR PARSER OPTIMIZATION CHAP. 7
Show that f and g are weak precedence functions for M if and only if f
and g are weak precedence functions for M ' .
DEFINITION
We say that two rows (columns) of a precedence matrix are com-
patible if whenever they differ one is blank. We can merge compatible
rows (columns) by replacing them by a single row ( c o l u m n ) w h i c h
agrees with all their nonblank entries.
7.1.10. Show that the operations of row and column merger preserve the prop-
erty of not having linearizing functions.
We can also use linear precedence functions to represent the <
and " relations used by the reduce function in the shift-reduce parsing
algorithm constructed by Algorithm 5.12. First, we construct a weak
precedence matrix M in which -- 1 represents < , + 1 represents --~-.,
and blanks represent both 3> and error. We then attempt to find linear
precedence functions for M, again attempting to represent as many
blanks as possible by O's.
7.1.11. Represent the < and ' relations of Fig. 7.13 with linear precedence
functions. Use Theorem 7.2 to locate the essential blanks and attempt
to preserve these blanks.
"7.1.12. Show that under the definition of exact equivalence for weak prece-
dence parsers a blank entry (X, Y) of the matrix of Wirth-Weber
precedence relations is an essential blank if and only if one of the fol-
lowing conditions holds"
(1) X and Y are in ~ U {$}; or
(2) X is in N, Y is in X U {$}, and there is a production X ~ czZ
such that Z 3>c Y.
In the following problems, "equivalent" is used in the sense of
Example 7.8.
• 7.1.13. Let ~c and ~ be shift-reduce parsing algorithms for a simple precedence
grammar as in Theorem 7.2. Prove that ~c is equivalent to ~ if and
only if the following conditions are satisfied"
(1) (a) If X < c Y, then X < Y.
(b) If X - - ~ Y, then X " Y.
(c) If X 3>~ a, then X 3> a.
(2) If b ?~ a, then b < a is false.
(3) If A ?~a and A < a or A " a, then there is no derivation
A ~r m 0~aX~=-~
rm
. . . :=~
rm
~mXm, m >
~
1, such that for 1 < i < m, Xt %
a and X t 3> a, and Xm ">c a, or Xm is a terminal and X,n 3> a.
(4) I f A ~ < a o r A~ " a f o r s o m e a , then there does not exist a
derivation A~ ==~ A2 ==~ . . . ~ Am ==~ B0~, m > 1, a symbol X, and a
production B ~ Y f l such that
(a) X ? ¢ A ~ b u t X < A t , for2~i<m;
(b) X ?c B but X < B; and
(c) x < Y.
BIBLIOGRAPHIC NOTES 563
7.1.14. Show that the parser using the precedence functions of Example 7.8
is equivalent to the canonical precedence parser for Go.
7.1.15. Let M be a matrix of precedence relations constructed from Me, the
matrix of canonical Wirth-Weber precedence relations, by replacing
some blank entries by ->. Show that the parsers constructed from M
and Mc by Algorithm 5.12 (or 5.14) are equivalent.
"7.1.16. Consider a shift-reduce parsing algorithm for a simple precedence
grammar in which after each reduction a check is made to determine
whether the < or ~ relation holds between the symbol that was
immediately to the left of the handle and the nonterminal to Which the
handle is reduced. Under what conditions will an arbitrary matrix of
precedence relations yield a parser that is exactly equivalent (or equiv-
alent) to the parser of this form constructed from the canonical Wirth-
Weber precedence relations ?
*'7.1.17. Show that every CFL has a precedence grammar (not necessarily
uniquely invertible) for which linear precedence functions can be found.
Research Problems
7,1,18, Give an efficient algorithm to find linear precedence functions for a
weak precedence grammar G that yields a parser which is equivalent
to the canonical precedence parser for G.
7,1,19, Devise good error recovery routines to be used in conjunction with
precedence functions.
Programming Exercises
7,1,20, Construct a program that implements Algorithm 7.1.
7,1,21, Write a program that implements a shift-reduce parsing algorithm using
linear precedence functions to implement the f and g functions.
7,1,22, Write a program that determines whether a CFG is a precedence gram-
mar that has linear precedence functions.
7,1,23, Write a program that takes as input a simple precedence grammar G
that has linear precedence functions and constructs for G a shift-reduce
parser utilizing the precedence functions.
BIBLIOGRAPHIC NOTES
Floyd [1963] used linear precedence functions to represent the matrix of operator
precedence relations. Wirth and Weber [1966] suggested their use for representing
Wirth-Weber precedence relations. Algorithms to compute linear precedence
functions have been given by Floyd [1963], Wirth [1965], Bell [1969], Martin [1972],
and Aho and Ullman [1972a].
Theorem 7.2 is from Aho and Ullman [1972b], which also contains answers to
Exercises 7.1.13 and 7.1.15. Exercise 7.1.17 is from Martin [1972].
564 TECHNIQUES FOR PARSER OPTIMIZATION CHAP. 7
For weak precedence grammars the precedence relations ~ and " indicate
shift and .> indicates a reduction.
The E-row of Fig. 7.9 generates the statements
El I error
The first statement states that if E is on top of the pushdown list and the
current input symbol is ), then we shift ) onto the pushdown list, read the
next input symbol, and go to the statement labeled S). If this statement does
not apply, we see whether the current input symbol is + . If the second
statement does not apply, we next see if the current input symbol is $. The
relevant action here would be to go into the halting state accept if the
pushdown list contained $E. Otherwise, we report error. Note that no reduc-
tions are possible if E is the top stack symbol.
Since the first component of the label indicates the symbol on top of
the pushdown list, we can in many cases avoid unnecessary checking of
the top symbol of the pushdown list if we know what it is. Knowing that
E is on top of the pushdown list, we could replace the statements (7.2.1) by
SE: 1) ~ )1 • s)
I + ---~ +i ,S+
$#15 1 accept
I I error
Notice that it is important that the error statement appear last. When E is
on top of the pushdown list, the current input symbol must be ), + , or $.
Otherwise we have an error. By ordering the statements accordingly, we can
first check for ), then for + , and then for $, and if none of these is the cur-
rent input symbol, we report error.
The row for T in Fig. 7.9 generates the statements
ST: ,S,
RT: E+T > E CT
T E CT
(7.2.2) CT: ) SE
+ SE
SE
error
566 TECHNIQUES FOR PARSER OPTIMIZATION CHAP. 7
Here the precedence relation T ~--. generates the first statement. The
precedence relations T-> ), T-> --k, and T-> $ indicate that with T on top
of the pushdown list we are to reduce. Since we are dealing with a weak
precedence grammar, we always reduce using the longest applicable pro-
duction, by Lemma 5.4. Thus, we first look to see if E + T appears on top
of the pushdown list. If so, we replace E + T by E. Otherwise, we reduce
T to E. When the two R T statements are applicable, we know that T is on
top of the pushdown list. Thus we could use
RT: E -k ~ l , EI CT
@l > El CT
and again avoid the unnecessary checking of the top symbol on the push-
down list.
After we perform the reduction, we check to see whether it was legal.
That is, we check to see whether the current input symbol is either ), + ,
or $. The group of checking statements labeled CT is used for this purpose.
We report error if the current input symbol is not ), + , or $. Reducing first
and then checking to see if we should have made a reduction may not always
be desirable, but by performing these actions in this order we shall be able
to merge common checking operations.
To implement this checking, we shall introduce a computed goto state-
ment of the form
G: #i I S_~
indicating that the top symbol of the pushdown list is to become the last
symbol of the label.
Now we can replace the checking statements in (7.2.2) by the following
sequence of statements"
CT: I) I G
1+ 1 G
!$ 1 G
1 I error
G: -~ ! i s#
We shall then be able to use these checking statements in other sequences.
For example, if a reduction in G Ois accomplished with T on top of the stack,
the new top of the stack must be E. Thus, the statements in the CT group
could all transfer to SE. However, in general, reductions to several different
nonterminals, could be made, and the computed goto is quite useful in
establishing the new top of the stack.
SEC. 7.2 OPTIMIZATION OF FLOYD-EVANS PARSERS 567
l aj ~ a:l • Sa k
RXi: 0~1~[ >All emitpl cx,
0~2@1 - >A21 emitp2 cx,
,
c~- Ib~ I G
G
[bl G
L error
568 TECHNIQUES FOR PARSER OPTIMIZATION CHAP. 7
If j is zero, then the first statement of the RX,. group also has label SX~.
If k is zero, the error statement in the RX~ group has label RX~. If X,. is
the start symbol, then we do as above and also add the statement
$@ 15 1 accept
G: #1 t s# D
Example 7.9
Consider the grammar Go. From the F row of the precedence matrix
we would get the following statements:
Note that the third statement is useless, as the second statement will always
produce a successful match. We could, of course, incorporate a test into
Algorithm 7.2 which would cause useless statements not to be produced.
From now on we shall assume useless statements are not generated.
From the a row we get the following statements:
I+ G
!, G
Is G
I error
Notice that the checking statements for a are identical to those for F.
~Note the use of multiple labels for a location. Here, the SF group is empty.
SEC. 7 . 2 OPTIMIZATION OF FLOYD-EVANS PARSERS 569
In the next section we shall outline an algorithm which will merge redun-
dant statements. In fact, the checking statements labeled CT could also be
merged with CF if we write
Our merging algorithm will also consider partial mergers of this nature.
The row labeled ( in the precedence matrix generates the statements
I( > (t ,s(
[a - > al ,Sa
t ] error
Similar statements are also generated by the rows labeled + , ,, and $. [--]
We shall leave the verification of the fact that Algorithm 7.2 produces
a valid right parser for G for the Exercises.
(1) Delete all .> entries and replace the ~" entries by <~. (Since we care
only about shifts, the <~ and " relations can be identified.)
(2) If two or more rows of the resulting matrix are identical, replace
570 TECHNIQUES FOR PARSER OPTIMIZATION CHAP. 7
them by one row in M,, with the new row identified with the set of symbols
in N u E u {$} with which the original rows were associated.
(3) Delete all rows with no < entries and call the resulting matrix Ms.
Example 7.10
The merged shift matrix for G o from Fig. 7.9 is shown in Fig. 7.15. This
merged shift matrix is a concise representation of the situations in which
the parser is to make a shift move.
( a ) + • $
~. ~o
T <"
Example 7.11
Consider the shift matrix Ms given in Fig. 7.16. The shift graph associated
with M is shown in Fig. 7.17. [~]
al a2 a3 a4 a5 a6
Y1 <. <. <. <. <.
Y3 <. .
<. <.
Ys "
It should be clear that the shift graph is a directed acyclic graph with
a single root, 0 . The number of shift statements generated by Algorithm
7.2 is equal to the number of shift ( < and ~--) entries in the precedence
matrix M. Using the shift matrix M, and merging rows with similar shift
entries, we can reduce the number of shift statements that are required.
The technique is to construct a minimum cost directed spanning tree (subset
of the edges which forms a tree, with all nodes included) for the shift graph,
where the cost of a spanning tree is the sum of the labels of the edges in
the tree.
A path from O to Y to Z in the shift graph (A, R) has the following
interpretation. The label l(O, Y) gives the number of shift statements gener-
ated for row Y of M,. Thus, the number of shift statements that would be
generated for rows Y and Z is l ( ~ , Y) + l ( ~ , Z). However, if there is a path
from ~ to Y to Z in the graph, we can first generate the shift statements for
row Y. To generate the shift statements for row Z, we can use the shift
statements for row Y and precede them by those shift statements for row Z
which are not already present. Thus, we would generate the following se-
quence of shift statements for rows Y and Z:
~: #1 I R#
The statements for the nodes can be placed in any order. However, if
the statement
I 1 L'
immediately precedes the statement labeled L', then the former statement
may be deleted. E]
Example 7.12
The tree of Fig. 7.18 is a spanning tree for the shift graph of Fig. 7.17.
The following sequence of statements could be generated from the tree
of Fig. 7.18 by Algorithm 7.3. Of course, the sequence of the statements is
not completely fixed by Algorithm 7.3, and other sequences are possible.
By S Y~ is meant the set of labels corresponding to row Yt in the shift graph.
SEC. 7.2 OPTIMIZATION OF FLOYD-EVANS PARSERS 573
SY4: a 1 ------->. a 1 • Sa x
a6 ~ a6 , Sa6
SY,: a 5 >a 5 • Sa 5
SY~ : a3 - > a3 • Sa 3
a 6 ----> a 6 • Sa 6
S Y3" a1 > al • Sa 1
a4 > a4 • Sa 4
SY~ : a2 > a2 • Sa 2
R# D
THEOREM 7.3
Algorithm 7.3 produces a sequence of production language statements
which may replace the shift statements generated by Algorithm 7.2, with
no change in the parsing action of the program.
P r o o f We observe that when started at the sequence of statements gener-
ated by Algorithm 7.3 for node Y, the statements which may subsequently
be executed are precisely those generated for the ancestors of node Y. It is
straightforward to show that these statements test for the presence in the
lookahead position of exactly those symbols whose columns are covered
by row Y of the shift matrix.
The statement with label ~ ensures that if no shift is made, we transfer
to the proper R-group.
574 TECHNIQUES FOR PARSER OPTIMIZATION CHAP. 7
The spanning tree for a given shift graph which produces the fewest shift
statements by Algorithm 7.3 is surprisingly easy to find. We observe tfiat,
neglecting the unconditional transfer statements, the number of statements
generated by Algorithm 7.3 (all of which are of the form l a ~a l • Sa
for some a) is exactly the sum of the labels of the edges in the tree.
ALGORITHM 7.4
Minimum cost spanning tree from shift graph.
Input. Shift graph (A, R) for a precedence matrix M.
Output. Spanning tree (A, R') such that ~(x,Y)~Z l(X, Y) is minimal.
Method. For each node Y in A other than the root, choose a node X such
that l(X, Y) is smallest among all edges entering Y. Add (X, Y) to R'. [Z]
Example 7.13
The spanning tree in Fig. 7.18 is obtained from the shift graph of Fig.
7.17 using Algorithm 7.4. [~
THEOREM 7.4
The number of shift statements generated by Algorithm 7.3 from a span-
ning tree is minimized for a given shift graph when the tree produced by
Algorithm 7.4 is chosen.
Proof Since every node except the root of (A, R) has a unique direct
ancestor, (A, R') must be a tree. Since in every spanning tree of (A, R) one
edge enters each node other than the root, the minimality of (A, R') is
immediate. [-]
Example 7.14
The Me matrix for G Ois
( a ) + •
< <
<
Y <
SEC. 7 . 2 OPTIMIZATION OF FLOYD-EVANS PARSERS 575
SE: ) - >) • S)
+ >+ .S+
$#t $ accept
fg
ST: * >* ,S,
fg: # R#
RT: E+# >E emit 1 CT
4; >E emit 2 CT
SF: RF: T,~ >T emit 3 CF
# >T emit 4 CF
Sa: Ra: # >F emit 6 Ca
s): R): (E# >F emit 5 C)
RE: R(: R+: R,: R$: error
CF: Ca: C):
CT: )
÷
$
error
G: # s#
tHere, this statement can be executed only with E on top of the stack. If it could be
executed otherwise, we would have to replace ~ by E (or in general, by the start symbol).
576 TECHNIQUES FOR PARSER OPTIMIZATION CHAP. 7
The penultimate statement plays the role of ~ for the checking state-
ments. D
la ~F I emit 6 Ca
EXERCISES
7.2.1. Use Algorithm 7.2 to generate Floyd-Evans parsers for the following
weak precedence grammars:
(a) s - - ~ S + Il I
I ~ (S) la(S)ta
(b) S - - , 0Sll01
(c) E ~ E + TI T
T--~, T , F I F
e~etele
P----, (E)Ia
7.2.2. Use the techniques of this section to improve the parsers constructed
in Exercise 7.2.1.
7.2.3. For the weak precedence matrix of Fig. 7.19, find the shift and reduce
matrices.
EXERCISES 577
Xl X2 X3 X4 X5 X6
I
X1 <" <.,'= _~ .> .>
, ,.
X4 "> ~_ .~ <.
X8 .
7.2.4. From the shift and reduce matrices constructed in Exercise 7.2.3, con-
struct the shift and reduce graphs.
7.2.5. Use Algorithm 7.3 to find shortest sequences of shift and checking state-
ments for the graphs of Exercise 7.2.4.
*7.2.6. Devise an algorithm to generate a deterministic left parser in production
language for an LL(1) grammar.
7.2.7. Using the algorithm developed in Exercise 7.2.6, construct a left parser
in production language for the following LL(1) grammar:
E-----~. TE"
E'- ~ + TE'ie
T - - - - ~ FT"
T'- ~ , FT'[e
F- x (E)la
*7.2.8. Use the techniques of this section to improve the parser constructed in
Exercise 7.2.7.
*7.2.9. Devise an algorithm to generate a deterministic right parser in pro-
duction language for an LR(1) grammar.
7.2.10. Using the algorithm developed in Exercise 7.2.9, construct right parsers
for Go and the grammar in Exercise 7.2.7.
"7.2.11. Use the techniques of this section to improve the parsers constructed in
Exercise 7.2.10. Compare the resulting parsers with those in Examples
5.47 and 7.14 and the one in Exercise 7.2.8.
"7.2.12. Is it possible to test a production language parser to determine if it is
a valid weak precedence parser for a given grammar ?
578 TECHNIQUES FOR PARSER OPTIMIZATION CHAP. 7
Research Problems
It should be evident that this section does not go very deeply into
its subject matter and that considerable further improvements can be
made in parsers implemented in production language. We therefore
suggest the following areas for further research.
7.2.14. Study the optimizations which are possible when various kinds of shift-
reduce algorithms are to be implemented in production language. In
particular, one might examine the algorithms used to parse LL(k), BRC,
extended precedence, simple mixed strategy precedence, and LR(k)
grammars.
7.2.15. Extend the parser optimizations given in this section by allowing post-
ponement of error detection, statement merger and/or other reasonable
alterations of the production language program.
7.2.16. Develop an alternative to production language for t h e implementation
of parsing algorithms. Your language should have the property that
each statement can be implemented by some constant number of machine
statements per character in the statement of your language. A "reason-
able" random access machine should serve as a benchmark here.
Programming Exercises
7,2,17. Design elementary operations that can be used to implement Floyd-
Evans production language statements. Construct an interpreter that
will execute these elementary operations.
7.2.18. Construct a compiler which will take a program in production language
and generate for it a sequence of elementary operations which can be
executed by the interpreter in Exercise 7.2.17.
7,2,19, Write a program that will construct production language parsers for a
useful class of context-free grammars.
7,2,20, Construct a production language parser for one of the grammars in the
Appendix of Volume I. Incorporate an error recovery routine which gets
called whenever error is announced. The error recovery routine should
adjust the stack and/or input so that normal parsing can resume.
BIBLIOGRAPHIC NOTES
Production language and variants thereof have been popular for implementing
parsers. Techniques for the generation of production language parsers have been
developed by a number of people, including Beals [1969], Beals et al. [1969],
DeRemer [1968], Earley [1966], and Haynes and Schutte [1970]. The techniques
SEC. 7.3 TRANSFORMATIONS ON SETS OF LR(k) TABLES 579
presented in this section for generating production language parsers were origi-
nated by Cheatham [1967]. The use of computed goto's in production language
and the optimization method in Section 7.2.2 is due to Ichbiah and Morse [1970].
Some error recovery techniques for production language parsers are described in
LaFrance [1970].
niques for constructing small LR(k) parsers. Many of these techniques can
also be applied to LL(k) grammars.
In this section we shall discuss LR(k) parsers from a general point of
view. We shall say that two LR(k) parsers are equivalent if given an input
string w either they both accept w or they both announce error at the same
symbol in w. This is exactly akin to the notion of "equivalence" we encoun-
tered in Example 7.8.
In this section we shall present several transformations which can be
used to reduce the size of an LR(k) parser, while producing an equivalent
LR(k) parser. In Section 7.4 we shall present techniques which can be used
to produce directly, from certain types of LR(k) grammars, LR(k) parsers
that are considerably smaller than, but equivalent to, the canonical LR(k)
parser. In addition, the techniques discussed in this section can also be
applied to these parsers. Finally, in Section 7.5 we shall consider a more
detailed implementation of an LR parser in which common scanning actions
can be merged.
A set of LR(k) tables forms the basis of the LR(k) parsing algorithm
(Algorithm 5.7). In general, there are many different sets of tables which
can be used to construct equivalent parsers for the same LR(k) grammar.
Consequently, we can search for a set of tables with certain desirable prop-
erties, e.g., smallness.
To understand what changes can be made to a set of LR(k)tables, let us
examine the behavior of the LR(k) parsing algorithm in detail. This algorithm
places LR(k) tables on the pushdown list. The LR(k) table on top of the push-
down list dictates the behavior of the parsing algorithm. Each table is a pair
of functions ( f , g). Recall that f, the parsing action function, given a look-
ahead string, tells us what parsing action to take. The action may be to (1)
shift the next input symbol onto the pushdown list, (2) reduce the top of
the pushdown list according to a named production, (3) announce comple-
tion of the parsing, or (4) declare that a syntactic error has been found in
the input. The second function g, the goto function, is invoked after each
shift action and each reduce action. Given a symbol of the grammar, the goto
function returns either the name of another table or an error notation.
A sample LR(1) table is shown in Fig. 7.20. In the LR(k) parsing
action goto
a b e S A a b
75:
s 3 x T4 X T7 X
Fig. 7.20 LR(1) table.
SEC. 7.3 TRANSFORMATIONS ON SETS OF LR(k) TABLES 581
algorithm, a table can influence the operation of the parser in two ways.
First, suppose that table T~ is at the top of the pushdown list. Then the
parsing action function of Ti influences events. For example, if b is the look-
ahead string, then the parser calls for a reduction using production 3. Or if
a is the lookahead string, the parser shifts the current input symbol (here, a)
onto the pushdown list, and since g(a) = TT, the name of table T7 follows a
onto the top of the pushdown list.t
The second way in which a table can influence the action of the parser
appears immediately after a reduction. Suppose that the pushdown list is
• ATlbT2ST3 and that a reduction using the production S ~ bS is called
for by T3. The parser will then remove four symbols (two grammar symbols
and two tables) from the stack, leaving ~AT1 there.l: At this point table T~
is exposed. The nonterminal S is then placed on top of the stack, and the goto
function of T1 is invoked to determine that table T4 = g(S) is to be placed
on top of S.
We shall take the viewpoint that the important times in an LR parsing
process occur when a new table has just been placed on top of the pushdown
list. We call such a table a governing table. We shall examine the charac-
teristics of an LR parser in terms of the sequence of governing tables. If a
governing table T calls for a shift, then the next governing table is determined
by the goto function of T in a straightforward manner. If T calls for a reduc-
tion, on the other hand, the next governing table is determined by the goto
function of the ith table from the top of the pushdown list, where i is the
length of the right-hand side of the reducing production. What table might
be there may seem hard to decide, but we can give an algorithm to determine
the set of possible tables.
With the viewpoint of governing tables in mind, we shall attempt to deter-
mine when two sets of LR(k) tables give rise to equivalent parsers. We shall
set performance criteria that elaborate on what "equivalent" means. First,
we shall give the definition of a set of LR(k) tables that extends the definition
given in Section 5.2.5. Included as both a possible action and a possible
goto entry will be a special symbol ~, which can be interpreted as "don't
care." It turns out, as we shall see, that many of the error entries in a set of
LR(k) tables are never exercised; that is, the LR(k) parsing algorithm will
never consult certain error entries no matter what the input is. Thus, we can
change these entries in any fashion whatsoever, and the parser will stili operate
in the same way.
DEFINITION
Let G be a CFG. A set o f LR(k) tables for G is a pair (3, To), where 3 is
a set of tables for G and To, called the initial table, is in 3. A table for G is
a pair of functions ( f , g), where
(1) f i s a mapping from X.k to the set consisting of ~, error, shift, accept,
and reduce i for all production indices i, and
(2) g maps N u X to 3 u {q~,error}.
When To is understood, we' shall refer to (3, To) simply as 3. The canonical
set of LR(k) tables constructed in Section 5.2.5 is a set of LR(k) tables in
the formal sense used here. Note that (p never appears in a canonical set of
LR(k) tables.
Example 7.15
Let G be defined by the productions
(1) S ~ SA
(2) S ~ A
(3) A --. aA
(4) A --~ b
In Fig. 7.21 is a set of LR(1) tables for G.
action goto
a b e S A a b
S S X X 73 T~ 72
T2 4 4 4
T3 ~o ~o T1 X T2
Fig. 7.21 Set of LR(1) tables.
We shall see that the tables of Fig. 7.21 do not in any sense parse accord-
ing to the grammar G. They merely "fit" the grammar, in that the tables
defined use only symbols in G, and the reductions called for use productions
which actually exist in G.
Let (3, To) be a set of LR(k) tables for a C F G G = (N, 2, P, S). A con-
figuration of the LR(k) parser for (3, To) is a triple ( T o X i T 1 . . . X,,Tm, w, ~z),
SEC. 7.3 TRANSFORMATIONS ON SETS OF LR(k) TABLES 583
where
(1) T~ is in 3, 0 _~ i < m, and To is the initial table;
(2) X~isinNuZ, l<i<m;
(3) w is in Z*; and
(4) zt is an output string of production numbers.
The first component of a configuration represents the contents of the
pushdown list, the second the unused input, and the third the parse found
so far.
An initial configuration of the parser is one of the form (To, w, e) for
some w in Z*. As before, we shall express a move of the parsing algorithm
by the relation 1--- on configurations defined as follows.
Suppose that the parser is in configuration (To Xi T i . . . XmTm, w, rt), where
Tm = ( f , g> is a table in 3. Let w be in Z* and let u = FIRSTk(w). That is, w is
the string remaining on the input, and u is the lookahead string.
(1) If f(u) = shift and w = aw', where a ~ Z, then
where T = g(a). Here, we make a shift move in which the next input symbol,
a, is shifted onto the pushdown list and then the table g(a) is placed on top
of the pushdown list.
(2) Suppose that f(u) = reduce i and that production i is A ~ y, where
I rl = r. Suppose further that r < m and that T m - r -~ ( f ' , g'). Tm-r is the
table that is exposed when the string X'm_r+lZm_r+l . . . X m T m is removed
from the pushdown list. Then we say that
tThis will never happen if the canonical set of tables is used. In general, this situation
is undesirable and should only be permitted if we are sure an error will be declared shortly.
584 TECHNIQUES FOR PARSER O P T I M I Z A T I O N ' CHAP. 7
there are not enough symbols on the pushdown list) or g'(a) is not a table
name.
We call (ToX~T1 . . . XmTm, w, n) an error indication if there is no next
configuration. However, as an exception, we call (ToST1, e, n) an accepting
configuration if T~ : ( f , g) and f(e) ~ accept.
We define [_.L., I_~._,and [_t_in the usual manner. We say that configuration
C is accessible if Co [-~- C for some initial configuration C 0, We shall now
summarize the parsing algorithm.
ALGORITHM 7.5
LR(k) parsing algorithm.
Input. A C F G G - - ( N , E, P, S), a set (3, To) of LR(k) tables for G, and
an input string w ~ E*.
Output. A sequence of productions n or an error indication.
Method.
(1) Construct the initial configuration (To, w, e).
(2) Let C be the latest configuration constructed. Construct the next
configuration C' such that C ~- C' and then repeat step (2). If there is no
next configuration C', go to step (3).
(3) Let C - - ( a , x, n) be the last configuration constructed. If C is an
accepting configuration, then emit n and halt. Otherwise, indicate error.
Example 7.16
Let us trace through the sequence of moves made by Algorithm 7.5 on
input ab using the LR(1) tables of Fig. 7.21 (p. 582) with T~ as the initial
table.
The initial configuration is (T1, ab, e). The action of T 1 on lookahead a
is shift and the goto of T1 on a is TI, so
(TiaT~bT2, e, e) F- (T~aT~AT3, e, 4)
We can now describe what it means for two sets of LR(k) tables to be
equivalent. The weakest equivalence we might be interested in would require
that, using Algorithm 7.5, the two sets of tables produce the same parse
for those sentences in the language L(G) and that one would not parse a sen-
tence not parsed by the other. The error condition might be detected at
different times by the two sets of tables.
The strongest equivalence we might consider would be one which required
that the two sets of tables produce identical sequences of parsing actions.
That is, suppose that (To, w, e) and (T~,, w, e) are initial configurations for
two sets of tables 3 and 3'. Then for any i ~ 0
though the other set has stopped parsing. We have the following motivation
for this definition.
When an error occurs, we wish to detect it using either set of tables, and
we want the position of the error to be as apparent as possible. It will not
do for one set of tables to detect an error while the other shifts a large number
of input symbols before the error is detected. The reason for this requirement
is that one would in practice like to discover an error as close to where it
occurred as possible, for the purpose of producing intelligent and intelligible
diagnostics.
In practice, on encountering an error, one would transfer to an error
recovery routine which would modify the remaining input string and/or
the contents of the pushdown list so that the parser could proceed to parse the
rest of the input and detect as many errors as possible in one pass over
the input. It would be unfortunate if large portions of the input had been
processed in a meaningless way before the error was detected. We are thus
motivated to make the following definition of equivalence o n sets of LR
tables. It is a special case of the informal notion of equivalence discussed
in Section 7.1.
DEFINITION
Let (3, To) and (3', T~) be two sets of L R ( k ) t a b l e s for a context-free
grammar G = (N, E, P, S).
Let w be an input string in E*, C O = (To, w, e), and C~ = (T~, w, e).
Let C o ~ C ~ ]--Czl-- "'" and C ~ [ - C'~ ~ C ~ ~ . . . be the respective
sequences of configurations constructed by Algorithm 7.5. We say that
(3, To) and (3', T~) are equivalent if the following four conditions hold, for
all i ~> 0 and for arbitrary w.
(1) If C~ and C~ both exist, then we can express these configurations
as Ct = (ToX1Zl . . . XmTm, X, ~) and C't -- (ToXiT'~ . . . ~Y(mT~, x, n), that
is, as long as both sequences of configurations exist, they are identical except
for table names.
(2) Ct is an accepting configuration if and only if C', is an accepting
configuration.
(3) If C~ is defined but C~ is not, then the second components of C~_
and C~ are the same.
(4) If C'~ is defined but C~ is not, then the second components of C~_~
and C'~ are the same.
What conditions (3) and (4) are saying is that once one of the sets of tables
has detected an error, the other must not consume any more input, that is,
not call for any shift actions. However, conditions (3) and (4) allow one set
of tables to call for one or more reduce actions while the other set has halted
with a don't care or error action.
Notice that neither set of tables has to be valid for G. However, if two
SEC. 7.3 TRANSFORMATIONS ON SETS OF LR(k) TABLES 587
sets of tables are equivalent and one is valid for G, then the other must also
be valid for G.
Example 7.17
Consider the LR(1) grammar G with the productions
(1) S---, aSb
(2) S--Tab
(3, To), the canonical set of LR(1) tables for G, is shown in Fig. 7.22. Figure
7.23 shows another set of LR(1) tables, ('11, U0), for G. Let us consider the
action goto
a b e S a b
To S X X T1 72 X
X X A X X X
T2 S S X T3 T4 Ts
T3 X S X X X T6
/'4 S S X T7 T4 78
Ts X X 2 X X X
T6 X X 1 X X X
T7 X S X x x T9
T8 X 2 X X X X
T9 X 1 X X X X
Fig. 7.22 (3, To)
behavior of the LR(1) parsing algorithm using 3 and '11 on the input string
abb. Using 3, the parsing algorithm would make the following sequence
of moves"
(To, aab, e) ~ (ToaT2, bb, e)
(ToaTzbTs, b, e)
The last configuration is an error indication. Using the set of tables 'It, the
parsing algorithm would proceed as follows"
action goto
a b e S a b
Uo S X X u~ u2 ~0
U1 ,p X A
S S X u3 u2 u4
U3 ~p S X ~0 ~ Us
u4 X 2 2 ~o ~0 ~0
X 1 1 ~o ~o ~o
Fig. 7.23 ('lt, U0)
Many of the error entries in a canonical set of LR(k) tables are never
used by the LR(k) parsing algorithm. Such error entries can be replaced by
~o's, which are truly don't cares in the sense that these entries never influence
the computation of the next configuration for any accessible configuration.
We show that all error symbols in the goto field of a canonical set of tables
can be replaced by ~'s and that if a given table can become the governing
table only immediately after a reduction, then the same replacements can
be made in the action field.
DEFINITION
Let (3, To) be a set of LR(k) tables, k ~ 1, and let
(3) If f(u) = reduce i, production i is A ---~ Y1 "'" Y,, r > 0, and Tm-r
is ( f ' , g'), then g'(A) ~ ~.
Informally, a set of LR(k) tables is tp-inaccessible if whatever ~ entries
appear in the tables are never referred to by Algorithm 7.5 during the parsing
of any input string. We shall now give an algorithm which replaces as many
error entries as possible by ~ in the canonical set of LR(k) tables while keep-
ing the resulting set of tables c-inaccessible.
Thus, we can identify the error entries in a canonical set of LR(k) tables
which are never consulted by the LR(k) parsing algorithm by using this
algorithm to change the unused error entries to ~p's.
ALGORITHM 7.6
In step (1) of Algorithm 7.6 all error entries in the goto functions are
changed to ~a entries, because when k > 1, a canonical LR(k) parser will
always detect an error immediately after a shift move. Hence, an error entry
in the goto field will never be exercised.
Step ( 2 ) o f Algorithm 7.6 replaces error by ~a in the action field of table
590 TECHNIQUES FOR PARSER OPTIMIZATION CHAP. 7
Example 7.18
Let G be the LR(I) grammar with productions
(1) S - - ~ S a S b
(2) S---, e
The canonical set of LR(k) tables for G is shown in Fig. 7.24(a), and the tables
after application of Algorithm 7.6 are shown in Fig. 7.24(b).
Note that in Fig. 7.24(b) all error's in the right-hand portions of the
tables (goto fields) have been replaced by ~0's. In the action fields, To has been
left intact by rule (2a) of Algorithm 7.6. The only shift actions occur in tables
T3 and T7; these result in T4, T~, or T7 becoming the governing table. Thus,
e r r o r entries in the action fields of these tables are left intact. We have changed
e r r o r to ~oelsewhere. [~]
THEOREM 7.5
The set of tables 3' constructed by Algorithm 7.6 is (0-inaccessible and
equivalent to the canonical set 3.
P r o o f The equivalence of :3 and :3' is immediate, since in Algorithm 7.6,
the only alterations of 3 are to replace error's by ~0's, and the LR(k) parsing
algorithm does not distinguish between error and ~0 in any way.t We shall
now show that if a ~0entry is encountered by Algorithm 7.5 using the set of
tables 3', then 3' was not properly constructed from 3.
Suppose that 3' is not ~0-inaccessible. Then there must be some smallest
i such that Co ]'---" C, where Co is an initial configuration and in configuration
C Algorithm 7.5 consults a ~0-entry of 3'. Since To is not altered in step (2a),
we must have i > 0. Let C = (To X1T1 " " XmTm, w, 7z), where T,, = ( f , g )
and FIRSTk(w ) = u. There are three ways in which a ~0-entry might be
encountered.
tThe purpose of the ~0'sis only to mark entries which can be changed.
SEC. 7.3 TRANSFORMATIONS ON SETS OF LR(k) TABLES 591
action goto
b e S a b
To 2 X 2 X X
T1 S X A X T2 X
T2 2 2 X X X
T3 S S X X r4 rs
T4 2 2 X T6 X X
1 X 1 X X X
T6 S S X X T4 T7
T7 1 1 X X X X
action ~oto
a b e S a b
To 2 X 2 T~ ~
T1 S ~o A T2
T2 2 2 X 75 :
T3 S S ,p ~ T4 ~
T4 2 2 X r6 ~
Ts 1 X 1 ~p tp ~o
T6 S S : r4 r7
T7 1 1 X ,p ~o ,p
Fig. 7.24 A set of LR(1) tables before
(b) ~-Free Set of Tables and after application of Algorithm 7.6.
Case 1: Suppose that f(u) -- qo.Then by step (2aii) and (2aiii) of Algorithm
7.6, the previous move of the parser could not have been shift and must have
been reduce. Thus, Co [i-~ C'k- C, where
[Xm ~ Y~ " " Yr ", u]. Recalling the definition of a valid item, there is
some y ~ Z* such that the string Xi . . . XmUy is a right-sentential form.
Suppose that the nonterminal Arm is introduced by production A --~ o~Xzfl.
That is, in the augmented grammar we have the derivation
s' rm rm
7 x.flx
- rm
We can also show that Algorithm 7.6 changes as many error entries in
the canonical set of tables to ~ as possible. Thus, if we change any e r r o r
entry in 3' to (p, the resulting set of tables will no longer be ~0-inaccessible.
T4 and T~ with T~. And if T3 and T4 missed being compatible only because
they had T~ and T2 in corresponding goto entries, we could still do the
merger.
We shall describe this merger algorithm by defining a compatible parti-
tion on a set of tables. We shall then show that all members of each block
in the compatible partition may be simultaneously merged into a single table.
DEFINITION
Let (3, To) be a set of ¢p-inaccessible LR(k) tables and let II =[$1, • • •, Sp}
be a partition on 3. That is, 8, U $2 U -.- U 8p -- 3, and for all i ~ j, ~;,
and Sj are disjoint. We say that II is a compatible partition of width p if for
all blocks 8~, 1 < i ~ p, whenever ( f l , g , ) and (fz, g , ) are in St, it follows
that
(1) f~(u) ~f2(u) implies that at least one off,(u) andf2(u) is ~0and that
(2) g~(X) ~ g2(X) implies that either
(a) At least one of g~(X) and g2(X) is ~, or
(b) g~(X) and g2(X) are in the same block of II.
We can find compatible partitions of a set of LR(k) tables using tech-
niques reminiscent of those used to find indistinguishable states of an incom-
pletely specified finite automaton. Our goal is to find compatible partitions
of least width. The following algorithm shows how we can use a compatible
partition of width p on a set of LR(k) tables to find an equivalent set con-
taining p tables.
ALGORITHM 7.7
Merger by compatible partitions.
Input. A ~0-inaccessible set (3, To) o f LR(k) tables and a compatible
partition II = [ $ ~ , . . . , Sp) on 3.
Output. An equivalent cp-inaccessible set (3', To) of LR(k) tables such
that :/4:3' = p.
Method.
(1) For all i, 1 _~ i _~ p, construct the table Ut = ( f , g ) from the block
St of II as follows-
(a) Suppose that ( f ' , g ' ) is in 8t and that for lookahead string u,
f'(u) ~ cp. Then let f(u) = f ' ( u ) . If there is no such table in St,
set f(u) = ~o.
(b) Suppose that ( f ' , g') is in $~ and that g'(X) is in block ~j. Then
set g ( X ) = Uj. If there is no table in St with g'(X) in $i, set
g(x) = ~o.
(2) T~ is the table constructed from the block containing To. [--]
Example 7.19
Consider Go, our usual grammar for arithmetic expressions"
(1) E - - D E + T
(2) E ~ T
(3) T ~ T, F
(4) T----, F
(5) F--~ (E)
(6) r---~a
That is, the only difference between the LR(k) parser using 3 and 3' is that
in using 3' the parser replaces table T in 3 by the representative of the block
of T in the partition H.
We shall prove statement (7.3.1) by induction on i. Let us consider the
"only if" portion. The basis, i -- 0, is trivial. For the inductive step, assume
that statement (7.3.1) is true for i. Now consider the i + 1st move. Since
;3 is to-inaccessible, the actions of Tm and T" on FIRSTk(x ) are the same.
SEC. 7.3 TRANSFORMATIONS ON SETS OF LR(k) TABLES 595
action goto
a + • ( ) e E T F a + • ( )
To S X X S X X T1 T2T3T 4 ~0 SO T5 ~0
T1 ~o S ,p ¢ so A
2 S ,p ~ 2
T3 so 4 4 ,# ~ 4
T4 X 6 6 X X 6 ~0 SO ~0 ~0 ~0 SO
T6 S X X S X X ~0 T13 T3 Z4 ~P SO T5 SO
T7 S X X S X X SO SO T14 T4 ~P ,p T5 SO
r8 so S so ¢ S ~0 ~o ~o ~0 ~0 T16 SO tp T15
T9 so 2 S so 2 so SO ~p SO tp ~p T17 SO tp
Tlo ,p 4 4 so 4 ,p ~p ~ tp ~p ~ SO ~p ~p
Tll X 6 6 X 6 X SO SO ~P SO ~# SO ~P ~P
T~3 so 1 S so so 1 SO SO tp SO ~p T7 SO SO
T14 ~o 3 3 ¢ ~p 3 ~p ~p ~p SO ~P SO SO ~P
T15 X 5 5 X X 5 9 ~ 9 9 9 ~P SO 9
T~9 so 1 S so 1 so ~p ~ ~0 SO ~o Tl7 SO ~p
r:o so 3 3 so 3 so SO ~p SO SO ~p ~o SO ~p
T~ X 5 5 X 5 X ~P ~ SO SO 9 SO 9 9
Suppose that the action is shift, that a is the first symbol of x, and that the
goto entry of Tm on a is T. By Algorithm 7.7 and the definition of compatible
partition, the goto of T'~ on a is T' if and only if T' is the representative of
the block of II containing T.
596 TECHNIQUESFOR PARSER OPTIMIZATION CHAP. 7
action goto
a + * ( ) e E T F a + * ( )
ro S X X S X X ~q r2 vl r4 ~ ~ rs ~
rl SO S SO ~o SO A
T2 ,# 2 S ~0 SO 2
T4 X 6 6 X X 6 SO SO SO SO SO SO SO SO
r9 SO 2 S ~o 2 ~o ~o ~0 SO SO ~0 T17 SO SO
Tll X 6 6 X 6 X SO ~0 SO SO ~o SO SO ~0
T~3 SO 1 S SO SO 1 so so so so so T7 so so
Ti5 X 5 5 X X 5 SO SO SO SO SO SO ~P SO
Ti9 SO 1 S SO 1 SO ~0 SO SO SO SO T17 SO SO
X 5 5 X 5 X SO SO SO SO SO SO SO SO
Ux SO 4 4 SO 4 4 SO SO SO SO SO SO SO SO
U~ SO 3 3 SO 3 3 SO SO SO SO SO SO SO SO
action
a + . ( ) e
T4: X 6 6 X X 6
Taa: X 6 6 X 6 X
Let (3, To) be a set of LR(k) tables for grammar G = (N, X, P, S). We
extend the GOTO function of Section 5.2.3 to tables and strings of grammar
symbols. GOTO maps 5 × (N U X)* to 3 as follows:
(1) GOTO(T, e) = T for all T in 3.
(2) If T = ( f , g>, GOTO(T, X) --- g ( X ) for all X in N U E and T in 5.
(3) GOTO(T, eX) -- GOTO(GOTO(T, e), X) for all 0~ in (N W E)* and
T i n :3.
We say table that T in (5, To) has height r if GOTO(T0, e) = T implies
that [a[ > r.
We shall also have occasion to use GOTO-~, the "inverse" of the GOTO
function. GOTO -~ maps 5 x (N u Z)* to the subsets of 5. We define
GOTO-~(T, t~) = IT' [GOTO(T', 00 = T}.
Finally, we define a function NEXT(T, p), where T is in 3 and p is produc-
tion A ~ Xx . . . X, as follows:
(1) If T does not have height r, then NEXT(T, p) is undefined.
(2) If T has height r, then NEXT(T, p ) - - [ T ' ] t h e r e exists T" ~ 5 and
~ (N U E) r such that T" ~ GOTO-I(T, e) and T ' - - G O T O ( T " , A)}.
Thus, NEXT(T, p) gives all tables which could be the next governing
table after T if T is on t o p of the pushdown list and calls for a reduction by
production p. Note that there is no requirement that Xx ..- Xr be the top
r grammar symbols on the pushdown list. The only requirement is that there
be at least r grammar symbols on the list. If the tables are the canonical
ones, we can show that only for ~t -- X 1 --. Xr among strings of length r
will GOTO -1 (T, e) be nonempty.
Certain algebraic properties of the GOTO and NEXT functions are left
for the Exercises.
The GOTO function for a set of LR(k) tables can be conveniently por-
trayed in terms of a labeled graph. The nodes of the GOTO graph are labeled
by table names, and an edge labeled X is drawn from node T,. to node Tj if
GOTO(T,., X) = Tj. Thus, if GOTO(T, X i ) ( 2 . . . X~) = T ' , then there will
be a path from node T to node T' whose edge labels spell out the string
X1 X2 --..Yr. The height of a table T can then be interpreted as the length
of the shortest path from To to T in the GOTO graph.
SEC. 7.3 TRANSFORMATIONS ON SETS OF LR(k) TABLES 599
Example 7.20
The GOTO graph for the set of tables in Fig. 7.25 (p. 595) is shown in
Fig. 7.27.
F ~ )
)j,,,_J +
From this graph we can deduce that GOTO[T6, (E)] = T15, since
GOTO[T6, (] = Ts, GOTO[Ts, E] = Ts, and GOTO[Ts, )] = T15.
Table T6 has height 2, so NEXT(T6, 5), where production 5 is F ~ (E),
is undefined.
Let us now compute NEXT(T1 s, 5). The only tables from which there is
a p a t h of length 3 to T15 are To, T6, and TT. Then GOTO(T0, F ) ~ - T 3 ,
GOTO(T6, F) = T3, and GOTO(TT, F) = T 14, and so NEXT(T1 s, 5) =
{T3, T14}. [~
We shall now give an algorithm whereby a set of q~-inaccessible tables
can be modified to allow certain errors to be detected later in time, although
not in terms of distance covered on the input. The algorithm we give here is
not as general as possible, but it should give an indication of how the more
general modifications can be performed.
We shall change certain error entries and q~-entries in the action field to
r e d u c e entries. For each entry to be changed, we specify the production to
be used in the new reduce move. We collect permissible changes into what
we call a postponement set. Each element of the postponement set is a triple
(T, u, i), where T is a table name, u is a lookahead string, and i is a produc-
tion number. The element (T, u, i) signifies that we are to change the action
of table T on lookahead u to reduce i.
DEFINITION
Let (3, To) be a set of LR(k) tables for a grammar G = (N, E, P, S).
We call (P, a subset of 3 × E *k × P, a postponement set for (3, To) if the
following conditions are satisfied.
If (T, u, i) is in 6' with T = ( f , g), then
(1) f ( u ) = e r r o r or ~0;
(2) If production i is A --~ e and T = GOTO(To, fl), then e is a suffix
of fl;
(3) There is no i' such that (T, u, i') is also in (P; and
(4) If T' is in NEXT(T, i) and T' = ( f ' , g'), then f'(u) = error or ~.
Condition (1) states that only error entries and (p-entries are to be changed
to reduce entries. Condition (2) ensures that a reduction by production i
will occur only if a appears on top of the pushdown list. Condition (3) ensures
uniqueness, and condition (4) implies that reductions caused by introducing
extra reduce actions will eventually be caught without a shift occurring.
Referring to condition (4), note that (T', u, j) may also be in (P. In this
case the value o f f ' ( u ) will also be changed from error or ~ to r e d u c e j. Thus,
several reductions may be made in sequence before error is announced.
Finding a postponement set for a set of LR(k) tables which will maximize
the total number of compatible tables in the set is a large combinatorial
problem. In one of the examples to follow we shall hint at some heuristic
techniques which can be used to find appropriate postponement sets. How-
SEC. 7.3 TRANSFORMATIONS ON SETS OF LR(k) TABLES 601
ever, we shall first show how a postponement set is used to modify a given
set of LR(k) tables.
ALGORITHM 7.8
Postponement of error checking.
Input. An LR(k) grammar G----(N, ~, P, S), a g-inaccessible set (3, To)
of LR(k) tables for G, and a postponement set 6).
Output. A a-inaccessible set 3' of LR(k) tables equivalent to 3.
Method.
(1) For each (T, u, i) in 6', where T = ( f , g), change f(u) to reduce i.
(2) Suppose that (T, u, i) is in 6) and that production i is A ----~0c. For all
T' = ( f ' , g ' ) such that GOTO(T', ~) = T and g'(A) = ~, change g'(A)
to e r r o r .
(3) Suppose that (T, u, i) is in 6' and that T' = ( f ' , g'~ is in NEXT(T, i).
Iff'(u) = q~, change f'(u) to e r r o r .
(4) Let 3' be the resulting set of tables with the original names retained.
D
E x a m p l e 7.21
THEOREM 7.7
action goto
a b e S A B a b
r0 S S X T1T2 SO T3 T4
~q SO SO A SO SO SO SO SO
r2 S S SO Ts T2 ~ T3 T4
r3 S S X ~ r6rTT8
r4 X X 2 SO SO SO SO SO
rs SO SO 1 SO SO SO SO SO
r6 3 3 SO SO so so SO so
r7 S S X ~ Tg r7 T8
7q 5 5 X SO so so SO SO
r9 4 4 SO SO so SO so SO
action goto
a b e S A B a b
To S S X TiT2XT3T4
Ti SO SO A SO SO SO SO
73 S S SO Ts T2 X T3 T4
T3 s s x X , T6 TT T8
r4 5 5 2 SO SO SO SO SO
r5 SO SO 1 SO SO SO SO SO
r~ 3 3 SO SO SO SO SO SO
T7 S S X x ~0 Tg T7 Ts
T8 5 5 2 so so so so ¸ so
r9 4 4 SO so so so so so
action goto
b e S A B a b
ro S S X T1T 1 X T3 T4
75 S S A Ts Ti X T3 T4
75 S S X x ~ Ts rT r4
r4 5 5 2 ~o ~o ~o ~o ~o
r5 3 3 1 ~o ~o so ~o ~o
T7 S S X X ~o T9 TT T4
r9 ¸ 4 4 ~o ~o ~p ~p tp ¢
a number of heuristics used in the example. Since these heuristics will not be
delineated elsewhere, the reader is urged to examine this example with care.
Example 7.22
Consider the tables for G O shown in Fig. 7.25 (p. 595). Our general
strategy will be to use Algorithm 7.8 to replace error actions by reduce
actions in order to increase the number of tables with similar parsing action
functions. In particular, we shall try to merge into one table all those tables
which call for the same reductions.
Let us try to arrange to merge tables T~ 5 and T21 , because they reduce
according to production 5, and tables T4 and T11, which reduce according
to production 6.
To merge T~ 5 and T21, we must make the action of T15 on ) be reduce 5
and the action of T2~ on e be reduce 5. Now we must check the actions of
NEXT(Tls, 5) = {T3, T14} and NEXT(T2~, 5) = {Tl0, T2o} on ) and e. Since
T3 and T14 each have (~ action on), we could change these ~'s to error's and
be done with it. However, then T3 and T1 o would no longer be compatible
nor would T14 and T20---~so we would be wise to change the actions of T3
and T14 on ) instead to reduce 4 and reduce 3, respectively.
We must then cheek NEXT(T3, 4) = NEXT(T14, 3) = [T2, T13}. A similar
argument tells us that we should not change the actions of T2 and T13 on )
to error, but rather to reduce 2 and reduce 1, respectively. Further, we see that
NEXT(T2, 2) = [Ti } = NEXT(T13, 1). There is nothing wrong with changing
the action of Ti on ) to error, so at this point we have taken into account all
modifications needed to change the action of T15 on ) to reduce 5.
We must now consider what happens if we change the action of T21 on
e to reduce 5. NEXT(T21, 4) = ITs0, T20}, but we do not want to change
the actions of Tlo and T2o on e to error, because then we could not possibly
merge these tables with T3 and T14. We thus change the actions of T10 and
T20 on e to reduce 4 and reduce 3, respectively. We find that
make these changes to T4 and T11 without further ado. The complete post-
ponement set consists of the following elements"
[T2, ), 2] [Tg, e, 2]
[T3, ), 4] [T~ o, e, 4]
[T4, ), 6] [T~ ,, e, 6]
[T,3, ), 1] [T,9, e, 1]
[T,,, ), 3] [Tzo, e, 3]
[r, 5, ), 5] [r2,, e, 5]
The result of applying Algorithm 7.8 to the tables of Fig. 7.25 with this
postponement set is shown in Fig. 7.31. Note that no error entries are intro-
duced into the goto field.
Looking at Fig. 7.3 l, we see that the following pairs of tables are imme-
diately compatible"
T~-Ti0
T~-T~i
T~-T~o
T~-T~i
T~-T, ~
T~'T12
T~'T17
T~-T~
T~-T~
T~-Ti ~
If we apply Algorithm 7.7 with the partition whose blocks are the above
pairs and the singletons {To} and {T1}, we obtain the set of tables shown in
Fig. 7.32. The smaller index of the paired tables is used as representative
in each case. It is interesting to note that the "SLR" method of Section
7.4.1 can construct the particular set of tables shown in Fig. 7.32 directly
from Go.
To illustrate the effect of error postponement, let us parse the erroneous
input string a). Using the tables in Fig. 7.25, the canonical parser would
606 TECHNIQUES FOR PARSER OPTIMIZATION CHAP. 7
action ,,goto
a + • ( ) e E T F a + * ( )
ro S X X S X X T1 T: T3 T4 ~ ~ Ts
rl ~p S ~ ~p X A
r~ ,# 2 S ~ 2 2
r3 4 4 ¢ 4 4
r4 X 6 6 X 6 6
r6 S X X S X X g9 T13 T3 T4 ho ~0 T5 g9
r7 S X X S X X tp tp T14 T4 ho tp T5
r8 ~. S ,¢ ¢ S X ~0 ~ ~0 ~ T16 ~0 ~ T15
r9 2 S ,~ 2 2 ~p ~o tp tp tp T17 ~p tp
r~o ,~ 4 4 ¢ 4 4
Tll X 6 6 X 6 6
r~4 ,~ 3 3 ~o 3 3
rls X 5 5 X 5 5 tp ~ ~ tp ~o tp tp ,¢
T19 ~p 1 S ~ 1 1 ~p tp ho 9 9 T17 tp ~p
r~o 3 3 ~o 3 3
r~l X 5 5 X 5 5 ~p ~p tp tp ~p ~o ¢ ~o
action goto
a + * ( ) e E T F a + • (
r0 S X X S X X
T1 ~p S ~ ~p X A ~p ~p ~p ~o T6 ~p ~p
T2 ~p 2 S ~p 2 2
T3 ,# 4 4 ~0 4 4 ~p ~o ~p ~p ~p ~p ~p ~p
T4 X 6 6 X 6 6
T5 S X X S X X r8 r2 r3 r4
T6 S X X S X X r~3 r3 73
T7 S X X S X X TI4 T4
r8 S ¢ ¢ S X
T13 ,p 1 S ~o 1 1
T14 ~p 3 3 ~ 3 3
T15 X 5 5 X 5 5
However, using the tables in Fig. 7.32 to parse this same input string,
the parser would now make the following sequence of moves:
section has to do with single productions, and we shall treat it rather more
informally than we did previous modifications.
A production of the form A ~ B, where A and B are nonterminals, is
called a single production. Productions of this nature occur frequently in
grammars describing programming languages. For instance, single produc-
tions often arise when a context-free grammar is used to describe the prece-
dence levels of operators in programming languages. From example, if a
string al-t- az * a3 is to be interpreted as a 1 ÷ (a 2 • a3), then we say the
operator • has higher precedence than the oPerator ÷ .
Our grammar G Ofor arithmetic expressions m a k e s , of higher precedence
than + . The productions in Go are
(1) E ~ E + T
(2) E----~ T
(3) T---~ T . F
(4) T--~ F
(5) F ~ (E)
(6) F--~ a
We can think of the nonterminals E, T, and F as generating expressions on
different precedence levels reflecting the precedence levels of the operators.
E generates the first level of expressions. These are strings of T's separated
by -t-'s. The operator -q- is on the first precedence level. T generates the second
level of expressions consisting of F's separated by • ;s. The third level of
expressions are those generated by F, and we can consider these to be the
primary expressions.
Thus, when we parse the string al -q- a2 * a3 according to G 0, we must
first parse a 2 • a3 as a T before combining this Twith al into an expression E.
The only function served by the two single productions E ~ T and
T ----~F is to permit an expression on a higher precedence level to be trivially
reduced to an expression on a lower precedence level. In a compiler the
translation rules usually associated with these single productions merely
state that the translations for the nonterminal on the left are the same as
those for the nonterminal on the right. Under this condition, we may, if we
wish, eliminate reductions by the single production.
Some programming languages have operators on 12 or more different
precedence levels. Thus, if we are parsing according to a grammar which
reflects a hierarchy of precedence levels, the parser will often make many
sequences of reductions by single productions. We can speed up the parsing
process considerably if we can eliminate these sequences of reductions, and
in most practical cases we can do so without affecting the translation that is
being computed.
In this section we shall describe a transformation on a set of LR(k) tables
which has the effect of eliminating reductions by single productions wherever
desired.
SEC. 7.3 TRANSFORMATIONS ON SETS OF LR(k) TABLES 600
Let (3, To) be a 0-inaccessible set of LR(k) tables for an LR(k) grammar
G = (N, l~, P, S), and assume that 3 has as many (p-entries as possible. Sup-
pose that A ~ B is a single production in P.
Now, suppose the LR(k) parsing algorithm using this set of tables has
the property that whenever a handle Y~ Y2 "'" Y, is reduced to B on look-
ahead string u, B is then immediately reduced to A. We can often modify
this set of tables so that Y~ • • • Y,. is reduced to A in one step. Let us examine
the conditions under which this can be done.
Let the index of production A ---~ B be p. Suppose that T = ( f , g~ is
a table such that f ( u ) = reduce p for some lookahead string u. Let 3 ' =
GOTO-~(T, B) and let '11. = {U IU = GOTO(T', A) and T' ~ 3'}. 3' consists
of those tables which can appear immediately below B on the pushdown
list when T is the table above B. If (3, To) is the canonical set of LR(k)
tables, then 'It is NEXT(T, p)(Exercise 7.3.19).
To eliminate the reduction by production p, we would like to change
the entry g'(B) from T to g'(A) for each ( f ' , g ' ) in 3'. Then instead of making
the two moves
We can make this change in g'(B) provided that the entries in T and all
tables in q.t are in agreement except for those lookaheads which call
for a reduction by production p. That is, let T = ( f , g ) and suppose
q'[ -~- ( ( f l, g~), ( f z, gz), . . . , (fro, g,,)}" Then we require that
down list that were not placed there by the original parser. For example,
suppose that the original parser makes a reduction to B and then calls for
a shift, as follows"
(?T' Y1 U1 Y2 U2 ... Y,. Ur, aw, rt) ~ (?T'A U', aw, zti)
(TT'A U'aT", w, hi)
Suppose that T = ( f , g), T' = ( f ' , g'), and U = ( f . , g,). Then, U is g'(A).
Table U' has been constructed from U according to rule (4) above. Thus,
if f ( v ) = shift, where v = FIRST(aw), we know that f~(v) is also shift.
Moreover, we know that gi(a) will be the same as g(a). Thus, the new parser
makes a sequence of moves which is correct except that it ignores the ques-
tion of whether the reduction by A ~ B was actually made or not.
In subsequent moves the grammar symbols on the pushdown list are
never consulted. Since the goto entries of U' and T agree, we can be sure
that the two parsers will continue to behave identically (except for reductions
by single production A ~ B).
We can repeat this modification on the new set of tables, attempting to
eliminate as many reductions by semantically insignificant single produc-
tions as possible.
Example 7.23
Let us eliminate reductions by single productions wherever possible in the
set of LR(1) tables for G Oin Fig. 7.32 (p. 607). Table T2 calls for reduction
by production 2, which is E ~ T. The set of tables which can appear imme-
diately below Tz on the stack is [To, T5}, since GOTO(T0, E ) = T1 and
GOTO(Ts, E) -- Ts.
We must check that except for the reduce 2 entries, tables T2 and T1 are
compatible and T2 and T8 are compatible. The action of T2 on • is shift. The
action of T1 and T8 on • is ~o. The goto of T2 on • is T7. The goto of T1 and
T8 on • is ~. Therefore, T2 and T1 are compatible and T2 and T8 are com-
patible. Thus, we can change the goto o f table To on nonterminal T from T2
to T1 and the goto of T5 on T from T2 to Ts. We must also change the action
of both T1 and Ts on • from f to shift, since the action of T2 on • is shift.
Finally, we change the goto of both T1 and T8 on • from (p to T7 since the
goto of T2 on • is Z7.
Table T2 is now inaccessible from To and thus can be removed.
SEC. 7.3 TRANSFORMATIONS ON SETS OF LR(k) TABLES 611
[To, (a + a) • a, el I- [To(Ts, a + a) • a, e]
[To(T, aT4, + a) • a, e]
]--- [To(T~ET8, + a) • a, 6]
[To(TsET8 + T6, a) • a, 6]
[- [To(TsET8 + T6aT4, ) • a, 6]
1-- [To(TsET8 + T6ET~3, ) • a, 66]
~- [To(TsET8, ) • a, 661]
[To(TsETs)Tls, • a, 661]
t-- [ToET~, * a, 6615]
[ToET1, TT, a, 6615]
[ToET~ ,TTaT+, e, 6615]
t-- [ToET~,TTET~4, e, 66156]
~- [ToETi , e, 661563]
612 TECHNIQUESFOR PARSER OPTIMIZATION CHAP. 7
action goto
a + -, ( ) e E T F a + • ( )
To S X X S X X ~q rl ~ r4 ~ ~ r5
r~ ~p S S ~p X A
T~ X 6 6 X 6 6
T~ S X X S X X T8 r8 T8 T4 ~ ~ r5
T~ S X X S X X
T7 S X X S X X ~ r~4 r4 ~ ~ r~ ~
T8 ~p S S ,p S X
Tx~ ~p 1 S ~p 1 1
rl4 ,~ 3 3 ~p 3 3
X 5 ~ 5 X 5 5
action got__..o
a + * ( ) e E a + * ( )
To S X X S X X 7q r4 ~ ~ r5
r~ ~p S S ~p X A
T4 X 6 6 X 6 6
T5 S X X S X X r8 T4 ~ ~ T5
r~ S X X S X X T~3 r4 ~ ~ Ts
r7 S X X S X X r14 r4 ~ ~ r5
r8 S S ~ S X ~, ,p T6 T7 ~ T~5
T~3 ~p 1 S ,# 1 1
Ti4 ~p 3 3 ~p 3 3 ~o ~p ~p tp ~p ~¢
TI5 X 5 5 X 5 5
THEOREM 7.8
If Algorithm 7.9 is applied to an LR(1) grammar G and its canonical set
of LR(1) tables 3, if G has no more than one single production for any non-
terminal, and if no nonterminal derives e alone, then the resulting set of
tables has no reductions by single productions.
]'Note that f i x c o u l d be e here, in which case etl calls for a reduction on lookahead b.
Otherwise, (~1 calls for a shift.
EXERCISES 61 5
Proof. Intuitively, Lemmas 7.1 and 7.2 assure that all pairs T1 and Tz
considered in step (3) do in fact meet the conditions of that step. A formal
proof is left for the Exercises.
EXERCISES
7.3.1. Construct the canonical sets of LR(I) tables for the following grammars:
(a) S---. A B A C
A - - . aD
B---, b l c
C--, cld
D---~ D010
(b) S ---, a S S [ b
(c) S --~ SSa l b
(d) E ~ E + T [ T
T---, T * F1F
V ~ P? FIP
P ---~ (E)1 al a(L)
L--~L, EIE
7.3.2. Use the canonicalset of tables from Exercise 7.3.1(a) and Algorithm
7.5 to parse the input string aObaOOc.
7.3.3. Show how Algorithm 7.5 can be implemented by
(a) a deterministic pushdown transducer,
(b) a Floyd-Evans production language parser.
7.3.4. Construct a Floyd-Evans production language parser for Go from
(a) The LR(1) tables in Fig. 7.33(a).
(b) The LR(1) tables in Fig. 7.33(b).
Compare the resulting parsers with those in Exercise 7.2.11.
7.3.5. Consider the following grammar G which generates the language
L = {anOa~bn[i, n ~ 0) W {Oa"laicn[i, n ~ 0}:
S- +AIOB
A ---+ aAb[O[OC
B~ aBc [ I I1C
C - - - ~ aC[a
7.3.7. Use the techniques of this section to find a smaller equivalent set of
LR(1) tables for each of the grammars of Exercise 7.3.1.
7.3.8. Let ~ be the canonical collection of sets of LR(k) items for an LR(k)
grammar G = (N, E, P, S). Let 12 be a set of items in ,~. Show that
(a) If item [A ~ tg • fl, u] is in 12, then u ~ FOLLOWk(A).
(b) If 12 is not the initial set of items, then 12 contains at least one item
of the form [A --~ tzX. fl, u] for some X in N w E.
(c) If [B ~ • fl, v] is in 12 and B ~ S', then there is an item of the
form [A ~ tz - BT, u] in 12.
(d) If [A --~ tg • Bfl, u] is in 12 and EFFi(Bflu) contains a, then there is
an item of the form [C ~ • a~', v] in 12 for some ~, and v. (This result
provides an easy method for computing the shift entries in an LR(1)
parser.)
*7.3.9. Show that if an error entry is replaced by tp in 3', the set of LR(k)
tables constructed by Algorithm 7.6, the resulting set of tables will no
longer be ~0-inaccessible.
"7.3.10. Show that a canonical LR(k) parser will announce error either in the
initial configuration or immediately after a shift move.
*'7.3.11. Let G be an LR(k) grammar. Give upper and lower bounds on the
number of tables in the canonical set of LR(k) tables for G. Can you
give meaningful upper and lower bounds on the number of tables in
an arbitrary valid set of LR(k) tables for G ?
"7.3.12. Modify Algorithm 7.6 to construct a tp-inaccessible set of LR(0) tables
for an LR(0) grammar.
"7.3.13. Devise an algorithm to find all a-entries in an arbitrary set of LR(k)
tables.
"7.3.14. Devise an algorithm to find all tp-entries in an arbitrary set of LL(k)
tables.
*'7.3.15. Devise an algorithm to find all a-entries in an LC(k) parsing table.
"7.3.16. Devise a reasonable algorithm to find compatible partitions on a set of
LR(k) tables.
7.3.17. Find compatible partitions for the sets of LR(1) tables in Exercise 7.3.1.
7.3.18. Show that the relation of compatibility of LR(k) tables is reflexive and
symmetric but not transitive.
7.3.19. Let (3, To) be the canonical set of LR(k) tables for an LR(k) grammar
G. Show that GOTO(T0, t~) is not empty if and only if t~ is a viable
prefix of G. Is this true for an arbitrary valid set of LR(k) tables for G ?
*7.3.20, Let (3, To) be the canonical set of LR(k) tables for G. Find an upper
bound on the height of any table in 3 (as a function of G).
7.3.21. Let 3 be the canonical set of LR(k) tables for LR(k) grammar
G = (N, E, P, S). Show that for all T e 3, NEXT(T, p) is the set
{GOTO(T', A)I T' ~ G O T O - I ( T , t~), where production p is A ~ t~}.
EXERCISES 617
DEFINITION
The following productions generate arithmetic expressions in which
0 1 , 0 2 , . . . , 0. represent binary operators on n different precedence
levels. 01 has the lowest precedence, and 0. the highest. The operators
associate from left to right.
Eo - - ~ Eo$1 E1 i E1
E1 ~ EiOzEz I E2
E0
( Eo )
E0 01 E1
action goto
a O/ ( ): e E a 0] ( )
[T~ ,i ] X S X X A X X [ r4,/] X X
T2 X 3 X 3 3 X X X X X
[Ts,i ] X S X S X X X IT4,/] X T7
tT6~I X R1 X R2 R2 X X [r,,/l X X
T7 X 2 X 2 2 X X X X X
**7.3.30. Show that the parser in Fig. 7.35 correctly parses all expressions gen-
erated by the tagged grammar.
7.3.31. Construct an LR(1) parser for the untagged grammar with operators
on n different precedence levels. How big is this parser compared with
620 TECHNIQUES FOR PARSER OPTIMIZATION CHAP. 7
the tagged parser in Fig. 7.35. Compare the operating speed of the two
parsers.
*7.3.32. Construct a tagged LR(1)-like parser for expressions with binary opera-
tors of which some associate from left to right and others from right
to left.
**7.3.33. The following tagged grammar will generate expressions with binary
operators on n different precedence levels"
(1) E~ - - , (Eo)Rt.,, 0 < i< n
(2) E~ --> aRt,,, 0 ~ i~ n
(3) Ri, k ---> OyEyRt.j_t 0 ~ i < j ~ k ~ n
(4) R~,j~e 0~i~j~n
Construct a tagged LL(1)-like parser for this grammar. Hint: Although
this grammar has two tags on R, only the first tag is needed by the
parser.
7.3.34. Complete the proof Of Lemma 7.1 and its corollary.
7.3.35. Prove Lemma 7.2.
7.3.36. Complete the proof of Theorem 7.8.
Open P r o b l e m
7.3.37. Under what conditions is it possible to merge all the goto columns for
the nonterminals after eliminating reductions by single productions,
as we did for Go ? The reader should consider the possibility of relating
this question to operator precedence. Recall that Go is an operator
precedence grammar.
Research Problems
7.3.38. Develop additional techniques for modifying sets of LR tables, while
preserving "equivalence" in the sense we have been using the term.
7.3.39. Develop techniques for compactly representing LR tables, taking advan-
tage of ~ entries.
P r o g r a m m i n g Exercises
7.3.40. Design elementary operations that can be used to implement an LR(1)
parser. Some of these operations might be: read an input symbol, push
a symbol on the pushdown list, pop a certain number of symbols from
the pushdown list, emit an output, and so forth. Construct an inter-
preter that will execute these elementary operations.
7.3.41. Construct a program that takes as input a set of LR(1) tables and
produces as output a sequence of elementary instructions that imple-
ments the LR(1) parser using this set of tables.
SEC. 7.4 TECHNIQUES FOR CONSTRUCTING LR(k) PARSERS 621
7.3.42. Construct a program that takes as input a set of LL(1) tables and
produces as output a sequence of elementary instructions that simulates
the LL(1) parser using this set of tables.
7.3.43. Write a program to add don't care entries to the canonical set of LR
tables.
*7.3.44. Write a program to apply some heuristics for error postponement and
table merger, with the goal of producing small sets of tables.
7.3.45. Implement Algorithm 7.9 to eliminate reductions by single productions
where possible.
BIBLIOGRAPHIC NOTES
The saving in this approach is due to the fact that we would use lookahead
only where lookahead is needed. For many grammars this approach will
produce a set of tables which is considerably smaller than the canonical set
of LR(k) tables for G. However, for some LR(k) grammars this method does
not work at all.
We shall also consider another approach to the design of LR(k) parsers.
In this approach, we split a large grammar into smaller pieces, constructing
sets of LR(k) items for the pieces and then combining the sets of items to
form larger sets of items. However, not every splitting of an LR(k) grammar
G is guaranteed to produce pieces from which we can construct a valid set
of tables for G.
Example 7.24
Let G Obe our usual grammar
E ~E+TIT
T >T*F[F
F ~ (E) la
tWe could use FOLLOWk_I(A) here, since fl must generate at least one symbol of
any string in EFFk(fl FOLLOWk(A)).
SEC. 7.4 TECHNIQUES FOR CONSTRUCTING LR(k) PARSERS 623
items as follows. The goto entries are constructed in the obvious way, as
though the tables were LR(0) tables. But for lookahead a, what should the
action be? Should we shift, or should we reduce by production A----~ ct.
The answer lies in whether or not a E F O L L O W i (A). If a ~ FOLLOWi(A),
then it is impossible that a is in EFFi(?) , by the definition of an SLR(1)
grammar. Thus, reduce is the appropriate parsing action. Conversely, if a is
not in F O L L O W i (A), then it is impossible that reduce is correct. If a is in
EFF(~,), then shift is the action; otherwise, error is correct. This algorithm
is summarized below.
ALGORITHM 7.10
Construction of a set of LR(k) tables for an SLR(k) grammar.
Input. An SLR(k) grammar G = (N, E, P, S) and So, the canonical col-
lection of sets of LR(0) items for G.
Output. (3, To), a set of LR(k) tables for G, which we shall call the SLR(k)
set of tables for G.
Method. Let ~ be a set of LR(0) items in So. The LR(k) table T associated
with ~ is the pair ( f , g), constructed as follows:
(1) For all u in Z *k,
(a) f(u) = shift if [A --~ ~ • fl, e] is in ~, ,8 ~ e, and u is in the set
EFFk(fl FOLLOWk(A)).
(b) f ( u ) = reduce i if [d---~ 0c., e] is in ~, d - - ~ o~ is production i
in P, and u is in FOLLOWk(A).
(c) f(e) = accept if [S' --~ S -, e] is in ~2.'~
(d) f ( u ) = error otherwise.
(2) For all X in N t.3 X, g(X) is the table constructed from G O T O ( a , X).
To, the initial table, is the one associated with the set of items containing
[s' ~ . S, d.
Example 7.25
Let us construct the SLR(1) set of tables from the sets of items of Fig.
7.36 (p. 623). We use the name Tt for the table constructed from ~ r We
shall consider the construction of table T2 only.
"l'The canonical collection of sets of items is constructed from the augmented grammar.
SEC. 7.4 TECHNIQUES FOR CONSTRUCTING LR(k) PARSERS 625
action goto
a + * ( ) e E T F a + • ( )
r0 S X X S X X T~r2T3T4X x T5 X
rl X S X X X A X X X X T6 X X X
r2 X 2 S X 2 2 X X X X X T7 X X
7q X 4 4 X 4 4 X X X X X X X X
r4 X 6 6 X 6 6 X X X X X X X X
r5 S X X S X X Ts T2 T3 T4 X XTsX
r6 S X X S X X X Tg T3 T4 X X T5 X
r7 S X X S X X X X TlO Tg X T5 X
r8 X S X X S X X X X X T6 x XT~
r9 X 1 S X 1 1 x x x x x rT x x
T10 X 3 3 X 3 3 X X X X X X X X
X 5 5 X 5 5 X X X X X X X X
Except for names and ~ entries, this set o f tables is exactly the same as
that in Fig. 7.32 (p. 607).
• core of a set of items R as the set of bracketed first components of the items
in that set. For example, the core of [A ~ 0c • fl, u] is [A ~ 0c • fl].t We shall
denote the core of a set of items R by CORE(R).
Each set of items in So is distinct, but there may be several sets of items
in Sk with the same core. However, it can easily be shown that So =
{CORE(et) let ~ Sk}.
Let us define the function h on tables which corresponds to the function
C O R E on sets of items. We let h(T) -- T' if T is the canonical LR(k) table
associated with R and T' is the SLR(k) table constructed by Algorithm 7.10
from CORE(R). It is easy to verify that h commutes with the GOTO function.
That is, GOTO(h(T), %) = h(GOTO(T, X)).
As before, let
Let $~ = [R'IR ~ So}. We know that (3, To) is the same set of LR(k) tables
as that constructed from S'0 using the method of Section 5.2.5. We shall show
that (5, To) can also be obtained by applying a sequence of transformations to
(3c, To), the canonical set of LR(k) tables for G. The necessary steps are the
following.
(1) Let 6' be the postponement set consisting of those triples (T, u, i)
such that the action of T on u is error and the action of h(T) on u is reduce i.
Use Algorithm 7.8 on 6" and (3 c, Tc) to obtain another set of tables (3'c, T'c).
(2) Apply Algorithm 7.7 to merge all pairs of tables T~ and Tz such that
h(T1) = h(Tz). The resulting set of tables is (3, To).
Let (T, u, i) be an element of 6". To show that 6' satisfies the requirements
of being a postponement set for 3c, we must show that if T " = ( f " , g " )
is in NEXT(T, i), then f " ( u ) = error. To this end, suppose that production
i is A ---, 0c and T " = GOTO(To, flA) for some viable prefix flA. Then
T = GOTO(T0,/~).
In contradiction let us suppose t h a t f " ( u ) ~ error. Then there is some item
[B ~ y • ,5, v] valid for flA, where u is in EFF(~v).:I: Every set of items,
except the initial set of items, contains an item in which there is at least one
symbol to the left of the dot. (See Exercise 7.3.8.) The initial set of items is
valid only for e. Thus, we may assume without loss of generality that 7 = y'A
for some y'. Then [B ---~ y' • A6, v] is valid for fl, and so is [A ---~ • ~, u].
Thus, [A --~ ~ . , u] is valid for p~, and f ( u ) should not be error as assumed.
We conclude that 6' is indeed a legitimate postponement set for ~o.
Let ~3~ be the result of applying Algorithm 7.8 to 3c using the postpone-
tWe shall not bother to distinguish between [A ---~ ~ • ,8] and [A ---~ • • ,13,e].
~Note that this statement is true independent of whether ~ = e or not.
SEC. 7.4 TECHNIQUES FOR CONSTRUCTING LR(k) PARSERS 627
ment set (P. Now suppose that T is a table in 3c associated with the set of
items ~ and that T' is the corresponding modified table in 31- Then the only
difference between T and T' is that T' may call for a reduction when T
announces an error. This will occur whenever u is in FOLLOW(A) and the
only items in ~ of the form [A ---, ~ . , v] have v ~ u. This follows from the
fact that because of rule (lb) in Algorithm 7.10, T' will call for a reduction
on all u such that u is FOLLOW(A) and [A --~ ct •] is an item in CORE(a).
We can now define a partition II -- [(B t, ( B 2 , . . . , 6~r} on 31 which groups
tables T1 and Tz in the same block if and only if h(T1) --h(T2). The fact
that h commutes with GOTO ensures that II will be a compatible partition.
Merging all tables in each block of this compatible partition using Algorithm
7.7 then produces 3.
Since Algorithms 7.7 and 7.8 each preserve the equivalence of a set of
tables, we have shown that 3, the set of LR(k) tables for G, is equivalent to
3c, the canonical set of LR(k) tables for G. [--]
Example 7.26
Consider the LR(1) grammar G with productions
(1) S ~ Aa
(2) S ~ dAb
628 TECHNIQUES FOR PARSER OPTIMIZATION CHAP. 7
(3) S ~ cb
(4) S ~ dca
(5) A-re
Even though L ( G ) consists only of four sentences, G is not an SLR(k) gram-
mar for any k :> 0. The canonical collection of sets of LR(0) items for G is
given in Fig. 7.38. The second components of the items have been omitted
~0: s' ~ .s
s ~ . A a [ . d A b [ .cb [ .dea
A ---. . c
~x : S ' ---> S .
¢gz : S ~ A .a
a,3: S---* d.Ab[d.ea
A--+.e
a~4 : S ---> c . b
A .--~ c.
a~6 : S ~ dA.b
~a0: S ~ dca.
and we have used the notation A --, al • fla i a~ • f121 "'" l a , " ft, as short-
hand for the n items [ A - - , a l . f l ~ ] , [ A ~ a s - f l 2 ] , . . . , [ A ~ a , . f l , ] .
There are two sets of items that are inconsistent, 64 a n d 67. Moreover,
since FOLLOW(A) -- {a, b}, Algorithm 7.10 will not produce unique parsing
actions from 64 and 67 on the lookaheads b and a, respectively.
However, let us examine the GOTO function on the sets of items as
graphically shown in Fig. 7.39.? We see that the only way to get to 64 from
6 0 is to have e on the pushdown list. If we reduce e to A, from the produc-
tions of the grammar, we see that a is the only symbol that can then follow
A. Thus, 7'4, the table constructed from 64, would have the following unique
parsing actions"
c d
tNote that this graph is acyclic but that, in general, a goto graph has cycles.
SEC. 7.4 TECHNIQUESFOR CONSTRUCTINGLR(k) PARSERS 629
S a c
Similarly, from the GOTO graph we see that the only way to get to ~7 from
a0 is to have de on the pushdown list. In this context if c is reduced to A,
the only symbol that can then legitimately follow A is b. Thus, the parsing
actions for TT, the table constructed from a7, would be
a b c d
The remaining LR(1) tables for G can be constructed using Algorithm 7.10
directly. [-]
Example 7.27
Consider the LR(1) grammar G with productions
(1) S ~ Aa
(2) S--~ dAb
(3) S ~ Bb
(4) S -~ dBa
(5) A - ~ c
(6) B - - - ~ c
This grammar is quite similar to the one in Example 7.26, but it is not an
L A L R grammar. The canonical collection of sets of LR(0) items for the
augmented grammar is shown in Fig. 7.40. The set of items et s is inconsistent
because we do not know whether we should reduce by production A --~ e
or B ~ e. Since FOLLOW(A) -- FOLLOW(B) = {a, b}, using these sets
as lookaheads will not resolve this ambiguity. Thus G is not SLR(I).
124: S ~ d.Ab
S ~ d.Ba
A .---~ . c
B ---~ . c
S A c
b A B
t
T~" reduce 6 reduce 5 error error
TJt" reduce 5 reduce 6 error error
Example 7.28
Let Go be the usual grammar and let N' = {E, T} be the splitting nonter-
minals. Then P consists of
T > ~'* F I F
F > (/~) [a
merge certain sets of items having common cores. The similarity to the SLR
algorithm should be apparent. In fact, we shall see that the SLR algorithm
is really a grammar-splitting algorithm with N' = N.
The complete technique can be summarized as follows.
(1) Given a grammar G = (N, E, P, S), we ascertain a suitable splitting
set of nonterminals N' ~ N. We include S in N'. This set should be large
enough so that the component grammars are small, and we can readily
construct sets of LR(1) tables for each component. At the same time, the
number of components should not be so large that the method will fail to
produce a set of tables. (This comment applies only to non-SLR grammars.
If the grammar is SLR, any choice for N ' will work, and choosing N ----N'
yields the smallest set of tables.)
(2) Having chosen N', we compute the component grammars using
Algorithm 7.11.
(3) Using Algorithm 7.12, below, we compute the sets of LR(1) items for
each component grammar.
(4) Then, using Algorithm 7.13, we combine the component sets of items
into S, a collection of sets of items for the original grammar. This process
may not always yield a collection of consistent sets of items for the original
grammar. However, if $ is consistent, we then construct a set of LR(1)
tables from 8 in the usual manner.
ALGORITHM 7.12
Construction of sets of LR(1) items for the component grammars of
a given grammar.
Input. A grammar G----(N, E, P, S), a subset N' of N, with S ~ N',
and the component grammars GA for each A ~ N'.
Output. Sets of LR(1) items for each component grammar.
Method. For notational convenience let N' = [Sx, Sz,. • •, am}. We shall
denote Gs, as Gi.
If O, is a set of LR(1) items, we compute O0', the closure of ~ with respect
to GA, in a manner similar, but not identical, to Algorithm 5.8. a ' is defined
as follows:
(i) Oo ~ O~'. (That is, all-items in 6t are in ~2'.)
(2) If [B --~ ~ • Cfl, u] is in a,' and C --~ 7 is a production in G~, then
[C ~ • 7, v] is in a ' for all v in FIRST~(fl'u), where fl' is fl with each symbol
in N replaced by the original symbol in N.
Thus all lookahead strings are in E*, while the first components of the
items reflect productions in GA.
For each Gi, we construct $i, the collection of sets of LR(1) items for G~,
as follows:
634 TECHNIQUESFOR PARSER OPTIMIZATION CHAP. 7
/[E--+ -£' + L + / ) / e ]
~o~. / [E---~ L + I)le]
[e--~ D.+~, + l)le]
[E---> ~., + I ) l e]
[E---> ~ + ' L + I)le]
a~- [E--> ~ -t- ~', ÷ / ) / e ]
Fig. 7.42 Sets of items for Gz.
SEC. 7.4 TECHNIQUES FOR CONSTRUCTING LR(k) PARSERS 635
a,0r"
{
[T----, . ~ . F ,
[T----, .F,
[F---, •(E),
[F---~-a,
+/./)/e]
+/./)/e]
+ / • / ) / e]
÷/./)/e]
aT: [T--~ f-*F, + l * l ) l e ]
t~" [T--> F., +l./)/e]
a~'" [F--~ (./~), +/*/)/el
a~- [F---~ a., + / • / ) / e]
f[T---~ ~'*.F, ÷ / , / ) / e ]
~sr" I[F--~'(E)' + / , / ) / e ]
,[F---, .a, + / • / ) / e]
a~': [F~(E.), +/,/)/e]
eta': [T----~f , F . , + / , / ) / e ]
at: [F--~(E)., +/,/)/e]
a set of LR(1) tables for the original grammar, provided that certain condi-
tions hold.
ALGORITHM 7.13
Construction of a set of LR(1) tables from the sets of LR(1) items of
component grammars.
lnput. A C F G G = (N, E, P, $1), a splitting set N' = ($1, $2,. • . , Sin},
and a collection {6g, 6 ] , . . . , 6~,} of sets of LR(1) items for each component
grammar Gi.
Output. A valid set of LR(1) tables for G or an indication that the sets of
items will not yield a valid set of tables.
Method.
(1) In the first component of each item, replace each symbol of the form
,Si by S~. Each such S~ is in N'. Retain the original name for each set of items
so altered.
(2) Let ~'0 = [[S'1 ~ • $I, e]}. Apply the following augmenting operation
to a0, and call the resulting set of items ~'0- a0 will then be the "initial" set
of items in the merged collection of sets of items.
Augmenting Operation. If a set of items 6 contains an item with a first
c o m p o n e n t of the form A ~ ~ • Bfl and B *~
G
Sit, for some S i in N', y in
(N U ~)*, then add 6~ to 6. Repeat this process until no new sets of items
can be added to 6.
(3) We shall now construct ,~, the collection of sets of items accessible
from tt o. Initially, let S = (tt0). We then perform step (4) until no new sets
of items can be added to 8.
636 TECHNIQUES FOR PARSER OPTIMIZATION CHAP. 7
Example 7.30
Let us apply Algorithm 7.13 to the sets of items in Figs. 7.42 and 7.43.
The effect of step (1) should be obvious. Step (2) first creates the set of items
ao = { [ E ' - - , • E , e]}, and after applying the augmenting operation, a0 =
{[E' --~ • E, el} U ago U ao~.
At the beginning of step (3), $ = [a0}. Applying step (4), we first compute
That is, GOTO({[E' --~ • E, el}, E) = [[E'--, E . , e]} and GOTO(ag, E) = af.
GOTO(ff,~, E) is empty. The augmenting operation does not enlarge g,.
We then compute g2 = GOTO(go, T) = ff,f Y ~'. The augmenting operation
does not enlarge a 2. Continuing in this fashion, we obtain the following col-
lection of sets of items for ,~"
a4 = ~
a~ = ~ u ~ u ~
~rs = elf u e~
tThe GOTO function for G],, is meant here. However if X is splitting nonterminal then
use X in place of X.
:l:The K honors A. J. Korenjak, inventor of the method being described.
SEC. 7.4 TECHNIQUES FOR CONSTRUCTING LR(k) PARSERS 637
alo = a,~
atx : ~
action goto
a + * ( ) e E T F a + * ( )
To s X x s x x T1 T2 T3 T4 X xTsx
T1 X S X X X A X X X X r6 X X X
T2 X 2 S X 2 2 X X X X X T7 X X
T3 X 4 4 X 4 4 X X X X X X X X
T4 X 6 6 X 6 6 X X X X X X X X
T5 S X X S X X T8 T2 T3 T4 X X Ts X
T6 S X X S X X X T9 T3 T4 X X T5 X
T7 S X X S X X X X TlO T4 X X T5 X
T8 X S X X S X X X X X 76 X X Ti1
T9 X 1 S X 1 1 X X X X X T7 X X
Tlo X 3 3 X 3 3 X X X X X X X X
Tll X 5 5 X 5 5 X X X X X X X X
We shall now show that this approach yields a set of LR(1) tables that is
equivalent to the canonical set of LR(1) tables. We begin by characterizing
the merged collection of sets of items generated in step (4) of Algorithm 7.13.
DEFINITION
[S' --~ S -, e] is in ~ can be handled easily, and we shall omit these details
here. Suppose that [A---~ ~ . p, a] is in a = KGOTO(a0, 7"X). There are
three ways in which [.4 ----~ t~ • fl, a] can be added to a.
Case 1" Suppose that [A ~ tx- ,8, a] is in GOTO(a/~', X) for some p,
a : a ' X and that [A --~ a' • Xfl, a] is in a[,". Then [A --~ a' . Xfl, a] is quasi-
valid for y', and it follows that [A --~ a • fl, a] is quasi-valid for ~,.
Case 2." Suppose that [A --~ a • fl, a] is in GOTO(a/,", X) and that a = e.
Then there is an item [B ---, $1X" COz, b] in GOTO(~/,", X), and C *=, A w,
rm
where a ~ FIRST(wO2b). Then [ B - - , ~ . XCO2, b] is quasi-valid for ?',
and [B ~ ~ i X . Cd~2, b] is quasi-valid for ?. If a is the first symbol of w or
w ---- e and a comes from ~2, then [A --~ a • fl, a] is valid for ~,. Likewise, if
[B --~ O l X . C~2, b] is valid for ),, so is [A - ~ a • fl, a].
Thus, suppose that w-----e, Oz =~ e, a = b, and [B---. O l X " C~2, b] is
quasi-valid, but not valid for ?. Then there is a derivation
quasi-valid for y, then it is in g. We omit the easier case, where the item is
actually valid for ?. Let us assume that [A ~ e • fl, a] is quasi-valid, but not
valid. Then there is a derivation
the item [A ~ ~ • fl, a] is added to g, either in the modified closure of the set
containing [B ~ • e, b] or in a subsequent augmenting operation. D
THEOREM 7.10
Let (N, E, P, S) be a CFG. Let (3, To) be the set of LR(1) tables for G
generated by Algorithm 7.13. Let (3 o To) be the canonical set. Then the two
sets of tables are equivalent.
Proof We observe by Lemma 7.3 that the table of 3 associated with string
y agrees in action with the table of 3c associated with ~, wherever the latter is
not error. Thus, if the two sets are inequivalent, we can find an input w such
that (To, w) ~ (TcX~T~ ... XmTm, x)t using 3~, and an error is then declared,
while
using 5.
"l'We have omitted the output field in these configurations for simplicity.
-~ 7 i~ - ~ :
640 TECHNIQUES FOR PARSER OPTIMIZATION CHAP. 7
Suppose that table U, is constructed from the set of items ~. Then ~ has
some member [A ~ a - fl, b] such that ,8 ~ e and a is in EFF(fl). Since
[A --~ a • fl, b] is quasi-valid for Y1 "'" Y,, by Lemma 7.3, there is a deri-
vation S' *~
rm
7'Ay =~
rm
Taffy for some y, where 7'a = Yi "'" Y,. Since we have
derivation Yi . . . Y,, *~
rm
X~ . . . Xm, it follows that there is an item
[B --~ ~ • e, c] valid for X~ . . . Xm, where a is in EFF(ec). (The case e = e,
where a = e, is not ruled out. In fact, it will occur whenever the sequence of
steps
It follows from Lemma 7.3 that the SLR sets of items are the same as those
generated by Algorithm 7.13. [~]
Example 7.31
Consider the following LR(1) grammar"
(1) S - - , Aa
(2) S--~ dab
(3) s ~ cb
SEC. 7.4 TECHNIQUES FOR CONSTRUCTING LR(k) PARSERS 641
(4) S---, BB
(5) A ~c
(6) B---, Bc
(7) B--, b
action goto
a b c d e S A B a b c d
To X S S S X T1 T2 T3 X T4 T5 T6
T~ X X X X A X X X X X X X
T2 S X X X X x x x TT x x x
T3 X S S X X x x T8 X T4 T9 X
T4 X 7 7 X 7 X X X X X X X
T5 5 S X X X X X X X Tlo X X
T6 X X S X X X Tll. X X X T12 X
V7 X X X X 1 X X X X X X X
T8 X X S X 4 x x x x X Tg X
r9 X 6 6 X 6 X X X X X X X
T~o X X X X 3 X X X X X X X
X S X X X X X X X Tt3 X X
r~2 X 5 X X X X X X X X X X
T~3 X X X X 2 X X X X X X X
EXERCISES
"7.4.1. Consider the class {G1, G2 . . . . } of LR(0) grammars, where G. has the
following productions"
S ~Ai l~i<n
A~- >aiA i l <i~j<n
A~ > aiBi l bi 1 < i _~ n
Bi ~ ajBt l bi 1 ~ i, j < n
Show that the number of tables in the canonical set of LR(0) tables
for G. is exponential in n.
7.4.2. Show that each grammar in Exercise 7.3.1 is SLR(1).
7.4.3. Show that every LR(0) grammar is an SLR(0) grammar,
*7.4.4. Show that every SMSP grammar is an SLR(1) grammar.
7.4.5. Show that the grammar in Example 7.26 is not SLR(k) for any k _> 0.
*7.4.6. Show that every LL(1) grammar is an SLR(1) grammar. Is every LL(2)
grammar an SLR(2) grammar ?
7.4.7. Using Algorithm 7.10, construct a parser for each grammar in Exercise
7.3.1.
7.4.8. Let ,~c be the canonical collection of sets of LR(k) items for G. Let So
be the sets of LR(0) items for G. Show that ,~c and 80 have the same
sets of cores. Hint: Let ~ = GOTO(~0, ~), where ~0 is the initial set
of So, and proceed by induction on l~ I.
7.4.9. Show that CORE(GOTO(tY, ~)) = GOTO(CORE(~), 06), where ~ ~ Sc
as above.
DEFINITION
A grammar G = (N, E , P , S) is said to be lookahead LR(k)
[LALR(k)] if the following algorithm succeeds in producing LR(k)
tables:
(1) Construct So, the canonical collection of sets of LR(k) items
for G.
(2) For each tY ~ ~c, let (Y' be the union of those (B ~ ~ such
that CORE((B) = CORE(G).
(3) Let S be the set of those ~ ' constructed in step (2). Construct
a set of LR(k) tables from S in the usual manner.
7.4.10. Show that if G is SLR(k), then it is LALR(k).
EXERCISES 643
7.4.11. Show that the LALR table-constructing algorithm above yields a set of
tables equivalent to the canonical set.
7.4.12. (a) Show that the grammar in Example 7.26 is LALR(1).
(b) Show that the grammar in Example 7.27 is not LALR(k) for any k.
7.4.13. Let G be defined by
S----~L= RIR
L----~ * RI a
R -----~ L
Research Problems
7.4.27. F i n d additional ways of constructing small sets of LR(k) tables for
LR(k) grammars without resorting to the detailed transformations of
Section 7.3. Your methods need not work for all LR(k) grammars but
should be applicable to at least some of the practically important
grammars, such as those listed in the Appendix of VolUme 1.
7.4.28. When an LR(k) parser enters an error configuration, in practice we
would call an error recovery routine that modifies the input and the
pushdown list so that normal parsing can resume. One method of
modifying the error configuration of an LR(1) parser is to search
forward on the input tape until we find one of certain input symbols.
Once such an input symbol a has been found, we look down into the
pushdown list for a table T : (f, g~ such that T was constructed from
a set of items ~ containing an item of the form [A ---, • 0c, a], A ~ S.
The errOr recovery procedure is to delete all input symbols up to a
a n d to remove all symbols and tables above T on the pushdown list.
The nonterminal A is then placed on top of the pushdown list and
table g(A) is placed on top of A. Because of Exercise 7.3.8(c),
g(A) ~ error. The effect of this error recovery action is to assume that
the grammar symbols above T on the pushdown list together with the
input symbols up to a form an instance of A. Evaluate the effectiveness
of this error recovery procedure, either empirically or theoretically.
A reasonable criterion of effectiveness is the likelihood of properly
correcting the errors chosen from a set of "likely" programmer errors.
7.4.29. When a grammar is split, the component grammars can be parsed in
different ways. Investigate ways to combine various types of parsers for
the components. In particular, is it possible to parse one component
bottom up and another top down ?
SEC. 7.5 PARSING AUTOMATA 645
Programming Exercises
7.4.30. Write a program that constructs the SLR(1) set of tables from an
SLR(I) grammar.
7.4.31. Write a program that finds all inaccessible error entries in a set of
LR(1) tables.
7.4.32. Write a program to construct an LALR(1) parser from an LALR(1)
grammar.
7.4.33. Construct an SLR(1) parser with error recovery for one of the gram-
mars in the Appendix of Volume I.
BIBLIOGRAPHIC NOTES
Simple LR(k) grammars and LALR(k) grammars were first studied by DeRemer
[1969, 1971]. The technique of constructing the canonical set of LR(0) items for a
grammar and then using lookahead to resolve parsing decision ambiguities was
also advocated by DeRemer. The grammar-splitting approach to LR parser design
was put forward by Korenjak [1969].
Exercise 7.4.1 is from Earley [1968]. The error recovery procedure in Exercise
7.4.28 was suggested by Leinius [1970]. Exercise 7.4.26 is from Aho and Ullman
[1972d].
7.5. PARSING A U T O M A T A
tWe remove loci -- 1 symbols, rather than t~1 symbols because the table corresponding
to the rightmost symbol of 0cis in control and does not appear on the list.
SEC. 7.5 PARSING AUTOMATA 647
that table does not appear but is indicated by the fact that program control
lies with that table.
We shall now define the parsing automaton which executes these parsing
actions directly. There is a state of the automaton for each state (i.e., table)
in the above sense. The input symbols for the automaton are the terminals
of the grammar and the state names themselves. A shift state makes tran-
sitions only on terminals, and a reduce state makes transitions only on state
names. I n fact, a state T calling for a reduction according to production
A ~ ~ need have a transition specified for state T' only when T is in
GOTO(T', ~).
We should remember that this automaton is more than a finite automaton,
in that the states have side effects on a pushdown list. That is, each time
a state transition occurs, something happens to the pushdown list which is
not reflected in the finite automaton model of the system. Nevertheless, we
can reduce the number of states of the parsing automaton in a manner similar
to Algorithm 2.2. The difference in this case is that we must be sure that all
subsequent side effects are the same if two states are to be placed in the same
equivalence class. We now give a formal definition of a parsing automaton.
DEFINITION
Let G = (N, Z, P, S) be an LR(0) grammar and (3, To) its set of LR(0)
tables. We define an incompletely specified automaton M, called the canonical
parsing automaton for G. M is a 5-tuple (3, Z U 3 U {$}, ~, To, [T~}), where
(1) 3 is the set of states.
(2) E U 3 is the set of possible input symbols. The symbols in Z are on
the input tape, and those i n 3 are on the pushdown list. Thus, 3 is both the
set of states and a subset of the inputs to the parsing automaton.
(3) ~ is a mapping from 3 × (Z U 3) to 3. c~is defined as follows:
(a) If T ~ 3 is a shift state, 6(T, a ) = GOTO(T, a) for all a ~ 2~.
(b) If T ~ 3 is a reduce state calling for a reduction by production
A ~ ~x and if T' is in GOTO-I(T, tz) [i.e., T = GOTO(T', tz)],
then c~(T, T') = GOTO(T', A).
(c) O(T, X) is undefined otherwise.
The canonical parsing automaton is a finite transducer with side effects
on a pushdown list. Its behavior can be described in terms of configurations
consisting of 4-tuples of the form (~t, T, w, n), where
(1) 0c represents the contents of the pushdown list (with the topmost
symbol on the right).
(2) Z is the governing state.
(3) w is the unexpended input.
(4) n is the output string to this point.
Moves can be reflected by a relation ~- on configurations. If T is a shift
648 TECHNIQUES FOR PARSER OPTIMIZATION CHAP. 7
state and J(T, a) = T', we write (oc, T, aw, n) ~ (ocT, T', w, n). If T calls for
a reduction according to the ith production A ~ ~, and J(T, T') ---- T", we
write (ocT'fl, T, w, n) ~ (ocT', T", w, hi) for all fl of length I~, [ -- 1. If l~'[----0,
then (~z, T, w, n) ~ (aT, T", w, hi). In this case, T and T' are the same. Note
that if we had included the controlling state symbol as the top symbol of
the pushdown list, we would have the configurations of the usual LR parsing
algorithm.
We define ~--., I--, and ~ in the usual fashion. An initial configuration is
one of the form (e, To, w, e), and an accepting configuration is one of the form
(To, T1, e, z0. If (e, To, w, e)]--- (To, T1, e, n), then we say that n is the parse
produced by M for w.
Example 7.32
Let us consider the LR(0) grammar G
(l) S--~ aA
(2) S--~ aB
(3) A --~ bA
(4) A --* c
(5) B - - , bB
(6) B--~ d
generating the regular set ab*(c + d). The ten LR(0) tables for G are listed
in Fig. 7.46.
action goto
e S A B a b c d
To T1 X X T2 X X X
T1 X X X X X X X
T2 x T3 T4 X T5 T, T7
T3 X X X X X X X
T4 X X X X X X X
T5 x T8 T9 X T5 T, T7
T6 X X X X X X X
T7 X X X X X X X
T8 X X X X X X X
T9 X X X X X X X
To, T2, and T5 are shift states. Thus, we have a parsing automaton with
the following shift rules"
6(To, a) = T~
,~(T~, b) = T~
6(T~, c) = T~
6(T~, d) = r~
6(T~, b) = T~
6(T~, c) = T~
6(T5, d) = T 7
We compute the transition rules for the reduce states T3, T4, T6, T7, Ts and
Tg. Table T3 reduces using production S ~ aA. Since GOTO-I(T3, aA)={To}
and GOTO(To, S ) = TI, ~(T3, To) = T1 is the only rule for T3. Table T7
reduces by B ---~ d. Since GOTO-I(TT, d) = [T2, Ts}, GOTO(T2, B) = T4, and
GOTO(Ts, B) = Tg, the rules for T7 are ~(TT, T2) = T4 and ~6(T7, Ts) = Tg.
The reduce rules are summarized below"
~(T3, To) : T 1
~(Z,, To) : T,
~(T6, T2) : T3 ~(T6, Ts) : Ts
6(T~, T9 = T, ~(T7, Ts) : T9
~(Ts, T 9 = Z~ ~(T8, Ts) : Ts
6(T~, T9 : T~ ~(Tg, Ts) : T9
Thus, the parsing automaton produces the parse 431 for the input abc.
D
650 TECHNIQUESFOR PARSER OPTIMIZATION CHAP. 7
start
¢ c
T7
Ts
T5 T8 T 9 ~ T5
/,o
the state name now on top of the pushdown list, say U, and then transfers
control to GOTO(U, A).
In the transition graph we can replace a reduce state T by a pop state,
which we shall also call T, and an interrogation state, which we shall call T'.
All edges entering the old T still enter the new T, but all edges leaving the old
T now leave T'. One unlabeled edge goes from the new T to T'. This trans-
formation is sketched in Fig. 7.48, where production i is A--~ 0~.
Shift states and the accept state will not be split here.
T Pop
State
Interrogation
State
Old reduce state
Example 7.33
The split parsing automaton from Fig. 7.47 is shown in Fig. 7.49. We
show shift states by [], pop states by A , and interrogation and accept states
by (D.
To compare the behavior of this split automaton with the automaton in
Example 7.32, consider the sequence of moves the split automaton makes on
input abc :
(ToTzTs, Ts, e, 4)
I--- (TOT2, T~, e, 43)
J-- (TOT2, T3, e, 43)
(To, T~, e, 431)
l- (To, T~, e, 431)
D
start
b °
L I a¸ (
s
the same number of input symbols. Thus, we are using the same definition
of equivalence as for two sets of LR(k) tables.
If the canonical and split canonical automata are run side by side, then
it is easy to see that the resulting pushdown lists are the same each time
the split automaton enters a shift or interrogation state. Thus, it should be
evident that these two automata are equivalent.
There are two kinds of simplifications that can be made to split parsing
automata. The first is to eliminate certain states completely if their actions
are not needed. The second is to merge states which are indistinguishable.
The first kind of simplification eliminates certain interrogation states.
If an interrogation state has out-degree 1, it may be removed. The pop
state connected to it will be connected directly, by an unlabeled edge, to
the state to which the interrogation state was connected.
We call the automaton constructed from a split canonical automaton
by applying this simplification a semireduced automaton.
Example 7.34
Let us consider the split parsing automaton of Fig. 7.49. T~ and T~ have
only one transition, on To. Applying our transformation, these states and
the To transitions are eliminated. The resulting semireduced automaton is
shown in Fig. 7.50.
THEOREM 7.12
A split canonical parsing automaton M1 and its semireduced automaton
M~ are equivalent.
Proof. An interrogation state does not change the symbols appearing on
the pushdown list. Moreover, if an interrogation state T has only one transi-
tion, then the state labeling that transition must appear at the top of the push-
down list whenever M1 enters state T. This follows from the definition of
the GOTO function and of the canonical automaton. Thus, the first trans-
formation does not affect any sequence of stack, input or output moves
made by the automaton. [Z]
start
T2 - 7
T5
DEFINITION
Let M1 = (Q, E, J, q0, {ql}) be a semireduced automaton. Let Q' be
the set of equivalence classes of -- and let [q] stand for the equivalence class
656 T E C H N I Q U E S FOR P A R S E R O P T I M I Z A T I O N CI-IAP. 7
containing q. The reduced automaton for M1 is M2 -- (Q', E', 6', [q0], {[ql]}),
where
(1) E ' - - ( Z - Q) u Q';
(2) For all q ~ Q and a ~ E U {e} -- Q, 6'([q], a) -- [~(q, a)]; and
(3) For all q and p ~ Q, 5'([q], [p]) = [~(q, p)].
Example 7.36
In Example 7.35 we found that T~ ~ T~ and T~ ~ T~. The transition
graph of the reduced automaton for Fig. 7.50 is shown in Fig. 7.51. T~ and
T~r have been chosen as representatives for the two equivalence classes with
more than one member. [~]
start -
a b
/
7
T1
reduced automata are equivalent. Essentially, the two automata will always
make similar sequences of moves. The reduced automaton enters a state
representing the equivalence class of each state entered by the semireduced
automaton. We state the correspondence formally as follows.
THEOREM 7.13
Let Mi be a semireduced automaton and Mz the corresponding reduced
automaton. Then for all i > 0, there exist T , , . . . , T~, T such that
if and only if
where To is the initial state of Mi and [u] denotes the equivalence class of
state u.
Proof. Elementary induction on i. [~
COROLLARY
list operations can be isolated with the hope of merging common operations.
Here, we shall consider the following state-splitting scheme.
Let Tbe a state corresponding to table T = ( f , g). We split this state into
read, push, pop, and interrogation states as follows"
(1) w e create a read state labeled T which reads the next input symbol.
Read states are indicated by G.
(2) If f(a) = shift, we then create a push state labeled T" and draw an edge
with label a from the read state T to this push state. If g(a) = T', we then
draw an unlabeled edge from T" to the read state labeled T'. The push state
T" has two side effects. The input symbol a is removed from the input, and
the table name T is pushed on top of the pushdown list. We indicate push
states by ~ .
(3) If f(a) = reduce i, then we create a pop state T1 and an interrogation
state T2. A n edge labeled a is drawn from T to T1. If production i is A ~ a,
then the action of state T~ is to remove l a[ -- 1 symbols from the top of the
pushdown list and to emit the production number i. If a = e, then T1 places
the original table name T on the pushdown list. Pop states are indicated by ~..
An unlabeled edge is then drawn from T1 to T2. The action of T2 is to examine
the symbol now on top of the pushdown list. If GOTO-~(T, a) contains T'
and GOTO(T', A) = T", then an edge labeled T' is drawn from T2 to the
read state of T". Interrogation states are also indicated by G. The labels on
the edges leaving distinguish these circles from read states.
Thus, state T would be represented as in Fig. 7.52 if f(a) = shift and
f(b) = reduce i.
Read State b
3'
7i Pop State
Example 7.37
(1) s---, A s
(2) A ---~ aAb
(3) A ---, e
(4) B---~ b B
(5) B----~b
A set of LR(1) tables for G is shown in Fig. 7.53.
action goto
a b e S A B a b
To S 3 X T1 T2 X T3 X
T~ X X A X X X X X
T2 X S X X X T4 x r5
T3 S 3 X x /'6 X T3 X
T4 X X 1 X X X X X
T5 X S 5 x X rT X rs
T6 X S X X X X X T8
r7 X X 4 X X X X X
r8 X 2 X X X X X X
Example 7.38
Let us consider the automaton of Fig. 7.54. There are three interrogation
states with out-degree 1, namely T~, T~, and T]. These states and the edges
leaving can all be eliminated.
660 TECHNIQUESFOR PARSER OPTIMIZATION CHAP. 7
Start
r3
T5
T2
T0
Next, let us consider read state T6. The only way T6 can be reached is via
the paths from T3 and T~ or T8 and T~. The previous input symbol label is b
in either case, meaning that if T3 or 7'8 see b on the input, they transfer to T~
or T~ for a reduction. The b remains on the input until 7'6 examines it and
decides to transfer to T b for a shift. Since we know that the b is there, 7'6 is
superfluous; T~ can push the state name 7'6 on the pushdown list without
looking at the next input symbol, since that input symbol must be b if the
automaton has reached state/'6.
SEC. 7.5 PARSINGAUTOMATA 661
Start
To
T3
entries for each LR(1) table could be stored as a list of pairs (A, T) meaning
on nonterminat A place T on top of the stack. The gotos on terminals could
be encoded in the shift statements themselves. Note that no p-entries would
have to be stored. The optimizations of Section 7.2 merging common se-
quences of statements would then be applicable.
These approaches to parser design have several practical advantages.
First, we can mechanically debug the resulting parser by generating input
strings that will check the behavior of the parser. F o r example, we can easily
construct input strings that cause each useful entry in an LL(1) or LR(1)
table to be exercised. Another advantage of LL(1) and LR(1) parsers,
especially the former, is that minor changes to the syntax or semantics can
be made by simply changing the appropriate entries in a parsing table.
Finally, the reader should be aware that certain ambiguous grammars
have " L L " o r " L R " parsers that are formed by resolving parsing action con-
flicts in an apparently arbitrary manner (see Exercise 7.5.14). Design of
parsers of this type warrants further research.
EXERCISES
DEFINITION
7.5.9. Show that a C F G is LR(k) if and only if each right-sentential form t~flw
such that S" *~
rm
~Aw ==~
rm
~flw has a characteristic string which may be
determined from only ~fl and FIRSTk(W).
7.5.10. Let G = (N, E, P, S) be an LR(0) grammar and (3, To) its canonical set
of LR(0) tables. Let M = (3, E U 3, ~, To, {T1}) be the canonical
parsing automaton for G. Let M ' = (3 U (qr}, E u E', 6', To, {qs]) be
the deterministic finite automaton constructed from M by letting
(1) ~'(T, a) -- ~(T, a) for all T in 3 and a in E.
(2) O'(T, :~,.) -- qs ifO(T, T') is defined and T' is areduce state calling
for a reduction using production i.
Show that L(M') is the set of characteristic strings for G.
7.5.11. Give an algorithm for merging "equivalent" states of the semireduced
automaton constructed for an LR(1) grammar, where equivalence is
taken to mean that the two states have transitions on the same set of
symbols and transitions on each symbol are to equivalent states.t
*%5.12. Suppose that we modify the definition of "equivalence" in Exercise
7.5.11 to admit the equivalence of states that transfer to equivalent states
on symbol a whenever both states have a transition on a. Is the resulting
automaton equivalent (in the formal sense, meaning one may not shift
if the other declares error) to the semireduced automation ?
7.5.13. Suppose a read state T of an LR(1) parsing automaton has all its tran-
sitions to pop states which reduce by the same production. Show that if
we delete T and merge all those pop states to one, the new automaton will
make the reduction independent of the lookahead, but will be equivalent
to the original automaton,
"7.5.14. Let G be the ambiguous grammar with productions
S -+ if b then SEIa
E --~ else S [ e
Research Problems
7.5.15. Apply the technique used here--breaking a parser into a large number
of active components and merging or eliminating some of t h e m - - t o
parsers other than LR ones. For example, the technique in Section 7.2
tThis definition can be made precise by defining relations 0=, __1_. . . . as in Lemma 7.2.
BIBLIOGRAPHIC NOTES 665
Programming Exercises
7.5.18. Design elementary operations that can be used to implement split
canonical parsing automata. Construct an interpreter for these elemen-
tary operations.
7.5.19. Write a program that will take a split canonical parsing automaton and
construct from it a sequence of elementary operations that simulates
the behavior of the parsing automaton.
7.5.20. Construct two LR(1) parsers for one of the grammars in the Appendix
of Volume 1. One LR(1) parser should be the interpretive LR(1) parser
working from a set of LR(1) tables. The other should be a sequence of
elementary operations simulating the parsing automaton. Compare the
size and speed of the parsers.
BIBLIOGRAPHIC NOTES
666
SEC. 8.1 THEORY OF LL LANGUAGES 667
The operator precedence languages are a proper subset of the simple pre-
cedence languages and incommensurate with the LL languages. In Chapter
5 we saw that the class of UI weak precedence languages is the same as the
simple precedence languages. Thus, we have the hierarchy of languages
shown in Fig. 8.1.
Deterministic
Context-free Languages
Simple
Precedence
LL
Operator
Precedence
In this chapter we shall derive this hierarchy and mention the most strik-
ing features of each class of languages. This chapter is organized into three
sections. In the first, we shall discuss LL languages and their properties.
In the second, we shall investigate the class of deterministic languages, and
in the third section we shall discuss the simple precedence and operator
precedence languages.
This chapter is the caviar of the book. However, it is not essential to
a strict diet of"theory of compiling," and it can be skipped on a first reading.'t
8.1. T H E O R Y OF LL L A N G U A G E S
Our first task is to prove that every LL(k) grammar is an LR(k) grammar.
This result can be intuited by the following argument. Consider the deriva-
tion tree sketched in Fig. 8.2.
S S
W X1 W X2
(a) (b)
LEMMA 8.1
•~Throughout this book we are assuming that a grammar has no useless productions.
670 THEORY OF DETERMINISTIC PARSING CHAP. 8
In these derivations, we can assume that i a n d j are both greater than zero.
Otherwise, we would have, for example,
S'-~ S'---> S
which would imply that G was left-recursive and hence that G was not LL.
Thus, for the remainder of this proof we can assume that we can replace S'
by S in derivations (8.1.1) and (8.1.2).
Let x,, xp, x v, and x ~ b e terminal strings derived from a, fl, ~,, and t~,
respectively, such that x, xpx2 = x~x~y. Consider the leftmost derivations
which correspond to the derivations
and
Specifically, let
(8.1.5) S ~l m x.Arl ~ x.flrl ~I m x.xprl ~I m x.xpx,
and
(8.1.6) S -Y-->
lm
x~BO --->
Im
x~OO --->
* x~x~O ::=,
lm
* x~x ~y
lm
Let us fix our attention on the parse tree T of derivation (8.1.7). Let nA
be the node corresponding to A in x~Arl and nn the node corresponding to
B in XyBO. These nodes are shown in Fig. 8.4. Note that nB may be a descen-
dant of nA. There cannot really be overlap between xp and x~. Either they
are disjoint or x~ is a subword of xp. We depict them this way merely to
imply either case.
Let us now consider two rightmost derivations associated with parse
tree T. In the first, T is expanded rightmost up to (and including) node nA;
in the second the parse tree is expanded up to node nB. The latter derivation
can be written as
s ~r m ~By ~1-m ~,,~y
SEC. 8.1 THEORY OF LL LANGUAGES 671
Xa X# X2
t , • 13 L l. T . JIL II J
x~ x6 Y
The grammar
S >A[B
A > aAb[O
B > aBbb[ 1
LR
Left Right
Parsable Parsable
LL
t I I is the sequence of interior nodes cff T in the order in which the nodes are expanded
in a leftmost derivation.
SEC. 8.1 THEORY OF LL LANGUAGES 673
We claim that each of the six classes of grammars depicted in Fig. 8.5 is
nonempty. We know that there exists an LL grammar, so we must s h o w
the following.
THEOREM 8.2
There exist grammars which are
(1) LR and left-parsable but not LL.
(2) LR but not left-parsable.
(3) Left- and right-parsable but not LR.
(4) Right-parsable but not LR or left-parsable.
(5) Left-parsable but not right-parsable.
Proof Each of the following grammars inhabits the appropriate region.
(1) The grammar Go with productions
S >AIB
A ~ aaA l aa
B > aaBla
S >AbIAc
A ~ ABla
B ~a
is LR(1) but not left-parsable. See Example 3.27 (p. 272 of Volume I).
(3) The grammar Go with productions
S- ~AblBc
A ~ Aala
B ~ Bala
S >AblBc
A ~ACla
B ~ BCIa
C- ~a
S >BAbICAc
A > BAIa
B >a
C >a
is left-parsable but not right-parsable. See Example 3.26 (p. 271, Volume I).
DEFINITION
ALGORITHM 8.1
Conversion to a nonnullable grammar,
lnput. C F G G = (N, E, P, S).
Output. A nonnullable context-free grammar G1 = (Na, ~E' P~, S~) such
that L(Gi) = L(G) - - [e}.
Method.
A: >BmX~ "'" X .
A >X~ ...X~
(c) In addition, if A is itself nullable, we also add to P' all the produc-
tions in (a) and (b) above with A instead of A on the left.
(4) If A ---~ e is in P, we add A ~ e to P'.
(5) Let G~ = (Na, Z, Pa, S~) be G ' = (N', Z, P', $1) with all useless sym-
bols and productions removed. D
Example 8.1
Let G be the LL(1) grammar with productions
S---->AB
A >aAle
B >bAle
S > ABIB
S > ABIB
676 THEORY OF DETERMINISTIC PARSING CHAP. 8
Since S is the new start symbol, we find that S is now inaccessible. The
final set of productions constructed by Algorithm 8.1 is
S > ABIB
A > aA
A > aAle
B > bA
B >bAle
The following theorem shows that Algorithm 8.1 preserves the LL(k)-
ness of a grammar.
THEOREM 8.3
If G1 is constructed from G by Algorithm 8.1, then
(1) L ( G , ) = L(G) -- {e}, a n d
(2) If G is LL(k), then so is G1.
Proof
(1) Straightforward induction on the length of strings shows that for all
A in N,
s, ~
Gt lm
w ~ Gt
- - .lm wp~ G~*-~
lm
wx
$1 t71
~ l m wAoc 17x
~ l m wToc ==~
(71 l m
wy
S ~G l m wAh(a) ===~
G lm
wfl'h(fl)h(a) *~ wh(fl)h(a) ~G l m wx
G lm
,
s ~ wAh(~) ~ w~'h(~)h(~) ~ wh(r)h(~) ~ wy
G lm G lm G Im G lm
lm wO~ - -Im
wfl'h(fl)h(~)=wO~ ==~ ~ ... lm ~. wO. = wh(,O)h(oO
DEFINITION
LEMMA 8.2
and
The following algorithm can be used to prove that every LL(k) language
without e has an LL(k + 1) grammar with no e-productions.
ALGORITHM 8.2
Elimination of e-productions from an LL(k) grammar.
Input. An LL(k) grammar Ga = (N~, E, P1, S1)-
Output. An LL(k + 1) grammar G = (N, Z, P, S) such that L ( G ) =
L(G,) -- {e}.
Method.
(1) First apply Algorithm 8.1 to obtain a nonnullable LL(k) grammar
G 2 = (N 2, Z, P2, $2).
(2) Eliminate from G2 each nonterminal A that derives only the empty
string by deleting A from the right-hand sides of productions in which it
appears and then deleting all A-productions. Let the resulting grammar be
G 3 : ( N 3, E, P3, $2)"
(3) Construct grammar G = (N, Z, P, S) as follows:
(a) Let N be the set of symbols [Xa] such that
(i) X is a nonnullable symbol of G3,
(ii) a is a string of nullable symbols,
SEC. 8.1 THEORY OF LL LANGUAGES 679
Example 8.2
Let us consider the g r a m m a r of Example 8.1. Algorithm 8.1 has already
been applied in that example. Step (2) of Algorithm 8.2 does not affect the
grammar. We shall generate the productions of g r a m m a r G as needed to
assure that each nonterminal involved appears in some left-sentential form.
The start symbol is [S]. There are two S-productions, with right-hand sides
A-B and/~. Since A a n d / ~ are nonnullable but B is nullable, g-a(A-B) = [A-B]
and g-a(B-) -- [/~]. Thus, by rule (i) of step (3c), we have productions
[Sl -~ ~ [ABll[B]
[B]-- ~ [bA]
We now apply rules (ii) and (iii) to the nonterminal [aAB]. There is one
non-e-production for A and one for B. Since g - ~ ( a A B ) - - [ a A B ] , we add
[aAB] ~ a[bA]
680 T H E O R Y OF D E T E R M I N I S T I C P A R S I N G CHAP. 8
[(tAB], >a
THEOREM 8.4
S~ w a s - - ~ wp~ * WX
G lm G lm G lm
and
S ~G lm wAoc ~G lm wyt~ ==~
G lm
wy
If 6 begins with a terminal, say ~ - - a 6 ' , case (2b) or (2c) must apply.
The two derivations in G3 replace a certain prefix of ~ by e, followed by
the application of a non-e-production in case (2b). It is easy to argue that
fl -- 3, in either case.
We shall now prove that every LL(k) language has an LL(k + 1) grammar
in Greibach normal form (GNF). This theorem has several important appli-
cations and will be used as a tool to derive other results. Two preliminary
lemmas are needed.
LEMMA 8.3
No LL(k) grammar is left-recursive.
Proof Suppose that G = (N, E, P, S) has a left-recursive nonterminal A.
+
Then there is a derivation A =~ A~. If ~ ~ e, then it is easy to show that G
is ambiguous and hence cannot be LL. Thus, assume that ~ *~ v for some
v ~ l~+. We can further assume that A *~ u for some u ~ E* and that there
exists a derivation
S =L~
lm
wA,~~l m wAock,5 l*-~
m
wuvkx
LEMMA 8 . 4
ALGORITHM 8.3
THEOREM 8.5
Let G 1 = (N1, I2, P,, $1) and G z = (N2, E, P2, $2) be LL(k) grammars
in G N F such that L ( G , ) = L(G2). Then there is a constant p, depending
on G 1 and G 2, with the following property. Suppose that Si ---> wa ---> wx
Gt l m Gt l m
and that $2 G2
~ l m w fl G2 *I m wy, where a and fl are the open portions of w0~and
===>
wfl, and FIRSTk(x) = FIRSTk(y). Then ITHO'(a) -- THO,(fl) ! ~ p.t
P r o o f Let t be the maximum of THO'(y) or TH"'(?) such that ? is a right-
hand side of a production in P1 or P2, respectively. Let p = t(k ÷ 1), and
suppose in contradiction that
that a =~- G~
zu, because by (8.1.11) THO,(a, z) > Jzu [. Since G1 is LL(k), if
there is any leftmost derivation of wzu in G~, it begins with the derivation
$1 Gt
- - l~m wa. Thus, wzu is not in L(G1) , contradicting the assumption that
L(G1) = L(G2). We conclude that THO,(a) -- THO,(fl) < p = t(k + 1). The
case THO~(fl) -- THO,(e) > p is handled symmetrically. ~
LEMMA 8.6
It is decidable, for DPDA P, whether P accepts all strings over its input
alphabet.
Proof By Theorem 2.23, L(P), the complement of L(P), is a deterministic
language and hence a CFL. Moreover, we can effectively construct a CFG
G such that L(G) = L(P). Algorithm 2.7 can be used to test if L(G) = ~ .
Thus, we can determine whether P accepts all strings over its input alphabet.
It is decidable, for two LL(k) grammars G 1 = (N1, El, P1, $1) and
G2 = (N2, Z2, e2, $2), whether L(G~) = L ( G 2 ) .
Proof. We first construct, by Algorithm 8.3, G N F grammars G'~ and G~
equivalent to G 1 and Gz, respectively (except possibly for the empty string,
SEC. 8.1 THEORY OF LL LANGUAGES 685
i w ! i
Parser
case the open portions of the two current left-sentential forms, together with
some extra information appended to the nonterminals to guide the parsing.
We can thus think of the stack contents as consisting of symbols of G'i and
G~. The extra information is carried along automatically.
Note that the two open portions may not take the same amount of space.
However, since we can bound from above the difference in their thicknesses,
then, whenever L(G1) = L(G2), we know that P can simulate both derivations
by reading and writing a fixed distance down its pushdown list. Since G'I
and G~ are in GNF, P alternately simulates one step of the derivation in G'i,
686 THEORY OF DETERMINISTIC PARSING CHAP. 8
one in G~, and then moves its input head one position. If one parse reaches
an error condition, the simulation of the parse in the remaining grammar
continues until it reaches an error or accepting configuration.
It is necessary only to explain how the two open portions can be placed
so that they have approximately the same length, on the assumption that
L(G1) = L(G2). By Lemma 8.5, there is a constant p such that the thicknesses
of the two open portions, resulting from processing the prefix of any input
string, do not differ by more than p.
For each grammar symbol of thickness t, P will reserve t cells of the
appropriate track of its pushdown list, placing the symbol on one of them.
Since G'x and G~ are in G N F , there are no nullable symbols in either grammar,
and so t > 1 in each case. Since the two strings ~ and fl of Fig. 8.6 differ in
thickness by at most p, their representations on P's pushdown list differ in
length by at most p cells.
To complete the proof, we design P to reject its input if the two open
portions on its pushdown list ever have thicknesses differing by more than
p symbols. By Lemma 8.5, L ( G ~ ) ~ L(G2) in this case. Also, should the
thicknesses never differ by more than p, P accepts its input if and only if it
finds a parse of that input in both G'i and G~ or in neither of G'~ and G~.
Thus, P accepts all strings over its input alphabet if and only if L(Gi) =
L(G2). E]
We shall show that for all k > 0 the LL(k) languages are a proper subset
of the LL(k + 1) languages. As we shall see, this situation is in direct con-
trast to the situation for LR languages, where for each LR language we can
find an LR(1) grammar.
Consider the sequence of languages L1, L 2 , . . , L k , . . . , where
S > aT
T >S A I A
A= > bB[c
B > bk-ld[ e
We now show that every LL(k) grammar for L k must contain at least one
e-production.
SEC. 8.1 THEORY OF LL LANGUAGES 687
LEMMA 8.7
Lk is not generated by any LL(k) grammar without e-productions.
P r o o f Assume the contrary. Then we may, by steps (2)-(4) of Algorithm
8.3, find an LL(k) grammar, G - - - ( N , {a, b, c, d}, P, S), in G N F such that
L(G)--L k. We shall now proceed to show that any such grammar must
generate sentences not in L k.
Consider the sequence of strings ~, i - - 1, 2 , . . . , such that
S==~ ==~
lm lm
for some J. Since G is LL(k) and in G N F , ~t is unique for each i. For if not,
let ~ = ~j. Then it is easy to show that a~+kbj+k is in L(G), which is contrary
to assumptions if i ~ j. Thus, we can find i such that I~,1>_2 k - - 1.
Pick a value of i such that 0ct = / 3 B y for some fl and ? in N* and B ~ N
such that lPt and 17,1 are at least k ~ 1. Since G is LL(k), the derivation of
the sentence a~+k-~b ~+~'-~ is of the form
i+k-- 1 =j+l+m.
The existence of the latter derivation follows from the fact that the sentence
at+k-~bJ+Z+k-~db r" agrees with a~+k-lb i+k-I for (i + k -- 1) + (j + 1 + k - - 1)
symbols. Thus, ? =~
* bk-ldb ~.
Putting these partial derivations together, we can obtain the derivation
THEOREM 8.7
Example 8.3
Let us consider Gk, the natural LL(k -+- 1) grammar for the language L k of
Lemma 8.7, defined by
S ~ aSAlaA
A ~ bkdlb[c
S > a[S, a]
A , b[A, b]lc[A, el
[S, al "~ S A I A
[A, b] > bk-ldle
[A, c] ~e
It is left for the Exercises to prove that G k and G~, are, respectively,
LL(k + 1) and LL(k).
THEOREM 8.8
EXERCISES
E - - - ~ TE"
E'- ~ + TE'[e
T - - - ~ FT'
T'- ~,FT'[e
F---> al(E)
8.1.17. Prove that THG(afl) = THa(a) ÷ THe(fl) and that if a *~ fl, then
THe(a) .~ WHe(fl).
8.1.18. Give algorithms to compute THe(a) and THG(a, z).
8.1.19. Show that G1 of Algorithm 8.1 left-covers G of that algorithm.
8.1.20. Show that every LL(k) grammar G is left-covered by an LL(k + 1)
grammar in GNF.
8.1.21. For k ~ 2, show that every LL(k) grammar without e-productions is
left-covered by an LL(k -- 1) grammar.
*'8.1.22. Show that it is decidable, given an LR(k) grammar G, whether there
exists a k' such that G is LL(k').
BIBLIOGRAPHIC NOTES
Theorem 8.1 was first suggested by Knuth [1967]. The results in Sections 8.1.2
and 8.1.3 first appeared in Rosenkrantz and Stearns [1970]. Solutions to Exercises
8.1.14-8.1.16 and 8.1.22 can be found in there also. The hierarchy of LL(k) lan-
guages was first noted by Kurki-Suonio [1969].
Several earlier papers gave decidability results related to Theorem 8.6. Korenjak
and Hopcroft [1966] showed that it was decidable whether two simple LL(1)
grammars were equivalent. McNaughton [1967] showed that equivalence was
decidable for parenthesis grammars, which are grammars in which the right-hand
side of each production is surrounded by a pair of parentheses, which do not
appear elsewhere within any production. Independently, Paull and Unger [1968a]
showed that it was decidable whether two grammars were structurally equivalent,
meaning that they generate the same strings, and that their parse trees are the
same except for labels. (Two grammars are structurally equivalent if and only if
the parenthesis grammars constructed from them are equivalent.)
of them, will be in the classes mentioned above. We shall first define the
special properties desired in a DPDA.
DEFINITION
but for no q" does (q', e, Z)[.z_ (q", e, Z). We construct the final D P D A P6 in
our sequence from Ps by giving q the moves of q' in each situation above.
P6 is then the desired D P D A P.
The detailed construction corresponding to these intuitive ideas is left
for the Exercises. [Z]
DEFINITION
[qq'] ---->a
[rq'] ~ [rq]a
[qq'] ~ [sp]
[rq'] ~ [rq][sp]
. L
694 THEORY OF DETERMINISTIC PARSING CHAP. 8
state q, M can write only a fixed number of symbols on its pushdown list
before scanning another input symbol. That is, there exists a finite sequence
of states q 1,. • •, qk such that q l = q, 6(qi, e, Z) ---- (q~+1, Yt Z) for 1 ~_ i < k
and all Z, and q~ is a scan state. The sequence has no repeats, and k ---- 1 is
possible. The justification is that should there be a repeat, then M is not
loop-free; if the sequence is longer than # Q, then there must be a repeat.
We call this sequence of states the write sequence f o r state q.
THEOREM 8.10
If G = (N, X, P, S) is the canonical g r a m m a r constructed from a normal
form D P D A M = (Q, X, F, 8, q0, z0, [qs]), then L(G) = L ( M ) -- [e}.
P r o o f Here we shall prove that [qq'] generates exactly the input strings
for which a traverse from q to q' is possible. To do so, we shall prove the fol-
lowing statement inductively"
COROLLARY 2
If L ~ E* is a deterministic language and ¢ is not in E, then L¢ has
a canonical grammar. [~]
LEMMA 8.9
A canonical g r a m m a r is a (not necessarily UI) (1, 1)-precedence gram-
mar.t
P r o o f . Let G = (N, X , P , S) be a canonical g r a m m a r . We consider
the three possible precedence conflicts and show that none can occur.
Case 1: Suppose that X < Y a n d X ~- Y. Since X-~- Y, there must be
a p r o d u c t i o n A - 0 X Y of type 2 or 4. Thus, X = [qq'], and either Y ~ Z
a n d q' is a scan state or Y = [pp'] and p' is an erase state.
Since X < Y, there m u s t also be a p r o d u c t i o n B - 0 X C of type 4, where
+
C =~ Ya for some a. Let X = [qq'] as above. Then q' must be a write state,
because B --~ X C is a type 4 production. Moreover, Y must be of the f o r m
[pp'], where p' is an erase state, and hence A - 0 X Y is of type 4. We may con-
clude from the f o r m of type 4 productions that p is the second state in the
write sequence o f q'. Because B - 0 X C is a type 4 production, we may also
conclude that C = [pp"] for some p " .
+
N o w , let us consider the derivation C =~ Ya, which we may write as
lm
l l
[sis,] ~ [s~si]~ =- ... =- [s.s,]~°,
lrn lm lm
where [sis'z] = [pp"] a n d [s,s'~] = [pp']. We observe from the form of pro-
ductions that for each i either s~+ 1 = s~ (if [s#'~] is replaced by a p r o d u c t i o n
of type 2 or 4) or s~+ 1 is the state following s~ in the write sequence of q' (if
[s#',] is replaced by a p r o d u c t i o n of type 3). Only in the latter case will s~+ ~ be
an erase state, and thus we may conclude that since s'~ ( = p') is an erase state,
s. ( = p) follows s._ ~ on the write sequence of q'. Since s,_ x is either p or fol-
lows p on that sequence, we may conclude that p appears twice in the write
sequence of q'. Since this would imply a loop, we conclude that there are no
conflicts between <~ and " in a canonical g r a m m a r .
Case 2: X < Y a n d X ~ Y. Since X < Y, we may conclude as in case 1
that X -- [qq'], where q' is a write state. But if X -> Y, then there is a produc-
tion A --~ B Z , where B =~ a X a n d Z *=~ Yfl. The form of the productions
+
assures us that if B =~ a[qq'], then q, is an erase state. But we already found
q' to be a write state. W e m a y conclude that no conflicts between < and ->
exist.
Case 3: X "-- Y and X ~ Y. Since X ~-- Y, we may conclude as in case 1
that X --[qq'], where q' is a write or scan state. But, since X -> Y, we may
conclude as in case 2 that q' is an erase state.
Thus, a canonical g r a m m a r is a precedence g r a m m a r .
?To prove that a canonical grammar is simple MSP, we need only prove it to b e w e a k
precedence, rather than (1, 1)-precedence. However, the additional portion of this lemma
is interesting and easy to prove.
SEC. 8.2 GRAMMARS GENERATING DETERMINISTIC LANGUAGES 697
THEOREM 8.11
A c a n o n i c a l g r a m m a r is a simple m i x e d strategy p r e c e d e n c e g r a m m a r .
P r o o f By T h e o r e m 8.10 a n d L e m m a s 8.8 a n d 8.9, it suffices to s h o w t h a t
for every canonical g r a m m a r G = (N, E, P, S)
(I) If A ~ a X Y f l a n d B ~ Yfl are in P, t h e n X is n o t in I(B); a n d
(2) If A --~ a a n d B - - , a are in P, A :/: B, t h e n I(A) ~ I(B) -- ;2.
W e have (1) i m m e d i a t e l y , since if X <~ B or X ~ B, t h e n X <~ Y. But if
A ~ a X Y f l is a p r o d u c t i o n , then X ~" Y, a n d so we have a p r e c e d e n c e
conflict, in violation o f L e m m a 8.9.
N o w let us consider (2). A --~ a a n d B ~ a c a n n o t be distinct type 2
p r o d u c t i o n s , for if A -- [qq'], B = [pp'], a n d a = [rr']a, a n d if G comes f r o m
a D P D A M -- (Q, Z, F, ~, qo, Z0, [qs}), we have
THEOREM 8.12
If G~ is the grammar constructed in Algorithm 8.4, then L(G~)-~ L.
Proof. Since every sentence w in L(G) is of the form x¢ for x ~ Z ÷, it
follows that for every A ~ N, either A =~ G
u implies u ~ Z ÷, or A =~ G
u
implies u = re, where v ~ Z*. Let us call nonterminals of the first kind
intermediate and nonterminals of the latter type completing. A straightfor-
ward induction on the length of derivations shows that if A is an intermediate
nonterminal, then A *~ u if and only if A =:~ u, u ~ ~ * . Likewise, if A is
G Gt
THEOREM 8.13
If L is a deterministic language and e is not in L, then L is generated by
a simple MSP grammar.
Proof. Let L ~ Z ÷ and ¢ not be in Z. Then by Corollary 2 to Theorem
8.10, L¢ is generated by a canonical grammar G = (N, Z, P, S), which by
Theorem 8.11 is a simple MSP grammar. Let G I = (N~, Z, P t, S) be the
grammar constructed from G by Algorithm 8.4. Then L(Gi) = L, and we
shall show that G~ is also simple MSP.
SEC. 8.2 GRAMMARS GENERATING DETERMINISTIC LANGUAGES 699
COROLLARY 1
Every deterministic language L with the prefix property has a (1, 0)-
BRC grammar.
P r o o f If e is not in L, the result is immediate. If e c L, then L = (e}.
It is easy to find a (1, 0)-BRC grammar for this language. [~]
COROLLARY 2
I f L ~ E* is a deterministic language and ¢ is not in E, then L¢ has a (1, 0)-
BRC grammar. [~
THEOREM 8 . 1 6
(1) Every deterministic language with the prefix property has an LR(0)
grammar.
(2) Every deterministic language has an LR(1) grammar.
P r o o f From Theorem 5.21, every (m, k)-BRC grammar is an LR(k)
grammar. The result is thus immediate from Theorem 8.15 and Corollary I
to Theorem 8.14. [~]
Example 8.4
Let G be defined by the following productions:
S: ~AB
A >alb
B---+AC
C ~D
D ---~ a
Then G1 is defined by
Ss ~ AsB~
As >a[b
Ba------~ A A C ~
AA -----+ a l b
CA > DA
DA - >a
ss, ~ A~B~IA~B~
A~ ~a
A~ ,b
704 THEORY OF DETERMINISTIC PARSING CHAP. 8
B ~~ A~AC~ I AbaCa~
A~ >a
AbA ~ >b
C~: >a
In step (4), we string out A~, AS, and C,5, and we string out A~ and A].
The resulting productions of Ga are
p' -- r'. That is, let 6 be the transition function of the D P D A underlying G.
Then O(p, e, Z ) -- (s, Y Z ) for some Y, and O(s', e, Y) -- (p', e) -- (r', e).
Thus, A -- B, and A x -- B x.
The case X -- $ is handled similarly, with q0, the start state of the underly-
ing D P D A , playing the role of q' in the above. [~
LEMMA 8.11
LEMMA 8.12
THEOREM 8.17
G4 of Algorithm 8.5 is a UI (2, 1)-precedence grammar.
P r o o f Since for each a in E, P4 has at most one production A ~ a, it
should be clear that G4 is UI. It is easy to show that the only new
706 THEORY OF DETERMINISTIC PARSING CHAP. 8
COROLLARY 1
COROLLARY 2
If L ~ Z* is a deterministic language and ¢ is not in Z, then L¢ has a UI
(2, 1)-precedence grammar. [~]
EXERCISES
P - ([qo, q,, qz, q3, qf], [0, 1], [Zo, Z1, Xi, ~, qo, Zo, [ql])
O(qo, e, Y) -- (ql, Z1 Y)
5(ql, O, Y) = (qz, Y)
5(ql, 1, Y) -- (q3, Y)
5(q2, e, Y) = (ql, X Y )
6(q3. e, X) -- (qx, e)
O(q3, e, Zi) = (qf, e)
O(q3, e, Zo) = (qs, e)t
8.2.3. Identify the write, scan, and erase states in Exercise 8.2.2.
8.2.4. Give formal constructions for Theorem 8.9.
The following three exercises refer to a canonical grammar
G = (N, E, P, S).
8.2.5. Show that
(a) If [qq'] ---~ a is in P, then q is a scan state.
(b) If [qq'] ---~ [pp']a is in P, then p' is a scan state.
l'This rule is never used but appears for the sake of the normal form.
708 THEORY OF DETERMINISTIC PARSING CHAP. 8
(c) If [qq'] ~ [pp'] is in P, then q is a write state and p' an erase state.
(d) If [qq'] --, [pp'][rr'] is in P, then p' is a write state and r' is an erase
state.
8.2.6. Show that if [qq'][pp'] appears as a substring in a right-sentential form
o f G, then
(a) q' is a write state.
(b) q' =¢=p.
(c) p is in the write sequence of q'.
+ p,
8.2.7. Show that if [qq'] ~ t~[pp'], then is an erase state.
8.2.8. Prove the "only if" portion of Theorem 8.10.
8.2.9. Give a formal proof of Theorem 8.15.
8.2.10. Use Algorithm 8.5 to find UI (2, 1)-precedence grammars for the fol-
lowing deterministic languages:
(a) {a0nc0n In > 0} U {bOncO2~In >_ 0}.
(b) {0~aln0m In > 0, m > 0} U {0mbln0n [n > 0, m > 0}.
8.2.11. Complete the case X = $ in Lemma 8.10.
8.2.12. Complete the proof of Theorem 8.17.
8.2.13. Prove Theorem 8.18.
"8.2.14. Show that a CFL has an LR(0) grammar if and only if it is determin-
istic and has the prefix property.
8.2.15. Show that a CFL has a (1, 0)-BRC grammar if and only if it is deter-
ministic and has the prefix property.
*'8.2.16. Show that every deterministic language has an LR(1) grammar in
(a) CNF.
(b) GNF.
8.2.17. Show that if A ~ t~ and B - ~ • are type 4 productions of a canonical
grammar, then A = B.
"8.2.18. If G of Algorithm 8.4 is a canonical grammar, does GI constructed in
that algorithm right-cover G ? What if G is an arbitrary grammar .9
8.2.19. Complete the proof of Theorem 8'12.
*8.2.20. Show that every LR(k) grammar is right-covered by a (1, k)-BRC
grammar. Hint: Modify the LR(k) grammar by replacing each ter-
minal a on the right of productions by a new nonterminal X,~ and
adding production Xo ---~ a. Then modify nonterminals of the grammar
to record the set of valid items for the viable prefix to their right.
*'8.2.21. Show that every LR(k) grammar is right-covered by an LR(k) grammar
which is also a (not necessarily UI) (1, 1)-precedence grammar.
*8.2.22. Show that G4 of Algorithm 8.5 right-covers G of that algorithm.
SEC. 8.3 THEORY OF SIMPLE PRECEDENCE LANGUAGES 709
Open Problems
8.2.23. Is every LR(k) grammar covered by an LR(1) grammar ?
8.2.24. Is every LR(k) grammar covered by a UI (2, 1)-precedence grammar ?
A positive answer here would yield a positive answer to Exercise 8.2.23.
8.2.25. We stated this one in Chapter 2, but no one has solved it yet, so we
shall state it again. Is the equivalence problem for DPDA's decidable ?
Since all the constructions of this section can be effectively carried out,
we have many equivalent forms for this problem. For example, one
might show that the equivalence problem for simple MSP grammars
or for UI (2, l)-precedence grammars is decidable.
BIBLIOGRAPHIC NOTES.
Theorem 8.11 was first derived by Aho, Denning, and Ullman [1972]. Theorems
8.15 and 8,16 initially appeared in Knuth [1965]. Theorem 8.18 is from Graham
[1970]. Exercise 8.2.21 is from Gray and Harrison [1969].
We have seen that many classes of grammars generate exactly the deter-
ministic languages. However, there are also several important classes of
grammars which do not generate all the deterministic languages. The LL
grammars are such a class. In this section we shall study another such class,
the simple precedence grammars. We shall show that the simple precedence
languages are a proper subset of the deterministic languages and are incom-
rnensurate with the LL languages. We shall also show that the operator
precedence languages are a proper subset of the simple precedence languages.
L~ = {aO"l" In ~ 1) w {bO"lz" In ~ 1}
THEOREM 8.19
The simple precedence languages form a proper subset of the deterministic
languages.
Proof. Clearly, every simple precedence grammar is an LR(1) grammar.
To prove proper inclusion, we shall show that there is no simple precedence
grammar that generates the deterministic language L 1.
Intuitively, the reason for this is that any simple precedence parser for L 1
cannot keep count of the number of 0's in an input string and at the same time
know whether an a or a b was seen at the beginning of the input. If the parser
stores the first input symbol on the pushdown list followed by the succeeding
string of 0's, then, when the l's are encountered on the input, the parser will
not know whether to match one or two l's with each 0 on the pushdown list
without first erasing all the O's stored on the stack. On the other hand, if
the parser tries to maintain on top of the pushdown list an indication of
whether an a or b was initially seen, then it must make a sequence of reduc-
tions while reading the O's, which destroys the count of the number of O's
seen on the input.
We shall now construct a formal proof motivated by this intuitive reason-
ing. Suppose that G = (N, X, P, S) is a simple precedence grammar such
that L(G) = Li. We shall show as a contradiction that any such grammar
must also derive sentences not in L~.
Suppose that an input string aO"w, w ~ 1", is to be parsed by the simple
precedence parser constructed for G according to Algorithm 5.12. As a0"
is the prefix of some sentence in L 1 for all n, each 0 must eventually be shifted
onto the stack. Let 0c~be the stack contents after the ith 0 is shifted. If ~ = tzj
for some i < j, then a0Jl ~ and a0JlJ would either both be accepted or both
be rejected by the parser, and so ~t ~ ~j if i ~ j.
Thus, for any constant c we can find a n ' ~ such that ]~i[ > c and ~ is
a prefix of every t~j, j > i (for if not, then we could construct an arbitrarily
long sequence ~s,, ~s,,. • • such that t~,, [ = ]~s,_, ] for t ~ 2 and thus find
two identical e's). Choose i as small as possible. Then there must be some
shortest string fl ~ e such that for each k, 0c;fl* is e,'+mk for some m > 0.
The reason is that since e,. is never erased as long as O's appear on the input,
the symbols written on the stack by the parser do not depend on 0c,.. The
behavior of the parser on input a0" must be cyclic, and it must enlarge the
stack or repeat the same stack contents (and we have just argued that it may
n o t do the latter).
Now, let us consider t h e behavior of the parser on an input string of
the form bO"x. Let )'k be the stack after reading b0 k. We may argue as above,
that for some ?j, j as small as possible, there is a shortest string J ~ e such
that for each k, ?ilk = ),j+q~ for some q > 0. In fact, since ?j is never erased,
we must have ~ = fl and q = m. That is, a simple induction on r ~> 0 shows
SEC. 8.3 THEORY OF SIMPLE PRECEDENCE LANGUAGES 711
that if after reading aOi+r the stack holds aier, then after reading bOj+r, the
stack will hold ?je,..
Consider the moves made by the parser acting on the input string
aOt+mklt+mk. After reading aO~+mk, the stack will contain etfl k. Then let s be
the largest number such that after reading 1~+mk-" the parser will have 0~t~
left on its stack for some ~ in (N U E)* (i.e., on the next 1 input, one of the
symbols of et is involved in a reduction). It is easy to show that s is unique;
otherwise, the parser would accept a string not in L~.
Similarly, let r be the largest number such that beginning with ?jfle on
its stack the parser with input 12~j+mk~-r makes a sequence of moves ending
up with some ~,j~ on its stack. Again, r must be unique.
Then for all k, the input bOJ+mkli+mk-`+r must be accepted, since bOj+m~'
causes the stack to become ~jflk, and 1'+mk-` causes the stack to become
~j~. The erasure of the ,O's occurs independently of whether ~i or 0~j is below,
and 1r causes acceptance. But since m ~ 0, it is impossible that we have
i -k mk -- s -k r = 2(j Jr mk) for all k.
We conclude that L~ is not a simple precedence language.
THEOREM 8 . 2 0
S - - - + aA t bB
A > 0A1 [01
B~ > 0BII[011
S- >A[B
A > 0Alia
B ~ 0B21b D
Example 8.5
Consider the grammar
S ~ > a laAbS
A > a laSbA
SEC. 8.3 THEORY OF SIMPLE PRECEDENCE LANGUAGES 713
S 1~ {S}[~S, A]
From step (4) we discover that all [S]- and (A]-productions are useless, and
so the resulting grammar is
S1 ), [S~ A}
LEMMA 8 . 1 4
We thus have the following normal form for operator precedence gram-
mars.
THEOREM 8.21
IfL is an operator precedence language, then L = L(G) for some operator
precedence grammar G = (N, Z, P, S) such that
(1) G is UI,
(2) S appears on the right of no production, and
(3) The only single productions in P have S on the left.
Proof Apply Algorithms 2.11 and 8.6 to an arbitrary operator precedence
grammar. [Z
mar G1 such that L(Gi) -- (L(G), where ~ is a left endmarker. We can then
construct a UI weak precedence grammar G2 such that L(Gz) --L(G). In
this way we shall show that the operator precedence languages are a subset
of the simple precedence languages.
ALGORITHM 8.7
Example 8.6
Let G be defined by
S ~A
A ~ aAbAe[aAd[ a
We shall generate only the useful portion of N'~ and P'i in Algorithm 8.7.
We begin with nonterminal [¢~S]. Its production is [¢S] ~ [¢~A]. The produc-
tions for [CA] are [tEA]~ ¢[aA][bA]c, [CA]--~ ¢[aA]d, and leA]---, Ca. The
productions for [aA] and [bA] are constructed similarly. Thus, G1 is
[¢s] ~ [¢,4]
[CA] > ¢[aA][bA]e I f~[aA]d[f~a
[aA] ~ a[aA][bA]c l a[aA]d [ aa
[bA] -----+ b[aA][bA]e [ b[aA]d [ ba D
SEC. 8.3 THEORY OF SIMPLE PRECEDENCE LANGUAGES 715
LEMMA 8.15
In Algorithm 8.7 the grammar G~ is a UI weak precedence grammar such
that L(G~) = eL(G).
Proof. Unique invertibility is easy to show, and we omit the proof. To
show that G1 is a weak precedence grammar, we must show that < and "
are disjoint from 3> in G~'t. Let us define the homomorphism g by
(1) g(a) : a for a ~ 12,
(2) g(¢) = $, and
(3) g([aA]) : a.
It suffices to show that
(1) If X <Z Y in G1, then g(X) <Z g(Y) or g(X) ~- g(Y) in G.
(2) If X ~ Y in G~, then g(X) ~ g(Y) or g(X) ~ g(Y) in G.
(3) If X -> Y in G1, then g(X) .> g(Y) in G.
Case/'Suppose that X ~ Y in G1, where X ~ ¢ and X ~ $. Then there
is a right-hand side in P~, say ocX[aA]fl, such that [aA] ~ Y7 for some 7-
• GI
If t~ is not e, then it is easy to show that there is a right-hand side in P with
substring h(X[aA]):~ and thus g(X) ~ a in G. But the form of the productions
in P~ implies that g(Y) must be a. Thus, g(X) " g(Y).
If t~ is e and X ~ E, then the left-hand side associated with right-hand
side ~X[aA]fl is of the form [XB] for some B E N. Then there must be a pro-
duction in P whose right-hand side has substring XB. Moreover, B ~ aAh(fl)
is a production in P. Hence, X <~ a in G. Since g(X) = X in this case, the
conclusion g(X)<Z g(Y) follows. If t~ is e and X = [bB] for some b ~ E
and B ~ N, let the left-hand side associated with right-hand side ocX[aA]fl
be [bC]. Then there is a right-hand side in P with substring bC, and
C---, BaAh(fl) is in P. Thus, b ~ a in G, and g(X) ~ g(Y) follows. The case
in which 0c = e and X = [¢B] for some B ~ N is easily handled, as is the
case where X itself is ¢ or $.
Case 2: The case X ~ Y is handled similarly to case 1.
Case 3: X 3> Y. Assume that Y ~ $. Then there is some right-hand side
+
in P~, say tx[aA]Zfl, such that [aA] 7X and Z *~ YO for some 7 and ~.
Gt G1
The form of productions in P~ again implies that either Z = Y and both are
in E U {$} or Z = [aB] and Y = [aC] or Y = a, for some B and C in N.
In any case, g(Z) = g( Y).
There must be a right-hand side in P with substring Ag(Z) because
tHere and subsequently, the symbols 4, ~, and .> refer to operator precedence rela-
tions in G and Wirth-Weber precedence relations in Gi.
Sh is the homomorphism in Algorithm 8.7.
716 THEORY OF DETERMINISTIC PARSING CHAP. 8
THEOREM 8.22
If L is a n operator precedence language, then L is a simple precedence
language.
Proof. By Theorem 8.21 and Lemma 8.15, if L __c E + is an operator pre-
cedence language, then (L is a UI weak precedence language, where ¢ is not
in X. A straightforward generalization of Algorithm 8.4 allows us to remove
the left endmarker from the grammar for eL constructed by Algorithm 8.7.
(The form of productions in Algorithm 8.7 assures that the resulting grammar
will be UI and proper.) The actual proof is left for the Exercises. By Theorem
5.16, L is a simple precedence language. U-]
CFL
deterministic CFL
LR(1)
(1,1)- BRC
UI (2, 1) - precedence
simple MSP
simple precedence
UI weak precedence
d e
LL
operator b
precedence
(5) {a0"l"ln >__ 1} U {b0"lZ"ln >__ 1] is LL(1) but not simple precedence.
(6) [aO"l"]n >. 1} w {aO"2Z"ln > I} u {b0"l Z"ln > 1} is deterministic but
not LL or simple precedence.
(7) {0"t"[n > 1} U {0"lZ"l n ~> 1} is context-free but not deterministic.
Proofs that these languages have the ascribed properties are requested
in the Exercises.
EXERCISES
8.3.1. Show that the grammar for L1 given in Theorem 8.20 is LL(2). Find an
LL(t) grammar for L 1.
*8.3.2. Prove that the language Lz = [0"al"[n > 1] U {0"b2"ln > 1] is not an
LL language. Hint: Assume that L2 has an LL(k) grammar in GNF.
*8.3.3. Show that L2 of Exercise 8.3.2 is an operator precedence language.
8.3.4. Prove that Algorithm 2.11 (elimination of single productions) preserves
the properties of
(a) Operator precedence.
(b) (m, n)-precedence.
(c) (m, n)-BRC.
8.3.5. Prove Lemma 8.14.
718 THEORY OF DETERMINISTIC PARSING CHAP. 8
*8.3.6. Why does Algorithm 8.6 not necessarily convert an arbitrary precedence
grammar into a simple precedence grammar ?
8.3.7. Convert the following operator precedence grammar to an equivalent
simple precedence grammar:
Show that there is no simple precedence parser for L that will announce
error immediately after reading the 1 in an input string of the form
a"laib. (See Exercise 7.3.5.)
Open Question
8.3.18. What is the relationship of the class of UI (1, k)-precedence languages,
k > 1, to the classes shown in Fig. 8.7? The reader should be aware
of Exercise 8.3,14.
BIBLIOGRAPHIC NOTES 719
BIBLIOGRAPHIC NOTES
720
SEC. 9 . 1 THE ROLE OF T R A N S L A T I O N IN C O M P I L I N G 721
which the compiler is to run, the language for which the compiler is being
developed, the number of people engaged in implementing the compiler,
and so forth. It is even possible to implement all phases in one pass. What
the optimal number of passes to implement a given compiler should be is
a topic that is beyond the scope of this book.
(9.1.1) A--B+C,-- D
with the normal order of precedence for the operators and assignment symbol
(--) has the postfix Polish representationt
ABCD- • -t- :
= A + B , C - - D
tin this representation, -- is a unary operator and ,, +, a n d : are all binary operators.
SEC. 9.1 THE ROLE OF TRANSLATION IN COMPILING 725
In postfix Polish notation, the operands appear from left to right in the
order in which they are used. The operators appear right after the operands
and in the order in which they are used. Postfix Polish notation is often used
as an intermediate language by interpreters. The execution phase of the inter-
preter can evaluate the postfix expression using a pushdown list for an accu-
mulator. (See Example 9.4.)
Both types of Polish expressions are linear representations of the syntax
tree for expression (9.1.1), shown in Fig. 9.1. This tree reflects the syntactic
structure of expression (9.1.1). We can also use this tree itself as the inter-
mediate program, encoding it as a linked list structure.
91 Syntax tree.
T I -,t O
Tz +---- *CT1
T3 ~, q-BTz
A ~ T3"~
tNote that the assignment operator must be treated differently from other operators.
A simple "optimization" is to replace T3 by A in the third line and to delete the fourth
line.
726 TRANSLATION AND CODE GENERATION CHAP. 9
1: --D
2: , C (1)
3: +B(2)
4: = A (3)
Here, S'i and S~ are the postfix representations of $1 and $2, respectively;
EQUAL ? is a Boolean-valued binary operator that has the value true if its
two arguments are equal and false otherwise. L2 is a constant which names
the beginning of S~. JFALSE is a binary operator which causes a jump to
the location given by its second argument if the value of the first argument is
false and has no effect if the first argument is true. L is a constant which is
the first instruction following S~. JUMP is a unary operator that causes
a jump to the location giqen by its argument.
A derivation tree for statement (9.1.2) is shown in Fig. 9.2. The important
syntactic information in this derivation tree can be represented by the syntax
tree shown in Fig. 9.3, where S'1 and S~ represent syntax trees for $1 and $2.
SEC. 9.1 THE ROLE OF TRANSLATION IN COMPILING 727
if < expression > then < statement > else < statement >
INTEGER I
will be translated into a command that enters the attribute "integer" in the
symbol table location reserved for identifier L There will be no explicit repre-
sentation for this statement in the intermediate program.
728 TRANSLATION AND CODE GENERATION CHAP. 9
A < -+-BC
LOAD B
ADD C
STORE A
EXERCISES
9.1.1. Draw syntax trees for the following source language statements:
(a) A = (B -- C)/(B + C) (as in FORTRAN).
(b) I = LENGTH(C111 C2) (as in PL/I).
(c) if B > C then
if D > E t h e n , 4 " = B - q - C e l s e A " = B - - C
else A := B • C (as in ALGOL).
tLet us assume for simplicity that there is only one type of arithmetic. If more than
one, e.g., fixed and floating, is available, then the translation of + will depend on symbol
table information about the attributes of B and C.
BIBLIOGRAPHIC NOTES 729
9.1.2. Define postfix Polish representations for the programs in Exercise 9.1.1.
9.1.3. Generate multiple address code with named results for the statements
in Exercise 9.1.1.
9.1.4. Construct a deterministic pushdown transducer that maps prefix Polish
notation into postfix Polish notation.
9.1.5. Show that there is no deterministic pushdown transducer that maps
postfix Polish notation into prefix Polish notation. Is there a nondeter-
ministic pushdown transducer that performs this mapping? Hint" see
Theorem 3.15.
9.1.6. Devise an algorithm using a pushdown list that will evaluate a postfix
Polish expression.
9.1.7. Design a pushdown transducer that takes as input an expression w in
L(Go) and produces as output a sequence of commands that will build
a syntax tree (or multiple address code) for w.
9.1.8. Generate assembly code for your favorite computer for the programs
in Exercise 9.1.1.
• 9.1.9. Devise algorithms to generate assembly code for your favorite computer
for intermediate programs representing arithmetic assignments when
the intermediate program is in
(a) Postfix Polish notation.
(b) Multiple address code with named results.
(c) Multiple address code with implicitly named results.
(d) The form of a syntax tree.
"9.1.10. Design an intermediate language that is suitable for the representation
of some subset of F O R T R A N (or PL/I or ALGOL) programs. The
subset should include assignment statements and some control state-
ments. Subscripted variables should also be allowed.
• "9.1.11. Design a code generator that will map an intermediate program of
Exercise 9.1.10 into machine code for your favorite computer.
BIBLIOGRAPHIC NOTES
Unfortunately, it is impossible to specify the best object code even for common
source language constructs. However, there are several papers and books that
discuss the translation of various programming languages. Backus et al. [1957]
give the details of an early F O R T R A N compiler. Randell and Russell [1964] and
Grau et al. [1967] discussthe implementation of ALGOL 60. Some details of PL/I
implementation are given in IBM [1969].
There are many publications describing techniques that are useful in code
generation. Knuth [1968a] discusses and analyzes various storage allocation tech-
niques. Elson and Rake [1970] consider the generation of code from a tree-
structured intermediate language. Wilcox [1971] presents some general models
for code generation.
?
730 TRANSLATION AND CODE GENERATION CHAP. 9
Example 9.1
Suppose that we wish to map expressions generated by the grammar G,
below, into prefix Polish expressions, on the assumption that ,'s are to take
SEC. 9.2 SYNTAX-DIRECTED TRANSLATIONS 7 3 '1
precedence over ÷ ' s , e.g., a , a q - a has prefix expression -t-* aaa, not
• a + aa. G is given by E --~ a ÷ E[ a • El a. However, there is no syntax-
directed translation scheme which uses G as an underlying grammar and
which can define this translation. The reason for this is that the output gram-
mar of such an SDTS must be a linear CFG, and it is not difficult to show
that the set of prefix Polish expressions over [-?, ,, a} corresponding to the
infix expressions in L ( G ) is not a linear context free language. However, this
particular translation can be defined using a simple SDTS with G O [except
for production F---~ (E)] as the underlying grammar.
Example 9.2
Let T be the simple SDTS with the rules
S ~ aSbSc, 1S2S3
S= >d, 4
Otherwise,
~(q, X, Y) = (error, e, e)
tin these rules we have taken the liberty of producing an output symbol and shifting
the input as soon as a production has been recognized, rather than doing these actions
in separate steps as indicated in the proof of Theorem 9.1.
SEe. 9.2 SYNTAX-DIRECTED T R A N S L A T I O N S 733
tively, the reason for this is that a translation element may require the
generation of output long before it can be ascertained that the production
to which this translation element is attached is actually used.
Example 9.3
Consider the simple SDTS T with the rules
S ~ Sa, aSa
S- ~ Sb, bSb
S----> e, e
Example 9.4
Postfix translations are more useful than it might appear at first. Here,
let us consider an extended D P D T that maps the arithmetic expressions of
734 TRANSLATION AND CODE GENERATION CHAP. 9
L(Go) into machine code for a very convenient machine. The computer for
this example has a pushdown stack for an accumulator. The instruction
LOAD X
puts the value held in location X on top of the stack; all other entries on
the stack are pushed down. The instructions A D D and MPY, respectively,
add and multiply the top two levels of the stack, removing the two levels
but then pushing the result on top of the stack. We shall use semicolons to
separate these instructions.
The SDTS we have in mind is
E >E+T, ET 'ADD;'
E >T, T
T >T,F, TF 'MPY;'
T >F, F
F >(E), E
F > a, ' L O A Da;'
In this example and the ones to follow we shall use the SNOBOL convention
of surrounding literal strings in translation rules by quote marks. Quote
marks are not part of the output string.
With input a + ( a . a)$, the D P D T enters the following sequence of
configurations; we have deleted the LR(k) tables from the pushdown list.
We have also deleted certain obvious configurations from the sequence as well
as the states and bottom of stack marker.
[e, a ÷ (a • a)$, e]
lad ÷(a.a)$, e]
--[F, -q-- (a • a)$, LOAD a;]
~[E, •-t- (a • a)$, LOAD a;]
I--L-[E + (a, • a)$, LOAD a;]
!--[E + (F, • a)$, LOAD a; LOAD a;]
- - [ E + (T, • a)$, LOAD a; LOAD a;]
[_L [E -+- ( T , a, )$, LOAD a; LOAD a;]
--[E + (T, F, )$, LOAD a; LOAD a; LOAD a;]
--[E + (T, )$, LOAD a; LOAD a; LOAD a; MPY;]
1--[E --q- (E, )$, LOAD a; LOAD a; LOAD a; MPY;]
SEC. 9.2 SYNTAX-DIRECTED TRANSLATIONS 735
Note that if the different a's representing identifiers are indexed, so that
the input expression becomes, say, a 1 + (a z • a3), then the output code
would be
LOAD a,
LOAD a2
LOAD a 3
MPY
ADD
While the computer model used in Example 9.4 was designed for the
purpose of demonstrating a syntax-directed translation of expressions with
few of the complexities of generating code for more common machine models,
the postfix scheme is capable of defining useful classes.of translations. In
the remainder of this chapter, we shall show how object code can be generated
for other machine models using a pushdown transducer operating on what is
in essence a simple postfix SDTS with an underlying LR grammar.
Suppose that we have a simple SDTS which has an underlying LR(k)
grammar, but which is not postfix. How can such a translation be performed ?
One possible technique is to use the following multipass translation scheme.
This technique illustrates a cascade connection of DPDT's. However, in
practice this translation would be implemented in one pass using the tech-
niques of the next section for arbitrary SDTS's.
Let T = (N, E, A, R, S) be a semantically unambiguous simple SDTS
with an underlying LR(k) grammar G. We can design a four-stage translation
scheme to implement r(T). The first stage consists of a DPDT. The input to
the first stage is the input string w$. The output of the first stage is z~, the right
parse for w according to the underlying input grammar G. The second stage
reverses n to create nR, the right parse in reverse.t
The input to the third stage will be rcR. The output of the third stage will
be the translation defined by the simple SDTS T' = (N, E', A, R', S), where
tRecall that n, the right parse, is the reverse of the sequence of productions used in a
rightmost derivation. Thus, nR begins with the first production used and ends with the
last production used in a rightmost derivation. To obtain nR we can merely read the
buffer in which z~is stored backwaM.
736 TRANSLATION AND CODE GENERATION CHAP. 9
7rR
Stage
2
71.R yR
Stage
3
yR
Stage
4
While the pushdown transducer is adequate for defining all simple SDTS's
on an LL grammar and for some simple SDTS's on an LR grammar, we
need a more versatile model of a translator when doing
(1) Nonsimple SDTS's,
(2) Nonpostfix simple SDTS's on an LR grammar,
(3) Simple SDTS's on a non-LR grammar, and
(4) Syntax-directed translation when the parsing is not deterministic
one pass, such as the algorithms of Chapters 4 and 6.
We shall now define a new device called a pushdown processor (PP) for
defining syntax-directed translations that map strings into graphs. A push-
down processor is a PDT whose output is a labeled directed graph, generally
a tree or part of a tree which the processor is constructing. The major feature
of the PP is that its pushdown list, in addition to pushdown symbols, can
hold pointers to nodes in the output graph.
Like the extended PDA, the pushdown processor can examine the top k
cells of its pushdown list for any finite k and can manipulate the contents of
these cells arbitrarily. Unlike the PDA, if these k cells include some pointers
to the output graph, the PP can modify the output graph by adding or deleting
directed edges connected to the nodes pointed to. The PP can also create
new nodes, label them, create pointers to them, and create edges between
these nodes and the nodes pointed to by those pointers on the top k cells of
the pushdown list.
As it is difficult to develop a concise lucid notation for such manipulations,
we shall use written descriptions of the moves of the PP. Since each move of
the PP can involve only a finite number of pointers, nodes, and edges, such
descriptions are, in principle, possible, but we feel that a formal notation
would serve to obscure the essential simplicity of the translation algorithms
involved. We proceed directly to an example.
Example 9.5
Let us design a pushdown processor P to map the arithmetic expressions
of L(G o) into syntax trees. In this case, a syntax tree will be a tree in which
each interior node is labeled by + or • and leaves are labeled by a. The fol-
lowing table gives the parsing and output actions that the pushdown proces-
sor is to take under various combinations of current input symbol and
symbol on top of the pushdown list. P has been designed from the SLR(1)
parser for G Ogiven in Fig. 7.37 (p. 625).
However, here we have eliminated table T4 from Fig. 7.37, treating
F ~ a as a single production, and have renamed the tables as follows"
738 TRANSLATION AND CODE GENERATION CHAP. 9
In addition $ is used as a right endmarker on the input. The parsing and out-
put actions of P are given in Figs. 9.5 and 9.6. P uses the LR(1) tables to
determine its actions. In addition P will attach a pointer to tables T1, T2, T3,
and T4 when these tables are placed on the pushdown list.t These pointers
are to the output graph being generated. However, the pointers do not affect
the parsing actions. In parsing we shall not place the g r a m m a r symbols on
the pushdown list. However, the table names indicate what that g r a m m a r
symbol would be.
The last column of Fig. 9.5 gives the new LR(1) table to be placed on
top of the pushdown list after a reduce move. Blank entries denote error
situations. A new input symbol is read only after a shift move. The numbers
in the table refer to the actions described in Fig. 9.6.
Let us trace the behavior of P on input a 1 • (a 2 + a3)$. We have sub-
scripted the a's for clarity. The sequence of moves m a d e by P is as follows:
tin practice, these pointers can be stored immediately below the tables on the push-
down list.
SEC. 9 . 2 SYNTAX-DIRECTED TRANSLATIONS 739
\ Input
~ Symbol action
Table on
Top of
Pushdown List "~ a
+ , ( ) $ goto
To 1 shift ( rl
2 s~ft 2
~q 3 3
+ shift ( r3
shift ( r4
( shift ( ~5
)
(1) Create a new node n labeled a. Push the symbol [T1, p] on top of the pushdown list,
where p is a pointer to n. Read a new input symbol.
(2) At this point, the top of the pushdown list contains a string of four symbols of the
form X[T~, p 1] + [Ti, p z], where p i andp 2 are pointers to nodes n 1and nz, respectively.
Create a new node n labeled Jr-. Make nl and nz the left and right direct descendants
of n. Replace [Ti, pl] + [Ti, p2] by [T,p], where T = goto(X) and p is a pointer to
node n.
(3) Same as (2) above with • in place of ÷.
(4) Same as (1) with T3 in place of Ti.
(5) Same as (1) with T4 in place of T~.
(6) Same as (1) with T~ in place of T1.
(7) The pushdown list now contains X([T,p]), where p is a pointer t o s o m e node n.
Replace ([T, p]) by [T', p], where T ' = goto(X).
P41
nl n3
//5
/'//3
This is the first of three sections showing how various parsing algorithms
can be extended to implement an SDTS by the use of a deterministic push-
down processor instead of a pushdown transducer. We begin by giving
an algorithm to implement an arbitrary SDTS with an underlying LR
grammar.
ALGORITHM 9. i
SDTS on an LR grammar.
Input. A semantically unambiguous SDTS T = (N, E, A, R, S) with
underlying LR(k) grammar G = (N, E, P, S) and an input w in E*.
SEC. 9.2 SYNTAX-DIRECTED TRANSLATIONS 741
Example 9.6
Let Algorithm 9.l be applied to the SDTS
with input abbab. The underlying grammar is SLR(1), and we shall omit
discussion of the LR(1) tables, assuming that they are there and guide the
parse properly. We shall list the successive stack contents entered by the push-
down processor and then show the tree structure pointed to by each of the
pointers. The LR(1) tables on the stack have been omitted.
742 TRANSLATIONAND CODE GENERATION CHAP. 9
THEOREM 9.4
I =1
1 ,i t.,i
(d) (e)
(4) If M's pushdown list becomes empty when it has reached the end of
the input sentence, it accepts; the output is the tree which has been construct-
ed with root n~. [~]
SEC. 9.2 SYNTAX-DIRECTED TRANSLATIONS 745
Example 9.7
We shall consider an example drawn from the area of natural language
translation. It is a little known fact that an SDTS forms a precise model for
the translation of English to another commonly spoken natural language,
pig Latin. The following rules informally define the translation of a word
in English to the corresponding word in pig Latin"
(1) If a word begins with a vowel, add the suffix YAY.
(2) If a word begins with a nonempty string of consonants, move all
consonants before the first vowel to the back of the word and append
suffix AY.
(3) One-letter words are not changed.
(4) U following a Q is a consonant.
(5) Y beginning a word is a vowel if it is not followed by a vowel.
We shall give an SDTS that incorporates only rules (1) and (2). It is left
for the Exercises to incorporate the remaining rules.
The rules of the SDTS are as follows"
(vowel)- ~ 'U','U'
(consonant) ~ 'B', 'B'
(consonant) > 'C', 'C'
Input Stack
( 1) THE$ (word)p1
(2) THE$ (consonant s)p 2(vowel)p 3(letters)p 4
(3) THE$ <consonant)p s(consonants)p 6(vowel)p 3(letters)p 4
(4) THE$ T(consonants)p 6(vowel)p 3(letters)p 4
(5) HE$ ( consonan ts ) p 6( v0wel) p 3(letters) p 4
(6) HE$ (consonant) p 7(vowel)p 3(letters) p 4
(7) HE$ H(vowel>p 3(letters)p 4
(8) E$ (vowel)p 3(letters)p 4
(9) E$ E(letters)p4
(10) $ (letters)p4
(11) $ e
The tree structure after steps 1, 2, 6, and 11 are shown in Fig. 9.10(a)-(d),
respectively. D
As in the previous section, there is an easy proof that the current algorithm
performs the correct translation and that on a suitable random access
machine the algorithm can be implemented to run in time which is linear in
the input length. For the record, we state the following theorem.
THEOREM 9.5
Algorithm 9.2 constructs a pushdown processor which produces as output
a tree whose frontier is the translation of the input string.
Proof We can prove by induction that an input string w has the net effect
of erasing nonterminal A and pointer p from the pushdown list if and only if
(A, A ) ~T* (w, x), where x is the frontier of the subtree whose root is the
node pointed to by p (after erasure of A and p and the symbols to which A
is expanded). Details are left for the Exercises.
(a) (b)
T
< consonant >
(d) T
i I
< consonant >
I
H
Fig. 9.10 Translation to pig Latin.
that the parsing machine already has pointers to its input; these pointers are
kept on one cell along with an information symbol.) The rules for manipu-
lating these pointers are the same as for the pushdown processor, and we shall
not discuss the matter in any more detail.
Before giving the translation algorithm associated with the parsing
machine, let us discuss how the notion of a syntax-directed translation
carries over to the G T D P L programs of Section 6.1. It is reasonable to
suppose that we shall associate a translation with a "call" of a nonterminal
if and only if that call succeeds.
Let P -- (N, Z, R, S) be a G T D P L program. Let us interpret a G T D P L
statement A --~ a, where a is in E U {e}, as though it were an attempt to
apply production A ~ a in a CFG. Then, in analogy with the SDTS, we
would expect to find associated with that rule a string of output symbols.
That is, the complete rule would be A --~ a, w, where w is in A*, A being
the "output alphabet."
The only other G T D P L statement which might yield a translation is
A --~ B[C, D], where A, B, C, and D are in N. We can suppose that two C F G
productions are implied here, namely A --~ BC and A ~ D. If B and C
succeed, we want the translation of A to involve the translations of B and C.
Therefore, it appears natural to associate with rule A --~ B[C, D] a trans-
lation element of the form wBxCy or wCxBy, where w, x, and y are in A*.
(If B -- C, then there must be a correspondence specified between the non-
terminals of the rule and those of the translation element.)
If B fails, however, we want the translation of A to involve the trans-
lation of D. Thus, a second translation element, of the form uDv, must be
associated with the rule A --~ B[C, D]. We shall give a formal definition of
such a translation-defining method and then discuss how the parsing machine
can begeneralized to perform such translations.
DEFINITION
component is the input string, with input head position indicated; the
second component is an output string, and the third is the outcome, either
success or failure.
(1) If A has rule A ~ a, y, then for all u in E*, A =~ (a l" u, y, s). If v is
in E* but does not begin with a, then A =~ ([" v, e, f ) .
(2) IfA has rule A --, B[C, D], ylEyzFy3, y4Dys, where E -- B and F -- C
or vice versa, then the following hold"
111 112
(a) If B =~ (Ul 1" uzu3, xl, s) and C =~ (u2 I" u3, x2, s), then
111+112+ !
A ~ ([" utuz, e, f )
/Ii t12
(d) if B =~ (l" u, e, f ) and O =~ (l" u, e, f ) , then A ,,+,,+1 ( { u , e , f ) .
Note that if A =~ (u [" v, y, f ) , then u -- e and y -- e. That is, on failure
the input pointer is not moved and no translation is produced. It also should
be observed that in case (2b) the translation of B is "canceled" when C
fails.
+ n
We let =~ be the union of =~ for n ~> 1. The translation defined by P,
denoted z(P), is [(w, x) [S =~ (w ~, x, s)}.
Example 9.8
Let us define a G T D P L p r o g r a m with output that performs the pig
Latin translation of Example 9.7. Here we shall use lowercase output and
the following nonterminals, whose correspondence with the previous exam-
ple is listed below. Note that X and C3 represent strings of nonterminals.
750 TRANSLATION AND CODE GENERATION CRAP. 9
W (word)
C1 (consonant)
C2 (consonants)
C3 (consonant)*
Li (letter)
Lz (letters~
V (vowel)
X (vowel)(letters)
HA
y~ nc Y2 nB Y3
752 TRANSLATION AND CODE GENERATION CHAP. 9
(b) If, on the other hand, B returns failure in rule A---~ B[C, D],
M will make the following sequence of moves"
THEOREM 9 . 6
Example 9.9
Let us show how a parsing machine would implement the translation of
Example 9.8. The usual format will be followed. Configurations will be
indicated, followed by the constructed tree. Moves representing recognition
by C1, V, and L 1 of consonants, vowels, and letters will not be shown.
EXERCISES 753
success a ]" nd Xp 6 p s
begin a l" n d L 2P 7
begin a I" nd Llp8LzpsP7
success a n I" d L 2P 8P 7
begin an r d L2 p 9
begin an l" d LlploL2PIop9
EXERCISES
)
l .~3r_~
(c) (d)
y)
(e)
754
EXERCISES "755
(a) a , (a + a).
(b) ((a + a ) , a) -4- a.
9.2.4. Show how a pushdown processor would translate the word abbaaaa
according to the SDTS
9.2.5. Show that there is no SDTS which performs the translation of Example
9.1 on the given grammar E ~ a ÷ E t a • EI a.
9.2.6. Give a formal construction for the P D T M of Theorem 9.1.
• 9.2.7. Prove that every D P D T defines a postfix simple syntax-directed trans-
lation on an LR(1) grammar. Hint: Show that every D P D T can be put
in a normal form analogous to that of Section 8.2.1.
• 9.2.8. Show that there exist translations T = [(x$, y)} such that T is definable
by a D P D T but {(x, y)[(x$, y) ~ T} is not definable any DPDT. Con-
trast this result with Exercise 2.6.20(b).
9.2.9. Give a formal proof of Theorem 9.3.
9.2.13. Show how a parsing machine with pointers would translate the word
abb according to the following G T D P L program with output:
"9.2.17. Extend the notion of a processor with pointers to a graph and give
translation algorithms for SDTS's based on the following algorithms:
(a) Algorithm 4.1.
(b) Algorithm 4.2.
(c) The Cocke-Younger-Kasami algorithm.
(d) Earley's algorithm.
(e) A two-stack parser (see Section 6.2).
(f) A Floyd-Evans parser.
SEC. 9.3 GENERALIZED TRANSLATION SCHEMES 757
Research Problem
9.2.18. In implementing a translation, we are often concerned with the efficiency
with which the translation is performed. Develop optimization tech-
niques similar to those in Chapter 7 to find efficient translators for
useful classes of syntax-directed translations.
Programming Exercises
9.2.19. Construct a program that takes as input a simple SDTS on an LL(1)
grammar and produces as output a translator which implements the
given SDTS.
9.2.20. Construct a program that produces a translator which implements a
postfix simple SDTS on an LALR(1) grammar.
9.2.21. Construct a program that produces a translator which implements an
arbitrary SDTS on an LALR(1) grammar.
BIBLIOGRAPHIC NOTES
Lewis and Stearns [1968] were the first to prove that a simple SDTS with an
underlying LL(k) grammar can be implemented by a deterministic pushdown trans-
ducer. They also showed that a simple postfix SDTS on an LR(k) grammar can
be performed on a D P D T and that every D P D T translation can be effectively de-
scribed by a simple postfix SDTS on an LR(k) grammar (Exercise 9.2.7). The push-
down processor was introduced by Aho and Ullman [1969a].
In many compiler-compilers and compiler writing systems, the formalism used
to describe the object compiler is similar to a syntax-directed translation scheme.
The syntax of the language for which a compiler is being constructed is specified
in terms of a context-free grammar. Semantic routines are also specified and
associated with each production. The object compiler that is produced can be
modeled by a pushdown processor; as the object compiler parses its input, the
semantic routines are invoked to compute the output. TDPL with output is a
simplified model of the T M G compiler writing system [McClure, 1965]. McIlroy
[1972] has implemented an extension of T M G to allow GTDPL-type parsing rules
and simulation of bottom-up parsers. GTDPL with output is a simplified model
for the META family of compiler-compilers [Schorre, 1964].
Two classes of simple SDTS's which each include the simple SDTS's on an
LL grammar and the postfix simple SDTS's on an LR grammar are mentioned by
Lewis and Stearns [i968] and Aho and Ullman [1972g]. Each of these classes is
implementable on a DPDT.
9.3. GENERALIZED T R A N S L A T I O N S C H E M E S
and more useful class of translations. Here we shall adopt the point of view
that the most general translation element that can be associated with a pro-
duction can be any type of function. The main extensions are to allow several
translations at each node of the parse tree, to allow use of other than string-
valued variables, and to allow a translation at one node to depend on trans-
lations at its direct ancestor, as welt as its direct descendants.
We shall also discuss the important matter of the timing of the evaluation
of translations at various nodes.
Our first extension of the SDTS Will allow each node in the parse tree to
possess several string-valued translations. As in the SDTS, each translation
depends on the translations of the various direct descendants of the node in
question. However, the translation elements can be arbitrary strings of output
symbols and symbols representing the translations at the descendants. Thus,
translation symbols can be repeated.
DEFINITION
A ..... Am=/L
Example 9.10
Let T = ([S~}, ~a], ~a}, {S,, $2}, R, S) be a-GSDTS, where R consists of
the following rules:
S >aS, S~ = S ~ S z S 2 a , Sz : S z a
S ~ a, S 1 = a, Sz = a
Then z ( T ) = {(a", a"')ln _~ 1}. For example, a 4 has the parse tree of Fig.
9.13(a). The values of the two translations at each interior node are shown
in Fig. 9.13(b).
For example, to calculate the value of S 1 at the root, we substitute into
the expression S i S z S 2 a the values for S~ and S 2 at the node below the root.
These values are a 9 and a 3, respectively. A proof that z(T) maps a" to a "'
reduces to observing that (n -t- 1)2 -- n 2 + 2n -+- 1. [-7
760 TRANSLATION AND CODE GENERATION CHAP. 9
S 1 = a16 S 2 = aaaa
a S1 a9 S 2 = aaa
S1 a4 S 2 = aa
S1 a S2=a
(a) (b)
Fig. 9.13 Generalized syntax directed translation.
Example 9.11
We shall give an example of formal differentiation of expressions involv-
ing the constants 0 and 1, the variable x, and the functions sine, cosine, + ,
and ,. The following grammar generates the expressions:
E ~ E +TIT
T >T,F[F
F > (E)[sin(E)[eos(E)[x[O[ 1
E >E+T Et : E l - l - T 1
E~ = E ~ + T~
E >T E1 =T1
E~ = T~
T >T*F T1 : T i * F 1
T~ = T~ , F~ + (T~) , F,
T >F T~ =F~
r >(E) F~ =(E~)
Fz : --sin(El) • (E2)
F >x F~ : x
F2=l
F >0 Fi : 0
F~=0
F >1 F1:1
F~=0
We leave for the Exercises a proof that if (0~, fl) is in z(T), then fl is the deri-
vative of ~. fl may contain some redundant parentheses.
The derivation tree for sin(cos(x)) + x is given in Fig. 9.14.
The values of the translation symbols at each of the interior nodes are
listed below:
n3, nz x 1
nl 2, nl 1, Hi 0 x 1
rig, n8, n7 COS(X) --sin(x) • (1)
n6, n5, n4 sin(cos(x)) cos(cos(x)) • (--sin(x) • (1))
nl sin(cos(x)) + x cos(cos(x)) • (--sin(x) • (1)) + 1
== . . . .
762 TRANSLATIONAND CODE GENERATION CHAP. 9
n2
n3
n7
n8
n9
r/10
nil
nl 2
Output. A dag from which we can recover the output y such that (x, y)
is in T.
Method. Let ~ be an LR(k) parser for G. We shall construct a pushdown
processor M with an endmarker, which will simulate ~ and construct a dag.
If ~ has nonterminal A on its pushdown list, M will place below A one pointer
for each translation symbol A l ~ r'. Thus, corresponding to a node labeled
A on the parse tree will be as many nodes of the dag as there are translations
of A, i.e., symbols Al in r'. The action of M is as follows:
(!) If ~ shifts, M does the same.
(2) Suppose that ~ is about to reduce according to production A ---~ 0c,
with translation elements A1 = fl 1,- - . , Am = fl~. At this point M will have
on top of its pushdown list, and immediately below each nonterminal in
there will be pointers toeach of the translations of that nonterminal. When
M makes the reduction, M first creates m nodes, one for each translation
symbol A r The direct descendants of these nodes are determined by the
symbols in i l l , . . . , tim" New nodes for output symbols are created. The node
for translation symbol B k ~ F is the node indicated by that pointer below
the nonterminal B in ~ which represents the kth translation of B. (As usual,
if there is more than one B in ~, the particular instance of B referred to will
be indicated in the translation element by a superscript.) In making the reduc-
tion, M replaces ~ and its pointers by A and the m pointers to the translations
for A. For example, suppose that ~ reduces according to the underlying
production of the rule
Then M would have the string PB,Ps,Bapc, Pc iC on top of its pushdown list
(C is on top), where the p's are pointers to nodes representing translations. In
making the reduction, M would replace this string by pA,PalA, where PAl
and p~, are pointers to the first and second translations of A. After the reduc-
tion the output dag is as shown in Fig. 9.15, We assume that Px, points to
the node for X~.
(3) If M has reached the end of the input string and its pushdown list
contains S and some pointers, then the pointer to the node for $1 is the root
of the desired output dag. [Z]
THEOREM 9.7
If Algorithm 9.5 is applied to the dag produced by Algorithm 9.4, then
the output of Algorithm 9.5 is the translation of the input x to Algorithm 9.4.
P r o o f Each node n produced by Algorithm 9.4 corresponds in an obvious
way to the value of a translation symbol at a particular node of the parse
tree for x. A straightforward induction on the height of a node shows that
R(n) does produce the value of that translation symbol.
Example 9.12
Let us apply Algorithms 9.4 and 9.5 to the GSDTS of Example 9.10
(p. 759), with input aaaa.
The sequence of configurations of the processor is [with LR(1) tables
omitted, as usual]
SEe. 9.3 GENERALIZED TRANSLATION SCHEMES 765
(1) e aaaa$
(2) a aaa$
(3) aa aa$
(4) aaa aS
(5) aaaa $
(6) aaap 1p 2S $
(7) aap3P4S $
(8) apsp6S $
(9) p7psS $
The trees constructed after steps 6, 7, and 9 are shown in Fig. 9.16(a)-
(c). Nodes on the left correspond to values of S~ and those on the right to
values of $2.
The application of Algorithm 9.5 to the dag of Fig. 9.16(c) requires many
invocations of the procedure R(n). We begin with node n x, since that corre-
sponds to $1. The sequence of calls of R ( n ) and the generations of output
a will be listed. A call of R(n~) iS indicated simply as n t. The sequence is
nln3nsn7ansansaan6nsaan6nsaaan4n6nsaaanan6nsaaaa. [~
l~:1 I.~!
k, k...~
(a) (b)
l,,o]
n3
n5
(c)
Fig. 9.16 Dag constructed by Algorithm 9.4.
SEC. 9.3 GENERALIZED TRANSLATION SCHEMES 767
complete, the pointer will again be at the top of the pushdown list, and the
translations for all descendants of n will have been computed. (We can show
this inductively.) It is then possible to compute the translation of n exactly
as if the parse were bottom-up. We leave the details of such a generalization
for the Exercises.
We conclude this section with several examples of more general trans-
lation schemes.
Example 9.13
We shall elaborate on Section 1.2.5, wherein we spoke of the generation
of code for arithmetic expresgions for a sin.gle accumulator random access
machine. Specifically, we assume that the assembly language instructions
ADD ~z
MPY ~z
LOAD ~z
STORE
t If we were working from the syntax tree, rather than the parse tree, reductions by
E ~ T, T ~ F and F ~ (E) would not appear. We would then have no need to implement
the trivial translation rules associated with these productions.
SEC. 9.3 GENERALIZED TRANSLATION SCHEM ES 7 69
(4) T :> F T, = F,
T~ = F~
(5) F~ (E) F, = E,
Fz = E z
Example 9.14
The code produced by Example 9.13 is by no means optimal. A con-
siderable improvement can be made by observing that if the right operand is
a single identifier, we need not load it and store it in a temporary. We shall
therefore add a third translation for E, T, and F, which is a Boolean variable
770 TRANSLATION AND CODE GENERATION CHAP. 9
E1 = L O A D a 2 . S T O R E $ t. t ~ t~ E1 = L O A D a 4 . S T O R E $ 1 .
L O A D a 1 ' ADD $1 / ~ T ~ L O A D a 3 ' ADD $1
E2=2 / E2= 2
F1 = L O A D a l y F2=
F2 =1 F1= LOADa2
1 ~F1 F2== LOADa3~
1 FI = LlOA~Da4
F2=
with value true if and only if the expression dominated by the node is a single
identifier.
The first translation of E, T, and F is again code to compute the expres-
sion. However, if the expression is a single identifier, then this translation is
only the " N A M E " of that identifier. Thus, the translation scheme does not
"work" for single identifiers. This should cause little trouble, since the expres-
sion grammar is presumably part of a grammar for assignment and the
translation for assignments such as A ~ B can be handled at a higher level.
The new rules are the following:
SEC. 9.3 GENERALIZED TRANSLATION SCHEMES 771
else Ei ';ADD' T,
else if E3 then T, ';STORE $1 ;LOAD' E,
';ADD $1'
else T, ';STORE $' E2 ';' E, ';ADD $' E2
Ez : max(E2, T2) -t- i
E3 : false
(2) E >T E, = T,
E2 = T2
E3 = T3
(3) T~T,F T, = i f F 3 t h e n
if T3 then 'LOAD' T, ';MPY' F1
else T~ ';MPY' F1
else if T3 then F~ ';STORE $1 ;LOAD' T,
';MPY $1'
else F, ';STORE $' T2 ';' T~ ';MPY $' T2
Tz : max(T2, Fz) -+- 1
T3 : false
(4) T- >F T, : F ,
T~ : F~
T~ = FF~
(5) F > (E) F, = E,
F2 : E2
F3 : E~
(6) F >a F, : NAME(a)
F2:l
F3 : true
In rule (1), the formula for E, checks whether either or both arguments
are single identifiers. If the right-hand argument is a single identifier, then
the code generated causes the left-hand argument to be computed and the
right-hand argument to be added to the accumulator. If the left-hand argu-
ment is a single identifier, then the code generated for this argument is the
772 TRANSLATION AND CODE GENERATION CHAP. 9
Note that these rules do not assume that + and • are commutative. [--]
Our next example again deals with arithmetic expressions. It shows how
if three address code is chosen as the intermediate language, we can write
what is essentially a simple SDTS, implementable by a deterministic push-
down transducer that holds some extra information at the cells of its push-
down list.
Example 9.15
Let us translate L(Go) to a sequence of three address statements of the
form A ~ + BC and A ~ • BC, meaning that A is to be assigned the sum
or product, respectively, of B and C. In this example A will be a string of
the form $i, where i is an integer. The principal translations, El, T1, and FI,
will be a sequence of three address statements which evaluate the expression
dominated by the node in question; E2, T2, and F2 are integers indicating
levels, as in the previous examples. E3, T3, and F3 will be the name of a vari-
able which has been assigned the value of the expression by the aforemen-
tioned code. This name is a program variable in the case that the expression
is a single identifier and a temporary name otherwise. The following is the
translation scheme"
E~ =T~
E3 =T~
T >T*F T1 : T1Fi '$' max(T2, F2)'< ,' T3F3 ';'
T~ : max(T2, F2) + 1
T3 : '$' max(T2, F2)
T .'F Ti = F ,
T~
T~ =F~
SEC. 9.3 GENERALIZED TRANSLATION SCHEMES 773
F >(E) F1 = E l
F~=E~
F3 = E3
F---~ a Fi = e
F2=l
F~ = NAME(a)
We leave it to the reader to observe that the rules for E,, T,, and F,
form a postfix simple SDTS if we assume that the values of second and third
translations of E, T, and F are output symbols. A practical method of imple-
mentation is to parse Go in an LR(1) manner by a D P D T which keeps the
values of the second and third translations on its stack. That is, each push-
down cell holding E will also hold the values of Ez and E3 for the associated
node of the parse tree (and similarly for cells holding T and F).
The translation is implemented by emitting
every time a reduction of E -t- T to E is called for, where Ez, E3, Tz, and T3
are the values attached to the pushdown cells involved in the reduction.
Reductions by T---, T , F are treated analogously, and nothing is emitted
when other reductions are made.
We should observe that since the second and third translations of E, T,
and F can assume an infinity of values, the device doing the translation is
not, strictly speaking, a DPDT. However, the extension is easy to implement
in practice on a random access computer. D
Example 9.16
We shall generate assembly code for control statements of the if-then-
else form. We presume that the nonterminal S stands for a statement and
that one of its productions is S ~ if B then S else S. We suppose that S has
a translation S, which is code to execute that statement. Thus, if a production
for S other than the one shown is used, we can presume that S, is correctly
computed.
Let us assume that B stands for a Boolean expression and that it has two
translations, B, and B2, which are computed by other rules of the translation
~i i ~i~ i
774 TRANSLATION AND CODE GENERATION CHAP. 9
EQUAL tz, fl
It generates n o code but causes the assembler to treat the locations named
and fl as the same location.
The EQUAL instruction is needed because the two instances of S on
the right-hand side of the production S ~ if B then S else S each have a name
for the instruction they expect to execute next. We must make sure that
a location allotted for one serves for the other as well.
The translation elements for the production mentioned are
S > if B then S <~ else S ~z~ S 1 = 'EQUAL' S~z1)'', S~z2~ ' "
01 ' ", S~ 1) ';JUMP'
ment. Thus, whether B is true or false, the location Stz1~ (which now equals
Stz2)) will be reached.
Let us consider the nested statement
Example 9.17
As the last example in this section, we consider generating object code
for a call of the form
S ~ call a A
776 TRANSLATION A N D CODE GENERATION CHAP. 9
A: >el(L)
L >E, LIE
That is, a statement can be the keyword call followed by an identifier and
an argument list (A). The argument list can be the empty string or a paren-
thesized list (L) of expressions. We assume that each expression E has two
translations El and E2, where the translation of E~ is object code that leaves
the value of E in the accumulator and the translation of E2 is the name of
a temporary location for storing the value of E. The names for the tempo-
raries are created by the function NEWLABEL. The following rules perform
the desired translation:
A2 :e
L ~E,L L1 -- E1 ';STORE' Ez ';' L1
"Lz -- 'ARG' Ez ';' L2
L- >E L~ = Et ';STORE' Ez
L z = 'ARG' Ez
For example, the statement call AB(E (1), E (2)) would, if the temporaries
for E (1) and E (2) were $$1 and $$2, respectively, be compiled into
of the values of the translations at its direct ancestor as well as its direct
descendants. We make the following definition.
DEFINITION
Example 9.18
Let us consider the translation rule
A] 1) = aA~2)A~3)b
A~ 3~ = aA]' )
Example 9.19
Let us compile code for arithmetic expressions to be executed on a
machine with two fast registers, denoted the A and B registers. The relevant
instructions are
LOADA
LOADB
STOREA
STOREB oc
ADDA
ADDB
MPY t~
ATOB
The meaning of the first six instructions should be obvious. We can load,
store, or perform addition in either register. We presume that the MPY
instruction takes its left argument in the B register and leaves the result in
the A register, as is the case for floating-point arithmetic on some computers.
The last instruction, ATOB, with no argument, transfers the contents of
the A register to the B register.
We shall build our translation on G O, exactly as we did in Example 9.13.
The translations Et, T1, and F~ will represent code to compute the value of
the associated input expression, sometimes leaving the result in the A register
and sometimes in B. However, the code for E1 at the root of the parse tree
for the input expression will always leave the value of the expression in the A
register. E2, Tz, and F2 are integers which measure the height of the node,
as in Example 9.13. There will be translations E3, T3, and F3 which are
Boolean and have the value true if and only if the value of the expression is
SEC. 9.3 GENERALIZED TRANSLATION SCHEMES 779
to be left in the A register. These last three translations are all inherited,
while the first six are synthesized. We dispense with the code-improving
feature of Example 9.14. The rules of the translation scheme are
E <t) ~ E C2) + T E~ 1) = if E~ ~) t h e n
Tt ' ; S T O R E A $' E~ 2) ';' E~ 2)
' ; A D D A $' E<22)
e l s e Tt ' ; S T O R E A $' E~2) ';' El 2)
' ; A D D B $' E~ 2)
E~2t) = max(E<22), T2) + 1
E~32) : E3¢1)~f
T 3 = true
E----> T E1 = T1
E~ -- T~
T3 -- E3
T Ct) > T ~2) • F T~ x) : i f T <3~) t h e n
F~ -- T3
F----~ (E) Ft = E1
Fz = E2
E3 = F3
F L >a F~ = if F~ then ' L O A D A ' N A M E ( a )
else 'LOADB' NAME(a)
Fz=l
1"We assume that all Boolean translations initially have the value true. Thus, if E~1~
refers to the root, it is already defined.
780 TRANSLATION AND CODE GENERATION CHAP. 9
r/4
n2
nl
T3 rt5 true
E3 n3 false
F3 nl false
F3 n2 true
FI n2 L O A D A a2
Fz n2 1
Fi nI LOADB aa
F2 na 1
E1 n3 LOADA a2; STOREA $1; LOADB al; ADDB $1
E2 n3 2
/73 n4 true
F1 n4 L O A D A a3
F2 n4 1
T1 ns LOADA a3; STOREA $2; LOADA a2;
STOREA $1 ; LOADB at ; ADDB $1 ; MPY $2
T2 n5 3
D
9.3.4. A Word About Timing
EXERCISES
9.3.4. Show that the translation [(an, am) l m is the integer part of,v/-~ -} cannot
be defined by any GSDTS.
9.3.5. For Exercise 9.3.1(a)-(c), give the dags produced by Algorithm 9.4
with inputs a 3, a 4, and 011, respectively.
9.3.6. Give the sequence of nodes visited by Algorithm 9.5 when applied to
the three dags of Exercise 9.3.5.
*9.3.7. Embellish Example 9.16 to include the production S ~ while B do S,
with the intended meaning that alternately expression B is to be tested,
and then the statement S done, until such time as the value of B becomes
false.
9.3.8. The following grammar generates PL/I-like declarations:
D ~ (L)M
L - - - ~ a, L I D , L I a I D
M--->" mt Im2l "-" Imk
numbers (possibly with a binary point). L stands for a list of bits and
B for bit. The translation elements are arithmetic formulas. The trans-
lation element N1 represents a rational number, the value of the binary
number derived by the nonterminal N. The translation elements L1, L2,
and B1 take integer values. For example, 11.01 has the translation 3¼.
Show that z(T) = {(b, at)Ib is a binary number and d is the value of b}.
"9.3.11. Consider the following translation scheme with the same underlying
grammar as in Exercise 9.3.10 but involving both synthesized and
inherited attributes:
B2 = L(21)
L ~ 2) = L~ 1) -[- 1
The parse tree for 11.0t together with the values of the translation
elements associated with each n o d e is shown in Fig. 9.20. Note that
to compute the translation element N1 we must first compute the L3's
to the right of the radix point bottom-up, then the L2's top-down, and
finally the Ll'S bottom-up. Show that this translation scheme defines
the same translation as the scheme in Exercise 9.3.10.
"9.3.12. Show that any translation that can be performed using inherited and
synthesized translations can be performed using synthesized translations
only. H i n t : There is no restriction on the structure of a translation.
Thus, one translation defined at a node can be the entire subtree that
it dominates.
9.3.13. Can every translation using synthesized translations be performed using
inherited translations only ?
*'9.3.14. Give an algorithm to test whether a given translation scheme involving
inherited and synthesized attributes is circular.
EXERCISES 785
/~-.~4~N~ L 3 =2
-- 1 -¼
= 0 ~B 2 = --2
B1 0
B2 -
A ~I:=E
E - > E(adop)T[ T
786 TRANSLATION AND CODE GENERATION CHAP. 9
T ~ T(mulop)Fl F
F----~ (E) IX
I ~ ala(L)
L----~E, LIE
(adop) ~ +l--
(mulop~ ~ ,[/
Research Problem
9.3.20. Translation of arithmetic expressions can become quite complicated if
operands can be of many different data types. For example, we could
be dealing with identifiers that could be Boolean, string, integer, real,
or complex--the last three in single, double, or perhaps higher preci-
sions. Moreover, some identifiers could be in dynamically allocated
storage, while others are statically allocated. The number of c9mbina-
tions can easily be large enough to make the translation elements asso-
ciated with a production such as E ~ E + T quite cumbersome. Given
a translation which spells out the desired code for each case, can you
develop an automatic way of simplifying the notation ? For instance,
in Example 9.19, the then and else portions of the translation of E~ 1~
differ only in the single character A or B at the end of ADD. Thus,
almost a factor of 2 in space could be saved if a more versatile defining
mechanism were used.
Programming Exercise
"9.3.21. Construct a translation scheme that maps a subset of F O R T R A N into
intermediate language as in Exercise 9.1.10. Write a program to imple-
ment this translation scheme. Implement the code generator designed
in Exercise 9.1.11. Combine these programs with a lexical analyzer to
BIBLIOGRAPHIC NOTES 787
BIBLIOGRAPHIC NOTES
10.1. S Y M B O L TABLES
The term symbol table is given to a table which stores names and infor-
mation associated with these names. Symbol tables are an integral feature of
virtually all compilers. A symbol table is pictured in Fig. 10.1. The entries
in the name field are usually identifiers. If names can be of different lengths,
then it is more convenient for the entry in the name field to be a pointer to
a storage area in which the names are actually stored.
The entries in the data field, sometimes called descriptors, provide infor-
mation that has been collected about each name. In some situations a dozen
or more pieces of information are associated with a given name. For example,
we might need to know the data type (real, integer, string, and so forth) of
an identifier; whether it was perhaps a label, a procedure name, or a formal
parameter of a procedure; whether it was to be given statically or dynamically
allocated storage; or whether it was an identifier with structure (e.g., an
array), and if so, what the structure was (e.g., the dimensions of an array).
If the number of pieces of information associated with a given name is vari-
788
SEC. 10.1 SYMBOL TABLES 789
NAME DATA
I INTEGER
LOOP LABEL
able, then it is again convenient to store a pointer in the data field to this
information.
GOTO LOOP
is found in the source program, then the compiler must check that the identi-
fier LOOP appears as the label of an appropriate statement in the program.
This information will be found in the symbol table (although not necessarily
at the time at which the goto statement is processed). The second use of
the information in the symbol table is in generating code. For example, if
we have a F O R T R A N statement of the form
A=Bq--C
in the source program, then the code that is generated for the operator +
depends on the attributes of identifiers B and C (e.g., are they fixed- or
floating-point, in or out of "common," and so forth).
The lexical analyzer enters names and information into the symbol table.
For example, whenever the lexical analyzer discovers an identifier, it consults
the symbol table to see whether this token has previously been used. If not,
the lexical analyzer inserts the name of this identifier into the symbol table
along with any associated information. If the identifier is already present in
the symbol table at some location/, then the lexical analyzer produces the
token (~identifier), l) as output.
Thus, every time the lexical analyzer finds a token, it consults the symbol
table. Therefore, to design an efficient compiler, we must, given an instance
of an identifier, be able to rapidly determine whether or not a location in
the gymbol table hag been regerved for that identifier. If no such entry exists,
we m u s t then be able to insert the identifier quickly into the .table.
790 BOOKKEEPING CHAP. 10
Example 10.1
Let us suppose that we are compiling a F O R T R A N - l i k e language and
wish to use a single token type (identifier) for all variable names. When
the (direct) lexical analyzer first encounters an identifier, it could enter into
a symbol table information as to whether this identifier was fixed- or floating-
point. The lexical analyzer obtains the information by observing the first
letter of the identifier. Of course, a previous declaration of the identifier to be
a function or subroutine or not to obey the usual fixed-floating convention
would already appear in the symbol table and would overrule the attempt
by the lexical analyzer to store its information. [~
Example 10.2
Let us suppose that we are compiling a language in which array declara-
tions are defined by the following productions"
j
array < array list >
< identifier >l ( < integer >1 ) < array definition >
The ways in which the information stored in the symbol table is used are
numerous. As a simple example, every subexpression in an arithmetic expres-
sion may need mode information, so that the arithmetic operators can be
interpreted as fixed, floating, complex, and so forth. This information is
collected from the symbol table for those leaves which have identifier or
constant labels and are passed up the tree by rules such as fixed + floating =
floating, and floating q- complex = complex. Alternatively, the language, and
hence the compiler, may prohibit mixed mode expressions altogether (e.g.,
as in some versions of FORTRAN).
then the direct access table provides a very fast mechanism for storingand
retrieving information about items. However, we would quickly discard the
idea of using a direct access table for most symbol table applications, since
the size of the table would be prohibitive and most of it would never be used.
For example, the number of F O R T R A N identifiers (a letter followed by up
to five letters or digits) is about 1.33 × 109.
Another possible method of storage is to use a pushdown list. If a new
item is encountered, its name and a pointer to information concerning that
item is pushed onto the pushdown list. Here, the size of the table is propor-
tional to the number of items actually encountered, and new items can be
inserted very quickly. However, the retrieval of information about an item
requires that we search the list until the item is found. Thus, retrieval on
the average requires time proportional to the number of items on the list.
This technique is often adequate for small lists. In addition, it has advantages
when a block-structured language is being compiled, as a new declaration
of a variable can be pushed on top of an old one. When the block ends, all
its declarations are popped off the list and the old declarations of the vari-
ables are still there.
A third method, which is faster than the pushdown list, is to use a binary
search tree. In a binary search tree each node can have a left direct descendant
and a right direct descendant. We assume that data items can be linearly
ordered by some relation < , e.g., the relation "precedes in alphabetical
order." Items are stored as the labels of the nodes of the tree. When the first
item, say ~1, is encountered, a root is created and labeled ~1. If ct2 is the next
item and ~2 < ~1, then a leaf labeled ~2 is added to the tree and this leaf is
made the left direct descendant of the root. (If 0cl < ~z, then this leaf would
have been made the right direct descendant.) Each new item causes a new
leaf to be added to the tree in such a position that at all times the tree will
have the following property. Suppose that N is any node in the tree and that
N is labeled ft. If node N has a left subtree containing a node labeled ~, then
< ft. If node N has a right subtree with a node labeled 7, then fl < 7.
75 75
SEC. 10.1 SYMBOLTABLES 793
The following algorithm can be used to insert items into a binary search
tree.
ALGORITHM 10.1
Insertion of items into a binary search tree.
Input. A sequence ¢z~,..., ¢z, of items from a set of items A with a linear
order < on A.
Output. A binary tree whose nodes are each labeled by one of 0c~. . . . , ~,,
with the property that if node N is labeled cz and some descendant N' of N
is labeled fl, then fl < 0c if a n d only if N ' is in the left subtree of N.
Method.
(1) Create a single node (the root) and label it el.
(2) Suppose that e l , . . . , ei-1 have been placed in the tree, i > 0. If
i = n + 1, halt. Otherwise, insert et in the tree by executing step (3) begin-
ning at the root.
(3) Let this step be executed at node N with label ft.
(a) If e~ < fl and N has a left direct descendant, Nt, execute step (3)
at NI. If N has no left direct descendant, create such a node and
label it ei. Return to step (2).
(b) If fl < 0~iand N has a right direct descendant, Nr, execute step (3)
at Nr. If N has no right direct descendant, create such a node
and label it ei. Return to step (2). [~
The method of retrieval of items is essentially the same as the method for
insertion, except that one must check at each node encountered in step (3)
whether the label of that node is the desired item.
Example 10.3
Let the sequence of items input to Algorithm 10. l be XY, M, QB, ACE,
and OP. We assume that the ordering is alphabetic. The tree constructed is
shown in Fig. 10.3. [~
It can be shown that, after n items have been placed in a binary search
tree, the expected number of nodes which must be searched to retrieve one
of them is proportional to log n. This cost is acceptable, although hash
tables, which we shall discuss next, give a faster expected retrieval time.
The most efficient and commonly used method for the bookkeeping neces-
sary in a compiler is the hash table. A hash storage symbol table is shown
schematically in Fig. 10.4.
794 BOOKKEEPING CHAP. 10
NAME POINTER
Ol
Item • Information
about ot
Ha h(ot)
function
n-1
a pointer field. Initially, each entry in the hash table is assumed to be empty.t
If an item tz has been encountered, then some location in the hash table,
usually h(o¢), contains an entry whose name field contains tz (or possibly
a pointer to a location in a name table in which tz is stored) and whose pointer
field holds a pointer to a block in the data storage table containing the infor-
mation associated with tz.
The data storage table need not be physically distinct from the hash
table. For example, if k words of information are needed for each item,
then it is possible to use a hash table of size kn. Each item stored in the hash
table would occupy a block of k consecutive words of storage. The appro-
priate location in the hash table for an item tz can then readily be found by
multiplying h(~), the hash address for 0c, by k and using the resulting address
as the location of the first word in the block of words for tz.
The hashing function h is actually a list of functions h0, h ~ , . . . , hm, each
from the set of items to the set of integers [0, 1 , . . . , n -- 1}. We shall call
ho the primary hashing function. When a new item ~ is encountered, we can
use the following algorithm to compute h(~t), the hash address of tz. If tz has
been previously encountered, h(00 is the location in the hash table at which
tz is stored. If ~ has not been encountered, then h(~) is an empty location
into which ~ can be stored.
ALGORITHM 10.2
Computation of a hash table address.
Input. An item ~, a hashing function h consisting of a sequence of functions
h 0, h l,..., ., h meach from the set of items to the set of integers {0, 1 , . . . , n -- 1}
and a (not necessarily empty) hash table with n locations.
Output. The hash address h(00 and an indication of whether tz has been
previously encountered. If tz has already been entered into the hash table,
h(tz) is the location assigned to tz. If ~z has not been encountered, h(tz) is
an empty location into which ~ is to be stored.
Method.
(1) We compute ho(tZ), h l ( t z ) , . . . , hm(tz) in order using step (2) until no
"collision" occurs. If hm(tz) produces a collision, we terminate this algorithm
with a failure indication.
(2) Compute hi(tz) and do the following:
(a) If location h,(ot) in the hash table is empty, let h(tz)= h,(oO,
report that ~ has not been encountered, and halt.
(b) If location hi(a) is not empty, check the name entry of this loca-
tSometimes it is convenient to put the reserved words and standard functions in the
symbol table at the start.
796 BOOKKEEPING CHAP. 10
tion.t If the name is a, let h(a) = hi(a), report that a has already
been entered, and halt. If the name is not a, a collision occurs
and we repeat step (2) to compute the next alternate address.
Each time a location hi(a) is examined, we say that a probe of the table
is made.
When the hash table is sparsely filled, collisions are rare, and for a new
item a, h(a) can be computed very quickly, usually just by evaluating the
primary hashing function, ho(a). However, as the table fills, it becomes
increasingly likely that for each new item a, ho(a) will already contain another
item. Thus, collisions become more frequent as more items are inserted into
the table, and thus the number of probes required to determine h(a) increases.
However, it is possible to design hash tables whose overall performance is
much superior to binary search trees.
Ideally, for each distinct item encountered we would like the primary
hashing function h 0 to yield a distinct location in the hash table. This, of
course, is not generally feasible because the total number of possible items
is usually much larger than n, the number of locations in the table. In
practice, n will be somewhat larger than the number of distinct items ex-
pected. However, some course of action must be planned in case the table
overflows.
To store information about an item a, we first compute h(a). If 0c has not
been previously encountered, we store the name a in the name field of loca-
tion h(a). [If we are using a separate name table, we store a in the next empty
location in the name table and put a pointer to this location in the name
field of location h(a).] Then we seize the next available block of storage in
the data storage table and put a pointer to this block in the pointer field of
location h(a). We can then insert the information in this block of data storage
table.
Likewise, to fetch information about an item a, we can compute h(a),
if it exists, by Algorithm 10.2. We can then use the pointer in the pointer field
to locate the information in the data storage table associated with item a.
Example 10.4
Let us choose n = 10 and let an item consist of any string of capital
R o m a n letters. We define CODE(a), where a is a string of letters to be the
sum of the "numerical value" of each letter in a, where A has numerical
value of 1, B has value 2, and so forth. Let us define hj(a), for 0 < j _~ 9,
?If the name entry contains a pointer to a name table, we need to consult the name
table to determine the actual name.
SEC. 10.1 SYMBOLTABLES 797
Name Pointer
data for A
A -'----- - " " " " - * "
EF -~.
W data for W
data for EF
7
8
9
Here we search forward from the primary location ho(00 until no collision
occurs. If we reach location n -- 1, we proceed to location 0. This method
is simple to implement, but clusters tend to occur once several collisions are
encountered. For example, given that h0(00 produces a collision, the prob-
ability that ha(tx) will also produce a collision is greater than average.
A more efficient method of generating alternate addresses is to use
and
h,(a) = [h0(a ) q- ai 2 + bi]mod n,
where a and b are suitably chosen constants.
A somewhat different method of resolving collisions, called chaining, is
discussed in the Exercises.
(10.1.2) p(il . . . ik) --- ~_, p(i~ ' ' " iki ) if k < n
i not a m o n g il, .. ', i~
Example 10.5
Let n = 3 and let the probabilities of the six permutations be
Permutation Probability
[0, 1, 21 .1
[0,2, 11 .2
[1, 0, 21 .1
[1, 2, 0l .3
[2, 0, 11 .2
[2, 1, 0l .1
Similarly, we obtain
where the rightmost sum is taken over all sets S of k locations which do not
contain i.
If h0(a ) is in S but hi(e) ~ S, then we shall succeed on the second try.
Therefore, the probability that we fail on the first try but succeed on the
second is given by
n-I n-1
~_~p(ij) ~ p(S)
i=0 j = 0 S
where the rightmost sum is taken over all sets S such that _~ S = k, i E S,
and j q! S. Note that P(O') = 0 if i = j.
Proceeding in this manner, we arrive at the following formula for E(k, n),
the expected number of probes required to insert an item into a table in which
802 BOOKKEEPING CHAP. 10
where
(1) The middle summation is taken over all w which are strings of distinct
locations of length m and
(2) The rightmost sum is taken over all sets S of k locations such that all
but the last symbol of w is in S. (The last symbol of w is not in S.)
The first summation assumes that m steps are required to compute the
primary location and its first m -- 1 alternates. Note that if k < n, an empty
location will always be found after at most k + 1 tries.
Example 10.6
Let us use the statistics of Example 10.5 to compute E(2, 3). Equation
(10.1.5) gives
3
E(2, 3 ) = ~ m ~ p(w) Z p(S)
= I m ,$' s u c h t h a t ~ S = 2 a n d a l l but
t h e last symbol o f w is i n S
Another figure of merit used to evaluate hashing systems is R(k, n), the
expected number of probes required to retrieve an item from a table in which
k out of n locations are filled. However, this figure of merit can be readily
computed from E(k, n). We can assume that each of the k items in the table
is equally likely to be retrieved. Thus, the expected retrieval time is equal to
the average number of probes that were required to originally insert these
k items into the table. That is,
R(k, n) - - - ~1 ~,=o
lE(i,n)
For this reason, we shall consider E(k, n) as the exclusive figure of merit.
A natural conjecture is that performance is best when a hashing system
is random, on the grounds that any nonrandomness can only make certain
SEC. 10.1 SYMBOLTABLES 803
locations more likely to be filled than others and that these are exactly the
locations more likely to be examined when we attempt to insert new items.
While this will be seen not to be precisely true, the exact optimum is not
known. Random hashing is conjectured to be optimum in the sense of mini-
mum retrieval time, and other common hashing systems do not compare
favorably with a random hashing system. We shall therefore calculate E(k, n)
for a random hashing system.
LEMMA 10.1
If a hashing system is random, then
(1) For all sequences w of locations such that 1 _~ ]wl ~ n,
p(w) = (n -- l w l)!In!
(2) For all subsets S of{0, I , . . . , n -- 1},
1
-(;x)
Proof
(1) Using (10.1.2), this is an elementary induction on (n - - [ w I), starting
at lwl = n and ending at lw[ ---- 1.
(2) A simple argument of symmetry assures us that p(S) is the same for
•
all S of size k. Since the number of sets of size k is (z) , part (2) is immedi-
ate. [~
LEMMA 10.2
I f n > k, then ~ ( ~ - - J ) = (n-~ ]1
THEOREM 10.1
If a hashing system is random, then E(k, n) = (n + 1)/(n + 1 -- k).
Proof. Let us suppose that we have a hash table with k out of n locations
filled. We wish to insert the k ÷ 1st item e. It follows from Lemma 10.1(2)
that every set of k locations has the same probability of being filled. Thus
E(k, n) is independent of which k locations are actually filled. We can there-
fore assume without loss of generality that locations 0, 1, 2 , . . . , k - 1
are filled.
To determine the expected number of probes required to insert a, we
examine the sequence of addresses obtained by applying h to a. Let this
sequence be ho(a ), h a ( a ) , . . . , h,_~(a). By definition, all such sequences of
n locations are equally probable.
804 BOOKKEEPING CHAP. 10
But ~ ' j qm is just the probability that the first j - 1 locations in the
sequence h0(~z), h , ( 0 0 , . . . , h,_,(~) are between 0 and k - 1, i.e., that at
least j probes are required to insert the k + 1st item. By Lemma 10.1(1),
this quantity is
n - j + 1)
( k ) ( k --11 ) " " ( k - -- -J j- +
t -2 ) __ kl
(k - - j + 1)!
(n--j+
n!
1)l k--j+
(~)
1
Then,
We observe from Theorem 10.1 that for large n and k, the expected time
to insert an item depends only on the ratio of k and n and is approximately
1/(1 -- p), where p = kin. This function is plotted in Fig. 10.6.
The ratio kin is termed the load factor. When the load factor is small,
the insertion time increases with k, the number of filled locations, at a slower
rate than log k, and hashing is thus superior to a binary search. Of course,
if k approaches n, that is, as the table gets filled, insertion becomes very
expensive, and at k = n, further insertion is impossible unless some mecha-
nism is provided to handle overflows. One method of dealing with overflows
is suggested in Exercises 10.1.11 and 10.1.12.
The expected number of trials to insert an item is not the only criterion
of goodness of a hashing scheme. One also desires that the computation of
the hashing functions be simple. The hashing schemes we have considered
compute the alternate functions h,(e) . . . . , h,_ ,(~) not from 0citself but from
h0(e), and this is characteristic of most hashing schemes. This arrangement is
efficient because ho(00 is an integer of known length, while e may be arbitrarily
long. We shall call such a method hashing on locations. A more restricted
case, and one that is even easier to implement, is linear hashing, where hi(e)
is given by (h0(e) + i)mod n. That is, successive locations in the table are
tried until an empty one is found; if the bottom of the table is reached, we
proceed to the top. Example 10.4 (p. 796) is an example of linear hashing.
SEC. 10.1 SYMBOL TABLES 805
10-
6 h
4--
2--
"° ~ o j °
i . i . ! l !
0 0.2 0.4 0.6 0.8 1.0
k/n
Fig. 10.6 Expected insertion time as a function of load factor for
r a n d o m hashing.
when we hash on locations there is, for each location i, exactly one permu-
tation, H~, that begins with i and has nonzero probability. We can denote
the probability of IIt by Pi. We shall denote the second entry in II~, the first
alternate of i, by a~. Ifp~ = 1/n for each i, we call the system random hashing
on locations.
THEOREM 10.2
E(2, n) is smaller for random hashing than for random hashing on loca-
tions for all n > 3.
Proof. We know by Theorem 10.1 that E(2, n) for random hashing is
(n + 1 ) / ( n - 1). We shall derive a lower bound on E(2, n) for hashing on
locations. Let us suppose that the first three items to be entered into the table
have permutations Hi, IIj, and rib, respectively. We shall consider two
cases, depending on whether i = j or not.
Case 1: i ~ j. This occurs with probability (n -- 1)/n. The expected num-
ber of trials to insert the third item is seen to be
n2 + 2 n + 3
E ( 2 ' n ) ~+( ~- -- I ) ( n - l n ) + ( n+n 3)(_~_)= n2
The point of the previous example and theorem is that many simple
hashing schemes do not meet the performance of random hashing. Intuitively,
the cause is that when nonrandom schemes are used, there is a tendency for
the same location to be tried over and over again. Even if the load factor is
small, with high probability there will still be some locations that have been
tried many times. If a scheme such as hashing on locations is used, each
time a primary location h0(~) is filled, all the alternates of h0(~) which were
tried before will be tried again, resulting in poor performance.
The foregoing does not imply that one should not use a scheme such
EXERCISES 807
•E x a m p l e 10.8
Let the permutations [0123] and [1032] have probability .2 and let [2013],
[2103], [3012], and [3102] have probability .15, all others having zero probabil-
ity. We can calculate E(2, 4) directly by (10.1.5), obtaining the value 1.665.
This value is smaller than the figure 5/3 for r a n d o m hashing.
EXERCISES
10.1.1. Use Algorithm 10.1 to insert the following sequence of items into a
binary search tree" T, D, H, F, A, P, O, Q, W, TO, TH. Assume that
the items have alphabetic order.
10.1.2. Design an algorithm which will take a binary search tree as input and
list all elements stored in the tree in order. Apply your algorithm to
the tree constructed in Exercise 10.1.1.
"10.1.3. Show that the expected time to insert (or retrieve) one item in a binary
search tree is O(log n), where n is the number of nodes in the tree.
What is the maximum amount of time required to insert any one item ?
"10.1.4. What information about FORTRAN variables and constants is needed
in the symbol table for code generation ?
10.1.5. Describe a symbol table storage mechanism for a block-structured
language such as ALGOL in which the scope of a variable X is limited
to a given block and all blocks contained in that block in which X is
not redeclared.
10.1.6. Choose a table size and a primary hashing function h0. Compute
h0(t~), where 0c is drawn from the set of (a) FORTRAN keywords,
(b) ALGOL keywords, and (c) PL/I keywords. What is the maximum
number of items with the same primary hash address ? You may wish
to do this calculation by computer. Sammet [1969] will provide the
needed sets of keywords.
"10.1.7. Show that R(k, n) for random hashing approximates (-- 1/p) log (1 --p),
where p = k/n. Plot this function.
Hint: Approximate (n/k) ~k~-o~(n + 1)/(n -- i + 1) by an integral.
*10.1.8. Consider the following pseudorandom number generator. This genera-
tor creates a sequence rl, rz, • • •, r,_l of numbers which can be used
to compute hi(00 = [h0(t~) + rdmod n for 1 ~ i ~ n - 1. Each time
808 BOOKKEEPING CHAP. 10
h z ~ = h o _ ~ ( p -2 1) + i t 2 mod P
DEFINITION
Another technique for resolving collisions that is more efficient in
terms of insertion and retrieval time is chaining. In this method one
field is set aside in each entry of the hash table to hold a pointer to
additional entries with the same primary hash address. All entries with
the same primary address are chained on a linked list starting at that
primary location.
There are several methods of implementing chaining. One method,
called direct chaining, uses the hash table itself to store all items. To
insert an item 0~, we consult location ho(0~).
(1) If that location is empty, 0c is installed there. If h0(~) is filled
and is the head of a chain, we find an empty entry in the hash table
by any convenient mechanism and place this entry on the chain
headed by ho(00.
(2) If h0(0¢) is filled but not by the head of a chain, we move the
current entry fl in h0(0c) to an empty location in the hash table and
insert 0c in ho(00. [We must recompute h(fl) to keep fl in the proper
chain.]
This movement of entries is the primary disadvantage of direct
chaining. However, the method is fast. Another advantage of the
technique is that when the table becomes full, additional items can be
placed in an overflow table with the same insertion and retrieval
strategy.
EXERCISES 809
10.1.11. Show that if alternate locations are chosen randomly, then R(k, n),
the expected retrieval time for direct chaining, is 1 + p/2, where
p = k/n. Compare this function with R(k, n) in Exercise 10.1.7.
Another chaining technique which does not require items to be
moved uses an index table in front of the hash table. The primary
hashing function h0 computes addresses in the index table. The entries
in the index table are pointers to the hash table, whose entries are
filled in sequence.
To insert an item ~ in this new scheme, we compute h0(tx), which
is an address in the index table. If h0(t~) is empty, we seize the next
available location in the hash table and insert t~ into that location.
We then place a pointer to this location in h0(00.
If h0(t~) already contains a pointer to a location in the hash table,
we go to that location. We then search down the chain headed by
that location. Once we reach the end of the chain, we take the next
available location in the hash table, insert ~ into that location, and
then attach this location to the end of the chain.
If we fill the hash table in order beginning from the top, we can
find the next available location very quickly. In this scheme no items
ever need to be moved because each entry in the index table always
points to the head of a chain.
Moreover, overflows can be simply accommodated in this scheme
by adding additional space to the end of the hash table.
"10.1.12. What is the expected retrieval time for a chaining scheme with an
index table? Assume that the primary' locations are uniformly dis-
tributed.
10.1.13. Consider a random hashing system with n locations as in Section
10.1.5. Show that if S is a set of k locations and i q~ S, then
~w p(wi) = 1/(n -- k), where the sum is taken over all w such that w
is a string of k or fewer distinct locations in S.
"10.1.15. Suppose that items are strings of from one to six capital R o m a n letters.
Let CODE(a) be the function defined in Example 10.4. Suppose that
item tx has probability (1/6)26-1~,I. Compute the probabilities of the
permutations on {0, 1 , . . . , n -- 1} if
(a) h,(ct) = (CODE(t~) + i)mod n, 0 < i < n - 1.
810 BOOKKEEPING CHAP. 10
Thus, if a given hashing system is better than random for some k, there
is a smaller k' for which performance is worse than random. Hint:
Show that if a hashing system is (k -- 1)-uniform but not k-uniform,
then E(k, n) > (n q- 1)/(n -- k q- 1).
"10.1.19. Give an example of a hashing systemwhich is k-uniform for all k but
is not random.
"10.1.20. Generalize Example 10.7 to the case of unequal probabilities for the
cyclic permutations.
"10.1.21. Strengthen Theorem 10.2 to include systems which hash on locations
but do not have equal pt's.
Open Problems
10.1.22. Is random hashing optimal in the sense of expected retrieval time ?
That is, is it true that R(k, n) is always bounded below by
Research Problem
10.1.25. In certain uses of a hash table, the items entered are known in advance.
Examples are tables of library routines or tables of assembly language
operation codes. If we know what the residents of the hash table are,
we have the opportunity to select our hashing system to minimize the
expected lookup time. Can you provide an algorithm which takes the
list of items to be stored and yields a hashing system which is efficient
to implement, yet has a low lookup time for this particular loading of
the hash table ?
Programming Exercises
10.1.26. Implement a hashing system that does hashing on locations. Test the
behavior of the system on F O R T R A N keywords and common func-
tion names.
10.1.27. Implement a hashing system that uses chaining to resolve collisions.
Compare the behavior of this system with that in Exercise 10.1.26.
BIBLIOGRAPHIC NOTES
Hash tables are also known as scatter storage tables, key transformation tables,
randomized tables, and computed entry tables. Hash tables have been used by
programmers since the early 1950's. The earliest paper on hash addressing is by
Peterson [1957]. Morris [1968] provides a good survey of hashing techniques. The
answer to Exercise 10.1.7 can be found there.
Methods of computing the alternate functions to reduce the expected number
of collisions are discussed by Maurer [1968], Radke [1970], and Bell [1970]. An
answer to Exercise 10.1.10 can be found in Radke [1970]. Ullman [1972] discusses
k-uniform hashing systems.
Knuth [1973] is a good reference on binary search trees and hash tables.
10.2. PROPERTY G R A M M A R S
10.2.1. Motivation
begin block 1
A
begin block 2
B
,i
begin block 3
c
end block 3
end block 2
begin block 4
end block 4
block 1
A block 2 E block 4 G
B
j block 3 D F
have level number 2 and index 1. Identifiers in area F of block 4 would have
level number 2 and index 2.
If an identifier in the block with level i and index j is referenced, we look
in the symbol table for a definition of that identifier with the same level
number and index. However, if that identifier is nowhere defined in the block
with level number i, we would then look for a definition of that identifier
in the block of level number i -- 1 which contained the block of level number
i and index j, and so forth. If we encounter a definition at the desired level
but with too small an index, we may delete the definition, as it will never
again apply. Thus, a pushdown list is useful for the storing of definitions of
each identifier as encountered. The search described is also facilitated if the
index Of the currently active block at each level is available.
For example, if an identifier K is encountered in region C and no defini-
tion of K appeared in region C, we would accept a definition of K appear-
i n g in region B (or D, if declarations after use are permitted). However, if
no definition of K appeared in regions B or D, we would then accept a
definition of K in regions A, E, or G. But we would not look in region F for
a definition of K.
The level-index method of recording definitions can be used for languages
with the conventional nesting of definitions, e.g., A L G O L and PL/I. How-
ever, in this section we shall discuss a more general formalism, called prop-
erty grammars, which permits arbitrary conventions regarding scope of
definition. Property grammars have an inherent generality and elegance
which stem from the uniformity of the treatment of identifiers a n d their
properties. Moreover, they can be implemented in an amount of time which
is essentially linear in the length of the compiler input. While the constant
814 BOOKKEEPING CHAP. 10
n) [1"2]
associates property 2 with index 1 and the neutral property with all other
indices. This table can then be interpreted as meaning that the identifier
associated with index 1 (namely B) has the property "declared real."
In general if we have the structure
A
T
x~ x~ . . . x ~
T~T~ ...T~
Let =*, or ==~ if G is understood, be the reflexive, transitive closure of =-. The
G G
language defined by G, denoted L ( G ) , is the set of all a~ T~a2T2 . . . a , T , such
that for some table T
(1) . . . a,T,;
S T *=, a t T l a z T 2
(2) Each aj is in E;
(3) For all indices i, T(i) is in F; and
(4) For each j, Tj maps all indices, or all but one index, to %.
We should observe that although the definition of a derivation is top-
down, the definition of a property grammar lends itself well to bottom-up
parsing. If we can determine the tables associated with the terminals, we can
construct the tables associated with each node of the parse tree deterministic-
ally, since/z is a function of the tables associated with the direct descendants
of a node.
It should be clear that if G is a property grammar, then the set
[alaz . . . a, l a l T l a z T 2 . . . a , T , is in L ( G )
for some sequence of tables T1, T z , . . . , T,}
Example 10.9
We shall give a rather lengthy example using property grammars to handle
declarations in a block-structured language. We shall also show how, if
the underlying C F G is deterministically parsable in a bottom-up way, the
tables can be deterministically constructed as the parse proceeds.t
Let G = (N, E, P, S, V, O, [0}, ,u) be a property grammar with
(i) N = ((block), (statement), (declaration list), (statement list),
(variable list)}. The nonterminal (variable list) generates a list of variables
used in a statement. We are going to represent a statement by the actual
variables used in the statement rather than giving its entire structure. This
tOur example grammar happens to be ambiguous but will illustrate the points to be
made.
SEC. 10.2 PROPERTY GRAMMARS 817
Informally, production (1) says that a block is a declaration list and a list of
statements surrounded by begin and end. Production (4) says that a statement
can be a block; productions (5) and (6) say that a statement is a list of the
variables used therein, possibly prefixed with a label. Production (7) says that
a statement can be a goto statement. Productions (8) and (9) say that a vari-
able list is a string of 0 or more a's, and productions (10) and (11) say that
a declaration list is a string of 0 or more deelare's.
(iv) V = (0, 1, 2, 3, 4} is a set of properties with the following meanings:
0 Identifier does not appear in the string derived from this node (neutral
property).
1 Identifier is declared to be a variable.
2 Identifier is a label of a statement.
3 Identifier is used as a variable but is not (insofar as the descendants
of the node in question are concerned) yet declared.
4 Identifier is used as a goto target but has not yet appeared as a label.
818 BOOKKEEPING CHAP. 10
#(1, s)
0 0 0 0
0 1 0 0
0 1 3 0
0 0 3 0
0 0 2 0
The only possible property for all integers associated with begin and end
is 0 and hence the two columns of O's. In the declaration list each identifier
will have property 0 ( = not declared) or 1 ( = declared). If an identifier is
declared, then within the body of the block, i.e., the statement list, it can
be used only as a variable (3) or not used (0). In either case, the identifier is
not declared insofar as the program outside the block is concerned, and so
we give the identifier the 0 property. Thus, the second and third lines appear
as they do.
If an identifier is not declared in this block, it may still be used, either
as a label or variable. If used as a variable (property 3), this fact must be
transmitted outside the block, so we can check that it is declared at some
appropriate place, as in line 4. If an identifier is defined as a label within
the block, this fact is not transmitted outside the block (line 5), because
a label within the block may not be transferred to from outside the block.
Since # is not defined for other values of s, the property grammar catches
uses of labels not found within the block as well as uses of declared variables
as labels within the block. A label used as a variable will be caught at another
point.
tThis differs from the convention of ALGOL, e.g., in that ALGOL allows transfers
to a block which surrounds the current one. We use this convention to make the handling
of labels differ from that of identifiers.
SEC. 10.2 PROPERTY GRAMMARS 819
U(2, s)
0 0
3 3
0 3
3 0
4 4
0 4
4 0
4 2
2 4
0 2
2 0
s u(3, s)
0 o
2 2
3 3
4 4
s U(4, s)
0 0
3 3
The philosophy for production (3) also applies for production (4).
820 BOOKKEEPING CHAP. 10
u(5, s)
0 0
0 3
2 0
s u(6, s)
0 0
3 3
s U(7,s)
0 0
4 4
u(8, s)
0 0
3 0
0 3
3 3
s g(9, s)
e 0
.u(lO, s)
0 0
0 1
1 0
1 1
s #(11,s)
e 0
begin
declare [1 : 1]
declare [ 2 : 1 ]
begin
label [ 1 : 2 ] a [2 : 3]
goto [1 : 4].
end
a [1 : 3]
end
That is, in the outer block, identifiers 1 and 2 are declared as variables
by symbols declare[1 : 1] and declare[2 : 1]. Then, in the inner block, identifier
1 is declared and used as a label (which is legitimate) bysymbols label[1 : 2]
and goto[l : 4], respectively, and identifier 2 is used as a variable by symbol
a[2 : 3]. Returning to the outer block, 1 is used as a variable by symbol
a[1:3].
A parse tree with tables associated with each node is given in Fig. 10.9.
822 BOOKKEEPING CHAP. 10
begin
li21,
< declaration list > < statement fist >
[1"3 2"3]
end
declare < declaration fist > < statement list > < statement >
[2 3] [1 3]
declare < declaration fist > < statement > < variable fist >
[2 • 1] [2 3] [1 3]
begin < declaration fist > < statement list > end e
[1"2,2"31
I
< statement > goto
[1"2,2'3] [1'4]
~ i ~i ~
824 BOOKKEEPING CHAP. 10
consisting of one or more fields, each of which can contain some data or
a pointer to another cell. The cells will be used to construct linked lists.
Suppose that there are k properties in V. The property table associated
with a g r a m m a r symbol on the pushdown list is represented by a data struc-
ture consisting of up to k property lists and an intersection list. Each property
list is headed by a property list header cell. The intersection list is headed by
an intersection list leader cell. These header cells are linked as shown in Fig.
10.10.
The property header cell has three fields:
Grammar Symbol B
Pushdown
List
Entry
Pointer to Property Table
I
Intersection ~ v
t ]
i
List Header Headers for Property Lists
The intersection list header cell contains only a pointer to the first cell
on the intersection list. All cells on the property lists and the intersection lists
are index cells. An index cell consists of four pointer fields:
Index Cell
Example 10.10
Suppose that we have a pushdown list containing grammar symbols B
and C with C on top. Let the tables associated with these entries be, respec-
tively,
T~ = [ 1 : v ~ , 2 : v 2 , 5 : v v S : v ~ , 8 : v z ]
826 BOOKKEEPING CHAP. 10
and
Then a possible implementation for these tables is shown in Fig. 10.11. Circles
Pushdown
List
B C
v2 ! v3Hv2H !
~I I / /f'-",~
/ / ~ / /
/ / /
/ /
/ /
/ /
/ /
J /
/
J
Symbol
Table
indicate index cells. The number inside the circle is the index represented
by that cell. Dol~ted lines indicate the links of the intersection list. Note that
the intersection list of the topmost table is empty, by definition, and that
the intersection list of table T1 consists of index cells for indices 2, 5, and 8.
Dashed lines indicate links to the hash table and to ceils representing the
same index on other tables. We show these only for indices 2 and 3 to avoid
clutter. D
SEC. 10.2 PROPERTYGRAMMARS 827
Suppose that we are parsing and that the parser calls for B C on its stack
to be reduced to A. We must compute table T for A from tables Ti and Tz
for B and C, respectively. Since C is the top symbol, the intersection list
of T1 contains exactly those indices having a nonneutral property on both
T~ and Tz (hence the name "intersection list"). These indices will be set aside
for later consideration.
Those indices which are not on the intersection list of T~ have the neutral
property on at least one of T1 or T2. Thus, their property on T is a function
only of their one nonneutral property. Neglecting those indices on the inter-
section list, the data structure representing table T can be constructed by
combining various trees of T1 and T~. After doing so, each entry on the inter-
section list of T1 is treated separately and made to point to the appropriate
cell of T.
Before formalizing these ideas, we should point out that in practice we
would expect that the properties can be partitioned into disjoint subsets such
that we can express V as Va × V2 × .-- × Vm for some relatively large m.
The various components, V~, V2, • • • would in general be small. For example,
V1 might contain two elements designating "real" and "integer". Vz might
have two elements "single precision" and "double precision"; V3 might
consist of "dynamically allocated" and "statically allocated," and so on.
One element of each of V1, Vz, • • • can be considered the default condition
and the product of the default elements is the neutral property. Finally, we
may expect that the various components of an identifier's property can be
determined independently of the others.
If this situation pertains, it is possible to create one property header for
each nondefault element of V~, one for each nondefault element of V2, and
so on. Each index cell is linked to several property headers, but at most one
from any V,. If the links to the headers for Vt are made distinct from those
to the headers of Vj for i ~ j, then the ideas of this section apply equally
well to this situation, and the total number of property headers will approxi-
mate the sum Of the sizes of the V,'s rather than their product.
We shall now give a formal algorithm for implementing a property gram-
mar. For simplicity in exposition, we restrict our consideration to property
grammars whose underlying C F G is in Chomsky normal form.
ALGORITHM 10.3
Table handling for property grammar implementation.
Input. A property grammar G = (N, E, P, S, V, v 0, F, ,u) whose underly-
ing C F G is in Chomsky normal form. We shall assume that a nonneutral
property is never mapped into the neutral property; that is~ if l t ( p , v 1v2) = vo,
then vl = v2 = v0. [This condition may be easily assumed, because if we
find It(i, vlv2) -- Vo but v~ or v 2 is not v0, we could on the right-hand side
replace v o by v~, a new, nonneutral property, and introduce rules that would
828 BOOKKEEPING CHAP. 10
make v~ "look like" v0.] Also, part of the input to this algorithm is a shift-
reduce parsing algorithm for the underlying CFG.
Output. A modified shift-reduce parsing algorithm for the underlying
CFG, which while parsing computes the tables associated with those nodes
o f the parse tree corresponding to the symbols or the pushdown list.
Method. Let us suppose that each table has the format of Fig. 10.10,
that is, an intersection list and a list of headers, at most one for each property,
with a tree of indices with that property attached to each header. The opera-
tion of the table mechanism will be described in two parts, depending on
whether a terminal or two nonterminals are reduced. (Recall that the underly-
ing grammar is in Chomsky normal form.)
Part 1: Suppose that a terminal symbol a is shifted onto the pushdown
list and reduced to a nonterminal A. Let the required table for A be [ i ' v ] .
To implement this operation, we shall shift A onto the pushdown list directly
and create the table [i" v] for A as follows.
(1) In the entry for A at the top of the pushdown list, place a pointer to
a single property header cell having property v and count 1. This property
header points to an intersection list header with an empty intersection list.
(2) Create C, an index cell for i.
(3) Place a pointer in the first field of C to the property header cell.
(4) Make the second field of C blank.
(5) Place a pointer in the third field of C to the hash table entry for i and
also make that hash table entry point to C.
(6) If there was previously another cell C' which was linked to the hash
table entry for i, place C' on the intersection list of its table. (Specifically,
make the intersection list header point to C' and make the third field in C'
point to the previous first cell of the intersection list if there was one.)
(7) Place a pointer in the fourth field of C to C'.
(8) Make the pointer in the third field of C' point to C.
Part 2: Now, suppose that two nonterminals are reduced to one, say by
production A ----~ B D . Let T1 and T2 be the tables associated with B and D,
respectively. Then do the following to compute T, the table for A.
(1) Consider each index cell on the intersection list of Tt. (Recall that T2
has no intersection list.) Each such cell represents an entry for some index i
on both Tt and T2. Find the properties of this index on these tables by
Algorithm 10.4.¢ Let these properties be v~ and v2. Compute v -- It(p, v~v2),
tObviously, one can find the property of the index by going from its cells on the two
tables to the roots of the trees on which the cells are found. However, in order that the table
handling as a whole be virtually linear in time, it is necessary that following the path to
the root be done in a special way. This method will be described subsequently in Algorithm
10.4.
SEC. 10.2 PROPERTYGRAMMARS 829
where p is production A ~ BD. Make a list of all index cells on the inter-
section list along with their new properties and the old contents of the cells
on T1 and T2.
(2) Consider the property header cells of T1. Change the property of
the header cell with property v to the property It(p, vvo). That is, assume
that all indices with property v on T1 have the neutral property on T2.
(3) Consider the property header cells of Tz. Change the property of
the header cell with property v to It(p, roy ). That is, assume that all indices
with property v on Tz have the neutral property on Ti.
(4) Now, several of the property header cells formerly belonging to T1
and T2 may have the same property. These are merged by the following
steps, which combine two trees into one"
(a) Change the property header cell with the smaller count (break
ties arbitrarily) into a dummy index cell not corresponding to
any index.
(b) Make the new index cell point to the property header cell with
the larger count.
(c) Adjust the count of the remaining property header cell to be
the sum of the counts of the two headers plus I, so that it reflects
the number of index cells in the tree, including dummy cells.
(5) Now, consider the list of indices created in step (1). For each such
index,
(a) Create a new index cell C.
(b) Place a pointer in the first field of C to the property header cell
with the correct property and adjust the count in that header cell.
(c) Place pointers in the third field of C to the hash table location for
that index and from this hash table entry to C.
(d) Place a pointer in the fourth field of C to the first index cell below
(on the pushdown list) having the same index.
(e) Now, consider C~ and C2, the two original rcells representing this
index on T1 and T2. Make C~ .and C2 into "dummy" cells by
preserving the pointers in the first field of C~ and C2 (their links
to their ancestors in their trees) but by removing the pointers in
the third and fourth fields (their links to the hash table and to
cells on other tables having the same index). Thus, the newly
created index cell C plays the role of the two cells C~ and C2
that have been made dummy cells.
(6) Dummy cells that are leaves can be returned to available storage. [1
Example 10.11
Let us consider the two property tables of Example 10.10 (Fig. 10.11
on p. 826). Suppose that It(p, st) is given by the following table:
830 BOOKKEEPING CHAP. 10
s t u(p, st)
VO Vl Vl
VO V2 V2
VO V3 V3
Vl VO Vl
V2 VO V2
V2 Vl V3
V2 V3 V2
+
(a) (b)
has been made a direct descendant of the header, while in Fig. 10.11 it was
a direct descendant of the node numbered 3. This is an effect of Algorithm
10.4 and occurred when the intersection list of T1 was examined. Node 2 in
Fig. 10.12(b) has been moved for the same reason.
In the last step, we consider the indices on the intersection list of Ti.
New index cells are created for these indices; the new cells point directly to
the appropriate header. All other cells for that index in table T are made
dummy cells. Dummy ceils with no descendants are then removed. The
resulting table T is shown in Fig. 10.13. The symbol table is not shown.
Note that the intersection list of T is empty.
see. 10.2 PROPERTY GRAMMARS 831
Pushdown A
List
Now suppose that an input symbol is shifted onto the pushdown list and
reduced to D[2 : v~]. Then the index cell for 2, which points to Va in Fig.
10.13, is linked to the intersection list of its table. The changes are shown
in Fig. 10.14. E]
We shall now give the algorithm whereby we inquire about the property
of index i on table T. This algorithm is used in Algorithm 10.3 to find the
property of indices on the intersection list.
A D
l ?
'r
Ol
~. [ ~ dummy /
\\ //
\ /
/
i
832 BOOKKEEPING CHAP. 10
ALGORITHM 10.4
Finding the property of an index.
Input. An index cell on some table T. We assume that tables are structured
as in Algorithm 10.3.
Output. The property of that index in table T.
Method.
(1) Follow pointers from the index cell to the root of the tree on which
it appears. Make a list of all cells encountered on this path.
(2) Make each cell on the path, except the root itself, point to the root
directly. (Of course, the cell on the path immediately before the root already
does so.) [Z]
The remainder of this chapter is devoted to the analysis of the time com-
plexity of Algorithms 10.3 and 10.4. To begin, we define two functions F
and G which will be used throughout this section.
DEFINITION
We define F(n) by the recurrence"
F(1)-- 1
F(n) -- 2 F('- I )
n F(n)
1 1
2 2
3 4
4 16
5 65536
Now let us define G(n) to be the least integer i such that F(i) ~ n. G(n)
grows so slowly that it is reasonable to say that G(n) ~ 6 for all n which are
representable in a single computer word, even in floating-point notation.
Alternatively, we could define G(n) to be the number of times we have to
apply log 2 to n in order to produce a number equal to or less than 0.
sEc. 10.2 PROPERTYGRAMMARS 833
Example 10.12
Suppose that we have objects a 1, a 2. . . . , a6 and the sequence of instruc-
tions
merge(AI, A 2, A 2)
merge(A 3, A 4, A 4)
merge(A 5, A 6, A 6)
merge(A 2, A 4, A 4)
merge(A4, A 6, A 6)
find(a3)
After executing the first instruction, A 2 is {a l, az}. After the second instruc-
tion, A 4 is [a3, a4}. After the third instruction, A 6 is {as, a6}. Then after the
instruction merge(A 2, A 4, A4), A4 becomes {al, az, a3, a4}. After the last
merge instruction, A 6 = {al, a z . . . . ,a6}. Then, the instruction find(a3)
prints the name A 6, which is the response to this sequence of instructions.
accomplishing this is to use two vectors OBJECT and SET such that
OBJECT(a) is a pointer to the node representing a and SET(A) is a pointer
to the root of the tree representing set A.
Initially, we construct n nodes, one for each object a t. The node for a l
is the root of a one-node tree. Initially, this root is labeled Ai and has a count
ofl.
(1) To execute the instruction merge(A, B, C), locate the roots of the trees
for A and B [via SET(A) and SET(B)]. Compare the counts of the trees
named A and B. The root of the smaller tree is made a direct descendant of
the larger. (Break ties arbitrarily.) The larger root is given the name C, and
its count becomes the sum of the counts of A and B.'t Place a pointer in
location SET(C) to the root of C.
(2) To execute the instruction find(a), determine the node representing a
via OBJECT(a). Then follow the path from that node to the root r of its
tree. Print the name found at r. Make all nodes on this path, except r, direct
descendants of r.:l:
Example 10.13
Let us consider the sequence of instructions i n Example 10.12. After
executing the first three merge instructions, we would have three trees, as
shown in Fig. 10.15. The roots are labeled with a set name and a count.
A2 A4 A6
(The count is not shown.) Then executing the instruction merge(A 2, A 4, A4),
we obtain the structure of Fig. 10.16. After the final merge instruction
merge(A4, A6, A6), we obtain Fig. 10.17. Then, executing the instruction
find(a3), we print the name A 6 and make nodes a 3 and a4 direct descendants
of the root. (a4 is already a direct descendant.) The final structure is shown
in Fig. 10.18. [Z]
tThe analogy between this step and the merge procedure of Algorithm 10.3 should
be obvious. The discrepancy in the way counts are handled has to do simply with the
question of whether the root is counted or not. Here it is; in Algorithm 10.3 it was not.
,The analogy to Algorithm 10.4 should be obvious.
i
836 BOOKKEEPING CHAP. I0
A4 A6
a 6
a2
A6
root. All subsequent results are predicated on this assumption. From this point
on we shall assume that n, the number of objects, has been fixed, and that
the sequence of instructions is of length 0(n).
DEFINITION
We define the rank of a node on one of the structures created by
Algorithm 10.5 as follows.
(1) A leaf is of rank 0.
(2) If a node N ever has a direct descendant of rank i, then N is of rank
at least i + 1.
(3) The rank of a rtode is the least integer consistent with (2).
It may not be immediately apparent that this definition is consistent.
However, if node M is made a direct descendant of node N in Algorithm
10.5, then M will never subsequently be given any more direct descendants.
Thus the rank of M may be fixed at that time. For example, in Fig. 10.17
the rank of node a 6 can be fixed at 1 since a~ has one direct descendant of
rank 0 and a 6 subsequently acquires no new descendants.
The next three lemmas state some properties of the rank of a node.
LEMMA 10.4
Let N be a root of rank i created by Algorithm 10.5. Then N has at least
2 i descendants.
Proof. The basis, i = 0, is clear, since a node is trivially its own descen-
dant. For the inductive step, suppose node N is a root of rank i. Then, N
must have at some time been given a direct descendant M of rank i - 1.
Moreover, M must have been made a direct descendant of N in step (1) of
Algorithm 10.5, or else the rank of N would be at least i -q- 1. This implies
that M was then a root so that, by the inductive hypothesis, M has at least
2 t-1 descendants at that time, and in step (1) of Algorithm 10.5, N has at
least 2 i- 1 descendants at that time. Thus, N has at least 2 t descendants after
merger. As long as N remains a root, it cannot lose descendants.
LEMMA 10.5
At all times during the execution of Algorithm 10.5, if N has a direct
descendant M, then the rank of N is greater than the rank of M.
Proof. Straightforward induction on the number of instructions ex-
ecuted. [~]
COROLLARY
where log~z1~ (n) -- log2 n and log~k+l) (n) -- log~zk) (log 2 (n)). That is, log~zk)
is the function which applies the log 2 function k times. For example,
in this case. By Lemma 10.5, the new direct ancestor of M is of higher rank
than its previous direct ancestor. Thus, if M is in rank group j, M may be
charged at most logsj~ (n) time units before its direct ancestor becomes one
of a lower rank group. From that time on, M will never be charged; the cost
of moving M will be borne by the find instruction executed, as described in
the paragraph above.
Clearly the charge to all the find instructions is O(nG(n)). To find an upper
bound on the total charge t o a l l the objects we sum over all rank groups the
maximum charge to each node in the group times the maximum number of
nodes in the group. Let g y be the number of nodes in rank group j and cj
the charge to all nodes in group j. Then:
log2 O) (n)
(10.2.1) gj ~ ~] n2 -k
k=log2 (j+1) (n)
by Lemma 10.6.
The terms of (10.2.1) form a geometric series with ratio 1/2, so their
sum is no greater than twice the first term. Thus g j < 2n 2 -l°g~''" (n), which
is 2n/logSj> (n). Now cj is bounded above by gj logsj> (n), so cj .~ 2n. Since j
may vary only from 1 to G(n), we see that O(nG(n)) units of time are charged
to nodes. It follows that the total cost of executing Algorithm 10.5 is
O(nG(n)). [~]
tlf we think of the typical use of these properties, e.g., when a is reduced to F in Go
and properties of the particular identifier a are desired, we see that the assumption is quite
plausible.
840 BOOKKEEPING CHAP. 10
It thus suffices to show that each index and header cell we create can be
modeled as an object node, that there are 0(n) of them, and that all manipu-
lations can be expressed exactly as some sequence of merge and lind instruc-
tions. The following is a complete list of all the cells ever created.
(1) 2n "objects,' correspond to the n header cells and n index cells created
during the shift operation (part 1 of Algorithm 10.3). We can cause the index
cell to point to the header cell by a merge operation.
(2) At most n objects correspond to the new index cells created in step
(5) of part 2 of Algorithm 10.3. These cells can be made to point to the
correct root by an appropriate merge operation.
Thus, there are at most 3n objects (and n in Algorithm 10.5 means 3n
here). Moreover, the number of instructions needed to manipulate the sets
and objects when "simulating" Algorithms 10.3 and 10.4 by Algorithm 10.5
is 0(n). We have commented that at most 3n merge instructions suffice to
initialize tables after a shift [(!) above] and to attach new index cells to headers
[(2) above]. In addition, 0(n) merge instructions suffice in step (4) of part 2
of Algorithm 10.3 when two sets of indices having the same property are
merged. This follows from the fact that the number of distinct properties is
fixed and that only n -- 1 reductions can be made.
Lemma 10.3 implies that 0(n) find instructions suffice to account for the
examination of properties of indices on an intersection list [step (1) of part 2
of Algorithm 10.3]. Finally, we assume in the hypothesis of the theorem
that 0(n) additional find instructions are needed to determine properties of
indices (presumably for use in translation). If we put all these instructions
in the order dictated by the parser and Algorithm 10.3, we have a sequence
of 0(n) instructions. Thus, the present theorem follows from Lemma 10.3
and Theorem i0.3. [5]
EXERCISES
E-----~ E + TI E@) TI T
T- > (E)ia
10.2.5. Generalize Algorithm 10.3 to grammars which are not in CNF. Your
generalized algorithm should have the same time complexity as the
original.
"10.2.6. Let us modify our property grammar definition by requiring that no
terminal symbol have the all-neutral table. If G is such a property
grammar, let L'(G) be (al . . . anlalT1 . . . anTn is in L(G) for some
(not all-neutral) tables T1 . . . . . Tn}. Show that L'(G) need not be a
CFG. Hint: Show that (aibJckli < j < k} can be so generated.
*'10.2.7. Show that for "property grammar" G, as modified in Exercise 10.2.6,
it is undecidable whether L'(G) : ~ , even if the underlying C F G of
G is right-linear.
10.2.8. Let G be the underlying C F G of Example 10.9. Suppose the termi-
nal deelare associates one of two properties with an index--either
"declared real" or "declared integer." Define a property grammar on
G such that if the implementation of Algorithm 10.3 is used, the highest
table on the pushdown list having a particular index i with nonneutral
property (i.e., the one with the cell for i pointed to by the hash table
entry for i) will have the currently valid declaration for identifier i as
the property for i. Thus, the decision whether an identifier is real or
integer can be made as soon as its use is detected on the input stream.
10.2.9. Find the output of Algorithm 10.5 and the final tree structure when
given a set of objects ( a l , . . . , alz} and the following sequence of
instructions. Assume that in case of tie counts, the root of At becomes
a descendant of the root of Aj if i < j.
merge(A 1, Az, A 1)
merge(A 3, A 1, A 1)
merge(A4, A s, A4)
merge(A 6, A 4, ,4 4)
merge(A 7, A 8, A 7)
merge(Ag, AT, AT)
merge(A 1o, A 11, A 1o)
merge(A12, AIo, Alo)
find(a1)
merge(A 1, A 4, A I)
merge( A 7, A 1o, A 2)
fred(aT)
merge(A 1, A z, A 1)
find(a3)
find(a4)
BIBLIOGRAPHIC NOTES 843
*'10.2.10. Suppose that Algorithm 10.5 were modified to allow either root to be
made a descendant of the other when mergers were made. Show that
the revised algorithm is of time complexity at best 0(n log n).
Open Problems
10.2.11. Is Algorithm 10.5 as stated in this book really O(nG(n)) in complexity,
or is it 0(n), or perhaps something in between ?
10.2.12. Is the revised algorithm of Exercise 10.2.10 of time complexity
0(n log n) ?
Research Problem
10.2.13. Investigate or characterize the kinds of properties of identifiers which
can be handled correctly by property grammars.
BIBLIOGRAPHIC NOTES
Property grammars were first defined by Stearns and Lewis [1969]. Answers
to Exercises 10.2.6 and 10.2.7 can be found there. An n log log n method of imple-
menting property grammars is discussed by Stearns and Rosenkrantz [1969].
To our knowledge, Algorithm 10.5 originated with R. Morris and M. D. Mcllroy,
but was not published. The analysis of the algorithm is due to Hopcroft and
Ullman [1972a]. Exercise 10.2.10 is from Fischer [1972].
11 CODE OPTIMIZATION
One of the most difficult and least understood problems in the design of
compilers is the generation of "good" object code. The two most common
criteria by which the goodness of a program is judged are its running time
and size. Unfortunately, for a given program it is generally impossible to
ascertain the running time of the fastest equivalent program or the length of
the shortest equivalent program. As mentioned in Chapter 1, we must be
content with code improvement, rather than true optimization when programs
have loops.
Most code improvement algorithms can be viewed as the application of
various transformations on some intermediate representation of the source
program in an attempt to manipulate the intermediate program into a form
from which more efficient object code can be produced. These code improve-
ment transformations can be applied at any point in the compilation process.
One common technique is to apply the transformations to the intermediate
language program that occurs after syntactic analysis but before code gen-
eration.
Code improvement transformations can be classified as being either
machine-independent or machine-dependent. An example of machine-inde-
pendent optimization would be the removal of useless statements from a
program, those which do not in any way affect its output.t Such machine-
independent transformations would be beneficial in all compilers.
" Machine-dependent transformations would attempt to transform a pro-
gram into a form whereby advantage could be taken of special-purpose
tSince such statements should not normally appear in a program, it is likely that an
error is present, and thus the compiler ought to inform the user of the uselessness of the
statement.
844
SEC. 11.1 OPTIMIZATION OF STRAIGHT-LINE CODE 845
A + - - - O B 1 "'" B~
Example 11.1
Let I = [A, B}, let U = {~F, G}, and let P consist o f the statementst
T~ -A+B
S< -A--B
T+--T,T
S< S,S
F< T+S
G< T--S
v , ( S ) = (A -- B) • (A -- B)
v,(F) = (A + B) • (A + B) + (A -- B) • (A - - B)
v~(C) = (A + B) • (A + B) -- (A -- B) • (A -- B)
{(A + B) • (A + B) + (A -- B) • (A -- B),
(A + B) * (A + B) -- (A - - B ) • ( A - - B)}.~
N o t e that we are assuming that no algebraic laws pertain. If the usual laws
o f algebra applied, then we could write F = 2(A z + B z) and G = 4 A B . [---]
tin displaying a list of statements we often use a new line in place of a semicolon to
separate statements. In examples we shall use infix notation for binary operators.
~In prefix notation the value of the block is
{ ÷ • -t-AB + AB, --AB--AB, -- • q--AB -k-AB, -- AB--AB }.
848 CODE OPTIMIZATION CHAP. 11
11.1.2. T r a n s f o r m a t i o n s on Blocks
We observe that given two blocks (B1 and (B2, we can test whether (B1
and (B2 are equivalent by computing their values v((Ba) and v((B2) and deter-
mining whether v((B1)= v(6~2). However, there are an infinity of blocks
equivalent to any given block.
For example, if 6~ = (P,/, U) is a block, X is a variable not mentioned
in 6~, A is an input variable, and 0 is any operator, then we can append the
statement X ~ O A . . . A to P as many times as we choose without changing
the value of (B.
Under a reasonable cost function, all equivalent blocks are not equally
efficient. Given a block 6~, there are various transformations that we can
apply to map 6~ into an equivalent, and possibly more desirable block (B'.
Let 3 be the set of all transformations which preserve the equivalence of
blocks. We shall show that each transformation in 3 can be implemented
by a finite sequence of four primitive transformations on blocks. We shall
then characterize those sequences of transformations which lead to a block
that is optimal under a reasonable cost criterion.
sEc. 11.1 OPTIMIZATIONOF STRAIGHT-LINECODE 849
DEFINITION
Let (B = (P, I, U) be a block with P = St; $2; . . . ;S,. For notational
uniformity we shall adopt the convention that all members of the input
set I are assigned at a zeroth statement, So, and all members of the output
set U are referenced at an n + 1st statement, S,+ 1.
Variable A is active immediately after time t, if
(1) A is assigned by some statement S~;
(2) A is not assigned by statements S~+1, S ~ + 2 , . . . , $i;
(3) A is referenced by statement Sj+t; and
(4) 0~i~t~j~n.
I f j above is as large as possible, then the sequence of statements S~+~,
S~+2, • • •, Sj+~ is said to be the scope of statement St and the scope of this
assignment of variable A. If A is an output variable and not assigned after
S~, then j -- n -t- 1, and U is also said to be in the scope of S~. (This follows
from the above convention; we state it only for emphasis.)
If a block contains a statement S such that the variable assigned in S
is not active immediately after this statement, then the scope of S is null,
and S is said to be a useless statement. Put another way, S is useless if S sets
a variable that is neither an output variable nor subsequently referenced.
Example 11.2
Consider the following block, where ~, fl, and 7 are lists o f zero or more
statements"
A< B+C
IJ
D+--A,E
?
Example 11.3
Let (g = (P, I, U), where I = {A, B, C}, U = {F, G}, and P consists of
F< A+A
G<-F,C
F< -A-+-B
G+----A,B
The second statement is useless, since its scope is null. Thus one application
of T~ maps (B into (B1 -- (P1,/, U), where P~ is
F< -A+A
F< A+B
G+---A*B
In 6~1, the input variable C is now useless, and the first statement in P~ is
also useless. Thus, we can apply transformation T~ twice in succession to
obtain ~2 = (P2, [A, B}, U), where P2 consists of
F< -A+B
G< A,B
Note that (B2 is obtained whether we first remove input variable C or whether
we first remove the first statement in P1. D
0~
P
B ~---0C1 "'" C r
and
(1) fl' is fl with all references to A changed to D in the scope of the
explicitly shown A, and
(2) ?' is 7 with all references to A and B changed to D in the scopes of
the explicitly shown A and B.
If the scope of A or B extends to S,+ 1, then U' is U with A or B changed to
D. Otherwise U ' = U.
D can be any symbol which does not change the value of the block.
Any symbol not mentioned in P is suitable, and some symbols of P might
also be usable.
Example 11,4
Suppose that CB = (P, {A, B], [F, G}), where P consists of
S< A+B
F< A,S
852 C O D EOPTIMIZATION CHAP. 11
R~ B+B
T~, A,S
G~, T,R
The output set becomes {D, G}. D may be any new symbol or one of the
variables F, A, S, or T. It is easy to check that letting D be B, R, or G changes
the value of the program. E]
T3" Renaming
Clearly, the name of an assigned variable is irrelevant insofar as the value
of a block 63 = (P,/, U) is concerned. Suppose that statement S~ in P is
A ~ 0 B1 .. • Br and that C is a variable that is not active in the scope of Si.
Then we can let ( B ' = ( P ' , / , U'), where P' is P with S~ replaced by
C ~ 0 B1 -.- Br and with all references to A replaced by references to C
in the scope of S~. If U is in the scope of S~, then U' is U with A changed
to C. Otherwise U' = U. Transformation T3 maps 63 into 63'.
Example i l .5
Let 63 = (P, {A, B}, IF}), where P is
T< A,B
T~ T+A
F< T,T
S~ A,B
T~. S+A
F~ T.T
Note that only the first assignment of T has been replaced by S. [Z]
SEC. 11.1 OPTIMIZATION OF STRAIGHT-LINE CODE 853
T4" Flipping
Let (B = ( P , / , U) be a block in which statement S~ is A ~--- 0 B1 .." Br,
statement Si+l is C ~ ¢D1 . . . D s, A is not one of C, D 1 , . . . , D s, and C
is not one of A, Bi, • - . , Br. Then transformation T4 maps the block (B into
6~' = ( P ' , / , U), where P' is P with S t and Si+ ~ interchanged.
Example 11.6
Let (B = (P, [A, B}, IF, G}) in which P is
F.~--- A -k- B
G.~---A . B
T4 can be applied to transform 6~ into (P', [A, B}, (F, G}), where P ' is
G< A.B
F+--A -k- B
However, T4 can not map the block CB1 = (P~, [d, B}, IF, G}), where P1 is
F< ~A+B
G< F,A
into the block (B2 = (P2, {A, B}, IF, G}), where P2 is
G< F,A
F+--A+B
In fact, 6~2 is not even a block, because variable F is used without a previous
definition.
We shall now define certain equivalence relations that reflect the action
of the four transformations defined.
DEFINITION
Let S be a subset of [1, 2, 3, 4}. We say that (B1 7 6~2 if one application
•
of transformation Tt changes (B1 into (B2, where i is in S. We say (B1 . ¢ sG . 6 ~
if there is a sequence e 0 , . . . , e, of blocks such that
(1) e0 = ~ .
(2) e. = (~2.
(3) For each i, 0 _G i < n, either e t =~ et+ 1 or e~+ 1 ~ et.
Thus, ~ is the least equivalence relation containing :=~ and reflects
S S
the idea that the transformations can be applied in either direction.
854 CODEOPTIMIZATION CHAP. 11
CONVENTION
In this section we shall show that for each block 03 = (P, I, U) we can
find a directed acyclic graph (dag) D that represents 03 in a natural way.
Each leaf of D corresponds to one input variable in I and each interior node
of D corresponds to a statement of P. The transformations on blocks con-
sidered in the previous section can then be applied to dags with equal ease.
DEFINITION
Example 11.7
Let 63 = (P, [A, B}, {F, G}) be a block, where P consists of the statements
T< -A+B
F< A,T
T< -B+F
G< -B,T
n4
n3
n2
n1
Each dag represents an equivalence class of ~3,4 in a natural way. That is,
if a block 631 can be transformed into 632 by some sequence of transformations
T3 and T4, then blocks 631 and 63z have the same dag, and conversely. Half
of this assertion is the following lemma, which is left to the reader to check
using the definitions.
LEMMA 1 1.1
If631 =~
3,4 632, then D(631) = D(632).
Proof Exercise.
COROLLARY
If 631 ~
3,4
632, then D(631) = D(632). D
856 CODE O P T I M I Z A T I O N CHAP. 11
The more difficult portion of our assertion is the other direction. For its
proof we need the following definition and lemma.
DEFINITION
A block ~ = (P, L U) is said to be open if
(1) No statement in P is of the form A ~-- ~, where A is in L and
(2) No two statements in P assign the same variable.
In an open block 6~ ---=( P , / , U), a distinct variable X t not in I is assigned
by each statement Si in P. The following lemma states that an open block
can always be created by renaming variables using only transformation T3.
LEMMA 11.2
Let CB = (P,/, U) be a block. Then there is an equivalent open block
(B' = ( P ' , / , U') such that (B ~ (B'.
3
P r o o f Exercise. E]
The following theorem shows that two blocks have the same dag if and
only if one block can be transformed into the other by renaming and flipping.
THEOREM 11.2
D((B 0 = D((B2) if and only if (Bt ~3,4 CB2.
Proof. The "if" portion is the corollary to Lemma 11.1. Thus, it suffices
to consider two blocks ~ = (P~, It, U~) and (B2 = (P2, 12, U2) such that
D ( ~ ) = D(~2) = D. Since the dags are identical, the input sets must be
the same, and so we may let I~ = 12 = I. Also, the number of statements in
Pt and P2 must be the same, and so we may suppose P~ = S t ; . . . ; S~ and
P2 = R I ; . . . . ;R,.
Using T3, the renaming transformation, we can construct two open
blocks ~'~ = (P'i, 1'1, U'~) and 6~ = (P~, 1~, U~) having the same set of as-
signed variables, such that
(1) ~', ~ tB,;
3
(2) ~ ~ ~2;
(3) Let P~ = S ~ ; . . . ; S', and P~ = R'~;.. ; R~'. Then S', and R) assign
the same variable if and only if they correspond to the same node of D.
[Observeby the eoroilary to Lemma 11.1 that D((B'i) = D((B~) = D.]
In creating the open blocks, we first rename all the variables of ~ and CB2
with entirely new names. Then, we can rename again to satisfy condition (3).
Now we shall construct a sequence of blocks e 0 , . . . , t~, such that
(4) e o = ~'~;
(5) e . = ~ ;
(6) e, ~ 4 e,+~ for 0 < i < n"
SEC. 11.1 OPTIMIZATION OF STRAIGHT-LINE CODE 857
3 4 3
COROLLARY
Example 11.8
Consider the two blocks (B~ = (Pa, {A, B}, {F}) and 632 = (e2, [A, B}, {F}),
with P~ and P2 as follows"
P1 P2
C~---A*A C~---B*B
De--B*B D*---A*A
E~---C--D E~--'DWC
F+--C+D C+---D--C
F +--- E l F F ~--- C[E
Blocks 631 and 632 have the same dag, which is shown in Fig. 11.2. Using/'3,
we can map 631 and 6~2 into open blocks 63'1 = (P'~, [A, B}, [Xs}) and
CB~ = (P~, {A, B}, [)(5]) so that condition (3) in the proof of Theorem 11.2
is satisfied. P'~ and P~ are shown on the following page.
858 CODE OPTIMIZATION CHAP. 11
Pi P~
X1 + - - A * A Xz ~----B*B
Xz~-'-B*B X1 + - - A * A
X3 +--- X 1 - X2 X4 +-- X~ + X2
24 ~ X1 --~ X2 X3 ~ X l -- X 2
X5 ~ X3/X4 X5 ~ X3/X4
Then, beginning with block eo having the list of statements P'i, we can readily
construct the blocks ex, e2, e3, C4, and C5 in the proof of Theorem 11.2.
Block e~ is obtained by using T4 to move fhe second statement in front of
the first, as shown below:
~t e3
Xz + - - B * B Xz ~ - - B * B
XI ~ A * A X1 ~ A * A
X3 ~ X1- Xz Xa ~ X1 + X2
X4 ~ X l "~- X z X3 ~ X l -- X z
X5 ~ X3/X4 X5 ~ X3/X4
Then ~2 = el. Block e 3 is constructed from C2, using T4 to move the fourth
statement in front of the third as shown above, and e4 and ~5 are both the
same as e 3. [~]
SEC. 11.1 OPTIMIZATION OF STRAIGHT-LINE CODE 859
We shall now show that 031 = 03z if and only if 031 1~, 2 , 3 , 4 032. In fact there
is a stronger result, namely that 031 ~ 032 if and only if 031 ~1,2 032 That is,
transformations T1 and T2 are sufficient to map any block into any other
equivalent block. We shall leave the proof of this stronger result for the
Exercises. (See Exercises 1 I. 1.9 and 1 1.1.10.)
DEFINITION
LEMMA 11.4
If 63s = (P1, Is, Us) and 632 = (/2, 12, U2) are equivalent reduced blocks,
then E(P~) = E(P2).
Proof If E(Ps) ~ E(P2), we may, without loss of generality, let ~ be the
last computed expression in E(Pa)--E(P2). Since v ( ~ ) = v(632) and each
expression may be uniquely split into subexpressions, it follows that ~ is not
a subexpression of any expression in v(63~). Thus, the statement computing
r/in P~ is useless and can be eliminated using transformation T~, contradicting
the assumption that 63~ was reduced. Details are left for the Exercises. D
THEOREM 11.3
COROLLARY
All reduced blocks equivalent to a given block have the same dag. D
We can now put the various pieces together to obtain the result that
the four transformations are sufficient to transform a block into any of its
equivalents.
THEOREM 11.4
63~ ~ 632 if and only if 631 ~.2.3, 4 632.
Proof The "if" portion is the corollary to Theorem 1 1.1. Conversely,
assume that 631 ~ 632. Then there exist reduced blocks 63'1 and 63~ such that
631 ~l,z 63'1 and 632 ~l,z 63~" By the corollary to Theorem 1 1.1,631 ~ 63 '1 and
632 ~ 63~. Thus, 63'~ ~_ 63~. By Theorem 1 1.3 D(63't) -~ D(63~). By Theorem
11.2, 63'1 ~ ' Hence, 631 1.2.3.4
632. .oLD ~ .
3,4
DEFINITION
Lemma 11.5 states that given a block (g we can confine our search for
an equivalent optimal block to the set of reduced blocks equivalent to (g.
The following lemma states that only reduced blocks equivalent to a given
reduced block 63 will be found by applying a sequence of transformations
T3 and T 4 to 63.
LEMMA 11.6
If 631 is a reduced block and 631 ~3,4 632, then (B2 is reduced.
Proof Exercise.
Our next result shows that if we have an open block initially, then a se-
quence of renamings followed by a flip can be replaced by the flip followed
by the renamings.
LEMMA 1 1.7
Let 631 b e a n open block and 6~1 ~ (g z 7 (g3" Then there exists a block
3
6~ suchthat (g 1 7 ~ ~ 3
633"
Proof Exercise. [~]
Proof Let (B" be any reduced block equivalent to (B. We can transform
(B" into (B', an open block equivalent to (B" using only T3. By Lemma 11.6,
(B' is reduced as well as open.
Let (g 2 be an optimal reduced block equivalent to (g. By Lemma 11.5,
(B2 exists. Thus, D((Bz)= D((g') by the corollary to Theorem 11.3. By
Theorem 11.2, (B' .~**>(Bz. We observe that T3 and T4 are their own "inverses,"
3,4
that is, e ==~ e ' if and only if e ' = , e. Hence, we can find a sequence of
3,4 3,4
blocks e l , . . , e , such that (g' = e l , (Bz - - e , , and e~ =*
3,4
e~+a for 1 -< i < n.
Using Lemma 11.7 iteratively, we can move all uses of T4 ahead of those of
T3. Thus, we can find (B 1 such that (g' * <'3=~
Example 11.9
We shall now take an example that has some interesting ideas not found
elsewhere in the book. The reader is urged to examine it closely. Let us
consider generating machine code for blocks. We postulate a computer
with a single accumulator and the following assembly language instructions
with meanings as shown.
(!) L O A D M. Here the contents of memory location M are loaded into
the accumulator.
(2) S T O R E M. Here the contents of the accumulator are stored into
memory location M.
(3) 0 M2M~,..., Mr. Here 0 is the name of an r-ary operator. The first
argument of 0 is in the accumulator, the second in memory location M2,
the third in memory location M 3, and so forth. The result obtained by apply-
ing 0 to its arguments is placed in the accumulator.
864 CODEOPTIMIZATION CHAP. 11
LOAD B
0 B 2 , . . . ,Br
STORE A
F=(A + B ) , ( A -- B)
G=(A--B),(A--C),(B--C)
T< -A+B
S~----A--B
F< T.S
T ~. A--B
S~ A--C
R~ B--C
SEC. 11.1 OPTIMIZATION OF STRAIGHT-LINE CODE 865
T< T,S
G< -T,R
Xi< A+B
Xz< A--B
X3 < Xt * X2
X4< A--C
Xs< B--C
X7 < X, , X,
The dag for $2 is shown in Fig. 11.4. Node ni is created from the state-
ment of Pz which sets X~.
We observe that there are a large number of programs into which 6~2
can be transformed using only T4. We leave it for the Exercises to show
that this number is the same as the number of linear orders of which the
partial order represented by Fig. 11.4 is a subset.
An upper bound on that number would be 7 !, the number of permuta-
tions of the seven statements. However, the actual number will be less in
this case, as not all statements of Pz can ever pass over each other by using
T4. For example, the third statement of P2 must always follow the second,
because the third references X2 and the second defines it. Note that an appli-
cation of T 3 may change the name of X2 but that the same relation will hold
with a new name.
Another interpretation of the limits on T4's ability to reorder the block
is to observe that in any such reordering, each node of D((Bz)will correspond
to some statement. The statement corresponding to an interior node n
cannot precede any statement corresponding to an interior node which is
a descendant of node n.
While the problem of this example is simple enough to enumerate all
linear orderings of P2, we cannot afford the time to do this for an arbitrary
block. Some heuristic that will produce good, although not necessarily
optimal, orderings quickly is needed. We propose one here. The following
algorithm produces a linear ordering of the nodes of a dag. The desired block
has statements corresponding to these nodes in reverse order. We express
the algorithm as follows:
866 CODE OPTIMIZATION CHAP. 11
r/7
n3 ~ * )n 6
nl {- )n2 (- }n4 n5
X5< B--C
X4< A--C
X2~ A--B
SEC. 11.1 O P T I M I Z A T I O N OF S T R A I G H T - L I N E CODE 867
X,< X, , X,
Xi< A+B
X3 < X~ * X2
LOAD A LOAD B
ADD B SUBTR C
STORE X1 STORE Xs
LOAD A LOAD A
SUBTR B SUBTR C
STORE Xz STORE X4
LOAD X1 LOAD A
MULT X2 SUBTR B
STORE X3 STORE Xz
LOAD A MULT X4
SUBTR C MULT Xs
STORE X4 STORE X7
LOAD B LOAD A
SUBTR C ADD B
STORE Xs MULT Xz
LOAD X2 STORE X3
MULT X4
MULT Xs
STORE X7
(a) From ~2 (b) From 6~3
tHowever, care must be exercised if the operands of a commutative operator are func-
tions with side effects. For example, f(x) + g(y) may not be equal to g(y) + f(x) if the
function f alters the value of y.
868 CODE OPTIMIZATION CHAP. 11
+ (# + r) = (= + #) + ~.t
(3) A binary operator 0~ distributes over a binary operator 02 if
01 (fl 02 7) = (e 0~ ,O) 02 (e 0a ?). F o r example, multiplication distributes
over addition because 0c • (p + ~,) = e • fl + e • y. The same caveats as
for (1) and (2) also apply here.
(4) A unary operator 0 is a self-inverse if 00e = e for all e. For exam-
ple, Boolean not and unary minus are self-inverses.
(5) A n expression e is said to be an identity under a (binary) operator
0 if e 0 • = e 0 e = • for all e. Some c o m m o n examples of identity expres-
sions are
(a) The constant 0 is an identity under addition. So is any expression
which has the value 0, such as e -- ~, e • 0, (-- e) + ~, and so
forth.
(b) The constant 1 is a multiplicative identity.
(c) The Boolean constant true is a conjunctive identity.
(That is, ~ and true = ~ for all =).
(d) The Boolean constant false is a disjunctive identity.
(That is, 0~ or false = e for all e).
If a is a set of algebraic laws, we say that expression o~ is equivalent to
expression fl under ~, written e --=4 fl, if e can be transformed into fl using
the algebraic laws in a;.
Example 11.10
Suppose that we have the expression
A ,(B, C) + ( B , A ) , D + A • E
(A , B) , (C + D) + A , E
Finally, applying the associative law to the first term and then the distributive
law, we can write the expression as
A , ( B , ( C + O)+E)
tOne must also use this transformation with care. For example, suppose x is very
much larger than y, z = --x, and floating-point calculation is done. Then (y + x) + z
may give 0 as a result, while y -I- (x + z) gives y as an answer.
SEC. 11.1 OPTIMIZATION OF STRAIGHT-LINE CODE 869
Example 11.11
If-+- is commutative, then the transformation on blocks correspond-
ing to this algebraic law would allow us to replace a statement of the form
X ~-- A ÷ B in a block by the statement X ~-- B -+- A.
The associated transformation on dags would allow us to replace the
structure
by the structure
Example 11.12
Let us consider the transformation on blocks corresponding to the
associative law for ÷ . Here we can replace a sequence of two statements
of the form
870 CODE OPTIMIZATION CHAP. 11
X< B-+- C
Y< A-k- X
by the three statements
X< B+C
X'< A+B
Y< X'-+- C
where X' is a new variable. This transformation would have the following
analog on dags:
<3=0,
Example 11.13
Consider the block 03 ~- (P, I, { Y}), where I = {A, B, C, D, E, F} and
P is the following sequence of statements'
Xi<---B--C
X2 <, A ,X~
X3 < - E * F
X4< D'X3
Y< X: * X ,
03 computes the expression
Y = (A , (B -- C)) , (D , (E , F))
C
C
Fig. 11.6 D a g for 6~.
X3 < E,F
X4 < D *X3
by the three statements
X3 < E,F
X~ < D,E
X4 < X~,F
Now the statement X 3 ~ E • F is useless and can be deleted by transfor-
mation T1. Then using the associative transformation, we can replace the
statements
X4 < X~ * F
Y< X2 * X 4
by the statements
X4< X~ , F
x'~ < x~ , x'~
Y< X'4 * F
Xi< B--C
X2 < A *Xi
X~ < D,E
Y< X~*F
Now if we apply the associative transformation once more to the third and
fourth statements, we obtain (after deleting the resulting useless statement)
the block
Xi+--B--C
X2 < A*X~
sEc. 11.1 OPTIMIZATION OF STRAIGHT-LINE CODE 873
Y<~ X~*F
Xi<---B--C
X2 < X~*A
X~' < X2• D
X~ <--- X~' * E
Y< X]*F
The dag for CB' is shown in Fig. 11.7. (B' has a cost of 7, the lowest possible
cost for a block equivalent to 6V. In the next section we shall give a sys-
tematic method for optimizing arithmetic expressions using the associative
and commutative algebraic laws.
EXERCISES
T~----A + B
R< -A,T
S< B+C
F~----R.S
T~ A,A
R< A+B
S< A,R
G~----S+T
(a) What is v((B) ?
(b) Indicate the scope of each statement in P.
(c) Does P have any useless statements ?
(d) Transformation T2 is applicable to the first and sixth statements
What values may D (as defined on p. 851) take in this application.
ofT2?
(e) Draw a dag for (B.
(f) Find an equivalent reduced block for ~ .
(g) How many different reduced blocks are equivalent to (B except for
renaming ? (More technically, let (B' be an open reduced block
equivalent to (B. What is the cardinality of {(B" I(B" ~ (B'} ?)
4
(h) Find a block equivalent to (B that is optimal according to the cost
criterion of Example 11.9 (p. 863).
11.1.2. Prove that transformations T1, T3, and T4 preserve block equivalence
(that is, if (B ==~ (B', then v((B) = v((B3 for i = 1, 3, and 4).
i
tDo not forget that the set of possible names of variables is infinite. Thus, some book-
keeping techniques such as those mentioned in Section 10.1 must be used.
EXERCISES 87S
1,2
• 11.1.9. Show that if (B1 ~ (Bz, then there is a block (B3 such that (B3 =~ (B~
and ~3 ~ (B~. Thus, transformation T3 can be implemented using
one application of T~ in reverse followed by one application of Tz.
• 11.1.10. Show that if ~1 ::~ (Bz, then there is a block (~3 such that (B3 ~ ~2
and (B3 ~ (Bi.
DEFINITION
A set S of transformations on blocks is complete if v((B1) = v(~2)
implies that 6~1 ~ (Bz. S is minimal complete if no proper subset
S
of S is complete.
Exercises 11.1.9 and 11.1.i0 show that {T~, T2} is complete. The
following two exercises show that {Ti, Tz} is minimal complete.
• 11.1.11. Show that block (B = (P, {A, B}, {C, D}) cannot be transformed into
(B' ----(P', {A, B}, {C, D}), where P a n d P ' are as shown, using only trans-
formations T1, T3, and T4.
p p"
E~--Aq-B C~---A+B
D~---E.E D~-----C,C
C~---Aq-B
p p'
C~----A.B C~-----A-q-B
C~---A-I-B
• 11.1.15. Consider blocks as defined but also include statements of the form
A ~ B with the obvious meaning. Find a complete set of transfor-
mations for such blocks.
• "11.1.16. Assume that addition is commutative. Let T5 be the transformation
which replaces a statement A ~ B + C by A ~ C + B. Show that T5
together with transformations T1 and T2 transform two blocks into
one another if and only if they are equivalent under the commutative
law of addition.
• "11.1.17. Assume that addition is associative. Le(T6 be the transformation which
replaces two statements X ~ A + B; Y ~ X + C by the three state-
ments X ~ A + B ; X'~B+C; Y~A +X'or the statements
X ~ - - B + C;Y ~---A + X b y X ~ - - B + C; X" + - A + B;Y~-- X" + C,
where X' is a new variable. Show that T6, T1, and Tz transform two
blocks into one another if and only if they are equivalent under the as-
sociative law of addition.
11.1.18. What is the transformation on blocks that corresponds to the distri-
butive law of • over + ? What is the corresponding transformation on
dags ?
• "11.1.19. Show that there exist sets of algebraic laws for which it is recursively
undecidable whether two expressions are equivalent.
DEFINITION
An algebraic law is operand-preserving if no operands are created
or destroyed by one application of the algebraic law. For example,
the commutative and associative laws are operand-preserving but the
distributive law is not.
An algebraic law is operator-preserving if the number of operators
is not affected by one application of the law. The algebraic law
00 •--~ (self-inverse) is not operator-preserving, but the law
(~ -//) - ~' = • - (/~ + r ) is.
The number of interior nodes and the number of leaves in the
dag associated with a block are preserved when the transformations
corresponding to operator- and operand-preserving algebraic laws are
applied to the block.
"11.1.20. Show that under a set of operator- and operand-preserving algebraic
laws it is decidable whether two blocks are equivalent.
• "11.1.21. Extend Theorem 11.5 to apply to optimization of blocks using both
the topological transformations of Section 11.1.2 and an arbitrary
collection of operator- and operand-preserving algebraic transfor-
mations.
"11.1.22. Consider blocks in which variables can represent one-dimensional
arrays. Let us consider assignment statements of the form
(1) A(X) ~ - B and
(2) B ~ A(X),
EXERCISES 877
Research Problems
11.1.33. Using the cost criterion of Example 11.9, or some other interesting cost
criterion, find a fast algorithm to find an optimal block equivalent to
a given one.
11.1.34. Find a collection of algebraic transformations that is useful in opti-
mizing a large class of programs. Devise efficient techniques for
applying these transformations.
P r o g r a m m i n g Exercises
11.1.35. Using a suitable representation for dags, implement transformations
T1 and T2 of this section.
11.1.36. Implement the heuristic suggested in Example 11.9 to "optimize" code
for a one-accumulator machine.
878 CODEOPTIMIZATION CHAP. 11
BIBLIOGRAPHIC NOTES
The presentation in this section follows Aho and Ullman [1972e]. ]garishi [1968]
discusses transformations on similar blocks with A ~--B statements permitted
and names of output variables considered important. DeBakker [1971] considers
blocks in which all statements are of the form A ~ B. Bracha [1972] treats straight
line blocks with foward jumps.
Richardson [1968] proved that no algorithm to "simplify" expressions exists
when the expressions are taken to be over quite simple operators. The answer to
Exercise 11.1.19 can be found in his article. Caviness [1970] also treats classes of
algebraic laws for which equivalence of blocks is undecidable.
Floyd [1961a] and Breuer [1969] have considered algorithms to find common
subexpressions in straight-line blocks when certain algebraic laws pertain. Aho
and Ullman [1972f] discuss the equivalence of blocks with structured variables as
in Exercise 11.1.22. Some techniques useful for Exercise 11.1.32 can be found in
Aho, Sethi, and Ullman [1972].
Let us now turn our attention to the design of a code generator which
produces assembly language code for blocks. The input to the code generator
is a block consisting of a sequence of assignment statements. The output is
an equivalent assembly language program.
We would like the resulting assembly language program to be good under
some cost function such as number of assembly language instructions or
number of memory fetches. Unfortunately, as mentioned in the last section,
there is no efficient algorithm known that will produce optimal assembly
code, even for the simple "one-accumulator" machine of Example 11.9.
In this section we shall provide an efficient algorithm for generating
assembly code for a restricted class o f b l o c k s - - t h o s e that represent one
expression with no identical operands. For this class of blocks our algorithm
will generate assembly language code that is optimal under a variety of cost
criteria, including program length and number of accumulators used.
While the assumption of no identical operands is certainly not realistic,
it is often a good first-order approximation. Moreover, if we are to generate
code using a syntax directed translation with synthesized attributes only, the
assumption is quite convenient. Finally, experience has shown that the prob-
lem of generating optimal code for expressions with even one pair of identical
operands is extremely difficult in comparison.
A block representing one expression has only one output variable. For
example, the assignment statement F = Z • (X ÷ Y) can be represented by
the block (B = (P, {X, Y, Z}, (F}), where P is
SEC. 11.2 ARITHMETIC EXPRESSIONS 879
R< ~X-t- Y
F< Z.R
Example 11.14
Consider the following assembly language program with two accumu-
lators A and B. The values of the accumulators after each instruction are
shown beside each instruction in infix notation, as usual.
v(A) v(B)
LOAD X, A X Undefined
ADD A, Y, A X+ Y Undefined
LOAD Z, B X+ Y Z
MULT B, A, A Z , (X + Y) Z
We can assign values to the nodes of a tree from the bottom as follows:
(1) If node n is a leaf labeled X, then n has value 2".
(2) If n is an interior node labeled 0 with direct descendants n 1 and n z
whose values are v I and v z, then n has value Ovlv z.
The value o f a tree is the value of its root. For example, the value of the
tree in Fig. 11.8 is Z * (X + Y) in infix notation.
Let us briefly discuss the relation between the intermediate language
blocks of Section 11.1 and the assembly language programs we have just
defined. First, given a reduced block in which
(1) All operators are binary,
(2) Each input variable is referenced once, and
(3) There is exactly one output variable,
the dag associated with the block will be a tree. This tree is a syntax tree in
our current terminology. The value of the expression is also the value of
the block.
We can naturally convert the intermediate language block to an assembly
language program, statement by statement. It turns out that if this conversion
takes account of the possibility that desired values are already in accumu-
lators, then we can produce an optimal assembly program from a given
reduced open block using only transformation T4, as suggested by Theorem
11.5, and then performing conversion to assembly language.
However, it may not be entirely obvious that the above is true; the reader
should verify these facts for himself. What we achieve by essentially rework-
882 CODEOPTIMIZATION CHAP. 11
ing many of the definitions of Section 11.1 for assembly language programs
is to show that there is no strange optimal assembly language program which
is not related by any natural statement-by-statement conversion to an inter-
mediate language block obtainable from a reduced open block and trans-
formation T4.
Example 11.15
'C
Output. An assembly language program P such that v(A 1) after the last
instruction of P is v(T); i.e., P computes the expression represented by T,
leaving the result in accumulator A 1.
Method. We assume that T has been labeled using Algorithm 11.1. We
then execute the following procedure code(n, i) recursively. The input to code
is a node n of T and an integer i between 1 and N. The integer i means that
accumulators A~, A~+ 1, • • •, AN are currently available to compute the expres-
sion for node n. The output of code(n, i) is a sequence of assembly language
instructions which computes the value v(n), leaving the result in accumu-
lator Av
Initially we execute code(n 0, 1), where n o is the root of T. The sequence
of instructions generated by this call of the procedure code is the desired
assembly language program.
Procedure code(n, i).
as shown:
'OP 0 A t, X, A t'
Here X is the variable associated with leaf n2, and OP 0 is the operation code
for operation 0. The output of code(n, i) is the output of code(n1, i) followed
by the instruction OP 0 A t, X, A~.
Later we shall show that the following relationships between ll, 12, and i
hold when steps (5), (6), and (7) of Algorithm 11.2 are invoked"
SEC. 11.2 ARITHMETIC EXPRESSIONS 885
Step Relation
(5) i ~ N - It
(6) i ~ N - - lz
(7) ~= 1
Note also that Algorithm 11.2 requires instructions of type (4) of the form
OPO A,B,A
OPO A,B,B
By making the procedure code slightly more complicated in step (5), we can
eliminate the need for instructions of the form
OPO A,B,B
Example 11.16
Let T be the syntax tree consisting of the single node X (labeled 1). From
step (2) code is the single instruction LOAD X, At. [Z
Example 11.17
Let T be the labeled syntax tree in Fig. 11.10. The assembly language
p r o g r a m for T using Algorithm 11.2 with N = 2 is produced as follows.
The following sequence of calls of code(n, i) is generated. We also show the
step of Algorithm 11.2 which is invoked during each call. Here, we indicate
a node by the variable or operator associated with it.
code(,, 1) (3c)
code(Z, 1) (2)
code(+, 2) (3a)
code(X, 2) (2)
886 CODEOPTIMIZATION CHAP. 11
'"7"32 T4: T1
r3
. 1 ~ ULTAI'A2'A1
T3"T2
Ti: LOAD Z, ADD A2, Y, A 2
G O
T2: LOAD X, A 2
The call code(X, 2) generates the instruction LOAD X, A s, which is the trans-
lation associated with node X. The call code(q-, 2) generates the instruction
sequence
LOAD X, A s
ADD As, Y, A 2
This program is similar (but not identical) to that in Example 11.14. The
value in accumulator A 1 at the end of this program is clearly Z • (X-t- Y).
D
Example 11,18
Let us apply Algorithm 11.2 with N = 2 to the syntax tree in Fig. 11.9
(p. 883). The following sequence of calls of code(n, i) is generated. Here *z
refers to the left descendant o f / , *R to the right descendant o f / , --z to the
right descendant of *L, and --R to the right descendant of *R. The step of
Algorithm 11.2 which is applicable during each call is also shown.
SEC. 11.2 ARITHMETICEXPRESSIONS 887
Call Step
code(/, 1) (3d)
code(,R, i) (3c)
code(D, 1) (2)
code(- R, 2) (3a)
code(E, 2) (2)
eode(,L, 1) (3c)
code(A, 1) (2)
code(--L, 2) (3a)
code(B, 2) (2)
LOAD D, A 1
LOAD E, A 2
SUBTR A2, F, A z
MULT A1, A2, A1
STORE A1, TEMP1
LOAD A, A 1
LOAD B, A 2
SUBTR A 2, C, A2
MULT A1, A2, A1
DIV A I, TEMP 1, A 1
We shall prove that the label of the root of the labeled syntax tree pro-
duced by Algorithm 11.1 is the smallest number of accumulators needed to
compute that expression without using any STORE instructions.
We begin by making several observations about Algorithm 11.2.
LEMMA 11.8
The program produced by procedure code(n, i) in Algorithm 11.2 correctly
computes the value of node n, leaving that value in the ith accumulator.
Proof. An elementary induction on the height of a node. [Z]
LEMMA 11.9
If Algorithm 11.2, with N accumulators available, is applied to the root
of a syntax tree, then when procedure code(n, i) is called on node n with
label I either
888 CODE OPTIMIZATION CHAP. 11
(1) l > N and N accumulators are available for this call (i.e., i = 1), or
(2) 1 < N and at least 1 accumulators are available for this call (i.e.,
i<N--l+ 1).
P r o o f Another elementary induction, this time on the number of calls
of code(n, i) made prior to the call in question. [Z
THEOREM 11.6
Let T be a syntax tree and let N be the number of available accumulators.
Let l be the label of the root of T. Then there exists a program to compute
T which uses no STORE instructions if and only if l < N.
Proof
If." If I ~ N, then step (7) of procedure code(n, i) is never executed. That
is, a node whose two direct descendants have labels equal to or greater than N
has a label at least N + 1 itself. Step (7) is the only step which generates a
STORE instruction. Therefore, if l ~ N, the program constructed by Algo-
rithm 11.2 has no STORE's.
Only if: Assume that 1 > N. Since N ~ 1, we must have 1 ~ 2. Suppose
that the conclusion is false. Then we may assume without loss of generality
that T has a program P which computes it using N accumulators, that P
has no STORE statements, and that there is no syntax tree T' which has
fewer nodes than T and also violates the conclusion. Since the label of the
root of T exceeds 1, T cannot be a single leaf. Let n be the root and let n~
and n2 be its direct descendants, with labels 1~ and 12, respectively.
Case 1" 1~ : 1. The only way the value of n can be computed is for the
value of n~ to appear at some time in an accumulator, since n~ cannot be
a leaf. We form a new program P ' from P by deleting those statements follow-
ing the computation of the value of n I . Then P ' computes the subtree with
root n~ and has no STORE's. Thus, a violation with fewer nodes than T
occurs, contrary to our assumption about T.
Case 2:12 : l. This case is similar to case 1.
Case 3" 11 : 12 : 1 - 1. We have assumed that no two leaves have the
same associated variable name. We can assume without loss of generality
that P is "as short as possible," in the sense that if any statement were deleted,
the value of n would no longer appear in the same accumulator at the end
of P. Thus, the first statement of P must be LOAD X, A, where X is the
variable name associated with some leaf of T, for any other first statement
could be deleted.
Let us assume that X is the value of a leaf which is a descendant of n~
(possibly n~ itself). The case in which X is a value of a descendant of n 2 is
symmetric and will be omitted. Then until n 1 is computed, there is always at
SEC. 11.2 ARITHMETIC EXPRESSIONS 889
least one accumulator which holds a value involving X. This value could
not be used in a correct computation of the value of n~. We may conclude
that from P we can find a program P ' which computes the value of nz, with
label 1 - 1, which uses no STORE's and no more than N - 1 accumulators
at any time. We leave it to the reader to show that from P' we can find an
equivalent P " which never mentions more than N -- 1 different accumulators.
(Note that P' may mention all N accumulators, even though it is not "using"
more than N - 1 at any time.) Thus, the subtree of n z forms a smaller vio-
lation of our conditions, contradicting the minimality of T. We conclude
that no violation can occur. D
Example 11.19
Consider the syntax tree of Fig. 11.9 (p. 883)again, with N -- 2. The only
major node is the root. There are four minor nodes, the leaves with values
A, B, D, and E. [---]
LEMMA 11.10
Let T be a syntax tree. There exists a program to compute T using m
LOAD's if and only if T has no more than m minor nodes.
Proof If we examine procedure code(n, i) of Algorithm 11.2, we find that
only step (2) introduces a LOAD statement. Since step (2) applies only to
minor nodes, the "if" portion is immediate.
The "only if" portion is proved by an argument similar to that of Theorem
11.6, making use of the facts that the only way the value of a leaf can appear
in an accumulator is for it to be "LOADed" and that the left argument of
any operator must be in an accumulator. D
LEMMA 11.11
Let T be a syntax tree. There exists a program P to compute T using M
STORE's if and only if T has no more than M major nodes.
890 CODEOPTIMIZATION CHAP. 11
Proof.
If: Again referring to procedure code(n, i), only step (7)introduces a
STORE, and it applies only to major nodes.
Only if: This portion is b y induction on the number of nodes in T. The
basis, a tree with one node, is trivial, as the label of the root is 1, and there
are thus no major nodes. Assume the result for syntax trees of up to k -- 1
nodes, and let T have k nodes.
Consider a program P which computes T, and let M be the number of
major nodes of T. We can assume without loss of generality that P has as
few STORE's as any program computing T. If M = 0, the desired result is
immediate, and so assume that M ~ 1. Then P has at least one STORE,
because the label of a major node is at least N -k 1, and if no STORE's were
present in P, a violation of Theorem 11.6 would occur.
The value stored by the first STORE instruction of P must be the value
of some node n of T, or else a program with fewer STORE's than P but com-
puting T could easily be found. Moreover, we may assume that n is not
a leaf for the same reason. Let T' be the syntax tree formed from T b y making
node n a leaf and giving it some new name X as value. Then T' has fewer nodes
than T, and so the inductive hypothesis applies to it. We can find a program
P ' which evaluates T' using exactly one fewer STORE than P. P ' is constructed
from P by deleting exactly those statements needed to compute-the first
value stored and replacing subsequent references in P to the location used
for that STORE by the name X until a new value is stored there.
If we can show that T' has at least M - 1 major nodes, we are done,
since by the inductive hypothesis, we can then conclude that P ' has at least
M -- 1 STORE's and thus that P has at least M STORE's.
We observe that no descendant of n in T can be major, since a violation
of Theorem 11.6 would occur. Consider a major node n' of T. If n is not
a descendant of n', then n' will be a major node in T'. Thus, it suffices to
consider those major nodes nl, n z , . . • on the iaath from n to the root of T.
By the argument of case 3 of Theorem 11.6, n cannot itself be major. The
first node, n l, if it exists, may no longer be major in T'. However, the label
of n 1 in T' is at least N, because the direct descendant of nl that is not an
ancestor of n must have a label at least N in T and T'. Thus, n 2, n 3, . . . are
still major nodes in T'. We conclude that T' has at least M - 1 major nodes.
The induction is now complete. E]
THEOREM 11.7
the tree and Algorithm 11.2 yields one such instruction for each interior
node, the theorem follows.
Example 11.20
As pointed o u t in Example 11.19, the arithmetic expression of Fig. 11.9
has one major and four minor nodes (assuming that N = 2). It also has five
interior nodes. Thus, at least ten statements are necessary to compute it.
The program of Example 11.18 has ten statements. Note that one of these is
STORE, four are LOAD's, and the rest operations. E]
Given a set a of algebraic laws, we say that two syntax trees T~ and T2
are equivalent under a, written T1 ~ a Tz, if there exists a sequence of trans-
formations derived from these laws which will transform T~ into T2. We shall
write [T]a to denote the equivalence class of trees {T'[T _=~ T'}.
892 CODE OPTIMIZATION CHAP. 11
<~==C>
Thus, if we are given a syntax tree T and we know that a certain set
of algebraic laws prevails, then to find an optimal program for T we might
want to search [T]a for an expression tree with the minimum cost. Once we
have found a minimum cost tree, we can apply Algorithm 11.2 to find the
optimal program. Theorem 11.7 guarantees that the resulting program will
be optimal.
If each law preserves the number of operators, as do the commutative
and associative laws, then we need only minimize the sum of major and
minor nodes. As an example, we shall give algorithms to do this minimization,
first in the case that some operators are commutative and second in the case
that some commutative operators are also associative.
Given a syntax tree T and a set ~ of algebraic laws, the next algorithm
will find a syntax tree T' in [T]a of minimal cost provided that g contains
only commutative laws applying to certain operators. Algorithm 11.2 can
then be applied to T' to find the optimal program for the original tree T.
ALGORITHM 11.3
Minimal cost syntax tree assuming some commutative operators.
lnput. A syntax tree T (with three or more nodes) and a set of commu-
tative laws ~.
Output. A syntax tree in [T]a of minimal cost.
Method. The heart of the algorithm is a recursive procedure commute(n)
which takes a node n of the syntax tree as argument and returns as output
a modified subtree with node n as root. Initially, commute(n0)is called,
where n o is the root of the given tree T.
SEC. 11.2 A R I T H M E T I C EXPRESSIONS 893
Procedure commute(n).
(1) If node n is a leaf, commute(n) : n.
(2) If node n is an interior node, there are two cases to consider:
(a) Suppose that node n has two direct descendants n 1 and n 2 (in this
order) and that the operator attached to n is commutative. If nl
is a leaf and n 2 is not, then the output of commute(n) is the tree of
Fig. 11.12(a).
(b) In all other cases the output of commute(n) is the tree of Fig.
11.12(b). [Z]
Example 11.21
Consider Fig. 11.9 (p. 883) and assume only • is commutative. Then the
result of applying Algorithm 11.3 to that tree is shown in Fig. 11.13. Note
that the label of the root of Fig. 11.13 is 2 and that there are two minor nodes.
(a)
(b)
Fig. 11.12 Result of commute procedure.
894 C O D EOPTIMIZATION CHAP. 11
1 1
Thus, if two accumulators are available, only seven statements are needed
to compute this tree, compared with ten for Fig. 11.9. [Z]
THEOREM 11.8
If the only algebraic law permitted is the commutative law of certain
operators, then Algorithm 11.3 produces that syntax tree in the equivalence
class of the given tree with the least cost.
Proof. It is easy to see that the commutative law cannot change the
number of interior nodes. A simple induction on the height of a node shows
that Algorithm 11.3 minimizes the number of minor nodes and the label
that would be associated with each node after applying Algorithm 11.1.
Hence, the number of major nodes is also minimized. D
The situation is more complex when certain operators are both commu-
tative and associative. In this case we can often transform the tree extensively
to reduce the number of major nodes.
DEFINITION
/
Let T be a syntax tree. A set S of two or more nodes of T is a cluster if
(1) Each node of S is an interior node with the same associative and
commutative Operator.
(2) The nodes of S, together with their connecting edges, form a tree.
(3) No proper superset of S has properties (!) and (2).
SEC. 11.2 ARITHMETICEXPRESSIONS 895
The root of the cluster is the root of the tree formed as in (2) above. The
direct descendants of a cluster S are those nodes of T which are not in S but
are direct descendants of a node in S.
Example 11.22
Consider the syntax tree of Fig. 11.14, where q- and • are considered
associative and commutative, while no other algebraic laws pertain.
"\\\\
\ \ Cluster 1
\ \
Cluster 2 \ ~ \\ \ k \\ Cluster 3
/\ , I
The three clusters are circled. The cluster which includes the root of
the ti'ee has as direct descendants, in order from the left, the root of cluster
2, the node to which the -- operator is attached, and the root of cluster 3.
and commutative operator as the nodes of the cluster S. The direct descen-
dants of the cluster in T are made direct descendants of n in T'.
Example 11.23
Consider the syntax tree T in Fig. 11.15. Assuming that + and • are both
associative and commutative, we obtain the clusters which are circled in
Fig. 11.15. The associative tree for T is shown in Fig. 11.16. Note that the
associative tree is not necessarily a binary tree.
//
// \\
/ \
We can l a b e l the nodes of an associative tree with integers from the bot-
tom up as follows"
(I) A leaf which is the leftmost direct descendant of its ancestor is labeled
1. All other leaves are labeled 0.
(2) Let n be an interior node having nodes n~, n 2 . . . . , n m with labels
11, 12 . . . . , Im as direct descendants, m _~ 2.
(a) If one of 11, 1 2 , . . . , I m is larger than the others, let that integer be
the label of node n.
(b) If node n has a commutative operator and n~ is an interior node
with 1,. = 1 and the rest of n~ . . . . , n~_ 2, n,.+1, • . . , rtm are leaves,
then label node n by 1.
SEC. 11.2 ARITHMETIC EXPRESSIONS 8 97
(c) Provided that (b) does not apply, if it = lj for some i ~ j and It
is greater than or equal to all other lk's, let the label of node n
be/i-t- 1.
Example 11,24
Consider the associative tree in Fig. 11.16. The labeled associative tree is
shown in Fig. 11.17.
Note that condition (2b) of the labeling procedure applies to the third
and fourth direct descendants of the root, since • is a commutative opera-
tor. [B
We now give an algorithm which takes a given syntax tree and produces
that tree in its equivalence class with the smallest cost.
ALGORITHM 11.4
Minimal cost syntax tree, assuming that certain operators are commu-
tative and that certain operators are both associative and commutative but
that no other algebraic laws pertain.
Input. A syntax tree T and a set a~ of commutative and associative-
commutative laws.
Output. A syntax tree in [T]a of minimal cost.
Method. First create T', the labeled associative tree for T. Then compute
acolmnute(no), where acommute is the procedure defined below and n o is
898 CODEOPTIMIZATION CHAP. 11
2 ,)l
the root of T'. The output of acommute(n0) is a syntax tree in [T]~ of minimal
cost.
Procedure acommute(n).
The argument n is a node of the labeled associative tree. If n is a leaf,
aeommute(n) is n itself. If n is an interior node, there are three cases to
consider"
(i) Suppose that node n has two direct descendants n~ and n2 (in this
order) and that the operator attached to n is commutative (and possibly
associative).
(a) If n~ is a leaf and n2 is not, then the output aeommute(n) is the tree
of Fig. ll.18(a).
(b) Otherwise, aeommute(n) is the tree of Fig. 11.18(b).
(2) Suppose that 0, the operator attached to n, is commutative and asso-
ciative and that n has direct descendants n t, n 2 , . . . , nm, m > 3, in order
from the left.
Let nm,x be a node among n~, . . . . , n m having the largest label. If two or
more nodes have the same largest label, then choose nm~ be be an interior
node. Let P ~ , P 2 , . . . , P m - ~ be, in any order, the remaining nodes in
n,] --
Then the output of aeommute(n) is the binary tree of Fig. 11.19, where
each r~, 1 < i ~ m - 1, is a new node with the associative and commutative
operator O of n attached.
SEC. 11.2 ARITHMETIC EXPRESSIONS 899
(a)
(b)
Example 11.25
Let us apply Algorithm 11.4 to the labeled associative tree in Fig. 11.17.
Applying aeommute to the root, case (2) applies, and we choose to treat the
first direct descendant from the left as nm~x. The binary tree which is the out-
put of Algorithm 11.4 is shown in Fig. 11.20. D
Case 2: n I is in S, but n2 is not. Since T 1 has fewer nodes than T, the induc-
tive hypothesis applies to it. Thus, in T1, S - {n} has at least r -- 2 major
nodes if 12 ~_ N and at least r -- 1 major nodes if 12 < N. In the latter case,
the conclusion is trivial. In both cases, the result is trivial if r ~ 1. Thus,
consider the case r > 1 and l 2 ~ N. Then S - In} has at least one direct
descendant with label ~ N, so ll _~ N. Thus, n is a major node, and S
contains at least r -- 1 major nodes.
sac. 11.2 ARITHMETIC EXPRESSIONS 901
is a syntax tree T whose root after applying the labeling Algorithm 11.1
has the, same label as the root of A. No tree in [T]a has a root with label
smaller than the label of the root of A, and no tree in [T]a has fewer major
or minor nodes.
Suppose otherwise. Then let T be a smallest tree violating one of those
conditions. Let 0 be the operator at the root of T.
Case 1" 0 is neither associative nor commutative. Every associative or
commutative transformation on T must take place wholly within the subtree
dominated by one of the two direct descendants of the root of T. Thus,
whether the violation is on the label, the number of major nodes, or the num-
ber of minor nodes, the same violation must occur in one of these subtrees,
contradicting the minimality of T.
Case 2 : 0 is commutative but not associative. This case is similar to case
1, except that now the commutative transformation may be applied to the
root. Since step (1) of procedure aeommute takes full advantage of this trans-
formation, any violation by T again implies a violation in one of its subtrees.
Case 3 : 0 is commutative and associative. Let S be the cluster containing
the root. We may assume that no violation occurs in any of the subtrees
whose roots are the direct descendants of S. Any application of an associative
or commutative transformation must take place wholly within one of these
subtrees, or wholly within S. Inspection of the result of step (2) of procedure
aeommute assures us that the number of minor nodes resulting from cluster
S is minimized. By Lemma 11.12, the number of major nodes resulting from
S is as small as possible (inspection of the result of procedure aeommute is
necessary to see this), and hence the label of the root is as small as possible.
Finally we observe that the alterations made by Algorithm 11.4 can
always be accomplished by applications of the associative and commutative
transformations. IS]
EXERCISES
OP0 A,B,A
11.2.12. Let
!l if b > 0
sign(a, b) : if b = 0
[al if b < 0
step 4
step 3
step 2
step 1
A3 into the second, A5 into the third, and A7 into the fourth. In the
second step we would add A2 to register 1, A4 to register 2, A6 to
register 3, and A8 to register 4. After this step register 1 would con-
tain A1 + A2, register 2 would contain A3 + A4, and so forth. At
the third step we would add register 2 to register 1 and register 4 to
register 3. At the fourth step we would add register 2 to register 1.
Define an N-register machine in which up to N parallel operations can
be executed in one step. Assuming this machine, modify Algorithm
11.1 to generate optimal code (in the sense of fewest steps) for single
arithmetic expressions with distinct operands.
Research Problem
11.2.24. Find an efficient algorithm that will generate optimal code of the type
mentioned in this section for an arbitrary block.
Programming Exercise
11.2.25. Write programs to implement Algorithms 11.1-11.4.
906 CODE OPTIMIZATION CHAP. 11
BIBLIOGRAPHIC NOTES
Many papers have been written on the generation of good code for arithmetic
expressions for a specific machine or class of machines. Floyd [1961a] discusses a
number of optimizations involving arithmetic expressions including detection of
common subexpressions. He also suggested that the second operand of a non-
commutative binary operator be evaluated first. Anderson [1964] gives an algorithm
for generating code for a one-register machine that is essentially the same as the
code produced by Algorithm 11.1 when N = 1. Nakata [1967] and Meyers [1965]
give similar results.
The number of registers required to compute an expression tree has been inves-
tigated by Nakata [1967], Redziejowski [1969], and Sethi and Ullman [1970].
Algorithms 11.1-11.4 as presented here were developed by Sethi and Ullman [1970].
Exercise 11.2.11 was suggested by P. Stockhausen. Beatty [1972] and Frailey [1970]
discuss extensions involving the unary minus operator. An extension of Algo-
rithm 11.2 to certain dags was made by Chen [1972].
There are no known efficient algorithms for generating optimal code for arbi-
trary expressions. One heuristic technique for making register assignments in a
sequence of expression evaluations is to use the following algorithm.
Suppose that expression tz is to be computed next and its value stored in a fast
register (accumulator).
(1) If the value Of ~ is already stored in some register i, then do not recompute ~.
Register i is now "in use."
(2) If the value of tz is not in any register, store the value of ~ in the next unused
register, say register j. Register j is now in use. If there is no unused register avail-
able, store the contents of some register k in main memory, and store the value
of g in register k. Choose register k to be that register whose value will be unrefer-
enced for the longest time.
Belady [1966] has shown that this algorithm is optimal in some situations. However,
the model assumed by this algorithm (which was designed for paging) does not
exactly model straight line code. In particlar, it assumes the order of computation
to be fixed, while as we have seen in Sections 11.1 and 11.2, there is often much
advantage to be had by reordering computations.
A similar register allocation problem is discussed by Horwitz et al. [1966].
They assume that we are given a sequence of operations which reference and
change values. The problem is to assign these values to fast registers so that the
number of loads and stores from the fast registers to main memory is minimized.
Their solution is to select a least-cost path in a dag of possible solutions. Techniques
for reducing the size of the dag are given. Further investigation of register alloca-
tion where order of computation is not fixed has been done by Kennedy [1972]
and Sethi [1972].
Translating arithmetic expressions into code for parallel computers is discussed
by Allard et al. [1964], Hellerman [1966], Stone [1967], and Baer and Bovet [1968].
sac. 11.3 PROGRAMS W I T H LOOPS 907
goto (label)
read A
write B
read Ai
read Az
read A,
if A r B goto L
means that if the relation r holds between the current values of A and B,
then control is to be transferred to statement labeled L. Otherwise, control
passes to the following statement.
A definition statement (or definition for short) is a statement of the form
read A or of the form A ~ OB1 . . . Br. Both statements are said to define
the variable A.
We shall make some further assumptions about programs. Variables
are simple variables, e.g., A, B, C , . . . , or simple variables indexed by one
simple variable or constant, e.g., A(1), A(2), A(I), or A(J). Further, we shall
assume that all variables referenced in a program must be either input vari-
ables (i.e., appear in a previous read statement) or have been previously
defined by an assignment statement. Finally, we shall assume that each
program has at least one halt statement and that if a program terminates,
then the last statement executed is a halt statement.
Execution of a program begins with the first statement of the program
and continues until a halt statement is encountered. We suppose that each
variable is of known type (e.g., integer, real) and that its value at any time
during the execution is either undefined or is a quantity of the appropriate
type. (It will be assumed that all operators used are appropriate to the types
of the variables to which they apply and that conversion of types occurs
when appropriate.)
In general the input variables of a program are those variables associated
with read statements and the output variables are the variables associated
with write statements. An assignment of a value to each input variable each
time it is read is called an input setting. The value of a program under an input
setting is the sequence of values written by the output variables during
the execution of the program. We say that two programs are equivalent if
for each input setting the two programs have the same value.t
This definition of equivalence is a generalization of the definition of
equivalent blocks used in Section 11.1. To see this, suppose that two blocks
(~1 = (P~, 1i, Ui) and (B2 = (Pz, Iz, Uz) are equivalent in the sense of Section
11.1. We convert (g~ and (g2 into programs 6'1 and 6'z in the obvious way
tWe are assuming that the meaning of each operator and relational symbol, as well as
the data type of each variable, is established. Thus, our notion of equivalence differs from
that of the schematologists (see for example, Paterson [1968] or Luckham et al. [1970]),
in that they require two programs to give the same value not only for each input setting,
but for each data type for the variables and for each set of functions and relations that
we substitute for the operators and relational symbols.
910 CODEOPTIMIZATION CHAP. 11
That is, we place read statements for the variables in 11 and 12 in front of P1
and P2, respectively, and place write statements for the variables in U 1 and
U2 after P1 and P2- Then we append a halt statement to each program. How-
ever, we must add the write statements to P1 and P2 in such a fashion that
each output variable is printed at least once and that the sequences of values
printed will be the same for both ff'l and 6~2. Since 031 is equivalent to 032,
we can always do this.
The programs (Pi and if'2 are easily seen to be equivalent no matter what
the space of input settings is and no matter what interpretation is placed on
the functions represented by the operators appearing in (P1 and 6'z. For
example, we could choose the set of prefix expressions for the input space
and interpret an application of operator 0 to expressions e 1, • • •, er to yield
Oel "'" e,.
However, if 031 and 03z are not equivalent and 6'1 and 6'z are programs
that correspond to 031 and 032, respectively, then there will always be a set of
data types for the variables and interpretations for the operators that causes
6'1 and (P2 to produce different output sequences. In particular, let the vari-
ables have prefix expressions as a "type" and let the effect of operator 0
on prefix expressions el, e z , . . . , ee be the prefix expression Oelez...ek.
Of course, we may make assumptions about data types and the algebra
connected with the function and relation symbols that will cause 6' 1 and 6~2
to be equivalent. In that case 031 and 032 will be equivalent under the corre-
sponding set of algebraic laws.
Example 11,26
Consider the following program for the Euclidean algorithm described
on p. 26 (Volume I). The output is t o be the greatest common divisor of
two positive integers p and q.
read p
read q
loop: r < remainder(p, q)
if r =0 goto done
p< ~q
q< r
goto loop
done: write q
halt
If, for example, we assign the input variables p and q the values 72 and 56,
respectively, then the output variable q in the write statement will have value
SEC. 11.3 PROGRAMS WITH LOOPS 91 1
'Example 11.27
For some types of data we might assume that a • a = 0 if and only if
a = 0. If we assume such a law, then the following program is equivalent
to the one in Example 11.26:
read p
read q
loop: r < remainder(p, q)
t< r,r
if t----0 goto done
p~ ~q
q~---r
goto loop
done" write q
halt
Example 11.28
Consider the program of Example 11.26. There are four block entries,
namely the first statement in the program, the statement labeled loop, the
assignment statement p ~ q, and the statement labeled done.
Thus, there are four basic blocks in the program. These blocks are given
below:
Block 1 read p
read q
Block 2 loop: r ~ remainder(p, q)
if r----0 goto done
Block 3 pc q
q~ r
goto loop
Block 4 done: write q
halt D
From the blocks of a program we can construct a graph that resembles
the familiar flow chart for the program.
DEFINITION
Example 11.29
The flow graph for the program of Example 11.26 is given in Fig. 11.22.
Block 1 is the begin node. E]
read
read ; Block 1
loop:
1
r "~-remainder (p,q)
if r = 0 goto done Block 2
Let F be a flow graph whose blocks have names chosen from set A.
A sequence of blocks CB1 . . . 6~, in A* is a (block) computationpath of F if
(1) 6~1 is the begin node of F.
(2) For 1 < i < n, there is an edge from block ~ _ 1 to CBt.
sEc. 11.3 PROGRAMS WITH LOOPS 91 5
In other words, a computation path 03t "'" 63, is a path from 031 to 63, in F
such that 6~a is the begin node.
We say that block 03' dominates 63 if 03' :~ 63 and every path from the
begin node to 63 contains 63'. We say that 03' directly dominates 03 if
(1) 03' dominates 03, and
(2) If 63" dominates 63 and 63" ~ 63', then 63" dominates 63'.
Thus, block 63' directly dominates 63 if 63' is the block "closest" to 03
which dominates 63.
Example 11.30
Referring to Fig. 11.22, the sequence 1232324 is a computation path.
Block 1 directly dominates block 2 and dominates blocks 3 and 4. Block 2
directly dominates blocks 3 and 4. D
LEMMA 1 1.14
Every block except the begin node (which has no dominators), has
a unique direct dominator.
Proof Let ,$ be the set of blocks that dominate some block 03. By Lemma
1 1.1 3 the dominance relation is a (strict) linear order on $. Thus, S has a mini-
mal element, which must be the direct dominator of 03. (See Exercise 0.1.23.)
D
91 6 CODEOPTIMIZATION CHAP. 11
Example 11.31
Let us compute the direct dominators for the flow graph of Fig. 1 1.22
using Algorithm 1 1.5. Here A = [631,632,633,634]- The successive values of
DOM(63) after considering 63~, 2 _~ i < 4, are given below"
Initial ~1 ~1 ~i
2 6~1 (Bz ~z
3 (B1 ~2 ~2
4 ~1 (B2 ~2
Let us compute line 2. Deleting block 632 makes blocks 633 and (~4 inaccessible.
We have thus determined that 632 dominates 633 and 634. Prior to this point,
DOM(632) = DOM(633) = 631, and so by step (2) of Algorithm 1 1.5 we set
DOM(633) to 63z- Likewise, DOM(634) is set to 63z. Deleting block 633 or 634
does not make any block inaccessible, so no further changes occur.
THEOREM 1 1.10
When Algorithm 1 1.5 terminates, DOM(63) is the direct dominator of 63.
SEC. 11.3 PROGRAMS W I T H LOOPS 917
Proof We first observe that step (1) correctly determines those 65's domi-
nated by 6~;, for 6~ dominates 6~ if and only if every path to 63 from the
begin node of F goes through 6~.
We show by induction on i that after step (2) is executed, DOM(CB) is
that block 6~h, 1 < h ~ i, which dominates 6~ but which, in turn, is dominated
by all 6~'s, 1 ~ j ~ i, which also dominate 6~. That such a 6~h must exist
follows directly from Lemma 11.13. The basis, i -- 2, is trivial.
Let us turn to the inductive step. If CB~+1 does not dominate 6~, the con-
clusion is immediate from the inductive hypothesis. If 6~+, does dominate ~,
but there is some 6~j, 1 ~_~j ~ i, such that 6~j dominates ~B and 6~+ 1 domi-
nates 6~, then DOM(6~) ~ DOM(6~+i). Thus, DOM((B) does not change,
which correctly fulfills the inductive hypothesis. If (B~+1 dominates 6~ but is
dominated by all (Bj's which dominate (g, 1 ~ j ~ i, we claim that prior to
this step, DOM(6~)----DOM(6~+I). For if not, there must be some 6~k,
1 < k ~ i, which dominates 6~+ 1 but not 6~, which is impossible by Lemma
11.13(1). Thus DOM(6~) is correctly set to 6~.+1, completing the induction. [~
an output variable also fall into this category. In Section 11.4, we shall pro-
vide a tool for implementing this transformation in a program with loops.
Example 11.32
Consider the flow graph shown in Fig. 11.23. In this flow graph block (B~
dominates blocks (B2, (B3, and (B4. Suppose that all assignment statements
involving the variables ,4, B, C, and D are as shown in Fig. 11.23. Then
the expression B + C has the same value when it is computed in blocks
(B~, 033, and 6~4. Thus, it is unnecessary to recompute the expression B + C
in blocks (B3 and 6~4. In block 6~ we can insert the assignment statement
X ~ A after the statement A ~ B + C. Here X is a new variable name.
Then in blocks 6~3 and 6~4 we can replace the statement A ~ B + C and
G ~ B + C by the simple assignment statements A ~ X and G ~ X,
respectively, without affecting the value of the program. Note that since A
is computed in block 6~2, we cannot use A in place of X. The resulting flow
graph is shown in Fig. 11.24.
The assignment A ~-- X now in 6~3 is redundant and can be eliminated.
Also note that if the statement F ~ A + G in block 6~2 is changed to
B ~-- A + G, then we can no longer replace G ~ - B + C by G ~-- X in
block 6~4. [-]
B ~-D+D
C~D*D ~l
A ~B+C
A ~-B*C ,1 A ~B+C
F~A+G l CB2 E~A,A ~3
t
6~5
B~D+D
. C~D*D -
A ~B+C I~1
X+-A
A ~-B*C A ~X
F~A+G (B2 E~A*A
" G'~- 4
6~5
Example 11.33
Suppose that we have the block
read R
PI < 3.14159
A < 4/3
B< A*PI
C< R1'3
V< B,C
write V
lent program:
read R
C< R1"3
V< 4.18878 • C
write V
4. Reduction in Strength
Reduction in strength involves the replacement of one operator, requiring
a substantial amount of machine time for execution, by a less costly com-
putation. For example, suppose that a PL/I source program contains the
statement
I ---- LENGTH(S1 I! $2)
where S1 and $2 are strings of variable length. The operator [[ denotes string
concatenation. String concatenation is relatively expensive to implement.
However, suppose we replace this statement by the equivalent statement
We would now have to perform the length operation twice and perform one
addition. But these operations are substantially less expensive than string
concatenation.
Other examples of this type of optimization are the replacement of cer-
tain multiplications by additions and the replacement of certain expo-
nentiations by repeated multiplication. For example, we might replace the
statement C ~-- R 1" 3 by the sequence
C~ R,R
C< C,R
}
922 C O D EOPTIMIZATION CHAP. 11
Example 11.34
Consider the abstract flow graph of Fig. 11.25. [2, 3, 4, 5} is a strongly
connected region with entry 2. [4] is a strongly connected region with entry 4.
{3, 4, 5, 6} is a region with entry 3. {2, 3, 7} is a region with entry 2. Another
region with entry 2 is {2, 3, 4, 5, 6, 7}. The latter region is maximal in that
every other region with entry 2 is contained in this region. D
THEOREM 11.11
Let F be a flow graph. Block CBin F is an entry block of a region if and
only if there is some block CB' such that there is an edge from ~ ' to 6~ and
6~ either dominates 6~' or is CB'.
Proof.
Only if: Suppose that 6~ is the entry block of region S. If S = [¢B}, the
result is trivial. Otherwise, let 6~' be in 8, cB' ~ 6~. Then ~ dominates 6~',
for if not, then there is a path from the begin node to CB' that does not pass
through ~, violating the assumption that 6~ is the unique entry block.
Thus, the entry block of a region dominates every other block in the region.
Since there is a path from every member of $ to cB, there must be at least
one CB' in 8 - {tB} which links directly to cB.
If: The case in which 6~ = cB' is trivial, and so assume that 6~ ~ 6~'.
Define S to be ~ together with those blocks ~ " such that tB dominates ~ "
and there is a path from 6~" to 6~ which passes only through nodes dominated
by 6~. By hypothesis, ~ and ~ ' are in $. We must show that $ is a region with
entry ~. Clearly, condition (2) of the region definition is satisfied, and so we
924 CODEOPTIMIZATION CHAP. 11
must show that there is a path from the begin node to ~ that does not pass
through any other block in $. Let C 1 .- • •,CB be a shortest computation path
leading to CB. If ej is in $, then there is some i, 1 < i < j, such that t~i = CB,
because ~ dominates Cj. Then e l . . . C,O3is not a shortest computation path
leading to CB,a contradiction. Thus, condition (1) of the definition of a strongly
connected region holds.
The set S constructed in the "if" portion of Theorem 11.11 is clearly the
maximal region with entry ~B. It would be nice if there were a unique region
with entry ~, but unfortunately this is not always the case. In Example 11.34,
there are three regions with entry 2. Nevertheless, Theorem 11.11 is useful
in constructing an efficient algorithm to compute maximal regions, which
are unique.
Unless a region is maximal, the entry block may dominate blocks not
in the region, and blocks in the region may be accessible from these blocks.
In Example 11.34, e.g., region [2, 3, 7} can be reached via block 6. We there-
fore say that a region is single-entry if every edge entering a block of the region,
other than the entry block, comes from a block inside the region. In Example
11.34, region {2, 3, 4, 5, 6, 7} is a single-entry region. In what follows, we
assume regions to be single-entry, although generalization to all regions is
not difficult.
1. Code Motion
There are several transformations in which knowledge of regions can be
used to improve code. A principal one is code motion. We can move a region-
independent computation outside the region. Let us say that within some
single-entry region variables Y and Z are not changed but that the statement
X ~-- Y + Z appears. We may move the computation of Y-Jr- Z to a newly
created block which links only to the entry block of the region.t All links from
outside the region that formerly went to the entry now go to the new block.
Example 11.35
It may appear that region-invariant computations would not appear
except in the most carelessly written programs. However, let us consider
the following inner DO loop of a F O R T R A N source program, where J is
defined outside the loop"
K=0
DO 3 I = 1,1000
3 K--J+ 1 +I+K
The intermediate program for this portion of the source program might
?The addition of such a block may make the flow graph unconstructable from any
program. However, the property of constructability from a program is never used here.
SEC. 11.3 PROGRAMS WITH LOOPS 925
K< 0
I< 1
loop: T+---J+ 1
S< T+I
K< S+K
if I : 1000 goto done
I< I+ 1
goto loop
done: halt
K+-O 63 1
I+-1
T~-J+I
S~T+I
K~S+K 6~2
I = 1000?
i ! i hat 6~4
i I~I+ 1 6~ 3
Fig. 11.26 Flow graph.
We observe that [~2, 6~3} in Fig, 11.26 is a region with entry ~2- The
statement T ~ J + 1 is invariant in the region, so it may be moved to a new
block, as shown in Fig. 11.27.
While the number of statements in the flow graphs of Figs. 11.26 and 11.27
is the same, the presumption is that statements in a region will tend to be
executed frequently, so that the expected time of execution has been
decreased. [Z]
2. Induction Variables
Another useful transformation concerns the elimination of what we shall
call induction variables.
926 C O D EOPTIMIZATION CHAP. 11
K~O 6~
I~1
L
T~J+I
S~T+I
K~S+K ~2
I = 1000?
I ! :l halt ]6~ 4
I~I+ 1 6~3
Fig. 11.27 Revised flow graph.
DEFINITION
Let $ be a single entry region with entry 6~ and let X be some variable
appearing in a statement of the blocks of S. Let CB1 . . . 6~n03e1 . . . ~m be
any computation path such that ~t is in S, 1 < i < m, and 6~n, if it exists,
is not in S. Define X1, X2, • • • to be the sequence of values of X each time X
is assigned in the sequence 6~e~ . . . ~m" If Xi, X2, • .. forms an arithmetic
progression (with positive or negative difference) for arbitrary computation
paths as above, then we say that X is an induction variable of $.
We shall also consider X to be an induction variable if it is undefined
the first time through (B and forms an arithmetic progression otherwise. In
this case, it may be necessary to initialize it appropriately on entry to the
region from outside, in order that the optimizations to be discussed here may
be performed.
Note that it is not trivial to find all the induction variables in a region.
In fact, it can be proven that no such algorithm exists. Nevertheless, we can
detect enough induction variables in common situations to make the concept
worth considering.
Example 11.36
In Fig. 11.27, the region {(B2, 6~3} has entry CB2. If CB2 is entered from
(B~ and the flow of control passes repeatedly from CB2 to 6~3 and then back
to 6~2, the variable I takes on the values 1, 2, 3 . . . . . Thus, I is an induction
variable. Less obviously, S is an induction variable, since it takes on values
T + 1, T + 2, T + 3, .. •. However, K is not an induction variable, because
it takes values T + 1, 2T + 3, 3T + 6, . . . . E]
sEc. 11.3 PROGRAMS WITH LOOPS 927
Example 11.37
Consider Fig. 11.27. We shall eliminate the induction variable I, which
qualifies under the criteria listed above. Its role will be played by S. We
observe that after executing 6~2, S has the value T + / , so when control
passes from 6~3 back to 6~2, the relation S ---- T + I - 1 must hold. We can
thus replace the statement S +-- T + I by S ~-- S + 1. But we must then
initialize S correctly in 6~, so that when control goes from 6~ to (B2, the
value of S after executing the statement S ,-- S + 1 is T + L Clearly, in
block 6~'2, we must introduce the new statement S ~-- T after the statement
T ~ - - J + I.
We must then revise the test I = 1000 ? so that it is an equivalent test on S.
When the test is executed, S has the value T ÷ I. Consequently, an equiva-
lent test is
K~0
K~0
iI
iT,-
S~
R +-
1
1000
~ S*-'J+ 1
R ~- S + 1000
S*-S + I S*-S+I
K~S+K 632 K*-S+K d~2
S =R? S =R?
I i
halt ] (B4
IIhalt d~4
Fig. 11.28 Further revised flow graph. Fig. 11.29 Final flow graph.
LOAD = 0 LOAD = 0
STORE K STORE K
LOAD = 1 LOAD J
LOOP: STORE I ADD = 1
LOAD J STORE S
ADD = 1 ADD = 1000
ADD 1 STORE R
ADD K LOOP: LOAD S
STORE K ADD = 1
LOAD I STORE S
SUBTR = 1000 ADD K
JZERO DONE STORE K
LOAD 1 LOAD S
ADD = 1 SUBTR R
JUMP LOOP JNZ LOOP
DONE: END END
(a) (b)
Program from Fig. 11.26 Program from Fig. 11.29
Observe that the length of the program in Fig. 11.30(b) is the same as
that of Fig. 11.30(a). However, the loop in Fig. 11.30(b) is shorter than in
Fig. ll.30(a) (8 instructions vs. 12), which is the important factor when
time is considered. [~
3. Reduction in Strength
An interesting form of reduction in strength is possible within regions.
If within a region there is a statement of the form A ,-- B • / , where the value
of B is region-independent and the values of I at that statement form an
arithmetic progression, we can replace the multiplication by addition or
subtraction of a quantity, which is the product of the region-independent
value and the difference in the arithmetic progression of the induction
variable. It is necessary to properly initialize the quantity computed by the
former multiplication statement.
Example 11.38
Consider the following portion of a source program,
DO 5 J--1, N
DO 5 I - - 1, M
5 A(/, J) = B(~, J)
which sets array A equal to array B assuming that both A and B are M by
N arrays. Suppose element A(I, J) is stored in location A + M • ( J - 1) + I - 1
for 1 < I < M , 1 <J-<N. Let us make a similar assumption about
B (/, J). For convenience, let us denote location A + L by A(L). Then the
following partially optimized intermediate program might be created from
this source program" M' + - - M - 1
N'< N--1
J< --1
outer" J< J[ 1
I< --1
K< M.J
loop" I< I[ 1
L< KtI
A(L) ~ B(L)
if I < M ' goto loop
if J < N' goto outer
halt
The flow graph for this program is shown in Fig. 11.31. In this flow
i~ - ~i ~ i-
930 CODEOPTIMIZATION CHAP. 11
M'~M-1
N'+-N-1
J+--1
631
l J~J+l
I~O 632
K+-M *J
I~I+I
L+-K+I
A (L ) +- B(L ) d~3
I < M'?
6~ 4
,11
halt
Fig. 11.31 Flow graph.
4. L o o p Unrolling
The final code-improving transformation which we shall consider is
exceedingly simple but often overlooked. It is loop unrolling. Consider the
sEc. 11.3 PROGRAMS WITH LOOPS 931
M'~M-I
N'~N-1
J~-I
K ~ -M
J~J+l
I~ -1
K~K+M
I~I+l
L~K+I
A(L)~B(L) ~3
I'[
I<M'?
J <N'? ~4
'1 halt ~5
Fig. 11.32 New flow graph.
L~--1
T~M *N
I
L~L+I
_ ,
A(L)~-B(L)
L<T?
I I '
halt
Fig. 11.33 Final flow graph.
?
932 CODE OPTIMIZATION CHAP. 11
flow graph in Fig. 11.34. Blocks 032 and 033 are executed 100 times. Thus 100
test instructions are executed. We could dispense with all 100 test instruc-
tions by "unrolling" the loop. That is, the loop could be unfolded into
a straight-line block consisting of 100 assignment statements"
A(1)< B(1)
A(2) < B(2)
.
A less frivolous approach would be to unroll the loop "once" to obtain the
flow graph in Fig. 11.35. The program in Fig. 11.35 is longer, but fewer
instructions are executed. In Fig. 11.35 only 50 test instructions are used,
versus 100 for the program in Fig. 11.34.
[I~11
t, I
A(1) ~ B(I)
I~-I+ 1
A(/) <--B(/) -|~ A ( I ) +- B(I)
I > 100? CB2 / I > 100?
I i l halt ]
~i "a'tl
I
[I~,+11
[
i I~I+ 1
1 6~3
! 1
Fig. 11.34 Flow graph. Fig. 11.35 Unrolled flow graph.
EXERCISES
11.3.2. What functions are computed by the following two intermediate lan-
guage programs ?
(a) read N
S~--0
I~--1
loop" S ~-- S + I
if I ~ N goto done
I:I+1
goto loop
done" write S
halt
(b) read N
T~--N+ I
T~--T,N
T+-- T , . 5
write T
halt
Are the two programs equivalent, if N and I represent integers and S
and I represent reals ?
11.3.3. Consider the following program P"
read A, B
R ~---- 1
C.~---A ,A
D+---B,B
if C < D goto X
E.+---A ,A
R+---R-F-1
E+---E+R
write E
halt
X: E.<----B, B
R. R+2
E+----E+R
write E
if E > 100 goto Y
halt
934 CODEOPTIMIZATION CHAP. 11
Y: R< R - - 1
goto X
Construct a flow graph for P.
11.3.4. Find the dominators and direct dominators of each node in the flow
graph of Fig. 11.36.
11.3.5. Let ON((B1, (B2) be the set of blocks that can appear on a path from
block (Bi to block 6~z (without going through (B1 again, although
the path may go through 632 more than once) in a flow graph. Show
that if (Bi dominates (Bz, then
ON((Ba, 6~2)= [6~ Ithere is a path from (B to (B2
when (B1 is deleted from the flow graph}.
What is the time required to compute ON((B1, (Bz)?
11.3.12. Give algorithms to find all (a) regions (b)single-entry regions and (c)
maximal regions in a flow graph.
• 11.3.13. Give an algorithm to detect some of the induction variables in a single-
entry region.
"11.3.14. Generalize the algorithm in Exercise 11.3.13 to handle regions that are
not single-entry.
11.3.15. Give an algorithm to move region-independent computations out of a
(not necessarily single-entry) region. Hint: Blocks outside the region
that can reach the region other than by the region entry are permitted
to change variables involved in the region-invariant computation.
936 CODE OPTIMIZATION CHAP. 11
We may need to place new blocks between blocks outside the region
and blocks within the region.
*'11.3.16. Show that it is undecidable whether two programs are equivalent.
Hint: Choose appropriate data types and interpretations for the
operators.
*'11.3.17. Show that it is undecidable whether a variable is an induction variable.
11.3.18. Generalize the notion of scope of variables and statements to pro-
grams with loops. Give an algorithm to compute the scope of a variable
in a program with loops.
"11.3.19. Extend transformations T1-T4 of Section 11.1 to apply to programs
with no backward loops (programs with assignment statements and
conditional statements o f the form if x R y goto L, where L refers to
a statement after this conditional statement).
"11.3.20. Show that it is und~cidable whether a program will ever terminate.
Research Questions
11.3.21. Characterize the machine models for which the transformations we
have described will result in faster-running programs.
11.3.22. Develop algorithms that will detect large classes of the phenomena
with which we have been dealing in this section, e.g., loop-invariant
computations or induction variables. Note that, for most of these
phenomena, there is no algorithm to detect all such instances.
Open Question
11.3.23. Is it possible to compute direct dominators of an n-node flow graph
in less than O(n2) steps ? It is reasonable to suppose that O(n2) is the
best we can do for the entire dominance relation, since it takes that
long just to print the answer in matrix form.
BIBLIOGRAPHIC NOTES
There are several papers that have proposed various optimizing transformations
for programs. Nievergelt [1965], Marill [1962], McKeeman [1965], and Clark [1967]
list a number of machine independent transformations. Gear [1965] proposes an
optimizer capable of some common subexpression elimination, the propagation of
constants, and loop optimizations such as strength reduction and removal of in-
variant computations. Busam and Englund [1969] discuss similar optimizations in
the context of FORTRAN. Allen and Cocke [1972] provide a good survey of these
techniques. Allen [1969] discusses a global optimization scheme based on finding
the strongly connected regions of a program.
The dominator approach to code optimization was pioneered by Lowry and
Medtock [1969], although the idea of the dominance relation comes from Prosser
[1959].
SEe. 11.4 DATAFLOWANALYSIS 937
There has been a great deal of theoretical work on program schemas, which are
similar to our flow graphs, but with unspecified spaces for the values of variables
and unspecified functions for operators. Two fundamental papers regarding equiva-
lence between such schemas independent of the actual spaces and functions are
Ianov [1958] and Luckham et al. [1970]. Kaplan [1970] and Manna [1973] survey
the area.
11.4. D A T A FLOW A N A L Y S I S
11.4.1. Intervals
DEFINITION
Example 11.39
Consider the flow graph of Fig. 11.37.
Let us consider the interval with header n~, the begin node. By step (1),
I(na) includes n~. Since the only edge to enter node n 2 leaves n~, we add n2
to I(n~). Node n3 cannot be added to I(n~), since n3 can be entered from node
n5 as well as n 2. No other nodes can be added to I~nl). Thus, 1(nl) = {nl, nz}.
Now let us consider I(n3). By step (1), n3 is in 1(n3). However, we cannot
add n 4 to I(n3), since n4 may be entered via n6 (as well as n3) and n6 is not in
1(n3). No other nodes can be added to I(n3), and so 1(n3) = {n3].
Continuing in this fashion, we can partition this flow graph into the fol-
lowing intervals"
ALGORITHM 11.6
Partitioning a flow graph into disjoint intervals.
lnput. A flow graph F.
Output. A set of disjoint intervals whose union is all the nodes of F.
Method.
(1) We shall associate with each node of F two parameters, a count and
a reach. The count of a node n is a number which is initially the number of
edges entering n. While executing the algorithm, the count of n is the number
of these edges which have not yet been traversed. The reach of n is either
undefined or some node of F. Initially, the reach of each node is undefined,
except for the begin node, whose reach is itself. Eventually, the reach of a
node n will be the first interval header h found such that there is an edge
from some node in I(h) to n.
(2) We create a list of nodes called the header list. Initially, the header
list contains only the begin node of F.
(3) If the header list is empty, halt. Otherwise, let n be the next node on
the header list. Remove n from the header list.
(4) Then use steps (5)-(7) to construct the interval I(n). In these steps
the direct successors of I(n) are added to the header list.
(5) l(n) is constructed as a list of nodes. Initially, I(n) contains only node
n and n is "unmarked."
(6) Select an unmarked node n' on I(n), mark n', and for each node
n" such that there is an edge from n' to n" perform the following operations'
(a) Decrease the count of n" by 1.
(b) (i) If the reach of n" is undefined, set it to n and do the follow-
ing. If the count of n is now 0 (having been 1), then add n"
to I (n) and go to step (7); otherwise, add n" to the header list
if not already there and go to step (7).
(ii) If the reach of n" is n and the count of n" is 0, add n" to
I(n) and remove n" from the header list, if it is there. Go to
step (7).
If neither (i) nor (ii) applies, do nothing in part(b).
(7) If an unmarked node remains in I(n), return to step (6). Otherwise,
l(n) is complete, and we return to step (3). [Z]
DEFINITION
From the intervals of a flow graph F, we can construct another flow
graph I(F) which we call the derived graph of F. I(F) is defined as follows:
(1) I(F) has one node for each interval constructed in Algorithm 11.6.
(2) The begin node of I(F) is the interval containing the begin node of F.
(3) There is an edge from interval I to interval J if and only if I ~ J
and there is an edge from a node o f / t o the header of J.
sEc. 11.4 DATA FLOWANALYSIS 941
I(F), the derived graph of a flow graph F, shows the flow of control among
the intervals of F. Since I(F) is a flow graph itself, we can also construct
I(I(F)), the derived graph of I(F). Thus, given a flow graph F 0 we can con-
struct a sequence of flow graphs F 0, F ~ , . . . , F,, which we call the derived
sequence of F, in which F~+~ is the derived graph of F~, for 0 ~ i < n, and F,
is its own derived graph [i.e., I(F,) ---- F,]. We say that F~ is the ith derived
graph of Fo. F, is called the limit of F0. It is not hard to show that F, always
exists and is unique.
If F, is a single node, then F is said to be reducible.
It is interesting to note that if F 0 is constructed from an actual pro-
gram, there is a high probability that F 0 will be reducible. In Section 11.4.3,
we shall discuss a node-splitting technique whereby every irreducible flow
graph can be transformed into one that is reducible.
Example 11.40
Let us use Algorithm 11.6 to construct the intervals for the flow graph of
Fig. 11.38.
The begin node is n~. Initially, the header list contains only n~. To con-
struct I(n~), we add n~ to I(n~) as an unmarked node. We make n~ marked by
considering n 2, the direct successor of n~. In doing so, we decrease the count
of n 2 from its initial value of 2 to 1, set the reach of n2 to n~, and add n 2
to the header list. At this point no unmarked nodes remain in I(n~), so
I(n~) -- [n~} is complete.
The header list now contains n z, the successor of I(n~). To compute I(n2),
we add n 2 to I(nz) and then consider n3, whose count is 2. We decrease the
count of n3 by 1, set the reach of n 3 to nz, and add n3 to the header list. Thus,
we find that I(n2) = [n2}.
The header list now contains n3, the successor of I(n2). To compute I(n3),
we begin by placing n 3 in I(n3). We then consider nodes n4 and ns, decreasing
their counts from 1 to 0, making their reach n3, and adding both n4 and n5
as unmarked nodes to I(n3). We mark n 4 by decreasing the count of n6 from
its initial value of 2 to 1, making n3 the reach of n6 and adding n6 to the
header list. When we mark n5 on I(n3), we change the count of n6 from 1 to 0,
remove n6 from the header list, and add n6 to I(n3).
To mark n 6 on I(n3), we make the count of n7 0, set the reach of n7 to n 3,
and add n7 to I(n3). Node n3 is considered next, since there is an edge from
n6 to n 3. Since its reach is n2, n3 does not affect I(n3) or the header list at
this point. To mark nT, we make the count of ns 0, set the reach of n8 to
n3, and add n8 to I(n3). Node n2 is also a successor of nT, but since the reach
of n2 is n~, n2 is not added to I(n3) or t h e header list. Finally, to mark n8,
no operations are needed, since n8 has no successors. At this point no un-
marked nodes remain in I(n3), and so I(n3) = In3, n4, ns, n6, r/7, n8}.
The header list is now empty; and so the algorithm terminates. In sum-
mary, we have partitioned the flow graph into three disjoint intervals"
942 CODE OPTIMIZATION CHAP. 11
l(n,) = {n,}
l(nz) -~ (n2}
!(n3) = (n3, /'/4, /'/5, /'/6, n7, ns}
From these intervals we can construct the first derived flow graph F1. We can
then apply Algorithm 11.6 to F I to obtain its intervals. Repeating this entire
process, we construct the sequence of derived flow graphs shown in Fig.
11.39. E]
Example 11.41
Consider the flow graph F in Fig. 11.40. The intervals for F are
= {nl}
I(n2) ~-- {n2}
I(n3) = In3}
SEC. 11.4 DATA FLOW ANALYSIS 943
{nl~ n
O .....
F1 F2 ~3
THEOREM I 1.13
Algorithm 11.6 constructs a set of disjoint intervals whose union is the
entire graph.
We shall show how interval analysis can be used to determine the data
flow within a reducible graph. The particular problem that we shall discuss is
that of determining for each block 63 and for each variable A of a reducible
flow graph at which statements of the program A could have last been defined
when control reaches 03. Subsequently, we shall extend the basic interval
analysis algorithm to irreducible flow graphs.
It is worthwhile pointing out that part of the merit in the interval approach
to data flow analysis lies in treating sets as packed bit vectors. The logical
AND, OR, and NOT operations on bit vectors serve to compute set inter-
sections, unions, and complements in a way that is quite efficient on most
computers.
We shall now construct tables that give, for each block 03 in a program,
all locations l at which a given variable A is defined, such that there is a
path from l to 03 along which A is not redefined. This information can be
used to determine the possible values of A upon entering 03.
We begin by defining four set-valued functions on blocks.
DEFINITION
statement of 03, such that no statement in this path redefines the variable
defined by d}.
(3) TRANS(03) = {d in P tthe variable defined by d is not defined by any
statement in ~}.
(4) GEN(03)= [d in 03I the variable defined by d is not subsequently
defined in 03}.
Informally, IN(03) contains those definitions that can be active going into
03. OUT(03) contains those definitions that can be active coming out of 03.
TRANS(03) contains the definitions transmitted through 03 without redefini-
tion in 03. GEN(03) contains those definitions generated in 03 that are active
on leaving 03. It is easy to show that
Example 11.42
Consider the following program"
Si: I+---- 1
$2: J< 0
$3: J< J+ I
$4: read I
$5" ifI< 100 goto $8
$6" write J
$7: halt
$8: I~ I.I
$9: goto $3
We have labeled each statement for convenience. The flow graph for this
program is shown in Fig. 11.41. Each block has been explicitly labeled.
Let us determine IN, OUT, TRANS, and GEN for block 032.
Statement S1 defines/, and S1, $2, $3 is a computation path that does
not define 1 (except at Si). Since this path goes from S1 to the first statement
of 032, we see that SI ~ IN(032). In this manner we can show that
SI" ! ~ 1
$2" J ~ 0
"¢_i
i=1
IN(CB) -- U OUT((B~)
k
= U [(IN(6~,) n TRANS(CB3) u GEN(CB,)]
i=1
To compute IN((B), we could write this set equation for each block in the
program along with IN(CB0)= ~3, where (B0 is the begin block, and then
attempt to solve the collection of simultaneous equations.t However, we
shall give an alternative method of solution that takes advantage of the
interval structure of flow graphs. We first define what we mean by an en-
trance and exit of an interval.
tAs with the regular expression equations of Section 2.2, the solution may not be
unique. Here we want the smallest solution.
SEC. 11.4 DATA FLOW ANALYSIS 947
DEFINITION
Let P be a p r o g r a m and F 0 its flow graph. Let F0, F 1 , . . . , F, be the
derived sequence of F 0. Each node in F~, i > 1, is an interval of F,_ 1 and is
called an interval o f order i.
The entrance of an interval of order 1 is the interval header. (Note that
this header is a block of the p r o g r a m . ) The entrance of an interval of order
i > 1 is the entrance of the header of that interval. Thus, the entrance of
any interval is a basic block of the underlying p r o g r a m P.
An exit of I(n), an interval of order 1, is the last statement of a block 03
in I(n) such that 03 has a direct descendant which is either the interval header
n or a block not in I(n). An exit of an interval I(n) of order i > 1 is the last
statement of a block 03 contained within I(n)t such that there is an edge in
F 0 from 03 either to the header o f interval n or to a block outside I(n).
N o t e that each interval has one entrance and zero or more exits.
Example 11.43
Let F 0 be the flow graph in Fig. ! 1.41. Using Algorithm 11.6, we obtain
11 -- I(03,) -- {031}
Iz -- I(03z) = {03z, 033,034}
and obtain the limit flow graph F2, also shown in Fig. 1 1.42.
Interval I
in Fi
Intervals in Fi_ l
Example 11.44
Let us considerF 0 of Fig. 1 1.4 1 and F I and F2 of Fig. 1 1.42. In F 1, interval
12 is {032,033, 034} and has exit $9. Thus, iN(I2) = IN(032) = IS1, $2, $3, $8},
and OUT(I2, $9) = OUT(033) ----{$3, $8}.
TRANS(I2, $9) = ~ , since TRANS(032) ---- ~ .
GEN(I2, $9) contains $8, since there is a sequence of blocks consisting
of 033 alone, with $8 in GEN(033, $9). Also, $3 is in GEN(I2, $9), because
of the sequence of blocks 032, 033, with exits $5 and $9. That is, $3 is in
GEN(032, $5), 032 is a direct predecessor of 033, and $3 is in TRANS(033, $9).
D
We shall now give an algorithm to compute IN(03) for all blocks of a pro-
gram P. The following algorithm works for only those programs that have
a reducible flow graph. Modifications necessary to do the computation for
irreducible flow graphs are given in the next section.
ALGORITHM 1 1.7
Computation of the IN function.
lnput. A reducible flow graph F 0 for a program P.
O u t p u t . IN(03) for each block 03 of P.
Method.
(1) Let F 0 , F 1 , . . . , F k be the derived sequence of F o. Compute
TRANS(03) and GEN(03) for all blocks 03 of Fo.
(2) For i = 1 , . . . , k, in turn, compute TRANS(I, s) and GEN(I, s)
for all intervals of order i and exits s of L The recursive definition of these
functions assures that this can be done.
(3) Define IN(/) = ~3, where I is the lone interval of order k. Set i = k.
(4) Do the following for all intervals of order i. Let I = [ I 1 , . . . , I,} be
950 CODE OPTIMIZATION CHAP. 11
Example 11.45
Let us apply Algorithm 11.7 to the flow graph of Fig. 11.41.
It is straightforward to compute G E N and T R A N S for the four blocks of
F 0. These results are summarized below"
sa [s1, $2}
6~2 {$3, $4}
6~a [$8} {$2, $3}
(B4 ~ {s1, $2, $3, $4, $8}
For example, since (B3 defines only the variable/, (B3 "kills" the previous
definitions of I but transmits the definitions of J, namely $2 and $3. Since
no block defines a variable twice, all definition statements within a block
are in G E N of that block.
We observe that I,, consisting of (B1 alone, has one exit, the statement $2.
tIf an interval has two exits connecting to the same next interval, they can be "merged"
for efficiency of implementation. The "merger" consists of taking the union of the GEN
and TRANS functions.
SEC. 11.4 DATAFLOWANALYSIS 951
Since paths in I~ are trivial, GEN(Ia, $2) = {S1, $2} and TRANS(I~, $2)
is the empty set.
12 has exit $9. We saw in Example 11.44 that GEN(I2, $ 9 ) = {$3, $8}
and TRANS(I2, $9) = ~ .
We can thus begin to compute the IN function. As required, IN(I3) = ~ .
Then we can apply step (4) of Algorithm 11.7 to the two subintervals of I3.
The only permissible order for these is I~, 12. We compute in step (4a),
IN(I~) = IN(/3) = ~ , and in step (4b),
OUT(It, $2) = (IN(I1) A TRANS(I1, $2)) U GEN(It, $2) = IS1, $2].
IN((B,) =
IN((B2) ----[S1, $2, $3, $8}
IN((B3) = {$3, $4}
IN((B4) = {$3, $4} E]
THEOREM 11.14
In Algorithm 11.7, for all basic blocks 63 in P, IN(CB) is the set of defini-
tions d such that there is a path in F 0 from d to the first statement of • along
which no statement redefines the variable defined by d. U]
Example 11.46
Consider the irreducible flow graph in Fig. 11.40 (p. 943). We can split
node n3 into two copies, n3! and n3, tt
to obtain the flow graph F ,t shown in
Fig. 11.44. The intervals for F' are
I, -- I(n,) - - [ n , , n'3]
Iz - - I ( n z ) - - [nz, nT}
F~, the first derived graph of F' will have two nodes, as shown in Fig. 11.45.
The second derived graph of F' consists of a single node. Thus by node split-
ting we have transformed F into a reducible flow graph F'. [--]
SEC. 11.4 DATA FLOW ANALYSIS 953
ALGORITHM 11.8
General computation of the IN function.
lnput. An arbitrary flow graph F for a program P.
Output. IN(G), for each block 6~ of P.
Method.
(1) Compute GEN(~) and TRANS((B) for each block ~B of F. Then
apply step (2) recursively to F. The input to step (2) is a flow graph G with
GEN(/, s) and TRANS(/, s) known for each node I of G and each exit s of L
The output of step (2) is IN(/) for each node I of G.
(2) (a) Let G be the input to this step and let G, G 1, . . . , Ge be the derived
sequence of G. if G k is a single node, proceed exactly as in
Algorithm 11.7. If G k is not a single node, we may compute G E N •
and TRANS for all the nodes of G1, • • •, Gk as in Algorithm 11.7.
Then, by Lemma 11.15, G k has some node other than the begin
node with more than one entering edge. Select one such node L
If I has j entering edges, replace I by new nodes 1 1 , . . . , Ij. One
edge enters each of 11, . . . , Ij, each from a different node from
which an edge previously entered L
(b) For each exit s o f / , create an exit s i of Ii, 1 _~ i _~ j, and imagine
that in F there is an edge from each s~ to the entrance of every
node to which s connected in Gk. Define GEN(I~, s~) = GEN(/, s)
954 CODEOPTIMIZATION CHAP. 11
Example 11.47
Consider the flow graph F0 of Fig. 11.46(a). We can compute F 1 = I(Fo),
which is shown in Fig. 11.46(b). However, 1(F1) = F1, so we must apply the
(a) F 0 (b) F 1
node splitting procedure of step (2). Let node [n2, ns} be/, and split I into 11
and/2. The result is shown in Fig. 11.47. We have chosen to connect nl to 11
and In3, n4} to 12. Edges from 11 and 12 to [n3, n4} have been drawn. Actually,
each exit of I is duplicated, one for 11 and one for 12. It is the duplicated
SEC. 11.4 DATA FLOW ANALYSIS 955
exits which connect to the entrance of [n3, n4}, a fact which is represented
by the two edges in Fig. 11.47. Note that the graph of Fig. 11.47 is reducible.
THEOREM 11.15
Algorithm 11.8 always terminates.
Proof By Lemma 11.15, each call of step (2) is either on a reducible
graph, in which case the call surely terminates, or there is a node I which
can be split. We observe that each of 11, . . . , Ij created in step (2) has a single
entering edge. Thus, when the interval construction is applied, they will
each find themselves in an interval with another node as header. We conclude
that the next call of step (2) will be on graphs with at least one fewer node,
so Algorithm 11.8 must terminate. D
THEOREM 11.16
Algorithm 11.8 correctly computes the IN function.
Proof It suffices to observe that G E N and TRANS for I1, • • •, Ij in step
(2) are the same as f o r / . Moreover, IN(/) is clearly (..][--1 IN(It) and OUT(/)
is ~{=1 OUT(It). Since each I~ connects wherever I connects, the IN function
for nodes other than / i n Gk is the same as in G'. Thus, a simple induction on
the number of calls of step (2) shows that IN is correctly computed. [5]
EXERCISES
11.4.1. Construct the derived sequence of flow graphs for the flow graphs in
Fig. 11.32 (p. 931) and Fig. 11.36 (p. 934). Are the flow graphs reduci-
ble?
11.4.2. Give additional examples of irreducible flow graphs.
11.4.3. Prove Theorem 11.12(1)and (2).
• 11.4.4. Show that Algorithm 11.6 can be implemented to run in time propor-
tional to the number of edges in flow graph F.
EXERCISES 957
N ~---- 2
Y: I ~--- 2
W: ifI<NgotoX
write N
Z: N~--N+ 1
goto Y
X: J + - - remainder(N, I)
if J -- 0 goto Z
I~ -I+1
goto W
read I
i f I = 1 goto X
Z: ifI> 10goto Y
X: J.~----I + 3
write J
058 CODE OPTIMIZATION CHAP. 11
W: I< -I + 1
goto Z
Y: I<---I-- 1
if I > 15 goto FV
halt
11.4.15. Show that every d-chart (See Section 1.3.2) is reducible. Hint: Use Exer-
cise 11.4.14.
11.4.16. Show that every F O R T R A N program in which every transfer to a
previous statement of the program is caused b y a DO loop has a
reducible flow graph.
EXERCISES 959
*'11.4.17. Show that one can determine in time 0(n log n) whether a program
flow graph is reducible. Hint' Use Exercise 11.4.12.
"11.4.18. What is the relation between the concepts of intervals and single-entry
regions ?
"11.4.19. Give an interval-based algorithm that determines for each expression
(say A + B) and each block (B whether every execution of the program
must reach a statement which computes A q- B (i.e., there is a state-
ment of the form C ~ A + B ) and which does not subsequently redefine
A or B. Hint: If 63 is not the begin block, let IN(63) -- ~ OUT(63i),
where the 63i's are all the direct predecessors of 63. Let OUT(63) be
(IN(63) ~ X) w Y
*'11.4.26. Give algorithms requiring 0(n log n) bit vector steps to compute the IN
functions of Algorithm 11.7 or Exercise 11.4.19 for flow graphs of n
nodes.
*'11.4.27. Show that a flow graph is reducible if and only if its edge set can be
partitioned into two sets El and E2, where (1)E1 forms a dag, and
(2) If (m, n) is in E2, then m = n, or n dominates m.
*'11.4.28. Give a n 0 ( n log n) algorithm to compute direct dominators for an n
node reducible graph.
Research Problems
11.4.29. Suggest someadditional data flow information (other than that men-
tioned in Algorithm 11.7 and Exercises 11.4.19 and 11.4.20) which
would be useful for code optimization purposes. Give algorithms to
compute these, both for reducible and for irreducible flow graphs.
11.4.30. Are there techniques to compute the IN function of Algorithm 11.8 or
other data flow functions that are superior to node splitting for irre-
ducible graphs? By "superior," we are assuming that bit vector
operations are permissible, or else the algorithms of Exercises 11.4.24
and 11.4.25 are clearly optimal.
BIBLIOGRAPHIC NOTES
AHO, A. V. [1968]
Indexed grammars~an extension of context-free grammars.
J. A CM 15: 4, 647-671.
AnD, A. V. (ed.) [1973]
Currents in the Theory of Computing.
Prentice-Hall, Englewood Cliffs, N.J.
AHO, A. V., P. J. DENNING, and J. D. ULLMAN [1972]
Weak and mixed strategy precedence parsing.
J. A C M 19:2, 225-243.
AHO, A. V., J. E. HOPCROFT, and J. D. ULLMAN[1968]
Time and tape complexity of pushdown automaton languages.
Information and Control 13:3, 186-206.
AHO A. V., J. E. HOPCROFT, and J. D. ULLMAN [1972]
On finding lowest common ancestors in trees.
Proc. Fifth Annual ACM Symposium on Theory of Computing, (May, 1973),
253-265.
AnD, A. V., S. C. JOHNSON,and J.D. ULLMAN [1972]
Deterministic parsing of ambiguous grammars.
Unpublished manuscript, Bell Laboratories, Murray Hill, N.J.
AHO, A. V., R. SETHI, and J. D. ULLMAN[1972]
Code optimization and finite Church-Rosser systems.
In Design and Optimization of Compilers
(R. Rustin, ed.). Prentice-Hall, Englewood
Cliffs, N.J., pp. 89-106.
AHO, A. V., and J. D. ULLMAN [1969a]
Syntax directed translations and the pushdown assembler.
J. Computer and System Sciences 3:1, 37-56.
961
962 BIBLIOGRAPHYFOR VOLUMES I AND II
BELADY, L. A. [1966]
A study of replacement algorithms for a virtual storage computer.
IBM Systems J. 5, 78-82.
BELL, J. R. [1969]
A new method for determining linear precedence functions for precedence
grammars.
Comm. A CM 12:10, 316-333.
BELL, J. R. [I 970]
The quadratic quotient method: a hash code eliminating secondary clustering.
Comm. A C M 13:2, 107-109.
BERGE, C. [1958]
The Theory of Graphs and Its Applications.
Wiley, New York.
BIRMAN,A., and J. D. ULLMAN[1970]
Parsing algorithms with backtrack.
IEEE Conference Record of llth Annual Symposium on Switching and Automata
Theory, pp. 153-174.
BLATTNER, M. [1972]
The unsolvabi!ity of the equality problem for sentential forms of context-free
languages.
Unpublished Memorandum, UCLA, Los Angeles, Calif. To appear in JCSS.
BOBROW, D. G. [1963]
Syntactic analysis of English by computer--a survey.
Proc. AFIPS Fall Joint Computer Conference, Vol. 24.
Spartan, New York, pp. 365-387.
BooK, R. V. [1970]
Problems in formal language theory.
Proc. Fourth Annual Princeton Conference on Information Sciences and Systems,
pp. 253-256. Also see Aho [1973].
BOOTH, T. L. [1967]
Sequential Machines and Automata Theory.
Wiley, New York.
BORODIN, A. [1970]
Computational complexity--a survey.
Proc. Fourth Annual Princeton Conference on Information Sciences and Systems,
pp. 257-262. Also see Aho [1973].
BRACHA, N. [1972]
Transformations on loop-free program schemata.
Report No. UIUCDCS-R-72-516, Department of Computer Science, Univer-
sity of Illinois, Urbana
BRAFFORT, P., and D. HIRSCHBERG(eds.) [1963]
Computer Programming and Formal Systems.
North-Holland, Amsterdam.
BIBLIOGRAPHY FOR VOLUMESi AND II 965
BREUER, M. A. [1969]
Generation of optimal code for expressions via factorization.
Comm. A C M 12: 6, 333-340.
BROOKER, R. A., and D. MORRIS[1963]
The compiler-compiler.
Annual Review in Automatic Programming, Vol. 3.
Pergamon, EImsford, N.Y., pp. 229-275.
BRUNO, J. L., and W. A. BURKHARD[1970]
A circularity test for interpreted grammars.
Technical Report 88. Computer Sciences Laboratory, Department of Electrical
Engineering, Princeton University, Princeton, N.J.
BRZOZOWSKI,J. A. [1962]
A survey of regular expressions and their applications.
IRE Trans. on Electronic Computers 11:3, 324-335.
BRZOZOWSKI, J, A. [1964]
Derivatives of regular expressions.
J. A C M 11:4, 481-494.
BUSAM, V. A., and D. E. ENGLUND [1969]
Optimization of expressions in Fortran.
Comm. A CM 12:12, 666-674.
CANTOR, D. G. [1962]
On the ambiguity problem of Backus systems.
J. A CM 9: 4, 477--479.
CAVINESS, B. F. [1970]
On canonical forms and simplification.
J. A C M 17:2, 385-396.
CHEATHAM, T. E., [1965]
The TGS-Ii translator-generator system.
Proc. IFIP Congress 65. Spartan, New York, pp. 592-593.
CHEATHAM, T. E. [1966]
The introduction of definitional facilities into higher level programming
languages.
Proc. AFIPS Fall Joint Computer Conference. Vol. 30.
Spartan, New York, pp. 623-637.
CHEATHAM, T. E. [1967]
The Theory and Construction of Compilers (2nd ed.).
Computer Associates, Inc., Wakefield, Mass.
CHEATHAM, T. E., and K. SATTLEY[1964]
Syntax directed compiling.
Proc. AFIPS Spring Joint Computer Conference, Vol. 25. Spartan, New York,
pp. 31-57.
966 BIBLIOGRAPHYFOR VOLUMES I AND II
CHURCH, A. [1956]
Introduction to Mathematical Logic.
Princeton University Press, Princeton, N.J.
CLARK, E. R. [1967]
On the automatic simplification of source language programs.
Comm. A C M 10:3, 160-164.
COCKE, J. [1970]
Global common subexpression elimination.
A CM SIGPLAN Notices 5:7, 20-24.
COCKE, J., and J. T. SCHWARTZ[1970]
Programming Languages and Their Compilers (2rid ed.).
Courant Institute of Mathematical Sciences, New York University, New York.
COHEN, D. J., and C. C. GOTLIEB [1970]
A list structure form of grammars for syntactic analysis.
Computing Surveys 2: 1, 65-82.
COHEN, R. S., and K. CULIK, II [1971]
LR-regular grammars--an extension of LR(k) grammars.
IEEE Conference Record of 12th Annual Symposium on Switching and Auto-
mata Theory, pp. 153-165.
COLMERAUER, A. [1970]
Total precedence relations.
J. A C M 17:1, 14-30.
CONWAY, M. E. [1963]
Design of a separable transition-diagram compiler.
Comm. A C M 6:7, 396-408.
CONWAY, R. W., and W. L. MAXWELL[1963]
CORC: the Cornell computing language.
Comm. A C M 6:6, 317-321.
CONWAY, R. W., and W. L. MAXWELL[1968]
CUPL--an approach to introductory computing instruction.
Technical Report No. 68-4. Department of Computer Science,
Cornell University, Ithaca, N.Y.
CONWAY, R. W., et al. [1970]
PL/C. A high performance subset of PL/I.
Technical Report 70-55. Department of Computer Science,
Cornell University, Ithaca, N.Y.
COOK, S. A. [1971]
Linear time simulation of deterministic two-way pushdown automata.
Proc. IFIP Congress 71, TA-2. North-Holland, Amsterdam. pp. 174-179.
COOK, S. A., and S. D. AANDERAA[1969]
On the minimum computation time of functions.
Trans. American Math. Soc. 142, 291-314.
968 BIBLIOGRAPHYFOR VOLUMES I AND II
FLOYD, R. W. [1962b]
On ambiguity in phrase structure languages.
Comm. A C M 5:10, 526-534.
FLOYD, R. W. [1963]
Syntactic analysis and operator precedence.
J. A C M 10:3, 316-333.
FLOYD, R. W. [1964a]
Bounded context syntactic analysis.
C o m m . / I C M 7:2, 62-67.
FLOYD, R. W. [1964b]
The syntax of programming languages--a survey.
I E E E Trans. on Electronic Computers EC-13: 4, 346-353.
FLOYD, R. W. [1967a]
Assigning meanings to programs.
In Schwartz [1967], pp. 19-32.
FLOYD, R. W. [1967b]
Nondeterministic algorithms.
J . / 1 C M 14: 4, 636-644.
FRAILLY, D. J. [1970]
Expression optimization using unary complement operators.
A C M SIGPL/1N .Notices 5:7, 67-85.
FREEMAN, D. N. [1964]
Error correction in CORC, the Cornell computing language.
Proc. AFIPS Fall Joint Computer Conference, Vol. 26.
Spartan, New York, pp. 15-34.
GALLER, B. A., and A. J. PERLIS [1967]
A proposal for definitions in ALGOL.
C o m m . / 1 C M 10: 4, 204-219.
GARWICK, J. V. [1964]
GARGOYLE, a language for compiler writing.
Comm. A C M 7: 1, 16-20.
GARWICK, J. V. [1968]
GPL, a truly general purpose language.
C o m m . / 1 C M 11:9, 634-638.
GEAR, C. W. [1965]
High speed compilation of efficient object code.
Comm. A C M 8:8, 483-487.
GENTLEMAN, W. M. [1971]
A portable c0routine system.
Proc. IFIP Congress 71, TA-3. North-Holland, Amsterdam, pp. 94-98.
GILL, A. [1962]
Introduction to the Theory o f Finite State Machines.
McGraw-Hill, New York.
BIBLIOGRAPHY FOR VOLUMES I AND II 971
GINSBURG, S. [1962]
An Introduction to Mathematical Machine Theory.
Addison-Wesley, Reading, Mass.
GINSBURG, S. [1966]
The Mathematical Theory of Context-Free Languages.
McGraw-Hill, New York.
GINSBURG, S., and S. A. GREIBACH[1966]
Deterministic context-free languages.
Information and Control 9:6, 620-648.
GINSBURG, S., and S. A. GREIBACH[1969]
Abstract families of languages.
Memoir American Math. Soc. No. 87, 1-32.
GINSBURG, S., and H. G. RICE [1962]
Two families of languages related to ALGOL.
J. A C M 9:3, 350-371.
GINZBURG, A. [1968]
Algebraic Theory of Automata.
Academic Press, New York.
GLENNIE, A. [1960]
On the syntax machine and the construction of a universal compiler.
Technical Report No. 2. Computation Center,
Carnegie-Mellon University, Pittsburg, Pa.
GRAHAM, R. L. [1972]
Bounds on multiprocessing anomalies and related packing algorithms.
Proc. AFIPS Spring Joint Computer Conference, Vol. 40
AFIPS Press, Montvale, N.J. pp. 205-217.
GRAHAM, R. M. [1964]
Bounded context translation.
Proc. AFIPS Spring Joint Computer Conference, Vol. 25.
Spartan, New York, pp. 17-29.
GRAHAM, S. L. [1970]
Extended precedence languages, bounded right context languages and deter-
ministic languages.
IEEE Conference Record of l lth Annual Symposium on Switching and Auto-
mata Theory, pp. 175-180.
GRAU, A. A., U. HILL, and H. LANGMAACK[1967]
Translation of ALGOL 60.
Springer-Verlag, New York
GRAY, J. N. [1969]
Precedence parsers for programming languages.
Ph.D. Thesis, Department of Computer Science, University of California,
Berkeley.
972 BIBLIOGRAPHYFOR VOLUMES I AND II
HARRISON, M. A. [1965]
Introduction to Switching and Automata Theory.
McGraw-Hiil, New York.
HARTMANIS, J., and J. E. HOPCROFT[1970]
An overview of the theory of computational complexity.
J. A C M 18:3, 444--475.
HARTMANIS, J., P. M. LEWIS, II, and R. E. STEARNS[1965]
Classifications of computations by time and memory requirements.
Proc. IFIP Congress 65. Spartan, New York, pp. 31-35.
HAYNES, H. R., and L. J. SCHUTTE[1970]
Compilation of optimized syntactic recognizers from Floyd-Evans productions.
A C M SIGPLAN Notices 5: 7, 38-51.
HAYS, D. G. [i967]
Introduction to Computational Linguistics.
American Elsevier, New York.
HECHT, M. S., and J. D. ULLMAN[1972a]
Flow graph reducibility.
S I A M J. on Computing 1"2, 188-202.
HECHT, M. S., and J. D. ULLMAN [1972b]
Unpublished memorandum,
Department of Electrical Engineering, Princeton University
HELLERMAN, H . [1966]
Parallel processing of algebraic expressions.
1EEE Trans. on Electronic Computers EC-15:1, 82-91.
HEXT, J. B., and P. S. ROBERTS[1970]
Syntax analysis by Domolki's algorithm.
Computer J. 13: 3, 263-271.
HOPCROFT, J. E. [1971]
An n log n algorithm for minimizing states in a finite automaton.
CS71-190. Computer Science Department, Stanford University, Stanford, Cal.
Also in Theory of Machines and Computations, Z. Kohavi and A. Paz (eds).
Academic Press, New York, pp. 189-196.
HOPCROFT, J. E., and J. D. ULLMAN[i967]
An approach to a unified theory of automata.
Bell System Tech. J. 46: 8, 1763-1829.
HOPCROFT, J. E., and J. D. ULLMAN[1969]
Formal Languages and Their Relation to Automata.
Addison-Wesley, Reading, Mass.
HOPCROFT, J. E., and J. D. ULLMAN[1972a]
Set merging algorithms.
Unpublished memorandum. Department of Computer Science,
Cornell University, Ithaca, N.Y.
974 BIBLIOGRAPHY FOR VOLUMES I AND II
KNUTH, D. E. [1965]
On the translation of languages from left to right.
Information and Control 8:6, 607-639.
KNUTH, D. E. [1967]
Top-down syntax analysis.
Lecture Notes.
International Summer School on Computer Programming, Copenhagen.
Also in Acta lnformatica 1: 2, (1971), 79-110.
KNUTH, D. E. [1968a]
The Art of Computer Programming, Vol. 1 : Fundamental Algorithms.
Addison-Wesley, Reading, Mass.
KNUTH, D. E. [1968b]
Semantics of context-free languages.
Math. Systems Theory 2: 2, 127-146.
Also see Math. Systems Theory 5: 1, 95-95.
KNUTH, D. E. [ 1971 ]
An empirical study of FORTRAN programs.
Software-Practice and Experience 1 : 2, 105-134.
KNUTH, D. E. [1973]
The Art of Computer Programming, Vol. 3: Sorting and Searching.
Addison-Wesley, Reading, Mass.
KORENJAK, A. J. [1969]
A practical method for constructing LR(k) processors.
Comm. ACM ~12: II, 613-623.
KORENJAK, A. J., and J. E. HOPCROFT [1966]
Simple deterministic languages.
IEEE Conference Record of 7th Annual Symposium on Switching and Automata
Theory, pp. 36--46.
KOSARAJU, S. R. [1970]
Finite state automata with markers.
Proc. Fourth Annual Princeton Conference on Information Sciences and Systems,
p. 380.
KUNO, S., and A. G. OETTiNGER [1962]
Multiple-path syntactic analyzer.
Information Processing 62 (IFIP Congress), Popplewell (ed.).
North-Holland, Amsterdam, pp. 306-311.
KURKt-SUONtO, R. [1969]
Notes on top-down languages.
BIT 9, 225-238.
LAFRANCE, J. [1970]
Optimization of error recovery in syntax directed parsing algorithms.
A CM SIGPLAN Notices 5: 12, 2-17.
BIBLIOGRAPHYFOR VOLUMESI AND II 977
MARILL, M. [1962]
Computational chains and the size of computer programs.
I R E Trans. on Electronic Computers, EC-11: 2, 173-180.
MARTIN, D. F. [1972]
A Boolean matrix method for the computation of linear precedence functions.
Comm. A C M 15:6, 448-454.
MAURER, W. D. [1968]
An improved hash code for scatter storage.
Comm. A C M 11 : 1, 35-38.
MCCARTHY, J. [1963]
A basis for the mathematical theory of computation.
In Braffort and Hirschberg [1963], pp. 33-71.
McCARTHY, J., and J. A. PAINTER [1967]
Correctness of a compiler for arithmetic expressions.
In Schwartz [1967], pp. 33-41.
MCCLURE, R. M. [1965]
T M G - - a syntax directed compiler.
Proc. A C M National Conference, Vol. 20, pp. 262-274.
MCCULLOCH, W. S., and W. PITTS [1943]
A logical calculus of the ideas immanent in nervous activity.
Bulletin o f Math. Biophysics 5, 115-133.
MCILROY, M. D. [1960]
Macro instruction extensions of compiler languages.
Comm. A C M 3: 4, 414-220.
MCILROY, M. D. [1968]
Coroutines.
Unpublished manuscript, Bell Laboratories, Murray Hill, N.J.
MCILROY, M. D. [1972]
A manual for the TMG compiler writing language.
Unpublished memorandum, Bell Laboratories, Murray Hill, N.J.
MCKEEMAN, W. M. [1965]
Peephole optimization.
Comm. A C M 8:7, 443--444.
MCKEEMAN, W. M. [1966]
An approach to computer language design.
CS48. Computer Science Department, Stanford University, Stanfordl Cal.
MCKEEMAN, W. M., J. J. HORNING, and D. B. WORTMAN[1970]
A Compiler Generator.
Prentice-Hall, Englewood Cliffs, N.J.
MCNAUGI-ITON, R. [1967]
Parenthesis grammars.
J. A C M 14: 3, 490-500.
BIBLIOGRAPHYFOR VOLUMESI AND II 979
RADKE, C. E. [1970]
The use of quadratic residue search.
Comm. A C M 13: 2, 103-109.
RANDELL, B., and L. J. RUSSELL [1964]
ALGOL 60 Implementation.
Academic Press, New York.
REDZIEJOWSKI, R. R. [1969]
On arithmetic expressions and trees.
Comm. A C M 12: 2, 81-84.
REYNOLDS, J. C. [1965]
An introduction to the COGENT programming system.
Proc. A C M National Conference, Vol. 20, p. 422.
REYNOLDS, J. C., and R. HASKELL[1970]
Grammatical coverings.
Unpublished memorandum, Syracuse University.
RICHARDSON, D. [1968]
Some unsolvable problems involving elementary functions of a real variable.
J. Symbolic Logic 33, 514-520.
ROGERS, H., JR. [1967]
Theory of Recursive Functions and Effective Computability.
McGraw-Hill, New York.
ROSEN S. (ed.) [1967a]
Programming Systems and Languages.
McGraw-Hill, New York.
ROSEN, S. [1967b]
A compiler-building system developed by Brooker and Morris.
In Rosen [1967a], pp. 306-331.
ROSENKRANTZ, O. J. [1967]
Matrix equations and normal forms for context-free grammars.
J. A C M 14: 3, 501-507.
ROSENI,ZRANTZ, D. J. [1968]
Programmed grammars and classes of formal languages.
J. A C M 16:1, 107-131.
ROSENKRANTZ, D. J., and P. M. LEWIS, II [1970]
Deterministic left corner parsing.
IEEE Conference Record of llth Annual Symposium on Switching and Automata
Theory, pp. 139-152.
ROSENKRANTZ, D. J., and R. E. STEARNS[1970]
Properties of deterministic top-down grammars.
Information and Control 17:3, 226-256.
SALOMAA, A. [1966]
Two complete axiom systems for the algebra of regular events.
J. A C M 13: 1,158-169.
BIBLIOGRAPHY FOR VOLUMES I AND II 983
SALOMAA, A. [1969a]
Theory of Automata.
Pergamon, Elmsford, N.Y.
SALOMAA, A. [1969b]
On the index of a context-free grammar and language.
Information and Control 14" 5, 474-477.
SAMELSON, K., and F. L. BAUER [1960]
Sequential formula translation.
Comm. A C M 3" 2, 76-83.
SAMMET, J. E. [1969]
Programming Languages" History and Fundamentals.
Prentice-Hall, Englewood Cliffs, N.J.
SCHAEFER, M. [1973]
A Mathematical Theory of Global Program Optimization
Prentice-Hall, Englewood Cliffs, N.J., to appear.
SCHORRE, D. V. [1964]
META II, a syntax oriented compiler writing language,
Proe. A C M National Conference, Vol. 19, pp. Di.3-1-Dt.3-11.
SCHUTZENBERGER, M. P. [1963]
On context-free languages and pushdown automata.
hlformation and Control 6" 3, 246-264.
SCHWARTZ, J. T. (ed.) [1967]
Mathematical Aspects of Computer Science.
Proc. Symposia in Applied Mathematics, Vol. 19.
American Mathematical Society, Providence.
SCOTT, D., and C. STRACHEY[1971]
Towards a mathematical semantics for computer languages.
Proc. Symposium on Computers and Automata, Microwave Research Institute
Symposia Series, Vol. 21. Polytechnic Institute of Brooklyn, New York, pp.
19-46.
SETHI, R. [1973]
Validating register allocations for straight line programs.
Ph.D. Thesis, Department of Electrical Engineering, Princeton University.
SETHI, R., and J. D. ULLMAN[1970]
The generation of optimal code for arithmetic expressions.
J. A C M 17" 4, 715-728.
SHANNON, C. E., and J. MCCARTHY (eds.) [1956]
Automata Studies.
Princeton University Press, Princeton, N.J.
SHAW, A. C. [1970]
Parsing of graph-representable pictures.
J. A C M 17" 3, 453-481.
984 BIBLIOGRAPHYFOR VOLUMESI AND II
SHEPHERDSON, J. C. [1959]
The reduction of two-way automata to one-way automata.
I B M J. Research 3, 198-200. Reprinted in Moore [1964], pp. 92-97.
STEARNS, R. E. [1967]
A regularity test for pushdown machines.
Information and Control 11:3, 323-340.
STEARNS, R. E. [1971]
Deterministic top-down parsing.
Proc. Fifth Annual Princeton Conference on Information Sciences and Systems,
pp. 182-188.
STEARNS, R. E., and P. M. LEWIS, II [1969]
Property grammars and table machines.
Information and Control 14:6, 524-549.
STEARNS, R. E., and D. J. ROSENKRANTZ[1969]
Table machine simulation.
IEEE Conference Record of lOth Annual Symposium on Switching and Automata
Theory, pp. 118-128.
STEEL, T. B. (ed.) [1966]
Formal Language Description Languages for Computer Programming.
North-Holland, Amsterdam.
STONE, H. S. [1967]
One-pass compilation of arithmetic expressions for a parallel processor.
Comm. A CM 10: 4, 220-223.
STRASSEN, V. [1969]
Gaussian elimination is not optimal.
Numerische Mathematik 13, 354-356.
SUPPES, P. [1960]
Axiomatic Set Theory.
Van Nostrand Reinhold, New York.
TARJAN, R. [1972]
Depth first search and linear graph algorithms.
S I A M J . on Computing 1"2, 146-160.
THOMPSON, K. [1968]
ReguIar expression search algorithm.
Comm. A C M 11:6, 419-422.
TURING, A. M. [1936]
On computable numbers, with an application to the Entscheidungsproblem.
Proc. London Mathematical Soc. Ser. 2, 42, 230-265. Corrections, Ibid., 43
(1937), 544-546.
ULLMAN, J. D. [1972a]
A note on hashing functions.
J. A C M 19:3, 569-575.
BIBLIOGRAPHY FOR VOLUMES I AND II 905
ULLMAN, J. D. [1972b]
Fast Algorithms for the Elimination of Common Subexpressions.
Technical Report TR-106, Dept. of Electrical Engineering, Princeton Univer-
sity, Princeton, N.J.
UNGER, S. H. [1968]
A global parser for context-free phrase structure grammars.
Comm. A C M 11 : 4, 240-246, and 11: 6, 427.
VAN WIJNGAARDEN, A. (ed.) [1969]
Report on the algorithmic language ALGOL 68.
Numerische Mathematik 14, 79-218.
WALTERS, D. A. [1970]
Deterministic context-sensitive languages.
Information and Control 17" 1, 14-61.
WARSHALL, S. [t962]
A theorem on Boolean matrices.
J. A C M 9 : 1 , 11-12.
WARSHALL, S., and R. M. SHAPIRO[1964]
A general purpose table driven compiler.
Proc. AFIPS Spring Joint Computer Conference, Vol. 25.
Spartan, New York, pp. 59-65.
WEGBREIT, B. [1970]
Studies in extensible programming languages.
Ph. D. Thesis, Harvard University, Cambridge, Mass.
WILCOX, T. R. [1971]
Generating machine code for high-level programming Ianguages.
Technical Report 71-103. Department of Computer Science,
Cornell University, Ithaca, N.Y.
WINOGRAD, S. [1965]
On the time required to perform addition.
J. A C M 12: 2, 277-285.
WINOGRAD, S. [1967]
On the time required to perform muItiplication.
J. A C M 14:4, 793-802.
WIRTH, N. [1965]
Algorithm 265: Find precedence functions.
Comm. A CM 8 : 10, 604-605.
WIRTH, N. [1968]
PL 360--a programming language for the 360 computers.
J. A C M 15: i, 37-34.
WIRTH, N., and H. WEBER [1966]
EULER--a generalization of ALGOL and its formal definition, Parts 1 and 2.
Comm. A C M 9: 1-2, 13-23, and 89-99.
986 BIBLIOGRAPHY FOR VOLUMES I AND II
WISE, D. S. [1971]
Domolki's algorithm applied to generalized overlap resolvable grammars.
Proc. Third Annual A C M Symposium on Theory of Computing, pp. 171-184.
WOOD, D. [196%]
The theory of left factored languages.
Computer J. 12: 4, 349-356, and 13: 1, 55-62.
WOOD, D. [1969b]
A note on top on top-down deterministic languages.
BIT 9: 4, 387-399.
WooD, D. [1970]
Bibliography 23: Formal language theory and automata theory.
Computing Reviews 11" 7, 417--430.
WOZENCRAFT, J. M., and A. EVANS, JR. [1969]
Notes on Programming Languages.
Department of Electrical Engineering, Massachusetts Institute of Technology,
Cambridge, Mass.
YERSHOV, A. P. [1966]
ALPHANan automatic programming system of high efficiency.
J. A C M 13:1, 17-24.
YOUNGER, D. H. [1967]
Recognition and parsing of context-free languages in time n 3.
Information and Control 10: 2, 189-208.
INDEX TO LEMMAS, THEOREMS,
AND ALGORITHMS
987
i
988 INDEX TO LEMMAS, THEOREMS AND ALGORITHMS
989
990 INDEXTO VOLUMES I AND II
Nonterminat, 85, 100, 218, 458 Parse tree, 139-143, 179-180, 220-222,
Normal form deterministic pushdown 273, 379, 464-466 (see also Syn-
automaton, 690-695 tax tree)
Northcote, R. S., 578 Parsing, 56, 59, 63-65, 72-74, 263-280,
Nullable symbol, 674-680 722, 781 (see also Bottom-up
parsing, Shift-reduce parsing,
Top-down parsing)
O Parsing action function (see Action func-
tion)
Object code, 59, 720 Parsing automaton, 645-665
Oettinger, A., 192, 313 Parsing machine, 477-482, 484, 747-748,
Ogden, W., 211 750-753
Ogden's lemma, I92-196 Partial acceptance failure (of a TDPL
(1, 1 )-bounded-right-context grammar, or GTDPL program), 484
429-430, 448, 690, 701 Partial correspondence problem, 36
( 1, 0)-bounded-right-context grammar, Partial function, 10
690, 699-701,708 Partial left parse, 293-296
One-turn pushdown automaton, 207-208 Partial order, 9-10, 13-15, 43-45, 865
One-way recognizer, 94 Partial recursive function (see Recursive
Open block, 856-858 function)
Open portion (of a sentential form), 334, Partial right parse, 306
369 Pass (of a compiler), 723-724, 782
Operator grammar/language, 165, 438 Paterson, M. S., 909, 937
Operator precedence grammar/language, Path, 39, 51
439-443, 448-450, 452, 550-551, Pattern recognition, 79-82
711-718 Paul, M., 455
Order (of a syntax-directed translation), Paull, M. C., i66, 690
243-251 Pavlidis, T., 82
Ordered dag, 42 (see also Dag) PDA (see Pushdown automaton)
Ordered graph, 41-42 PDT (see Pushdown transducer)
Ordered tree, 42-44 (see also Tree) Perfect induction, 20
Ore, O., 52 Perles, M., 211
OUT, 944-958 Perlis, A. J., 58
Out degree, 39 Peterson, W. W., 811
Output symbol, 218, 224 Petrick, S. R., 314
Output variable, 844 Pfaltz, J. L., 82
Phase (of compilation), 721,781-782
w-inaccessible set of LR(k) tables, 588-
597, 601,613, 616
Pager, D., 621 Phrase, 486
Painter, J. A., 77 Pig Latin, 745-747
Pair, C., 426 Pitts, E., 103
PAL, 512-517 PL/I, 501
Parallel processing, 905 PL360, 507-511
Parenthesis grammar, 690 Poage, J. F., 505
Parikh, R. J., 211 Polish notation (see Prefix expression,
Parikh's theorem, 209-211 Postfix expression)
Park, D. M. R., 909, 937 Polonsky, I. P., 505
Parse lists, 321 Pop state, 650, 658
Parse table, 316, 339, 345-346, 348, 351- Porter, J. H., 263
356, 364-365, 374 Position (in a string), 193
998 INDEXTO VOLUMESI AND II
Walk, K., 58
Waiters, D. A., 399
Warshall, S., 52, 77
UI (see Unique invertability) Warshall's algorithm, 48-49
Ullman, J. D., 36, 102-103, 192, 211, WATFOR, 721
1002 INDEXTO VOLUMESI AND II
PRENTICE-HALL, Inc.
Englewood Cliffs, New Jersey
1 1 7 6 • P r i n t e d in U.S. of A m e r i c a