An Algorithm For Optimal Lambda Calculus Reduction
An Algorithm For Optimal Lambda Calculus Reduction
John Lamping
Xerox PARC
We present an algorithm for lambda expression re- will be duplicated, which will cause redundant work
duction that avoids any copying that could later cause later once a value for h is determined. Four copies of
duplication of work. It is optimal in the sense defined the application (h (w (,!y.y))) will ultimately need to be
by L&y. The basis of the algorithm is a graphical rep- simplified if the inner redex is reduced first, compared
resentation of the kinds of commonality that can arise to only two copies if the outer redex is reduced first, one
from substitutions; the idea can be adapted to represent for each different value of h.
other kinds of expressions besides lambda expressions. In general, a tension can occur when a function
The algorithm is also well suited to parallel implemen- that is used several times, (Xh . . . ) in the example
tations, consisting of a fixed set of local graph rewrite above, contains a subfunction that it uses several times,
rules. (Aw.(h (w (Ay.y)))) in th e example above. The tension
occurs, as in this case, when the subfunction can be
simplified given either the value of the argument to the
Overview function or the value of the argument to the subfunc-
tion. It is possible to simplify each use of the subfunc-
The lambda calculus[l] defines beta reduction (reduc-
tion (to fold it inline) before any of the uses of the func-
ing the application of a lambda term to an argument)
tion are simplified, so that the work of simplifying each
in terms of substitution; each occurrence of the variable
use of the subfunction can be shared among all uses
bound by the lambda is replaced by a copy of the ar-
of the function. Alternatively, it is possible to simplify
gument. This can create extra work, since any work
the subfunction (to partial evaluate it) each time the
needed to simplify the argument must be repeated on
main function is used so that that work of simplifying
each copy.
the subfunction can be shared among all uses of the sub-
LPvy[4, 51 has shown that there are lambda expres-
function. But whichever simplification is done first ends
sions for which any order of reduction duplicates work.
up duplicating the work of the other.
One example is
The technique of graph reductions[9] avoids some
((Ml3 (I3 ww
(Xh.((Xf.(f (f (kr)))) copying by treating a lambda expression as a tree, which
is represented by an acyclic directed graph. Lambda
ow-(h (w (~Y.Y>>>>)>)
which has two redexes, an outer one ((Ag . . .) . . . ), and calculus reductions are then modeled by graph opera-
an inner one (( Af . . . ) . . . ). If the outer redex is reduced tions. Since the lambda expression is represented by a
first the inner redex will be dupiicated, and each copy graph, identical subtrees (identical subexpressions) can
will have to be reduced. On the other hand, if the inner be represented with a single piece of graph. In partic-
redex is reduced first, its argument, (Xw.(h (w (Ay.!)))), ular, reducing a redex doesn’t require copying the ar-
gument; each use of the argument simply gets a link
to the original. This can save work later because one
simplification step in a shared section of the graph cor-
responds to what would be multiple simplifications in
the represented lambda expression.
17
definitions and lemmas in the proofs of correctness and ment to where the variable was, as illustrated in the
optimality. figure. The property is first established when a lambda
One discouraging note: while both an informal argu- expression is translated into a graph by establishing one
ment and experience with an implementation of the al- variable node for each lambda node, using fan nodes to
gorithm indicate that the amount of bookkeeping work consolidate the different variable instances bound by a
the algorithm requires for each beta reduction step is lambda into one variable node.
proportional to the cost of doing a substitution, we There is a technical problem in making sure that the
haven’t proven this. The difficulty appears to be in variable node matched by the rule is the mate of the
correctly formulating the bound. lambda node matched by the rule; we resolve this by
assuming that there is a link in the graph between the
lambda node and its corresponding variable node, in-
Simplified Execution Example dicated in the picture by their common variable name
rather than by a line. This also means that the left hand
Figure 1 illustrates the step by step execution of a sim- side of the rule is, in fact, connected.
plified version of the algorithm. This section explains I Another execution of rule I.a, this time on the appli-
what is happening in the figure. cation of the (Jf . . .) yields graph C, which represents
Like graph reduction, the algorithm treats a lambda the lambda expression
expression as a tree, which it represents with a rooted
graph. Graph A in figure 1 shows the graph the algo-
rithm uses to represent the lambda expression
((Mg (8 (~xJ4>))
(Ah.((Xf.(f (f (Azz))))
(h (AY.Y>>)>)
The nodes in the graph fall into two categories. The or-
dinary nodes represent parts of a lambda expression. where the subexpression (h (Ay.y)) occurs four times,
These are the application nodes, written @; lambda although it occurs only once in the graph.
nodes, written Ax; and variable nodes, written x. The So far, the algorithm has done nothing different from
control nodes, on the other hand, don’t represent parts what graph reduction would do; the next rule execution
of a lambda expression, but instead control how the will change that. The only redexes in the lambda ex-
graph represents a tree. Here, these are the fan nodes, pression are applications of one of the copies of (Ah . . . ).
written v. This graph resembles the one that a stan- Since those copies are represented by a shared piece of
dard graph reduction algorithm would use to represent graph, ordinary graph reduction would have to copy the
the same lambda expression. The obvious difference is shared piece before it was possible to proceed with a
the two fan nodes, which explicitly show where sharing beta reduction.
is occurring. The algorithm takes the more subtle step that re-
Throughout figure 1, the thickly drawn link in each sults in graph D. Rule 11.~ in figure 3 replaces a shared
graph indicates where the next rule execution occurs. lambda by two lambdas with a shared body; it replaces
In graph A, this is the topmost application, where the the variable node paired with the original lambda node
rule which simulates beta reductions (rule 1.a in figure 3) with a fan node leading to two variable nodes for the
will be executed, resulting in graph B. The rule indicates two lambda nodes (ignore the annotation i on the nodes
that a subgraph that matches the left hand side of the in the rule for the moment). The “upside down” fan
arrow should be replaced by an instance of the right node is no different in kind from the other fan nodes; it
hand side of the arrow. Node variables, notated like 0, is just more convenient to draw it in that orientation.
match any node. On the right hand side, they show The node type is independent of orientation because the
how the resulting subgraph should be connected with graph isn’t directed; although application and lambda
the rest of the graph. This particular rule eliminates nodes will always be oriented the same way with respect
nodes and changes connections; other rules will create to the root of the graph, the control nodes can occur in
nodes as well. For a formal semantics of a similar graph either orientation. The definition of the graph and the
rewrite rule system, see Barendregt et al[2]. operation of the rules only consider which sites of each
The rule presumes that all t,he variable occurrences node are connected to which sites of the other nodes,
bound by a given lambda are represented by a single with no notion of direction. (It would be possible to
variable node, one variable node per lambda node. This define an equivalent algorithm on directed graphs, but
property means that the beta reduction rule doesn’t the distinctions imposed by the directedness of the links
have to do any copying; it can just connect the argu- would necessitate gratuitously repetitious node types
18
and rules.) Finally, it is important to note that the notion of
Starting from the perspective of the root of the graph, paths is part of an explanation of what lambda expres-
however, it is natural to superimpose a sense of direction sion a graph stands for, not something which the al-
on the graph and to distinguish an “upside down” fan gorithm is directly sensitive to. The algorithm simply
node from a “right side up” fan node. From this theo- does local operations, which are in accord with the big-
retical perspective, an “upside down” fan functions as a ger picture without being directly cognizant of it.
fan-out, rather than as a fan-in. To see the meaning of After the Xh node has been split, the beta reduction
a fan-out, first notice that in a graph without fan-outs, rule is applicable again. Applying it to reduce the ap-
every path from the root that follows the superimposed plication of (Xh’ . . . ) results in graph E.
sense of direction to reach an ordinary node represents a Various rules are applicable at this point; assume that
path in the lambda expression tree (equivalently, a node the algorithm turns to the bottom application node in
in the lambda expression tree, since there is a one-to- graph E, which represents applications of two differ-
one pairing in trees between nodes and paths from the ent functions: h, and (Xx.x). To make progress, it is
root). The same is true in a graph with fan-outs, except necessary to duplicate the application node, so the two
that there is a restriction on the paths. Every fan-out functions can be reduced separately. This is the only
along a path will be paired with some fan-in along the situation where the algorithm duplicates an application
path, and the path must follow the branch out of the node: when the node represents applications of different
fan-out that is marked the same (* or o) as the branch functions. This stipulation means that an application
it followed into the fan-in. Thus, graph D stands for the node is not duplicated unless the resulting nodes will
lambda expression represent different work; it is the key to the optimal-
((W(h (XY-Y)) ity of the algorithm. Graph F shows the result of rule
‘I: ‘^;;;,” IV.e, which duplicates the application; the two new ap-
plications get their respective functions and share their
((Xh’.((tiy Ay.y)) common argument. The new graph represents the same
‘1;’ (;;$I) lambda expression.
2.2 Again, various of the algorithm’s rules apply at var-
wx>>> ious points of this graph. Assuming the algorithm
because the fan-out is paired with the top fan-in, which chooses to reduce the application of (Xx.x), the result
means that paths from the Ah, which go in the *branch is shown in graph G. Next, the rule for duplicating a
of the top fan-in, must go out the * branch of the fan- shared lambda node can be applied to the Ay node, giv-
out to the h variable; and correspondingly for paths ing graph H. In this case, the common body shared
from the Xh’. between the two new lambda nodes is a single edge.
If, on the other hand, the fan-out were paired with Graph I shows what then happens when the lower fan-
the lower fan-in, the graph would stand for the lambda out meets its paired fan-in: rule V.b shows how they
expression annihilate each other and connect corresponding links,
* with * and o with o. This operation leaves the graph
wm VY-Y))
representing the same lambda expression as before; the
‘I;’ ‘“,r;r”
lambda expression tree represented by the middle graph
only reflected paths that went through corresponding
(W-((hzizhv))
links of the fans, and those two valid routes are replaced
u; (;,I)
2.2 by two links in the right hand graph.
There is still a fan-out meeting a fan-in, but these
(Ax-X>>>
which has an h and h’ incorrectly interchanged. De- fans aren’t paired with each other, the fan-out is in-
termining which fan-outs pair with which fan-ins is a stead paired with the upper fan-in. All four routes going
crucial issue, but that is the issue that is not addressed through the two fans contribute to the lambda expres-
by this simplified version of the algorithm. The full al- sion
((JJh.((h (XY.Y))
gorithm uses several additional kinds of control nodes ‘~: ‘~~j~~)
to delimit scopes within which fan-ins and fan-outs can 2.2
pair. For this simplified explanation, we have omitted ((XY’.Y’)
those control nodes and the rules that manipulate them. ((AY’.Y’>
They account for about half of the rule executions of (AZ-Z))))
the full algorithm. For the moment, we will assume represented by the graph, so they must all be preserved.
that the pairing of fan-ins and fan-outs is determined Rule V.a shows how the fans should duplicate each
omnisciently. other, resulting in graph J, which has a new link for
19
each of the four routes through the two fans in the for- segment:
mer graph. The mechanism for deciding whether to use
rule V.a or rule V.b will be taken up in the next section. @I
The algorithm proceeds through the rest of figure 1
using the five rules already illustrated. A few more steps
beyond the figure finally result with (A2.z). 1
0
X Y
It then becomes necessary to determine which traversal
Execution Example of a fan-in should be paired with a particular traver-
sal of a fan-out. Assume in the graph that the fan-in
One of the key points of the algorithm is that a sub- labeled 0 is paired with the fan-out labeled 0, and simi-
piece of the graph which contains fan-outs can represent larly for the fans labeled 1. Any legal path that reaches
different lambda expressions for different paths which the fan-out labeled 0 will have gone through the fan-in
reach it. For example, the graph segment labeled 0 twice. If the traversal of the fan-out should be
paired with the second traversal of the fan-in then the
A
h h’
graph segment represents the expression ((x y) (x y)).
The segment can occur with this interpretation, for ex-
ample, near the end of the simplification of
which arose during the simplified example, represents
(WAY- ((WW-+ (h(h (h-q)))))
either h or h’, depending on the history of the path that
reaches it. Any path from the root can be thought of as WW W (1 4)) 4 Y>N)
accumulating a context that records which branches of (43-(ww(g 4 h3 v>>>>>>))
On the other hand, if the traversal of the fan-out should
fan-ins it traversed, so that it knows which branches of
be paired with the first traversal of the fan-in then
fan-outs to take. The context combines with the struc-
the graph segment represents the different expression
ture of a piece of graph to determine which lambda ex-
pression is represented. Like the notion of paths, this lb 4 (Y Y>>-Th e seg ment can occur with this interpre-
tation, for example, near the end of the simplification
notion of context is not a part of the algorithm, but
of (WAY. ((JJ.((Ah.(h (hdh (h-p)))))
rather a tool to analyze and understand the algorithm.
The context must not only record which branches of (WW (An. (1 4)) 4 Y>>>>
fan-ins were taken, but must organize that information PdW~G3 u> (B v))H)N>
The labels on the fans aren’t able to distinguish the two
so that it is possible to determine which fan-ins pair
cases.
with which fan-outs. This section deals with the issue
Our solution adds a notion of enclosure to delimit the
of how to do that.
interaction of fans along paths. Specifically, traversals
An obvious idea is to label the fans in the graph, of fans from different enclosures along a path never pair.
with fans paired if and only if they have identical labels. Without any intervening enclosures, adjacent identically
To see why this scheme isn’t adequate, first notice that labeled fans pair, so that the above graph represents
graphs containing looping paths, for example the expression ((x y) (x y)). But bracket nodes (to be
explained shortly) can be added to the graph:
A0
1
Y
0
20
resents the initial lambda expression The transparency property is needed for the correct-
((hd3 (g (Ax-x))>> ness of rule I.a, which simulates beta reduction. Recall
(Ah. ((Af. f (f (kz))) that the rule disconnects the argument from the appli-
0-l OY*YN))) cation and reconnects it where the bound variable node
with graph A of figure 2, which is identical to the graph was. If the argument were to see different contexts in its
of the simplified example, except that each fan is in new location than it did in its old, it would represent dif-
its own enclosure, represented in the graph by two new ferent lambda expressions after the beta reduction than
kinds of control nodes, U, and EJ, called bracket nodes, it did before, and the transformation would have been
which indicate enclosure boundaries. (For the moment, incorrect. But consider any path to the application.
ignore the difference between the two kinds of bracket The one step extensions of the path to the argument
nodes and ignore the number next to all control nodes and to the lambda have the same context, since there
in the graph). The bracketing nodes can be viewed as are no intervening control nodes, and transparency en-
delimiting enclosures, which indicate how fans pair; or sures that any extensions of the path corresponding to
they can be viewed as serving to organize the context variable occurrences bound by the lambda have that
and to control its accumulation, which will then deter- same context. So after the beta reduction rule runs, any
mine how fans pair. The two circles on the illustration path to the former argument will have the same context
of graph A are not an actual part of the graph struc- as the path in the original expression that represented
ture, but serve as an aid to visualizing the enclosures the argument.
(It isn’t always possible to illustrate such boundaries Two applications of rule 1-a yield graph B in figure 2
on a graph, because brackets enclose segments of paths, (in this example, we will often show several rule execu-
not segments of graphs. For the sample execution, the tions at once). At this point, the simplified algorithm
two will coincide, but that wasn’t the case in the graph would have duplicated the Ah node. But now there is a
above). The open end of a bracket node points toward U between the fan and the lambda node. There are a
the inside of the enclosure it is a part of. From the point couple of ways of looking at what has to happen, each
of view of a path starting at the root of the graph, a A of which points out a problem.
node looks like an open bracket; crossing it puts the The immediate goal is to get the U node out of the
path inside a new enclosure. Similarly, a tl looks a close way, presumably by moving it below the Ah node, thus
bracket. As will become apparent later in the example, putting the lambda node and fan-in in the same enclo-
enclosures can nest or overlap, and a single enclosure sure. But in order to preserve the transparency prop-
can be composed of disjoint regions. erty, it would also be necessary to add a node above the
The effect of the brackets on the context accumulated h node that undoes the effect of the U. The problem is
by a path can be deduced by noting that since the fans that there is no way to undo the irreversible effect of a
inside an enclosure cannot pair with fans outside the U node. From the point of view of the context there is
enclosure, once a path leaves an enclosure its context no way to know what information the node threw away;
doesu’t reflect any enclosed fans; from outside, it can from the point of view of enclosures, the enclosure has
never matter which branches were taken inside. This been closed off and there is no way back in.
requires that the context look like a stack of separate One step further ahead of this goal is the need to
frames or levels, one for each level of nesting of enclo- prepare for the duplication of the Ah node. This will
sures. Each level of the context records the branches require the fan-in to be moved below the Ah node and
taken through a sequence of fans all at the same level be paired with a fan-out placed where the h variable
of nesting. An open bracket starts a new level of the node is now. For the fans to be paired, they will have to
context to reflect entering a new enclosure, and a close be in the same enclosure. But that enclosure shouldn’t
bracket discards the top level to reflect leaving that en- include most of the body of the Xh, otherwise the fans
closure. might incorrectly pair with some other fan inside the
The brackets in the initial graph set up a crucial body (this isn’t a problem in the current graph, but
transparency property maintained by the algorithm: if could be in general).
a path segment in the graph corresponds in the lambda The solution is to have a disconnected enclosure that
expression to going between a lambda and an occurrence includes the fan-in and the Ah node in one part, and the
of the variable it binds, then the control nodes along the h node in the other. The first step is rule III.a, which
path segment will have no net effect on the accumulated rewrites the U node to a new kind of control node, a
context. The initial bracketing accomplishes this by en- conditional bracket, written U, as shown in graph C. A
closing each fan, so that the effect of the fan on the conditional bracket indicates that the area of the graph
context will be discarded. This is fine since, initially, no it encloses might be one component of a disconnected
fan pairs with any other. enclosure which has another component in the direc-
tion of the link through the bracket. This is indicated verse above the h node to preserve the transparency of
with a dashed line in illustrations of enclosures. As in path segments from the Ah node to the h node. The
this graph, some of the brackets around a region can result, graph D, has an enclosure that consists of dis-
be conditional brackets while others are unconditional connected regions.
brackets. Other parts of the enclosure to which the re- Now the lambda node is ready to be dupiicated by
gion belongs will never be found in the direction of links rule 11.~; the fan-out and fan-in would be in the same
that cross unconditional brackets. enclosure, and thus correctly pair with each other. But
A conditional enclosure boundary doesn’t necessarily to stay on the topic of conditional brackets, but we will
mean that the area of graph it encloses has another com- assume that the algorithm first deals with them some
ponent in the direction of the link through the bracket; more. The algorithm tries to merge disconnected re-
that is why it is called conditional. For two areas to gions of a single enclosure by moving conditional brack-
be part of the same enclosure, they must be at the ets so that they enclose as large a region as possible;
same level of enclosure nesting and there must be a path when the disconnected regions meet, the brackets can
segment between them that enters both at conditional cancel, merging the regions. From graph D, the two con-
brackets. Put negatively, a conditionally enclosed area ditional brackets can be propagated over applications by
will never be part of the same enclosure as an uncondi- rules 1V.b and IV.d, resulting in graph E. In each case
tionally enclosed area or with an area at a different level a conditional bracket on one branch of the application
of nesting. The rules reflect these principles by hav- turns into conditional brackets on each of the other two
ing back-toback conditional brackets cancel each other branches, so that the application becomes included in
(rule VII.c), w h’lAe h aving a conditional bracket directly the enclosed region.
enclosed by an unconditional bracket turn into an un- Now two of the conditional brackets have reached un-
conditional bracket (rule VI1.d). conditional brackets. The conditional brackets are des-
In terms of the context, a conditional close bracket tined to meet and cancel, but there is another enclosure
suspends a level of context. It says that the level it between them. In order for the conditional brackets to
brackets should be encapsulated, but not thrown away. meet, at least one of them will have to go through that
It acts rather like a closure-forming operator, wrapping enclosure. The first step must be to transpose a con-
the contents of the level it brackets into a capsule which ditional bracket past an unconditional bracket so that
is placed in the newly-uncovered next lower Ievel. In- their respective enclosures overlap. But transposing the
versely, a conditional open bracket expects to find a cap- brackets without adjusting something else would change
sule as the most recent item of the level, which it opens which close brackets close which open brackets, giving
up to form a new level. On the other hand if there are entirely rearranged enclosures, not the desired overlap-
capsules on a level that a general close bracket discards, ping enclosures. The solution, implemented by rules
the capsules are discarded with everything else. VI1.f and VI1.h (with one more execution of 1V.d thrown
Changing a general bracket to a conditional bracket in) is shown in graph F, where the numbers that are at-
changes the contexts accumulated by paths and could tached to every control node are used to indicate over-
thus potentially change the lambda expression repre- lapping enclosures.
sented by a graph, but the algorithm maintains an in- A control node with a number i interacts with nodes
dependence property to preclude that possibility in this i levels of enclosure removed. Thus the conditional
case: the top level of the context will never influence bracket nodes in graph F with number 1 don’t form
which lambda expression is represented by the section an enclosure with the unconditional brackets they face,
of graph below a lambda node. This property means, but rather with brackets one level of enclosure further
essentially, that each lambda node has a level of con- out, forming the illustrated regions. Again, the outer
text available for keeping track of different instantia- two regions are two disconnected components of a sin-
tions of the lambdas it represents, which won’t interact gle enclosure.
with the graph below the lambda node. The property In terms of context, a control node with non-zero
is sufficient to ensure the correctness of changing a close number doesn’t act on the top level of the context, but
bracket above a lambda to a conditional close bracket, rather acts as many levels down in the context as the
since the only difference in context that results from the value of its number (the top level being the 0th). The
change is an additional capsule in the top level, which Lli and l-11nodes in graph F are acting not at the level
can’t affect the lambda expression represented by the of bracketing that directly encloses them, but rather at
graph. one level further out. In general, by keeping track of
Returning to the example, since the effect of a con- relative offsets of levels, the numbers make it possible
ditional bracket can be undone, rule I1.a can move the to move enclosure boundaries past each other without
conditional bracket below the Ah node, putting its re- getting them tangled.
33
Next, rule Ha moves one of the conditional brackets the tlo over the application. This splits the enclosure
below the Xy node while rule V1.f moves another above into two independent enclosures, which would be in-
a fan-in, giving graph G. Enclosures at deeper levels correct if fans in the two enclosures had interacted or if
don’t restrict how fans can pair, so moving the level 1 disconnected regions within the two enclosures has been
conditional bracket over the fan-in can’t affect how the part of one enclosure. This can’t happen because a tl,
fan is paired. called a restricted bracket, is never allowed to be used
At this point, the different regions of the single condi- to bracket a fan from the point or to bracket a condi-
tional enclosure are touching and rule V1I.c can merge tional bracket from the outside. These restrictions mean
them, giving (with the aid of another execution of rule that an enclosure can always be split in the vicinity of
1I.a) graph H (The arc near the top of the graph indi- a restricted bracket, jusifying the use of rule 1V.a. The
cates that everything below it is part of an enclosure). general bracket, U doesn’t have these restrictions, and
Rule VI1.c required the conditional brackets to have the so the algorithm knows less when it encounters one. For
same level number; otherwise they would have belonged, this reason, general brackets are used as little as possi-
to different enclosures and should have passed through ble. In fact, they only occur like UO, as close brackets
each other, following rule VI1.n. What started in graph with number 0.
D as an enclosure with two disconnected regions has be-
come a single large region. What started back in graph
B as independent enclosures have become nested en- OutIine of Proofs
closures, preparing the way for lambda reduction while
avoiding possible mispairings of fans. The conditional Correctness
brackets were the instrument of the transformation. In
First, a quick summary of the algorithm: A legal graph
general, conditional enclosure boundaries merge disjoint
is a rooted undirected graph and contains nodes of only
regions by expanding them until they encounter an en-
the following types and arities. The ordinary nodes
closing boundary; in this example there was no enclosing
are the lambda nodes, variable nodes, and application
boundary so the conditional boundaries expanded to fill
nodes:
the entire surrounding graph structure.
Rule 1I.c could have been executed any time since ,:
+ / &I\
graph D. Executing it now gives graph I, which is ready
for rule Lb, a version of the beta reduction rule that The control nodes are the fan nodes, general bracket
accomodates a ACI node, giving graph J; the do keeps nodes, restricted bracket nodes, and conditional bracket
the Ax node out of the enclosure. Rule 1V.e duplicates nodes:
the application to give graph K, where a fan-in is facing I
a fan-out across an enclosure boundary. The enclosure +I ty’ L)i
A
indicates that the fans are not paired, so they should
duplicate each other, rather than connect correspond- As shown, each control node has an associated level
ing links. Rule V1.a takes the first step, moving the number, a non-negative integer, which will always be
fan-out inside the enclosure, but incrementing its num- 0 for general bracket nodes. Finally, there are two pe-
ber to give graph L. Just as with brackets, the number i riphery node types, the root node and void nodes:
on the fan indicates that the fan logically belongs i levels 9 Q
of enclosure out. Since the fans now facing each other
have different numbers, they belong to different nest- Each graph has one root node, which is its root; each
ings of enclosures, and are thus not paired with each graph presented so far should have had a root node con-
other; they should duplicate each other. All of the en- nected to its top node. Void nodes haven’t appeared in
closure mechanism is ultimately in service of making the examples; they connect to unused parts of a graph.
sure that fans have the right numbers when they meet. Each arc of a node attaches at a distinguishable site,
Rule V.a does the actual duplication, yielding graph M, indicated in the pictures by relative location. In terms
then two applications of rule V1.c take the fan-outs back of standard graph theory, the nodes are labeled (the la-
out of the enclosure, yielding graph N. Even though bel is the node type, including the level number on a
rule V1.c is written upside down compared to the seg- control node) and the arcs of a node are ordered.
ments it matches in graph M, the rule fires; the graph Running the algorithm consists of encoding of a
is undirected, so rules can match in any orientation to lambda expression into a graph, which will be described
the graph as a whole. shortly, followed by executing the rewrite rules (then,
Only one other notable thing happens in the execu- optionally, reinterpreting the graph as a lambda expres-
tion past graph N. Rule 1V.a applies at graph N to move sion). The rules of figure 3 are the heart of the algo-
rithm. Rules 1.a and 1.b simulate beta reduction while use lambda node and variable node to refer to nodes in
the other rules of figure 3 get the control nodes out of the graph and use lambda and variable occurrence to
the way of a potential beta reduction. Only the beta re- refer to vertices in the tree.
duction rules change which lambda expression is repre-
Property 1 (pairing) There is a one-to-one pairing
sented by the graph. The rules in figure 4 eliminate un-
between lambda nodes and variable nodes; a variable
reachable graph structures (these result from reducing
node represents all variable occurrences bound by all
lambdas that have no instance of their bound variable).
lambdas represented by its paired lambda node and rep-
Those of figure 5 take advantage of special situations
resents nothing else.
to simplify the graph structure, usually getting rid of a
control node or two. The algorithm works without the Since one lambda node can represent several lambdas,
rules of figure 5, but does less bookkeeping work if the this property still allows one variable node to represent
rules are available. For example, when an implemen- variable occurrences bound by several lambdas, pro-
tation of the algorithm is run on a computation of 6! vided its paired lambda node represents all those lamb-
in the unary representation of Church numerals, it per- das.
forms only about half as many control rule executions The property poses three minor difficulties in trans-
when those rules are available and are given priority. lating a lambda expression into a graph. First, it re-
All of the rules that move control nodes move them in a quires every variable to be bound by some lambda;
consistent direction; the algorithm can’t get into a loop this can be resolved by adding lambdas to the front
of just shuffling control nodes back and forth. Further, of the expression to bind any otherwise unbound vari-
the rules are capable of getting any combination of con- ables. Second, it presents a problem with expressions
trol nodes out of the way of a potential beta reduction; where a lambda doesn’t bind any variables, for exam-
if a lambda expression has a redex, the algorithm will be ple (Xx.(Ay.x)). Th e solution is to go ahead and include
able to reduce it. It remains to show that the algorithm variable nodes for all lambda nodes, but to connect un-
correctly simulates beta reduction. used variable nodes to void nodes. The example would
The proof of correctness proceeds by defining the be represented as
lambda expression represented by a graph and show-
9
ing that an execution of the beta reduction rule on the
A? Q
graph does, in fact, simulate a collection of beta re- XY Y
ductions on the lambda expression, while the remaining ,:
rules don’t affect the represented lambda expression. This situation is the way that void nodes are introduced
As described earlier, each proper path through the into the graph. Later, if the Xy is reduced, the argument
graph corresponds to a path through the lambda ex- will be connected to the @node and garbage collected
pression. A proper path is one which starts at the root by the rules of figure 4. The final problem is that some
and at each fan-out takes the branch corresponding to lambda might bind several variable occurrences; in this
the branch it took into the paired fan-in. This can be case fan-ins are used to collect the references and con-
defined precisely by defining the context accumulated nect them to the single variable node, as was done in
by a path and how fan-outs find their matched fan-in the example.
in that context. For this discussion, we will use the
informal description given in the previous section. Property 2 (transparency) For any proper path seg-
The definition of the tree represented by a graph ment that represents a sequence of links in the tree from
glossed over the issue of how variables in the represented a lambda to a variable occurrence bound by the lambda,
lambda expression should be named. Equivalently, it the control nodes along the segment will have no net
ignored the issue of how the scoping of lambdas is rep- effect on any allowable context for the segment.
resented. Not surprisingly, the control nodes hold the
That is, the segment yields the identity transformation
answer. We present some properties relating lambda on any allowable context. This property can be set
nodes to control nodes, which legal graphs must obey,
up initially by using brackets to encapsulate any fan-
show how to establish them in the initial encoding of the
ins that were introduced to satisfy the paring property,
lambda expression as a graph, and show how they define
putting restricted brackets on each branch and general
how the graph represents scoping. The first three prop- brackets on the points, as was done in the example.
erties have already been mentioned in the examples.
There is a potential for confusion in the following Property 3 (independence) FOT any proper path to
definitions because the names “lambda” and “variable” a lambda node, the make-up of the top level of its context
might refer either to nodes in the graph or to vertices of at the lambda node will have no effect on how the path
the represented tree. To avoid ambiguity, we will always can be extended.
24
More operationally, if control nodes that affected only trip through the lambda node represents the lambda
the top level of the context were to be added just above that binds the variable occurrence.
the lambda node, they would not affect the tree repre- To see how the properties determine binding patterns,
sented by the graph. This is the property that justifies consider a path to a variable node and the variable oc-
rule III-a, converting a general bracket above a lambda currence it represents. Of the trips the path makes
node to a conditional bracket. through lambda nodes, consider the latest one for which
the make-up of its top level of context is still available
Property 4 (nesting) For any proper path segment at the variable node. The nesting property implies that
that represents a sequence of links in the trze from a the variable occurrence is not free in the lambda rep-
lambda to a variable occurrence free in the lambda, the resented by that trip, so that lambda or some deeper
context transformation determined by the segment will one must bind the variable occurrence. But all deeper
discard the make-up of the top level of the entering con- lambdas are represented by later trips through lambda
text. nodes, for which the top level of the context is not avail-
able at the variable node. The transparency property
More operationally, if control nodes that affected only implies that lambdas represented by those trips can’t
the top level of the context were to be added just above bind the variable, so the lambda in question must. In
the lambda node, their effects would be discarded by summary, for any path terminating at a variable node,
any extension of the path that represents a variable oc- the properties imply that the variable occurrence repre-
currence free in the lambda. This property ensures that sented by the path is bound by the lambda represented
beta substitutions preserve the independence property; by the nearest trip through a lambda node for which
if an expression is substituted for free a variable inside the make-up of the top level of the accumulated context
a lambda, the top level of the context at the lambda is still available at the variable node.
won’t be visible inside the expression. This property We define the names of variables in the tree repre-
didn’t come up in the example, because the example sented by the graph in accord with the properties. Ev-
didn’t have any free variables inside lambdas. ery lambda in the tree is defined to bind a variable of
When initially encoding a lambda expression, this a different name, and the name of each variable occur-
property is established by placing an open restricted rence is then defined to be the same as that of that of
bracket just above any lambda that contains free vari- lambda that the properties indicate should bind it. This
ables and placing one general close bracket just above definition assigns all tracking of scoping to the control
a variable for each lambda in which the variable is free.
nodes in the graph. Variable names only appear in the
This also ends up pairing up the open and close brackets, interpretation of the algorithm, not in the algorithm
so as to preserve the transparency property. For exam- itself. For example, although the definition of beta re-
duction in the lambda calculus may require substantial
ple, the expression (Ax. (Ay. (Xz. (x (y z))))) is repre-
renaming to avoid variable capture, the beta reduction
sented by the graph
rule in the algorithm only needs to change a few links;
0 only the interpretation does renaming. Of course, the
4x algorithm must end up with the same bindings that the
9” lambda calculus would, and it does.
?y The proof of correctness is an induction, showing that
To if the graph satisfies the properties, then each rule pre-
+ serves the properties and each rule preserves the lambda
expression represented by the graph, except for the beta
“C@-22 substitution rules (1.a and I.b), which simulate beta re-
I;0 l& ‘z ductions.
,: 3
All four properties talk about which lambdas bind Optimality
which variable occurrences, one in terms of the pairing
of lambda nodes with variable nodes and the other three The optimality of the algorithm follows from its not du-
in terms of contexts. In a legal graph, they must agree. plicating nodes, except when necessary. In particular,
The transparency nesting properties are enough to com- application nodes are only duplicated if a fan-out is on
pletely determine which lambdas in the tree bind which their function link and lambda nodes are only dupli-
variable occurrences. Even in a case where a path loops cated if a fan-in is above them. These are exactly the
through the same lambda node several times before two situations when a fan node is impeding a potential
reaching a bound variable node, they determine which beta reduction.
The optimality is demonstrated by relating the algo- References
rithm’s copying with a labelling on lambda expression
links that Levy’s defines to specify optimality. Levy PI H. P. Barendregt. The Lambda Calculus : its Syntax
defines a parallel reduction step consisting of reducing and Semantics, volume 103 of Studies in logic and
all identically labelled redexes in one step. The task is the foundations of mathematics. North-Holland,
to show that each beta reduction step of the algorithm 1981.
corresponds to one of Levy’s parallel reductions.
It would be nice if any two links in a lambda expres-
PI H. P. Barendregt et al. LEAN, an intermediate lan-
guage based on graph rewriting. Parallel Comput-
sion with identical labels under Levy’s labelling were ing, 9:163-177, 1989.
represented by a single link in the graph, but the actual
case is slightly more involved. It is possible to define the PI Vinod Kathail. private communication, 1989.
the prerequisite chain of a vertex in the lambda expres-
sion, which will be a sequence of links in the lambda PI Jean-Jacques Levy. Reductions correctes et opti-
expression, all of which must be involved in beta re- males dans le lambda-calcul. PhD thesis, Universite
ductions before the vertex can be involved in a beta de Paris, 1978.
reduction. Then we can show Optimal reductions in the
PI Jean-Jacques L&y.
lambda-calculus. In J.P. Seldin and J.R. Hindley,
Lemma 1 At any point dvting Ihe erecvlion of Ihe editors, To H.B. Curry: Essays on Combinatory
rules of ihe algoriihm, if two vertices in ihe tree rep- Logic, Lambda Calculus and Formalism, pages 159-
resented by the graph have prerequisile chains of the 191. Academic Press, 1980.
same length and if corresponding links in the chains have
matching labels, then the chains have the same wpresen- 161 John Staples. Efficient evaluation of lambda expres-
tation in the graph. sions: a new strategy. Technical Report 23, Univer-
sity of Queensland, Department of Computer Sci-
In the case of redexes, the link from the application to ence, St. Lucia, Queensland, 4067, Australia, 1980.
the function is a prerequsite chain by itself, and so the
lemma guarantees that all identically labelled redexes
PI John Staples. Two-level expression representation
for faster evaluation. In Hartmut Ehrig, Man-
will be represented by the same structure in the graph, fred Nagl, and Grzegorz Rozenberg, editors, Graph-
and thus reduced in a single step. Grammars and their Application to Computer Sci-
Levy’s results then imply that no order of executing ence: 2nd International Workshop. Springer-Verlag,
the rules of the algorithm will duplicate work. But it 1982. Lecture Notes in Computer Science 153.
is possible that some rule executions might do useless
work, that is do beta reductions inside a subexpression PI D. A. Turner. A new implementation technique for
that is eventually discarded. The solution is to impose a applicative languages. Software Practice and Expe-
normal order strategy on the rule execution. An imple- rience, 9(l), 1979.
mentation would keep track of part of the graph repre-
sents the redex that would be reduced in normal order PI C. P. Wadsworth. Semantics and Pmgmatics of the
(this can be done efficiently) and would only execute X-calculus.PhD thesis, Oxford University , 1971.
rules on that part.
Of course, such a strategy reduces the opportunities
for parallelism. A parallel implementation of the rules
might dispense with normal order to achieve high paral-
lelism at the cost of possibly doing some work that was
later discarded. The proof that the rules never duplicate
work would still apply to such a system.
Acknowledgments
Jim desRivieres and Jean-Jacques Levy helped simplify
the algorithm. They and Alan Bawden, Pave1 Curtis,
Dan Friedman, Julia Lawall, and Dan Rabin suggested
substantial improvements to earlier drafts.
26
X,h x
@\
@\
v
‘\ k
Z
h XY
I:
K
A‘@
d
A
Y Y z z*
Q R
h’ Ry
i
27
I’ & j
:/\,
“.,h
..____:+iy
Y
H J
@. ,@,
@. ‘70 f=io--
h-70A0 ?: A
Y
Ah0
d 0
@
h’ ?Y
28
0
I\ 6
gQ3: 8
Q
X
8d
La 1-b 1I.b
1V.d
4
VI.d V1.e Vi.f
0 9 9 Q
8 Ui
8 uo
A0 a
A. ==k ; ‘sl”
lh- b uo ==2 I;0
0 4 b 0
VI1.b v11.c VI1.d
Q Q Q
L/i b&,
9
$JO
0 6 6 6 g ;-’
VI1.e VIIf VI1.h V1I.i VI1.j
Q a
;; LLs 8A)!+,
6 6 6 6 b 6
VI1.k VI1.m VI1.n
Figure 3: Rules
29
Q x
XX
$8 88
8
88
Q x x 9 ox 8-m
6-1
@ x a/@\@ ==+ o b &@\a * b XP
@’ ‘0
8
x
VIII .b VII1.c VIII-d
VII1.a
QAi
6
*
8
0
9.
g-0 8x 9,
g-0 8
x ?
g-l? 8 Q
g-a 8 9
g-0 8
x
8
xvI;;),
1X.d
1X.e
1X.h
30