Data Structure Optimization For Functional Programs: Maciej Godek
Data Structure Optimization For Functional Programs: Maciej Godek
Maciej Godek
nr albumu: 181131
Gdańsk 2017
This work is licensed under a Creative Commons
“Attribution-ShareAlike 4.0 International” li-
cense.
Abstract
The purpose of this work is to develop techniques to allow for executing
programs written in functional style effectively. The work consists of two
parts. The first one shows some classic techniques for transforming func-
tional programs into imperative form, as well as some basic methods of prov-
ing statements about program properties. In the second part, a method for
transforming a certain class of programs operating on lists into equivalent
programs operating on arrays is proposed. Furthermore, the conditions al-
lowing to transform a functional implementation of quick sort algorithm into
an optimal imperative form are analyzed.
All source programs and transformations are expressed using the purely
functional subset of the algorithmic language Scheme, as described in chap-
ter 2. The target computation model is a variant of the RAM machine,
whose model and instruction set were described in depth in chapter 3, includ-
ing an implementation, which uses some imperative features of the Scheme
programming language.
In chapter 4 some classic techniques of transforming programs expressed
in the previously described subset of Scheme into sequences of instruc-
tions for the RAM machine are presented; in particular, the conversion to
Continuation-Passing Style and Tail-Call Optimization are described.
Chapter 5 describes a simplified variant of the Boyer-Moore system,
including a full list of axioms used for proving theorems about programs
expressed in the previously described subset of the Scheme programming
language. Unlike the original Boyer-Moore system, however, the system
elaborated in our work is incapable of proving theorems on its own, and can
only serve as a proof-checker for the proofs provided by its user.
In chapter 6 an original method for converting functional programs into
forms receiving and passing arrays is developed. The source language is the
purely functional subset of Scheme described in chapter 2, and the target
language is the full Scheme language, including its imperative features. The
proposed conversion method is only sketchy and certainly requires elabora-
tion.
Chapter 7 deals with automatic conversion of a functional variant of the
quick sort algorithm into an imperative form, although it fails to present a
working conversion algorithm.
Keywords
data structure, program transformation, compiler, theorem prover, func-
tional programming
iv
Contents
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Historical perspective – algorithms . . . . . . . . . . . . 1
1.1.2 Programming as expressing ideas . . . . . . . . . . . . . 2
1.1.3 Referential transparency . . . . . . . . . . . . . . . . . . 4
1.1.4 Historical perspective – data structures . . . . . . . . . 6
1.2 Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Structure of this work . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . 10
I The Organon 13
v
vi CONTENTS
4 Compilation 49
4.1 Continuation-Passing Style . . . . . . . . . . . . . . . . . . . . . 49
4.2 Conversion to Continuation-Passing Style . . . . . . . . . . . . 54
4.3 Generating machine code . . . . . . . . . . . . . . . . . . . . . . 57
4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
II The Substance 89
B Y-lining 133
Bibliography 161
viii CONTENTS
1
Introduction
1.1 Motivation
Functional programming is a valuable technique for building large and reusable
software systems. Contrasted with more widespread approaches, its main
advantage is that it simplifies the reasoning about program’s behavior. Fur-
thermore, it allows programmers to provide less detail about the program,
making it susceptible to run in various different setups.
1
2 1. INTRODUCTION
What those languages had in common is that their main purpose was
to instruct computer how to perform computations. In addition, the means
of expression of those languages were designed, so that they could easily be
translated directly to machine code of most CPUs. With time, however,
the way of thinking about computers had changed significantly. No longer
were they just mere tools for performing scientific computations, but they
were becoming larger and larger systems that were expected to run multiple
programs simultaneously in real time.
Furthermore, the complexity of the applications was only increasing, and
the old ways of thinking were often insufficient for managing that complexity.
As the performance of computers was improving, our requirements with
regard to programming systems begun to shift. For many applications it
was no longer necessary to get the maximum performance of every CPU
cycle, as there were other factors, such as network latency, that bounded
the overall performance of the application.
Moreover, there turned out to be a physical limit on the clock speed of a
single CPU core, which caused hardware manufacturers to focus on deliver-
ing processors with more and more cores of the same speed. Consequently,
the ability to disperse execution of a program on many cores, or even many
machines, can often be more profitable than being able to perform a few
more instructions per second by a sequential program.
The abundance of processing power had yet another consequence. As
programmers no longer had to worry (too much) about computational re-
sources, they could experiment more with the means of expression of their
programming languages. After all, it turned out that the two most costly as-
pects of computer application development were (1) programmers’ time and
(2) programmers’ mistakes. It therefore seems prudent to focus on shorten-
ing the time of application development and on minimizing the possibilities
of making mistakes.
1
The idea of using natural language to program computers has been criticized at length
by Edsger Dijkstra, who concluded that “machines to be programmed in our native tongues
—be it Dutch, English, American, French, German, or Swahili— are as damned difficult
to make as they would be to use”[17]
2
However, certain problems with the way the language of mathematics is customarily
used to lay out physics have been pointed out Gerald Sussman, who proposed to use a
programming language for that purpose instead[77].
4 1. INTRODUCTION
def factorial(n):
if n == 0:
return 1
else:
return n * factorial(n - 1)
>>> factorial(5)
=== 5 * factorial(4)
=== 5 * 4 * factorial(3)
...
=== 5 * 4 * 3 * 2 * 1 * 1
=== 120
One attempt to solve this problem was to separate data structure imple-
mentations from their interfaces. This idea was embodied in the Standard
Template Library of the C++ programming language, which provided a
few different data structures that could be used with a fixed set of generic
functions[75] [74].
It was also at the core of the SQL family of languages, which actually
were designed as a unified interface to a plenitude of various data structures.
Another idea was to provide a few basic, most commonly used data
structures in the core language. Those typical data structures usually are: an
ordered sequence of elements of arbitrary length (lists), an ordered sequence
of elements of fixed length (tuples), a mapping between some key values and
corresponding target values (dictionaries, usually implemented using hash
tables), and unordered set of elements of arbitrary size (sets).
The latter approach has been employed by many popular programming
languages, such as Perl, Python, PHP, Ruby or JavaScript. Although the
performance of their versatile built-in data structures is usually worse than
in the case of tailored solutions, it is often sufficiently good for practical
applications, and the greater conceptual simplicity of the source code makes
its development and maintenance easier.
An extreme version of this approach was proposed in 1958 by John Mc-
Carthy, who designed the LISP programming language[61]. LISP used one
data structure to represent collections, namely – singly linked lists. They
turned out to be expressive enough to embrace not only sets and multisets,
but also dictionaries (as lists of key-value pairs).
Since its inception, singly linked lists became a predominant data struc-
ture in functional programming[4], which – in conjunction with the technique
of garbage collection (also pioneered by McCarthy) allowed to develop pro-
grams devoid of state mutation.
Unfortunately, singly linked lists have some undesired properties that
disqualify them in a number of applications. For example, the access time
to a random element is linear (as opposed to constant, as it is the case for
arrays), similarly to calculating the length of a list. Furthermore, modern
CPUs are optimized to process data that is organized in vectors, and the non-
locality of link reference operation may contribute to an increased number
of cache misses, left alone the doubled memory consumption caused by the
need to store additional pointer along with each element of the list.
For a long time, those deficiencies prevented functional programming
from becoming widespread. Even the programming languages that were ad-
vertised as functional usually incorporated arrays in the repertoire of their
primitive data structures, and specified their order of evaluation[66] (or pro-
vided some sophisticated means to do so[81]) in order to be able to use them
in a predictive way.
In 2002, Phil Bagwell proposed a data structure called VList, which
merged the functionality of linked lists with vectors. The interface to the
8 1. INTRODUCTION
1.2 Formulation
Let’s consider the following program – the variant of Quick-sort, written in
the Haskell programming language:
1 qsort [] = []
2 qsort (p:xs) = (qsort below) ++ [p] ++ (qsort above)
3 where
4 below = filter (< p) xs
5 above = filter (>= p) xs
The central idea of the algorithm is expressed in the line 2: that the
resulting sequence consists of the (sorted) elements smaller than pivot, fol-
lowed by pivot, and then followed by the (sorted) elements greater than
pivot.
Let’s contrast it with the imperative version from [14]:
Quicksort(A, p, r)
1 if p < r
2 q ← Partition(A, p, r)
3 Quicksort(A, p, q − 1)
4 Quicksort(A, q + 1, r)
Partition(A, p, r)
1 x ← A[r]
2 i←p−1
3 for j ← p to r − 1
4 if A[j] ≤ x
5 i←i+1
6 exchange A[i] ↔ A[j]
7 exchange A[i + 1] ↔ A[r]
8 return i + 1
1.3. STRUCTURE OF THIS WORK 9
These definitions come with a remark, that “to sort an entire array A,
the initial call is Quicksort(A, 1, length[A])”.
There are a few things to note here. First, that the imperative definition
is much more difficult to follow, because array indexing adds another layer
of indirection. Also, the imperative version is less composable, because it
modifies its argument A, so if the old array ought to be used in some other
part of the program, the programmer has to remember to make a copy.
On the other hand, the qsort function defined in Haskell can only super-
ficially be called quick: both the filter function and the list concatenation
operator ++ allocate new storage, and the time complexity of the ++ oper-
ator is proportional to the length of its leftmost argument. Furthermore,
each use of the filter function traverses the input list once, so it may be
traversed twice per each invocation of qsort. By contrast, the Partition
function traverses the input array only once.
There is also a significant difference in the content of the data structures
at the intermediate steps of computation: the above and below lists will
contain elements in the order in which they appear in the xs list, while the
result of Partition is generally difficult to determine.
The reason why this question doesn’t matter too much is that the ele-
ments eventually end up sorted, no matter what both the initial and inter-
mediate arrangements of elements are.
Clearly, there is a big difference between the functional qsort and the
imperative Quicksort, because the latter is based on a clever idea of
Partitioning an array (in place).
However, we can perceive the Haskell program as a sort of a section
through its imperative counterpart. In particular, it should be possible to
mechanically transform certain classes of functional programs so that they
would use arrays instead of lists, and reuse previously allocated storage
instead of allocating new one. This is what this work is going to be about.
1.5 Acknowledgements
First and foremost, I would like to thank my family and my girlfriend for
supporting my crazy decision of returning to the University to fill the gap
1.5. ACKNOWLEDGEMENTS 11
I should also mention Radek Potyraj and Andrzej Macuk, who created
the right conditions at the Fellows company, which allowed me to try to
continue the studies. I am grateful to Michał Janke for dragging me to their
company, and I regret that in the end the things didn’t turn out as good
as they could have. I appreciate the attitudes of my colleagues at work,
who have been showing interest in this work and my opinion and expertise,
making me feel as someone important.
My apologies to everyone who might have felt that his or her name is
missing here. For purely ecological reasons, I’m unable to do the justice to
all of you here. (This also includes a lot of people that I never had any
opportunity to meet in person, and whose work I had to admire from the
distance, be it spatial or temporal.)
Part I
The Organon
13
2
First we must define the terms “noun” and “verb”, then the terms
“denial” and “affirmation”, then “proposition” and “sentence”.
– Aristotle, On Interpretation
2.1 Overview
In the previous chapter, we presented two variants of seemingly the same
algorithm – one written in the functional language Haskell, and the other
presented in its classical imperative form.
In this work, we are going to analyze the general conditions that need
to be satisfied in order to be able to transform functional programs that
operate on lists into equivalent imperative programs that operate on arrays,
and search for the means that can be used in order to detect whether those
conditions are actually satisfied.
We shall begin with defining a programming language that will be used
to express algorithms on data structures that are supposed to be optimized,
and specifying a computational model that will be the target of our trans-
formations.
An ideal programming language for our purpose would possess the fol-
lowing traits: it would be simple to describe, simple to process and powerful
enough to express any computable function.
For those reasons, we decided to choose a purely functional subset of
the Scheme programming language[66], devoid of the set! instruction and
the call-with-current-continuation control operator. Furthermore, al-
though the specification of Scheme defines the strict (or applicative) order
of evaluation, the programs presented in this work shall never rely on this.
Lastly, as the point of this work is to describe techniques for optimizing pro-
grams that operate on lists, our subset of Scheme need not support vectors,
in spite of their presence in the specification. Likewise, we shall ignore the
15
16 2. THE SOURCE LANGUAGE
2.2 Syntax
Syntactically, Scheme employs a fully parenthesized prefix notation built
around the so called s-expressions that could be described with the following
BNF-style grammar:
2.3 Semantics
The value of a number is that number itself (thus we say that numbers are
self-evaluating). The value of a symbol is the value that has been bound to
that symbol. A symbol is said to be bound if either:
• it has been defined using a define special form (or its derivative) in
the current lexical context,
2.3. SEMANTICS 17
(lambda (x y) x )
² ®
arguments body
Note that the symbols that appear in the argument list must be unique,
i.e. it must not be the case that the same symbol appears on the list more
than once. Also, the body must be a valid Scheme program.
The list of arguments need not be proper. In such cases, the dotted tail
argument represents a list of optional (variadic) arguments.
Although the lambda form is sufficient to express any computation [41],
including operations on natural numbers as well as lists[1], Scheme provides
some additional primitive forms for convenience.
(define x 5 )
® ®
def iniendum def iniens
causes x to be bound to the value 5. The define special form is the only
primitive special form that is not considered an expression.
18 2. THE SOURCE LANGUAGE
Also, since – unlike lambda – the define form does not create a new
scope, but rather extends the current one, it allows to define functions that
are recursive.
Although the basic form of define is (define name value), we are
going to treat usages like (define (f x) ...) as short-hand for (define
f (lambda (x) ...)) and so on1 .
The if form
If the <condition> evaluates to #false, then the value of the whole expres-
sion is value of <alternative>; otherwise, it is the value of <consequent>
(therefore, every value other than #false is considered to be true in the
context of an if expression. Such values will be referred to as truth-ish
throughout this text).
The quote form is used to input literal data. For example, the value of the
expression (quote x) is the symbol x. The quote operator is redundant for
self-evaluating expressions (e.g. numbers), so there is no practical difference
between expressions (quote 1) and 1.
The operator is not idempotent, though: the value of the expression
(quote (quote 1)) is a list of two elements, whose first element is the sym-
bol quote, and whose second element is the number 1.
As it was noted in the section describing the syntax, the expression
(quote x) can be abbreviated as ’x, so – consequently – the expression
(quote (quote 1)) can equivalently be written as ’’1, and also ’(quote 1)
and (quote ’1).
Although the source programs that we are going to write are purely ap-
plicative, the resulting program will occasionally contain some procedural
constructs. The begin form is used for sequencing operations. Its value is
the value of the last expression in that form (in particular, (begin x) and
x are equivalent).
1
In particular, the (define ((f x) y) ...) can be treated as a short-hand for
(define f (lambda (x) (lambda (y) ...))). This generalized feature, called curried
definitions, is not provided by most implementations of Scheme.
2.3. SEMANTICS 19
The set! form is used for performing assignment. For example, the
value of the expression2
(begin
(define x 5)
(set! x (+ x 1))
(* x x))
is the number 36. There is no need to use begin in the body of the
lambda form: (lambda args (begin actions ...)) and (lambda args
actions ...) are equivalent. Most typically, the begin form is used in one
or both branches of the if form.
The let form is used to create local bindings, and is defined so that
expands to
In the case of the let form, the bindings of variables to all values occur
simultaneously, so for example in the expression
1 (let ((x 2)
2 (y (+ x 1)))
3 (+ x y))
2
It may be questionable whether that program can actually be called an expression,
as it introduces a definition into its current scope. The snippet presents a very bad style
of programming, but it also presents the meaning of the special forms discussed in this
section.
20 2. THE SOURCE LANGUAGE
the symbol x in the line 2 does not refer to the value from the binding in
the line 1, but to the value from the outer scope of the expression. This
behavior is often undesired in practice.
For that reason, Scheme provides a sequential variant of let called let*,
which is defined so that
expands to
expands to
(if <condition1>
(and <condition2> ...)
#false)
and
(and <final-condition>)
expands to
<final-condition>
thereby causing the and clause to expand either to #false or to the value
of its <final-condition>.
The or form
Like the and form, the or special form performs evaluation in a short-
circuited manner, and is also defined to evaluate to the value of its suc-
ceeding clause. This last requirement causes a slight complication under the
strict model of evaluation, as it requires to capture the value of expression
in order to make sure that it is evaluated only once:
2.3. SEMANTICS 21
expands to
(cond (<condition1>
<value1>)
(<condition2>
<value2>)
...)
gets expanded to
(if <condition1>
<value1>
(cond (<condition2>
<value2>)
...))
expands to
expands to
The quasiquote form (in conjunction with two helper keywords, namely
– unquote and unquote-splicing) is used for creating data conveniently.
The convenience stems from the fact, that the syntaxes (quasiquote x),
(unquote x) and (unquote-splicing x) can be abbreviated as ‘x, ,x and
,@x, respectively. This allows to create data that contains some variable
elements in it, for example the value of the expression
(let ((x 3)
(y ’(5 6 7)))
‘(1 2 ,x 4 ,@y 8))
Scheme allows functions to return more than one value using the values
form. For example, the form (values 1 2 3) returns three values: 1, 2
and 3.
Normally (e.g. in the context of a function call) only the first value is
taken into account, and the remaining ones are ignored. Scheme provides
a special form call-with-values, which allows to capture the remaining
return values. For example, (call-with-values (lambda () (values 1
2 3)) list) passes the values 1, 2 and 3 as subsequent argument to the
list function (effectively creating the list (1 2 3)).
We are going to assume here, that the let* form can be abused to receive
multiple values[21], so that, for example
is equivalent to
(call-with-values
(lambda () (values 1 2 3))
(lambda (a b c) (list a b c)))
Note that the multiple return value feature will never be used in this
work when Scheme is to be treated as the subject language (i.e. as data to
be processed by compilers, interpreters, theorem provers and other trans-
formers).
2.3. SEMANTICS 23
The is form
(is a related-to? b)
expands to
(related-to? a b)
(is _ related-to? x)
expands to
As noted before, the lambda form contains a list of arguments and a nested
Scheme expression, referred to as the body of the lambda form.
The value of the application of a function created by a lambda form
is simply the value obtained by evaluating its body in the lexical scope
extended with bindings of its arguments to the values of the operands of the
combination.
For example, the value of the expression
((lambda (x y) x) 5 10)
The ! symbol in the body of the lambda expression refers to the binding
in the environment in which the lambda expression is evaluated (the so-
called top-level environment), and not the environment introduced by the
let form (in which the ! symbol gets bound to the value of the lambda
expression).
This is not the case with the define forms. The define form introduces
a new binding to the current environment, so the definition
2.3. SEMANTICS 25
(note the additional pair of parentheses around the nested f and generating
(Z F)):
There is also the random function that takes an integer n and evaluates
to a random integer between 0 and n-1. Note that although programs that
contain a call to the random function are no longer referentially transparent,
they can still be analyzed in terms of the substitution model of computation.
The basic function for constructing lists is called cons. It takes two ar-
guments. The value of the expression (cons a b ) is a (typically) newly
allocated pair (a . b ) (also called a cons cell) from the garbage-collected
heap.
In order to retrieve the values from a cons cell, one can use the accessor
functions car and cdr. In particular, for any values a and b , the expression
(car (cons a b )) evaluates to a and the expression (cdr (cons a b ))
evaluates to b .
The predicate pair? can be used to check whether a given object is a
cons cell, so for any values a and b , the expression (pair? (cons a b ))
evaluates to truth-ish value. On the other hand, for any value x , if x is an
atom, then (pair? x ) evaluates to #false.
The apply function can be used to apply a function to a list of arguments.
In particular, if l is a list (a1 a2 ... an ), then (apply f l) is equivalent
to (f a1 a2 ... an ).
The notion of identity in Scheme may seem a bit complex at first. The prim-
itive predicate eq? can be used to check whether two expressions evaluate
to the same object, that is, an object that occupies the same space in the
computer memory.
In particular, it need not be the case that (eq? (cons 1 2) (cons 1 2)),
because the evaluation of each expression (cons 1 2) may allocate new
memory, instead of re-using the already allocated.
On the other hand, it is guaranteed that two instances of a symbol with
the same shape are eq?, for example (eq? ’abc ’abc) evaluates to a truth-
ish value, and that two symbols with different shapes are not eq?, so for
example (eq? ’abc ’def) evaluates to #false. Also, it is guaranteed that
the empty list, i.e. ’(), is always eq? to itself.
Situation gets more complicated in the case of numbers. Typically, two
instances of the same number can be eq? if the numbers are small enough to
fit into machine word. However, since Scheme supports arbitrary precision
arithmetic, additional storage may be allocated on the heap to store results
of arithmetic operations. In such cases, two instances of the same number
may not be eq?.
28 2. THE SOURCE LANGUAGE
(define (equal? a b)
(or (eq? a b)
(and (pair? a)
(pair? b)
(equal? (car a) (car b))
(equal? (cdr a) (cdr b)))
(and (number? a)
(number? b)
(= a b))))
The above definition reads as follows. Two objects are equal? either if
they are eq?, or if they are both pair?, their cars are equal? and cdrs are
equal?, or they are both number? and they are = (numerically equal).
The actual equal? function available in Scheme is a bit more powerful,
as it can take arbitrarily many arguments, and evaluates to #false if any
two differ (in the sense implied by the above definition) from each other.
List processing
The list function evaluates to a list of its (evaluated) arguments, for ex-
ample (list 1 (* 1 2) (+ 1 (* 1 2))) evaluates to the list (1 2 3). The
list function could be defined simply as (lambda x x), because if the list
of arguments of a lambda expression is improper, then they are captured in
a list and bound with the dotted tail of the argument list.
The map function takes a n-ary function f (for n ≥ 1) and n lists of length
k and returns a new list of length k such that its elements are obtained
from application of f to the subsequent n-tuples from the list. In other
2.3. SEMANTICS 29
Non-standard functions
There are a few functions that are going to be used in this work that are not
a part of the standard Scheme. They are explained here briefly, and their
definitions are given in the appendix A.
Quantifiers
It is common in logic and mathematics to express statements using the quan-
tifiers for all (∀) and exists (∃). For example, the sentence “All men are mor-
tal” could be translated to predicate calculus as ∀x [man(x) => mortal(x)].
We usually assume that there is a domain (universe) of all objects that we
can talk about.
In Scheme, we usually have to be a bit more specific. It is customary
to define two functions, usually called every and any, that take a predicate
and a list of objects from the domain, and evaluate to #false if every object
from the list satisfies the predicate (in case of every) or there is no object in
the list that satisfies the predicate (in case of and), and otherwise evaluate
to some truth-ish value.
So while the most faithful translation of the above example to Scheme
would have the form
it is much more customary to assume that we have a list of all men, and
check whether each element of that list satisfies the predicate mortal?, i.e.
(every mortal? men).
30 2. THE SOURCE LANGUAGE
Set operators
Lists can interpreted not only as sequences, but also as sets. On such occa-
sions, the order of elements on the list becomes immaterial. An element is
thought of as being a member of a set represented by a list if it is a member
of that list, which is denoted as (member element list) or (is element
member list).
One can easily define the usual set operators such as union, intersection
and difference. However, the order of elements contained in the result of
these operations is undefined and should not be relied on (in particular, it
might be the case that any of these functions called with the same arguments
return the result in a different order upon each invocation).
Folding
Suppose that we are given a list, for example [a1 , a2 , a3 , ...], and a binary
operator ○. The simplest variant of the f old operation computes the value
of the expression (a1 ○ a2 ○ a3 ○ ...).
Note that the above formulation is ambiguous. For example,
f old(○, [a1 , a2 , a3 , a4 ])
i.e.
(a1 ○ a2 ○ a3 ○ a4 )
can be interpreted as either
(((a1 ○ a2 ) ○ a3 ) ○ a4 ) (2.1)
or
(a1 ○ (a2 ○ (a3 ○ a4 ))) (2.2)
or
((a1 ○ a2 ) ○ (a3 ○ a4 )). (2.3)
The interpretation (2.1) is called left fold and the interpretation (2.2) is
called right fold.
Of course, if the operator ○ is associative, the interpretation is (by defini-
tion) insignificant from the denotational point of view (although the amount
of resources used by those interpretations might differ under various circum-
stances).
Also, it is easy to see that, for the above f old operation to make sense,
it must be the case that ○ ∶ A × A ↦ A.
This requirement can be loosened a bit, though. By introducing addi-
tional argument e, we allow the operator to have a type B × A ↦ B in the
case of the left fold, or A × B ↦ B in the case of the right fold: then, fold-
left(○, e, [a1 , a2 , ..., an ]) is interpreted as ((...((e ○ a1 ) ○ a2 ) ○ ...) ○ an ), and
fold-right(○, e, [a1 , a2 , ..., an ]) is interpreted as (a1 ○ (a2 ○ (... ○ (an ○ e)...))).
2.3. SEMANTICS 31
The purpose for the presence the additional argument e can be explained
as follows. Imagine that you have a state of the world, S, and a list L
(possibly infinite) of actions that occur in the order as they appear on the
list. There’s an update function called ⊲ that takes a state and an action
and returns an updated state. Under such circumstances, the evolution of
the world can be modeled as fold-left(⊲, S, L).
If the operation ○ has a neutral element (i.e. an element 1 such that, for
any valid x, 1 ○ x = x in the case of left fold, or x ○ 1 = x in the case of right
fold), it is often convenient to choose it as the argument e.
Pattern matching
We can write programs that operate on s-expressions using the primitive
functions pair?, car, cdr and eq?. For example, if we want to know whether
a given s-expression exp represents a sum of exactly two elements, say,
(+ a b) (for some a and b), we can write a compound condition
If we would like to capture the first and second operand to plus, using,
say, the let from, we would obtain:
(match exp
((’+ a b)
;; here a and b are bound to the first
;; and the second operand of +
...)
;; possibly other matches go next
...)
expands to the above condition and binding list. Furthermore, we can
establish a convention, that whenever a compound expression appears in the
place of an argument in the lambda form (or any derivative form, such as let
or let*), it gets pattern-matched, so for instance (lambda ((a . b)) body)
would be interpreted as (lambda (x) (match x ((a . b) body))) (where
the variable x does not occur free in the body form). The _ (underscore)
symbol has a special meaning: it matches anything, but does not get bound
to any value.
We will sometimes be making use of a bit more exotic (and less obvious)
feature of the pattern matcher, namely – the ... (ellipsis) operator. It be-
haves similarly to the ,@ (unquote-splicing) operator in that it “unsplices”
a list:
(match ’(1 2 3 4 5)
((x y ... z)
;; x is bound to 1,
;; y -- to the list (2 3 4),
;; and z -- to 5
...))
containing the define forms into a program that only uses lambda, let and
let* (and a fixed-point combinator) is given in the appendix B.
Also, for the sake of brevity, we shall ignore the issues concerning user-
defined syntactic extensions. We shall also assume that the programs do not
reuse either of the syntactical keywords (like quote or lambda or if), so for
example we shall not be concerned with programs such as
((lambda (lambda)
(lambda lambda))
(lambda (quote)
(quote quote)))
((’quote literal)
literal)
((operator . operands)
(let ((procedure (value operator environment))
(arguments (map (lambda (operand)
(value operand environment))
operands)))
(application procedure arguments)))
(_
(cond ((symbol? expression)
(lookup expression environment))
((number? expression)
expression)
(else
(error ’unrecognized-expression expression))))
))
The most important thing that is left is to explain how the procedure ap-
plication occurs: first, we should extend the original environment of the pro-
cedure with the values of arguments, and then execute the program defined
in the body of the lambda form. (The only exception concerns the primitive
functions, which are going to be handled using the primitive apply)
The tie function ties the parameter names with their values, extending
a given environment. Although it is not required by the Scheme standard,
the tie function can be defined to support destructured bindings:
The initial environment should contain all the primitive functions avail-
able in the language. In the simplest case, it should be sufficient to define
it as
ing a single cons cell, whereas the (key value) list normally uses two cons cells. However,
since data structure optimization is the core theme of this work, we don’t value such argu-
ments too much. We believe that improper lists should only be used for constructing and
destructuring lists, and that actual data should only be stored using proper lists (unless
the data stored represents the Scheme source. However, even on such occasions we could
treat the dot as a special symbol, rather than a part of syntax). Note also, that another
popular representation of frames consists of two lists, where the first contains symbols’
names, and the second consists of the corresponding values, for example ((a b c) (1 2
3)). This representation is often called a rib cage[18].
36 2. THE SOURCE LANGUAGE
(define initial-environment
‘(((cons ,cons)
(car ,car)
(cdr ,cdr)
(eq? ,eq?)
(pair? ,pair?)
(number? ,number?)
;; ...
)))
In the previous chapter, we’ve expressed the subset of Scheme of our interest
in this very subset. We used the representations of symbols, numbers and
lists provided by the host Scheme implementation.
However, this level of detail is insufficient for our purpose, because it
does not reflect the capabilities and limitations of real machines.
In particular, the typical computer architectures consist of registers and
an array of memory cells that can hold integer numbers from some limited
range.
37
38 3. THE COMPUTATION MODEL AND THE TARGET LANGUAGE
(define pair
’((left word)
(right word)))
Of course, the procedures for getting and setting values of memory cells
would need to operate in the context of some memory object:
(define unallocated-memory 0)
(if (not-enough-memory?)
(collect-garbage! memory))
We are going to assume that, apart from an array of heap memory de-
scribed in the previous section, the machine also has a separate storage area
for the control stack, and that it also has an area of (read-only) program
memory, with a special register called next-instruction, which points to
the instruction to be executed in the next step of the computation.
Furthermore, it has a set of general-purpose registers.
;; auxiliary definitions:
(define (next)
(machine ’set-next-instruction!
(+ (machine ’next-instruction) 1))
(execute program machine))
(define (fetch-instruction)
(vector-ref program (machine ’next-instruction)))
6
We chose a double stroke (||) to signify bit-wise or, because a single stroke is not
a proper symbol in Scheme. The double ampersand (&&) was chosen for consistency of
notation. Admittedly, those symbols can be confusing to people who are accustomed to
the C programming language, where these “doubled” symbols are used to denote logical
disjunction and conjunction rather than bit-wise operations, which are denoted using
single | and &.
3.2. MACHINE LANGUAGE AND MODEL 43
The body of the execute function consists of a dispatch over the current
instruction, which explains what has been said earlier.
;; the body of ‘execute’ begins here:
((’goto address)
(goto address))
((’push register/value)
(machine ’push! (value register/value))
(next))
((’pop register)
(machine ’set-value-in! register (machine ’pop!))
(next))
((’halt)
machine)
))) ;; the definition of ‘execute’ ends here
The compare function also resorts to the built-in numerical comparison pred-
icates of Scheme:
7
Actually, the SRFI-60 functions were eventually included in the (rnrs arithmetic
bit-wise (6)) library specified in the R6 RS document[70].
3.2. MACHINE LANGUAGE AND MODEL 45
((’memory-at address)
(let ((bytes (map (lambda (i)
(bytevector-u8-ref memory i))
(range 0 (- MACHINE-WORD-SIZE 1)))))
((number/base 255) bytes)))
((’value-in register)
(let (((register . value) (assoc register registers)))
value))
46 3. THE COMPUTATION MODEL AND THE TARGET LANGUAGE
((’next-instruction)
(this-machine ’value-in ’next-instruction))
((’set-next-instruction! value)
(this-machine ’set-value-in! ’next-instruction value))
((’push! value)
(set! stack (cons value stack)))
((’pop!)
(let (((top . below) stack))
(set! stack below)
top))
))
this-machine))
Note that we used the for control structure and the range function.
Although their meanings should be intuitive to the reader, they are not a
part of the standard Scheme, so they are defined in the appendix A, just
like the functions number/base and machine-word-bytes.
(define factorial
’#((n <- 5) ;0
(acc <- 1) ;1
(if n = 0 goto 6) ;2
(acc <- acc * n) ;3
(n <- n - 1) ;4
(goto 2) ;5
(halt) ;6
))
It uses exactly two registers, called n and acc. It does not use any stack
nor memory cells, so it can run on a machine with no memory (other than
the registers):
In order to run the program on the machine, one simply has to type in
3.2. MACHINE LANGUAGE AND MODEL 47
After the computation terminates, the acc register contains the result
(which should be retrievable using (tiny-machine ’value-in ’acc) com-
mand).
3.2.4 Assembler
The machine code for computing the factorial function from the previous
section was written in a highly non-composable style, because it contained
instructions such as (if n = 0 goto 6) or (goto 2) – adding a single in-
struction at the beginning of the program would ruin the logic of the pro-
gram.
For this reason, it is convenient to introduce labels to mark certain entry
points to the program. To represent the labels, we are going to use the
extension to Scheme known as keywords, as defined, in the Scheme Request
For Implementation 88 [26] document8 , because they do not interfere with
our decision to use symbols to denote registers.
The program for computing the factorial expressed in this position-
independent way could look as follows:
’((n <- 5)
(acc <- 1)
factorial:
(if n = 0 goto end:)
(acc <- acc * n)
(n <- n - 1)
(goto factorial:)
end:
(halt))
where
Compilation
In the previous two chapters, we have shown the source language that we
wish to express our programs in, and the target language that models the
machine code that is actually used by the real computers to perform com-
putations.
In a way, those two languages are the complete opposites of each other:
the first one is about composing functions, allows no side effects such as
assignment and provides an implicit memory model. The other makes mem-
ory operations explicit, allows to exchange information solely with the use
of assignment, and the only thinkable way of performing composition is by
sequencing operations and subprograms.
The transformation from the first sort of languages to the second has
traditionally been called compilation, and some of its popular techniques
will be presented in this chapter.
We shall begin with transforming Scheme programs into a special form
that does not contain any nested function calls, and hence should be easier
to transform to the program on our machine.
(define (delta a b c)
(- (* b b) (* 4 (* a c))))
Prior to computing the value of the whole expression, we need to have our
machine compute the values of sub-expression and store them somewhere.
Provided that our machine has a sufficient number of registers, we could
expect it to compile to the following sequence of machine instructions1 :
1
Note that we use a new register to hold the result of each intermediate computation. Of
49
50 4. COMPILATION
(bb <- b * b)
(ac <- a * c)
(4ac <- 4 * x2)
(bb-4ac <- x1 - x3)
Let us now ask the opposite question: given a sequence of machine in-
structions, how can we express them in Scheme (or λ calculus)?
We typically imagine that a von Neumann machine operates by altering
its current state (and indeed, this is how we implemented our virtual machine
in Scheme).
However, we could imagine that there is something quite different going
on: each assignment to a register can be perceived not as actually altering
some value, but as creating a new scope where the original variable has been
shadowed with a new one (bound with the altered value), and where the rest
of the program is evaluated.
In order to clarify things a bit, we can define an auxiliary function pass
that takes a value and a procedure and simply passes the value to the pro-
cedure:
(bb <- b * b)
... the rest of the program ...
as
(pass (* b b)
(lambda (bb)
... the rest of the computation ...))
The procedure that represents the rest of the computation has tradition-
ally been called a continuation, and the form of a program where control is
passed explicitly to continuations is called continuation-passing style[73].
If we were to define our delta procedure using the continuation-passing
style, we would need to extend its argument list with a continuation, i.e.
a parameter that would explain what to do next with the value that our
function has computed. Moreover, we could demand that all the functions
that we use behave in the same way, i.e. that instead of the function pass,
course, the real machines usually have a limited number of registers, but the assumption
that each register is assigned exactly once leads to the form of programs called Static
Single-Assignment form (or SSA for short), which is used, for example, in the machine
language of the LLVM virtual machine.
4.1. CONTINUATION-PASSING STYLE 51
we would have functions pass* and pass- that would compute the values
of operations * and - and pass them to their continuations. This way we
make sure that there are no nested expressions in our program.
(define (abs n)
(if (is n >= 0)
n
(- n)))
The corresponding machine code would look more or less like this:
The above code was trivial in that it used the condition that is directly
representable in our machine code. However in general we can place arbi-
trarily nested Scheme expressions as the if’s <condition> clauses.
Continuations can also be used for returning multiple values. For exam-
ple, the code finds the roots of a quadratic equations would need to check
whether the discriminant ∆ is non-negative in order to proceed with the
computation of the roots:
(define (quadratic-roots a b c)
(cond ((is (delta a b c) > 0)
(values (/ (- (- b) (sqrt (delta a b c))) (* 2 a))
(/ (+ (- b) (sqrt (delta a b c))) (* 2 a))))
((is (delta a b c) = 0)
(/ (- b) (* 2 a)))))
Note that we have used the values form that is used in Scheme for
returning multiple values. Although we didn’t introduce it to be the part
of our host language, its meaning in the context of the discussion regarding
continuation-passing style is obvious (we simply pass more than one value
to the continuation). Of course, we could have instead returned a list of
values, which in turn would force us to fix on some particular representation
of lists, which we want to avoid at this moment.
Note also, that we defined quadratic-roots to return meaningful values
only if the delta is either zero? or positive? – that is, if it is negative,
then the expression quadratic-roots has no values (or in other words, its
value is unspecified).
Lastly, some readers may find it displeasing, that we didn’t capture
the value of (delta a b c) using the let form (which is something that
we would normally do to avoid some redundant computations, and more
specifically, not to repeat ourselves). We ask those readers to be forgiving,
as our goal at this point is to demonstrate the correspondence between
various Scheme programs and their CPS counterparts, rather than promote
good programming practices. Or in other words, we are in the position of
a surgeon performing an operation on a patient. Of course, it is in general
good for health to jog, but it would be insane to recommend jogging to
someone who is lying on an operation table with open veins.
The computation of quadratic-roots is a bit tricky
4.1. CONTINUATION-PASSING STYLE 53
Some (rather trivial) parts of the code for computing roots were omitted
for clarity. It should be clear now that for complex conditions we simply
compute the value of a condition, and then pass it to a continuation that
takes the result and, depending on its value, either executes the CPS version
of its <then> branch or the CPS version of its <else> branch.
Note also, that – in order to avoid accidental name clashes – we generated
a new name for the result of each evaluated (or executed) expression.
Let’s now consider the following definition of the factorial function:
(define (factorial n)
(if (= n 0)
1
;else
(* n (factorial (- n 1)))))
We can imagine that its continuation-passing version could look like this:
54 4. COMPILATION
The questions that we need to ask are: (1) how do we transform arbitrary
functional Scheme code to continuation passing style and (2) how do we
transform continuation passing style program to machine code.
((function . arguments)
(let ((simple-arguments (map (lambda (argument)
(if (compound? argument)
(original-name argument)
;else
argument))
arguments)))
(passing-arguments arguments simple-arguments
‘(,(passing-function function)
,@simple-arguments
,continuation))))
(_
‘(,continuation ,expression))))
;; the definition of ‘‘passing’’ ends here
2
Although in this work we have been consequently passing the continuation as the last
argument for the purpose of clarity, in practice it might be a better idea to make it the
first argument, because that would allow to handle variadic functions properly.
3
Note that, for clarity of presentation, we depart from the definition of Scheme in that
we do not allow complex expression in the head (function) position.
56 4. COMPILATION
((argument . next)
(let (((name . names) names))
(if (compound? argument)
(passing-arguments next names
(passing argument
‘(lambda (,name) ,final)))
;else
(passing-arguments next names final))))))
The passing-program function takes a program, that is, a sequence of
definitions and an expression, and converts each of the definitions to the
continuation-passing style. For the sake of simplicity, we shall assume here,
that all the definitions are function definitions. Note that we need to pass
the additional return argument that is stripped away after the conversion.
We assume, that the program passes its result to the exit continuation.
(define (passing-program program)
(let ((((’define names functions) ... expression) program))
‘(,@(map (lambda (name function)
(let (((’return pass-function) (passing function
’return)))
‘(define ,(passing-function name)
,pass-function)))
names functions)
,(passing expression ’exit))))
The complete code, with the implementations of passing-function and
original-name can be found in the appendix E.
We can check, that the value of
(passing-program ’((define !
(lambda (n)
(if (= n 0)
1
(* n (! (- n 1))))))
(! 5)))
4.3. GENERATING MACHINE CODE 57
is the form
((define pass-!
(lambda (n return)
(pass= n 0
(lambda (n=0/1)
(if n=0/1
(return 1)
;else
(pass- n 1
(lambda (n-1/3)
(pass-! n-1/3
(lambda (!/n-1/2)
(pass* n !/n-1/2 return))))))))))
(pass-! 5 exit))
On our machine, the possible answers are that arguments and values can
either be passed through registers, through stack or through the memory
heap.
Typically, passing values through registers is most efficient and therefore
most desirable. However, since the number of registers in a CPU is usually
small, some other conventions often need to be established (for example,
the first few arguments can be passed through registers, and another ones
through the stack or heap).
Another question is, how can a function know where the control should
be transferred after it finishes its execution. Typically, this information is
stored on the call stack, which stores the appropriate address in the caller
code.
However, while the use of stack is in general inevitable, sometimes it may
be more desirable to store the return address in a register, and only save it
58 4. COMPILATION
when invoking another function (because this can decrease the number of
memory accesses, which are typically more expensive than register access).
The latter option, although may seem less obvious, allows to perceive
function calls as gotos that pass arguments[73], where the return address is
just another argument to be passed.
(define pass-!
(lambda (n return)
(pass= n 0
(lambda (n=0/1)
(if n=0/1
(return 1)
;else
(pass- n 1
(lambda (n-1/3)
(pass-! n-1/3
(lambda (!/n-1/2)
(pass* n !/n-1/2 return/1))))))))))
factorial:
(if n <> 0 goto else:)
(result <- 1)
(goto return)
else:
(n-1 <- n - 1)
(push n)
(push return)
(n <- n-1)
(return <- proceed:)
(goto factorial:)
proceed:
(pop return)
(pop n)
(n-1! <- result)
(n*n-1! <- n * n-1!)
(result <- n*n-1!)
(goto return)
(define !+
(lambda (n a)
(if (= n 0)
a
;else
(!+ (- n 1) (* n a)))))
(define pass-!+
(lambda (n a return)
(pass= n 0
(lambda (n=0/1)
(if n=0/1
(return a)
;else
(pass* n a
(lambda (n*a/3)
(pass- n 1
(lambda (n-1/2)
(pass-!+ n-1/2 n*a/3 return))))))))))
factorial+:
(if n <> 0 goto else:)
(result <- a)
(goto return)
else:
(n*a <- n * a)
(n-1 <- n - 1)
(n <- n-1)
(a <- n*a)
(goto factorial+:)
The code does not perform any stack operations, and it is clear that the
function call is performed just as a simple goto with register assignment.
Note that the calling function has to know the names of the registers
that are used to pass arguments to the called function. It may also have
to know what registers are used by the called function internally (including
the registers used by all functions that are called by the called function, as
well as registers used by the functions called by these functions, and so on)
in order to know whether it should save their values on the stack before the
call, and restore them afterwards.
Also, the code generated by our procedure is wasteful with regard to the
number of used registers. Normally, computers have a limited number of
registers, and compilers try to reuse them as much as possible in order to
minimize the number of accesses to RAM (which is typically much slower
than manipulating register values).
It would therefore be more realistic to rename the arguments to functions
in a systematic way, and also minimize the number of registers that are
used within a procedure (this process is called register allocation in the
literature[84] [51]).
4.3. GENERATING MACHINE CODE 61
However, since these issues have very little to do with the merit of this
work, we will proceed with our assumption, that the number of registers of
our machine is sufficient to perform any computation we desire (which, at
this very moment, is either computing a factorial or – ultimately – sorting
an array).
We therefore assume that each calling function knows at least the names
of registers for each defined procedure, that will be available via argument-
-names helper function.
The code for branching is a bit tricky, as we need to undo some of the
effects of our CPS transformation, to handle the conditionals properly (as we
noted in chapter 2, Scheme provides Boolean values #true and #false, but
here – for simplicity – we assume, that the instruction (if a >?< b goto
c) can only be generated from the code of the form (pass<?> a b (lambda
(a<?>b) (if a<?>b ...))). Furthermore, to attain some readability, we
62 4. COMPILATION
shall inverse the condition in the comparison, and perform jump to the
else branch). We use the sign function, which converts names like pass=
or pass+ to operators like = or +.
We generate a new label for the <else> branch, add a and b to the set
of used registers, generate a branching instruction followed by assembly for
the <then> expression, followed by the label for the <else> branch, followed
by machine code for the <else> expression.
((operator . operands)
(call operator operands registers))))
;; the definition of ‘‘assembly’’ ends here
((defined-function? operator)
(call-defined operator operands registers))
((anonymous-function? operator)
(call-anonymous operator operands registers))))
4.3. GENERATING MACHINE CODE 63
(match continuation
((’lambda (result) body)
(let* ((proceed (new-label ’proceed))
(sequel (assembly body registers))
(registers (intersection registers
(used-registers sequel))))
‘(,@(save registers)
,@(pass arguments function)
(push return)
(return <- ,proceed)
(goto ,entry)
,proceed
(pop return)
,@(restore registers)
,@sequel)))
Given all these helper functions, we can now express the compilation
of a whole program4 . As noted earlier, we assume that a program is a
sequence of function definitions followed by a single expression. We therefore
need to compile both the definitions and the expression. Furthermore, we
need to take into account what should happen after our program finishes its
execution. Obviously, we want our machine to halt.
4
A careful reader probably noticed that we’re lacking the definitions of
anonymous-function? and call-anonymous. These definitions are trivial, as the call
to anonymous function boils down to register assignment followed by execution of assem-
bly code of the body of that function. They would contribute nothing to the examples
presented here, so we allowed ourselves to omit them here. They are of course available
in the appendix E.
4.4. CONCLUSION 65
We can observe that the programs for computing factorial function be-
have roughly as we expected them to: the tail recursive version does not
perform any stack operations and only uses goto to transfer control. The
other version saves the return register on the stack prior to the call, along
with other registers that are needed in the sequel.
4.4 Conclusion
Although the compiler presented in this chapter successfully transforms
some high level functions to efficient machine code, it is of course by no
means complete. It does not handle higher order functions properly, nor
does it support arbitrary precision arithmetic. Moreover, it does not perform
any register allocation and uses a new register for storing each intermediate
result, which makes it inapplicable to real machines. It does, however, serve
its purpose, in that it gives a rough overview of the compilation process.
66 4. COMPILATION
5
In the previous chapter we have seen how an arbitrary Scheme program can
be transformed to a particular form that has certain properties which make
it suitable for execution on a sequential machine. In particular, this form
specified the order of evaluation of arguments, which would otherwise be
unspecified1 .
In this chapter, we will present a broader class of Scheme to Scheme
transformations, called equational reasoning.
As the name suggests, the purpose of these transformations is reasoning,
that is, drawing certain conclusions about programs.
Broadly speaking, we have already seen a simple example of a reasoning
system, namely – the evaluator itself, which allowed us to conclude the
values of expressions for given arguments.
However, this system only allowed us to conclude about some very spe-
cific properties, like, that the value of expression (+ 2 2) is the number
4.
For the purpose of this work, we would like to be able to prove our claims
about some more abstract properties of our program, like that the qsort
function actually sorts a given sequence, or that at least the length of its
output is the same as the length of its input, and that all the elements that
were present in the input are also present in the output.
67
68 5. REASONING ABOUT PROGRAMS
(define (equal-same x)
(equal? (equal? x x) #true))
(assume equal-same)
(define (equal-if x y)
(if (equal? x y) (equal? x y)))
(assume equal-if)
(assume if-true)
3
Choosing two names that only differ with a single character to denote two completely
opposite notions may not seem to be a very good idea. We are drawing inspiration here
from the creator of the Scala programming language, Martin Odersky, who did the same
thing choosing the names val and var for declaring immutable and mutable variables,
respectively. We are hoping that, since the worst ideas in Computer Science seem to also
be the ones that last the longest, this work would actually turn out to be influential in
some regards.
70 5. REASONING ABOUT PROGRAMS
(assume if-false)
(assume if-same)
(assume if-nest-then)
(assume if-nest-else)
(assure negation-inversion)
5.2. THE REASONING SYSTEM 71
(if condition
(if (if condition #false #true) result #true)
(if (if condition #false #true) result #true))
(if condition
(if #false result #true)
(if (if condition #false #true) result #true))
(if condition
(if #false result #true)
(if #true result #true))
We can now reduce the same <then> branch as before using if-false:
(if condition
#true
(if #true result #true))
(if condition
#true
result)
We can now substitute this result to the original context, i.e. as the right
hand side of the equal? expression from the definition of negation-inversion:
We see that the right hand side is identical to the left hand side, which
allows us to apply the equal-same rule, yielding
#true
The core function for our reasoning system should take an expression, a
path to its sub-expression of our interest, and an axiom with a hint specifying
how it is meant to be used, and it should return an expression with the sub-
expression transformed appropriately. For example,
6
It should be easy to see that the focus function could also be defined using fold-left
over list-ref:
(define (focus expression path)
(fold-left list-ref expression path))
74 5. REASONING ABOUT PROGRAMS
and
should evaluate to
Note also, that the rule of inference needs to be able to access the axioms,
theorems and definitions that we refer to. However, since it would be incon-
venient to pass them around to the rewrite function, we will make use of
extension to Scheme known as parameters[25], and have the current-book
parameter default to the core axioms and definitions.
In our rewriting rule, we need to differentiate between theorems (in-
cluding axioms) and definitions, because, in the case of the definitions, we
can only replace definiendum with the corresponding definiens, while in the
case of theorems, we can replace one side of the conclusion with the other
(this distinction should make it clear why we decided to mark axioms and
theorems using the assume and assure keywords):
5.2. THE REASONING SYSTEM 75
the symbol quote or any of the symbols contained in the variables list
(this condition can be assured by systematic α renaming of all the bound
variables of a program [73]).
The replace-subexpression function is defined as follows:
(_
’())))
5.2. THE REASONING SYSTEM 77
(Note that the code assumes that all the derived special forms such as
and, or and single-armed if are expanded).
For example, conclusions+premises of the body of if-nest-then, i.e.
(conclusions+premises
’(if condition
(equal? (if condition then else) then)
#true))
and the value for the body of if-nest-else is also a singleton list:
’(if c
(if a
’(2 2)
’(2 3))
(if b
’(3 2)
’(3 3)))
78 5. REASONING ABOUT PROGRAMS
checking:
5.3 Totality
The rewrite-definition function assumed that there is nothing wrong
with replacing an application of a function with the body of that function,
where formal arguments are replaced with values being applied to – just as in
most circumstances there was nothing wrong with replacing an application
of a function to some arguments with the actual value of that function for
those arguments.
However, it is not obvious that a function actually has a value. Consider
the following definition:
(define (partial x)
(not (partial x)))
The function partial is not a total function, because it does not have
a defined value for every argument (as a matter of fact, it doesn’t have a
definite value for any argument), and consequently, a program whose value
relies on the value of partial function may itself have no definite value.
The Boyer-Moore system doesn’t allow to expand the definitions of func-
tions that were not proven to be total, because they could be used to prove
a contradiction[32], thereby depriving the deductive system of its cognitive
value. Therefore, in order to be able to rewrite-definition in a legitimate
way, it is required that a totality claim for that function is proven first.
While it is impossible to provide a universal function that would claim
whether a given function is total, there exist certain classes of functions for
which it is possible to derive such proofs by purely mechanical means.
In the case of recursive functions it is easy to see that if an argument
that is used as a base case for recursion shrinks (in some general sense) by
one unit towards the base case with each recursive call, then the function
will eventually reach its base case and terminate7 .
This “general sense of argument shrinking” is called a measure of a func-
tion, which is a function that maps arguments to natural numbers. For
many arithmetic functions, a common measure is just the identity function.
For functions whose arguments are structures/expressions, the measure can
be defined as
(define (size x)
(if (pair? x)
(+ 1 (size (car x)) (size (cdr x)))
;else
0))
(define (natural?/size x)
(equal? (natural? (size x)) #true))
(assume natural?/size)
7
Note that, especially in lazy languages, there are functions that do not satisfy this
condition, but have a definite value nevertheless. Consider, for example, the definition:
(define (numbers-from start)
(cons start (numbers-from (+ start 1))))
While this function may call itself potentially indefinitely many times, it is total (in the
domain of numbers).
5.3. TOTALITY 81
(define (size/car x)
(if (pair? x)
(equal? (< (size (car x)) (size x)) #true)))
(assume size/car)
(define (size/cdr x)
(if (pair? x)
(equal? (< (size (cdr x)) (size x)) #true)))
(assume size/cdr)
The following function can be used to obtain the totality claim for any
recursive function8 :
((function . arguments)
(if (equal? function name)
‘(and (< ,(substitute args arguments measure) ,measure)
. ,(map claim arguments))
;else
‘(and . ,(map claim arguments))))
(_
#true)))
(define (append a b)
(if (pair? a)
(cons (car a) (append (cdr a) b))
;else
b))
We can obtain its totality claim by evaluating
(totality-claim ’append ’(a b)
’(if (pair? a)
(cons (car a) (append (cdr a) b))
b)
’(size a))
which produces
(and (natural? (size a))
(and (and #true)
(if (pair? a)
(and (and #true)
(and (< (size (cdr a)) (size a))
(and #true)
#true))
#true)))
Apparently, our totality-claim contains many redundant (and #true)
and #true conditions. They can be easily removed by expanding and to if
and applying the if-true axiom:
(if (natural? (size a))
(if (pair? a)
(< (size (cdr a)) (size a))
#true)
#false)
The proof is done by applying the natural?/size and if-true axioms,
which allow us to rewrite this formula as
(if (pair? a)
(< (size (cdr a)) (size a))
#true)
and size/cdr allows to reduce the expression to
(if (pair? a)
#true
#true)
It is now easy to see that this expression is equal? to #true (by if-same).
5.4. INDUCTION AND RECURSION 83
(define (list? l)
(or (equal? l ’())
(and (pair? l)
(list? (cdr l)))))
(assure associative?/append)
84 5. REASONING ABOUT PROGRAMS
List induction over the l1 argument provides us with the following claim:
(define (car/cons x y)
(equal? (car (cons x y)) x))
(assume car/cons)
(define (cdr/cons x y)
(equal? (cdr (cons x y)) y))
(assume cdr/cons)
(define (pair?/cons x y)
(equal? (pair? (cons x y)) #true))
(assume pair?/cons)
(define (cons/car+cdr x)
(if (pair? x)
(equal? x (cons (car x) (cdr x)))))
(assume cons/car+cdr)
(define (cons-equal-car x y z)
(equal? (equal? (cons x z) (cons y z))
(equal? x y)))
(assume cons-equal-car)
(define (cons-equal-cdr x y z)
(equal? (equal? (cons x y) (cons x z))
(equal? y z)))
(assume cons-equal-cdr)
Continuing our proof, we can now substitute the expressions (append (cons
x l1) ...) with the body of the definition of append:
86 5. REASONING ABOUT PROGRAMS
Some irrelevant bits of the expression were replaced with ###. They
appear twice in the alternatives of if expressions whose conditions are
[pair? (cons x l1)], which – by virtue of pair?/cons, are equal? to
#true.
Similarly, the expressions [car (cons x l1)] can be replaced with x by
car/cons, and [cdr (cons x l1)] can be replaced with l1 by cdr/cons,
yielding
This allows us to perform the if-lifting and transform the whole ex-
pression to
We can now use the equal-if axiom to rewrite whichever side of equality
we choose, say, left to right:
5.5 Conclusion
The purpose of the proofs presented in this chapter was to exemplify some
methods that are useful for proving properties of programs. It is hard to
deny that – without any assistance from computer tools that help to trace
nested parentheses – the structures of expressions may seem obscure, and
indeed, some more advanced typesetting features would certainly be helpful.
We hope that the presentation was instructive nevertheless.
ACL2, the descendant of the original Boyer-Moore system, is capable of
proving a large class of theorems about programs automatically using some
principles that were laid out in this chapter. The source codes for ACL2 are
publicly available9 .
This chapter ends the first part of this work, whose purpose was to
present various tools that can be helpful for the task that we set to ourselves
in the first chapter.
9
https://github.com/acl2/acl2
88 5. REASONING ABOUT PROGRAMS
Part II
The Substance
89
6
By now, we should have a fairly detailed idea how the Scheme programs
ought to be executed on register machines, and how to check whether our
programs posses certain properties that are of interest to us.
In this chapter we will try to formulate certain properties that should be
useful for us if we wish to make our compiler use arrays in place of linked
lists.
(define (map f l)
(if (null? l)
’()
;else
(cons (f (car l)) (map f (cdr l)))))
1
For the clarity of presentation, we are not going to employ the match and quasiquote
macros in our subject programs, and only use them in the meta-programs.
91
92 6. LIST RECURSION AND ARRAY-RECEIVING STYLE
(define (reverse-map f l)
(traverse l ’()))
The problem with this function is that that elements of the output list
are in the reverse order – for example, (reverse-map square ’(1 2 3))
would construct a list (9 4 1). Of course, we could now define map by using
reverse-map with the identity function
(define (map f l)
(reverse-map f (reverse-map (lambda (x) x) l)))
and while, defined this way, map would indeed use a constant amount of
stack space, it would traverse the list twice, and needlessly generate (length
l) cons-cells of garbage.
(define (map! f l)
(define (iterate point)
(if (null? point)
l
;else
(begin
(set-car! point (f (car point)))
(iterate (cdr point)))))
(iterate l))
6.2. ARRAY PASSING 93
The above code refers to some procedures that are a part of our interface
to arrays, namely beyond?, next, start, memory-ref and memory-set!.
The start function returns a pointer to the first element of array, the
beyond? predicate checks whether a given pointer points outside of an array,
and next function returns a pointer of an element next to a given one. The
memory-ref and memory-set! procedures return and modify the value of
memory cell pointed to by a given pointer.
For now, we are deliberately avoiding to provide any concrete implemen-
tation of this interface, so that we don’t need to decide whether all the sizes
of elements of an array are uniform, nor whether the addresses of subsequent
elements should be ascending or descending.
The questions that arise are:
1. Under what circumstances can we transform a recursive function to
the array-receiving style?
94 6. LIST RECURSION AND ARRAY-RECEIVING STYLE
We don’t know the exact answers to these questions, but we shall propose
an initial attempt of addressing them.
Dubbing the term tail recursion, we shall name the circumstances under
which we can use arrays instead of lists, a list recursion (to be distinguished
from tree recursion or free recursion).
The first approximation of a list recursion is that it is a function f whose
tail expression is either:
• a recursive call to f, or
The last condition could actually be loosened a bit: the second argument
to cons could itself be a cons, and so on, until we make the recursive call, or
it could be a call to a function which evaluates to a list or to an expression
of the form (cons x y), where y is an argument of that function, and it is
bound to the value of the recursive call (f . args). However important,
these nuances obscure the point that we are trying to make, so we shall
ignore them for the moment.
In order to put what we have just said more formally, we need to specify
what we mean by tail expressions of a given expression. When our expres-
sion has a form (if <test> <then> <else>), then its tail expressions are
the tail expressions of <then> and tail expressions of <else>. Otherwise,
if it has the form ((lambda <args> <body>) . <values>), then the tail
expressions are the tail expressions of <body> with <values> substituted
for <args> throughout. Otherwise, the tail expressions are a singleton con-
taining only the expression itself:
6.2. ARRAY PASSING 95
((function . arguments)
(and (every (lambda (arg)
(not (calling? arg name)))
arguments)
(equal? name function)))
(_
(is tail member args))))
((function . args)
(or (equal? function name)
(calling? function name)
(any (lambda (arg)
(calling? arg name))
args)))
(_
#false)))
While these notions clearly need an elaboration (for example, the fact
that we use substitute with recursive call allows to construct forms that
would never terminate), they should be sufficient to indicate some conditions
that permit us to convert a list recursive function to the array-receiving style.
(array-receiving ’(map (f l)
(if (null? l)
’()
(cons (f (car l))
(map f (cdr l))))))
(match expression
((function . args)
(if (is function member ’(cdr null?))
(apply union (map possibly args))
(apply union (map list-arguments args))))
(_
’())))
((’cdr expression)
(if (is expression member list-args)
‘(next ,(symbol-append expression ’-index))
;else
argument))
((function . args)
‘(,function . ,(map convert-argument args)))
(_
argument)))
((’null? x)
(if (is x member list-args)
‘(beyond? ,(symbol-append x ’-index) ,x)
;else
expression))
((’quote ())
’target)
((function . arguments)
(if (eq? function name)
‘(step ,next-target
. ,(map convert-argument arguments))
;else
‘(,function . ,(map convert-argument arguments))))
(_
(if (is expression member list-arguments)
(symbol-append expression ’-index)
;else
expression))))
Of course, there are means which allow to avoid accidental name clashes
with the names such as step or target, but for the time being we ignore
this issue completely.
While the array-receiving function may not be perfect, it is general
enough to be able to transform some functions other than map. For example,
one can easily check that the array-receiving version of the function range
defined as
is the following:
(define (filter p l)
(if (null? l)
’()
;else
(if (p (car l))
(cons (car l) (filter p (cdr l)))
;else
(filter p (cdr l)))))
The universality of Lisp based systems stems from the fact the cons oper-
ator, invoked from within functions, is responsible for memory allocation,
and the responsibility for reclaiming the memory that is no longer in use
belongs to the garbage collector.
The array receiving style, however, transfers the burden of memory al-
location from a callee to a caller. In order for this to be possible, the caller
needs to know how much memory the called function is going to need, and
allocate it prior to the call, or – if it is able to prove that some sufficiently
large area of memory won’t be used in the rest of the program – reuse some
previously allocated area.
While this problem can be hard to determine in general, there are clearly
situations when it is relatively easy. For example, it should not be hard to
prove the following lemmas (assuming totality of f and p):
(define (map-length f l)
(if (and (list? l) (unary-function? f))
(equal? (length (map f l))
(length l))))
(assure map-length)
(define (append-length a b)
(if (and (list? a) (list? b))
(equal? (length (append a b))
(+ (length a) (length b)))))
(assure append-length)
102 6. LIST RECURSION AND ARRAY-RECEIVING STYLE
(assure range-length)
(define (filter-length p l)
(if (and (list? l) (unary-predicate? p))
(equal? (max (length l) (length (filter p l)))
(length l))))
(assure filter-length)
These lemmas can be used to infer the amount of memory that needs to
be allocated for a given function. Note that, depending on situation, this
information doesn’t necessarily need to be available prior to a function call:
one can imagine that the target argument to an array-receiving function
could be located at the end of the heap, and grow the array as needed.
Reusing memory
However, if we are able to infer the size of the output of a function prior to
the call, we could potentially overwrite some object which would no longer be
needed for the computation. This in turn could decrease program’s reliance
on garbage collection, increasing overall performance.
This observation, in turn, leaves us with the following question: how
can we know the lifetimes of heap allocated objects? The intuitive answer
is that these lifetimes span between the creation of an object, and the last
point at which any of the variables referring to that object is used.
One can imagine at least two counterexamples to this intuition, though.
The first one is a function which might return some of its arguments. Con-
sider the following procedure:
(define (random-argument . arguments)
(list-ref arguments (random (length arguments))))
Unless we make a function like this make a copy of its return value
(which would likely be unreasonable, given the goal we set to ourselves), we
cannot rely on the fact that any of its arguments is no longer used in the
code following the call to that function, at least as long as the result of that
procedure is used thereby.
Of course, it might be tempting to ask, under which circumstances can
we prove that a function does not return any data structure that is shared
by any of its arguments, and this is indeed an interesting question. For the
time being, we shall allow ourselves to leave it unanswered, though.
6.2. ARRAY PASSING 103
Shared objects
The second example is of a greater significance to us, because it has more
to do with the goal that we set to ourselves in the first chapter. It might
be the case (often a desirable one), that a list produced by some function
should be a part of another list. In particular, if a function application (or
its result) is the first argument to append, we would wish to arrange the
computation just by placing the memory areas of its arguments side by side,
avoiding any actual calls to append and memory copying whatsoever.
It therefore seems that the question, how do we prove that an allocated
object is no longer needed, is in general non-trivial, and instead of solving
it for the general case, we need to focus on some particular cases that serve
our goal.
Let’s consider a simple example of the aforementioned optimization of
append:
We can infer from the range-length lemma, that the length of numbers
is amount, and likewise – from map-length – that the length of squares
is the length of numbers, i.e. amount. Finally, by append-length we
could conclude that the amount of memory that has to be allocated for
numbers&squares is (+ amount amount).
We could therefore expect that – if the numbers&squares function isn’t
itself meant to be array-receiving, it could be transformed to the following
form:
((operator . operands)
‘(,(reduce operator) . ,(map reduce operands)))
(_
expression)))
((’map f l)
(size l))
((’range lo hi)
‘(max 0 (- ,hi ,lo)))
2
We use the name size in a sense that is different than was characterized in chapter
5, where it meant the number of cons cells used by an object. Here we desire the size to
mean the number of elements of a sequence. We hope that the reader doesn’t get confused
with this ambiguity.
106 6. LIST RECURSION AND ARRAY-RECEIVING STYLE
((’filter p l)
(size l))
((’append x y)
‘(+ ,(size x) ,(size y)))
((’cons x y)
‘(+ 1 ,(size y)))
(_
‘(length ,expression))))
The transformation itself needs to split the definitions into the categories
specified at the beginning of this section:
It will turn out, that we will want to check whether a given expression
is an application of an array-receiving function, i.e. whether it belongs to
the list-recursive set3 :
((function . arguments)
(let* (((name args body) (find (lambda ((name _ _))
(equal? name function))
list-recursive))
(list-args (intersection (list-arguments body)
args))
(args/array-passing (filter array-receiving?
arguments)))
(array-passing-library
’((define (map f l)
(if (null? l)
’()
(cons (f (car l))
(map f (cdr l)))))
(define (square x)
(* x x))
((define (square x)
(* x x))
The first difference could be fixed by elaborating our method for ob-
taining array-passing versions of functions: after performing the reduce
operation, we would need to extract common sub-expressions and perform
some analysis to see whether some calls could be eliminated without any
harm to the result of the computation.
The fact that arguments aren’t passed explicitly, but only cause side
effects on the content of the memory area, is more bothersome in the case of
functions such as filter, whose result size cannot be known a priori.
6.3 Conclusion
We have presented a sketch of a method which allows us to transform func-
tional programs that operate on lists with destructive programs that operate
on arrays.
Surely, the method is far from perfect, but it works for some simple
examples.
Some problems, like the synthesis of the size function (defined on page
105) for a given set of array-receiving definitions, require some elaboration
and seem to form whole research topics on their own.
Another ones – like the lifetime analysis of some specific heap areas –
reveal a lot of similarities to the topics that are already well examined in the
Computer Science, and in the field of compiler construction in particular.
7
113
114 7. TRYING TO MAKE QUICKSORT QUICK AGAIN
Quicksort: Partition:
(if first >= last goto end:) (pivot <- [last])
(push return) (trail <- first - 1)
(return <- partitioned:) (front <- first)
(goto Partition:) loop:
partitioned: (if front >= last goto done:)
(push last) (item <- [front])
(push result) (if item > pivot goto next:)
(return <- left-sorted:) (trail <- trail + 1)
(last <- result - 1) (swap <- [trail])
(goto Quicksort:) ([front] <- swap)
left-sorted: ([trail] <- item)
(pop result) next:
(pop last) (front <- front + 1)
(pop return) (goto loop:)
(first <- result + 1) done:
(goto Quicksort:) (result <- trail + 1)
end: (item <- [result])
(goto return) (swap <- [last])
([result] <- swap)
([last] <- item)
(goto return)
functions. For this reason, they won’t be very useful to us. Instead, we
prefer to obtain an imperative version of Quicksort in Scheme.
(else
(parts! back (+ front 1)))))
(parts! 1 1)))
1
It may seem surprising that array-set! takes the value as its second argument,
and array index as its last argument – contrary to vector-set! known from Scheme.
The code that we’re showing here has been written and tested with Guile Scheme, which
supports its own API for shared and multidimensional arrays [48].
116 7. TRYING TO MAKE QUICKSORT QUICK AGAIN
The (split-at list n) function defined in [67] returns two values – the
first one contains the first n elements of the original list, and the second
one – the remaining elements.
This allows us to express qsort in the following way:
7.2 Transformation
The question which now arises is: under what circumstances are we allowed
to rewrite quicksort to something like quicksort!, and Hoare-partition
to something like Hoare-partition!?
Before attempting to answer it, let’s note that there are a few bothering
things about the definitions of quicksort! and Hoare-partition!. As we
noted before, the base case of Hoare-partition! contains the (swap! 0
middle array) instruction, which silently inserts a pivoting element of the
array in between the slices.
Furthermore, the algorithm works, because the slices returned by Hoare-
-partition! belong to a continuous region.
Lastly, the quicksort! procedure doesn’t give a clue about the structure
of the result, contrary to its functional counterpart.
This observation prompts us with the following hint: perhaps we could
use the structural information contained in the ‘(,@(quicksort below)
,head ,@(quicksort above)) expression to automatically generate a call
to swap!?
118 7. TRYING TO MAKE QUICKSORT QUICK AGAIN
(define (insertions x l)
(match l
(()
‘((,x)))
((head . tail)
‘((,x ,head . ,tail) . ,(map (lambda (y)
‘(,head . ,y))
(insertions x tail))))))
(define (permutations l)
(match l
(()
’(()))
((head . tail)
(append-map (lambda (sub)
(insertions head sub))
(permutations tail)))))
The problem with the ordering? lemma is that it does not conform
to the specification of <rule> from chapter 5, and hence we wouldn’t know
how to use it. It can be expressed in a more operational form:
We can deduce that there is a free cell available to the left of the below
list (actually, it is occupied by the value of head, but we don’t care about it
too much, since we already managed to store this value in a local variable)
and that – in order to make the result of the append function fit the allocated
120 7. TRYING TO MAKE QUICKSORT QUICK AGAIN
storage – we need to move its last element into that free cell and then place
the value of head in the previous position of the last element.
In other words, given the appropriate circumstances, we wish to trans-
form the above invocation of append into something like
(begin
(when (is (length below) > 0)
(array-set! list (last below) 0)
(array-set! list head (length below)))
(quicksort! below)
(quicksort! above)
list)
Note that since the array argument in the Parts! helper function
doesn’t change between the calls, and could therefore be removed.
A much more puzzling question is: what makes this transformation so
straightforward? Unfortunately, we have no answer for it. The fact is, that
the code for Hoare-partition was itself derived from the code that was
based on array slices.
The transformation of quicksort is somewhat more complex. It exploits
certain properties regarding the memory layout of allocated objects that we
talked about earlier. We hope that the intended meaning can be inferred
from the names of predicates that are used to express these properties.
122 7. TRYING TO MAKE QUICKSORT QUICK AGAIN
The () pattern
The pattern () is mapped to the condition (is (array-length array) =
0). Moreover, the result ’() is mapped to the value of array. Unlike in the
case of the functional variant, we cannot simply return any empty array (in
particular, we cannot allocate a new empty array, or return some generic
object that would represent an empty array).
Our optimization would need to figure out that the result we’re returning
is actually contained in one of its arguments. Deciding which argument it
is supposed to be is not an easy task in general (on the other hand, in this
particular case we don’t have many candidates to consider).
expression, and
is simply mapped to
Of course, in the case of actual code, all the let, let*, match and is
forms would be expanded to if, lambda and call-with-value forms prior
to the transformation.
Intuitively, the validity of this principle stems from the requirement that
the function must be a deterministic permutation, which means that when
it is passed any permutation of a collection of elements, then the order of
elements in its result will always be the same.
Now this principle may seem very particular, as if it were cut out espe-
cially to tackle our problem, and we admit that this was indeed the case.
7.3. CONCLUSIONS AND FUTURE WORK 125
4. from the point of view of the rest of the computation, before can
be treated as a set, i.e. the order of the elements contained in it is
irrelevant;
time some more swift methods for optimizing programs will be developed.
Furthermore, this effort needs only be done by people who specialize in such
optimization, allowing the majority of language users to express their pro-
grams in a way that is just convenient, without having to worry too much
about their performance.
We also believe that the approach to program optimization presented in
this work may prove itself much more scalable than the more conventional
approach, where programmers achieve speed-ups by modifying their original
programs, because the same optimization could be used by more than just
a one program.
Of course, when it comes to optimization, it is reasonable to focus on
the most common cases first. For this reason, we suggest that it might be a
good idea to extend our transformation to handle matrix operations (where
a matrix is to be represented as a list of lists of equal length).
Moreover, we believe that the development of a formal system for reason-
ing about the time and space complexity of functions would be an important
step towards making an automatic system for optimizing programs.
Another idea to pursue is to have the compiler deduce the entropy of
certain variables to minimize the amount of bits that are used to represent
them. This way, it should sometimes be possible to have a couple of values
stored in a single register.
An even more radical idea is to feed the compiler with both the program
and the description of the instruction set of the target processor, and have
it automatically come up with a sequence of instructions that is (in some
sense) isomorphic with the original program.
Judging by the number of conferences, functional programming tech-
niques have recently been getting more recognition in the industry. A likely
reason for this state of affairs is that computer hardware is cheap and fast
enough enough to run programs that were written with readability and
maintainability in mind, rather than performance – and indeed, the costs
of programmers’ mistakes and overlong development time often exceeds the
costs of hardware by a few orders of magnitude. It is therefore reasonable
to search for techniques of increasing software reliability, even if the price to
pay is increased consumption of computing resources.
However, the availability of cheap processing power should not be an
excuse for wasting it. We believe (and hope that we managed to show this
to some extent in this work) that functional programs may benefit from bet-
ter maintainability without any performance loss whatsoever, and that the
process of programming could be simplified further by moving the burden
of dealing with data structures from the programmer to the compiler. As
programming is becoming more and more popular an activity, but program-
mers are not necessarily becoming more competent, we suspect that this
could even prevent some catastrophes in the future.
Part III
Appendices
127
Appendix A
Non-standard functions
129
130 APPENDIX A. NON-STANDARD FUNCTIONS
(define (subset? x y)
(every (is _ member y) x))
The for loop used in chapter 3 is a macro that could be defined in the
following way:
(define-syntax for
(syntax-rules (in)
((for element in sequence . actions)
(for-each (lambda (element) . actions)
sequence))))
(e.g.
(range 1 10)
===> (1 2 3 4 5 6 7 8 9 10))
(e.g.
((number/base 2) ’(1 0 0)) ===> 4
((number/base 10) ’(1 0 0)) ===> 100)
(e.g.
((digits/base 2) 4) ===> ’(1 0 0)
((digits/base 10) 140) ===> ’(1 4 0))
Y-lining
(even? 5)
133
134 APPENDIX B. Y-LINING
Macro expansion
Prior to evaluation, we need to convert all the special forms like let or and
into a program consisting solely of primitive forms lambda and if.
The R5 RS specification of Scheme provides a special language for defin-
ing new syntactic extensions, called syntax-rules.
While there are free implementations available, we believe that it is too
complex for our purpose. Instead we are going to propose a language that
is similar but slightly simpler.
As in the case of syntax-rules, we shall be writing down our macros
using patterns and templates. For example, we’d like to be able to define
the core Scheme macros in the following way:
(define core-macros
’(((’let ((name value) ...)
. body)
((’lambda (name ...) . body) value ...))
((’let* () . body)
(’begin . body))
((’and)
#true)
((’and last)
last)
135
136 APPENDIX C. MACRO EXPANSION
((’or)
#false)
((’or last)
last)
but
((head/pattern . tail/pattern)
(match form
((head/form . tail/form)
(let ((bound (apply bind head/pattern head/form
bound-variables)))
(and bound
(apply bind tail/pattern tail/form bound))))
(_
#false)))
(_
(if (symbol? pattern)
(merge-bindings ‘((,pattern . ,form) bound-variables))
;else
(and (equal? pattern form)
bound-variables)))))
where
We have used the feature of the cond variable that we didn’t describe in
chapter 2: if the condition is followed by the => symbol, then the following
expression must be a function of one argument.
If the value of the condition is other than #false, then it is passed to
that function, yielding the value of the cond expression.
As we can see, the definition of bind is rather straightforward: we must
only consider seven cases. The first one is the occurrence of a literal, which
is compared using equal?.
The second is a pattern followed by an ellipsis. Since it is a bit complex,
it is handled by a separate function called bind-sequence that is explained
below 1 .
The third is when pattern is a pair. In this case, the form being pattern-
matched must also be a pair, and we should be able to bind the head of the
pattern with the head of the form, and the tail of the pattern with the tail
of the form, unifying the bindings.
Otherwise, the pattern is either a literal (such as a number) or a symbol.
If it is a literal, it is compared with the form using the equal? predicate.
Otherwise it may either either be bound or unbound. If it is bound, then
form must be equal? to the bound value. Otherwise, a new binding is
added to the bound-variables.
Adding support for ellipses is a bit tricky. When we encounter the ...
symbol, we need to make sure that we’re both able to match some prefix
of the form so that each element matches the pattern preceding the ...
symbol, and that the part of the form that didn’t get into the prefix matches
the remainder of the pattern.
The bind-sequence function will need to call bind recursively on some
prefix of the form being pattern-matched, and then zip the resulting bindings
(note that the zip-bindings function requires that the order of bindings is
the same for each invocation of bind)
(e.g.
(zip-bindings ’(((a . 1) (b . 2) (c . 3))
((a . 4) (b . 5) (c . 6))
((a . 7) (b . 8) (c . 9))))
===> ((a 1 4 7) (b 2 5 8) (c 3 6 9)))
The carry function is used for testing smaller and smaller prefixes (and –
accordingly – longer and longer suffixes) until it finds a division that satisfies
the condition:
(e.g.
(prefix-length even? ’(2 4 6 7 8 9)) ===> 3)
((head . tail)
(union (used-symbols head)
(used-symbols tail)))
C.2. FILLING TEMPLATES 141
(_
(if (symbol? expression)
‘(,expression)
;else
’()))))
((head . tail)
‘(,(fill-template head #;with bindings)
. ,(fill-template tail #;with bindings)))
(_
(cond ((and (symbol? template)
(assoc template bindings))
=> (lambda ((key . value))
value))
142 APPENDIX C. MACRO EXPANSION
(else
template)))))
(e.g.
(unzip-bindings ’((a 1 2 3) (b 1 2 3) (c 1 2 3) (d . 4)) ’(a c e))
===> (((a . 1) (c . 1) (a 1 2 3) (b 1 2 3) (c 1 2 3) (d . 4))
((a . 2) (c . 2) (a 1 2 3) (b 1 2 3) (c 1 2 3) (d . 4))
((a . 3) (c . 3) (a 1 2 3) (b 1 2 3) (c 1 2 3) (d . 4))))
C.3 Expansion
Having bind and fill, we can now construct our expander:
(define (expand expression macros)
(_
expression))))
(define (expand expression)
(match expression
((’quote _)
expression)
((’lambda args body)
‘(lambda ,args ,(expand body)))
((’if condition then else)
‘(if ,(expand condition)
,(expand then)
,(expand else)))
((operator . operands)
(let ((transformed (fix transform expression)))
(if (equal? expression transformed)
‘(,(expand operator) . ,(map expand operands))
;else
(expand transformed))))
144 APPENDIX C. MACRO EXPANSION
(_
expression)))
(expand expression))
(e.g.
(expand ’(let* ((a 5) (b (* a 2)))
(or (> a b)
(+ a b)))
core-macros) ===> ((lambda (a)
((lambda (b)
(begin
((lambda (##result#1)
(if ##result#1
##result#1
(+ a b)))
(> a b))))
(* a 2)))
5))
One can see ha there’s a symmetry between the patterns and the tem-
plates in the definition of core-macros. This could prompt someone to
equip the expand function with the facility of reverting the expansion.
We have indeed made a successful attempt in this direction, although it
wasn’t mature enough to incorporate it here.
Appendix D
Hudak quicksort
(define (ref v n)
(list-ref v (- n 1)))
(define (quicksort v)
(qsort v 1 (length v)))
145
146 APPENDIX D. HUDAK QUICKSORT
One can easily see that the code fails to work as expected:
(e.g.
(quicksort ’(4 3 9 8 7 1 2 6 5))
===> (4 4 4 4 7 7 7 7 9))
Appendix E
The compiler
Below is the full source code of the compiler from chapter 4. It has been
tested with Guile 2.0.111 .
(define primitive-operators
’((+ pass+)
(- pass-)
(* pass*)
(/ pass/)
(% pass%)
(&& pass&&)
(|| pass||)
(^ pass^)
(<< pass<<)
(>> pass>>)
(= pass=)
(< pass<)
(<= pass<=)
(<> pass<>)
(>= pass>=)
(> pass>)))
147
148 APPENDIX E. THE COMPILER
(define mutually-negating-comparisons
’((< >=)
(<= >)
(= <>)))
(define original-name
(let ((number 0))
(lambda base
(match base
(()
(set! number 0))
((base)
(set! number (+ number 1))
(string->symbol (string-append
(->string (expression-name base))
"/"
(->string number))))))))
((function . arguments)
(let ((simple-arguments (map (lambda (argument)
(if (compound? argument)
(original-name argument)
argument))
arguments)))
(passing-arguments arguments simple-arguments
‘(,(passing-function function)
,@simple-arguments
,continuation))))
(_
‘(,continuation ,expression))))
(e.g.
(begin
(original-name)
(passing-program ’((define !
(lambda (n)
(if (= n 0)
1
(* n (! (- n 1))))))
(! 5))))
===>
151
((define pass-!
(lambda (n return)
(pass= n
0
(lambda (n=0/1)
(if n=0/1
(return 1)
(pass- n
1
(lambda (n-1/3)
(pass-!
n-1/3
(lambda (!/n-1/2)
(pass* n !/n-1/2 return))))))))))
(pass-! 5 exit)))
(define new-label
(let ((label-counter 0))
(lambda parts
(cond ((null? parts)
(set! label-counter 0))
(else
(set! label-counter (+ label-counter 1))
(apply label ‘(,@parts - ,label-counter)))))))
((’push register/value)
(maybe-register register/value))
((’goto register/value)
(maybe-register register/value))
(_
’())))
((’pop register)
‘(,register))
((target ’<- . _)
‘(,target))
(_
’())))
((operator . operands)
(call operator operands registers))))
((defined-function? operator)
(call-defined operator operands registers))
#;((anonymous-function? operator)
(call-anonymous operator operands registers))))
(_ ;; a ‘‘return’’ continuation
‘((result <- ,left ,(sign operator) ,right)
(goto ,continuation))))))
(e.g.
(begin
(new-label)
(passing-program->assembly
’((define pass-!
(lambda (n return)
(pass= n 0
(lambda (n=0/1)
(if n=0/1
(return 1)
;else
(pass- n 1
(lambda (n-1/3)
(pass-! n-1/3
(lambda (!/n-1/2)
(pass* n !/n-1/2
return))))))))))
(pass-! 5 return))))
===> ((return <- end:)
(n <- 5)
(goto !:)
!:
(if n <> 0 goto else-1:)
(result <- 1)
(goto return)
else-1:
(n-1/3 <- n - 1)
(push n)
(n <- n-1/3)
(push return)
(return <- proceed-2:)
(goto !:)
proceed-2:
(pop return)
(pop n)
(result <- n * !/n-1/2)
(goto return)
end:
(halt)))
(define (compile scheme-program)
(assemble (passing-program->assembly
(passing-program scheme-program))))
Appendix F
Readers who are familiar with the Scheme programming language proba-
bly noticed that the way it has been used in this work deviates from the
standards defined in [66] and [70] because of the destructuring that can
be performed in lambda, let and let* forms, as well as the possibility of
creating curried definitions using the define form.
Since define and lambda forms are actually the core bindings, they
cannot be in principle redefined. However, module systems present in some
Scheme implementations allow to shadow the core bindings with some user-
defined ones.
This section shows how this can be done with the module system avail-
able in Guile. The pattern matching is performed using the (ice-9 match)
module that is shipped with Guile. It is a subset of the (grand scheme)
glossary which is maintained by the author of this work1 .
(define-syntax mlambda
(lambda (stx)
(syntax-case stx ()
1
https://github.com/plande/grand-scheme
157
158 APPENDIX F. OVERRIDING THE CORE SCHEME BINDINGS
(define-syntax primitive-lambda
(syntax-rules ()
((_ . whatever)
(lambda . whatever))))
(define-syntax cdefine
(syntax-rules ()
((_ ((head . tail) . args) body ...)
(cdefine (head . tail)
(mlambda args body ...)))
((_ (name . args) body ...)
(define name (mlambda args body ...)))
((_ . rest)
(define . rest))
))
(define-syntax match-let/error
(syntax-rules ()
((_ ((structure expression) ...)
body + ...)
((match-lambda* ((structure ...) body + ...)
(_ (error ’match-let/error (current-source-location)
’((structure expression) ...)
expression ...)))
expression ...))))
159
(define-syntax named-match-let-values
(lambda (stx)
(syntax-case stx ()
((_ ((identifier expression) ...) ;; optimization: plain "let" form
body + ...)
(every identifier? #’(identifier ...))
#’(let ((identifier expression) ...)
body + ...))
[1] Abelson, Harold and Gerald Jay Sussman with Julie Sussman, Structure
and Interpretation of Computer Programs, Second Edition, MIT Press,
1996, ISBN 0-262-01153-0
https://mitpress.mit.edu/sicp/full-text/book/book.html
[3] Backus, John Can Programming Be Liberated from the von Neumann
Style? A Functional Style and Its Algebra of Programs, ACM Turing
Award Lecture, 1977,
http://worrydream.com/refs/Backus-CanProgrammingBeLiberated.
pdf
[4] Bagwell, Phil, Fast Functional Lists, Hash-Lists, Deques and Variable
Length Arrays, 2002
https://infoscience.epfl.ch/record/64410/files/techlists.
pdf
[5] Baker, Henry G., Unify and Conquer (Garbage, Updating, Aliasing, ...)
in Functional Languages,
http://www.pipeline.com/~hbaker1/Share-Unify.html
[6] Baker, Henry G., Shallow Binding Makes Functional Arrays Fast,
http://www.pipeline.com/~hbaker1/ShallowArrays.html
[10] Boyer, Robert S. and J Strother Moore, Proving Theorems About LISP
Functions, Journal of the Association for Computing Machinery, Vol.
161
162 BIBLIOGRAPHY
[13] Chase, David R., Garbage Collection and Other Optimizations, Ph.D.
Thesis, Rice University, August, 1987
https://scholarship.rice.edu/bitstream/handle/1911/16127/
8900220.PDF?sequence=1&isAllowed=y
[14] Cormen, Thomas H., Charles E. Leiserson, Ronald L. Rivest and Clif-
ford Stein, Introduction to Algorithms, Third Edition, MIT Press, 2009,
ISBN 9780262533058
[16] Dennett, Daniel C., Intuition Pumps and Other Tools for Thinking,
New York, W. W. Norton & Company, 2013. ISBN 0393082067
[33] Friedman, Daniel P. and Matthias Felleisen, The Little Schemer, Fourth
Edition, MIT Press 1996, ISBN 0-262-56099-2
[34] Friedman, Daniel P., Mitchell Wand and Christopher T. Haynes, Es-
sentials of Programming Languages, Third Edition, MIT Press, 2008,
ISBN 0-262-06279-8
[38] Godek, Panicz Maciej, Maszyna RAM i predykat Kleenego (in Polish),
notes from the course in Theory of Computation by Marcin Mostowski
at the University of Warsaw (faculty of Philosophy), 2012
https://github.com/panicz/writings/raw/master/archive/
predykat-kleenego.pdf
[46] Hunt, Andrew and David Thomas, The Pragmatic Programmer: From
Journeyman to Master Pearson Education, 2000
[49] Peyton Jones, Simon, Paul Hudak, John Hughes and Philip Wadler,
A History of Haskell: Being Lazy with Class, Third ACM SIGPLAN
History of Programming Languages Conference (HOPL-III), San
Diego, 2007
http://research.microsoft.com/en-us/um/people/simonpj/
Papers/history-of-haskell/history.pdf
[53] Knuth, Donald, Literate Programming CSLI Lecture Notes, no. 27,
1992, ISBN 0-937073-80-6
166 BIBLIOGRAPHY
[55] Kranz, David, Richard Kelsey, Jonathan Rees, Paul Hudak, James
Philbin, and Norman Adams, ORBIT: An Optimizing Compiler for
Scheme, SIGPLAN ’86 Proceedings of the 1986 SIGPLAN symposium
on Compiler construction, Palo Alto, California, USA — June 25 - 27,
1986
https://www.cs.purdue.edu/homes/suresh/590s-Fall2002/
papers/Orbit.pdf
[58] Landin, Peter J., The Next 700 Programming Languages, in Communi-
cations of the ACM, Volume 9, Number 3, March 1966
http://thecorememory.com/Next_700.pdf
[60] Moseley, Ben and Peter Marks, Out of the Tar Pit, 2006
http://shaffner.us/cs/papers/tarpit.pdf
[66] Kelsey, R., W. Clinger, J. Rees (editors), Revised5 Report on the Algo-
rithmic Language Scheme, in Higher-Order and Symbolic Computation,
Vol. 11, No. 1, August, 1998 and ACM SIGPLAN Notices, Vol. 33, No.
9, September 1998
http://www.schemers.org/Documents/Standards/R5RS/
[70] Sperber, M., Kent R. Dybvig, Matthew Flatt, Anton van Straaten
(editors), Revised6 Report on the Algorithmic Language Scheme,
September 2007
[73] Steele, Guy Lewis Jr, and Gerald Jay Sussman, Lambda the Ultimate
Declarative, MIT AI Lab Memo, 1976,
http://repository.readscheme.org/ftp/papers/ai-lab-pubs/
AIM-379.pdf
[77] Sussman, Gerald Jay and Jack Wisdom, Structure adn Interpretation
of Classical Mechanics MIT Press, 2001, ISBN 0-262-019455-4
https://mitpress.mit.edu/sites/default/files/titles/
content/sicm/book.html
[78] van Heijenoort, Jean, From Frege to Gödel: A Source Book in Mathe-
matical Logic, 1879-1931, Harvard University Press, 1967
[83] Wright, Andrew K., and Robert Cartwright, A Practical Soft Type
System for Scheme in ACM Transactions on Programming Languages
and Systems, Vol. 19 No. 1, January 1997, Pages 87-152
http://www.iro.umontreal.ca/~feeley/cours/ift6232/doc/
pres2/practical-soft-type-system-for-scheme.pdf