0% found this document useful (0 votes)
7 views

Functional Programming Victoria University of Wellington

The document is a comprehensive guide on functional programming authored by Neil Leslie in 2008. It covers various topics including the importance of functional programming, higher-order functions, lazy evaluation, and practical examples. Additionally, it discusses limitations, types, and abstract data types in the context of functional programming.

Uploaded by

benjamdoley
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Functional Programming Victoria University of Wellington

The document is a comprehensive guide on functional programming authored by Neil Leslie in 2008. It covers various topics including the importance of functional programming, higher-order functions, lazy evaluation, and practical examples. Additionally, it discusses limitations, types, and abstract data types in the context of functional programming.

Uploaded by

benjamdoley
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 217

Functional Programming

Neil Leslie
Centre for Logic, Language and Computation
School of Mathematical and Computing Sciences
Victoria University of Wellington

2008

L
@
@
@
@
Bβ @ Bβ
@
@
@
@
M R
@ N
@
@
@
@
∃Bβ @ ∃Bβ
@
@
@
@
R
@
P
Neil Leslie asserts his moral right to be identified as the author of this work.

Produced: 9:14 July 11, 2008 © Neil Leslie, 2000-2008


iv
Contents

1 Introduction 1
1.1 Course outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Texts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Compilers and interpreters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Re-cap of what you (are supposed to have) learned in COMP 304 . . . . . . . . . . 2
1.4 Next chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4.1 How to read a scientific paper . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Why functional programming matters 5


2.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 The argument . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Higher-order functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.4 Lazy evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.5 Example: α-β pruning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.5.1 Game trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.5.2 Minimax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.5.3 α-β . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3 Limitations of reduce 15
3.1 Trying to use reduce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Parameterising on the sorting order . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3 Finding the minimum of a list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3.1 First solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3.2 Second solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3.3 Third solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4 More lazy programming 21


4.1 Evaluation order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5 List comprehensions 25
5.1 Pythagorean triples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.2 Quicksort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.3 n Queens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

6 Case study: searching a graph 31


6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

7 Types 33
7.1 Hindley-Milner, and parametric polymorphism . . . . . . . . . . . . . . . . . . . . . 33
7.2 Extensions to Hindley-Milner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
7.3 Basic types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.4 Type synonyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.5 Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.5.1 Declaring a class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

v
Contents

7.5.2 Section summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37


7.6 Algebraic types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.6.1 Enumerated types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.6.2 Algebraic types and type classes . . . . . . . . . . . . . . . . . . . . . . . . . 38
7.6.3 Tagged Unions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
7.6.4 Inductively defined algebraic types . . . . . . . . . . . . . . . . . . . . . . . . 40
7.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

8 Abstract data types 45


8.1 Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
8.2 Queue ADT using modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
8.2.1 Altering our definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
8.2.2 Equality on ADT’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
8.3 A set ADT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
8.4 Relations and graphs using the set ADT . . . . . . . . . . . . . . . . . . . . . . . . . 52
8.4.1 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
8.4.2 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
8.5 Searching a graph using list comprehensions . . . . . . . . . . . . . . . . . . . . . . 56
8.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

9 Parser combinators 59
9.1 The type of parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
9.2 From grammar to parser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
9.2.1 Simple parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
9.2.2 Combining parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
9.2.3 Parsing balanced brackets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
9.2.4 More uses for <@ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
9.3 Extending our parsing toolset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
9.3.1 sp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
9.3.2 just . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
9.3.3 some . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
9.3.4 <:&> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
9.3.5 Kleene ∗ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
9.3.6 Kleene + . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
9.3.7 first . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
9.3.8 bang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
9.3.9 optionally . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
9.4 Parsing sequences of items . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
9.4.1 Example: S-expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
9.4.2 Lists with meaningful separators . . . . . . . . . . . . . . . . . . . . . . . . . 73
9.5 <&=> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
9.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

10 The λ-calculus 77
10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
10.2 Syntax of λ-terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
10.2.1 α convertibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
10.2.2 De Bruijn terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
10.3 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
10.3.1 Dynamic scope in LISP, Jensen’s device . . . . . . . . . . . . . . . . . . . . . . 84
10.4 Conversions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
10.4.1 β reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
10.5 The Church-Rosser Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
10.6 Normal form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
10.7 Reduction strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
10.7.1 Leftmost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
10.7.2 Other reduction strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

vi
Contents

10.8 Representing data and functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87


10.8.1 Booleans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
10.8.2 The Church numerals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
10.8.3 λ-definability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
10.8.4 Section summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
10.9 HNF and WHNF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
10.10 Graphs and laziness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
10.10.1η reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
10.11 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

11 The typed λ-calculus 93


11.1 Types for λ-terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
11.2 Type inference, and the principal type algorithm . . . . . . . . . . . . . . . . . . . . 95
11.3 Terms which can’t be typed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
11.4 A typed λ-calculus with recursion operators . . . . . . . . . . . . . . . . . . . . . . . 96
11.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

12 Continuations 99
12.1 Introducing tail-recursion and continuations . . . . . . . . . . . . . . . . . . . . . . . 99
12.2 Some simple functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
12.2.1 A CPS version of Fibonacci’s function . . . . . . . . . . . . . . . . . . . . . . 100
12.2.2 Historical note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
12.3 Uses of continuations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
12.3.1 Continuations and I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
12.4 Further examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
12.4.1 Functions on types not defined inductively . . . . . . . . . . . . . . . . . . . . 104
12.4.2 Functions on inductively defined types . . . . . . . . . . . . . . . . . . . . . . 105

13 Case study: unification 111


13.1 Premable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
13.1.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
13.2 Representing terms in Haskell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
13.3 Representing substitutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
13.3.1 The identity substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
13.3.2 Composing functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
13.4 Answering our question . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
13.4.1 About substitutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
13.5 Continuing to answer our question . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
13.5.1 First clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
13.5.2 Second and third clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
13.5.3 Final clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
13.6 Termination of unifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
13.7 A trick to avoid mutual recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
13.8 A theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
13.9 Another view of unification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
13.10 Yet another view of unification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
13.11 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
13.12 Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
13.13 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
13.14 Code summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

14 Case study: unification in continuation-passing style 125


14.1 Premable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
14.2 Basic types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
14.3 Top-level design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
14.3.1 Starting to define cpsunifier . . . . . . . . . . . . . . . . . . . . . . . . . 125
14.3.2 Second and third clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

vii
Contents

14.3.3 The final clause of cpsunifier . . . . . . . . . . . . . . . . . . . . . . . . . 127


14.4 Implementing cpsunifierl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
14.4.1 cpsextend and cpsmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
14.4.2 The final clause of cpsunifierl . . . . . . . . . . . . . . . . . . . . . . . . 128
14.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

15 Case study: computing principal types for λ-terms 129


15.1 Preamble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
15.2 Getting started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
15.2.1 Type expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
15.3 Getting ahead of ourselves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
15.4 Type assignment itself . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
15.4.1 Typing a variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
15.4.2 Typing an abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
15.4.3 Typing an application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
15.5 Example derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
15.6 Turning these rules into an algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 133
15.6.1 Key to the algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
15.6.2 printyp of a variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
15.6.3 printyp of an abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
15.6.4 printyp of an application term . . . . . . . . . . . . . . . . . . . . . . . . . 135
15.7 Improving the code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
15.7.1 Improving printyp of a variable . . . . . . . . . . . . . . . . . . . . . . . . 135
15.7.2 Improving printyp of an abstraction . . . . . . . . . . . . . . . . . . . . . . 136
15.7.3 Improving printyp of an application . . . . . . . . . . . . . . . . . . . . . . 136
15.8 Even more improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
15.9 Tidying up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
15.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
15.11 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

16 Case study: SKI-ing 141


16.1 Preamble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
16.2 λ calculus and combinatory logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
16.2.1 An interesting observation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
16.3 Reducing CL terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
16.4 Mimicking abstraction using SKI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
16.4.1 λw x.x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
16.4.2 λw x.P, x 6∈ FV(P ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
16.4.3 λw x.U V . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
16.4.4 λw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
16.5 Compiling λ-terms to CL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
16.5.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
16.5.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

17 Reasoning about functions 145


17.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
17.2 Proof by induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
17.2.1 The natural numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
17.2.2 An example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
17.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

18 Monads 153
18.1 A categorical perspective on monads . . . . . . . . . . . . . . . . . . . . . . . . . . 153
18.2 A computer scientist’s view of monads . . . . . . . . . . . . . . . . . . . . . . . . . . 153
18.2.1 Input/Output in functional programming . . . . . . . . . . . . . . . . . . . . 154
18.2.2 IO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
18.2.3 Generalising from IO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

viii
Contents

18.3 Examples of monads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157


18.3.1 The trivial (or identity) monad . . . . . . . . . . . . . . . . . . . . . . . . . . 157
18.3.2 The list monad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
18.3.3 The parsing monad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
18.3.4 The Maybe monad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
18.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

19 Monad example: an evalautor 161


19.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
19.2 Without monads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
19.2.1 Handling exceptions without a monad . . . . . . . . . . . . . . . . . . . . . . 161
19.2.2 Adding state without a monad . . . . . . . . . . . . . . . . . . . . . . . . . . 162
19.2.3 Adding traces without a monad . . . . . . . . . . . . . . . . . . . . . . . . . 163
19.3 Using monads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
19.3.1 The identity monad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
19.3.2 The exception monad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
19.3.3 The state monad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
19.3.4 The output monad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
19.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

20 Worked Example : writing an evaluator for λ terms 169


20.1 β reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
20.2 Turning this into Haskell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
20.3 Rules for substitutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
20.3.1 Substitution in Haskell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
20.3.2 What a state! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
20.4 Solution 1: following our noses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
20.4.1 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
20.5 Solution 2: Use CPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
20.5.1 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
20.6 Solution 3: Or, we could use a monad . . . . . . . . . . . . . . . . . . . . . . . . . 175
20.6.1 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

21 An SECD machine 179


21.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
21.2 Informal description of the SECD machine . . . . . . . . . . . . . . . . . . . . . . . 179
21.2.1 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
21.3 Module SECD2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
21.4 Imports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
21.5 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
21.5.1 Dealing with a term on the control stack . . . . . . . . . . . . . . . . . . . . . 181
21.5.2 Dealing with a unary operator on the control stack . . . . . . . . . . . . . . . 182
21.5.3 Dealing with a binary operator on the control stack . . . . . . . . . . . . . . 182
21.5.4 Dealing with If on the control stack . . . . . . . . . . . . . . . . . . . . . . . 182
21.5.5 Dealing with At on the control stack . . . . . . . . . . . . . . . . . . . . . . . 183
21.6 Some miscellaneous definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
21.7 Module TERMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
21.8 Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
21.9 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
21.9.1 Top-level call to be exported . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
21.9.2 Parsers which produce terms . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
21.9.3 Parsing parts of terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
21.9.4 Parsing to Haskell types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
21.9.5 Parsing tokens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
21.9.6 Defined combinators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
21.9.7 Utility functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
21.10 Module: Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

ix
Contents

x
Programs

2.1 Polymorphic lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5


2.2 Summing a list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 More simple list functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4 reduce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.5 Defining functions using reduce . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.6 Function composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.7 map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.8 The next approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.9 A list of approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.10 within and sqroot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.11 relative and sqroot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.12 The type of Hughes’s game trees . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.13 redtree and redtrees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.14 reptree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.15 gametree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.16 maximise and minimise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.17 A putative evaluation of a position . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.18 Pruning a tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.19 Minimax evaluation of a game tree . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.20 Re-implementing minimise and maximise . . . . . . . . . . . . . . . . . . . . . 11
2.21 Re-implementing mapmin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.22 omit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.23 minleq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.24 Re-defining evaluate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.25 Binary trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.1 Towards sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15


3.2 Towards sorting 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.3 listrec . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.4 reduce defined via listrec . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.5 Towards sorting 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.6 Insertion sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.7 Parameterising insertion sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.8 An incorrect program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.9 A better program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.10 An even better program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.1 The combinator K, and a bomb . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

5.1 Pairs of integers and characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25


5.2 Pythagorean triples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.3 Quicksort defined using list comprehensions . . . . . . . . . . . . . . . . . . . . . 26
5.4 Quicksort defined without using list comprehensions . . . . . . . . . . . . . . . . . 27
5.5 Starting n Queens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.6 Developing n Queens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.7 A safety check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.8 A complete check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.9 n Queens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

xi
PROGRAMS

7.1 The identity function id . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33


7.2 How lists might have been defined. . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.3 How tuples might have been defined. . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.4 How strings might have been defined. . . . . . . . . . . . . . . . . . . . . . . . . 35
7.5 How the equality class might have been defined. . . . . . . . . . . . . . . . . . . . 36
7.6 Bool as an instance of Eq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
7.7 Int as a silly instance of Eq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
7.8 Ord as a defined class. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
7.9 Bool as an instance of Ord . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.10 Overusing names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.11 Days of the week . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.12 Working days . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
7.13 Days of the week, deriving class membership . . . . . . . . . . . . . . . . . . . . . 38
7.14 The Either type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
7.15 The Shape type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
7.16 Some functions on shapes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
7.17 when . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
7.18 The Maybe type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
7.19 Two total functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
7.20 The natural numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
7.21 A function on the natural numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 41
7.22 One conversion of a Nat to a String . . . . . . . . . . . . . . . . . . . . . . . . 41
7.23 natrec . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
7.24 natbomb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
7.25 Odd and Even, using mutual induction . . . . . . . . . . . . . . . . . . . . . . . . 42
7.26 Double odd and even, using mutual recursion . . . . . . . . . . . . . . . . . . . . 42
7.27 Some definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
7.28 One version of Student . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
7.29 Another version of Student . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

8.1 Starting to define a queue ADT . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46


8.2 Continuing to define a queue ADT . . . . . . . . . . . . . . . . . . . . . . . . . . 46
8.3 queue a using newtype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
8.4 Queue a using data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
8.5 EmptyQ and isEmptyQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
8.6 addQ and remQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
8.7 A queue ADT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
8.8 Showing a queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
8.9 Queue a using data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
8.10 addQ and remQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
8.11 Less naïve queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
8.12 A signature for sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
8.13 Starting to define sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
8.14 Adding more definitions to the set ADT . . . . . . . . . . . . . . . . . . . . . . . . 50
8.15 And some more . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
8.16 Set union . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
8.17 Set intersection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
8.18 Showing a set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
8.19 The signature for Thompson’s sets . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
8.20 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
8.21 image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
8.22 setImage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
8.23 composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
8.24 limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
8.25 transClos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
8.26 The type of a search function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

xii
PROGRAMS

8.27 Listifying a set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54


8.28 Finding new descendants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
8.29 Breadth-first search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
8.30 Starting to define depth-first search. . . . . . . . . . . . . . . . . . . . . . . . . . . 55
8.31 depthSearch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
8.32 depthList . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
8.33 Paths through a graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

9.1 A type for parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59


9.2 A better type for parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
9.3 An even better type for parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
9.4 lbr and rbr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
9.5 single_symbol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
9.6 token . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
9.7 symbol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
9.8 satisfy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
9.9 Defining single_symbol using satisfy . . . . . . . . . . . . . . . . . . . . . 61
9.10 digt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
9.11 <|> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
9.12 <&> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
9.13 fail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
9.14 succeed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
9.15 epsilon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
9.16 An attempt at a parser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
9.17 The structure of a binary tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
9.18 Applying a function to the output of a parser . . . . . . . . . . . . . . . . . . . . . 63
9.19 A parser for balanced brackets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
9.20 A neater parser for balanced brackets . . . . . . . . . . . . . . . . . . . . . . . . 64
9.21 <& and &> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
9.22 An even neater parser for balanced brackets . . . . . . . . . . . . . . . . . . . . . 64
9.23 Nesting of brackets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
9.24 A higher-order function to parse and use balanced brackets . . . . . . . . . . . . 65
9.25 Using foldparens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
9.26 digit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
9.27 sp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
9.28 foldparens2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
9.29 just . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
9.30 some . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
9.31 DetParser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
9.32 <:&> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
9.33 uncurry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
9.34 curry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
9.35 star . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
9.36 plus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
9.37 identifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
9.38 first . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
9.39 identifier 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
9.40 plus_bang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
9.41 star_bang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
9.42 bang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
9.43 natural . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
9.44 foldl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
9.45 optionally . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
9.46 integer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
9.47 pack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
9.48 begin and end symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

xiii
PROGRAMS

9.49 Using pack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71


9.50 listOf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
9.51 commaList . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
9.52 A type for s-expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
9.53 A parser for space lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
9.54 A parser for s-expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
9.55 The type of chainl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
9.56 chainl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
9.57 expr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
9.58 chainr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
9.59 exprr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
9.60 <&=> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
9.61 twoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
9.62 nest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

10.1 Factorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
10.2 Factorial, again . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
10.3 Factorial, another time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

12.1 The length of a list 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99


12.2 The length of a list 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
12.3 The length of a list 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
12.4 A CPS factorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
12.5 Fibonacci’s function in Haskell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
12.6 A structurally recursive version of Fibonacci’s function . . . . . . . . . . . . . . . . 101
12.7 A neater version of Fibonacci’s function . . . . . . . . . . . . . . . . . . . . . . . . 101
12.8 A CPS version of Fibonacci’s function . . . . . . . . . . . . . . . . . . . . . . . . . 101
12.9 Looping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
12.10 Computing Fibonacci’s function in C . . . . . . . . . . . . . . . . . . . . . . . . . 102
12.11 “H.C.F. by the standard process” in Haskell . . . . . . . . . . . . . . . . . . . . . . 103
12.12 cpsid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
12.13 apply . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
12.14 split . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
12.15 fst and snd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
12.16 split . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
12.17 cpsfst and cpssnd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
12.18 cpsnat2str . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
12.19 cpsappend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
12.20 A more CPS cpsnat2str . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
12.21 Another more CPS cpsnat2str . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
12.22 More CPS functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
12.23 The general form of CPS functions on lists . . . . . . . . . . . . . . . . . . . . . . 107
12.24 cpsreduce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
12.25 Representing λ-terms in Haskell . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
12.26 An attempt at an unparser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
12.27 A more CPS unparser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
12.28 Adding brackets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
12.29 Using pattern matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
12.30 Using pattern matching 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
12.31 A better CPS unparser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
12.32 A non CPS unparser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

13.1 A type for terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112


13.2 Another type for terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
13.3 Yet another type for terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
13.4 A type for substitutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
13.5 Constructing a substitution to replace the variable x with the term t . . . . . . . . . 112

xiv
PROGRAMS

13.6 A variant of Program 13.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113


13.7 The identity substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
13.8 A variant of Program 13.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
13.9 Constructing the extension of a substitution . . . . . . . . . . . . . . . . . . . . . . 113
13.10 Composing two substitutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
13.11 An operator for composing two substitutions . . . . . . . . . . . . . . . . . . . . . 114
13.12 Constructing cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
13.13 Beginning to find a unifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
13.14 The first clause of Program 13.13 . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
13.15 A naïve attempt at the second clause of Program 13.13 . . . . . . . . . . . . . . . 116
13.16 The occurs check . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
13.17 A less naïve attempt at the second and third clauses of Program 13.13 . . . . . . . 116
13.18 The occurs check, using a different type . . . . . . . . . . . . . . . . . . . . . . . . 116
13.19 Clause Four . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
13.20 Clause Four – expanded version . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
13.21 Clause Four – second expanded version . . . . . . . . . . . . . . . . . . . . . . . 118
13.22 Types of unifier and unifierall . . . . . . . . . . . . . . . . . . . . . . . . . 118
13.23 unifier:: Term -> Term -> Substitution . . . . . . . . . . . . . . . . 118
13.24 unifierall :: Terms -> Terms -> Substitution . . . . . . . . . . . 118
13.25 unifierall :: [Term] -> [Term] -> Substitution . . . . . . . . . . 119
13.26 Avoiding mutual recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
13.27 A different type for terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
13.28 All the unifer code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

14.1 CPS substitutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125


14.2 Types of cpsunifier and cpsunifier . . . . . . . . . . . . . . . . . . . . . . 125
14.3 The first clause of cpsunifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
14.4 cpsif . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
14.5 cpsoccurs, cpsoccursl and cpsany . . . . . . . . . . . . . . . . . . . . . . . 126
14.6 any . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
14.7 The second and third clauses of cpsunifier . . . . . . . . . . . . . . . . . . . . 127
14.8 The final clause of cpsunifier . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
14.9 The first three clauses of cpsunifierl . . . . . . . . . . . . . . . . . . . . . . . 127
14.10 cpsextend and cpsmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
14.11 The final clause of cpsunifierl . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

15.1 A type for λ-terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129


15.2 A type for types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
15.3 An unparser for types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
15.4 Substitutions on types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
15.5 Unifying types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
15.6 A stock of arbitrary types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
15.7 Retrieving a type for a variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
15.8 Computing a principal type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
15.9 The type of printyp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
15.10 Typing a Var . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
15.11 Typing an Abs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
15.12 Typing an App . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
15.13 Improving printyp on a Var . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
15.14 Nicer typing of an Abs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
15.15 Nicer typing of an App . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
15.16 An improved definition of princtype . . . . . . . . . . . . . . . . . . . . . . . . 137
15.17 Final version of printyp? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
15.18 Prettifying types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

16.1 A type for CL terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

xv
PROGRAMS

17.1 An extracted program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148


17.2 A variant of our extracted program . . . . . . . . . . . . . . . . . . . . . . . . . . 149
17.3 Another variant of our extracted program . . . . . . . . . . . . . . . . . . . . . . . 149

18.1 show . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154


18.2 print . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
18.3 putStrLn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
18.4 putStrLn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
18.5 echo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
18.6 echo2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
18.7 » . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
18.8 Monad m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
18.9 fail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
18.10 >@> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
18.11 Monad laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
18.12 Trivial »= and return . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
18.13 »=, return and fail for the list monad . . . . . . . . . . . . . . . . . . . . . . 158
18.14 »=, return and fail for the MParser monad . . . . . . . . . . . . . . . . . . . 158
18.15 »=, return and fail for the Maybe monad . . . . . . . . . . . . . . . . . . . . 158

19.1 In the beginning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161


19.2 An error type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
19.3 An exception type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
19.4 Exception handling evaluator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
19.5 Showing Exc a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
19.6 A state transformer type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
19.7 Applying a state transformer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
19.8 Showing Exc a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
19.9 Handling state with a state transformer . . . . . . . . . . . . . . . . . . . . . . . . 163
19.10 A type to hold a value and some output . . . . . . . . . . . . . . . . . . . . . . . . 163
19.11 Formatting output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
19.12 Evaluating with a trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
19.13 The basic monadic evaluator, using do . . . . . . . . . . . . . . . . . . . . . . . . 164
19.14 The basic monadic evaluator, using »= . . . . . . . . . . . . . . . . . . . . . . . . 165
19.15 The identity monad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
19.16 evalID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
19.17 Exc as a monad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
19.18 raise :: Exception -> Exc a . . . . . . . . . . . . . . . . . . . . . . . . 165
19.19 raise :: Exception -> Exc a . . . . . . . . . . . . . . . . . . . . . . . . 166
19.20 raise :: Exception -> Exc a, using »= . . . . . . . . . . . . . . . . . . . 166
19.21 The state monad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
19.22 Updating a counter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
19.23 The state evaluator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
19.24 The state evaluator, using »= . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
19.25 The output monad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
19.26 The out function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
19.27 The tracing evaluator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
19.28 The tracing evaluator, using »= . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

20.1 A type for λ-terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169


20.2 Naïvely reducing a term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
20.3 Naïvely reducing an application term . . . . . . . . . . . . . . . . . . . . . . . . . 170
20.4 Oops! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
20.5 Reduce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
20.6 Reducing a term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
20.7 Reducing an application term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
20.8 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

xvi
PROGRAMS

20.9 freein . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173


20.10 Reduce, again . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
20.11 cpsred . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
20.12 cpsred’ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
20.13 CPS version of subst . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
20.14 usefrees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
20.15 cpsred’ with let . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
20.16 Setting up a state monad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
20.17 red using »= . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
20.18 red using do . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
20.19 red’ using »= . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
20.20 red’ using do . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
20.21 subst using »= . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
20.22 subst using do . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

xvii
PROGRAMS

xviii
Figures

9.1 A grammar for bracket expressions . . . . . . . . . . . . . . . . . . . . . . . . . . 60

10.1 A grammar for λ-terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78


10.2 Some λ-terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
10.3 Equality rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
10.4 The length of a λ-term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
10.5 Subterms of a λ-term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
10.6 occurences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
10.7 Some well-known combinators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
10.8 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
10.9 β-reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
10.10 Extending the equality rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
10.11 The diamond property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
10.12 Behaviour of IF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
10.13 Defining IF, TRUE and FALSE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
10.14 Church numerals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
10.15 Lots of fix-point finders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
10.16 η-reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

11.1 A grammar for types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

12.1 Evaluating cpsfac 3 k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100


12.2 Evaluating cpsfibs 3 k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
12.3 Plotkin’s CPS-conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
12.4 CPS-translation some particular terms . . . . . . . . . . . . . . . . . . . . . . . . 103

xix
FIGURES

xx
Proof figures

7.1 ∨ Introduction left . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39


7.2 ∨ Introduction right . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
7.3 ∨ Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

10.1 ∀ elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
10.2 Invalid ∀ elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
10.3 Valid ∀ elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
10.4  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
10.5 Alternative clause for  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
10.6 Confluence of  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

11.1 Type assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94


11.2 The implicational fragment of minimal propositional logic . . . . . . . . . . . . . . 94
11.3 I:a→a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
11.4 K:a→b→a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
11.5 II : a → a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
11.6 Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
11.7 Two new redexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
11.8 Typing listrec(d, e, l) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

12.1 A tail call . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

15.1 Typing a variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131


15.2 Typing an abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
15.3 Typing an application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
15.4 Typing λf x.f x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
15.5 A second typing for λf x.f x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

16.1 Modus Ponens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142


16.2 →E . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
16.3 Typing SKK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
16.4 Proving ` A ⊃ A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

17.1 Natural Numbers, as a type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145


17.2 Mathematical induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
17.3 Structural induction on lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
17.4 Total induction on Nat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
17.5 (∃m : Nat)(Zero = 2m ∨ Zero = 2m + 1) . . . . . . . . . . . . . . . . . . . . . . . 147
17.6 Induction step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
17.7 Sub-proof Π1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
17.8 Sub-proof Π2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
17.9 Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

20.1 Reducing a variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169


20.2 Reducing an abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
20.3 Reducing an application, 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
20.4 Reducing an application, 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
20.5 Reducing an application, 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
20.6 Substituting in a variable, 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

xxi
PROOF FIGURES

20.7 Substituting in a variable, 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170


20.8 Substituting in an application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
20.9 Substituting in an abstraction, 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
20.10 Substituting in an abstraction, 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
20.11 Substituting in an abstraction, 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

xxii
1 Introduction
These are the notes that I have prepared for COMP 432: Functional Programming since 2000. They
are notes, not a text book. In particular, there is no guarantee that they will be error-free: caveat lector.

1.1 Course outline


The official course outline reads:

COMP 432 Functional Languages


Functional programming languages are based on a very simple model of computation.
Their simple foundations means that they provide us with a very clear setting in which
to study many important ideas in programming, such as (function) abstraction, data ab-
straction, genericity, type polymorphism and overloading. An understanding of functional
programming languages and techniques, as well as being valuable for its own sake, also
provides the programmer with a new perspective on other programming paradigms.

This is a course about functional programming as programming, not about the implementation of
functional programming languages, nor about the λ-calculus as a topic of study in its own right.1
To understand programming we must learn both principles and practice. Each informs the other:
unprincipled practice is almost always poor practice; unpracticed principles are almost always poorly
understood. Hence this course will involve writing programs, and thinking about programs.
Extended examples will be used to illustrate the concepts we are trying to understand.

1.1.1 Texts
The textbook on which I have relied most is:

• Haskell The Craft of Functional Programming by Simon Thompson [77].

The first edition of this is on Three-day Reserve in the Library.


The following texts were also borrowed from heavily:

• ML for the Working Programmer by Larry Paulson [61];

• Programming in Prolog by Chris Mellish and William F. Clocksin [12];

Other interesting texts include:

• An Introduction to Functional Programming Systems Using Haskell by A. J. T. (Tony) Davie [14];

• The Implementation of Functional Programming Languages by Simon Peyton Jones [64];

• Functional Programming and Parallel Graph Rewriting by Rinus Plasmeijer and Marko van Eeke-
len [65];

• The Haskell School of Expression Learning Functional Programming Through Multimedia by Paul
Hudak [40];

• Purely Functional Data Structures by Chris Okasaki [59];

• Clause and Effect by William F. Clocksin[11].


1 Not that there is anything wrong with those topics.

1
1 Introduction

1.2 Compilers and interpreters


In this course we will use HUGS:

Hugs is an interpreter for Haskell, a standard non-strict functional programming language.


Hugs implements almost all of version 1.4 of the Haskell standard, except for the module
system. The name HUGS is a mnemonic (sic) for the Haskell User’s Gofer System.

There is a HUGS interpreter available on the MCS system. Type hugs at the prompt. Emacs users may
find the Haskell mode useful. Haskell is also available for the Mac and for PC’s.

1.3 Re-cap of what you (are supposed to have) learned in


COMP 304
COMP 304 Programming Languages spent a number of lectures discussing functional programming
in Haskell. I think you should know:

• functional programming is about evaluating expressions;

• functions defined on inductively defined data types (e.g. Int, lists, trees) typically use recursion;

• Haskell has a polymorphic type system;

• Haskell allows us to define higher-order functions;

• the Haskell system can infer the type of a well-typed expression;

• Haskell is a sugaring of the λ-calculus;

• Haskell uses lazy evaluation.

1.4 Next chapter


In the next Chapter we discuss the paper ‘Why Functional Programming Matters’ [41]. You will get a
lot more from the discussion if you have read the paper before-hand.

1.4.1 How to read a scientific paper


Reading a scientific paper is not like reading a novel.2 Scientific papers are very highly structured
documents. There is, Consequently, there is a structured way to read them:

1. Read the title.

2. Read the abstract. Ask yourself the question ‘What is this paper trying to tell us?’ Now you have
a general idea of what the paper is about.

3. Flick to the back of the paper, reading only the headings. This gives you an idea of the structure
of the paper, and an impression of how the paper tells you about what it tells you about.

4. Read the conclusion. Ask yourself the question ‘What was actually achieved in the paper?’

5. Look at the references. These tell you what other work was used to construct the current work.

6. Go back to the beginning and read the paper carefully. Don’t be afraid to skip sections at the
first reading.

A well-written scientific paper encourages the reader to move through these steps.
2 Nor usually as interesting.

2
1.5 Summary

1.5 Summary
From this chapter you should have learned:
• some administrative details;
• what you you should have learned in COMP 304;
• how to read a scientific paper.

3
1 Introduction

Questions
1. Why are you doing this course?
2. What do you expect from this course?
3. Use the WWW to find sites related to functional programming languages. Bookmark the best
ones.
4. What advantages do you think functional programming languages have compared to:
a) object-oriented languages, such as C++, SmallTalk and Java;
b) logic programming languages, such as Prolog and Gödel;
c) imperative programming languages, such as C.

5. What disadvantages do you think functional programming languages have compared to:
a) object-oriented languages, such as C++, SmallTalk and Java;
b) logic programming languages, such as Prolog and Gödel;
c) imperative programming languages, such as C.

6. Look at the Standard Prelude provided with the HUGS system. What is in it and why?

4
2 Why functional programming matters
2.1 Summary
In [41] John Hughes argues that two features of functional languages, higher-order functions and
lazy evaluation contribute greatly to support for program modularity. Since modularity is of great
importance in good programming, functional languages are of importance in the ‘real world’.

2.2 The argument


The paper begins by arguing that describing the advantages of functional programming in terms
of what functional programming is not is not useful. We should attempt to characterise functional
programming in terms of what it offers, not what it denies. An analogy of made with the structured
programming movement. The benefit of structured programming is not that it denies the programmer
gotos, but that it supports modular design.
If we are to aim for modular design, i.e. if large structures are made from smaller parts, we must
provide ways to put the parts together. Hughes expresses this as saying that we must provide the right
sort of ‘glue’. The claim is that higher-order functions and lazy evaluation are two very powerful glues
for programs.
In Section 3 of his paper Hughes discusses higher-order functions, and in Section 4 he discusses lazy
evaluation. Section 5 looks in more detail at a programming example from artificial intelligence: α-β
pruning. Section 6 is the conclusion.

2.3 Higher-order functions


In Section 3, ‘Glueing Functions Together’, Hughes shows how higher-order functions support mod-
ularity. We begin by examining some simple functions on lists. Although the original paper uses
Miranda,1 we take the liberty of translating the examples into Haskell. We can think of Haskell lists as
being defined as:

data [a] = Nil


| Cons a [a]

Program 2.1: Polymorphic lists

As usual we write:
• [] for Nil,
• [1] for Cons 1 [] (or Cons 1 Nil),
• [1, 2] for Cons 1 [2] (or Cons 1 (Cons 2 []), or Cons 1 (Cons 2 Nil)).
When we define a function like sum:

sum :: [Int] -> Int

sum [] = 0
sum (h:t) = h + (sum t)

Program 2.2: Summing a list


1 Miranda is a trademark of Research Software Limited.

5
2 Why functional programming matters

we observe that the only parts of this definition which are unique to summation are the 0 in the first
clause and the + in the second. If we have not seen many examples of list processing functions we can
look at a few more:

prod :: [Int] -> Int

prod [] = 1
prod (h:t) = h * (prod t)

and_all, or_all :: [Bool] -> Bool

and_all [] = True
and_all (h:t) = h && (and_all t)

or_all [] = False
or_all (h:t) = h || (or_all t)

evens :: [Int] -> [Int]

evens [] = []
evens (h:t) = if (even h) then (h : (evens t))
else (evens t)

Program 2.3: More simple list functions

In all these functions we see a common pattern:


• in the case of [] we supply a value;
• in the case of Cons h t we supply an auxiliary function which can use:
– h
– what we get by computing with t
We can write a higher-order function which captures this regularity:

reduce :: (a -> b -> b) -> b -> [a] -> b

reduce f x [] = x
reduce f x (h:t) = f h (reduce f x t)

Program 2.4: reduce

We can now define the functions from Programs 2.2 and 2.3 using reduce:

sum = reduce (+) 0

prod = reduce (*) 1

and_all = reduce (&&) True

or_all = reduce (||) False

evens = reduce
(\x y -> if (even x) then (x : y) else y)
[]

Program 2.5: Defining functions using reduce

6
2.4 Lazy evaluation

Other examples of higher order-functions which we can use to glue other functions together are:

• function composition

• map

Function composition is defined in the Standard Prelude as:

(.) :: (b -> c) -> (a -> b) -> (a -> c)

(f . g) x = f (g x)

Program 2.6: Function composition

We can define map as:

map :: (a -> b) -> [a] -> [b]

map _ [] = []
map f (h:t) = (f h) : (map f t)

Program 2.7: map

We can also define higher-order functions analogous to reduce for other inductively defined types.

2.4 Lazy evaluation


The second sort of glue that we have is lazy evaluation, which is discussed in Section 4 ‘Glueing Pro-
grams Together’. Lazy evaluation only does as much computation as we really need. Lazy evaluation
lets us plug whole programs together. Suppose f and g are programs, then (g . f) input will
compute g (f input). Only as much of f input will be computed as is needed for g. So we don’t
need to worry that f input might be very large, or even non-terminating.
The examples used in Section 4 are numerical algorithms. First we look at the Newton-Raphson
algorithm for computing (approximations to) the square root of a number N :

N
an + an
an+1 =
2

Since the ai converge quickly we only want to compute as many of them as we need to reach some
tolerance, usually denoted . Hughes asserts that the usual conventional programs for computing
square roots are very un-modular. We can use laziness to improve modularity. We do this by lazily
generating the list of approximations, and testing for the situation when two successive approximations
are within  of each other.
We obtain the next approximation from the current one by:

next :: Float -> Float -> Float

next n x = (x + n/x)/2.0

Program 2.8: The next approximation

If we define a function rept we can define a list of approximations:

7
2 Why functional programming matters

rept :: (a -> a) -> a -> [a]

rept f a = a : (rept f (f a))

approxs :: Float -> Float -> [Float]

approxs n a0 = rept (next n) a0

Program 2.9: A list of approximations

The list approxs n a0 is, of course, infinite. But, so long as we only need a finite initial segment
of it we can use laziness to operate on it.
Now we can define a function to test whether two successive values in a list are within  of each other,
and hence define a square root function:

within :: Float -> [Float] -> Float

within eps (a0:a1:as)


| abs(a0-a1) <= eps = a1
| otherwise = within eps (a1:as)

sqroot :: Float -> Float -> Float -> Float

sqroot start eps square =


within eps (approxs square start)

Program 2.10: within and sqroot

Evaluation of sqroot start eps square will terminate, even though the list of approximations
is infinite. We can use laziness to support modularity here. Suppose we decide that it is not the differ-
ence between successive approximations that we care about but the ratio of successive approximations.
Then we can define:

relative :: Float -> [Float] -> Float

relative eps (a0:a1:as)


| abs(a0-a1) <= eps * (abs a1) = a1
| otherwise = relative eps (a1:as)

sqroot :: Float -> Float -> Float -> Float

sqroot start eps square =


relative eps (approxs square start)

Program 2.11: relative and sqroot

We have not had to alter the code which generates the approximations, only the code which com-
pares the values.
Similar examples from numerical differentiation and integration are presented where lazy evaluation
helps support modularisation.

2.5 Example: α-β pruning


In Section 5, ‘An Example from Artificial Intelligence’, Hughes presents α-β pruning. This is a standard
technique from AI: [7, 68] both give short descriptions. The basic scenario is that we are implementing

8
2.5 Example: α-β pruning

a game-playing program for a two-player game, in which the computer is one of the players. We can
represent the current state of play, and we have a function which allows us to generate the state of play
after each possible move by either player. The game can then be represented as a tree, with states of
play as values at nodes. The sub-trees of each node represent the consequences of the possible valid
moves from that node. Our task is to let the computer pick the best move for it to make. We use a
minimax strategy to pick the best move, and we use α-β pruning to cut out branches of the tree where
the best move cannot reside.

2.5.1 Game trees


We use the trees defined by Hughes in an earlier section:

data Tree a = Node a [Tree a]

Program 2.12: The type of Hughes’s game trees

Each game tree is a node with a value and some (possibly none, of course) sub-trees.2 Recall that
we can define two mutually recursive higher-order functions on such trees:

redtree :: (a -> b -> c) ->


(c -> b -> b) ->
b ->
Tree a -> c

redtree f g a (Node v trees) = f v (redtrees f g a trees)

redtrees :: (a -> b -> c) ->


(c -> b -> b) ->
b ->
[Tree a] -> b
redtrees f g a [] = a
redtrees f g a (tree:trees) = g (redtree f g a tree)
(redtrees f g a trees)

Program 2.13: redtree and redtrees

Each state of play will be represented by an object of type Position, the details of which we do not
care about. The moves available from any given position can be represented by moves, a function of
type Position -> [Position]. A game tree is built by applying moves to the tree that we start
with. A complete game tree will have by the initial state of play as the value at the node. The subtrees
will be the list of trees with values of possible initial moves at their nodes, and developments from them
as sub-trees. This structure is very like that list of approximations that we looked at in §2.4. We can
define reptree, an analogue of rept:

reptree :: (a -> [a]) -> a -> Tree a

reptree f a = Node a (map (reptree f) (f a))

Program 2.14: reptree

Now we can define a function gametree which takes a position p and constructs the game tree
which can develop from that position:
2 trees like this are sometimes called ‘rose trees’.

9
2 Why functional programming matters

gametree :: Position -> (Tree Position)

gametree p = reptree moves p

Program 2.15: gametree

We have a function, which Hughes calls static, which evaluates positions, i.e. static has type
Position -> Float. If we map static over the game tree we can find the values of all the states
of play. Selecting the best move can be done after inspecting this tree of values of states of play. We
use a minimax strategy to choose the next move: to do this we need to inspect the subtrees of any node
to decide how good it really is. Although a node may have a good rating using static, subsequent
moves may end up being poor.

2.5.2 Minimax
Anyhow, let’s implement minimax. We are trying to produce a function which will give us the ‘actual’
value of each position, a value which we obtain by considering all the reachable positions. We define
maximise and minimise:

maximise, minimise :: Ord a => Tree a -> a

maximise (Node n []) = n


maximise (Node n subs) = maximum (map minimise subs)
minimise (Node n []) = n
minimise (Node n subs) = minimum (map maximise subs)

Program 2.16: maximise and minimise

Now we might try to define:

evaluate :: Position -> Float

evaluate = maximise . maptree static . gametree

Program 2.17: A putative evaluation of a position

Hughes points out that there are two, related, problems here. If the game tree is infinite then this
definition will never terminate, and even for finite but large trees this definition is practically unworkable.
Now the problem of selecting a move in a game is starting to look very much like the one we had
before:

• the game tree may be very large;

• it may not even be finite;

• consequently, we will only look at an initial part of it.

So we need to implement a way to cut off the upper branches of the tree, that is to prune it:

prune :: Int -> (Tree a) -> Tree a

prune 0 (Node v _) = Node v []


prune n (Node v trees) = Node v (map (prune (n-1)) trees)

Program 2.18: Pruning a tree

Now we can define:

10
2.5 Example: α-β pruning

evaluate :: Position -> Float

evalaute = maximise . maptree static . prune 5. gametree

Program 2.19: Minimax evaluation of a game tree

Here prune 5. gametree uses lazy evaluation to implement a 5-move lookahead. Hughes here
makes the point here that laziness supports greater modularity. If we did not have the ability to exploit
laziness we would have had to fold the pruning into the generation of the game tree. Notice also that
the two really problematic functions are:

• maximise, the last function, and

• gametree, the first one.

Laziness allows us to generate the input for maximise only as it is needed, and to reclaim storage
as it becomes free, thus giving a useful optimisation. Again laziness supports modularity: to do this
without laziness would require us to lump all these functions together.

2.5.3 α-β
What we have so far implements minimax searching. We can optimise this using α-β pruning. The
crucial point to observe is that we are interested in the maximum minimum (or the minimum maximum)
in the tree. Thus we can often cut parts of the tree out without ever visiting them. We begin by re-
implementing minimise and maximise:

maximise, minimise :: Ord a => Tree a -> a


maximise’, minimise’ :: Ord a => Tree a -> [a]

maximise = maximum . maximise’


minimise = minimum . minimise’

maximise’ (Node n []) = [n]


maximise’ (Node n subs) = mapmin (map minimise’ subs)
where mapmin = map minimum

minimise’ (Node n []) = [n]


minimise’ (Node n subs) = mapmax (map maximise’ subs)
where mapmax = map maximum

Program 2.20: Re-implementing minimise and maximise

The point of doing this is to re-define mapmax and mapmin to allow them to ignore the ignorable.
We follow Hughes and only show the re-definition of mapmin, asserting that mapmax can be given a
similar treatment. What we shall do is omit minima which cannot be the largest minimum. We define:

mapmin :: Ord a => [[a]] -> [a]

mapmin (nums:numss) =
(minimum nums) : (omit (minimum nums) numss)

Program 2.21: Re-implementing mapmin

The function omit is defined as:

11
2 Why functional programming matters

omit :: Ord a => a -> [[a]] -> [a]

omit pot [] = []
omit pot (ns:nss)
| minleq ns pot = omit pot nss
| otherwise = (minimum ns) : (omit (minimum ns) nss)

Program 2.22: omit

Omit takes a potential maximum minimum and ignores minima less than this. Minleq is where the
clever bit comes in: it is given a potential maximum minimum and a list of numbers. Minleq is true
if the minimum of the list is less than or equal to the potential maximum minimum. If any number in
the list is is less than or equal to the potential maximum minimum, then the minimum of the list surely
is too. (The property of being the minimum in a list is that the minimum is less than or equal to any
number in the list, and ≤ is transitive!) Hence minleq does not have to look at all the values in the
list. We define minleq as:

minleq :: Ord a => [a] -> a -> Bool

minleq [] _ = False
minleq (n:ns) pot = (n <= pot) || minleq ns pot

Program 2.23: minleq

We re-implement mapmax mutatis mutandis, and then we can re-define evaluate:

evaluate = maximise . maptree static . prune 8 .gametree

Program 2.24: Re-defining evaluate

Hughes points out that we have made a number of efficiency gains through laziness, and hence we
can look 8 moves ahead now!
Following this development has been, I think, quite hard. The main point is that laziness has sup-
ported modularity. We have been able to improve evaluate as defined in Program 2.19 to implement
α-β pruning by making purely local changes to maximise. The great modularity of evaluate also
allows us to implement other optimisations simply.

2.6 Conclusion
The conclusions of the paper are that:
• modularity is known to be a good thing;
• functional programming languages offer higher-order functions and laziness;
• higher-order functions and laziness support modularity;
• hence functional programming languages have a lot to offer.

12
2.6 Conclusion

Questions
1. Define append and map using reduce.
2. filter :: (a -> Bool) -> [a] -> [a] takes a test, of type a -> Bool and filters
items from the input list which fail the test. Define filter using reduce.
3. Define the function evens from Program 2.3 using filter.
4. Binary trees can be defined in Haskell as:

data BinTree a = Leaf


| Node a (BinTree a) (BinTree a)

Program 2.25: Binary trees

Define the function redbintree, the analogue of reduce for binary trees.
5. Define 3 functions inorder, preorder and postorder to flatten a tree (i.e. to convert a
BinTree a into a [a]) using redbintree. The three functions should list the values using in-,
pre-, and post-order traversals.
6. Define treemap :: (a -> b) -> (BinTree a) -> BinTree b, which maps a func-
tion over a tree, using redbintree.
7. The two functions rept in Program 2.9 and reptree in Program 2.14 have some obvious
similarities. Is it possible to define, in Haskell, a function which generalises them? What type
would this function have?
8. Draw the game tree for a game of noughts and crosses (tic-tac-toe).
9. Define a type Oxo which represents the state of a game of noughts and crosses.
10. Generate a game tree for noughts and crosses.
11. Implement mapmax.

13
2 Why functional programming matters

14
3 Limitations of reduce
The function reduce is very useful, but it has limitations. In this Chapter we shall look at two of these.

3.1 Trying to use reduce


In this section we shall try to implement a sorting algorithm using the reduce operator of [41]. Unfor-
tunately reduce is not quite up to the job.
A sorting algorithm takes a list and returns an ordered permutation of it. In Haskell’s type system the
type of such a function can be expressed as: Ord a => [a] -> [a].
We begin by observing:

• the empty list is an ordered permutation of the empty list;

• if t’ is an ordered permutation of t and f h t’ is an ordered permutation of h:t’, then f h


t’ is an ordered permutation of h:t.

So, if we can define an appropriate f, we can define a sorting algorithm as:

sort :: Ord a => [a] -> [a]

sort = reduce f []
where f x l = ??

Program 3.1: Towards sorting

Now we have the task of defining a suitable f, which takes a value and an ordered list, and returns
an ordered list.
We begin by observing:

• if we are given a value v and the empty list then we can construct the singleton list [v] which is
ordered;

• if we are given a value v and a non-empty, ordered list then we can construct an ordered list by
comparing v with the head of the list. If v is less than the head of the list then we cons v onto
the list given to us, otherwise we cons the head of the list onto what we got from constructing
an ordered list from v and the tail of the list given to us.

We attempt to implement this using reduce:

f :: Ord a => a -> [a] -> [a]

f v = reduce (\h r -> if v < h


then (v:h:??)
else h:r)
[v]

Program 3.2: Towards sorting 2

We have a problem here: we do not have access to the tail of the original list! This sorting algorithm
does not fit into exactly the pattern which reduce generalises. We can scratch our heads for a bit
at this point: fortunately the solution to this problem is well-known. Instead of reduce we use an

15
3 Limitations of reduce

operator called listrec. We discuss listrec in § 11.4 on page 96, and [55, 58, 76] explain in much
more detail. This operator is defined as follows:

listrec :: b -> (a -> [a] -> b -> b) -> [a] -> b

listrec d _ [] = d
listrec d e (h:t) = e h t (listrec d e t)

Program 3.3: listrec

We can define reduce using listrec:

reduce :: (a -> b -> b) -> b -> [a] -> b

reduce f d = listrec d (\h _ r -> f h r)

Program 3.4: reduce defined via listrec

We can define the function f from Program 3.2 as:

f :: Ord a => a -> [a] -> [a]

f v = listrec [v]
(\h t r -> if v < h
then (v:h:t)
else h:r)

Program 3.5: Towards sorting 3

And we can define the sorting algorithm as:

sort :: Ord a => [a] -> [a]

sort = reduce insert []


where insert x =
listrec [x] (\h t r -> if x < h
then (x:h:t)
else h:r)

Program 3.6: Insertion sort

Although anything we can do with reduce can be done with listrec in many situations reduce
is more convenient and neater.
One key point to notice here is that we have defined insertion sort, not a particularly good algorithm.
There are better sorting algorithms which do not follow the pattern of reduce and listrec.

3.2 Parameterising on the sorting order


In the example above we used type classes. We can, instead, parameterise the sorting function on the
ordering required:

16
3.3 Finding the minimum of a list

sort :: (a -> a -> Bool) -> [a] -> [a]

sort ord = reduce insert []


where insert x =
listrec [x] (\h t r -> if (ord x h)
then (x:h:t)
else h:r)

Program 3.7: Parameterising insertion sort

Now we can pass sort < or > depending on whether we want to sort numbers in ascending or
descending order, a feature which generalises to other orders, and to other types.

3.3 Finding the minimum of a list


Given a list of integers we might want to find the minimum value in the list. Doing this with reduce
forces us to think carefully.
For some reason naïve programmers try to implement minl as:

minl :: [Int] -> Int

minl [] = -1
minl [s] = s
minl (h:t) = min h (minl t)

Program 3.8: An incorrect program

this program does not fit the pattern of reduce, but this is not its only flaw.
We might expect to have:
minl(l + +m) = min(minll)(minlm)
But consider the list [1], which we choose to write as [1] + +[]. Now we can show that 1 = −1.
The problem is that minl is simply not defined for the empty list. Pretending that it is will only lead us
into trouble. Furthermore, reduce only lets us write total functions.
There is more than one way to solve this problem. We look at two, and will see a third later on.

3.3.1 First solution


Our first attempt at error handling is to do it all ourselves. We use the Maybe type from Program 7.18
on page 40.

minl :: [Int] -> Maybe Int


minl’ :: Int -> [Int] -> Int

minl [] = Error
minl (h:t) = Ok (minl’ h t)

minl’ x [] = x
minl’ x (h:t) = min x (minl’ h t)

Program 3.9: A better program

We get this behaviour:

Gofer?
minl []

17
3 Limitations of reduce

Error :: Maybe Int


Gofer?
minl [4,5,3,6,2]
Ok 2 :: Maybe Int
Now, this program can be written using reduce. Often we can find a program using reduce which
will do the job for us, even if we have to think a bit.
However, if we want to program in this fashion we end up writing a lot of code to handle errors.

3.3.2 Second solution


The designers of Haskell built a special value error :: String -> a into the language. The type
of error is very unusual. We can define minl in a very natural way:

minl’’ :: [Int] -> Int

minl’’ [] = error "Trying to find minimum of empty list."


minl’’ [x] = x
minl’’ (h:t) = min h (minl’’ t)

Program 3.10: An even better program

We get this behaviour:


minl’’ []

Program error: Trying to find minimum of empty list.

Gofer?
minl’’ [4,5,3,6,2]
2 :: Int
Letting the system handle errors for us is more convenient.

3.3.3 Third solution


We will see a related technique using monads in § 18.3.4 on page 158 and § 19.3.2 on page 165.

3.4 Summary
In this Chapter we highlighted some limitations of reduce. We presented another operator, called
listrec, which allows us to overcome one of these limitations. We also briefly discussed error
handling.

18
3.4 Summary

Questions
1. Define treerec, the analogue of listrec for the binary trees defined in Program 2.25.
2. Does the following hold?

(∀l, m ∈ [Int])minl(l + +m) = min(minll)(minlm)

3. Define minl of Program 3.9 on page 17 using reduce.

4. Explain why the type of error is very unusual.

19
3 Limitations of reduce

20
4 More lazy programming
Hughes’ [41] argues strongly that laziness has a lot of advantages. In this Chapter we will look more
closely at laziness, following the presentation given in Chapter 17 of [77].1 [61] has a rather nice
chapter on laziness, despite (or perhaps because of) being about ML, a strict functional language.
When we use lazy evaluation:

1. we only evaluate an expression if we have to;

2. we only evaluate an expression as far as we have to;

3. we only evaluate an expression once.

As a simple illustration of lazy evaluation we show how to take the first few items from an infinite list.
Haskell gives us a convenient notation for lists of integers:2

Gofer?
[1..5]
[1, 2, 3, 4, 5] :: [Int]
(44 reductions, 84 cells)
Gofer?
[1, 3..10]
[1, 3, 5, 7, 9] :: [Int]
(46 reductions, 94 cells)

This notation also allows us to describe infinite lists:

Gofer?
[1, 3 ..]
[1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29,
31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57,
59, 61, 63, 65, 67, 69, 71, 73, 75{Interrupted!}

(181 reductions, 498 cells, 1 garbage collection)

The function take :: Int -> [a] -> [a] from the standard prelude returns the first n items
of a list. Lazy evaluation allows us to take the first 5 items from an infinite list:

Gofer?
take 5 [1, 3..]
[1, 3, 5, 7, 9] :: [Int]
(28 reductions, 75 cells)

We saw lots more examples of laziness in action in Chapter 2.

4.1 Evaluation order


So far we have been rather non-specific about the details of how a Haskell program is actually evalu-
ated. To explain how lazy evaluation works we need to be a bit more specific.
Suppose we are trying to evaluate:
f 1 (g 2 3)
1 This is Chapter 13 in the first edition.
2 The examples are from a session with MacGofer.

21
4 More lazy programming

where f and g are functions. There are two possible expressions we could choose to work on first,
which we underline:
f 1 (g 2 3)
Two questions spring immediately to mind:
1. Does it matter which we choose?
2. If it does matter, what difference does it make?
The λ-calculus provides the best setting in which to provide answers to these questions: when we
look at the λ-calculus in Chapter 10 we will give more detailed explanations, but for the moment we
will work with Haskell.
The answer to question 1 is yes: we would not have bothered to spend so much time building up to
it is if was not! Question 2 clearly has a much more interesting answer.
The evaluation strategy that Haskell adopts is that it picks the leftmost, outermost expression to
evaluate first. In the example above we would first evaluate:
f 1 (g 2 3)
This strategy is what is required for lazy evaluation. Lazy evaluation is, however, more than just
leftmost, outermost evaluation as we also must take care never to evaluate further than we must, nor
twice. The leftmost, outermost evaluation strategy has one remarkable property: if any evaluation
strategy finds a value of an expression then leftmost, outermost will. Leftmost, outermost is also called
normal order evaluation.
Applicative order evaluation evaluates the arguments to a function before evaluating the function. In
the example above we would first evaluate:
g23
If we think in terms of parameter-passing mechanisms then applicative order corresponds to call-by-
value, and normal order to call-by-name.
Notice that normal-order evaluation may find a value for an expression when applicative order
cannot. Consider the (contrived) example:

bomb :: Int

bomb = bomb + 1

k :: a -> b -> a

k x y = x

Program 4.1: The combinator K, and a bomb

Attempting to evaluate bomb gives this result:


Gofer?
bomb
Error: the Macintosh stack collided with the heap
However we can happily evaluate k 1 bomb:
Gofer?
k 1 bomb
1
(2 reductions, 6 cells)
If we used applicative-order evaluation we would not be able to evaluate k 1 bomb.
When we discuss the λ-calculus we will introduce the notion of a normal form for a λ-term: evaluation
of Haskell expressions corresponds to reduction of a λ-term to normal form. We can also show that the
λ-calculus has the Church-Rosser property: if two evaluation strategies reduce a λ-term to a normal
form, then these normal forms are the same. In other words normal forms are unique, and hence we
are justified in treating the normal form of an expression as its value. More of this in Chapter 10.

22
4.2 Summary

4.2 Summary
Laziness has three parts. Normal-order evaluation deals with one part of laziness: not evaluating
expressions that we can ignore. The λ-calculus is the best environment in which to discuss many of the
notions introduced in this Chapter, so we will re-visit them later.

23
4 More lazy programming

Questions
1. Define take and drop. The function drop has type Int -> [a] -> [a] and drops the first
n items from a list.
2. Lazy evaluation has a correspondence with ‘call-by-name’ parameter passing. Find out what
parameter-passing mechanism is used by:
• Algol-60
• Algol-68
• Pascal
• C
• C++
• Java
• LISP
• LISP, originally
• Scheme
• ML
• Miranda
• SmallTalk
• your favourite language
Why did the language designers make these choices?

3. Notice that normal-order evaluation may find a value for an expression when applica-
tive order cannot.
Does this mean that there are functions we can write in Haskell that we can’t write in a strict
language like ML? You may want to read Chapter 5 of ML for the Working Programmer [61].
4. Find out what strictness analysis is.

24
5 List comprehensions
We pause to discuss list comprehensions. List comprehensions were introduced in Miranda as ZF-
expressions. ZF is a set theory due to Zermelo and Fraenkel. In ZF we can define set comprehensions,
e.g.
{x|x ∈ N ∧ x mod 2 = 0}

List comprehensions allow us to use a similar notation to describe lists:

Gofer?
[x | x <- [0..], x ‘mod‘ 2 == 0]
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28,
30, 32, 34, 36, 38, 40{Interrupted!}

List comprehensions give us a very powerful way to describe lists. There is one crucial difference be-
tween set comprehensions in ZF and list comprehensions in Haskell. The items in a list comprehension
are generated in a particular order, and sometimes we need to think carefully about how they are gen-
erated. This is rather similar to the distinction between the declarative and the procedural semantics of
Prolog.
The syntax for list comprehensions provided by Haskell is quite simple. After the list bracket we have
an expression, then a vertical bar, and then a sequence of generators and tests. A generator is a
pattern followed by <- followed by a list. The symbol <- is supposed to remind us of ∈. A test is an
expression of type Bool.
We can have more than one generator, and more than one test. If we are using more than one
generator we need to be careful about the order in which the results are generated. If we define this
function:

pairs :: Int -> [(Int,Char)]

pairs p = [(n, m) | n <- [1 .. p],


m <- [’a’ .. chr (ord ’a’ + n - 1)]]

Program 5.1: Pairs of integers and characters

It is evaluated as follows (with a little bit of formatting for clarity):

Gofer?
pairs 5
[(1,’a’),
(2,’a’), (2,’b’),
(3,’a’), (3,’b’), (3,’c’),
(4,’a’), (4,’b’), (4,’c’), (4,’d’),
(5,’a’), (5,’b’), (5,’c’), (5,’d’), (5,’e’)]

5.1 Pythagorean triples


The following list comprehension computes Pythagorean triples:

25
5 List comprehensions

[(n*n - m*m, 2*n*m, n*n + m*m) | n <- [2 .. ],


m <- [1 .. n-1],
gcd n m == 1,
odd (m + n)]

Program 5.2: Pythagorean triples

The definition looks obscure: but it comes from a careful consideration of the problem. Powerful
programming techniques are also helped by careful thought. Observe that:

(n2 − m2 )2 = n4 + m4 − 2n2 m2
(2nm)2 = 4n2 m2
(n − m ) + (2nm)2
2 2 2
= n4 + m4 + 2n2 m2
n4 + m4 + 2n2 m2 = (n2 + m2 )2

Hence the sum of the squares of the first two is the square of the third, so they certainly are
Pythagorean triples. The first generator selects n from the list of natural numbers from 2 upwards.
The second generator selects m from the list of natural numbers from 1 to n − 1. The (n, m) pairs
are generated in just the same order as the pairs in Program 5.1. The stipulation that n and m are
mutually prime ensures that we ignore triples such as (pa, po, ph), as the triple (a, o, h) is already in the
list. The stipulation that m + n is odd ensures that we ignore triples of the form (2o, 2a, 2h) as the triple
(a, o, h) is already in the list.
And, in case you don’t believe the argument above, the first 10 items in this list are:

[(3,4,5), (5,12,13), (15,8,17), (7,24,25), (21,20,29),


(9,40,41), (35,12,37), (27,36,45), (11,60,61),
(45,28,53)]

5.2 Quicksort
As an example of a function defined using list comprehensions we give a version of quicksort.
Quicksort is defined as follows:

• The empty list is sorted.

• To sort a list with a head and a tail we sort the values in the tail less than the head and append
that to the head consed onto what we got from sorting the values in the tail greater than the
head.

quick :: Ord a => [a] -> [a]

quick [] = []
quick (h:t) = quick [x | x <- t, x < h] ++
h :
quick [x | x <- t, x >= h]

Program 5.3: Quicksort defined using list comprehensions

We can also define quick without using list comprehensions:

26
5.3 n Queens

quick :: Ord a => [a] -> [a]

quick [] = []
quick (h:t) = quick smalls ++
h :
quick bigs
where (smalls, bigs) = partition h t

partition :: Ord a => a -> [a] -> ([a],[a])

partition _ [] = ([], [])


partition pivot (a:as) =
if a < pivot
then (a : wees, bigs)
else (wees, a : bigs)
where (wees, bigs) = partition pivot as

Program 5.4: Quicksort defined without using list comprehensions

Which should we prefer? What are the criteria on which to choose? Paulson [61] writes:

Correctness must come first. Clarity must usually come second and efficiency third. When-
ever you sacrifice clarity for efficiency, be sure to measure the improvement in performance
and decide whether the sacrifice is worth it. A judicious mixture of realism and principle,
with plenty of patience, makes for efficient programs.

Programs 5.3 and 5.4 are both correct.1 Program 5.3 is, surely, clearer. Check their relative
efficiencies out for yourselves.

5.3 n Queens

And now we program n Queens. The original problem is to place 8 Queens on a chess board, such
that no two Queens threaten each other. We call an arrangement where no two Queens threaten
each other safe. The number 8 plays no significant role, so we generalise to the problem of placing
n Queens safely on an n × n chess board. We could represent the positions of the Queens using an
n × n array. This not a very good representation as we know that no two Queens can possibly be in
the same row. For each row all we need to record is which column is occupied, so we can use a list
of integers. We can also notice that no two Queens can be in the same column either, so solution to
the problem must be permutations of the list [1..n]. One algorithm to solve the problem is simply to
generate all permutations, and then test each one to find whether it is a safe arrangement. Even using
lazy evaluation this is not a very good solution. Instead we define a recursive solution where we place
n Queens by adding one Queen to the nth row of a board on which we have already safely placed
n − 1 Queens in the first n − 1 rows.
So we begin by defining:

1 The proof is left to the interested reader.

27
5 List comprehensions

type Solution = [Int]

queens :: Int -> [Solution]

queens n = place n n

place :: Int -> Int -> [Solution]

place _ 0 = [[]]
place size n = ??

Program 5.5: Starting n Queens

Now we must fill out the ??. Of course we will use a list comprehension. As we are writing a recursive
function we expect to use place size (n - 1) :: [Solution] in one of the generators. The
solutions to place size (n - 1) are the ways to place n − 1 Queens safely in the first n − 1 rows.
If we pick one of these solutions, and pick a value for the column to put the nth Queen in, and show
that this is a safe arrangement then we have solved the problem. So now we are working along these
lines:

place size n = [p ++ [q] | p <- place size (n - 1),


q <- [1 .. size],
safe p q]

Program 5.6: Developing n Queens

We still need to explain what safe is. If p has length l then we are placing a Queen on column q
in row l + 1. So we must check that the co-ordinates (l + 1, q) do not clash with the co-ordinates of
any of the existing Queens. Remember p is a list of the columns, and the ith item in p is in row i.
there is a function in the standard prelude zip :: [a] -> [b] -> [(a, b)] which we can use
to generate the list of co-ordinates implicit in p. The list we are interested in is simply:

zip [1..] p

If we have two pairs of co-ordinates (r1 , c1 ) and (r2 , c2 ) then the check for safety is that:

• the rows differ,

• the columns differ,

• the positive diagonals (r + c) differ, and

• the negative diagonals (r − c) differ.

We know that the rows differ, so we only need to check the other three conditions. In Haskell we can
express this as:

check :: Num a => (a,a) -> (a,a) -> Bool

check (i,j) (m,n) =


not (j==n || (i+j==m+n) || (i-j==m-n))

Program 5.7: A safety check

We need to perform the check for all the co-ordinates we have generated, so safe is defined as:

28
5.4 Summary

safe :: Solution -> Int -> Bool

safe p n =
and [ check (i,j) (m,n) | (i,j) <- zip [1..] p ]
where m = 1 + length p

Program 5.8: A complete check

The function and :: [Bool] -> Bool is from the standard prelude, and is defined as reduce
(&&) True.
The whole program is:

type Solution = [Int]

queens :: Int -> [Solution]

queens n = place n n

place :: Int -> Int -> [Solution]

place _ 0 = [[]]
place size n = [p ++ [q] | p <- place size (n - 1),
q <- [1 .. size],
safe p q]

safe :: Solution -> Int -> Bool

safe p n =
and [ check (i,j) (m,n) | (i,j) <- zip [1..] p ]
where m = 1 + length p

check :: Num a => (a,a) -> (a,a) -> Bool

check (i,j) (m,n) =


not (j==n || (i+j==m+n) || (i-j==m-n))

Program 5.9: n Queens

Working through the development of this algorithm may suggest a change in our representation
of the solutions. We used a very irredundant representation of a solution, and we generated the co-
ordinates when we needed them. We might instead decide to trade some space off against time and
use a quadruple (r, c, r + c, r − c) to represent the position of each Queen. Some careful analysis is
needed here. [7] discusses the n Queens problem in Prolog at some length.

5.4 Summary
In this Chapter we have introduced list comprehensions. They provide us with a powerful and succinct
way to describe lists. They operate using a ‘generate-and-test’ mechanism.
We will see more examples of list comprehensons as we proceed.

29
5 List comprehensions

Questions
1. Define filter using list comprehensions.
2. Define map using list comprehensions.
3. Program n Queens without using list comprehensions.

30
6 Case study: searching a graph
Graphs turn up all over the place. One simple and appealing problem where graphs come to our aid
is the following, from Programming in Prolog by Chris Mellish and William F. Clocksin [12]:

It is a dark and stormy night. As you drive slowly down a lonely country road, your car
breaks down, and you stop in front of a splendid palace. You go to the door, find it open,
and begin looking for a telephone. How do you search the palace without getting lost, and
know that you have searched every room? Also, what is the shortest path to the telephone?
It is just for such emergencies that graph searching methods have been devised.

Before we can get to the stage of writing a solution to this problem we have to look at how we will
represent a graph. We expect to represent a graph as a relation, where a relation will be represented
as a set of pairs. So we will have to look at how we can represent a set. Sets, of course, are best
represented using an abstract data type, so we will have to look at how Haskell handles ADT’s. In
order properly to discuss ADT’s we need to say something about algebraic types too. So our starting
point is a long way away from the graph search problem.

6.1 Summary
In order to complete the task of this Chapter we have to take a major detour through the type system
that Haskell provides us with. We will return to our initial problem in § 8.5 on page 56.

31
6 Case study: searching a graph

32
7 Types
Typing is a Good Thing. Our main concern is that the type system offers benefits for the programmer.
For example, an attempt to take the head of an integer can be spotted as a type error at compile time.
Early detection of errors makes them easier to fix. However we should also be aware that the type
system also provides the compiler with information. For example, in the standard reference on the C
programming language [44] types are first mentioned on p. 9, and, at the first mention of types, we
are told how many bytes are to be allocated for each of the basic types. Later on (p. 42) we are told
that:
A char is just a small integer, so chars may be freely used in arithmetic expressions.
There is clearly a tension here. Knowing that a letter and a number are both stored in the same amount
of space may allow the programmer to perform some cute tricks, but code full of cute tricks is hard
to understand and rarely clear, or easily modifiable. We follow Paulson’s advice quoted in § 5.2 on
page 27, and leave the information for the compiler implicit. It is the compiler writer’s job to exploit
this information, not the programmer’s.

7.1 Hindley-Milner, and parametric polymorphism


Haskell has a higher-order, polymorphic type system, based on a system of type assignment described
by Roger Hindley [37] and later Robin Milner [56]. Hindley’s work was unknown to Milner. Hindley
was proving a mathematical result about a system of combinatory logic, Milner was working on the
design of ML. Polymorphism allows us to use type variables. A type like a -> a is implicitly universally
quantified over types. This is similar to the situation we have in logic where A, B and so on are
propositional variables, and a formula like A ⊃ A is implicitly universally quantified over propositions.1
The Hindley-Milner type system has some useful features. Consider the identity function. In the λ-
calculus:
λx.x
or in Haskell:

id x = x

Program 7.1: The identity function id

The identity function has type:


• Int -> Int
• Float -> Float
• Array(BinTree(Char)) -> Array(BinTree(Char))
• Term -> Term
• ...
Id has an infinity of types. All these types are instances of the type:

α→α

where α is a variable which ranges over types. This is the principal type of the identity function. In the
Hindley-Milner system:
1 The analogy between propositions and types is enough to make a career out of.

33
7 Types

1. every typable term has a principal type;


2. principal types, being principal, are unique.
Furthermore, and this is the useful bit, given any term:
1. it is possible to say whether it is typable;
2. if the term is typable, it is possible to give the principal type.
Thus we can implement a type inference or principal type algorithm. [38] gives an excellent pre-
sentation of the principal type algorithm, which we will look at in § 11.1 on page 93. From the
programmer’s point of view type inference with polymorphic typing gives the security of a type system
with much of the flexibility of an un-typed language.
In Hindley-Milner, as all type variables are implicitly universally quantified, we omit the quantifier.
Hence we write:
α→α
instead of:
∀α.α → α
In Haskell we write a -> a. Notice that, just as:
∀α.α → α
and:
∀σ.σ → σ
are indistinguishable, so are a -> a and s -> s.

7.2 Extensions to Hindley-Milner


The parametric polymorphism that we have discussed so far is based on universal quantification at the
level of types. The other form of polymorphism that we see is often called ad hoc polymorphism, or
overloading. With overloading it can be argued that we have foolishly used one name to stand for two
different functions. We use different names for:
• the function which reverses a list, and
• the function which finds the length of a list.
Why should we not use different names for:
• the function which adds two integers,
• the function which adds two floating point numbers, and
• the function which forms the resultant of two vectors?
Although there need be no similarity between the algorithms involved there are surely some sim-
ilarities between the properties of the functions involved. The use of type classes lets us express the
similarities, while hiding the details of the differences. We can, if we are logically minded, think of type
classes in terms of existential quantification over types. The other situation where we want to perform
some information-hiding involving types is with ADT’s. We can think of ADT’s in terms of existential
quantification over types, too.
The restrictions that are imposed on the type system of Haskell are essentially there to permit type
inference. We can imagine more expressive type systems, but in these systems type inference may
become impractical or impossible. This is just like the situation we find in logic, where we have decision
procedures for some logics, and where we know that no decision procedure can exist for some other
logics. This is not an accidental correspondence, but a consequence of thinking of propositions and
types are being related.
After this rather abstract description of typing, we look at the particulars of the type system of Haskell.
In Haskell the names of types start with a capital.

34
7.3 Basic types

7.3 Basic types


Haskell has a number of basic types:

• Int integers . . . , -1, 0, 1, . . .

• Float floating point numbers e.g. 3.14159, 2.0.

• Char characters e.g. ’a’, ’A’, ’0’. The most recent versions of Haskell (are supposed to) use
Unicode.

• Bool the type with the two values True and False in it.

• other numeric types Rational, Integer,. . .

Lists and tuples are also basic to Haskell. Lists behave as if they were defined by Program 2.1 on
page 5. The concrete syntax that Haskell uses for a list and for the type of lists is the same. Out of
context is it not clear whether [a] is the type of lists of objects of type a, or whether it is the singleton
list containing the variable a. So an alternative way to express Program 2.1 would be:

data [a] = []
| [a : [a]]

Program 7.2: How lists might have been defined.

The concrete syntax for tuples has the same property, and tuples behave as if they were defined by:

data (a1, ..., an) = (a1, ..., an)

Program 7.3: How tuples might have been defined.

There are lots of functions defined on the basic types in the standard prelude.

7.4 Type synonyms


We can give a name to an existing type. For example, the type of strings behaves as if it was defined
as:

type String = [Char]

Program 7.4: How strings might have been defined.

Notice that we start a type synonym off with type, rather than data.

7.5 Classes
We have already mentioned type classes, which, to borrow a phrase from Phil Wadler and Stephen
Blott [84], let us make ad hoc polymorphism less ad hoc. A class is a collection of types. A type which
is in a class is called an instance of that class. Classes can be derived from other classes. The derived
class inherits operations. There are some basic classes, and we can also define type classes ourselves.
The basic classes include:

• Eq

• Ord

• Enum

35
7 Types

• Bounded
• Show
• Read
• Num
• Fractional

7.5.1 Declaring a class


We declare a class by giving a signature. For example the class Eq can be defined as:

class Eq a where
(==), (\=) :: a -> a -> Bool
x /= y = not (x == y)
x == y = not (x/=y)

Program 7.5: How the equality class might have been defined.

The definitions of == and /= are the default definitions. Any instance of the class will pick up the
default definitions, unless these are over-ridden. For example Bool is an instance of the Eq class:

instance Eq Bool where


True == True = True
False == False = True
_ == _ = False

Program 7.6: Bool as an instance of Eq

The instance definition of == over-rides the default definition, and the default definition defines /=
for Bool.
There is nothing to stop us from defining, for example:

instance Eq Int where


n == m = even (n*m)

Program 7.7: Int as a silly instance of Eq

Haskell does not check that our definition of == is sensible: that it up to us.
The class Ord is a derived class. It is defined as:

class Eq a => Ord a where


(<), (<=), (>), (>=) :: a -> a -> Bool
max, min :: a -> a -> a

x < y = x <= y && x /= y


x >= y = y <= x
x > y = y < x

max x y | x >= y = x
| y >= x = y
min x y | x <= y = x
| y <= x = y

Program 7.8: Ord as a defined class.

36
7.6 Algebraic types

We can define Bool as an instance of Ord:

instance Ord Bool where


False <= x = True
True <= x = x

Program 7.9: Bool as an instance of Ord

7.5.2 Section summary


In this section we have seen Haskell’s type classes. Type classes were an innovative feature of Haskell.
We can think of them as:

• making ad hoc polymorphism less ad hoc;

• giving us some features of object-orientation;

• providing more support for large-scale programming;

• adding some existential typing to the Hindley-Milner type system.

7.6 Algebraic types


Haskell, in common with other modern, functional programming languages, allows us to define al-
gebraic types. When we define an algebraic type we explain how to construct the values of the type.
Given this description we can explain the general form which functions over the type take.
The syntax is that we start with the keyword data then give the name of the type (which must start with
a capital), then a single equals sign, then the patterns for the values of the types, in a list separated by
vertical bars. Each pattern is a constructor (which must start with a capital), followed by some (perhaps
none) type names. Notice that, just as we overused the list and tuple brackets, we can use the same
name for a type and a constructor:

data Foo = Foo

Program 7.10: Overusing names

7.6.1 Enumerated types


As a more sensible example, we might define the days of the week as:

data DaysOfTheWeek = Monday


| Tuesday
| Wednesday
| Thursday
| Friday
| Saturday
| Sunday

Program 7.11: Days of the week

DaysOfTheWeek is an enumerated type.


To write a function over DaysOfTheWeek we will typically supply seven clauses, one clause to match
each constructor. For example:

37
7 Types

workday :: DaysOfTheWeek -> Bool

workday Monday = True


workday Tuesday = True
workday Wednesday = True
workday Thursday = True
workday Friday = True
workday Saturday = False
workday Sunday = False

Program 7.12: Working days

Of course, we can use the wildcard pattern _ to cut out some clauses.

7.6.2 Algebraic types and type classes


There is an obvious way in which we can make each algebraic type an instance of the Eq class. Two
values are equal if they have the same constructor and the same sub-parts. We can tell Haskell that
this is all we need to consider for the equality on the type. We can do this with other classes as well.
For example, we can re-define DaysOfTheWeek as:

data DaysOfTheWeek = Monday


| Tuesday
| Wednesday
| Thursday
| Friday
| Saturday
| Sunday
deriving (Eq, Ord, Enum, Show, Read)

Program 7.13: Days of the week, deriving class membership

And now we can do:2


Main> [Monday .. Friday]
[Monday,Tuesday,Wednesday,Thursday,Friday]
Main> Wednesday == Friday
False
Main> Wednesday < Friday
True

7.6.3 Tagged Unions


We can describe tagged unions using algebraic types. We can use tags to ‘inject’ values of a number
of different types into another type. For example we can define a type Either a b, whose values are
either values of a or of b (along with a tag to say which):

data Either a b = Inl a


| Inr b

Program 7.14: The Either type

The Either type is rather general. As a more concrete example, suppose we were trying to represent
shapes. We might define a type like:3
2 From a session with HUGS on the SMCS Unix system.
3 As indeed Thompson does on p246 of [77].

38
7.6 Algebraic types

data Shape = Circle Float


| Rectangle Float Float
deriving (Eq, Ord, Show, Read)

Program 7.15: The Shape type

We can write functions like these:

isRound :: Shape -> Bool

isRound (Circle _) = True


isRound (Rectangle _ _) = False

area :: Shape -> Bool

area (Circle r) = pi * r * r
area (Rectangle l b) = l * b

Program 7.16: Some functions on shapes

We can think of the two constructors Circle and Rectangle as functions of types Float ->
Shape and Float -> Float -> Shape, respectively.
Lets return to Either. We can think Inl and Inr as constructor functions too. Inl has type a ->
Either a b and Inr has type b -> Either a b.
We can write a higher-order function to make use of a value of type Either a b. This function will
take two auxiliary functions, of types a -> c and b ->c, and will apply the appropriate one.

when :: (a -> c) -> (b -> c) -> (Either a b) -> c

when d _ (Inl x) = d x
when _ e (Inr y) = e y

Program 7.17: when

You may want to compare Inl, Inr, and when with the natural deduction rules for ∨ introduction
and elimination:

A
∨ I left
A∨B
Rule 7.1: ∨ Introduction left

B
∨ I right
A∨B
Rule 7.2: ∨ Introduction right

[A] [B]
· ·
· ·
· ·
A∨B C C
∨E
C
Rule 7.3: ∨ Elimination

One tagged union type which has a lot of uses is Maybe a, defined as follows:

39
7 Types

data Maybe a = Error


| Ok a

Program 7.18: The Maybe type

A total function from α to β is defined for all values in α. A partial function from α to β is undefined
for some values in α. We can use Maybe to turn partial functions into total ones. Given a partial
function, f , from α to β, we define a total function, g, from α to Maybe β, which co-incides with f
wherever f is defined, and has the value Error elsewhere. For example:

totalhead :: [a] -> Maybe a

totalhead [] = Error
totalhead l = Ok (head l)

totalfac :: Int -> Maybe Int

totalfac n
| n < 0 = Error
| otherwise = Ok (fac n)

fac :: Int -> Int

fac 0 = 1
fac n = n * fac (n-1)

Program 7.19: Two total functions

Now we get this behaviour:

Gofer?
fac (-3)
Error: the Macintosh stack collided with the heap
Gofer?
totalfac (-3)
Error :: Maybe Int
(9 reductions, 16 cells)
Gofer?
fac 3
6 :: Int
(11 reductions, 17 cells)
Gofer?
totalfac 3
Ok 6 :: Maybe Int
(18 reductions, 28 cells)

One important application of the totalising nature of Maybe is in error handling. We have already
seen this in § 3.3.1 on page 17, and Question 9 on page 43 relates to this. We will see a related
technique using monads in § 18.3.4 on page 158.

7.6.4 Inductively defined algebraic types


We can define algebraic types using induction. We have already seen examples of lists and trees.
Another simple inductively defined type is the type of natural numbers:

40
7.6 Algebraic types

data Nat = Zero


| Succ Nat

Program 7.20: The natural numbers

The two constructors of Nat are Zero and Succ:

• Zero is nullary and lets us make a natural number from nothing at all;

• Succ is unary and lets us construct a new Nat from an existing one.

In general, to write a function on the natural number we will write two clauses, one saying what to
do with Zero, and saying what to do with Succ n, where n :: Nat. Such functions will look like:

natfun :: Nat -> Sometype

natfun Zero = xxx


natfun (Succ n) = yyy

Program 7.21: A function on the natural numbers

An immediate question is: what can we use to fill in xxx and yyy? In both xxx and yyy natfun
itself is in scope, so we can write recursive functions like:

nat2str :: Nat -> String

nat2str Zero = "0"


nat2str (Succ n) = "s(" ++ (nat2str n) ++ ")"

Program 7.22: One conversion of a Nat to a String

The function for natural numbers which is analogous to listrec (Program 3.3 on page 16) is
natrec:

natrec :: a -> (Nat -> a -> a) -> Nat -> a

natrec d _ Zero = d
natrec d e (Succ n) = e n (natrec d e n)

Program 7.23: natrec

The function nat2str follows the pattern described by natrec. Suppose we are using natrec to
define a function f. When defining f (Succ n) we are only allowed to use the value of f at n. This
pattern of recursion is called primitive or structural recursion.4 Haskell has a more liberal notion than
structural recursion. When we are defining f (Succ n) we can use the value of f at any number
whatsoever. So we can define functions like:

natbomb :: Nat -> String

natbomb Zero = "0"


natbomb (Succ n) =
"s(" ++ (natbomb (Succ (Succ n))) ++ ")"

Program 7.24: natbomb


4 Primitive
recursion is used when only the natural numbers are being discussed, structural recursion is used when lists, trees
and so on are also being discussed. The use of the term ‘primitive recursion’ follows [45], apparently.

41
7 Types

Functions defined using structural recursion have pleasant termination properties: functions like
natbomb don’t. Why then do we allow this unlimited notion of recursion? As always you get nothing
for nothing. There are functions on the natural numbers that are clearly computable which cannot
be computed using (just) structural recursion on the natural numbers. The most famous of these are
the Ackermann functions. The original Ackermann function was constructed to show that the primitive
recursive functions are a proper subset of the computable functions. This function is clearly computable
but is not definable using primitive recursion.5 We have also seen other useful functions which are not
structurally recursive, such as quick from § 5.2 on page 26 and maximise and minimise from
Program 2.16 on page 10. The pleasant termination properties of structurally recursive functions come
from the fact that recursive calls are on values lower in the natural order generated by the inductive
definition of the type. In order to show termination (ignoring laziness, for the moment) for the more
general recursion that Haskell makes available we have to find some well-founded ordering such that
recursive calls are below the initial one in this order. In the case of quicksort this ordering is the length
of the list on which the recursive calls are made.

Important point
The crucial point is that there is a very close relationship between types defined by induction and
functions defined by recursion.

Mutual induction, mutual recursion


We can have mutually inductive types, and mutually recursive functions. One slightly contrived example
is of the odd and even numbers. An even number is either zero or the successor of an odd number,
and an odd number is the successor of an even number. So we get:

data Even = Zero


| Esucc Odd

data Odd = Osucc Even

Program 7.25: Odd and Even, using mutual induction

Functions written on Odd and Even will, typically, be mutually recursive. For example:

twiceodd :: Odd -> Even


twiceeven :: Even -> Even

twiceodd Osucc even = Esucc (Osucc (twiceeven even))

twiceeven Zero = Zero


twiceeven Esucc odd = Esucc (Osucc (twiceodd odd))

Program 7.26: Double odd and even, using mutual recursion

7.7 Summary
In this Chapter we have looked at:
• parametric polymorphism;
• ad hoc polymorphism, and type classes;

• algebraic data types.

5 [63] goes into much more detail about this sort of stuff.

42
7.7 Summary

Questions
1. Draw an inheritance diagram for the basic classes.
2. What are the merits and demerits of the use of the notation [a] for both the list of types, and
particular types?
3. Investigate the relationship between Haskell’s type classes and object-oriented programming.
4. What are the rules for defining classes, properly?
5. Define types to represent:
a) arithmetic expressions, e.g. 5 + 6, 4 × −6, 2 × (3 + 8);
b) Prolog terms, e.g. X, x, f(x, X, g(h));
c) logical expressions, e.g. A ∧ B, A ⊃ (B ⊃ ¬(C ∨ A)).
6. Define oddrec and evenrec, the structural recursion operators for Odd and Even in Program
7.25. What observation can you make about oddrec and evenrec?
7. Define a type for Prolog terms using mutual recursion.

8. Given these definitions:

type Age = Int


type FName = String
type Sname = String

data Major = COMP


| MATH
| STAT
| OPRE
| LOCO

Program 7.27: Some definitions

We can define a type to represent students as:

type Student = (Age, FName, SName, Major)

Program 7.28: One version of Student

or as:

data Student = Student Age FName SName Major

Program 7.29: Another version of Student

Which is preferable, and why?


9. If you are trying to report errors to a person the simple report that there is an error is usually not
very useful. Modify the Maybe type in Program 7.18 on page 40 to allow you to report errors
more informatively.
10. Write a function adder :: Maybe Int -> Maybe Int -> Maybe Int which performs
addition on maybe Int.

43
7 Types

44
8 Abstract data types
There are various more-or-less equivalent definitions of ADT’s in the literature:

• An ADT is ‘a type, and a collection of functions on that type’ [59];

• ‘An abstract type consists of the type name, the signature of the type, and the implementation
equations for the names in the signature’ [77];

• ‘An abstract data type consists of a type together with an explicit set of operations, which are the
only means by which to compute with values of that type.’ [61].

Implementing ADT’s requires us to hide some information, specifically exactly how the ADT is real-
ized. In this respect ADT’s are like type classes, and one of the formal approaches to ADT’s involves
describing them using existentially quantified types. [8] discusses this further, although not in a specifi-
cally functional setting. In Haskell we achieve the information hiding by using the modules.

8.1 Modules
Haskell has a module system. Using modules allows us to keep local information local, and only export
what should be exported. A module has an interface which tells us what is exported from the module,
that is which parts of it another module can import. The module system of Haskell gives us control over
what is imported and exported.

8.2 Queue ADT using modules


As a simple example of the implementation of an ADT we look at queues. A queue is a ‘first in, first
out’ list. We need to be able to:

• create an empty queue;

• add a new item to the queue;

• remove an item from the queue;

• test a queue to check whether it is empty or not.

We call the module that we are creating Queue, and we will define, and export:

• a type Queue;

• emptyQ :: Queue a

• addQ :: a -> Queue a -> Queue a

• remQ :: Queue -> (a, Queue a)

• isEmptyQ :: Queue a -> Bool

We start out like this:

45
8 Abstract data types

module Queue
( Queue,
emptyQ,
isEmptyQ,
addQ,
remQ
) where
????

Program 8.1: Starting to define a queue ADT

this tell us that the module Queue defines (and exports) a type Queue and functions emptyQ, addQ,
remQ, isEmptyQ. There is no information here about the types of these functions, nor about their
properties. We adopt the convention used in [77] and add the types as comments:

module Queue
( Queue,
emptyQ, -- Queue a
isEmptyQ, -- Queue a -> Bool
addQ, -- a -> Queue a -> Queue a
remQ -- Queue -> (a, Queue a)
) where
????

Program 8.2: Continuing to define a queue ADT

Now we must fill in the ???? with the details of the implementation. Our first choice is which type
we will use to represent queues. Lists look like a good bet so we declare:

newtype Queue a = Qu [a]

Program 8.3: queue a using newtype

The keyword newtype is a new one. Program 8.3 has exactly the same meaning as:

data Queue a = Qu [a]

Program 8.4: Queue a using data

but, for a one-constructor type, newtype is implemented in a more efficient manner.


Now we must give definitions of the functions we are exporting. EmptyQ and isEmptyQ involve no
real decision:

emptyQ = Qu []

isEmptyQ (Qu []) = True


isEmptyQ _ = False

Program 8.5: EmptyQ and isEmptyQ

To define addQ and remQ we have to make a decision. Do we add elements to the head or the tail
of the list? Suppose we decide to add them at the tail. Then we would define:

46
8.2 Queue ADT using modules

addQ a (Qu q) = Qu (q ++ [a])

remQ q@(Qu xs)


| not (isEmptyQ q) = (head xs, Qu (tail xs))
| otherwise = error "remQ"

Program 8.6: addQ and remQ

There are two more novelties in this:


• the pattern q@(Qu xs) allows us to use q as a name for Qu xs;
• error is a standard function which allows us to handle errors.
The whole ADT is given in Program 8.7.

module Queue
( Queue,
emptyQ, -- Queue a
isEmptyQ, -- Queue a -> Bool
addQ, -- a -> Queue a -> Queue a
remQ -- Queue -> (a, Queue a)
) where

newtype Queue a = Qu [a]

emptyQ = Qu []

isEmptyQ (Qu []) = True


isEmptyQ _ = False

addQ a (Qu q) = Qu (q ++ [a])

remQ q@(Qu xs)


| not (isEmptyQ q) = (head xs, Qu (tail xs))
| otherwise = error "remQ"

Program 8.7: A queue ADT

Now we can do things like:


Queue> isEmptyQ emptyQ
True
and attempt things like:
Queue> addQ 6 emptyQ
ERROR: Cannot find "show" function for:
*** Expression : addQ 6 emptyQ
*** Of type : Queue Integer
We need to make Queue and instance of the Show class, which we can do by adding:

instance Show a => Show (Queue a) where


show (Qu l) = show l

Program 8.8: Showing a queue

Now we can do:

47
8 Abstract data types

Queue> addQ 6 emptyQ


[6]

There is one minor problem with the interaction between type classes and modules. We did not
declare show to be exported, but it is now available everywhere. Although show is not likely to cause
us problems it is conceivable that the class system could allow details of the concrete representation to
leak out, with unwanted consequences.
One option that we might have taken would be to use a data declaration and define:

data Queue a = Qu [a]


deriving Show

Program 8.9: Queue a using data

If we make this change we get this behaviour:

Queue> addQ 6 emptyQ


Qu [6]

which is slightly different.

8.2.1 Altering our definitions

The definitions of addQ and remQ given in Program 8.6, we based on a decision about where to add
and remove from the queue. Had we made the other choice we would have defined:

addQ a (Qu q) = Qu (a:q)

remQ q@(Qu xs)


| not (isEmptyQ q) = (last xs, Qu (init xs))
| otherwise = error "remQ"

Program 8.10: addQ and remQ

So, what difference does this make? As far as the equations which hold about adding and removing
items from queues go there is no difference at all. However, in Program 8.6 it is cheap to remove
items and expensive to add them; whereas in Program 8.10 it is cheap to add items and expensive
to remove them. Part of the point of having ADT’s is so that we can find better representations of our
datatypes and use them, without having to change our code globally. Can we find a representation of
queues in which it is cheap both to add and remove items? If we use a pair of lists, one, representing
the back of the queue, for adding to, and one, representing the front of the queue, for removing from,
then adding and removing will both be cheap, most of the time. Sometimes we will try to remove an
item from a queue which has a back, but no front, and this operation will be expensive. However this
operation will be no more expensive than the expensive operation in the naïve representation, and will
be performed a lot less often, and so overall we get a benefit. Our new implementation will now look
like:

48
8.3 A set ADT

data Queue a = Qu [a] [a]

emptyQ = Qu [] []

isEmptyQ (Qu [] []) = True


isEmptyQ _ = False

addQ x (Qu xs ys) = Qu xs (x : ys)

remQ (Qu (x:xs) ys) = (x, Qu xs ys)


remQ (Qu [] ys) = remQ (Qu (reverse ys) [])
remQ (Qu [] []) = error "remQ"

Program 8.11: Less naïve queues

[59] gives a very thorough treatment of issues relating to purely functional data structures, particularly
in terms of their efficiency, and is well-worth further study.

8.2.2 Equality on ADT’s


One interesting issue is raised by the question of when two queues are equal queues. In the naïve
representation two queues were the same when they were Qu l and Qu m and the two lists l and m
were the same list. For the two list representation of Program 8.11 this need no longer be the case. For
example Qu [1,2,3] [] and Qu [] [3,2,1] represent the same queue, even though they are
clearly made from different lists. These two queues have the same behaviour when we remove items
from them. In this sense we can think of queues (and other ADT’s) as being defined, not by how they
were assembled, but by how they may be decomposed.

8.3 A set ADT


As a longer example we look at a set ADT. A set is a collection of items (all of the same type, as far as
we are concerned). We do not care how many copies of each item in the set there are in the set, nor
do we care what order the items in the set occur in.
We shall follow the implementation of sets given in [77], but we will simplify somewhat. Thompson
chooses to define sets of items on which an order is defined. Doing this simplifies considerably some
of the implementations of the set operations, but it would appear to be a leaking of information about
the representation. There are also operations available on sets of items from an order type which are
not available on sets of items from an type which does not have an ordering.
The operations that we need to define include:

• set intersection;

• set union;

• set difference;

• membership test;

• equality test;

• forming the empty set;

• adding an item to a set;

• counting the items in a set;

• showing a set.

49
8 Abstract data types

So, the signature for the sets that we are defining will look like:

module Set
(Set,
empty, -- Set a
addSet, -- Ord a => a -> Set a -> Set a
memSet, -- Ord a => Set a -> a -> Bool
union, inter, diff, -- Ord a => Set a -> Set a -> Set a
eqSet, -- Eq a => Set a -> Set a -> Bool
showSet, -- (a -> String) -> Set a -> String
card -- Set a -> Int
)
where
????

Program 8.12: A signature for sets

There are a lot of other set operations that we might define in the ADT itself. For example we can
make a singleton set by inserting an item into the empty set; or we might decide that this is such a
commonly required operation that we should define it in the ADT.
Now we fill in the ???? in Program 8.12. We follow Thompson and choose to represent sets as
ordered lists, without repetitions. We will import the definitions from the List module, but since union
is already defined for lists we hide this definition. And we explain how Set a is an instance of Eq and
Ord. So we begin with:

newtype Set a = SetI [a]

import List hiding (union)

instance Eq a => Eq (Set a) where


(==) = eqSet

instance Ord a => Ord (Set a) where


(<=) = leqSet

Program 8.13: Starting to define sets

Next we start to define the operations. We start with empty and addSet:

empty :: Set a

empty = SetI []

addSet :: Ord a => a -> Set a -> Set a

addSet i (SetI s) = SetI (adds i s)

adds i s :: a -> [a] -> [a]

adds i [] = [i]
adds i (l@h:t)
| i < h = i:l
| i == h = l
| otherwise = h : (adds i t)

Program 8.14: Adding more definitions to the set ADT

50
8.3 A set ADT

This looks fine, but it might strike us that we could define a function to turn a list into a set, and define
both empty and addSet in terms of this. With the definitions we have we have ensured that sets are
represented by ordered lists with no repetitions. Now we can get on with defining more operations:

memSet :: Ord a => Set a -> a -> Bool

memSet (SetI []) _ = False


memSet (SetI (h:t)) y
| h < y = memSet (SetI t) y
| h == y = True
| otherwise = False

eqSet :: Eq a => Set a -> Set a -> Bool

eqSet (SetI l) (SetI m) = l == m

card :: Set a -> Int

card (SetI l) = length l

Program 8.15: And some more

Next we define set union:

union :: Ord a => Set a -> Set a -> Set a

union (SetI l) (SetI m) = SetI (uni l m)

uni :: Ord a => [a] -> [a] -> [a]


uni [] m = m
uni l [] = l
uni (l@h:t) (m@i:s)
| h < i = h : (uni t m)
| h == i = h : (uni t s)
| otherwise = i : (uni l s)

Program 8.16: Set union

and intersection:

inter :: Ord a => Set a -> Set a -> Set a

inter (SetI l) (SetI m) = SetI (int l m)

int :: Ord a => [a] -> [a] -> [a]


int [] _ = []
int _ [] = []
int (l@h:t) (m@i:s)
| h < i = int t m
| h == i = h : (int t s)
| otherwise = int l s

Program 8.17: Set intersection

And finally, we explain how to show a set:

51
8 Abstract data types

showSet :: (a -> String) -> Set a -> String

showSet f (SetI s) = concat (map ((++"\n") . f) s)

Program 8.18: Showing a set

8.4 Relations and graphs using the set ADT


Now we proceed to use the set ADT to define relations and graphs. The set ADT which Thompson gives
has the following signature:

module Set
(Set,
empty, -- Set a
sing, -- a -> Set a
memSet, -- Ord a => Set a -> a -> Bool
union, inter, diff, -- Ord a => Set a -> Set a -> Set a
eqSet, -- Eq a => Set a -> Set a -> Bool
subSet, -- Ord a => Set a -> Set a -> Bool
makeSet, -- Ord a => [a] -> Set a
mapSet, -- Ord b => (a -> b) -> Set a -> Set b
filterSet,-- (a -> Bool) -> Set a -> Set a
foldSet, -- (a -> a -> a) -> a -> Set a -> a
showSet, -- (a -> String) -> Set a -> String
card -- Set a -> Int
)
where
????

Program 8.19: The signature for Thompson’s sets

Completing this ADT is left as an exercise (Question 4 on page 57).

8.4.1 Relations
A binary relation on a set is a set of pairs. In Haskell we can express this as:

type Relation a = Set (a, a)

Program 8.20: Relations

The set operations are therefore available on relations.


There are various operations that we will want to define on relations themselves. We follow Thomp-
son and look at some of these.

Image
The image of a value v under a relation R is the set of all values x such that R(v, x). How do we
construct this set? We must find the second item in all the pairs in the relation whose first item is v.
Recall that a relation is a set, so:

filterSet ((==v).fst) r

will find the set of pairs in r which have v as the left item. Now we just need to form the set of the
right item from these pairs. So, all we need is:

52
8.4 Relations and graphs using the set ADT

image :: Ord a => Relation a -> a -> Set a

image r v = mapSet snd (filterSet ((==v).fst) r)

Program 8.21: image

Set image
We can extend this notion to form images of sets. The image of a set is the set of the images of the
items in the set. We can compute this by forming the set of images (of type Set (Set a)), and then
forming the union of these.

setImage :: Ord a => Relation a -> Set a -> Set a

setImage r = unionSet . mapSet (image r)

unionSet :: Ord a => Set a -> Set a -> Set a

unionSet = foldSet union empty

Program 8.22: setImage

Composition
Given two relations we can form their composition, which will itself be a relation. In set comprehension
notation the composition is given by:

{(x, y)|(∃z)(R1 (x, z) ∧ R2 (z, y))}

How do we compute this? Thompson does it by computing the set of all pairs of the pairs in the two
relations (i.e. pairs of pairs), and then filter out those which we are not interested it, and then taking
only the ‘outer’ items from the pair of pairs:

compose :: Ord a => Relation a -> Relation a -> Relation a

compose r1 r2 =
mapSet outers (filterSet innerseq (setProduct r1 r2))
where
outers ((x, _) (_, y)) = (x, y)
innerseq ((_, w) (z, _)) = w == z

Program 8.23: composition

SetProduct :: Ord a => Set a -> Set a -> Set a takes two sets and computes the
set of pairs, where the first item in the pair is from the first set and the second is from the second set.
Completing this definition is left as an exercise (Question 6).

Transitive closure of a relation


If we have a relation we can compute its transitive closure. The transitive closure of a relation is formed
by composing it with itself, until we reach a relation which is not changed when composed with the
initial relation. This relation is transitive,1 and is a closure of the initial relation, and so is called the
transitive closure.
So, to compute a transitive closure we need to be able to compose relations, and to test when we
have reached the limit of this process. The limit function is a more general one and can be defined as
follows:
1 Prove this.

53
8 Abstract data types

limit :: Eq a => (a -> a) -> a -> a

limit f x
| x == (f x) = x
| otherwise = limit f (f x)

Program 8.24: limit

The transitive closure of a relation is then the limit of composing the relation with itself:

transClos :: Ord a => Relation a -> Relation a

transClos r = limit extend r


where extend s = s ‘union‘ (s ‘compose‘ r)

Program 8.25: transClos

8.4.2 Graphs
We can treat a relation as a description of a (directed) graph: the relation records which nodes have
arcs between them. Of course, as soon as we see a graph we are determined to search it. And as
soon as we think of searching we are not sure whether to choose depth- or breadth-first search. So we
follow Thompson and implement both. The type of both searches is the same:

breadthFirst :: Ord a => Relation a -> a -> [a]


depthFirst :: Ord a => Relation a -> a -> [a]

Program 8.26: The type of a search function

Both these functions take a graph and a node and return all the nodes reachable from the given
node. The difference lies in the ordering of the list returned.
Implementing the solution to this problem brings us up against the ‘abstraction barrier’: we are
going to make use of a function to listify a set. Thompson defines this (as an extension to the set ADT)
as:

flatten :: Set a -> [a]

flatten (SetI l) = l

Program 8.27: Listifying a set

This really is a case of the implementation details leaking out. However we will live with this.
For both search strategies we will make use of a function to find the unvisited (immediate) descen-
dants of a given node:

findDescs :: Ord a => Relation a -> [a] -> a -> [a]

findDescs r old here =


flatten ((image r here)
‘diff‘
(makeSet old))

Program 8.28: Finding new descendants

Image r here computes the set of descendants of here. makeSet old is the set of nodes
already visited. Taking the difference of these two sets gives us the new nodes.

54
8.4 Relations and graphs using the set ADT

Breadth-first
Performing a breadth-first search is very like computing the transitive closure: we must simply build up
a list of visited nodes (in order, of course) until we reach a limit. Given a list of nodes reached after n
steps the list of nodes at n + 1 steps will be this list plus those new nodes reachable from them. If ris the
relation and xs the list of nodes already visited then map (findDescs r xs) xs :: [[a]] is
the list of lists of new nodes. In the standard prelude concat :: [[a]] -> [a] is defined which
will concatenate a list of lists into one list. So, concat (map (findDescs r xs) xs) :: [a]
is the list of new nodes reachable from xs. As r is a graph, not a tree, it is possible that there will
be duplicates in list. Again the standard prelude comes to our rescue: nub :: Eq a => [a] ->
[a] removes second and subsequent occurrences of an item from a list. Hence breadth-first search
can be implemented as:

breadthFirst :: Ord a => Relation a -> a -> [a]

breadthFirst r v =
limit step [v]
where step xs =
xs ++
nub (concat (map (findDescs r xs) xs))

Program 8.29: Breadth-first search

Depth-first
Depth-first search is slightly less straightforward. We must carry a list of visited nodes around, and so
we define:

depthFirst :: Ord a => Relation a -> a -> [a]


depthSearch :: Ord a => Relation a -> a -> [a] -> [a]

depthFirst r x = depthSearch r x []

Program 8.30: Starting to define depth-first search.

The function depthSearch can be defined as:

depthSearch r x old =
x:depthList r (findDescs r (x:old)) (x:old)

Program 8.31: depthSearch

Where depthList will find all the descendants of a list of nodes. It is defined as:

depthList :: Ord a => Relation a -> [a] -> [a] -> [a]

depthList _ [] _ = []
depthList r (h:t) old =
next ++ (depthList r t (old ++ next))
where
next = if elem h old
then []
else depthSearch r h old

Program 8.32: depthList

Notice that this definition uses mutual recursion.

55
8 Abstract data types

8.5 Searching a graph using list comprehensions


Now we can solve the problem we set ourselves in Chapter 6, of searching a graph using list compre-
hensions. We are trying to construct the list of routes from one node to another. Each route will be a
list of nodes, so we are constructing a list of lists. We assume that the graph may be cyclic, so we need
to carry a list of already visited nodes with us, which we must avoid. The full definition of this function
is:

routes :: Ord a => Relation a -> a -> a -> [a] -> [[a]]

routes r from to been


| from == to = [[from]]
| otherwise =
[from:thru | via <- nbhrs r from \\ been,
| thru <- routes r
via
to
(from:been)]

nbhrs :: Ord a => Relation a -> a -> [a]


nbhrs r x = flatten (image r x)

Program 8.33: Paths through a graph

8.6 Summary
In this Chapter we have looked at:
• the module mechanism of Haskell;
• how we can define ADT’s using this;
• examples of ADT’s;
• examples of the use of an ADT.

56
8.6 Summary

Questions
1. Define mkSet :: Ord a => [a] -> Set a, and define both empty and addSet in terms
of this.
2. Define diff, from Program 8.12. Diff a b is the set of items in a, but not b.
3. Define leqSet, from Program 8.13.
4. Define the remaining operations in Program 8.19. One of the types given is not as general as it
might be. Which one is it?
5. [From [77]] A binary search tree is an object of BinTree a (as defined in Program 2.25 on
page 13 ) whose elements are ordered. The empty tree is ordered. A tree Node v l r is
ordered if: all the values in l are smaller than v; all the values in r are greater than v; l and r
are themselves ordered. Implement a search tree ADT.
6. Complete the definition of compose from Program 8.23.
7. Define the reflexive and the symmetric closures of a relation.
8. Implement best-first search.

9. Compare the implementation of depth-first and breadth-first search given in §8.4.2 with those
given in [61], [7], and your favourite imperative programming textbook.
10. Compare the algorithm in § 8.5 with that given in [12].

57
8 Abstract data types

58
9 Parser combinators
Parser combinators1 are discussed in a number of places, with tutorial introductions in [25, 42, 20].
[77] uses parser combinators as an example where laziness is exploited. We follow the development
given in [20]. The parser combinators that we define in this Chapter will allow us to construct recursive-
descent parsers. [15] discusses applications of recursive-descent techniques to compiling in some
depth.

9.1 The type of parsers


Our initial notion of a parser is that a parser is a function from lists of symbols to parse trees. So our
first stab at the type of a parser is:

type Parser symbol result = [symbol] -> result

Program 9.1: A type for parsers

This type looks like a good start, but a moment’s reflection will convince us that it is not really
good enough. This type is really the type of a rather monolithic parser. It is better not to try to build
monolithic parsers, but to use the support we have for modularity to build parser components. What
sort of behaviour would we expect of a parsing component? It should make sense of what it can parse,
and leave the rest of the string for the next component to work on. The type of a function which does
this is:

type Parser symbol result = [symbol] -> ([symbol], result)

Program 9.2: A better type for parsers

A parser now takes a list of symbols and returns a pair consisting of a list of symbols and a parse
tree. Can we do better than this? Suppose there are a number of different ways in which an initial part
of a list of symbols can be understood by our parser. For example, we may have a rule which says
that a variable name is a sequence of lower-case letters. Then the string which begins "xyz" could
be the variable x followed by something, the variable xy followed by something, or the variable xyz
followed by something. One way to deal with situations like this is to use some sort of backtracking
search mechanism, where we come back to explore alternative solutions when the one we first picked
fails. Another way is, rather than relying on failure, is to work with a list of successes. This technique
is described in [81]. We have already used it in § 8.5 on page 56 when we implemented depth-first
search using list comprehensions. Instead of producing just one possible pair we produce them all.
Then we have this type:

type Parser symbol result =


[symbol] -> [([symbol], result)]

Program 9.3: An even better type for parsers

Now a parser is a function from lists of symbols to lists of list of symbol, parse tree pairs. In many
practical cases we are trying to parse strings, i.e. lists of Char, and so the type of parsers will usually
be String -> [(String, result)].
1 Remember that ‘combinator’ is just another word for ‘function’.

59
9 Parser combinators

9.2 From grammar to parser


The following grammar is a grammar for expressions consisting of balanced brackets:

Expr −→ Lbra Expr Rbra Expr


| 

Lbra −→ ‘(0
Rbra −→ ‘)0

Figure 9.1: A grammar for bracket expressions

Examples of such expressions are: (), ()()(), ((()())), ()()()(((((()())))())). We shall develop parser
combinators as we try to write a parser for this grammar.
The parser for Lbra will accept a (, that for Rbra a ). The parser of Expr will consist of a choice
between the  parser and a sequence of parsers.

9.2.1 Simple parsers


We could write the following combinators to recognise left and right brackets:

lbr, rbr :: Parser Char Char

lbr (’(’ : xs) = [(xs, ’(’)]


lbr _ = []

rbr (’)’ : xs) = [(xs, ’)’)]


rbr _ = []

Program 9.4: lbr and rbr

We would then get this behaviour:


Gofer?
lbr "(123"
[("123",’(’)] :: [([Char],Char)]
Gofer?
lbr "123"
[] :: [([Char],Char)]
The parsers lbr and rbr are clearly generalisable. We can define:

single_symbol :: Eq s => s -> Parser s s

single_symbol sym (h:t)


| sym == h = [(t, h)]
| otherwise = []

Program 9.5: single_symbol

Now we get this behaviour:


Gofer?
(single_symbol ’(’ ) "(123"
[("123",’(’)] :: [([Char],Char)]

60
9.2 From grammar to parser

The parser single_symbol is clearly also generalisable to parse tokens larger then one symbol.
The parser token2 will recognise a token:

token :: Eq [s] => [s] -> Parser s [s]

token k xs
| k == take n xs = [(drop n xs, k)]
| otherwise = []
where n = length k

Program 9.6: token

Now we get this behaviour:


Gofer?
(token "xxx") "xxxx"
[("x","xxx")] :: [([Char],[Char])]
We can recognise a single symbol using token:

symbol :: Eq [s] => s -> Parser s [s]

symbol s = token [s]

Program 9.7: symbol

And we get this behaviour:


Gofer?
(symbol ’(’) "(123"
[("123","(")] :: [([Char],[Char])]
We can generalise single_symbol from Program 9.5 on the preceding page in another way. We
can write a parser which can recognise a symbol which satisfies some condition:

satisfy :: (s -> Bool) -> Parser s s

satisfy f (x:xs) = if (f x) then [(xs, x)] else []


satisfy f _ = []

Program 9.8: satisfy

We can define single_symbol using satisfy:

single_symbol :: Eq s => s -> Parser s s

single_symbol sym = satisfy (==sym)

Program 9.9: Defining single_symbol using satisfy

More importantly we can now define parsers like digt, a parser to recognise digits:

digt :: Parser Char Char

digt = satisfy isDigit

Program 9.10: digt


2 Notice that this is not the same as Thompson’s parser of the same name.

61
9 Parser combinators

9.2.2 Combining parsers


To implement a parser for the grammar in Figure 9.1 on page 60 we clearly need to be able to write
functions to combine parsers. The first parser combining function that we write is the function which
lets us express selection between two alternatives:

infixr 4 <|>
(<|>) :: (Parser s a) -> (Parser s a) -> Parser s a
(p1 <|> p2) xs = p1 xs ++ p2 xs

Program 9.11: <|>

For convenience we define <|> as an infix operator. Thompson calls this operator alt. The parser
p1 <|> p2 finds all the parses that p1 finds and all the parses that p2 finds.
The second parser combining function that we define is <&>, which Thompson calls >*>.

infixr 6 <&>
(<&>) :: (Parser s a) -> (Parser s b) -> Parser s (a, b)
(p1 <&> p2) xs = [(xs2, (v1, v2)) |
(xs1, v1) <- p1 xs,
(xs2, v2) <- p2 xs1]

Program 9.12: <&>

Again, for convenience, we define <&> as an infix operator. The parser p1 <&> p2 finds all the
parses which can be obtained by using p2 after using p1.
We can build a parser which always fails:

fail :: Parser s r
fail _ = []

Program 9.13: fail

And we can build a parser which always succeeds:

succeed :: r -> Parser s r


succeed v xs = [(xs, v)]

Program 9.14: succeed

And we can use succeed to implement epsilon, a parser for the empty string:

epsilon = succeed []

Program 9.15: epsilon

9.2.3 Parsing balanced brackets


We can attempt to use <&> and <|> to define a parser for balanced brackets like:

bras = (lbr <&> bras <&> rbr) <&> bras


<|> epsilon

Program 9.16: An attempt at a parser

However we soon run into a problem:

62
9.2 From grammar to parser

Gofer?
Reading script file "moreparsers.hs":
Parsing...Done
Dependency analysis...Done
Type checking...
ERROR "moreparsers.hs" (line 5): Type error in application
*** expression : (lbr <&> bras <&> rbr) <&> bras <|> epsilon
*** term : (lbr <&> bras <&> rbr) <&> bras
*** type : [Char] -> [([Char],((Char,(a,Char)),a))]
*** does not match : [b] -> [([b],[c])]

We have simply not thought about the types involved. we need to think carefully about the type of
parse trees that we are trying to construct. In this case we might define a type like:

data Tree = Nil | Node (Tree, Tree)

Program 9.17: The structure of a binary tree

We could then define a parser which constructed tree like this for us. A moment’s reflection will show
us that we can generalise here. Suppose we were not interested in merely that the brackets balanced,
but also are interested in the depth of their nesting. We might implement a function to compute the
depth of nesting which parsed a string of brackets, and constructed a tree, and then traversed the tree
to compute the nesting. Whenever we construct a data structure and then take it to pieces we should
hear alarm bells ringing. Why put something together only to take it apart? Why not compute what is
required as the tree is constructed? What we need is a way to apply an arbitrary function to the output
of the parser, not just a tree constructing function. We can define a function to do just this for us, using
a list comprehension:

infixl 5 <@
(<@) :: (Parser s a) -> (a -> b) -> Parser s b

(p0 <@ f) xs = [(ys, f v) | (ys, v) <- p0 xs]

Program 9.18: Applying a function to the output of a parser

The function which we apply is sometimes called a semantic function.


Now we can define bras as:

bras :: Parser Char Tree

bras = ((lbr <&> bras) <&> rbr) <&> bras


<@ (\(((_, a), _), b) -> Node(a, b))
<|> succeed Nil

Program 9.19: A parser for balanced brackets

We have replaced epsilon <@ k Nil with succeed Nil, for brevity. We get this behaviour:

Gofer?
bras "()"
[([],Node (Nil,Nil)), ("()",Nil)] :: [([Char],Tree)]
(64 reductions, 172 cells)
Gofer?
bras "(())"
[([],Node (Node (Nil,Nil),Nil)), ("(())",Nil)]
:: [([Char],Tree)]
(109 reductions, 277 cells)

63
9 Parser combinators

Gofer?
bras "()()"
[([],Node (Nil,Node (Nil,Nil))),
("()",Node (Nil,Nil)),
("()()",Nil)] :: [([Char],Tree)]
(118 reductions, 326 cells)
This is fine, as far a behaviour goes. If we look at the semantic function ((((_, a),_), b)
-> Node(a, b)) we see that it involves a rather complicated pattern to pick out the things we are
interested in and ignore the rest. The structure of this pattern reflects the structure of the parser itself.
In larger examples this pattern may become very much more complicated, and in any case we have
written the structure of the parser out in the parser already, and it is surely foolish to write it out again
just to do some pattern matching. Instead we can choose to use <@ to allow us to ignore the ignorable
when it is made, and rewrite program 9.19 as:

bras :: Parser Char Tree

bras = ((lbr <&> bras <@ snd)


<&> rbr <@ fst)
<&> bras <@ (\(a, b) -> Node(a, b))
<|> succeed Nil

Program 9.20: A neater parser for balanced brackets

The final semantic function is certainly neater: in fact it could just be Node, but the use of fst and
snd makes the parser more complicated than it might be. Hence we define two new operators:

infixr 6 <&
(<&) :: (Parser s a) -> (Parser s b) -> Parser s a
p <& q = p <&> q <@ fst

infixr 6 &>
(&>) :: (Parser s a) -> (Parser s b) -> Parser s b
p &> q = p <&> q <@ snd

Program 9.21: <& and &>

And now we can define:

bras :: Parser Char Tree

bras = (lbr &> bras <& rbr) <&> bras <@ Node
<|> succeed Nil

Program 9.22: An even neater parser for balanced brackets

We claimed above that one of the advantages of <@ was that it would make it easy to implement
function which parsed a string of balanced brackets and computed the depth of the nesting:

nst :: Parser Char Int

nst = (lbr &> nst <& rbr) <&> nst <@ (\(x, y) ->
max (x+1) y)
<|> succeed 0

Program 9.23: Nesting of brackets

64
9.3 Extending our parsing toolset

We get this behaviour:

Gofer?
nst "()()"
[([],1), ("()",1), ("()()",0)] :: [([Char],Int)]
(142 reductions, 317 cells)
Gofer?
nst "(())()"
[([],2), ("()",2), ("(())()",0)] :: [([Char],Int)]
(191 reductions, 408 cells)

We see immediately that nst and bras follow the same pattern, and whenever we see the same
pattern in two places we write a higher-order function to capture it:

foldparens :: ((a, a) -> a) -> a -> (Parser Char a)


foldparens f b = p
where p = (lbr &> p <& rbr) <&> p <@ f
<|> succeed b

Program 9.24: A higher-order function to parse and use balanced brackets

Now we can define bras and nst as:

bras = foldparens Node Nil

nst = foldparens (\(x, y) -> max (x+1) y) 0

Program 9.25: Using foldparens

9.2.4 More uses for <@


The operator <@ lets us do all sorts of fun things. For example we can use it to parse a char and
return an Int by adapting Program 9.10:

digit :: Parser Char Int

digit = satisfy isDigit <@ (\c -> (ord c - ord ’0’))

Program 9.26: digit

In a more complicated example, where we might be parsing a program, we can see that <@ can
be used to apply whatever manipulations we may require to be performed on the expression of the
language.

9.3 Extending our parsing toolset


We can continue to build up our toolkit of handy parsing functions, concentrating on parsing Strings.

9.3.1 sp
We would prefer our parsers not to get upset by white space. We define a function, sp, which will
apply a parser after dropping any white space, using one of the functions from the Standard Prelude:

65
9 Parser combinators

sp :: (Parser Char a) -> Parser Char a

sp p = p . dropWhile isSpace

Program 9.27: sp

We can now go back and re-define foldparens, and nst to ignore white space:

foldparens2 :: ((a, a) -> a) -> a -> (Parser Char a)


foldparens2 f b = p
where p = ((sp lbr) &> p <& (sp rbr)) <&> p <@ f
<|> sp (succeed b)

nst2 = foldparens2 (\(x, y) -> max (x+1) y) 0

Program 9.28: foldparens2

Now we get this behaviour:

Gofer?
nst2 "( (() ) )()( ) "
[([],3),
("( ) ",3),
("()( ) ",3),
("( (() ) )()( ) ",0)] :: [([Char],Int)]
(796 reductions, 1445 cells)

9.3.2 just
Another handy function is just, which ensures that there is nothing left over from the parse:

just :: (Parser s a) -> Parser s a

just p = filter (null . fst) . p

Program 9.29: just

Just works by filtering out the successful parses which have some work left to do. For example:

Gofer?
(just nst2) "( (() ) )()( ) "
[([],3)] :: [([Char],Int)]
(756 reductions, 1296 cells)

9.3.3 some
Just is not quite the correct function: we really only want to see the 3 and not all the brackets. We
can remedy this by defining a new function some. Some is used where we know that there is exactly
one parse.

some p = snd . head . just p

Program 9.30: some

Now we can do things like:

66
9.3 Extending our parsing toolset

Gofer?
(some nst2) "( (() ) )()( ) "
3 :: Int
(592 reductions, 1071 cells)

Whereas just has type (Parser a b) -> Parser a b, some has type (Parser a b) ->
[a] -> b. Since the effect of some is to construct a deterministic parser we define:

type DetParser a b = [a] -> b

Program 9.31: DetParser

We can now give some the type (Parser a b) -> DetParser a b.

9.3.4 <:&>
There are a number of times when we can expect to parse a sequence of items, and return a list of
parse trees. For situations like this we define the operator <:&>, which makes a list from whatever <&>
gives back:

infixr 6 <:&>

(<:&>) :: (Parser s a) -> (Parser s [a]) -> Parser s [a]

p <:&> q = p <&> q <@ (uncurry (:))

Program 9.32: <:&>

The function uncurry is from the standard prelude. It is defined as follows:

uncurry :: (a -> b -> c) -> (a,b) -> c

uncurry f = \(x, y) -> f x y

Program 9.33: uncurry

There is also a function curry :: ((a, b) -> c) -> a -> b -> c:

curry :: ((a, b) -> c) -> a -> b -> c

curry f = \x y -> f(x, y)

Program 9.34: curry


9.3.5 Kleene
The first use we find for <:&> is to define ∗ . A∗ is a list of zero or more A’s. We choose to implement
a unary function called star:

-- Kleene star
star :: (Parser s a) -> Parser s [a]
star p = p <:&> star p
<|> succeed []

Program 9.35: star

67
9 Parser combinators

+
9.3.6 Kleene
+
And of course we next define . A+ is a list of one or more A’s. We choose to implement a unary
function called plus:

plus :: (Parser s a) -> Parser s [a]

plus p = p <:&> star p

Program 9.36: plus

Suppose in our favourite language an identifier is any sequence of one or more alphabetic charac-
ters:

-- identifier
identifier :: Parser Char String

identifier = plus (satisfy isAlpha)

Program 9.37: identifier

We now get this behaviour:

Gofer?
identifier "haskell"
[([],"haskell"),
("l","haskel"),
("ll","haske"),
("ell","hask"),
("kell","has"),
("skell","ha"),
("askell","h")] :: [([Char],[Char])]
(321 reductions, 864 cells)

9.3.7 first
This is probably not how we want a parser for an identifier to behave. We almost certainly don’t want
the backtracking in case of a later failure, and so we define first:

-- first gives back the first parse only


first :: (Parser s a) -> Parser s a

first p = (take 1) . p

Program 9.38: first

We can now define:

-- identifier
identifier :: Parser Char String

identifier = (first . plus) (satisfy isAlpha)

Program 9.39: identifier 2

And now we get this behaviour:

68
9.3 Extending our parsing toolset

Gofer?
identifier "haskell"
[([],"haskell")] :: [([Char],[Char])]
(154 reductions, 351 cells)

9.3.8 bang
Because first affects backtracking behaviour it reminds us of Prolog’s !. So we define:

plus_bang :: (Parser s a) -> Parser s [a]

plus_bang = first . plus

Program 9.40: plus_bang

Of course we can compose first with star as well, and so we define:

star_bang :: (Parser s a) -> Parser s [a]

star_bang = first . star

Program 9.41: star_bang

And of course we see that these two definitions follow the same pattern, so we define:

bang :: ((Parser s a) -> Parser s [a]) ->


(Parser s a) -> Parser s [a]

bang pc = first . pc

Program 9.42: bang

Now we can define parsers like these:

--- natural numbers


digit :: Parser Char Int
digit = satisfy isDigit <@ (\c -> (ord c - ord ’0’))

natural :: Parser Char Int


natural = plus_bang digit <@ (foldl (\a b -> 10*a + b) 0)

Program 9.43: natural

The function foldl is from the standard prelude:

foldl :: (a -> b -> a) -> a -> [b] -> a

foldl f z [] = z
foldl f z (x:xs) = foldl f (f z x) xs

Program 9.44: foldl

Foldl is a tail-recursive function which uses an accumulating argument, and can be used like this:
Gofer?
foldl (flip (:)) [] [1..5]
[5, 4, 3, 2, 1] :: [Int]

69
9 Parser combinators

Natural behaves like this:

Gofer?
(some natural) "123"
123 :: Int
(94 reductions, 195 cells)

9.3.9 optionally
We know that there are many grammars where we have optional elements, so we define a combinator
for these:

optionally :: (Parser s a) -> Parser s [a]


optionally p = p <@ (\x -> [x])
<|> epsilon

Program 9.45: optionally

An now we can define parsers like this:

-- integer
minussign = symbol ’-’

integer :: Parser Char Int


integer = optionally minussign <&> natural <@ f
where f ([], n) = n
f (_, n) = -n

Program 9.46: integer

Integer behaves like this:

Gofer?
(some integer) "123"
123 :: Int
(121 reductions, 254 cells)
Gofer?
(some integer) "-123"
-123 :: Int
(120 reductions, 253 cells)

9.4 Parsing sequences of items


Since there are a lot of occasions when we will expect to parse a sequence of items we shall write
special combinators for this. If opensym and closesym are parsers for opening and closing symbols
respectively, then we can define pack:

pack :: (Parser s a) -> (Parser s b) ->


(Parser s c) -> Parser s b

pack opensym p closesym = opensym &> p <& closesym

Program 9.47: pack

If we make definitions like:

70
9.4 Parsing sequences of items

lparen = sp (symbol ’(’) -- allow leading spaces


rparen= sp (symbol ’)’) -- allow leading spaces

lsq = symbol ’[’


rsq = symbol ’]’

lbrace = symbol ’{’


rbrace = symbol ’}’

begin = token "begin"


end_ = token "end" -- end already used by Haskell

Program 9.48: begin and end symbols

we can use pack like this:

parenthesized p = pack lparen p rparen

sqparenthesized p = pack lsq p rsq

braced p = pack lbrace p rbrace

compound s = pack begin s end_

Program 9.49: Using pack

The sorts of things that we will typically pack between two symbols will be lists of items separated by
(meaningless) separators, so lets define a parser combinator for these:

listOf :: (Parser s a) -> (Parser s b) -> Parser s [a]

listOf p s = p <:&> star (s &> p)


<|> epsilon

Program 9.50: listOf

One particular type of list that we will require will be lists separated by commas:

comma = sp (symbol ’,’) -- allow leading spaces

commaList :: (Parser Char a) -> Parser Char [a]

commaList p = listOf p comma

Program 9.51: commaList

Now we get this behaviour:


Gofer?
(parenthesized (commaList integer)) "(-1,2,3,-4)"
[([],[-1, 2, 3, -4])]
(662 reductions, 1280 cells)

9.4.1 Example: S-expressions


We can now write a parser for LISP-like s-expressions. We take a description of s-expressions from
p. 343 of [28]:

71
9 Parser combinators

An expression is either an atom or a list. An atom is a string of characters. . . A list is a


sequence of atoms or lists, separated by spaces and bracketed by parentheses.
..
.
A symbol (an atom) is either a number or a name.

This definition is slightly confused, but the gist is clear enough.


When we parse an s-expression we will construct a value of this type:

data Sexpr = Ident String


| Number Int
| Compound [Sexpr]

Program 9.52: A type for s-expressions

In Program 9.39 on page 68 we gave a suitable parser for identifiers, and in Program 9.46 on
page 70 we gave a suitable parser for integers. All we need do now is describe a list of items separated
by spaces:

spaces :: Parser Char String

spaces = (first . plus) (satisfy isSpace)

spacelist :: (Parser Char a) -> Parser Char [a]

spacelist p = listOf p spaces

Program 9.53: A parser for space lists

And so an s-expression can be parsed by:

sexpr :: Parser Char Sexpr

sexpr = identifier <@ Ident


<|> integer <@ Number
<|> parenthesized (spacelist sexpr) <@ Compound

Program 9.54: A parser for s-expressions

Now we get this behaviour:

Gofer?
sexpr "(set x 3)"
[([],Compound [Ident "set", Ident "x", Number 3])]
Gofer?
sexpr "(lambda (x y) x)"
[([],Compound [Ident "lambda",
Compound [Ident "x", Ident "y"], Ident "x"])]
Gofer?
sexpr "(defun k (lambda (x y) x))"
[([],Compound [Ident "defun",
Ident "k",
Compound [Ident "lambda",
Compound [Ident "x", Ident "y"],
Ident "x"]])]

72
9.4 Parsing sequences of items

9.4.2 Lists with meaningful separators


Now we will look at parsing lists of items where the separators are meaningful. The lists that we have
looked at so far only have meaningless separators, like commas and spaces. What if the separators
themselves were meaningful? We might, for example, have an expression like:

5 + 4 − −3 ∗ 12

We can think of this expression as a list of integers separated by meaningful symbols. We have
already seen examples of expressions like this, when we use foldr (a.k.a. reduce) and foldl. We
can think of foldr (+) 0 [1..5] as a short-hand for:

(1 + (2 + (3 + (4 + (5 + 0)))))

and foldl (+) 0 [1..5] as a short-hand for:

(((((0 + 1) + 2) + 3) + 4) + 5)

Since + is associative these two expressions have the same value. this is not the case for a non-
associative operation, such as division on floating point numbers:

Gofer?
foldr (/) 1.0 [1.0 .. 5.0]
1.875 :: Float
(44 reductions, 82 cells)
Gofer?
foldl (/) 1.0 [1.0 .. 5.0]
0.00833333 :: Float
(44 reductions, 92 cells)

Notice that foldr makes the operator associate to the right, foldl to the left.
We shall explain how to parse a string like "1+2+3+4+5", and return the value we would expect
form computing 1 + 2 + 3 + 4 + 5. We are clearly going to adapt the listOf parser, but whereas
listOf threw the separators away we must keep and use them. Now we define a parser which will
parse a list of values separated by symbols representing functions, and apply the functions on the fly.
The type of this parser is:

chainl :: (Parser s a) -> (Parser s (a-> a -> a)) ->


Parser s a

Program 9.55: The type of chainl

We have jumped ahead of ourselves by calling the parser combinator chainl as we are presuming
that there will be a chainr. When we use chainl we will typically instantiate the type as:

(Parser Char Int) -> (Parser Char (Int-> Int -> Int)) -> Parser Char Int

We can define chainl as:

chainl p s =
p <&> star (s <&> p) <@
(\(e0, l) -> foldl (\x (op, y) -> op x y) e0 l)

Program 9.56: chainl

Now we can define:

73
9 Parser combinators

plussign = symbol ’+’


-- minussign = symbol ’-’

applyop :: Parser Char (Int -> Int -> Int)


applyop = (plussign <@ k (+) <|> minussign <@ k (-))

expr :: Parser Char Int

expr = chainl integer applyop

Program 9.57: expr

And we get this behaviour:


Gofer?
(some expr) "1+2+3+4+5"
15 :: Int
(492 reductions, 1037 cells)
Gofer?
(some expr) "1-2+3-4+5"
3 :: Int
(492 reductions, 993 cells)
We can define chainr:

chainr :: (Parser s a) -> (Parser s (a-> a -> a)) ->


Parser s a

chainr p s =
star (p <&> s) <&> p <@
(\(l, e0) -> foldr (\(x, op) y -> op x y) e0 l)

Program 9.58: chainr

And then a right-associative version of expr:

exprr :: Parser Char Int

exprr = chainr integer applyop

Program 9.59: exprr

And we get this behaviour:


Gofer?
(some exprr) "1+2+3+4+5"
15 :: Int
(526 reductions, 1091 cells)
Gofer?
(some exprr) "1-2+3-4+5"
5 :: Int
(544 reductions, 1094 cells)

9.5 <&=>
We return to the problem of dealing with tuples that we came across before when we defined &> and
<&. We can define a new parser combinator to compose parsers in a way similar to <&>. We can the
new combinator <&=>:

74
9.6 Summary

infixr 6 <&=>

(<&=>) :: (Parser s a) -> (a -> Parser s b) -> Parser s b

(p1 <&=> p2) xs = [ts | (xs1, v1) <- p1 xs,


ts <- p2 v1 xs1]

Program 9.60: <&=>

Some authors call <&=> bind. We have made <&=> deal with taking the tuples to pieces.
We can use <&=> in similar way to the way we used <&>. For example, a parser to read and add
two integers:

twoints :: Parser Char Int

twoints = integer <&=> (\i ->


symbol ’+’ <&=> (\_ ->
integer <&=> (\j ->
succeed (i+j))))

Program 9.61: twoints

We get this behaviour:


Gofer?
twoints "2+2"
[([],4)] :: [([Char],Int)]
(164 reductions, 313 cells)
We can define the nesting parser as:

nest :: Parser Char Int

nest = (lbr &> nest <& rbr) <&=> (\x ->


nest <@ (\y ->
max (x+1) y))
<|> succeed 0

Program 9.62: nest

We will re-visit <&=> when we look at monads in Chapter 18.

9.6 Summary
In this Chapter we have covered a lot of material.

75
9 Parser combinators

Questions
1. Write some parsers for e.g. arithmetic expressions, λ-terms, Prolog terms, your favourite pro-
gramming language.
2. Write the Chapter summary.
3. Find out why <&=> is also called bind.

76
10 The λ-calculus
In this Chapter we will look at the un-typed λ-calculus. In Chapter 11 we will look at the typed theory.
This should be the most formal part of the course, but we will avoid full formality wherever possible.

10.1 Introduction
The λ-calculus was invented in the 1930’s by the American logician Alonzo Church [9]. Church was
interested in the foundations of mathematics, and he hoped to construct a language in which all and
only the consistent part of mathematics could be expressed. We now know, thanks to Gödel [30], that
this is impossible, but the λ-calculus has proved to be useful in other ways, particularly in computer
science. In this respect Church was like Columbus: he set off in search of one thing, but found another,
rather more interesting, thing.
The λ-calculus was one of the first formal models of computation. Alan Turing’s work on Turing
machines [79] was contemporary,1 and the phrase ‘Turing machine’ was first used in print by Church,
in a review of [79] in the Journal of Symbolic Logic. At about the same time Emil Post proposed another
formal model of computation [67]. There have been various other formal models of computation
proposed, such as:

• Minsky systems;

• Markov systems;

• µ-recursive functions [45];

• unlimited register machines (URM’s) [73];

• the Warren Abstract Machine (WAM) [1];

• the Java Virtual Machine (JVM) [51].

Although they look very different all of these models of computation have been formally shown to
describe the same class of functions. Church’s Thesis is the assertion that this class of functions coincides
with our intuitive notion of what a computable function is, and can properly be called the computable
functions. Church’s Thesis is not amenable to formal proof as it relates an informal notion with a
formal one. To give a formal proof of Church’s Thesis we should have to formalise correctly our
informal notion of what a computable function is: that is we should have to have a proof of Church’s
Thesis to hand already.
The main feature of the λ-calculus is that it is a notation to discuss functions as rules. This contrasts
with the notion of functions as graphs. Since the latter part of the 19th century it has been common
for mathematicians to attempt to found mathematics on set theory, and to treat functions as a derived
notion. Using this model we treat a function F as a relation, that is as a set of pairs, with the extra
condition that if (x, y) ∈ F and (x, z) ∈ F then y = z. This is an extensional notion of what a function
is, and is quite reasonable for (some) mathematical applications, but it is a poor model for computer
scientists. The graph of the addition function is infinite: where do we store this infinite graph in a finite
machine? We have also seen many examples of pairs of functions which have the same graph, but
which behave very differently. We do not want to identify all sorting algorithms as being the same
function. As computer scientists we are concerned very much with algorithmic aspects of functions, and
we naturally want to think of a function as a rule, and treat the rules themselves as objects of study. The
λ-calculus provides us with an excellent setting in which to investigate functions-as-rules. Just as the
WAM provides us with a model for the implementation of Prolog, and the JVM provides us with a model
1 Church was Alan Turing’s Ph.D. supervisor at Princeton.

77
10 The λ-calculus

for the implementation of Java, so the λ-calculus provides us with a model for the implementation of
our favourite functional programming language. Of course, since all of these models of computation
are formally equivalent, they all provide us with an implementation model for each other, but not
necessarily a very user-friendly model.
The λ-calculus itself is very simple, but the theory which can be built around it is very rich. The
two best-known texts on the λ-calculus are probably [4] and [39]. [4] covers (almost) everything that
you could possibly want to know about the untyped theory; [39] is more introductory, and covers the
relationship with combinatory logic in some depth. [5] describes the contribution of the λ-calculus to
computer science.

10.2 Syntax of λ-terms


In general we will follow [38] with respect to the syntax of the λ-calculus. The λ-calculus is, essentially,
a purely syntactic theory,2 and most of the definitions have to be carefully framed. Life is made a lot
easier if we stick with one author for the technical3 parts.
A term of the pure λ-calculus is either:

• a variable; or

• an abstraction of a variable over some term; or

• the application of a term to a term.

Such terms are called ‘pure’ because they do not involve constants. Applications and abstractions
are sometimes called compound terms. Variables (and, in the impure theory, constants) are sometimes
called atomic.
We will use x, y, z . . . , possibly with subscripts for (arbitrary) variables, and L, M , N . . . , for
(arbitrary) terms. We can describe such terms using this (almost formal) grammar:

Term −→ Var
| λVar . Term
| Term Term
| (Term)

Var −→ x, y, z, . . .

Figure 10.1: A grammar for λ-terms

The terms described by our grammar have far too many brackets in them, so we adopt the usual
conventions:

• we omit the outer brackets;

• application associates to the left, so we can write L M N rather than (L M ) N

• abstraction associates to the right, so we can write λx.λy.λz.M rather than λx.(λy.(λz.M ))

• we allow one λ to stand for many, and write λxyz.M rather than λx.λy.λz.M

• application binds tighter than abstraction, so we write λx.M N rather than λx.(M N )
2 At least as far as I am concerned.
3 That is: fiddly.

78
10.2 Syntax of λ-terms

By this definition the following are all λ-terms:

x y λx.x λx.y
xy λy(λx.y) λx(λx.x) (λx.x x)(λz.z z)
Figure 10.2: Some λ-terms

The λ-calculus is a theory about when two terms are equal. We have the following basic rules for
equality:

M = M (reflexivity)
if L = M then M = L (symmetry)
if L = M and M = N then L = N (transitivity)
if L = M then NL = NM
if L = M then LN = M N
if L = M then λx.L = λx.M
Figure 10.3: Equality rules

We need to define various notions relating to the syntax of λ-terms. All these definitions should be
relatively familiar to computer scientists: the λ-calculus is the prototypical programming language,
after all.

Length of a term

The length of a term is the number of occurrences of variables in it. It is defined inductively:

|x| = 1
|M N | = |M | + |N |
|λx.M | = 1 + |M |

Figure 10.4: The length of a λ-term

Subterms

The notion of subterm is also defined inductively:

subterms(x) = {x}
subterms(M N ) = {M N } ∪ subterms(M ) ∪ subterms(N )
subterms(λx.M ) = {λx.M } ∪ subterms(M )

Figure 10.5: Subterms of a λ-term

Notice that:

• every term is a subterm of itself;

• z is a subterm of λz.M iff it is a subterm of M

79
10 The λ-calculus

Occurrence
The notion of an occurrence is a slightly different notion from that of a sub-term. A term N occurs in
a term M at a location, and it may occur at more than one location. A location can be labelled by
the path through the λ-term (considered as a tree) required to get there. We write P to indicate an
occurrence of P .
An occurrence of λx is called an abstractor and the occurrence of x in an abstractor is called a
binding occurrence. So z occurs in λz.M , even if it is not a subterm of M . Occurrences of a variable
in a term which are not binding occurrences are called non-binding occurrences.
We can define the occurrences in a term as follows:

occurences(M ) = occs(M, ↓)
occs(x, path) = [(path, x)]
occs(M N, path) = [(path, M N )]++occs(M, path .)++occs(N, path &)
occs(λx.M, path) = [(path, λx.M ), (path ., x)]++occs(M, path &)

Figure 10.6: occurences

We are using a cute notation of sequences of arrows to represent the paths, but strings of 0’s and
1’s will do fine too.

Components
All the occurrences of terms in M , other than binding occurrences of variables, are called components
of M .

Scope
If λx.P is a component of some term, then we say that:
• P is the body of λx.P
• P is the scope of the abstractor λx
The covering abstractors of P are the abstractors whose scopes contain P .
Notice that by this definition scopes don’t have holes.

Free and bound occurrences of variables


A non-binding occurrence of a variable x in a term is a bound occurrence if it is in the scope of an
abstractor λx; otherwise it is a free occurrence.
Every occurrence of a variable in a term is:
• a binding occurrence; or
• a bound occurrence; or
• a free occurrence.

Free and bound variables


A variable is free in a term M is it has a free occurrence in M .
A variable is bound in a term M is it has a bound occurrence in M .
A variable may be both bound and free in a term. For example, x is both bound and free in:

x(λx.x)

The set of free variables of a term M is written FV(M ).

80
10.2 Syntax of λ-terms

Combinators

A term in which no variable occurs free is called closed term or a combinator. The following table gives
some well-know combinators, and their commonly used names:

λx y z.x(y z) aka B
λx y z.x z y aka C
λx y.x aka K
λx.x aka I
λx y z.x z(y z) aka S
λx.(λy.x(y y))(λv.x(v v)) aka Y
λx y.y(x x y) aka A
AA aka Θ
λx y.y aka 0
λx y.x y aka 1
λx y.xn y aka n

Figure 10.7: Some well-known combinators

10.2.1 α convertibility
Two terms which differ only in the names of their bound variables are said to be α-convertible. We
consider two α-convertible terms to be the same term. Thus, for example, λx.x and λy.y are indistin-
guishable, as are λxvu.ux(λy.xuλw.w) and λzux.xz(λy.zxλq.q).
If two terms L and M are syntactically identical we write:

L≡M

Bound variable clashes

A term M in which an abstractor λx occurs, and in which a variable x occurs outside the scope of the
abstractor (and not in the abstractor itself!) is said to have a bound variable clash.
Consider the term:
λx.λx.x

We do not know whether it is the same term as:

λy.λx.y

or is the same term as:

λx.λy.y

Any term M with a bound variable clash can be replaced by a term N without a bound variable
clash, such that M ≡ N , by renaming the bound variable, using a new name which we pick from our
endless supply of variable names, and which does not occur anywhere in the term (or terms) we are
considering. Such a variable is called a fresh variable. In future we will attempt to avoid terms with
bound variable clashes. changing the notion of scope to allow scopes with holes is probably easiest,
but it is not done (explicitly) in [38].

81
10 The λ-calculus

10.2.2 De Bruijn terms


Since the names of bound variables are irrelevant, and seem to cause no end of problems, perhaps we
should adopt a notation where bound variables are not named at all. Such a notation was developed
by De Bruijn for the AUTOMATH project [17, 18]. The AUTOMATH project was a mechanical mathematics
checking project. We can think of it as an attempt to explain mathematics to a machine. The formal
system of AUTOMATH required a lot of manipulations to be performed on λ-terms, and so De Bruijn
came up with a notation in which the representation of λ-terms, and the manipulations required on
them, were relatively easy to express.
Consider the term:

λx.λy.x(λz.y z w)

and the term:


λp.λq.p(λr.q r w)
These are the same term.
It would be useful if they were identical. De Bruijn’s idea was to measure the distance (i.e. the number
of λ’s) from the bound occurrence of a variable to the binding occurrence. We start counting at 1. The
binding occurrences can then be replaced just by λ’s. The following table give the correspondence
between some ordinary terms and De Bruijn terms:
Ordinary term De Bruijn term
λx.x λ1
λy.y λ1
(λy.y)λx.x (λ1)λ1
λx.x(λy.xyy) λ1(λ211)
λp.λq.p(λr.q r w) λλ2(λ214)
Free variables in De Bruijn terms can be represented by numbers larger than the ‘λ-depth’.

Section Summary
De Bruijn terms deal with many of the problems associated with the naming of (bound) variables, by not
giving bound variables names. Each De Bruijn term can be thought of as representing an equivalence
class of ordinary λ-terms: if L ≡ M then they both correspond to the same De Bruijn term.

10.3 Substitution
We leave De Bruijn terms and continue working with ordinary λ-terms. We can define a notion of
substitution for terms. In fact we can define a variety of (sensible4 ) notions of substitution. We choose
to define a notion which is relatively close to an implementable function. One important feature of the
substitutions that we define is that substitution is always possible, although this may require re-naming
of bound variables.
We write:
[N/x]M
which we read as ‘substitute N for x in M ’, or ‘replace x with N in M ’. Beware that different authors
use different notations: instead of [N/x]M [4] writes M [x := N ] and [61] writes M [N/x].
We must take care not to capture variables. If L ≡ M then [N/x]L ≡ [N/x]M . Consider:

[x/y]λz.zy

This is just:
λz.zx
If we naïvely perform the substitution:
[x/y]λx.xy
4 We can define any number of nonsensical notions of substitution.

82
10.3 Substitution

we get:
λx.xx
which is a different term. When we substitute a term N into an abstraction we must take care that no
free variable in N comes into the scope of the abstractor. We can ensure this by re-naming the bound
variable, using a fresh variable. So:

[x/y]λx.x y ≡ λv.v x

Substitution is defined by induction on terms:


 if x ≡ y
[N/x]y = then N
else y

[N/x] (LM ) = ([N/x]L)([N/x]M )



 if y 6∈ FV(N ) and x 6≡ y
[N/x] (λy.M ) = then λy.([N/x]M )
else λz.([N/x]([z/y]M )), z fresh

Figure 10.8: Substitution

In the third clause we could rename the bound variable anyway, even if it was not needed, but this is
just pointless work. We could, perhaps, save a little work if we observe that [N/x]P = P if x 6∈ FV(P ),
but for the moment we are more concerned with clarity.

More on capturing free variables


The problems connected to capturing free variables are not restricted only to the λ-calculus. They turn
up in many other situations are well.
Let x, y and z be natural numbers. Consider the following:

(∀x)(∀y)(∃z)(x = y + z ∨ y = x + z)

This is the valid proposition that for every two natural numbers there is a natural number which is
their difference. The difference may be zero, of course.
The rule for ∀ elimination is:

(∀v)P
[t/v]P
Rule 10.1: ∀ elimination

If we are not careful we can use this rule to show:

(∀x)(∀y)(∃z)(x = y + z ∨ y = x + z)
(∀y)(∃z)(z = y + z ∨ y = z + z)
Rule 10.2: Invalid ∀ elimination

When we should have:

(∀x)(∀y)(∃z)(x = y + z ∨ y = x + z)
(∀y)(∃w)(z = y + w ∨ y = z + w)
Rule 10.3: Valid ∀ elimination

83
10 The λ-calculus

In the conclusion to Rule 10.2 z is bound, in the conclusion to Rule 10.3 z is free. The conclusion
to Rule 10.2 is that every natural number is either zero or is even, the conclusion to Rule 10.3 is that,
for every natural number, there is a difference (possibly zero, of course) between it and an arbitrary
natural number.

10.3.1 Dynamic scope in LISP, Jensen’s device . . .


We have been very careful not to capture free variables when performing substitutions. Not everyone
has been so careful. This mistake was made by Hilbert when writing with Ackermann and when writing
with Bernays [35, 36, 10], and lies behind Jensen’s device [62], and the problems associated with
dynamic scoping in LISP [53], and the caution which is often required when using macro expansion
languages. [5] reports that Robin Gandy recalled pointing out an instance of this mistake out to Alan
Turing and that Turing’s response was that ‘This remark is worth £100 a month’. Curiously, Jensen’s
device is described as ‘ingenious’ in [21], and appears as an advanced technique in [32]. Computer
scientists and logicians would appear to regard constructions of the absurd differently.

10.4 Conversions
What we have done so far is to describe the syntax of λ-terms, and the (syntactic) operation of substi-
tution on them. We now connect substitution to an operation on the meaning of terms.

10.4.1 β reduction
If we have an abstraction applied to a term we can simplify the term, using a β reduction:

(λx.M )N B1β [N/x]M

Figure 10.9: β-reduction

We use the symbol Bβ to denote a sequence (possibly empty) of β-reductions.


A reversed β-reduction is called a β expansion.
If we have two terms, L and M which we can convert into each other by a series of β-reductions and
β-expansions then we write L =β M .
There is a distinction between =β and ≡. We can now extend our rules for equality given in Figure
10.3 with:

if L =β M then L = M
Figure 10.10: Extending the equality rules

Redex
A redex is a reducible expression, that is, an occurrence of an application of an abstraction.

10.5 The Church-Rosser Theorem


The Church-Rosser Theorem5 tells us something about the global properties of sequences of β-reductions.

Theorem 1 Church-Rosser If L Bβ M and L Bβ N then there is some P such that M Bβ P and N Bβ P .

5 Actually, this is confluence, but we will ignore the distinction. If a system (such as the λ-calculus with β-reduction) is Church-
Rosser, then it is also confluent. Conversely every confluent system is also Church-Rosser. For full details consult [3].

84
10.5 The Church-Rosser Theorem


The
Church-
Rosser The-
orem is often
called the diamond
property, be-
cause we can
draw it
as:

L
@
Bβ @ B
@ β
@
M R N
@
@
∃Bβ @ ∃Bβ
@
@
@
R
P
Figure 10.11: The diamond property

Proof
See [4], for the full details of proofs of the Church-Rosser theorem. We will outline one, due to Tait and
to Martin-Löf.
First we need a lemma: If a relation has the diamond property them so does its transitive closure.
Proof: draw a diagram!
Now we define a relation  on λ-terms such that:

•  is confluent;

• the transitive closure of  is Bβ .

Clearly  will be related to B1β .


We define  as:

M M

M  M0
λx.M  λx.M 0

M  M0 N  N0
M N  M 0N 0

M  M0 N  N0
(λx.M )N  [N 0 /x]M 0
Rule 10.4: 

The final clause could be written as:

85
10 The λ-calculus

M  M0 N  N0 (λx.M 0 )N 0 B1β Q
(λx.M )N  Q
Rule 10.5: Alternative clause for 

Now we need to show:

1.  is confluent;

2. the transitive closure of  is Bβ .

Showing 1 means that we must prove:

M L
(∀N )(M  N ⊃ (∃Q)(L  Q&N  Q))
Rule 10.6: Confluence of 

The proof of this proceeds by ∀ introduction, ⊃ introduction, and then by induction on the definition
of .
We show 2 as follows. We think of our relations as sets of pairs. The reflexive closure of B1β is
a subset of , which is itself a subset of Bβ . However, since Bβ is the transitive closure of reflexive
closure of B1β it is also the transitive closure of . 

Now we proceed to look at corollaries of the Church-Rosser theorem.

Corollary 1.1 If M =β N then there is some P such that M Bβ P and N Bβ P .

Proof
Application of Church-Rosser Theorem. 

10.6 Normal form


A term is in normal form (NF) if it contains no redexes.
If P is in normal form then P cannot be further reduced, so Bβ must leave P unchanged.

Corollary 1.2 Normal forms are unique: any term which can be reduced to normal form can only be
reduced to one normal form.

Proof
Application of Church-Rosser Theorem. 

Because normal forms are unique we can take the normal form of a term to be the value of the term.

Corollary 1.3 The λ-calculus is consistent.

Proof
By ‘consistent’ we mean that we cannot show P =β Q, for all P , Q. Let L and M be in normal form,
and be distinct (i.e. we do not have L ≡ M ). Suppose L =β M . Then L would have two normal forms,
itself and M . But this contradicts Corollary 1.2, hence ¬(L =β M ). Hence it is not the case that we
can show P =β Q, for all P , Q. 

86
10.7 Reduction strategies

10.7 Reduction strategies


Suppose we are given a term and asked to reduce it. What strategy should we choose to select which
redex to reduce first? (At this point you should remind yourself of § 4.1 on page 21.)

10.7.1 Leftmost
The leftmost redex of a term P is the redex whose leftmost parenthesis is to the left of all the other
parentheses of redexes in P . The leftmost reduction of a term P is the sequence of terms generated
from P by reducing the leftmost redex of each term generated.

Theorem 2 A term has a normal form iff its leftmost reduction is finite. The last term in the reduction
is the normal form.

Proof
See [4]. 

This theorem tells us that if any reduction strategy will find a normal form then the leftmost strategy
will find it. It does not tell us that every term has a normal form, nor even give us a bound on how long
we will have to look for a normal form for.

10.7.2 Other reduction strategies


Look in [64].

10.8 Representing data and functions


We have built up quite a lot of theoretical apparatus now. We are now in a position to show how we
can represent data and functions in the pure λ-calculus. We will take a two stage approach to this:

• first we show that we can represent the booleans, then we show that we can represent natural
numbers;

• then we get a bit serious and outline what is required to show that the λ-definable functions
coincide with the recursively definable functions.

We don’t get too serious: we do not properly define the class of the recursive function, nor do we
carry the proof through properly.

10.8.1 Booleans
The booleans are the values TRUE and FALSE. They have the property that TRUE and FALSE are
different. Furthermore we need to define IF where IF behaves like this:

IF TRUE L M = L
IF FALSE L M = M

Figure 10.12: Behaviour of IF

How do we construct λ-terms which behave like this? To begin with IF had better take three ar-
guments, so it will look like λx y z.Q. The term (IF Q) will return either its first or second argument
depending on whether Q is TRUE or FALSE.6 Since we can’t test cases we will simply let TRUE and
6 We don’t care what happens if Q is neither.

87
10 The λ-calculus

FALSE decide whether to return the first or second argument. IF, TRUE and FALSE can then be defined
as:

IF =def λx y z.x y z
TRUE =def λx y.x
FALSE =def λx y.y

Figure 10.13: Defining IF, TRUE and FALSE

So:
IF TRUE L M

is just:
(λx y z.x y z)(λw v.w)L M

10.8.2 The Church numerals


In this section we introduce the Church numerals, and claim that they are adequate as numerals.
The Church numerals are just 0 and n from 10.7:

λx y.y aka 0
λx y.xn y aka n

Figure 10.14: Church numerals

With these numerals we can define addition, multiplication and exponentiation. (Exercise).

10.8.3 λ-definability
Now we outline the proof that the recursively definable functions are λ-definable. Recall that the class
of recursive functions is the least class of functions which:

• includes the initial functions:

– zero Z(n) = 0
– successor Succ(n) = n + 1
– projections Pin (x1 , . . . , xn ) = xi

• is closed under:

– composition,
– primitive recursion,
– minimisation;

We will not show all the details here: the interesting part is how we can represent recursion, so we
look at this. The representation of the minimisation follows similar lines, but we will not cover this.

88
10.8 Representing data and functions

Recursion via fix-point finders


To show primitive recursion and minimisation we need to take a detour through fix-point finders. We
start by explaining what a fixed-point finder is. This material is based on (extracts of) Chapter 6 of [4],
and on Chapter 3 of [39].
A fixed point of a function f is a value x such that x = f (x).
Theorem 3 (∀F )(∃X)(X Bβ F X)

Proof
Let W =def λx.F (xx) and X =def W W . Now, W W B1β F (W W ), which is just F (X). 

A fix-point finder is a combinator M such that:


(∀F )M F = F (M F )
i.e. M F is a fixed-point of F . It is not immediately apparent that such combinators must exist.
However, careful inspection of Theorem 3 lets us construct one: λf.(λy.f (y y))(λv.f (v v)). This is just
Y from 10.7 on page 81.
We have:
Yf = f (Yf )
but we no not have:
Yf Bβ f (Yf ) NO!
However the combinator Θ from 10.7 on page 81 does have the property:
Θf Bβ f (Θf )
So, although Y is simpler than Θ, in this respect at least, Θ has some advantages. There are in fact
lots of fix-point combinators. We can define:

Y0 =def Y
n+1
Y =def Yn (SI)

Figure 10.15: Lots of fix-point finders

Y1 is just Θ.
This is all good stuff, but what does it tell us about recursion? We will show that we can use Y to
represent the recursively defined functions. As an example consider the the factorial function:

fac :: Int -> Int

fac 0 = 1
fac n = n * fac (n -1)

Program 10.1: Factorial

We could write this as:

fac :: Int -> Int

fac n = if (n == 0)
then 1
else (n * fac (n -1))

Program 10.2: Factorial, again

89
10 The λ-calculus

Or as:

fac :: Int -> Int

fac = \n -> if (n == 0)
then 1
else (n * fac (n -1))

Program 10.3: Factorial, another time

Now, we will suppose that we are allowed to make recursive definitions in theλ-calculus, and can
write definitions like:
fac =rec λn.if (n = 0) then 1 else n ∗ fac(n − 1)
We have used =rec to emphasise that this is a recursive definition.
We will show that we can give a definition of fac which does not require that we use recursion. The
other parts of the definition are all things that we know how to encode in the λ-calculus already.
The first step is to do a β expansion, and introduce a redex:

fac =rec (λf.(λn.if (n = 0) then 1 else n ∗ f (n − 1)))fac

Now we have an equation of the form:

q = Hq

In other words, fac is a fixed-point of:

λf.(λn.if (n = 0) then 1 else n ∗ f (n − 1))

We have just shown how to find a fixed-point of a function by using Y.


So now we can write:

fac = Y(λf.(λn.if (n = 0) then 1 else n ∗ f (n − 1)))

This definition does not use recursion explicitly: we are using Y instead. So, we have shown that
we can represent the factorial function in the λ-calculus, without needing to add a new mechanism for
making recursive definitions. We can generalise from here to show tha any recursively defined function
can be encoded in the λ-calculus.

10.8.4 Section summary


In this section we have shown that the formalism of the λ-calculus is rich enough for us to write any
computable function in (at least if we accept Church’s Thesis). This leads us to the conclusion that the
λ-calculus is worthy of study as a paradigmatic functional programming language, and further, that
study of the λ-calculus will assist our understanding of all programming languages. Alas and alack!
we do not have time to indulge in this study in this course.

10.9 HNF and WHNF


We defined the notion of a normal form in 10.6 on page 86. We can define weaker notions of normal
form as follows:

• A term is in head normal form (HNF) if it is of the form

λx1 . . . xn .xM1 . . . Mm

where n, m ≥ 0

90
10.10 Graphs and laziness

• A term is in weak head normal form (WHNF) if it is of the form

λx1 . . . xm .L1 . . . Ln

where n, m ≥ 0, m > n

Any term which is in NF is also in HNF, although not necessarily vice versa, and a term which is in
HNF is also in WHNF, although not necessarily vice versa.
These notions of normal form are of interest to us, because the represent intermediate points on
the road to (fully) normal forms. A term in WHNF tells us enough about itself for us to say what its
outermost form is, without telling us exactly what its value is. But this is just what we need for laziness.

10.10 Graphs and laziness


So lets think about lazy evaluation again. Recall from Chapter 4 on page 21 that there are three parts
to laziness:
1. only evaluate what we have to;
2. only evaluate as far as we need to;
3. only evaluate once.
We have seen that:
• 1 is handled by normal order evaluation;
• 2 is handled by evaluating to WHNF.7
We will assert that 3 can be handled by representing λ-terms as graphs, rather than trees.

10.10.1 η reduction
So far we have only discussed the λ-calculus with β-reduction. This calculus is sometimes called the
λβ -calculus. We can add another notion of reduction, called η-reduction:

λx.(M x) B1η M [x 6∈ FV(M )]

Figure 10.16: η-reduction

Now we should go thorough all the previous definitions of reduction, redex and so on and call
them β-redex, β-normal form, and so on. Then we can define η-redex, η-normal form and so on.
The calculus with both β and η reductions is sometimes called the λβη -calculus. The Church-Rosser
Theorem holds for the λβη -calculus.

10.11 Summary
In this chapter we have looked at the un-typed λ-calculus.
In fact we have looked at one untyped calculus: there are a lot of variations on the principal theme,
and we have not touched on these. The interested reader is referred to [4] for more details.

7 Notice how laziness and a weak head go together!

91
10 The λ-calculus

92
11 The typed λ-calculus
Thee are two intertwined aspects to the relation between λ-calculus and functional programming:

1. functional programs can be compiled to λ-terms;


2. a functional programming language is simply an enrichment of the λ-calculus (often expressed
as ‘a syntactic sugaring’).

If we focus on 2 it strikes us immediately that the λ-calculus that we have discussed so far is un-typed,
and types play a hugely important rôle in Haskell. We need to consider a typed λ-calculus. There is
one crucial consequence of adding types: any typable term has a normal form. Hence the slogan:
typing implies termination.
This is a remarkable result. Typing is a purely static, syntactic notion, and yet it tells us about the
dynamic behaviour of terms. Unfortunately we get nothing for nothing. We are unable to type any fix-
point finder. This is a bit of a blow as we used fix-point finders to show that the recursive functions were
λ-definable. If we have no fix-point finders then we are forced to add recursion operators explicitly.
We must take care here. If we add recursion operators that are too powerful we will lose the property
‘typing implies termination’. On the other hand if we add weak recursion operators we find that there
are functions which we cannot express. The choice which Haskell makes is, of course, to give up ‘typing
implies termination’.

11.1 Types for λ-terms


We are going to explain how to give types for (pure) λ-terms. We follow the treatment from Chapter 2
of [38], with the occasional deviation.
To begin with we assume that we have an infinite stock of type variables a, b, c, d . . .. We also have a
stock of arbitrary types, which are denoted by lower-case Greek letters (with the obvious exception of
λ!). The types themselves are described by:

Type −→ Type → Type


| TyVar
| (Type)

TyVar −→ a, b, c, . . .

Figure 11.1: A grammar for types

As ever we introduce a convention to dispense with brackets:


• → associates to the right, so we can write α → β → γ instead of α → (β → γ)
We need to make some definitions.
A type assignment is an expression of the form:

M :τ

where M is a term and τ a type.

93
11 The typed λ-calculus

A type context or environment is a set of type assignments. Upper-case Greek letters Γ, ∆ . . . are
usually used for type contexts. We will typically abuse notation and omit braces and commas from
sets.1
The rules for type assignment given in [38] are (almost):

Γ 7→ x : α (x : α ∈ Γ)

Γ 7→ P : σ → τ ∆ 7→ Q : σ
→ Elim
Γ∆ 7→ P Q : τ

Γ, x : σ 7→ P : τ
→ Intro
Γ 7→ λx.P : σ → τ
Rule 11.1: Type assignment

These rules are expressed in a hybrid system which mixes Gentzen’s natural deduction (N style)
systems, and his sequent calculus (L style) systems to get the worst of both. Such systems might be
called NL systems and offer neither the freedom of natural deduction proper, nor the precision of the
sequent calculus proper. If one were a logician, and wrote ⊃ instead of →, and upper-case Roman
letters instead of lower-case Greek ones, and ignored the λ-terms one would get:

Γ 7→ A (A ∈ Γ)

Γ 7→ A ⊃ B ∆ 7→ A
⊃ Elim
Γ∆ 7→ B
Γ, A 7→ B
⊃ Intro
Γ 7→ A ⊃ B
Rule 11.2: The implicational fragment of minimal propositional logic

We shall present some inferences of types:

x : a 7→ x : a
→ Intro
7→ λx.x : a → a
Rule 11.3: I : a → a

x : a, y : b 7→ x : a
→ Intro
x : a 7→ λy.x : b → a
→ Intro
7→ λxy.x : a → b → a
Rule 11.4: K : a → b → a

Notice that II Bβ I, so that II had better have the ‘same’ type as I:

x : a → a 7→ x : a → a x : a 7→ x : a
→ Intro → Intro
7→ λx.x : (a → a) → a → a 7→ λx.x : a → a
Defn. Defn.
7→ I : (a → a) → a → a 7→ I : a → a
→ Elim
7→ II : a → a
Rule 11.5: II : a → a

We will make clearer exactly what ‘same’ means in the next section.
1 The notation we are using is originally from Gentzen’s sequent calculus [27] where Γ denotes a list (sequence) of formulas.
The use of sets frees us from having to state a number of so-called ‘structural’ rules.

94
11.2 Type inference, and the principal type algorithm

We see that there is an obvious relationship between proof search and type inference. In the next
section we will outline a type inference algorithm.

11.2 Type inference, and the principal type algorithm


The type system that we are considering is very simple: we only have type variables and function types.
In this type system we have an interesting fact: every typable term has a principal type. The principal
type is the most general type that a term can have, in the sense that every type which a term can be
given is an instance of the principal type. For example, I can be given any of the following types:

a→a
(a → b) → a → b
((a → b) → a → b) → (a → b) → a → b

The language that we are using here should sound reminiscent of the language that we use when
talking about unification of simple terms (like Prolog terms, for example). Recall that if two Prolog
terms unify then they have a most general unifier. The similarity is no accident, as the type inference
algorithm depends on unification. One way to think of unification is in terms of equation solving, or
constraint satisfaction and we can think of the type inference algorithm as setting up constraints on
types (alternatively: equations between type variables).
The principal type algorithm takes a term of the λ-calculus and returns either:

• the principal type of the term; or

• the information that the term has no type.

The basis of the algorithm is that we use the rules from Figure 11.1 to tell us how to attempt to
construct a type for the term we have been given. We use unification to match the types. Since we
construct a most-general unifier, we construct a principal type.
We defer the details of the type inference algorithm until Chapter 15.

11.3 Terms which can’t be typed


There are terms which we cannot provide a type for. For example, if we use the type checking of Haskell
itself:

Gofer?
\x -> x x

ERROR: Type error in application


*** expression : x x
*** term : x
*** type : a -> b
*** does not match : a
*** because : unification would give infinite type

Gofer?
\x -> (\y -> x(y y))(\v -> x(v v))
ERROR: Type error in application
*** expression : y y
*** term : y
*** type : a -> b
*** does not match : a
*** because : unification would give infinite type

95
11 The typed λ-calculus

The second of these terms presents us with a problem: it is the Y combinator and in Chapter 10
we used Y to represent the recursive functions. The untypability of Y would seem to be a blow to
the expressive power of the un-typed λ-calculus. A moment’s reflection will convince us that this is
inevitable. If any typable term has a normal form, and terms are programs, and normal forms values
then there must be terms which we cannot type, or else we would be able to solve the halting problem.
All is not totally lost. We can define recursion operators, and give types for them, and we can use them
to express programs. Our problem now is that there are, inevitably, things we can’t express.
Notice that Haskell’s type system does not have the property that well-typed programs will terminate:
typing and termination are separated. Well-typed programs have fewer ways to go wrong than un-
typed programs.

11.4 A typed λ-calculus with recursion operators


We finish this Chapter off by introducing my favourite typed λ-calculus with structural recursion opera-
tors.
We allow ourselves the ability to extend both the λ-calculus and the type system, in a controlled
fashion. We must, of course, take care to add only conservative extensions.2 For example, we can add
the type of lists directly. The rules for typing a list are:

7→ [] : List(τ )

Γ 7→ P : τ ∆ 7→ Q : List(τ )
Γ, ∆ 7→ Cons(P, Q) : List(τ )
Rule 11.6: Lists

The rules should remind you strongly of the type declaration for lists in Haskell given in Program 7.2
on page 35. Along with this definition we define a structural recursion operator for lists, listrec . We
must explain how to compute with listrec, i.e. we define new redexes and give their reduction rules:

l B [] d B d0
listrec(d, e, l) B d0

l B Cons(h, t) e(h, t, listrec(d, e, t)) B e0


listrec(d, e, l) B e0
Rule 11.7: Two new redexes

An expression like listrec(d, e, l) is a definition of a function on lists (just like it is in Haskell!).


The rule for typing listrec(d, e, l) looks like this:3

Γ 7→ l : List(τ ) ∆ 7→ d : γ([]) Θ, h : τ, t : List(τ ), r : γ(t) 7→ e(h, t, r) : γ(Cons(h, t))


Γ, ∆, Θ 7→ listrec(d, e, l) : γ(l)
Rule 11.8: Typing listrec(d, e, l)

If we use the structural recursion operator listrec we retain the property ‘typing is termination’. We
loose this property if we use a more powerful recursion.
The extended type system that we are considering here, with dependant types, and the ability to
add new types freely is very expressive. In fact it is expressive enough to allow us to write program
specifications as types. Already in Haskell we can think of the type of a function as describing its
behaviour, that is as a partial specification. The specification of a program is a proposition, and
2 At least if we wish to retain consistency, and ‘typing as termination’.
3 We are allowing dependent types, which is a large step away from the system we have had so far, and much stronger than
the system which Haskell has.

96
11.5 Summary

the identification of propositions as types is often called the Curry-Howard isomorphism, although I
personally prefer ‘propositions-as-types analogy’. The step from Figures 11.1 to 11.2 is, essentially,
the recognition of propositions as types. [19] gives a general introduction to the ideas behind Curry-
Howard. The work of Per Martin-Löf [55] and others [58, 69, 76] on type theory is an extended exercise
on the utility of this idea in programming.

11.5 Summary
In this Chapter we have:
• discussed typing of the λ-calculus;
• mentioned the principal type algorithm;

• discussed an extension to the λ-calculus, and the type system.

97
11 The typed λ-calculus

98
12 Continuations
Informally, a continuation is a function which tell us ‘what to do next’ [2]. We can think of the contin-
uation as embodying the ‘future’ of the computation. It is natural to think of state in dynamic terms,
so this intuition helps us to see how continuations give us one way to handle stateful computation in a
functional setting.
Sections 12.1 to 12.3 are based on Chapter 3 of [50].
In many situations continuation-based approaches can be replaced by approaches based on the use
of monads, which we discuss in Chapter 18.

12.1 Introducing tail-recursion and continuations


A computation rule involves a tail call if it is of the form:

x1 −→ x01 ψ(x01 , . . .) −→ α
φ(x1 , . . .) −→ α

Rule 12.1: A tail call

This can be read roughly as: ‘To evaluate the function φ, we must evaluate the function ψ.’ Alas the
crudity of this reading leads to two different definitions of tail recursion:
• the first is that a function is tail-recursive if recursive calls are tail calls (for example, §10.2 of
[52]);
• the second is that a function is tail-recursive if all calls are tail calls (for example, §6.8 of [76]).
As a simple example we look at 3 versions of a function to compute the length of a list.
We will call a function which is not tail-recursive a direct function. The function len in Program 12.1
is a direct function to compute the length of a list.

len :: [a] -> Int

len [] = 0
len h:t = 1 + len t

Program 12.1: The length of a list 1

The function len1 in Program 12.2 is tail-recursive by the first definition, but not the second. A call
to len1 l 0 will compute the length of the list l.

len1 :: [a] -> Int -> Int

len1 [] n = n
len1 (h:t) n = len1 t (n + 1)

Program 12.2: The length of a list 2

The function len2 in Program 12.3 is a variant of this function which is tail-recursive by the second
definition too. A call to len2 l f will compute the value of the function f applied to the length of the
list l.

99
12 Continuations

len2 :: [a] -> (Int -> b) -> b

len2 [] k = k 0
len2 (h:t) k = len2 t (\n -> k(n + 1))

Program 12.3: The length of a list 3

The auxiliary argument k to len2 in Program 12.3 is called a tail function or continuation. A
function like len2 is said to written in continuation-passing style. We will now illustrate some of the
uses of continuations.

12.2 Some simple functions


In this section we shall look at some more simple Haskell functions and present tail-recursive, continuation-
passing versions of them.
We begin by looking at a very simple example, the factorial function, given in Program 10.1 on
page 89.
A CPS version of factorial will take a tail function as an auxiliary argument. The tail function expresses
how to continue, so the type of our new function will be Int -> (Int -> a) -> a. We shall call
our new function cpsfac:

cpsfac :: Int -> (Int -> a) -> a

cpsfac 0 k = k 1
cpsfac n k = cpsfac (n - 1) (\x -> k(n * x))

Program 12.4: A CPS factorial

A call to cpsfac n f will compute the value of the function f applied to the value of n!. Notice
that in fac in Program 10.1 the multiplication is outside the recursion, whereas in cpsfac in Program
12.4 the multiplication is inside the recursion. This is the typical pattern that we see when we write a
CPS function.
Suppose we call cpsfac 3 k for some arbitrary k. Evaluation will be as follows (we name some of
the expressions involved to aid readability):

cpsfac 3 k
=> cpsfac 2 (\x -> k(3 * x))
=> cpsfac 1 (\y -> k’(2 * y))
-- where k’ is \x -> k(3 * x)
=> cpsfac 0 (\z -> k’’(1 * z))
-- where k’’ is \y -> k’(2 * y)
=> k’’’ 1
-- where k’’’ is \z -> k’’(1 * z)
=> k’’ 1
=> k’ 2
k 6

Figure 12.1: Evaluating cpsfac 3 k

12.2.1 A CPS version of Fibonacci’s function


As a second example we look at Fibonacci’s function for modelling the growth of rabbit populations.1
The direct version of Fibonacci’s function that we first write in Haskell looks like:
1 The oldest known description is in [48]. [80] gives many interesting properties of this function.

100
12.2 Some simple functions

fib :: Int -> Int

fib 0 = 1
fib 1 = 1
fib n = fib (n - 1) + fib (n - 2)

Program 12.5: Fibonacci’s function in Haskell

Notice that this function is not written in primitive recursive form. Since we are interested in functions
written in primitive recursive form we look at a primitive recursive version of Fibonacci’s function. We
define an auxiliary function:

fibs :: Int -> (Int, Int)

fibs 0 = (1, 1)
fibs n = (snd(fibs (n - 1)),
snd(fibs (n - 1)) + fst(fibs (n - 1)))

Program 12.6: A structurally recursive version of Fibonacci’s function

A call to fst(fibs n) will compute fib n.


A neater definition uses a let:

fibs :: Int -> (Int, Int)

fibs 0 = (1, 1)
fibs n = let (lo, hi) = fibs (n-1)
in (hi, hi + lo)

Program 12.7: A neater version of Fibonacci’s function

These two functions are essentially the same: the let just aids readability.2 CPS-converting these
functions is straightforward: again we supply an auxiliary argument and again the order of the opera-
tions on the right gets inverted:

cpsfibs :: Int -> ((Int, Int) -> a) -> a

cpsfibs 0 k = k (1, 1)
cpsfibs n k = cpsfibs (n - 1)
(\(lo, hi) -> k(hi, hi + lo))

Program 12.8: A CPS version of Fibonacci’s function

Evaluation of cpsfibs 3 k for some arbitrary k will be as follows (allowing for the simplification
of arithmetic expressions):

2 It also allows the compiler to make a major optimisation!

101
12 Continuations

cpsfibs 3 k
=> cpsfibs 2 (\(a, b) -> k(b, b + a))
=> cpsfibs 1 (\(p, q) -> k’(q, q + p))
-- where k’ is \(a, b) -> k(b, b + a)
=> cpsfibs 0 (\(x, y) -> k’’(y, y + x))
-- where k’’ is \(p, q) -> k’(q, q + p)
=> k’’’(1, 1) -- where k’’’ is \(x, y) -> k’’(y, y + x)
=> k’’(1, 2)
=> k’(2, 3)
k (3, 5)

Figure 12.2: Evaluating cpsfibs 3 k

We are now at a point where we can make some observations about the CPS versions of the functions
we have produced. The first observation is that the time complexity of cpsfibs is much better than
that of fib. The second observation is that there is a possibility to optimise the evaluation of cpsfib.
Whatever structure we build to represent the call to cpsfibs 3 k can simply be replaced by the
structure that we build to represent the call to cpsfibs 2 (\(a, b) -> k(b, b + a)). In fact
the algorithm that we have written is very similar to the C [44] function in Program 12.10.
The function repeat in Program 12.9 implements a looping construct in Haskell.

repeat :: Int -> (a -> b) -> a -> b

repeat 0 f a = f a
repeat n f a = repeat (n-1) (\x -> f x) a

Program 12.9: Looping

int fib ( int n ){


int lo, hi;
int i;

lo = 0;
hi = 1;

for( i = 0 ; i <= n ; i++ ){


hi = hi + lo;
lo = hi - lo;
}
return lo;
}

Program 12.10: Computing Fibonacci’s function in C

So we see that the use of tail-recursive functions allows us to write algorithms which behave like
imperative loops, and which a compiler can treat in the same way as it treats an imperative loop. This
is a crucial point: using continuations lets us write imperative algorithms in a functional language. Thus
we see that we can use continuations to allow us to handle computations which involve state. The use
of continuations which we discuss later in this chapter are essentially the application of this observation
to various specific problems.

12.2.2 Historical note


One of the first programs to be run on “the first machine that had all the components now classically
regarded as characteristic of the basic computer” [72] was “H.C.F. by the standard process” [86]. We
can assume that this means Euclid’s algorithm.

102
12.3 Uses of continuations

This is a standard example of a tail-recursive function:

hcf n 0 = n
hcf n m = hcf m (n ‘rem‘ m)

Program 12.11: “H.C.F. by the standard process” in Haskell

12.3 Uses of continuations


In the previous section we presented some very simple functions and CPS versions of them. In this
section we will describe some of the ways in which continuations can be exploited. Some of the uses
of continuations, particularly in compiling require us to be able to convert any function into CPS. CPS
conversion was first discussed in [23], and a number of variants have since appeared in the literature.
We only discuss here the conversion presented in [66], as we only intend to give a flavour of what is
involved. Fuller discussion and comparison of different CPS conversions for λ-calculi can be found in,
for example, [34, 75].
In [66] Plotkin is interested in the relationship between call-by-value and call-by-name. He defines
the following CPS-conversion to allow us to map terms from a language with call-by-name to one with
call-by-value:

x → λk.kx
λx.M → λk.k(λx.M )
M N → λk.M (λm.N (λn.mnk))

Figure 12.3: Plotkin’s CPS-conversion

Where k, m and n are chosen to avoid variable capture in the usual way. Plotkin proves some
theorems relating to the values of M and M when call-by-name and call-by-value evaluation strategies
are used.
The effect of Plotkin’s CPS conversion on some terms is shown in Figure 12.4. We have named the
newly introduced bound variables kn .

x → λk1 .k1 x
λk.kx → λk1 .k1 (λk.(λk2 .(λk3 .k3 k)(λk4 .(λk5 .k5 x)(λk6 .k4 k6 k2 ))))
λx.x → λk1 .k1 (λx.(λk2 .k2 x))
xy → λk1 .(λk2 .k2 x)(λk3 .(λk4 .k4 y)(λk5 .k3 k5 k1 ))

Figure 12.4: CPS-translation some particular terms

We can make a number of observations:


• the conversion is not idempotent;
• new redexes may be introduced by the conversion;
• the behaviour of the interpreter is reflected in the structure the term produced.
Making the structure of the term reflect its evaluation gives us an insight into why CPS-conversions
are of interest in compiling. This helps us to formalise the informal observations that we were able to
make in the previous section about the small functions we looked at. As far as compilation is concerned
we see:

103
12 Continuations

• CPS-conversion can produce terms which tell us useful things about how they will be evaluated;

• such terms likely to tell us a lot of things which are not really very useful, so they need to be
optimised.

There is a large literature on the use of CPS in compilation: [2, 24] provide a good start.
Another direction that we can follow from the use of CPS-conversions leads us to the extraction of
constructive content from classical proofs. Notoriously, classical proofs need not contain any construc-
tive content. However, we know that we can use the double-negation transformation to produce a
intuitionistic theorem from a classical one. In [26] Friedman introduced a related technique, called
A-translation, which allowed him to show that Peano arithmetic is a conservative extension of Heyting
arithmetic, for Π02 sentences. The constructive content of Friedman’s proof is that we can convert a
classical proof of a Π02 sentence into a constructive one. The constructed proof can be interpreted
as the application of a non-local control operator applied to a CPS-conversion of the classical proof.
The control operator allows us to replace the current evaluation context with a different one, just as
goto allows us to make non-local jumps in imperative programs. There is an extensive literature on
control operators: [13, 71] are a beginning. [33, 57] provide much more detail on the extraction of
constructive content from classical proofs, and the relation with control operators.
Just as in imperative programming we can make jumps available to the programmer by providing
goto we can make the control operator available to the programmer. The is done in Scheme [43]
and Standard ML of New Jersey [2] using a callcc (call-with-current-continuation) primitive. [74]
describes the representation of jumps with continuations in more detail. Just as goto allows the
programmer to invent control structures, so does callcc, with all that this entails. Continuations
allow us to implement threads, as discussed in, for example, [2] and [22].
Again, connected with their rôle of representing control in a functional setting continuations have
applications in denotational semantics as discussed in, for example, [70].

12.3.1 Continuations and I/O


In [5] Barendregt points out that many functional programming languages can be considered ‘autistic’
in that they lack any ability to communicate with the outside world3 . This is, as he points out, partly
because our thinking is in terms of the evaluation of functions, an activity which naturally concerns
itself with termination, whereas for communication we need a notion of process, and the evolution of
processes is a continuing activity. Continuations help us formalise this intuition, as, in a simple model
of I/O, we can do three things:

• we can stop;

• we can read a value, and continue by performing some computation with it;

• we can write a value, and continue performing computations.

At this level the world is almost as simple: it is just a pair of lists of natural numbers. We see that
CPS functions let us thread a state or world value through our programs.

12.4 Further examples


Now we present some simple examples of direct functions and their CPS equivalents. We begin by
looking at functions on some types not defined inductively.

12.4.1 Functions on types not defined inductively


The identity function, given in Program 7.1 on page 33, does nothing but return its argument. The CPS
version of the identity function should do nothing but pass its argument on to its continuation.
3 As Estragon puts it, in Beckett’s Waiting for Godot: ‘Nothing happens, nobody comes, nobody goes, it’s awful!’

104
12.4 Further examples

cpsid :: a -> (a -> b) -> b

cpsid a c = c a

Program 12.12: cpsid

Notice that cpsid is just apply, with its argument supplied the other way around.

apply :: (a -> b) -> a -> b

apply f x = f x

Program 12.13: apply

The function split, given in Program 12.14, is the basic function to split pairs (2-tuples) into their
components.

split :: (a -> b -> c) -> (a, b) -> c

split f (a, b) = f a b

Program 12.14: split

The functions fst and snd can be defined in terms of split:

fst :: (a, b) -> a


snd :: (a, b) -> b

fst = split (\x y -> x)


snd = split (\x y -> y)

Program 12.15: fst and snd

We can write a CPS version of split:

cpssplit :: (a -> b -> c) -> (a,b) -> (c -> d) -> d

cpssplit f (a, b) k = k (f a b)

Program 12.16: split

We can then write cps version of fst and snd:

cpsfst :: (a,b) -> (a -> c) -> c


cpssnd :: (a,b) -> (b -> c) -> c

cpsfst = cpssplit (\x y -> x)


cpssnd = cpssplit (\x y -> y)

Program 12.17: cpsfst and cpssnd

12.4.2 Functions on inductively defined types


Now we look at some functions defined on inductively defined types. In order to be thorough we begin
with the type of natural numbers, as defined in Program 7.20 on page 41. We can write a CPS version
of nat2string (Program 7.22 on page 41)

105
12 Continuations

cpsnat2str :: Nat -> (String -> a) -> a

cpsnat2str Zero k = k "0"


cpsnat2str (Succ n) k =
cpsnat2str n (\r -> k ("s(" ++ r ++ ")"))

Program 12.18: cpsnat2str

The use of append is a bit unfortunate: we should define a CPS version of append:

cpsappend :: [a] -> [a] -> ([a] -> b) -> b

cpsappend [] a k = k a
cpsappend (h:t) a k = cpsappend t a (\r -> k (h:r))

Program 12.19: cpsappend

Now we can re-define cpsnat2str:

cpsnat2str :: Nat -> (String -> a) -> a

cpsnat2str Zero k = k "0"


cpsnat2str (Succ n) k =
cpsnat2str n (\p ->
cpsappend p ")" (\q ->
cpsappend "s(" q k ))

Program 12.20: A more CPS cpsnat2str

We could just as easily have defined:

cpsnat2str :: Nat -> (String -> a) -> a

cpsnat2str Zero k = k "0"


cpsnat2str (Succ n) k =
cpsnat2str n (\p ->
cpsappend "s(" p (\q ->
cpsappend q ")" k ))

Program 12.21: Another more CPS cpsnat2str

Now we have imposed an order on the evaluation that was not here explicily before.
We can define CPS versions of addition, multiplication, and exponentation:

106
12.4 Further examples

infixr .+.
infixr .*.
infixr .^.

Zero .+. m = \k -> k m


(Succ n) .+. m = \k -> ((n .+. m) (\r -> k(Succ r)))

Zero .*. m = \k -> k Zero


(Succ n) .*. m = \k -> (n .*. m) (\r -> (m .+. r) k)

n .^. Zero = \k -> k (Succ Zero)


n .^. (Succ m) = \k -> (n .^. m) (\r -> (n .*. r) k)

Program 12.22: More CPS functions

Example: cpsreduce

There is a general pattern to the CPS functions that we write. In the case of lists we have the following
basic form:

f [] k = k D
f (h:t) k = f t (\r -> k (E h r))

Where D and E are expressions supplied by us.


Program 12.23: The general form of CPS functions on lists

Having identified this basic patttern we can produce a CPS version of reduce:

cpsreduce :: a -> (b -> a -> a) -> [b] -> (a -> c) -> c

cpsreduce d _ [] k = k d
cpsreduce d e (h:t) k = cpsreduce d e t (\r -> k (e h r))

Program 12.24: cpsreduce

The same can be done for all the other inductively defined types we have come across.

Example: λ-terms

Suppose we define the type of λ-terms as follows:

type Var = String

data Term = Var Var


| App Term Term
| Abs Var Term

Program 12.25: Representing λ-terms in Haskell

One obvious thing to do with λ-terms is to write a pretty-printer, or unparser. The CPS unparser will
have type Term -> (String ->a) -> a. We adopt a strategy of writing a function which we know
to be flawed, and then correct it. The deliberate error that we make is to ignore the use of brackets in
the string we are generating. The following function is a first attempt:

107
12 Continuations

cpsunparse :: Term -> (String -> a) -> a

cpsunparse (Var x) k = k x
cpsunparse (App l m) k =
cpsunparse l (\lstring ->
cpsunparse m (\mstring ->
k(lstring ++ " " ++ mstring)))
cpsunparse (Abs x m) k =
cpsunparse m (\mstring ->
k("\\ " ++ x ++ " . " ++ mstring))

Program 12.26: An attempt at an unparser

This function does not use CPS for appending the strings. We use Program 12.19 on page 106 and
define:

up2 :: Term -> (String -> a) -> a

up2 (Var x) k = k x
up2 (App l m) k =
up2 l (\lstring ->
up2 m (\mstring ->
cpsappend lstring " " (\r ->
cpsappend r mstring k)))
up2 (Abs x m) k =
up2 m (\mstring ->
cpsappend "\\ " x (\r ->
cpsappend r " . " (\s ->
cpsappend s mstring k)))

Program 12.27: A more CPS unparser

This is beginnning to look very typical of CPS code. We have pushed the continuation k as far into
the function as possible. Consequently we have named a lot of intermediate results.
This code is still not correct, as it fails to put brackets in where they are needed:

Gofer?
up2 (App (Var "x") (App (Var "y") (Var "z"))) id
x y z

Gofer?
up2 (App (App (Var "x") (Var "y")) (Var "z")) id
x y z

up2 (Abs "x" (App (Var "x") (Var "y"))) id


\ x . x y

Gofer?
up2 (App (Abs "x" (Var "x")) (Var "y")) id
\ x . x y

Now we address this problem. We know that we will need to bracket expressions, so we define4 :

4 We resist the temptation to abstract over the sort of brackets used.

108
12.4 Further examples

brack :: String -> (String -> a) -> a

brack string k =
cpsappend "(" string (\r -> cpsappend r ")" k)

Program 12.28: Adding brackets

We can use Haskell’s pattern matching to solve the problem. Instead of having one clause to deal
with application terms we will have several:

up2 (App l m@(App _ _)) k = -- needs brackets on m


up2 (App l@(Abs _ _ ) m) k = -- needs brackets on l
up2 (App l m) k = -- no brackets needed

Program 12.29: Using pattern matching

This clause now becomes:

up2 (App l m@(App _ _)) k =


up2 l (\lstring ->
up2 m (\mstring0 ->
brack mstring0 (\mstring ->
cpsappend lstring " " (\r ->
cpsappend r mstring k))))
up2 (App l@(Abs _ _ ) m) k =
up2 l (\lstring0 ->
up2 m (\mstring ->
brack lstring0 (\lstring ->
cpsappend lstring " " (\r ->
cpsappend r mstring k))))
up2 (App l m) k =
up2 l (\lstring ->
up2 m (\mstring ->
cpsappend lstring " " (\r ->
cpsappend r mstring k)))

Program 12.30: Using pattern matching 2

The full function is given in Program 12.31 on the following page. For comparison, the same function
written in non-CPS style, is given in Program 12.32 on the next page. Now we get:
Gofer?
up2 (App (App (Var "x") (Var "y")) (Var "z")) id
x y z

Gofer?
up2 (App (Var "x") (App (Var "y") (Var "z"))) id
x (y z)

Gofer?
up2 (Abs "x" (App (Var "x") (Var "y"))) id
\ x . x y

Gofer?
up2 (App (Abs "x" (Var "x")) (Var "y")) id
(\ x . x) y

109
12 Continuations

up2 :: Term -> (String -> a) -> a

up2 (Var x) k = k x
up2 (App l m@(App _ _)) k =
up2 l (\lstring ->
up2 m (\mstring0 ->
brack mstring0 (\mstring ->
cpsappend lstring " " (\r ->
cpsappend r mstring k))))
up2 (App l@(Abs _ _ ) m) k =
up2 l (\lstring0 ->
up2 m (\mstring ->
brack lstring0 (\lstring ->
cpsappend lstring " " (\r ->
cpsappend r mstring k))))
up2 (App l m) k =
up2 l (\lstring ->
up2 m (\mstring ->
cpsappend lstring " " (\r ->
cpsappend r mstring k)))
up2 (Abs x m) k =
up2 m (\mstring ->
cpsappend "\\ " x (\r ->
cpsappend r " . " (\s ->
cpsappend s mstring k)))

Program 12.31: A better CPS unparser

brack1 :: String -> String

brack1 string = "(" ++ string ++ ")"

up3 :: Term -> String

up3 (Var x) = x
up3 (App l m@(App _ _)) =
up3 l ++ " " ++ (brack1 (up3 m))
up3 (App l@(Abs _ _ ) m) =
brack1 (up3 l) ++ " " ++ (up3 m)
up3 (App l m) =
up3 l ++ " " ++ (up3 m)
up3 (Abs x m) =
"\\ " ++ x ++ " . " ++ (up3 m)

Program 12.32: A non CPS unparser

110
13 Case study: unification
13.1 Premable
Let a term be either:

• a variable;

• a functor applied to a (possibly empty) list of terms.

Question 1 Given two terms t1 and t2 , is there a way to replace the variables in t1 and t2 with terms,
such that the two resulting terms are the same term?

13.1.1 Examples
In the following x, y, z are variables, f, g, h functors. In the possible replacements all variables not
mentioned are assumed to be left unchanged.

Term Term Possible replacements


x y x→y
x y y→x
x y y → f, x → f
x y y → f, x → f, z → f
x f x→f
x f x → f, z → x
f f
f f x → y, y → z, z → x
f g no possible replacement
f (x) f (y) x → g, y → g
f (x) f (g) x→g
f f (g) no possible replacement
f (x) x no possible replacement
f (x, g, y) f (f, z, x) x → f, z → g, y → f

Table 13.1: Some examples

Informally, from this table we can see:

• not every pair of terms can be made the same;

• if a pair of terms can be made the same then there are lots of ways to do this, but;

• some of the ways look better than others.

13.2 Representing terms in Haskell


If we are going to code up a solution to this problem then our first task is to represent terms. One
obvious type to represent the abstract syntax of terms is:

111
13 Case study: unification

type Var = String


type Functor = String

data Term = Var Var


| Compound Functor [Term]

Program 13.1: A type for terms

We should derive the show function automatically, of course. There are a lot of possible variations
on this type. One interesting one eschews the use of lists, and uses two mutually inductive types:

type Var = String


type Functor = String

data Term = Var Var


| Compound Functor Terms

data Terms = None


| Some Term Terms

Program 13.2: Another type for terms

We could abstract over the types of variables and constants:

data Term v f = Var v


| Compound f [Term v f]

Program 13.3: Yet another type for terms

We will work with the type from Program 13.1.


Notice that there is a difference between "x" which has type Var, and Var "x", which has type
Term, even though we might informally call both of them “the variable x”.

13.3 Representing substitutions


“Replacing variables by terms” is also known as performing a substitution. A substitution is just a
function which replaces variables by terms. Hence the obvious type to represent substitutions is:

type Substitution = Var -> Term

Program 13.4: A type for substitutions

In order to replace a single variable by a term, leaving all other variables unaffected, we define a
function called mksubst:

mksubst :: Term -> Var -> Substitution

mksubst t x = \y -> if x == y then t else (Var y)

Program 13.5: Constructing a substitution to replace the variable x with the term t

We could just as easily have defined:

112
13.3 Representing substitutions

mksubst :: Term -> Var -> Substitution

mksubst t x y = if x == y then t else (Var y)

Program 13.6: A variant of Program 13.5

Now we get:
Main> (mksubst (Compound "f" []) "x") "y"
Var "y" :: Term
Main> (mksubst (Compound "f" []) "x") "x"
Compound "f" [] :: Term

13.3.1 The identity substitution


The identity function is the function which simply returns its argument unchanged. It has type α → α.
Hence the identity substitution ι should simply return its argument unchanged. However the type of
substitutions is not an instance of α → α, so it cannot return the same value. What we do instead is to
turn the variable into a term, in the obvious way:

idsubst :: Substitution

idsubst x = Var x

Program 13.7: The identity substitution

We can exploit currying here to get:

idsubst :: Substitution

idsubst = Var

Program 13.8: A variant of Program 13.7

13.3.2 Composing functions


Suppose σ and τ are substitutions. How do we form their composition σ ◦ τ ?
Simply trying to use function composition directly won’t work as the types don’t match, in the same
way as we saw in the case of the identity substitution. Hence what we need to do is to form a new
function which takes a term and replaces occurrences of variables in it. If σ is a substitution then σ ∗
is its extension, and σ ∗ does to occurrences of variables in a term what σ does to variables. We can
define extend as:

extend :: Substitution -> Term -> Term

extend sigma (Var x) = sigma x


extend sigma (Compound f xs) =
Compound f (map (extend sigma) xs)

Program 13.9: Constructing the extension of a substitution

Now we get:
Main> (extend (mksubst (Compound "f" []) "x")) (Var "y")
Var "y" :: Term
Main> (extend (mksubst (Compound "f" []) "x")) (Var "x")
Compound "f" [] :: Term

113
13 Case study: unification

Main> (extend
(mksubst (Compound "f" []) "x"))
(Compound "g" [(Var "x"), (Var "y")])
Compound "g" [Compound "f" [],Var "y"] :: Term

And now we can form the composition of two substitutions:

compose :: Substitution -> Substitution -> Substitution

compose sigma tau = (extend sigma) . tau

Program 13.10: Composing two substitutions

If we want to we could define an operator to aid readability:

infixr #

(#) :: Substitution -> Substitution -> Substitution

sigma # tau = (extend sigma). tau

Program 13.11: An operator for composing two substitutions

13.4 Answering our question


Now we return to Question 1. First, however, we make some preliminary definitions.

Definition 1 (Unifier) A unifier of two terms t1 and t2 is a substitution σ such that σ ∗ t1 and σ ∗ t2 are
equal terms.

If two terms have a unifier then they are said to be unifiable.


Question 1 can be rephrased as:

Question 2 Given two terms, are they unifiable?

We need to make a little detour into the theory of substitutions.

13.4.1 About substitutions


Before we begin, apologies for the pedantic nature of the following material, but this all needs to be
done carefully.

Definition 2 (Basis) The basis of a substitution σ is the set of variables x such that σx differs from ιx.

The basis of a substitution is the set of variables which the substitution really affects. All the substitu-
tions that we are concerned with will have a finite basis.

Definition 3 (Set map) Let f be a function and S be a set. The set map of f on S is the set {f s|s ∈
S}.

Definition 4 (Cycle) Let σ be a substitution and B be its basis. σ is a cycle if the set map of σ on B
is the same as the set map of ι on B.

Informally, a cycle maps some variables onto themselves. We can define functions like the following
to construct cycles:

114
13.5 Continuing to answer our question

cycle2 :: (Var, Var) -> Substitution

cycle2 (x, y) v
| v == x = Var y
| v == y = Var x
| otherwise = Var v

cycle3 :: (Var, Var, Var) -> Substitution

cycle3 (x, y, z) v
| v == x = Var z
| v == y = Var x
| v == z = Var y
| otherwise = Var v

Program 13.12: Constructing cycles

Definition 5 (Generality) Let σ and τ be substitutions. We say that σ is more general than τ if there
is a substitution γ such that τ = γ ◦ σ.

The phrase more general than is a slight misnomer. First, every substitution is more general than
itself since every substitution is the same as itself composed with the identity substitution. Second, let γ
and δ be cycles which have the same basis. Then γ and δ are more general than each other.

Definition 6 (Idempotent) A substitution σ is idempotent if σ = σ ◦ σ

If a substitution is idempotent then applying it once is enough. Notice that cycles are not, with the
exception of ι, idempotent.

13.5 Continuing to answer our question


Now we return to Question 2. We will write a function called unifier to compute the unifier of two
terms. We know that not every pair of terms has a unifier, so we expect to use error. An alternative
would be to use the Maybe type. We expect unifier to look like:

unifier :: Term -> Term -> Substitution

unifier (Var x) (Var y) =


unifier (Var x) (Compound g ys) =
unifier (Compound f xs) (Var y) =
unifier (Compound f xs) (Compound g ys) =

Program 13.13: Beginning to find a unifier

13.5.1 First clause


The first clause is easy:

unifier (Var x) (Var y) = mksubst (Var y) x

Program 13.14: The first clause of Program 13.13

Notice that this substitution is a most general, idempotent unifier. We could just as easily have chosen
mksubst (Var x) y. This too is a most general, idempotent unifier.

115
13 Case study: unification

13.5.2 Second and third clauses


The second and third clauses are clearly going to have very similar solutions. The naïve solution is:

unifier (Var x) (Compound g ys) =


mksubst (Compound g ys) x

Program 13.15: A naïve attempt at the second clause of Program 13.13

Unfortunately, when we apply the unifier of x and f (x) to x and f (x) we get f (x) and f (f (x)):

Main> (extend (unifier (Var "x") (Compound "f" [Var "x"])))


(Var "x")
Compound "f" [Var "x"] :: Term
Main> (extend (unifier (Var "x") (Compound "f" [Var "x"])))
(Compound "f" [Var "x"])
Compound "f" [Compound "f" [Var "x"]] :: Term

So, close, but no cigar. The solution is that we have to check whether the variable occurs in the
compound term with which it is being unified. We can write the occurs check:

occurs :: Var -> Term -> Bool

occurs x (Var z) = x == z
occurs x (Compound _ ys) = any (occurs x) ys

Program 13.16: The occurs check

Now we re-write the second and third clauses as:

unifier (Var x) (Compound g ys) =


if (any (occurs x) ys)
then error "occurs"
else mksubst (Compound g ys) x
unifier (Compound f xs) (Var y) =
if (any (occurs y) xs)
then error "occurs"
else mksubst (Compound f xs) y

Program 13.17: A less naïve attempt at the second and third clauses of Program 13.13

Once again we have constructed most general, idempotent unifiers.


Suppose we had use the type in Program 13.2, then we would have written two mutually recursive
functions to implement the occurs check:

occurs :: Var -> Term -> Bool


occurs_all :: Var -> Terms -> Bool

occurs x (Var z) = x == z
occurs x (Compound _ ys) = occurs_all x ys

occurs_all _ None = False


occurs_all x (Some term terms) =
(occurs x term) || (occurs_all x terms)

Program 13.18: The occurs check, using a different type

116
13.5 Continuing to answer our question

In some ways this is actually a more natural way to express this function, and we will see that the
final clause of unifier follows this pattern.

13.5.3 Final clause

Now we look at the final clause. We know that we can only unify two compound terms which have the
same functor. We also know that the functors must have the same arity. We can check that functors are
the same easily. The arity of a functor is implicit in the length of the list of terms it is applied to. We
could check this by comparing the lengths of the lists. We are working our way towards:

unifier (Compound f xs) (Compound g ys) =


if not ((length xs == length ys) && (f == g))
then error "functors don’t match"
else {- do something with xs and ys -}

Program 13.19: Clause Four

What do we the do with xs and ys? Perhaps we should match on the lists in clause four:

unifier (Compound f []) (Compound g []) =


if not ((length [] == length []) && (f == g))
then error "functors don’t match"
else idsubst
unifier (Compound f []) (Compound g ys) = -- ys is not []
if not ((length [] == length ys) && (f == g))
then error "functors don’t match"
else error "can’t get here"
unifier (Compound f xs) (Compound g []) = -- xs is not []
if not ((length xs == length []) && (f == g))
then error "functors don’t match"
else error "can’t get here"
unifier (Compound f (x:xs)) (Compound g (y:ys)) =
if not ((length (x:xs) == length (y:ys)) && (f == g))
then error "functors don’t match"
else let sigma = unifier x y
in {- do something with sigma xs and ys -}

Program 13.20: Clause Four – expanded version

The most general, idempotent unifier of a constant with itself is the identity substitution.

Clearly the testing of the length in this function is a waste of time. Furthermore we know that we
need to apply sigma to xs and ys. So we are getting towards:

117
13 Case study: unification

unifier (Compound f []) (Compound g []) =


if (f /= g)
then error "functors don’t match"
else idsubst
unifier (Compound f []) (Compound g (_:_)) =
error "arities"
unifier (Compound f (_:_)) (Compound g []) =
error "arities"
unifier (Compound f (x:xs)) (Compound g (y:ys)) =
if (f /= g)
then error "functors don’t match"
else let sigma = unifier x y
xs’ = map (extend sigma) xs
ys’ = map (extend sigma) ys
in {- do something with xs’ ys’ -}

Program 13.21: Clause Four – second expanded version

We still have to do something with xs’ and ys’. We just don’t know how to unify a list of terms.
Perhaps we should look at an analogy with Program 13.18, and implement a pair of mutually recursive
functions with types:

unifier :: Term -> Term -> Substitution


unifierall :: Terms -> Terms -> Substitution

Program 13.22: Types of unifier and unifierall

We then get:

unifier (Var x) (Var y) = mksubst (Var y) x


unifier (Var x) (Compound g ys) =
if (any (occurs x) ys)
then error "occurs"
else mksubst (Compound g ys) x
unifier (Compound f xs) (Var y) =
if (any (occurs y) xs)
then error "occurs"
else mksubst (Compound f xs) y
unifier (Compound f xs) (Compound g ys) =
if (f == g)
then unifierall xs ys
else error "functors differ"

Program 13.23: unifier:: Term -> Term -> Substitution

and:

unifierall None None = idsubst


unifierall None (Some _ _) = error "arities"
unifierall (Some _ _) None = error "arities"
unifierall (Some x xs) (Some y ys) =
let sigma = unifier x y
xs’ = termsmap (extend sigma) xs
ys’ = termsmap (extend sigma) ys
tau = unifierall xs’ ys’
in tau # sigma

Program 13.24: unifierall :: Terms -> Terms -> Substitution

118
13.6 Termination of unifier

The function termsmap maps a function over a Terms, just like map maps a function over a list. So
we have solved the problem of unifying lists of terms. Notice also that we only test whether the functors
are the same at one place. Now if we go back to the type we previously had we get:

unifierall [] [] = idsubst
unifierall [] (_:_) = error "arities"
unifierall (_:_) [] = error "arities"
unifierall (x : xs) (y : ys) =
let sigma = unifier x y
xs’ = map (extend sigma) xs
ys’ = map (extend sigma) ys
tau = unifierall xs’ ys’
in tau # sigma

Program 13.25: unifierall :: [Term] -> [Term] -> Substitution

If σ is the most general unifier of x and y, τ the most general unifier of map σ ∗ xs and map σ ∗ ys
then τ ◦ σ is the most general unifier of x : xs and y : ys. Although the composition of two idempotent
substitutions need not be idempotent (for example, two idempotent substitutions may compose to form
a cycle), τ ◦ σ is, in fact, idempotent.
The full code is given in Program 13.28 on page 123.

13.6 Termination of unifier


Recursive calls to unifyall are not on structural sub-parts of the initial term: xs’ will not in general
be the tail of x:xs. Hence we need to use some other argument to show that calls to unifier will
terminate. We need to make a few more definitions.

Definition 7 (Chain) Let ≺ be a partial order. A chain is a sequence x1 , x2 , x3 , . . . such that if i < j
then xi  xj .

Definition 8 (Well-founded ordering) ≺ is well-founded if every chain is finite.

Note: sometimes well-foundedness is defined as “there are no infinite chains”.


If recursive calls to a function are on values below the initial value in some well-founded order then
the function embodies well-founded recursion, and is guaranteed to terminate.
An obvious well-founded order is that of decreasing length of lists. The function quicksort termi-
nates because every recursive call is on a list shorter than the original one.
We can find a well-founded order on (pairs of) terms such that every recursive call to unifier or
unifierall is on a smaller (pair of) term(s) – the situation is made slightly more complicated by the
use of two mutually recursive functions, but the principle is the same. It is crucial for the well-founded
order that we use that the unifiers are most-general and idempotent.
Well-founded recursion is very similar to a method from intuitionistic mathematics called bar induc-
tion.

13.7 A trick to avoid mutual recursion


If we had a pathological aversion to mutual recursion we could code up the final clause of Program
13.21 as:

119
13 Case study: unification

unifier (Compound f (x:xs)) (Compound g (y:ys)) =


if (f /= g)
then error "functors don’t match"
else let sigma = unifier x y
xs’ = map (extend sigma) xs
ys’ = map (extend sigma) ys
tau = unifier (Compound "DUMMY" xs’)
(Compound "DUMMY" ys’)
in tau # sigma

Program 13.26: Avoiding mutual recursion

This particular trick relies on constructing terms with a dummy functor "DUMMY". Of course we would
waste time comparing "DUMMY" with itself, and it is bad programming practice to build a structure just
to take it apart.

13.8 A theorem
In order to answer Question 1 we have had to deviate quite a way into the theory of substitutions and
of well-founded orders. The program that we have developed embodies a proof of this theorem:
Theorem 4 For any two terms t1 and t2 then either t1 and t2 have a most-general, idempotent unifier,
or they have no unifier.

13.9 Another view of unification


Rather than focussing on unifying pairs of terms, suppose we focus on unifying pairs of lists of terms,
like:
[x, y, f (a, b)] and [z, b, f (x, y)]
If we apply the unifier of these term lists to the variable list
[x, y, z]
we get
[a, b, a]
Now, suppose we change the syntax a little and write the pair of lists as a list of pairs:
(x, z)
(y, b)
(f (a, b), f (x, y))
and write the results as a list of pairs too
(x, a)
(y, b)
(z, a)
Suppose we change the syntax a little more and write the pairs of terms with just an = between them.
Then we get:
x = z (13.1)
y = b (13.2)
f (a, b) = f (x, y) (13.3)
and:
x = a (13.4)
y = b (13.5)
z = a (13.6)

120
13.10 Yet another view of unification

It is no accident that the set of equations (13.4) to (13.6) is the solved set for the set (13.1) to (13.3).
The task of solving a set of equations is just to find replacements for the variables such that the left-
and right-hand sides of each equation are the same. So by writing a unification algorithm we have
written an equation solver.

13.10 Yet another view of unification


There is another view of unification – since we find the most-general unifier of two terms we can see it
is giving us the “best” match of two terms (if any match exists). This is how we often think of unification
in Prolog programming.
The type inference algorithm that the Hindley-Milner type system uses finds a principal type for any
typeable term. A principal type for a term is a type of which any other type for the term is an instance.
In this sense the principal type is the “best” type we can find for the term. We find principal types
by using unification to find the “best” match inside the type inference algorithm. We will develop a
principal-type algorithm in Chapter 15.

13.11 Summary
We have developed a unification algorithm in Haskell. We have touched on a number of issues, some
to do with Haskell, and some to do with unification:

• we saw how to represent terms using mutually inductive types;

• we saw how to represent terms without using mutually inductive types by using lists and sleight-
of-hand;

• we saw how to write mutually recursive functions;

• we saw how to avoid writing mutually recursive functions using a hack.

• we saw how to use higher-order functions to treat substitutions as Haskell functions, saving our-
selves a lot of bookkeeping;

• we saw well-founded recursion;

• we saw that unification is equation solving;

• we claimed that unification finds the “best” match.

13.12 Questions
1. Write termsmap.

2. Make the error messages more useful.

3. Write an equation solver based on the unification algorithm.

4. Write an accumulator-based unifier :: Substitution -> Term -> Term -> Substitution.

5. Use the Maybe type and write unifier :: Term -> Term -> (Maybe Substitution)

6. Write an unparser for Terms.

7. Write a unification algorithm for the type of terms in Program 13.27. In this type a term is either
a variable or a constant of is the application of a term to a term. So x, f, f x, (f x)y are legitimate
terms corresponding to our terms x, f, f (x), f (x, y). Now we also have things like xy, (f x)(wz),
which don’t correspond to anything we had before.

8. Write a principal-type algorithm.

121
13 Case study: unification

9. Write a parser for Terms.

type Var = String


type Const = String

data Term = Var Var


| Const Const
| App Term Term

Program 13.27: A different type for terms

13.13 Background
Most of the development of this follows [49], which in turn relied on [54]. The type given in Program
13.27 is used in [60]. More details on bar induction can be found in [78]. The principal-type algorithm
is described in [38].

122
13.14 Code summary

13.14 Code summary

type Var = String


type Functor = String

data Term = Var Var


| Compound Functor [Term]
deriving Show

type Substitution = Var -> Term

mksubst :: Term -> Var -> Substitution


mksubst t x y = if x == y then t else (Var y)

idsubst :: Substitution
idsubst = Var

extend :: Substitution -> Term -> Term


extend sigma (Var x) = sigma x
extend sigma (Compound f xs) = Compound f (map (extend sigma) xs)

infixr #
(#) :: Substitution -> Substitution -> Substitution
sigma # tau = (extend sigma). tau

occurs :: Var -> Term -> Bool


occurs x (Var z) = x == z
occurs x (Compound _ ys) = any (occurs x) ys

unifier :: Term -> Term -> Substitution


unifierall :: [Term] -> [Term] -> Substitution
unifier (Var x) (Var y) = mksubst (Var y) x
unifier (Var x) (Compound g ys) =
if (any (occurs x) ys)
then error "occurs"
else mksubst (Compound g ys) x
unifier (Compound f xs) (Var y) =
if (any (occurs y) xs)
then error "occurs"
else mksubst (Compound f xs) y
unifier (Compound f xs) (Compound g ys) =
if (f == g)
then unifierall xs ys
else error "functors differ"

unifierall [] [] = idsubst
unifierall [] (_:_) = error "arities"
unifierall (_:_) [] = error "arities"
unifierall (x : xs) (y : ys) =
let sigma = unifier x y
xs’ = map (extend sigma) xs
ys’ = map (extend sigma) ys
tau = unifierall xs’ ys’
in tau # sigma

Program 13.28: All the unifer code

123
13 Case study: unification

124
14 Case study: unification in
continuation-passing style
14.1 Premable
In Chapter 12 we discussed continuations, and in Chapter 13 we discussed unification. In this Chapter
we present a CPS unification algorithm.

14.2 Basic types


We use the same type for terms that the used before, defined in Program 13.1 on page 112. We will
also use the same type for substitutions, as defined in Program 13.4 on page 112. We resisted the
temptation to use a continuation-passing type for substitutions, like:

type CPSedSubs a = (Term -> a) -> Var -> a

Program 14.1: CPS substitutions

Since substitutions are not in CPS we have a number of ‘direct’ functions on substitutions, exactly as
before:

• mksub :: Var -> Term -> Substitution from Program 13.6 on page 113

• idsubst :: Substitution from Program 13.8 on page 113

• extend :: Substitution -> Term -> Term from Program 13.9 on page 113

14.3 Top-level design


As before we expect to have two mutually recursive functions, but now with the following type declara-
tion:

cpsunifier :: Term ->-> Term


(Substitution -> a) -> a
cpsunifierl :: [Term] -> [Term] ->
(Substitution -> a) -> a

Program 14.2: Types of cpsunifier and cpsunifier

The continuation that we supply will be a function which makes use of a substitution, for example
something like: \ s -> map s ["x", "y", "z"].
As we saw in Chapter 12 the order of operations will seem to be inverted when compared to the
direct functions.

14.3.1 Starting to define cpsunifier


The first clause of cpsunifier is straightforward:

125
14 Case study: unification in continuation-passing style

cpsunifier (Var x) (Var y) k = k (mksub x (Var y))

Program 14.3: The first clause of cpsunifier

All we are doing here is applying the continuation to the substitution we have constructed.

14.3.2 Second and third clauses


In the direct function the seconnd and third clauses had the form:

• if (occurs ...) then ...else ....

If we invert the order we expect to have:

• cpsoccurs ...(\ b -> cpsif b ......k).

So we need to define cpsif and cpsoccurs.

cpsif :: Bool -> a -> a -> (a -> b) -> b

cpsif b t e k = if b then k t else k e

Program 14.4: cpsif

As we expect cpsoccurs and cpsoccursl must be defined together, and they require us to define
cpsany:

cpsoccs :: Var -> Term -> (Bool -> a) -> a


cpsoccsl :: Var -> [Term] -> (Bool -> a) -> a

cpsoccs x (Var y) k = k (x == y)
cpsoccs x (Compound _ ys) k = cpsoccsl x ys k

cpsoccsl x = cpsany (\v c -> cpsoccs x v c)

cpsany :: (a -> (Bool -> b) -> b) -> [a] ->


(Bool -> b) -> b

cpsany _ [] k = k False
cpsany f (h:t) k = cpsany f t (\b -> f h (\c -> k(c||b)))

Program 14.5: cpsoccurs, cpsoccursl and cpsany

Notice that the test we supply to cpsany is itself in CPS. Compare with any from the prelude:

any _ [] = False
any f (h:t) = (f h) || any f t

Program 14.6: any

And now the second and third clauses are:

126
14.4 Implementing cpsunifierl

cpsunifier (Var x) (Compound g ys) k =


cpsoccsl x ys (\b ->
cpsif b (error "occurs")
(mksub x (Compound g ys)) k)
cpsunifier (Compound f xs) (Var y) k =
cpsoccsl y xs (\b ->
cpsif b (error "occurs")
(mksub y (Compound f xs)) k)

Program 14.7: The second and third clauses of cpsunifier

14.3.3 The final clause of cpsunifier


The final clause of cpsunifier is where we call cpsunifierl. We also get the inversion of the
order of the operations:

cpsunifier (Compound f xs) (Compound g ys) k =


cpsunifierl xs ys (\s ->
cpsif (f == g) s (error "functors") k)

Program 14.8: The final clause of cpsunifier

14.4 Implementing cpsunifierl


The first three clauses of cpsunifierl are easy:

cpsunifierl [] [] k = k idsubst
cpsunifierl (_:_) [] k = k (error "arities")
cpsunifierl [] (_:_) k = k (error "arities")

Program 14.9: The first three clauses of cpsunifierl

The final clause is, as always, where the hard work is. Given two lists of terms (x:xs) and (y:ys)
and something to do next k we must:

• find the unifier of x and y, call it s and next

• find s’s extension, t, and next

• map t over xs to get xs0 , and next

• map t over ys to get ys0 , and next

• find the unifier of xs0 and ys0 , call this u, and next

• find u’s extension, v, and next

• apply k to the v ◦ u

So we need to implement CPS functions cpsextend and cpsmap.

14.4.1 cpsextend and cpsmap


We can implement cpsextend and cpsmap easily:

127
14 Case study: unification in continuation-passing style

cpsextend :: Substitution -> ((Term -> Term) -> a) -> a

cpsextend sigma k = k (extend sigma)

cpsmap :: (a -> b) -> [a] -> ([b] -> c) -> c

cpsmap _ [] k = k []
cpsmap f (h:t) k = cpsmap f t (\r -> k((f h) : r))

Program 14.10: cpsextend and cpsmap

14.4.2 The final clause of cpsunifierl


And now we just have to turn the list above into Haskell. For improved readability we have added the
types of the abstracted variables in the various continuations as comments.

cpsunifierl (x:xs) (y:ys) k =


cpsunifier x y (\s -> -- s :: Substitution
cpsextend s (\t -> -- t :: Term -> Term
cpsmap t xs (\xs’ -> -- xs’ :: [Term]
cpsmap t ys (\ys’ -> -- ys’ :: [Term]
cpsunifierl xs’ ys’ (\u -> -- u :: Substitution
cpsextend u (\v -> -- v :: Term -> Term
k (v.s)))))))

Program 14.11: The final clause of cpsunifierl

And now we get:


Main> cpsunifier
(Var "x")
(Var "y")
(\s -> map s ["x", "y"])
[Var "y",Var "y"]
Main> cpsunifier
(Compound "f" [Var "x", Compound "g" [], Var "y"])
(Compound "f" [Var "y", Var "x", Var "z"])
(\s -> map s ["x", "y", "z"])
[Compound "g" [],Compound "g" [],Compound "g" []]

14.5 Summary
We presented a CPS version of a unification algorithm.

128
15 Case study: computing principal types
for λ-terms

15.1 Preamble
We will develop a type assignment algorithm in Haskell. We have already mentioned type assignment
in Chapter 11. The rules for type assignment were given in Rule 11.1. We will develop an algorithm
based on these rules. We begin with some definitions.

Definition 9 (Closed term) A λ-term is closed if it contains no occurrences of any free variables.

Definition 10 (Pure λ calculus) There are no constants in the pure λ calculus.

Definition 11 (Principal type) A principal type for a term is a type for the term. Furthermore every
type for the term is a substitution instance of it.

Theorem 5 (Principal type theorem) Every closed term of the pure λ calculus either has a principal
type, or has no type.

The principal-type algorithm is described in [38]. Haskell’s own type inference is just a slightly more
subtle version of this algorithm.
We give some examples of principal type assignable to λ-terms in § 15.11 on page 140.

15.2 Getting started


We need to make some definitions to get started. Since we are going to deal with typing of λ-terms in
a typed language based on the λ-calculus there is lots of scope for confusion. We will need to have
Haskell types for λ-terms and for their types.
Program 15.1 gives a suitable type for λ-terms.

type Var = String

data LTerm = Var Var


| App LTerm LTerm
| Abs Var LTerm

Program 15.1: A type for λ-terms

15.2.1 Type expressions


The type system that we are dealing with has only type variables and function types. A function type
is just a pair of types, but modern notation is to use an arrow to form pairs. Program 15.2 gives a
suitable type for types.

129
15 Case study: computing principal types for λ-terms

type TyVar = String

infixr :->

data Type = TyVar TyVar


| Type :-> Type

instance Show Type where


show = unparseTy

Program 15.2: A type for types

The unparser is defined in Program 15.3.

unparseTy :: Type -> String

unparseTy (TyVar tvar) = tvar


unparseTy ((t1 :-> t2) :-> t3) =
let t1t2str = unparseTy (t1 :-> t2)
t3str = unparseTy t3
in ’(’:t1t2str ++ ") -> " ++ t3str
unparseTy (t1 :-> t2) =
let t1str = unparseTy t1
t2str = unparseTy t2
in t1str ++ " -> " ++ t2str

Program 15.3: An unparser for types

As usual we have → as right-associative.

15.3 Getting ahead of ourselves


We know that we will need to be able to unify type expressions. The code in Programs 15.4 and 15.5
is an adaptation of the code that we presented in Chapter 13 to compute unifiers. Computing unifiers
for type expressions is easier than computing unifiers for the terms we used before – there is only one
functor, and its arity is 2. The code in Program 15.4 deals with substitutions.
The code in Program 15.5 deals with unification itself. We choose to use a accumulator-based
algorithm. Later we will implement the type inference algorithm using an accumulator too, and we will
see that we can improve our code slightly by exploiting the similarity between the two functions.

15.4 Type assignment itself


Now we explain how to assign type to terms. There are three cases to consider:

• we have a variable;

• an abstraction;

• an application term.

Life is quite straightforward: we appeal to some proof theory that we do not discuss and rely on the
fact that if we can assign a type to a term then we can assign a type using the rules in the most obvious
way – we rely on there being principal derivations.

130
15.4 Type assignment itself

type Substitution = TyVar -> Type

mksubst :: Type -> TyVar -> Substitution

mksubst t x y = if x == y then t else (TyVar y)

idsubst :: Substitution

idsubst = TyVar

extend :: Substitution -> Type -> Type

extend sigma (TyVar x) = sigma x


extend sigma (t1 :-> t2) =
(extend sigma t1) :-> (extend sigma t2)

infixr #

(#) :: Substitution -> Substitution -> Substitution

sigma # tau = (extend sigma). tau

Program 15.4: Substitutions on types

15.4.1 Typing a variable


We start by explaining how to assign a type to a variable.

x:α∈Γ
Γ 7→ x : α

Rule 15.1: Typing a variable

Γ is an environment, i.e. a collection of variable/type pairs. This rule tells us that if x : α is in the
environment then the variable x can be assigned the type α. In this rule α is an arbitrary type.

15.4.2 Typing an abstraction


The second rule explains how to assign a type to an abstraction:

x : α, Γ 7→ y : β
Γ 7→ λx.y : α → β

Rule 15.2: Typing an abstraction

This rules tells us that in order to type an abstraction we add a new binding to the environment, and
type the body of the abstraction in the new environment.

15.4.3 Typing an application


The third rule explains how to assign a type to an application:

131
15 Case study: computing principal types for λ-terms

unifier :: Type -> Type -> Substitution

unifier t1 t2 = trunifier idsubst t1 t2

trunifier :: Substitution -> Type -> Type -> Substitution

trunifier sigma (TyVar x) (TyVar y) =


(mksubst (TyVar y) x) # sigma
trunifier sigma (TyVar x) (t1 :-> t2) =
if ((occurs x t1) || occurs x t2)
then error ": unification would give infinite type"
else (mksubst (t1 :-> t2) x) # sigma
trunifier sigma (t1 :-> t2) (TyVar y) =
if ((occurs y t1) || occurs y t2)
then error ": unification would give infinite type"
else (mksubst (t1 :-> t2) y) # sigma
trunifier sigma (t1 :-> t2) (s1 :-> s2) =
let sigma’ = trunifier sigma t1 s1
t2’ = extend sigma’ t2
s2’ = extend sigma’ s2
in trunifier sigma’ t2’ s2’

occurs :: TyVar -> Type -> Bool

occurs x (TyVar z) = x == z
occurs x (t1 :-> t2) = (occurs x t1) || occurs x t2

Program 15.5: Unifying types

Γ 7→ f : α → β Γ 7→ x : α
Γ 7→ f x : β

Rule 15.3: Typing an application

This tells us that in order to type an application we should type the term being applied and the term
being applied to.

15.5 Example derivation


Here is a derivation of a type for λf x.f x:

[f : a → b, x : a] 7→ f : a → b [f : a → b, x : a] 7→ x : a
[f : a → b, x : a] 7→ f x : b
[f : a → b] 7→ λx.f x : a → b
7→ λf x.f x : (a → b) → a → b

Rule 15.4: Typing λf x.f x

This is not the only type we can infer for λf x.f x.

132
15.6 Turning these rules into an algorithm

[f : a → a, x : a] 7→ f : a → a [f : a → a, x : a] 7→ x : a
[f : a → a, x : a] 7→ f x : a
[f : a → a] 7→ λx.f x : a → a
7→ λf x.f x : (a → a) → a → a

Rule 15.5: A second typing for λf x.f x

We can claim that (a → b) → a → b is a better type for λf x.f x than (a → a) → a → a is since


(a → a) → a → a is a substitution instance of (a → b) → a → b. In fact, (a → b) → a → b is the
principal type for λf x.f x. The language used here should strongly remind you of the language used
in unification, and there is a close connection. When we compute principal type we will use unification
when we need to match.

15.6 Turning these rules into an algorithm


We can use Rules 15.1, 15.2 and 15.3 as the basis of a type assignment algorithm. We read the rules
from the bottom up.
In Rule 15.2 we pick a new type α for the binding variable. We must have a stock of fresh type
variables, as given in Program 15.6:

supply :: Char -> [String]

supply c = map (\n -> c : (show n)) [1..]

freshtypevars = supply ’t’

Program 15.6: A stock of arbitrary types

Program 15.7 implements lookuptype, which takes an environment and a λ-variable, and returns
the type of the variable. Since we are dealing with closed terms, and the environment is added to using
Rule 15.2, every variable we need to check will have a type in the environment.

lookuptype :: (Show a, Eq a) => a -> [(a,b)] -> b

lookuptype v [] = error ((show v) ++ " is not bound.")


lookuptype v ((x,ty):xts) =
if v == x
then ty
else lookuptype v xts

Program 15.7: Retrieving a type for a variable

The type for lookuptype is more general than we need: Var -> [(Var, TyVar)] -> TyVar
would do.

15.6.1 Key to the algorithm


The key to the algorithm is to read the rules upwards. We treat the rules as telling us how to type a
compound expression by giving types for its sub-expressions (again, we are just assuming that this is
reasonable, but we could look at the literature to confirm that this is valid). Since we don’t know at
the start what the type of an expression will be we just pick an arbitrary type and “fill in the details”
as we work our way upwards. We fill in the details by accumulating a substitution. We ensure that
the type we finally compute is the principal type by using unification, which computes most general

133
15 Case study: computing principal types for λ-terms

unifiers, to match types when required. Later we will see that some of the unification that we are doing
is unnecessary, but for the moment, whenever we need to match types we will unify.
Initially we have an empty context, and the identity substitution. We also have a supply of arbitrary
types. Since we are accumulating a substitution, we will need to return this. We also need to return the
type of the term, and, since we may consume some of our stock of arbitrary types we need to return
the unused part of this.
We are working our way towards the program given in Program 15.8.

princtype :: LTerm -> Type

princtype tm =
let (_, _, typ) =
printyp idsubst [] (map TyVar (supply ’t’)) tm
in typ

Program 15.8: Computing a principal type

The type of printyp is given in Program 15.9.

printyp :: Substitution ->


[(Var, TyVar)] ->
[TyVar] ->
LTerm ->
(Substitution, [TyVar], Type)

Program 15.9: The type of printyp

One important point to notice is that we choose only to build up information in the substitution: the
environment will consist of λ-variable/type-variable pairs only.

15.6.2 printyp of a variable


Program 15.10 explains how to compute the type for a variable.

printyp sigma env (tv:tyvars) (Var x) =


let t = lookuptype x env
delta = unifier (TyVar tv) (sigma t)
sigma’ = delta # sigma
t’ = sigma’ tv
in (sigma’, tyvars, t’)

Program 15.10: Typing a Var

We pick a new arbitrary type, tv. We retrieve t, the type variable we picked for the variable. The
currently accumulated substitution tells us what new details we have about t, so we find the unifier
of TyVar tv and sigma t. This unifier is not very interesting as one of the type expressions being
unified is a type variable which is sure not to occur in the other. We will return to this observarion later.
Now, we update our accumulated substitution by composing the unifier with the previously accumulated
substitution. Finally, the type we have computed for the variable is given by applying this substitution
to the new arbitrary type that we picked.
We assert, but do not prove, that this is the principal type.

15.6.3 printyp of an abstraction


Program 15.11 explains how to compute the type for an abstraction.

134
15.7 Improving the code

printyp sigma env (tv1:tv2:tyvars) (Abs x m) =


let (sigma’, tyvars’, mtype) =
printyp sigma ((x, tv1):env) tyvars m
tv1’ = sigma’ tv1
delta = unifier (tv1’ :-> mtype) (TyVar tv2)
sigma’’ = delta # sigma’
t’ = sigma’’ tv2
in (sigma’’, tyvars’, t’)

Program 15.11: Typing an Abs

First we compute the principal type of the body, given the current accumulated substitution, and
extending the environment with a binding of a new arbitrary type to the abstracted variable. Next we
make use of the newly accumulated substitution on the arbitrary type we chose for the variable. We
choose a second arbitrary type for the abstraction term itself, and compute the unifier of this with tv1’
:-> mtype. We then use this to update the accumulated substitution and also compute the type we
were interested in.
We assert, but do not prove, that we compute the principal type.

15.6.4 printyp of an application term


Typing application terms is the most complicated part, but it follows the same structure as before.
Program 15.12 explains how to compute the type for an application term.

printyp sigma env (tv1:tv2:tyvars) (App l m) =


let (sigma’, tyvars’, ltype) =
printyp sigma env tyvars l
tau =
unifier ltype ((TyVar tv1) :-> (TyVar tv2))
(sigma’’, tyvars’’, mtype) =
printyp (tau # sigma’) env tyvars’ m
delta = unifier mtype (sigma’’ tv1)
sigma’’’ = delta # sigma’’
lmtype = sigma’’’ tv2
in (sigma’’’, tyvars’’, lmtype)

Program 15.12: Typing an App

It is crucial that we ensure that we take account of any information we gained when typing l when
we attempt to type m. It is here that we will get the "unification would give infinite type"
error when we try to type λx.xx
We assert, but do not prove, that we compute the principal type.

15.7 Improving the code


Careful inspection of our code allows us to see how we can improve it slightly, mainly so that fewer
intermediate results are calculated, and fewer redundant function calls are made.

15.7.1 Improving printyp of a variable


In Program 15.10 we define delta as :

• unifier (TyVar tv) (sigma t).

The function unifier is defined in terms of an accumulator-based auxiliary function trunifier.


So we know that this definition will expand to:

135
15 Case study: computing principal types for λ-terms

• trunifier idsubst (TyVar tv) (extend sigma t).

We have just chosen tv from our stock of shiny new type variables, so we know that tv does not
occur in (extend sigma t). Hence we know that the result of this call will be:

• (mksubst (extend sigma t) tv) # idsubst

So delta is (mksubst (extend sigma t) tv) # idsubst


All we do with delta is compose it with sigma to obtain sigma’. Now, sigma’ is defined as:

• (mksubst (extend sigma t) tv) # idsubst # sigma

Since composition with the identity substitution does nothing, this is just:

• (mksubst (extend sigma t) tv) # sigma

We notice that t’ is only used once, and its definition is very small, we we put it in place.
So new we have:

printyp sigma env (tv:tyvars) (Var x) =


let t = lookuptype x env
sigma’ = (mksubst (sigma t) tv) # sigma
in (sigma’, tyvars, sigma’ tv)

Program 15.13: Improving printyp on a Var

Notice that we would have obtained the same result if we had thought of passing sigma into
trunifier as the accumulated unifier, rather than starting from scratch with the identity substitution.

15.7.2 Improving printyp of an abstraction


We can use similar reasoning to improve Program 15.11. We have just picked tv2, so it does not
occur elsewhere. Hence we could define sigma” as:

• (mksubst (tyvar’ :-> mtype) tv2) # sigma’

This would give us:

printyp sigma env (tv1:tv2:tyvars) (Abs x m) =


let (sigma’, tyvars’, mtype) =
printyp sigma ((x, tv1):env) tyvars m
tv1’ = sigma’ tv1
sigma’’ =
(mksubst (tv1’ :-> mtype) tv2) # sigma’
in (sigma’’, tyvars’, sigma’’ tv2)

Program 15.14: Nicer typing of an Abs

Again we have eliminated a call to unifier, which is probably conceptually clearer as there is no
unification going on.

15.7.3 Improving printyp of an application


We can use similar reasoning on the typing of an application term. In this case however, we will not be
able to eliminate the unifications, as they may really do something.
In Program 15.12 we have:

• tau = unifier ltype ((TyVar tv1) :-> (TyVar tv2))

136
15.8 Even more improvements

There is nothing much that we can do about the definition of tau, except to feel that we have possibly
built (TyVar tv1) :-> (TyVar tv2) just to take it apart. The only subsequent use of tau is that
it is composed with sigma’, so we might as well exploit the accumulating nature of trunifier, and
define:

• mu = trunifier sigma’ ltype ((TyVar tv1) :-> (TyVar tv2))

We can then use mu in place of tau # sigma’. Similarly we can dispense with delta and just
define:

• sigma”’ = trunifier sigma” mtype (sigma” tv1)

We end up with Program 15.15

printyp sigma env (tv1:tv2:tyvars) (App l m) =


let (sigma’, tyvars’, ltype) =
printyp sigma env tyvars l
mu = trunifier sigma’
ltype
((TyVar tv1) :-> (TyVar tv2))
(sigma’’, tyvars’’, mtype) =
printyp mu env tyvars’ m
sigma’’’ = trunifier sigma’’
mtype
(sigma’’ tv1)
in (sigma’’’, tyvars’’, sigma’’’ tv2)

Program 15.15: Nicer typing of an App

We have still named some intermediate results, but this code (and that in Programs 15.13 and
15.14) is cleaner than the code we originally wrote.

15.8 Even more improvements


In each of the three clauses for printype we return a triple of the form:

• (gamma, tyvars, gamma v)


:: (Substitution, [TyVars], Type)

This is silly. We should instead re-write our code so that we return a triple of the form:

• (gamma, tyvars, v)
:: (Substitution, [TyVars], TyVar)

We will need to change princtype , as shown in Program 15.16

princtype :: LTerm -> Type

princtype tm =
let (sigma, _, v) = printyp idsubst [] (supply ’t’) tm
in sigma v

Program 15.16: An improved definition of princtype

On reflection, the previous definition of princtype in Program 15.8 really is a bit odd looking.
However, it seemed plausible at the time that we wrote it.
The final version of printyp is given in Program 15.17.

137
15 Case study: computing principal types for λ-terms

printyp :: Substitution ->


[(Var, TyVar)] ->
[TyVar] ->
LTerm ->
(Substitution, [TyVar], TyVar)

printyp sigma env (tv:tyvars) (Var x) =


let t = lookuptype x env
sigma’ = (mksubst (sigma t) tv) # sigma
in (sigma’, tyvars, tv)

printyp sigma env (tv1:tv2:tyvars) (Abs x m) =


let (sigma’, tyvars’, mv) =
printyp sigma ((x, tv1):env) tyvars m
sigma’’ = (mksubst
((sigma’ tv1) :-> (sigma’ mv))
tv2)
# sigma’
in (sigma’’, tyvars’, tv2)

printyp sigma env (tv1:tv2:tyvars) (App l m) =


let (sigma’, tyvars’, lv) =
printyp sigma env tyvars l
mu = trunifier sigma’
(sigma’ lv)
((TyVar tv1) :-> (TyVar tv2))
(sigma’’, tyvars’’, mv) =
printyp mu env tyvars’ m
sigma’’’ = trunifier sigma’’
(sigma’’ mv)
(sigma’’ tv1)
in (sigma’’’, tyvars’’, tv2)

Program 15.17: Final version of printyp?

This code, although much shorter than the code we originally wrote is clearer and easier to under-
stand. If only we could have written this in the first place!

15.9 Tidying up
We are now more-or-less finished, except for some tidying up that we might want to do to the way
results are printed. The code that we have written so far behaves like this:

PT> i
^ x . x :: LTerm
PT> princtype i
t1 -> t1 :: Type
PT> k
^ x y . x :: LTerm
PT> princtype k
t1 -> t3 -> t1 :: Type
PT> one
((^ n m z . z n m) ^ x y . y) (^ x . x) :: LTerm
PT> princtype one
((t18 -> t20 -> t20) -> (t23 -> t23) -> t12) -> t12
:: Type

138
15.10 Summary

PT> one’
^ z . z (^ x y . y) (^ x . x) :: LTerm
PT> princtype one’
((t8 -> t10 -> t10) -> (t13 -> t13) -> t4) -> t4 :: Type
PT> two
(^ n . ((^ n m z . z n m) ^ x y . y) n)
(((^ n m z . z n m) ^ x y . y) (^ x . x)) :: LTerm
PT> princtype two
((t22 -> t24 -> t24) -> (((t45 -> t47 -> t47) ->
(t50 -> t50) -> t39) -> t39) -> t16) -> t16 :: Type
PT> two’
^ z . z (^ x y . y) (^ z . z (^ x y . y) (^ x . x))
:: LTerm
PT> princtype two’
((t8 -> t10 -> t10) -> (((t20 -> t22 -> t22) ->
(t25 -> t25) -> t16) -> t16) -> t4) -> t4 :: Type
These types are correct, but don’t look very pretty. It is also a bit unfortunate that the type of one
and one’ is not obviously the same. We can correct this flaw by defining a way to prettify types, as in
Program 15.18 on the next page.
Now we get:
PT> (prettytypes.princtype) i
a -> a :: Type
PT> (prettytypes.princtype) k
a -> b -> a :: Type
PT> (prettytypes.princtype) one
((a -> b -> b) -> (c -> c) -> d) -> d :: Type
PT> (prettytypes.princtype) one’
((a -> b -> b) -> (c -> c) -> d) -> d :: Type
PT> (prettytypes.princtype) two
((a -> b -> b) -> (((c -> d -> d) ->
(e -> e) -> f) -> f) -> g) -> g :: Type
PT> (prettytypes.princtype) two’
((a -> b -> b) -> (((c -> d -> d) ->
(e -> e) -> f) -> f) -> g) -> g :: Type
And finally, we can ask Haskell for a type for one:
PT> \z -> z (\ x y -> y)(\x -> x)
ERROR - Cannot find "show" function for:
*** Expression : \z -> z (\x y -> y) (\x -> x)
*** Of type : ((a -> b -> b) -> (c -> c) -> d) -> d
And it agrees with us, even down to the names of the type variables.

15.10 Summary
In this note we have developed a type inference algorithm. This algorithm will take a closed λ-term
and return either a principal type for the term, if the term has a type; or the information that the term
has no type.
The algorithm was motivated by considering the rules for type assignment. Its correctness is justified
by appeal to some proof theory that we have not had space to cover. Unification is crucial to the
computation of a principal type.
We could extend this algorithm to deal with any term (not just closed terms). Free variables are
allocated types when we first meet then, and we have to be careful to pass an environment around.
We can also extend the algorithm to deal with other types (e.g. products) if we can give the appro-
priate proof rules, and give similar proof theory to justify their use.

139
15 Case study: computing principal types for λ-terms

prettytypes ty =
let (_, thetype, _) = prettify [] prettynames ty
in thetype

prettynames = (map (\c -> [c]) [’a’ .. ’z’]) ++


map (\n -> "tv"++ (show n)) [1..]

prettify [] (name:names) (TyVar x) =


([(x, name)], TyVar name, names)
prettify ((y,yname):env) names (TyVar x) =
if x == y
then ((y,yname):env, TyVar yname, names)
else
let (nenv, pretty, nunames) =
prettify env names (TyVar x)
in ((y,yname):nenv, pretty, nunames)
prettify env names (x :-> y) =
let (nuenv, prettyx, nunames) =
prettify env names x
(nunuenv, prettyy, nununames) =
prettify nuenv nunames y
in (nunuenv, prettyx :-> prettyy, nununames)

Program 15.18: Prettifying types

15.11 Examples
λxy.x : a→b→a
λxyz.xz(yz) : (a → b → c) → (a → b) → a → c
λx.x : a→a
λf x.x : a→b→b
λf x.f x : (a → b) → a → b
λf x.f (f x) : (a → a) → a → a
λf x.f (f (f x)) : (a → a) → a → a
λf x.f (f (f (f x))) : (a → a) → a → a
λf x.f (f (f (f (f x)))) : (a → a) → a → a
λf x.f (f (f (f (f (f x))))) : (a → a) → a → a
λf x.f (f (f (f (f (f (f x)))))) : (a → a) → a → a
λf x.f (f (f (f (f (f (f (f x))))))) : (a → a) → a → a
λf x.f (f (f (f (f (f (f (f (f x)))))))) : (a → a) → a → a
λf x.f (f (f (f (f (f (f (f (f (f x))))))))) : (a → a) → a → a
λxypq.xp(ypq) : (a → b → c) → (a → d → b) → a → d → c
λf gx.f (gx) : (a → b) → (c → a) → c → b
λxy.yx : a → (a → b) → b
λf gx.f (gx) : (a → b) → (c → a) → c → b
λnmz.znm : a → b → (a → b → c) → c
λxy.x : a→b→a
λxy.y : a→b→b
λp.p(λxy.x) : ((a → b → a) → c) → c
λp.p(λxy.y) : ((a → b → b) → c) → c
λx.x : a→a
λn.n(λxy.x) : ((a → b → a) → c) → c
Notice that Y and Θ are untypeable.

140
16 Case study: SKI-ing

16.1 Preamble
In this Chapter we will explain how to compile λ-terms to Combinatory Logic terms.
Section 16.2 introduces Combinatory Logic (CL), a theory closely related to the λ-calculus.
Section 16.3 asks you to write a function to reduce CL terms.
Section 16.4 explains how we can mimic abstraction in CL.
Section 16.5 explains how to translate λ-terms to CL.

16.2 λ calculus and combinatory logic


The theory of combinatory logic is very closely related to the λ-calculus. A combinator is just a function,
and corresponds to a closed λ-term.
Two combinators are of special interest to us, S and K.
A term of combinatory logic is either a variable or S or K or is the application of a CL term to a CL
term. As usual application is left-associative.
We can use the following Haskell data type to represent CL terms:

type CLVar = String

data CLTerm = CLVar CLVar -- never "S" or "K"


| S
| K
| CLApp CLTerm CLTerm

instance Show CLTerm where


show = clunparse

clunparse (CLVar x) = x
clunparse S = "S"
clunparse K = "K"
clunparse (CLApp l n@(CLApp _ _))
= (clunparse l) ++ bracket(clunparse n)
clunparse (CLApp l n)
= (clunparse l) ++ clunparse n

Program 16.1: A type for CL terms

Recall that we have previously defined:

S =def λxyz.xz(yz)
K =def λxy.x

Pedantic note: S and K are (names of) terms of the λ-calculus, S and K are terms of combinatory
logic.

141
16 Case study: SKI-ing

16.2.1 An interesting observation


S and K are both typeable terms:

λxyz.xz(yz) : (a → b → c) → (a → b) → a → c
λxy.x : a → b → a

Instantly, we recognise that if we replace → by ⊃ and write the letters in upper case the types of S
and K can be written:
(A ⊃ B ⊃ C) ⊃ (A ⊃ B) ⊃ A ⊃ C
A⊃B⊃A

Recall that Modus Ponens is the rule:

`A⊃B `A
MP
`B

Rule 16.1: Modus Ponens

And, of course, if we treat the two types as axioms schemata we have, in combination with Modus
Ponens a system which is complete for minimal implicational logic.
Now, we look at the rules that we saw for type checking, and we see the rule:

x:α→β y:α
(→ E)
xy : β

Rule 16.2: → E

We can type SKK:

S : (γ → (γ → γ) → γ) → (γ → γ → γ) → γ → γ K : γ → (γ → γ) → γ
(→ E)
SK : (γ → γ → γ) → γ → γ K:γ→γ→γ
(→ E)
SKK : γ → γ

Rule 16.3: Typing SKK

This is just this proof:

` (A ⊃ (A ⊃ A) ⊃ A) ⊃ (A ⊃ A ⊃ A) ⊃ A ⊃ A ` A ⊃ (A ⊃ A) ⊃ A
(⊃ E)
` (A ⊃ A ⊃ A) ⊃ A ⊃ A `A⊃A⊃A
(⊃ E)
`A⊃A

Rule 16.4: Proving ` A ⊃ A

If we reduce SKK we get λx.x which does indeed have type γ → γ.

16.3 Reducing CL terms


We can reduce CL terms in a similar way to the way we reduce λ-terms. Just as we introduced a single
step of β reduction and then computed its reflexive transitive closure, so first we introduce a single step
of reduction on CL terms:
Kxy B1 x
Sxyz B1 xz(yz)

142
16.4 Mimicking abstraction using SKI

So
SKKp
B1 Kp(Kp)
B1 p
Note: In the λ calculus SKK reduces to λx.x; in CL SKK does not simplify.
The CL term SKK does not have an abstraction in it (it is a CL term and abstractions do not figure in
CL terms), and yet it can mimic a λ-term which does have an abstraction. It turns out that this is not a
fluke: we can use S and K to give ourselves the power of λ.
Since the term SKK is just the identity function we can freely extend our theory of CL terms, without
affecting anything very much, by adding a new primitive combinator I, with the reduction rule:

Ix B1 x

Question Extend the type of CL terms to include I.


Question Write a Haskell function clreduce which implements the reflexive transitive closure of
B1 .

16.4 Mimicking abstraction using SKI


We have just seen that we can use I to mimic λx.x. In fact we can extend this idea to mimic abstraction
completely within combinatory logic.
We will extend the syntax of CL with an abstraction operator called λw . The superscript w is for weak.
Section 16.5.2 gives pointers to more details on λw .

16.4.1 λw x.x
We begin with:
λw x.x =def I
Now, λw mimics λ (the abstraction operator of the λ-calculus) in the sense that (λw x.x)M reduces
in a single step to M , and (λx.x)P also reduces in a single step to P . We have only shown that we
can mimic λx.x, and we need to extend the definition.

16.4.2 λw x.P, x 6∈ FV(P )


Suppose we have a term λx.P , where x 6∈ FV(P ). If we apply λx.P to an arbitrary term M and do a
single reduction, we get P . So, in order to mimic λx.P we need a CL term which gives P when applied
to an arbitrary term. But we have such a term – KP . So we define:

λw x.P =def KP, x 6∈ FV(P )

16.4.3 λw x.U V
Now suppose we have (λx.U V )M . This reduces in a single step to:
• ([M/x]U )([M/x]V )
Now, (λx.U )M reduces in a single step to [M/x]U , so what we have is:
• (λx.U )M ((λx.V )M )
Hence:
• (λx.U V )M = (λx.U )M ((λx.V )M )
Lets switch to looking at S. Recall:
• Sf gy B1 f y(gy)
So:

143
16 Case study: SKI-ing

• S(λw x.U )(λw x.V )M B1 (λw x.U )M ((λw x.V )M )

This tells us that S(λw x.U )(λw x.V )M has the same behaviour as (λx.U V )M , and hence we make
the defintion:

λw x.U V =def S(λw x.U )(λw x.V )

16.4.4 λw
The full definition of λw is:
λw x.x =def I
λw x.P =def KP, x 6∈ FV(P )
w
λ x.U V =def S(λw x.U )(λw x.V )

16.5 Compiling λ-terms to CL


The ability to mimic abstraction is of more than merely academic interest. We can take the definition
of λw , and use it as the basis for an inductive translation from λ-terms into CL terms. A variable in the
λ-calculus becomes a variable in CL. An application term in the λ calculus becomes an application
term in CL. The tricky part is to deal with abstraction terms. Translating λx.x is easy. So is translating
λx.y, and λx.U V . The problem arises when we need to translate λx.λy.Q. The trick here is to translate
λy.Q first. This will give a CL term Q0 which, by definition, will not be an abstraction. Then we can
apply the obvious translation to y and Q0 .
For example:
(λxy.x)0
⇒ λx.(λy.x)0
⇒ (λx.Kx)0
⇒ S(λx.K)0 (λx.x)0
⇒ S(KK)I
We have been gloriously cavalier in this derivation – things like λx.Kx are neither λ-terms nor terms
of combinatory logic. Haskell’s type system will not allow us to play fast and loose with types like this.
Just as (λxy.x)uv Bβ u, we get the following derivation:

S(KK)Iuv
B (KK)u(Iu)v
B K(Iu)v
B Iu
B u

Thus we can compile λ-terms into CL terms.


Question Write a function skicompile :: LTerm -> CLTerm which converts from a λ-term
to a CL term. You will (probably) need to use an auxiliary function skicompile’ :: Var ->
CLTerm -> CLTerm to handle the conversion of a term of the form λxy.M .

16.5.1 Summary
Now we can reduce λ-terms, by compiling them to CL terms and then reducing the CL term. As so
often we have only looked at the first steps. For example there are lots of optimisations that we can
make in both the compilation and reduction phases.

16.5.2 Background
This material is covered in greater detail in [39], [4] and [64]. The treatments in [39] and [4] focus
on technical aspects if λw , and other similar operators. On the other hand, [64] concentrates on
implementation issues. The first practical use of combinators in compiling functional languages was
by David Turner in SASL. The definition of λw goes back at least as far as the late 1950’s.

144
17 Reasoning about functions
At the end of Chapter 11 we touched on the issue of reasoning about programs. In this Chapter we
will continue this work, although not in exactly the same formal framework as we used in Chapter 11.
In the main we will be concerned about reasoning about program correctness, rather than program
efficiency. We will follow the treatment from Chapter 6 of [61].
Reasoning about functions can be difficult and expensive: not reasoning about them can be fatal.

17.1 Preliminaries
We assume that we have the ‘usual’ logical language: ∧, ∨, ¬, ⊃, ∀, ∃.
The specification for a program is, typically, a proposition of the form:

For all inputs there is some output which has such-and-such a property.

We will formalise this as:


(∀i : Input)(∃o : Output)S(i, o)

where S formalises the relationship between input and output. Although this is the typical form of a
program specification, not all specifications have this form. For example we might specify a parser for
a grammar g as:
(∀s : String)(G(g, s) ∨ ¬G(g, s))

i.e. a parser is a function which takes a string and a grammar and tell whether or not the string is
generated by the grammar. Specifications of this form can be massaged into the ∀∃ form.
If we consider the specification as a proposition we will see that there is (usually) a close connection
between a proof of the proposition and a program which meets the specification. In fact, in cer-
tain logics proofs and programs are just the same thing. Even in logics where we cannot make this
identification there is usually much to be learned about algorithms from studying proofs.

17.2 Proof by induction


We will focus on proving properties of functions written over inductive data types. Such functions, as
we have seen in previous Chapters, are typically defined by recursion. Proofs of their properties are
typically inductive proofs. So we begin by looking at induction on natural numbers.

17.2.1 The natural numbers


Recall that the type of the natural numbers can be defined as in Program 7.20 on page 41.
We can write this in a more ‘logical’ way as:

n : Nat
Zero : Nat Succ(n) : Nat

Rule 17.1: Natural Numbers, as a type

This inductive definition justifies an induction rule, to allow us to prove C(n) for arbitrary n : Nat:

145
17 Reasoning about functions

[C(m)]
·
·
·
C(Zero) C(Succ(m))
C(n)

Rule 17.2: Mathematical induction

Paulson [61] calls this rule ‘mathematical induction’. It is probably better called ‘structural induction
on Nat’. Informally we can read it as:
If we can show that C holds of Zero, and we can show that if C holds of m than it holds
of Succ(m) then C holds of arbitrary n.
We can write a similar rules for lists:

[C(t)]
·
·
·
C([]) C(Cons(h, t))
C(l)

Rule 17.3: Structural induction on lists

We have been a bit cavalier here: we really should say what h, t, and l are. Rule 17.3 bears a
strong resemblance to Rule 11.8 on page 96.
Structural induction is not the only form of induction we can have. We can use total induction, where
we are allowed to assume that C holds for all m < n:

[(∀m < n)C(m)]


·
·
·
C(n)
Note: n fresh
C(p)

Rule 17.4: Total induction on Nat

Although total induction looks more powerful than structural induction, this is not the case.
We are assuming here that < is the ‘obvious’ order on Nat, but we can write an induction rule using
any well-founded order ≺. An order is well-founded if every chain is finite, where a chain is a sequence
x1  x2  x3  x4  . . ..
The induction step in the proof corresponds to recursion in the function definition. In the cases of
structural and mathematical induction we do not need to perform a separate proof that our function
terminates, as this is guaranteed by <. When we use well-founded induction the proof that the rela-
tion ≺ is well-founded is exactly the proof that our function terminates. Thus quicksort terminates
because the ordering of lists on length is well-founded.

17.2.2 An example
To illustrate proof by structural induction we prove that every natural number is either even or odd. First
of all we need to clarify what we mean by ‘even’ and ‘odd’. A number n is even if there is a number
m such that n = 2m, and it is odd if there is a number m such that n = 2m + 1.1 So now we are trying
to prove:
(∀n : Nat)(∃m : Nat)(n = 2m ∨ n = 2m + 1)
The first step in the proof is ∀ introduction, leaving us to prove:
(∃m : Nat)(k = 2m ∨ k = 2m + 1)
1 We are assuming that we understand ‘1’, ‘2’, equality, addition and multiplication!

146
17.2 Proof by induction

for arbitrary k. Now we use the induction rule, to give ourselves two sub-problems:
(∃m : Nat)(Zero = 2m ∨ Zero = 2m + 1)
and:
[(∃m : Nat)(p = 2m ∨ p = 2m + 1)]
·
·
·
(∃q : Nat)(Succ(p) = 2q ∨ Succ(p) = 2q + 1)
The base case is easy:

Arithmetic
Zero = 2 ∗ Zero
∨ I left
Zero = 2 ∗ Zero ∨ Zero = 2 ∗ Zero + 1
∃I
(∃m : Nat)(Zero = 2m ∨ Zero = 2m + 1)

Rule 17.5: (∃m : Nat)(Zero = 2m ∨ Zero = 2m + 1)

The induction step is slightly harder:

[p = 2m ∨ p = 2m + 1] Π1 Π2
∨E
[(∃m : Nat)(p = 2m ∨ p = 2m + 1)] (∃q : Nat)(Succ(p) = 2q ∨ Succ(p) = 2q + 1)
∃E
(∃q : Nat)(Succ(p) = 2q ∨ Succ(p) = 2q + 1)

Rule 17.6: Induction step

where the sub-proofs Π1 and Π2 are:

[p = 2m]
=
Succ(p) = Succ(2m)
Arithmetic
Succ(p) = 2m + 1
∨ I right
Succ(p) = 2m ∨ Succ(p) = 2m + 1
∃I
(∃q : Nat)(Succ(p) = 2q ∨ Succ(p) = 2q + 1)

Rule 17.7: Sub-proof Π1

and:

[p = 2m + 1]
=
Succ(p) = Succ(2m + 1)
Arithmetic
Succ(p) = 2m + 2
Arithmetic
Succ(p) = 2(m + 1)
∨ I left
Succ(p) = 2(m + 1) ∨ Succ(p) = 2(m + 1) + 1
∃I
(∃q : Nat)(Succ(p) = 2q ∨ Succ(p) = 2q + 1)

Rule 17.8: Sub-proof Π2

So these two subproofs complete Proof 17.6. Thus both the base case and the induction step have
been established. 
From this proof we observe a number of points:

147
17 Reasoning about functions

• formal proofs can be tedious, and we should use a mechanical proof assistant;

• the structure of the proof reflects the structure of a recursive algorithm which will compute m
where n = 2m ∨ n = 2m + 1;

• we could get our mechanical proof assistant to extract the algorithm.

The algorithm that we are constructing looks like:

data OR a b = Inl a
| Inr b

when :: (a -> b) -> (c -> b) -> OR a c -> b

when d e (Inl l) = d l
when d e (Inr r) = e r

evenorodd :: Int -> OR Int Int

evenorodd 0 = Inl 0
evenorodd n = when (\l -> Inr l)
(\r -> Inl (r + 1))
(evenorodd (n - 1))

Program 17.1: An extracted program

The first clause is generated from the base case, Proof 17.5. The second clause is generated from
the induction step. The function when arises from the use of the rule for ∨ elimination in Proof 17.6.
The uses of Inr and Inl arise from the uses of ∨ introduction on the right and left in the sub-proofs
Π1 and Π2 respectively. The call to evenorodd (n - 1) corresponds to the assumption made in the
induction step.
evenorodd behaves like this:

Gofer?
evenorodd 0
Inl 0 :: OR Int Int
Gofer?
evenorodd 1
Inr 0 :: OR Int Int
Gofer?
evenorodd 13
Inr 6 :: OR Int Int

Thus the algorithm does not merely tell us that the number is even or odd, but also provides us with
a witness. In this case we are probably not terribly bothered about the witness, but, for example, in the
case of a parser the witness is the parse tree for the string, or the error message that informs us that
the string is not generated by the grammar.
In the current case we might look at the function we have constructed and decide to throw away the
witnessing information. We might decide that evenorodd should only return either Inl "even" or
Inr "odd". Thinking for a little bit allows us to see that we are returning one or other of two distinct
values. We already have a perfectly good type with two distinct values in it, called Bool, so let’s use it.
Now our program will look like:

148
17.3 Summary

evenorodd :: Int -> Bool

evenorodd 0 = True
evenorodd n = if (evenorodd (n - 1))
then False
else True

Program 17.2: A variant of our extracted program

The second clause of evenorodd is a clumsy way to write not (evenorodd (n - 1)), so we
have now:

evenorodd :: Int -> Bool

evenorodd 0 = True
evenorodd n = not (evenorodd (n - 1))

Program 17.3: Another variant of our extracted program

17.3 Summary
In this Chapter we presented a quick introduction to reasoning about functional programs. The steps
involved were:
• a problem was expressed informally;
• this was formalised as a specification, which we treated as a proposition to prove;
• we proved the proposition, using induction;
• we inspected the proof, and from it extracted an algorithm;

• we performed some transformations on this algorithm, mostly to throw some information away
the we did not consider important.

149
17 Reasoning about functions

150
[p = 2m + 1]3
=
[p = 2m]3 Succ(p) = Succ(2m + 1)
= Arithmetic
Succ(p) = Succ(2m) Succ(p) = 2m + 2
Arithmetic Arithmetic
Succ(p) = 2m + 1 Succ(p) = 2(m + 1)
∨ I right ∨ I left
Succ(p) = 2m ∨ Succ(p) = 2m + 1 Succ(p) = 2(m + 1) ∨ Succ(p) = 2(m + 1) + 1
Arithmetic ∃I ∃I
Zero = 2 ∗ Zero [p = 2m ∨ p = 2m + 1]2 (∃q : Nat)(Succ(p) = 2q ∨ Succ(p) = 2q + 1) (∃q : Nat)(Succ(p) = 2q ∨ Succ(p) = 2q + 1)
∨ I left ∨ E3
Zero = 2 ∗ Zero ∨ Zero = 2 ∗ Zero + 1 [(∃m : Nat)(p = 2m ∨ p = 2m + 1)]1 (∃q : Nat)(Succ(p) = 2q ∨ Succ(p) = 2q + 1)
∃I ∃ E2
(∃m : Nat)(Zero = 2m ∨ Zero = 2m + 1) (∃q : Nat)(Succ(p) = 2q ∨ Succ(p) = 2q + 1) 1

Rule 17.9: Proof


Induction
(∃m : Nat)(k = 2m ∨ k = 2m + 1)
∀I
(∀n : Nat)(∃m : Nat)(n = 2m ∨ n = 2m + 1)

151
17.3 Summary
17 Reasoning about functions

152
18 Monads
In this Chapter we will look at monads in functional programming. The notion of a monad1 comes to
us from category theory (see, for example, [46]) where they are also known as ‘standard constructions’
or ‘triples’. The precise definition of a monad that computer scientists use is very slightly different to that
employed by category theorists. Monads escaped from category theory into computer science via the
work of Eugenio Moggi, an Italian then working at Edinburgh University. Monads were then adopted
with enthusiasm by Phil Wadler (see, for example, [83, 82]), an American then working at Glasgow
University.

18.1 A categorical perspective on monads


Since monads come from category theory we begin by looking at them from a categorical perspective.
This necessitates us discussing category theory, and really just distracts us form making progress.
We note very briefly that Lambek and Scott [46] give the following definition of a monad:2
A triple (T, η, µ) on a category A consists of:

• a functor3 T ;

• a natural transformation η : 1A → T ; and

• a natural transformation µ : T 2 → T

with the following properties:

µ ◦ Tη = 1T
1T = µ ◦ ηT
µ ◦ µT = µ ◦ Tµ

So, a monad is functor and two natural transformations which obey some laws. The relevance of
this notion to computer science is not immediately obvious.
The functor T will turn out to be a type, and the natural transformations η and µ will turn out to be
functions. The monad laws will turn out to be properties of the functions.
The insight that Moggi had was that a monad allows is to treat computations as things to manipulate.
At this point we decide to omit the details of the original path from category theory to computer
science, and turn immediately to a computer scientist’s view of monads.

18.2 A computer scientist’s view of monads


Nw we follow the treatment from Chapter 18 of [77]. Thompson introduces the IO type family before
moving on to monads, so we will follow his example.
Recall that the cunning thing about monads is that they let us deal with values, and with computations
which yield values.
1 Not to be confused with Liebniz’ ‘windowless monads’. Apparently, ‘monad’ is a pun on ‘monoid’.
2 Which they call a ‘triple’.
3 ‘Functor’ has a technical meaning in category theory, and its use here should not be confused with the use of ‘functor’ when

discussing the syntax of e.g. Prolog.

153
18 Monads

18.2.1 Input/Output in functional programming


Recall that the meaning of an expression is its value. As each expression should have one meaning, so
it should have one value. Now, suppose we decide to add I/O naïvely to a functional language. One
obvious model is that we treat the input as a stream of values, and we have functions like getInt
:: Int, which takes the first item from the input stream. This approach works, in that it enables us
to implement an functional language with I/O, but it has a high price. We now have an expression
getInt which has (or may have) more than one value. This has all sorts of unpleasant consequences.
For example we can no longer go safely from (\ x -> x == x) getInt to True. (\ x -> x
== x) getInt reduces to getInt == getInt, and this may turn out to be 0 == 1. The normal
good behaviour of expressions in functional programming languages in expressions involving equality
is normally called referential transparency, and sometimes referred to as Liebniz’ Law. We can recover
referential transparency if we decide that getInt really has type World -> (World, Int). This is
fine, but now we are going to have to add the World as a parameter to every function that might be
involved in I/O, or maybe have it as an extra unwritten parameter everywhere (like a global variable!).
Copying the world around all over the place might strike us as hopeless since very little of the world
will actually be altered by any given function. the monadic approach is an attempt to tread a path
between these two extremes. First we look at the IO types.

18.2.2 IO
The basic intuition behind the IO types is that a value of IO a is an action (a program) which performs
some I/O and returns a value of type a.

The type ()
Haskell has a type () which has exactly one value, confusingly also called (). As a datatype () is
pretty impoverished as there is not much data in the value (). However a value of type IO () is an
I/O action which returns only () as its value.

Reading input
We can read input using the standard functions getLine :: IO String and getChar :: IO
Char. These behave as we would expect them to.

Writing output
We can write output using putStr :: String -> IO ().
Recall that Haskell provides a class Show, and show is defined as:

show :: Show a => a -> String

Program 18.1: show

We can use show to implement a print function:

print :: Show a => a -> IO ()

print = putStr . (++ "\n") . show

Program 18.2: print

return
There is a builtin function return :: a -> IO a, which does no I/O, but simply returns a value
of a. return x simply returns x. This is not quite pointless, as we will see later.

154
18.2 A computer scientist’s view of monads

The do notation
We now introduce the do notation. do allows us to compose I/O actions is a way very reminiscent of
writing imperative programs.4
We can use do to put a string on a line:

putStrLn :: String -> IO ()

putStrLn str = do putStr str


putStr "\n"

Program 18.3: putStrLn

We can write a function to write the string n times:

putNStr :: Int -> String -> IO ()

putNStr n str = if n <= 1


then putStrLn str
else do
putStr str
putNStr (n - 1) str

Program 18.4: putStrLn

This function looks remarkably like an imperative loop!

<-
We can use <- to name the result of an I/O action. For example we can write a program to echo lines:

echo :: IO ()

echo = do theline <- getLine


putStrLn theline

Program 18.5: echo

We can think of the process of naming a form of assignment. Crucially <- only permits single
assignment: this is not the same as the destructive assignment that we (you!) are familiar with from
imperative programming.

»= and »
We can define the function do using an operator »= :: IO a -> (a -> IO b) -> IO b.5 The
operator »= is a sequencing operator. it is called bind. It passes the result of the first operation to the
second one. We can use »= in place of do:

echo2 :: IO ()

echo2 = getLine >>= \theline ->


putStrLn theline

Program 18.6: echo2


4 The irony that one of the most advanced topics in functional programming is allowing ourselves to write imperative programs
should not be overlooked.
5 Thompson [77] has the wrong type for this function on p 400.

155
18 Monads

The do notation is often more convenient to use, but we can think of it as being implemented by »=.
We can define the operator (») :: IO a -> IO b -> IO b as:

(>>) :: IO a -> IO b -> IO b

a >> b = a >>= \_ -> b

Program 18.7: »

The effect of » is to throw the value returned by its first operand away.

Summary of the I/O operators


We have a type IO a, and we have defined some functions which use value of IO a, including:
• return :: a -> IO a
• (»=) :: IO a -> (a -> IO b) -> IO b
• (») :: IO a -> IO b -> IO b
We can also observe that the type IO a provides us with an imperative I/O programming language
on top of Haskell.

Further discussion
For a much fuller treatment of I/O in functional programming see [31].

18.2.3 Generalising from IO


I/O is clearly important in itself. There are other situations which resemble I/O in that there is some
notion of state involved:
• Pseudo-random number generators;
• Errors: instead of using the Maybe types (see 7.18 on page 40) we could have an error state;
• Iteration: Thompson shows how to use do and <- to write looping I/O programs;
• When implementing substitution in the λ-calculus we pass around a supply of fresh variables.
We can think of this as being like a state value;
• Imperative programming can be though of as programming with state;
• ...
Clearly at this point the key to the generalisation is going to be revealed to be treating I/O as a
monad. the key insight is to interpret IO as a functor, and the I/O operators as natural transformations,
and then to express the monad laws as properties that the I/O operators must obey. At this point we
present the Haskell constructor class Monad:

class Monad m where


(>>=) :: m a -> (a -> m b) -> m b
return :: a -> m a
(>>) :: m a -> m b -> m b
fail :: String -> m a

Program 18.8: Monad m

This defines a class of type constructors. The default definition of » is as in Program 18.7, and that
for fail is:

156
18.3 Examples of monads

fail :: String -> m a

fail s = error s

Program 18.9: fail

We still have to present the monad laws. These are most easily expressed using an derived compo-
sition operator >@>, called Kleisli composition:

(>@>) :: Monad m => (a -> m b) -> (b -> m c) -> (a -> m c)

f >@> g = \x -> (f x) >>= g

Program 18.10: >@>

The monad laws can now be framed as:

return >@> f = f
f >@> return = f

(f >@> g) >@> h = f >@> (g >@> h)

Program 18.11: Monad laws

When we declare a monad we have to prove the monad laws for ourselves – Haskell is incapable of
checking these automatically.

18.3 Examples of monads


Not only IO, but other things also form monads. . . Now we list some examples of monads, again
following Thompson.

18.3.1 The trivial (or identity) monad


We stated above that monads let us deal with computations which yield values. The most trivial
computation is nor computation at all, and from this we get the trivial or identity monad.
The type m a is just a itself, and we define:

-- (>>=) :: a -> (a -> b) -> b


-- return :: a -> a

x >>= f = f x

return = id

Program 18.12: Trivial »= and return

18.3.2 The list monad


Suppose we want to represent non-deterministic computation. Suppose that we have a computation
which has many possible results. we might use a list to collect these results, and we can indeed define
a list monad, using:

157
18 Monads

-- (>>=) :: [a] -> (a -> [b]) -> [b]


-- return :: a -> [a]
-- fail :: String -> [a]

instance Monad [] where


xs >>= f = concat (map f xs)
return x = [x]
fail _ = []

Program 18.13: »=, return and fail for the list monad

If a computation fails then it gives us the empty list of possible results.


If we have a computation we can make a collection of it by creating a singleton list.
If we have a collection of possible results, and a function, then we can create a collection of new
results for the collection formed by applying the function.
Notice that for lists we do not rely on the default fail.

18.3.3 The parsing monad


The list monad, and its interpretation, reminds us of the treatment that we had before about parsing.
We can indeed define a parsing monad.

data MParser a b = MParser (Parser a b)

-- (>>=) :: MParser a b -> (b -> MParser a c) ->


-- MParser a c
-- return :: b -> MParser a b
-- fail :: String -> MParser a b

instance Monad (MParser a) where


return q = MParser (succeed q)
fail s = MParser Parsers.fail
(MParser p) >>= f =
MParser (\s ->
concat [mparse (f x) rest | (rest, x) <- p s])

mparse (MParser p) = p

Program 18.14: »=, return and fail for the MParser monad

18.3.4 The Maybe monad


We can define the Maybe monad using:

-- (>>=) :: Maybe a -> (a -> Maybe b) -> Maybe b


-- return :: a -> Maybe a
-- fail :: String -> Maybe a

instance Monad Maybe where


(Ok x) >>= f = f x
Error >>= f = Error
return = Ok
fail _ = Error

Program 18.15: »=, return and fail for the Maybe monad

158
18.4 Summary

18.4 Summary
In this Chapter we have introduced and discussed monads from two perspectives: as a concept from
category theory, and as a way to generalise the treatment of I/O to the structuring of other computa-
tions.

159
18 Monads

160
19 Monad example: an evalautor
19.1 Introduction
We build an evaluator, following Ch 10 of [6]
Bird’s example illustrates how we can use monads to help structure computations.

19.2 Without monads


First we work without monads. We start off with exceptionally simple terms, and an exceptionally simple
evaluator. No monads to be seen anywhere.

data Term = Con Int


| Div Term Term

eval :: Term -> Int

eval (Con x) = x
eval (Div n d) = (eval n) ‘div‘ (eval d)

Program 19.1: In the beginning

And we get this sort of behaviour:

Birdeval> eval (Div (Div (Con 66) (Con 3)) (Con 11))
2
Birdeval> eval (Div (Con 1) (Con 0))

Program error: {primDivInt 1 0}

We will follow Bird and add extra functionality to our evaluator, first directly, and second by using a
monad.
We will add:

• exception handling

• a counter for the number of divisions done

• an output of the trace

19.2.1 Handling exceptions without a monad


When we get are error we want our evaluator to report it, and to terminate normally, rather than
crashing. Naïvely we might define an error type as:

type Mess = String

data Error a = OK a
| Error Mess

Program 19.2: An error type

161
19 Monad example: an evalautor

However we are going to think of ourselves as handling exceptions and we define:

type Exception = String

data Exc a = Return a


| Raise Exception

Program 19.3: An exception type

eval :: Term -> Exc Int

eval (Con i) = Return i


eval (Div n d) = h (eval n)
where
h (Raise e) = Raise e
h (Return x) = h’ (eval d)
where h’ (Raise e’) = Raise e’
h’ (Return y) =
if y == 0
then Raise "division by 0"
else Return (x ‘div‘ y)

Program 19.4: Exception handling evaluator

Now we get this behaviour:

Birdeval> eval (Div (Div (Con 66) (Con 3)) (Con 11))
ERROR - Cannot find "show" function for:
*** Expression : eval (Div (Div (Con 66) (Con 3)) (Con 11))
*** Of type : Exc Int

Doh! We should have made Exc a an instance of Show:

instance Show a => Show (Exc a) where


show (Raise e) = "Exception: " ++ e
show (Return v) = "Value: " ++ show v

Program 19.5: Showing Exc a

Now we get this behaviour:

Birdeval> eval (Div (Div (Con 66) (Con 3)) (Con 11))
Value: 2
Birdeval> eval (Div (Con 1) (Con 0))
Exception: division by 0
Birdeval> eval (Div (Div (Con 12) (Con 0)) (Con 99))
Exception: division by 0

19.2.2 Adding state without a monad


Now, suppose we want to vary the original evaluator to tell us how many divisions were done in the
evaluation. In an imperative language we might use a global variable, which we would increment
whenever we did a division. In a functional language we must pass a parameter around which we use
to accumulate the number of divisions done. We are using an instance of a more general notion of
state, so we define:

162
19.2 Without monads

type State = Int

newtype St a = MkSt (State -> (a, State))

Program 19.6: A state transformer type

A value of St a is a state transformer, i.e. a function which takes a state and returns a value paired
with a (new) state. In order to make use of a value of type St a we define

applyst :: St a -> State -> (a, State)

applyst (MkSt f) s = f s

Program 19.7: Applying a state transformer

We remember to make St a an instance of Show:

instance Show a => Show (St a) where


show f = "Value: " ++ show x ++ " Count: " ++ show s
where (x, s) = applyst f 0

Program 19.8: Showing Exc a

We can use state transformers to handle state:

eval (Con i) = MkSt (\s -> (i, s))


-- MkSt f where f s = (i, s)
eval (Div n d) = MkSt f
where
f s = (x ‘div‘ y, s’’ + 1)
where (x, s’) = applyst (eval n) s
(y, s’’) = applyst (eval d) s’

Program 19.9: Handling state with a state transformer

Now we get this behaviour:

Birdeval> eval (Div (Div (Con 66) (Con 3)) (Con 11))
Value: 2 Count: 2
Birdeval> eval (Div (Div (Con 12) (Con 0)) (Con 99))
Value:
Program error: {primDivInt 12 0}

19.2.3 Adding traces without a monad


The third thing we do is add a trace facility. First we define a type to hold a value and some output:

type Output = String

newtype Out a = MkOut (Output, a)

instance Show a => Show (Out a) where


show (MkOut (outa, a)) = outa ++ "Value " ++ show a

Program 19.10: A type to hold a value and some output

163
19 Monad example: an evalautor

Then we define a function format some output (after remembering to declare Term as an instance
of Show):

cr = "\n"

line :: Term -> Int -> Output

line t x = "Evaluating " ++ show t


++ " gives " ++ show x ++ cr

Program 19.11: Formatting output

And now we can define an evaluator with a trace:

eval :: Term -> Out Int

eval (Con i) = MkOut (line (Con i) i, i)


eval (Div n d) =
MkOut (outn ++ outd ++ line (Div n d) a, a)
where
MkOut (outn, n’) = eval n
MkOut (outd, d’) = eval d
a = n’ ‘div‘ d’

Program 19.12: Evaluating with a trace

Now we get this behaviour:


Birdeval> eval (Div (Div (Con 66) (Con 3)) (Con 11))
Evaluating Con 66 gives 66
Evaluating Con 3 gives 3
Evaluating Div (Con 66) (Con 3) gives 22
Evaluating Con 11 gives 11
Evaluating Div (Div (Con 66) (Con 3)) (Con 11) gives 2
Value 2
Birdeval> eval (Div (Div (Con 12) (Con 0)) (Con 99))
Evaluating Con 12 gives 12
Evaluating Con 0 gives 0
Evaluating Div (Con 12) (Con 0) gives
Program error: {primDivInt 12 0}

19.3 Using monads


Now we introduce monads into the game. The basic monadic evaluator looks like:

eval :: Monad m => Term -> m Int

eval (Con i) = return i


eval (Div n d) =
do x <- eval n
y <- eval d
return (x ‘div‘ y)

Program 19.13: The basic monadic evaluator, using do

Or, if you prefer bind:

164
19.3 Using monads

eval (Con i) = return i


eval (Div n d) =
eval n >>= (\x ->
eval d >>= (\y ->
return (x ‘div‘ y)))

Program 19.14: The basic monadic evaluator, using »=

19.3.1 The identity monad

newtype Id a = MkId a

instance Monad Id where


return x = MkId x
(MkId x) >>= f = f x

instance Show a => Show (Id a) where


show (MkId v) = "Value: " ++ show v

Program 19.15: The identity monad

Defining evalID is trivial:

evalID :: Term -> Id Int

evalID = eval

Program 19.16: evalID

We get this behaviour:

Birdeval> evalID (Div (Div (Con 66) (Con 3)) (Con 11))
Value: 2

19.3.2 The exception monad

instance Monad Exc where


return x = Return x
(Raise e) >>= _ = Raise e
(Return v ) >>= f = f v

Program 19.17: Exc as a monad

Bird defines a new function, raise :: Exception -> Exc a

raise :: Exception -> Exc a

raise = Raise

Program 19.18: raise :: Exception -> Exc a

In this case raise is a trivial function, but for other monads we may define less trivial functions.
Now we need to define a new monadic evaluator, evalEx :: Term -> Exc Int

165
19 Monad example: an evalautor

evalEx (Con i) = return i


evalEx (Div n d) =
do x <- evalEx n
y <- evalEx d
if y == 0
then raise "division by 0"
else return (x ‘div‘ y)

Program 19.19: raise :: Exception -> Exc a

NB: Bird has this function wrong.


If we were afraid of do:

evalEx (Con i) = return i


evalEx (Div n d) =
evalEx n >>= (\x ->
evalEx d >>= (\y ->
if y == 0
then raise "division by 0"
else return (x ‘div‘ y)))

Program 19.20: raise :: Exception -> Exc a, using »=

We get this behaviour:

Birdeval> evalEx (Div (Div (Con 66) (Con 0)) (Con 11))
Exception: division by 0
Birdeval> evalEx (Div (Div (Con 66) (Con 3)) (Con 11))
Value: 2
Birdeval> evalEx (Div (Div (Con 66) (Con 3)) (Con 0))
Exception: division by 0

19.3.3 The state monad

instance Monad St where


return x = MkSt f where f s = (x, s)
-- MkSt (\s -> (x, s))
p >>= q = MkSt f
where
f s = applyst (q x) s’
where
(x, s’) = applyst p s

Program 19.21: The state monad

Next we define a function specific to the state monad, which just increments a counter:

tick :: St ()

tick = MkSt (\s -> ((), s + 1))

-- Bird has
-- tick = MkSt f where f s = ((), s + 1)

Program 19.22: Updating a counter

166
19.3 Using monads

The evaluator just has to tick over when it does a division:

evalSt (Con i) = return i


evalSt (Div n d) =
do x <- evalSt n
y <- evalSt d
tick
return (x ‘div‘ y)

Program 19.23: The state evaluator

or

evalSt (Con i) = return i


evalSt (Div n d) =
evalSt n >>= (\x ->
evalSt d >>= (\y ->
tick >>
return (x ‘div‘ y)))

Program 19.24: The state evaluator, using »=

We get this behaviour:

Birdeval> evalSt (Div (Div (Con 66) (Con 3)) (Con 11))
Value: 2 Count: 2

19.3.4 The output monad


An of course out is a monad:

instance Monad Out where


return x = MkOut ("", x)
p >>= q = MkOut (ox ++ oy, y)
where
MkOut (ox, x) = p
MkOut (oy, y) = q x

Program 19.25: The output monad

And we have a function specific to this monad:

out :: Output -> Out ()

out o = MkOut (o, ())

Program 19.26: The out function

The evaluator is then:

167
19 Monad example: an evalautor

evalOut (Con i) =
do
out (line (Con i) i)
return i
evalOut (Div n d) =
do
x <- evalOut n
y <- evalOut d
out (line (Div n d) (x ‘div‘ y))
return (x ‘div‘ y)

Program 19.27: The tracing evaluator

evalOut (Con i) =
out (line (Con i) i) >>
return i
evalOut (Div n d) =
evalOut n >>= (\x ->
evalOut d >>= (\y ->
out (line (Div n d) (x ‘div‘ y)) >>
return (x ‘div‘ y)))

Program 19.28: The tracing evaluator, using »=

We get this behaviour:


Birdeval> evalOut (Div (Div (Con 66) (Con 3)) (Con 11))
Evaluating Con 66 gives 66
Evaluating Con 3 gives 3
Evaluating Div (Con 66) (Con 3) gives 22
Evaluating Con 11 gives 11
Evaluating Div (Div (Con 66) (Con 3)) (Con 11) gives 2
Value 2

19.4 Summary
We have used monads to implement different variants of an evaluator.

168
20 Worked Example : writing an
evaluator for λ terms
20.1 β reduction
For the pure, untyped λβ -calculus we have the following (big step) reduction rules:

x Bβ x

Rule 20.1: Reducing a variable

M Bβ M 0
λx.M Bβ λx.M 0

Rule 20.2: Reducing an abstraction

M Bβ P Q N Bβ N 0
M N Bβ P QN 0

Rule 20.3: Reducing an application, 1

M Bβ x N Bβ N 0
M N Bβ xN 0

Rule 20.4: Reducing an application, 2

M Bβ λx.P [N/x]P Bβ P 0
M N Bβ P 0

Rule 20.5: Reducing an application, 3

20.2 Turning this into Haskell


Let’s see what happens if we take a naïve approach to turning these rules into Haskell.
First we define a datatype for λ terms:

data Term var = Var var


| App (Term var) (Term var)
| Abs var (Term var)
deriving Show

Program 20.1: A type for λ-terms

169
20 Worked Example : writing an evaluator for λ terms

We will write a function called reduce :: Term a -> Term a. We expect that reduce will
have three clauses, and that the third clause will itself be analysed into three cases, which we will
handle by using an auxiliary function red’ :: (Term a) -> (Term a) -> Term a.

red :: Term a -> Term a

red (Var x) = Var x


red (Abs x m) = Abs x (red m)
red (App m n) = let m’ = red m
in red’ n m’

Program 20.2: Naïvely reducing a term

The function red’ lets us deal with the case of reducing a possible redex.

-- Second arg to red’ is already in NF


red’ n (Var x) = App (Var x) (red n)
red’ n (App p q) = App (App p q) (red n)
red’ n (Abs x p) = red (subst n x p)

Program 20.3: Naïvely reducing an application term

All we need to do now is define subst and we are done.

20.3 Rules for substitutions


The rules for substitution are:

[N/x]x −→ N

Rule 20.6: Substituting in a variable, 1

x 6= y
[N/x]y −→ y

Rule 20.7: Substituting in a variable, 2

[N/x]P −→ P 0 [N/x]Q −→ Q0
[N/x]P Q −→ P 0 Q0

Rule 20.8: Substituting in an application

[N/x]λx.P −→ λx.P

Rule 20.9: Substituting in an abstraction, 1

170
20.4 Solution 1: following our noses

[N/x]P −→ P 0
x 6= y, y ∈
/ FV (N )
[N/x]λy.P −→ λy.P 0

Rule 20.10: Substituting in an abstraction, 2

[z/y]P −→ P 0 [N/x]P 0 −→ P 00
x 6= y, y ∈ FV (N ), z fresh
[N/x]λy.P −→ λz.P 00

Rule 20.11: Substituting in an abstraction, 3

20.3.1 Substitution in Haskell


Writing a substitution function will be very much like writing the reduce function. The two sets of rules
are, after all, very similar. We expect to be able to write something like this:

subst n x (Var y)
| x == y = n
| otherwise = (Var y)
subst n x (App p q) = App (subst n x p) (subst n x q)
subst n x (Abs y p)
| x == y = (Abs y p)
| not (y ‘freein‘ n) = Abs y (subst n x p)
| otherwise = ?????

Program 20.4: Oops!

20.3.2 What a state!


So there is a problem here: we cannot generate a free variables from nothing. We need to make use
of some notion of a state.

20.4 Solution 1: following our noses


The first solution is that we have a way to generate fresh variables, and pass this around as a parameter.
The simplest way is just to have a lazy list of names (which we must guarantee to be fresh). When we
need a fresh variable we just take the head of the list, and use the tail of the list as the stock of
guaranteed fresh variables. Now our code needs to involve a lot of bookkeeping.
We will define reduce in terms of red.

reduce :: Eq a => [a] -> Term a -> Term a


reduce frees = snd . (red frees)

Program 20.5: Reduce

The red function has to return a pair consisting of a stock of fresh variables and a term. This function
is a lot more readable is we make use of lets. There is now a lot of bookkeeping involved in our code:
only in one place in the substitution function do we actually make use of the stock of fresh variables.

171
20 Worked Example : writing an evaluator for λ terms

red :: Eq a => [a] -> Term a -> ([a] , Term a)

red frees (Var x) = (frees, Var x)


red frees (Abs x m) = let (frees’, m’) = red frees m
in
(frees’ , Abs x m’)
red frees (App m n) = let (frees’, m’) = red frees m
in
red’ frees’ n m’

Program 20.6: Reducing a term

red’ :: Eq a => [a] -> Term a -> Term a -> ([a], Term a)
red’ frees n (Var x) = let (frees’, n’) = red frees n
in
(frees’, App (Var x) n’)
red’ frees n (App p q) = let (frees’, n’) = red frees n
in
(frees’, App (App p q) n’)
red’ frees n (Abs x p) =
let (frees’, p’) = subst frees n x p
in red frees’ p’

Program 20.7: Reducing an application term

We could, of course, have used Eq a => ([a], Term a) -> ([a] , Term a) for the type
of red.

subst :: Eq a => [a] -> Term a -> a -> Term a


-> ([a], Term a)
subst frees n x (Var y)
| x == y = (frees, n)
| otherwise = (frees, Var y)
subst frees n x (App p q) =
let (frees’, p’) = subst frees n x p
(frees’’, q’) = subst frees’ n x q
in
(frees’’, App p’ q’)
subst frees n x (Abs y p)
| x == y = (frees, Abs y p)
| not (y ‘freein‘ n) =
let (frees’, p’) = subst frees n x p
in
(frees’, Abs y p’)
| otherwise =
let (z:frees’) = frees
(frees’’, p’) = subst frees’ (Var z) y p
(frees’’’, p’’) = subst frees’’ n x p’
in
(frees’’’, Abs z p’’)

Program 20.8: Substitution

172
20.5 Solution 2: Use CPS

freein :: Eq a => a -> Term a -> Bool


freein v (Var w) = v == w
freein v (App p q) = (freein v p) || (freein v q)
freein v (Abs x q) = not (v == x) && (freein v q)

Program 20.9: freein

20.4.1 Comments
This code works, but lots of it are just concerned with the passing around of the stock of fresh variables.
As we might have anticipated, only in the subst function do we make use of the stock of variables.
We have also made very extensive use of lets, to aid program readability.

20.5 Solution 2: Use CPS


Generally, anywhere that we use lots of lets is a prime candidate for converting to CPS. Just for fun we
will use the type lifted version of Eq a => ([a], Term a) -> ([a] , Term a) as the type for
the CPS’ed version of red. We will re-order the arguments to cpsred’ to ensure that cpsred’ (x
:: Term a) has the same type as cpsred. We will also resist the temptation to convert absolutely
everything to CPS.

reduce :: Eq a => [a] -> Term a -> Term a


reduce frees t = cpsred (frees, t) snd

Program 20.10: Reduce, again

cpsred :: Eq a => ([a], Term a) ->


(([a], Term a) -> t) -> t
cpsred (frees, Var x) k = k (frees, Var x)
cpsred (frees, Abs x m) k =
cpsred (frees, m) (\(fs, t) ->
k (fs, Abs x t))
cpsred (frees, App m n) k =
cpsred (frees, m) (\(fs, m’) ->
cpsred’ m’ (fs, n) k)

Program 20.11: cpsred

cpsred’ :: Eq a => Term a -> ([a], Term a) ->


(([a], Term a) -> t) -> t
cpsred’ (Var x) (frees, n) k =
cpsred (frees, n) (\(fs, n’) ->
k (fs, App (Var x) n’))
cpsred’ (App p q) (frees, n) k =
cpsred (frees, n) (\(fs, n’) ->
k( fs, App (App p q) n’))
cpsred’ (Abs x p) (frees, n) k =
cpssubst frees n x p (\(fs, p’) ->
cpsred (fs, p’) k)

Program 20.12: cpsred’

We do a little bit of re-arrangement of the arguments to get the type of cpssubst to look like we
want it to.

173
20 Worked Example : writing an evaluator for λ terms

cpssubst :: Eq a => Term a -> a -> ([a], Term a) ->


(([a], Term a) -> t) -> t
cpssubst n x (frees, Var y) k
| x == y = k (frees, n)
| otherwise = k (frees, Var y)
cpssubst n x (frees, App p q) k =
cpssubst n x (frees, p) (\(fs, p’) ->
cpssubst n x (fs, q) (\(fs’, q’) ->
k(fs’, App p’ q’)))
cpssubst n x (frees, Abs y p) k
| x == y = k (frees, Abs y p)
| not (y ‘freein‘ n) =
cpssubst n x (frees, p) (\(fs, p’) ->
k(fs, Abs y p’))
| otherwise =
usefrees frees (\(fr, frs) ->
cpssubst (Var fr) y (frs, p) (\(frs’, p’) ->
cpssubst n x (frs’, p’) (\(frs’’, p’’) ->
k (frs’’, Abs fr p’’))))

Program 20.13: CPS version of subst

There is one difference here, in that we have had to write usefrees:

usefrees :: [a] -> ((a, [a]) -> t) -> t


usefrees (f:fs) k = k(f, fs)

Program 20.14: usefrees

We make use of freein from Program 20.9.

20.5.1 Comments
This code suffers from a real burden of notation. We could, perhaps, have improved readability by
using let expression to get code like program 20.15. Once again we have spent a lot of syntax on
something which gets passed around unaltered for most of the code. The structure of this code is,
basically, what we will get for the monad-based code. The big win of the monad-based code is that
we almost never need to mention the state, once we have wrapped it up in the monad.

cpsred’ :: Eq a => Term a -> ([a], Term a) ->


(([a], Term a) -> t) -> t

cpsred’ (Var x) (frees, n) k =


let k’ = \(fs, n’) -> k (fs, App (Var x) n’)
in
cpsred (frees, n) k’
cpsred’ (App p q) (frees, n) k =
let k’ = \(fs, n’) -> k (fs, App (App p q) n’)
in
cpsred (frees, n) k’
cpsred’ (Abs x p) (frees, n) k =
let k’ = \(fs, p’) -> cpsred (fs, p’) k
in cpssubst n x (frees, p) k’

Program 20.15: cpsred’ with let

174
20.6 Solution 3: Or, we could use a monad

20.6 Solution 3: Or, we could use a monad

In Chapter 19 we saw a simple evaluator which used a state variable to count the number of divisions it
had performed. We can use the same technique to deal with the stock of fresh variables. For simplicity
we will just use an integer for the state. The code which deals with setting the monad up is:

type State = Int

newtype St a = MkSt (State -> (a, State))

applyst (MkSt f) s = f s

instance Monad St where


return x = MkSt f where f s = (x, s)
p >>= q = MkSt f
where
f s = applyst (q x) s’
where
(x, s’) = applyst p s

instance Show a => Show (St a) where


show f = "Value: " ++ show x ++
" Next: " ++ (mkvar s)
where (x, s) = applyst f 0

fresh = MkSt (\s -> (mkvar s, s+1))

mkvar n = ’_’ : (show n)

Program 20.16: Setting up a state monad

One significant difference is that red has a different type from before. For each of the function we
will give a version using bind (»=), and then a version defined using the do notation.

red :: Term [Char] -> St (Term [Char])

red (Var x) = return (Var x)


red (Abs x m) =
red m >>= (\m’ ->
return (Abs x m’))
red (App m n) =
red m >>= (\m’ ->
red’ m’ n)

Program 20.17: red using »=

175
20 Worked Example : writing an evaluator for λ terms

red :: Term [Char] -> St (Term [Char])

red (Var x) = return (Var x)


red (Abs x m) = do
m’ <- red m
return (Abs x m’)
red (App m n) = do
m’ <- red m
red’ m’ n

Program 20.18: red using do

red’ :: Term [Char] -> Term [Char] -> St (Term [Char])

red’ (Var x) n =
red n >>= (\n’ ->
return (App (Var x) n’))
red’ (App p q) n =
red n >>= (\n’ ->
return (App (App p q) n’))
red’ (Abs x p) n =
subst n x p >>= (\p’ ->
red p’)

Program 20.19: red’ using »=

red’ :: Term [Char] -> Term [Char] -> St (Term [Char])

red’ (Var x) n = do
n’ <- red n
return (App (Var x) n’)
red’ (App p q) n = do
n’ <- red n
return (App (App p q) n’)
red’ (Abs x p) n = do
p’ <- subst n x p
red p’

Program 20.20: red’ using do

176
20.6 Solution 3: Or, we could use a monad

subst :: Term [Char]-> [Char]-> Term [Char]


-> St (Term [Char])

subst n x (Var y)
| x == y = return n
| otherwise = return (Var y)
subst n x (App p q) =
subst n x p >>= (\p’ ->
subst n x q >>= (\q’ ->
return (App p’ q’)))
subst n x (Abs y p)
| x == y = return (Abs y p)
| not (y ‘freein‘ n) =
subst n x p >>= (\p’ ->
return (Abs y p’))
| otherwise =
fresh >>= (\z ->
subst (Var z) y p >>= (\p’ ->
subst n x p’ >>= (\p’’ ->
return (Abs z p’’))))

Program 20.21: subst using »=

subst :: Term [Char]-> [Char]-> Term [Char]


-> St (Term [Char])

subst n x (Var y)
| x == y = return n
| otherwise = return (Var y)
subst n x (App p q) = do
p’ <- subst n x p
q’ <- subst n x q
return (App p’ q’)
subst n x (Abs y p)
| x == y = return (Abs y p)
| not (y ‘freein‘ n) = do
p’ <- subst n x p
return (Abs y p’)
| otherwise = do
z <- fresh
p’ <- subst (Var z) y p
p’’ <- subst n x p’
return (Abs z p’’)

Program 20.22: subst using do

20.6.1 Comments
For the monad-based code there is certainly a burden of setting up the monad, and we have been
less abstract (any type with a next function defined on it should have been suitable for the state) than
with the other definitions, but there can be little doubt that the code written with the do notation is very
much closer to Programs 20.2, 20.3, 20.4 than any of the other code is. This must count as a very
strong argument in favour of monads.

177
20 Worked Example : writing an evaluator for λ terms

178
21 An SECD machine
21.1 Introduction
This Chapter presents some example code to implement a version of the SECD machine of Landin
[47]. The code is in 3 modules:

• the machine itself is in §21.3

• a data types for λ-terms, in §21.7

• an implementation of stacks, in §21.10

21.2 Informal description of the SECD machine


The λ expressions that we use will include variables, numbers, ‘built-in’ functions and application and
abstraction terms.
The SECD machine has 4 stacks:

S a stack were results are stored

E an environment consisting of identifier/value pairs

C the control, the expression being executed

D the dump, where we can copy the current state of the machine

The operation of the SECD machine can be given in terms of a while loop. If there is no more work
to do we stop, and return the result on the stack; otherwise we inspect the top of the control stack to
see what we must push onto and pop from the various stacks to make progress.

21.2.1 Evaluation
while not finished(S, E, C, D) do
{if empty(C)
then do
{resume(S, E, C, D)}
else do
{ case top(C) of
...
}
}

21.3 Module SECD2


module SECD2
(
)
where

179
21 An SECD machine

21.4 Imports
import Stack
import Terms

The possible forms of a result are:


data Result = Value Value
| Closure Var Term EnvSt
deriving (Eq, Show)
The control stack will contain either term or a constant signalling that some particular state has
been reached. At arises when we have evaluated an application term, If when we have evaluated a
conditional expression.
data Control = At
| If
| PrimUnCall UnOp
| PrimBinCall BinOp
| Term Term
deriving (Eq, Show)
Now we set the stacks up:
type Binding = (Var, Result)

type EnvSt = Stack Binding


type ResSt = Stack Result
type ContSt = Stack Control
type Dump = Stack (ResSt, EnvSt, ContSt)

data SECD = Quad ResSt EnvSt ContSt Dump


-- deriving (Eq, Show)

21.5 Functions
We start execution by calling load of a term. The term gets loaded, and evaluation begins.

doit :: String -> Result


doit = load . parsetm

load :: Term -> Result


load t =
eval (Quad emptyStack
emptyStack
(push (Term t) emptyStack)
emptyStack)

The evaluation function inspects the control stack. If it is empty, and the dump is also empty then
we are finished, and we stop and return a value. If the dump is not empty, we resume the suspended
computation. If the control stack is not empty we pop it and proceed as appropriate.

eval :: SECD -> Result


eval (Quad s e c d)
| isEmptyStack c =
if (isEmptyStack d)
then stop s
else resume s d

180
21.5 Functions

| otherwise = let (ct, cr) = pop c


in case ct of
(Term t) -> auxterm t (Quad s e cr d)
At -> auxat (Quad s e cr d)
If -> auxif (Quad s e cr d)
(PrimUnCall op) -> auxunop op (Quad s e cr d)
(PrimBinCall op) -> auxbinop op (Quad s e cr d)

When we are finished we return the top of the result stack.

stop :: Stack a -> a


stop = fst.pop

When we must resume, the result stack will have a single value in it. We will crash with a pattern-
matching error if the result stack is not a singleton. We resume by putting the value on the top of the
stack that we moved to the dump, and restoring the old environment, control stack and dump.

resume :: ResSt -> Dump -> Result


resume s d = let (v, emptyStack) = pop s
((s’, e’, c’), d’) = pop d
in eval (Quad (push v s’) e’ c’ d’)

If there is still work to do we look at the item on top of the control stack, and take the appropriate
action. All of these auxiliary functions are designed to give a pattern-matching error if computation
cannot proceed. The general pattern is as follows:

• if we have a term, we break the term up and push items onto the stacks. If the term is evaluable
we will leave a marker on the control stack to indicate that, when we reach it again, we can
perform some evaluation;

• if we have a marker we perform the appropriate action, such as making a primitive function call.

21.5.1 Dealing with a term on the control stack


One oddity is that way tha we deal with conditional expressions. We puss the then and else expressions
onto the the control stack, then we push a marker iIf, and the test onto the control stack. When we
return to the If marker there will be a boolean on the result stack. This is quite different from the
technique described in [29].

auxterm :: Term -> SECD -> Result


auxterm (Const v) (Quad s e c d) =
eval (Quad (push (Value v) s) e c d)
auxterm (Var v) (Quad s e c d) =
eval (Quad (push (lookUp v e) s) e c d)
auxterm (PrimUnTm op t) (Quad s e c d) =
eval (Quad s e (push (Term t) (push (PrimUnCall op) c)) d)
auxterm (PrimBinTm op t1 t2) (Quad s e c d) =
eval (Quad s e (push (Term t2)
(push (Term t1)
(push (PrimBinCall op) c))) d)
auxterm (ITE b thn els) (Quad s e c d) =
eval (Quad s e (push (Term b)
(push If
(push (Term thn)
(push (Term els) c)))) d)
auxterm (Abs v t) (Quad s e c d) =
eval (Quad (push (Closure v t e) s) e c d)
auxterm (App t1 t2) (Quad s e c d) =
eval (Quad s e (push (Term t2) (push (Term t1) (push At c))) d)

181
21 An SECD machine

The lookup function is:


lookUp :: Var -> EnvSt -> Result
lookUp x e
| isEmptyStack e = error ("lookup " ++ (show x))
| otherwise = let ((y, vy), e1) = pop e
in if x == y
then vy
else (lookUp x e1)

21.5.2 Dealing with a unary operator on the control stack


We pop a value from the result stack and call a primitive function.
auxunop :: UnOp -> SECD -> Result
auxunop op (Quad s e c d) =
let (Value v, s’) = pop s
in eval (Quad (push (Value ((primUncall op) v)) s’) e c d)

Primitive unary operations


primUncall :: UnOp -> Value -> Value
primUncall Negative (NumVal n) = NumVal (-n)
primUncall Not (TruVal b) = TruVal (not b)
primUncall IsZero (NumVal n) = TruVal (n==0)

21.5.3 Dealing with a binary operator on the control stack


We pop two values from the result stack and call a primitive function.
auxbinop :: BinOp -> SECD -> Result
auxbinop op (Quad s e c d) =
let (Value v1, s’) = pop s
(Value v2, s’’) = pop s’
in eval (Quad (push (Value ((primBincall op) v1 v2)) s’’) e c d)

Primitive binary operations


primBincall :: BinOp -> Value -> Value -> Value
primBincall Plus (NumVal n) (NumVal m) = NumVal (n + m)
primBincall Minus (NumVal n) (NumVal m) = NumVal (n - m)
primBincall Times (NumVal n) (NumVal m) = NumVal (n * m)
primBincall Divide (NumVal n) (NumVal m) = NumVal (n ‘div‘ m)
primBincall Remainder (NumVal n) (NumVal m) = NumVal (n ‘rem‘ m)

21.5.4 Dealing with If on the control stack


If the top of the control stack is If then we have been evaluating a conditional expression. The top of
the result stack will be a Boolean value, and the next two items on the control stack will be the then and
else expressions. We choose the appropriate one and push it back onto the control stack (ignoring that
half the time we will be doing useless work by pushing the value just popped)
auxif :: SECD -> Result
auxif (Quad s e c d) =
let (Value (TruVal b), s’) = pop s
(t1, c’) = pop c
(t2, c’’) = pop c’
choice = if b then t1 else t2
in eval (Quad s’ e (push choice c’’) d)

182
21.6 Some miscellaneous definitions

21.5.5 Dealing with At on the control stack


If the top of the control stack is At then we have been evaluating an application term. The top of the
result stack will be a closure. We pop the next item from the result stack, and add a new binding to
the environment of the closure. We dump the current stacks, and proceed to evaluate the body of the
closure in the new environment.
auxat :: SECD -> Result
auxat (Quad s e c d) =
let (Closure x t e’, s’) = pop s
(v, s’’) = pop s’
in eval (Quad emptyStack
(push (x, v) e’)
(push (Term t) emptyStack)
(push (s’’, e, c) d))

21.6 Some miscellaneous definitions


w, bigOmega :: Term
w = (Abs "x" (App (Var "x") (Var "x")))
bigOmega = App w w

--add s, k, i, theta

itm = Abs "x" (Var "x")


ktm = Abs "x" (Abs "y" (Var "x"))
stm = Abs "x"
(Abs "y"
(Abs "z" (App (App (Var "x") (Var "z")) (App (Var "y") (Var "z")))))

skk = App (App stm ktm) ktm

zero = Const (NumVal 0)


one = Const (NumVal 1)
two = Const (NumVal 2)

mktm op n m = PrimBinTm op n m

xminus1 = Abs "x" (mktm Minus (Var "x") one)

iszero t = PrimUnTm IsZero t

ch0 = Abs "f" (Abs "x" (Var "x"))


ch1 = Abs "f" (Abs "x" (App (Var "f") (Var "x")))
ch2 = App (App chplus ch1) ch1
ch3 = App (App chplus ch1) ch2
ch4 = App (App chplus ch1) ch3
ch5 = App (App chplus ch1) ch4
ch6 = App (App chplus ch1) ch5

ch36 = App (App chtimes ch6) ch6


ch1296 = App (App chtimes ch36) ch36

chplus = Abs "x"


(Abs "y"

183
21 An SECD machine

(Abs "p"
(Abs "q" (App (App (Var "x") (Var "p"))
(App (App (Var "y") (Var "p")) (Var "q"))))))

chtimes = Abs "f"


(Abs "g"
(Abs "x" (App (Var "f") (App (Var "g") (Var "x")))))

chpower = Abs "f"


(Abs "g" (App (Var "g") (Var "f")))

ch2int ch = App (App ch (Abs "v" (PrimBinTm Plus (Var "v") one)))
zero

21.7 Module TERMS


This module defines the type of terms, and implements a parser. The parser code in not included in
this Chapter as it is based on the material in Chapter 9

module Terms
(parsetm,
Var,
NumVal,
TruVal,
Value(NumVal, TruVal),
UnOp(Negative, Not, IsZero),
BinOp(Plus, Minus, Times, Divide, Remainder),
Term(Const, Var, PrimUnTm, PrimBinTm, ITE, Abs, App)
)

where
import ParserLib

21.8 Types
First some synonyms:

type Var = String


type NumVal = Int
type TruVal = Bool

We will have Boolean and integer values:

data Value = NumVal NumVal


| TruVal TruVal
deriving (Eq, Show)

And some operators:

data UnOp = Negative


| Not
| IsZero
deriving (Eq, Show)

184
21.9 Functions

data BinOp = Plus


| Minus
| Times
| Divide
| Remainder
deriving (Eq, Show)

Now we explain how to form terms.

data Term = Const Value


| Var Var
| PrimUnTm UnOp Term
| PrimBinTm BinOp Term Term
| ITE Term Term Term
| Abs Var Term
| App Term Term
deriving (Eq, Show)

21.9 Functions
21.9.1 Top-level call to be exported

parsetm :: String -> Term


parsetm = some absp

21.9.2 Parsers which produce terms


In this section we have all the functions which have type Parser Char Term, as well as those func-
tions with type Parser Char [Term] (or similar) which are used to remove left recursion.
Abstraction terms are at the top-level.

absp :: Parser Char Term


absp = lambda &> (nespacelist name) <&> dot &> absp <@ uncurry mkabs
<|> ifthenelse

Then, If then else. We force barackets on nested If then elses.

ifthenelse :: Parser Char Term


ifthenelse = ifsym &> notp <&> thensym &> notp <&> elsesym &> notp <@ mkITE
<|> notp

Then Not:

notp :: Parser Char Term


notp = notsym &> notp <@ PrimUnTm Not
<|> zerop

Then IsZero:

zerop :: Parser Char Term


zerop = isZerosym &> zerop <@ PrimUnTm IsZero
<|> multp

Next we have the terms formed with the arithmetic operators:

185
21 An SECD machine

multp :: Parser Char Term


multp = addp <&> multpaux <@ uncurry mklaop
<|> addp

multpaux :: Parser Char [(BinOp, Term)]


multpaux = (multsymp <&> addp) <&> multpaux <@ (uncurry (:))
<|> multsymp <&> addp <@ singleton

addp :: Parser Char Term


addp = negativep <&> addpaux <@ uncurry mklaop
<|> negativep

addpaux :: Parser Char [(BinOp, Term)]


addpaux = (addsymp <&> negativep) <&> addpaux <@ (uncurry (:))
<|> addsymp <&> negativep <@ singleton

negativep :: Parser Char Term


negativep = negativesym &> negativep <@ PrimUnTm Negative
<|> appp

Application binds very tightly:

appp :: Parser Char Term


appp = constp <&> apppaux <@ (foldl1 App) . (uncurry (:))
<|> constp

apppaux :: Parser Char [Term]


apppaux = constp <&> apppaux <@ (uncurry (:))
<|> constp <@ singleton

Finally we have constants, variables, defined combinators, and bracketed terms.

constp :: Parser Char Term


constp = value <@ Const
<|> name <@ Var
<|> comb
<|> bra &> absp <& ket

comb :: Parser Char Term


comb = yCombinator
<|> pairCombinator
<|> fstCombinator
<|> sndCombinator
<|> sCombinator
<|> kCombinator
<|> iCombinator

21.9.3 Parsing parts of terms


In this section we have all the parsers which return sub-parts of terms.

multsymp :: Parser Char BinOp


multsymp = timesp <|> divp <|> modp

addsymp :: Parser Char BinOp


addsymp = plusp <|> minusp

186
21.9 Functions

plusp, minusp, timesp, divp, modp :: Parser Char BinOp


plusp = plussym <@ k Plus
minusp = minussym <@ k Minus
timesp = timessym<@ k Times
divp = divsym <@ k Divide
modp = modsym <@ k Remainder

value :: Parser Char Value


value = natural <@ NumVal
<|> truval <@ TruVal

21.9.4 Parsing to Haskell types


In this section we have the parsers which return values in some Haskell type which we want to retain.

truval :: Parser Char Bool


truval = falsesym <@ k False
<|> truesym <@ k True

natural :: Parser Char Int


natural = sp (first (plus digit))
<@ (foldl (\ a b -> 10*a + b) 0)

digit :: Parser Char Int


digit = satisfy isdigit
<@ (\ c -> (fromEnum c - fromEnum ’0’))

21.9.5 Parsing tokens


In this section we have the parsers which return strings which we want to discard. Roughly, this is
tokenisation.

name,
plussym, minussym, timessym, divsym, modsym, negativesym,
truesym, falsesym, notsym, isZerosym,
ifsym, thensym, elsesym,
lambda, dot, bra, ket :: Parser Char String

name = sp (first (plus (satisfy ( \ c -> (’a’ <= c && c <= ’z’ )))))

plussym = sp (symbol ’+’)


minussym = sp (symbol ’-’)
timessym = sp (symbol ’*’)
divsym = sp (symbol ’/’)
modsym = sp (symbol ’%’)

negativesym = sp (symbol ’~’)


truesym = sp (token "True")
falsesym = sp (token "False")
notsym = sp (token "Not")

isZerosym = sp (token "IsZero")

ifsym = sp (token "If ")


thensym = sp (token "Then ")

187
21 An SECD machine

elsesym = sp (token "Else ")

lambda = sp (symbol ’^’)


dot = sp (symbol ’.’)

bra = sp (symbol ’(’)

ket = sp (symbol ’)’)

21.9.6 Defined combinators


yCombinator :: Parser Char Term
yCombinator = sp (symbol ’Y’)
<@ k (Abs "_x" (App (Abs "_y" (App (Var "_x") (App (Var "_y") (Var "_y"))))
(Abs "_y" (App (Var "_x") (App (Var "_y") (Var "_y"))))))

pairCombinator :: Parser Char Term


pairCombinator = sp (token "Pair")
<@ k (Abs "_x"
(Abs "_y"
(Abs "_z" (App (App (Var "_z") (Var "_x")) (Var "_y")))))

fstCombinator :: Parser Char Term


fstCombinator = sp (token "Fst")
<@ k (Abs "_p" (App (Var "_p") (Abs "_x" (Abs "_y" (Var "_x")))))

sndCombinator :: Parser Char Term


sndCombinator = sp (token "Snd")
<@ k (Abs "_p" (App (Var "_p") (Abs "_x" (Abs "_y" (Var "_y")))))

sCombinator :: Parser Char Term


sCombinator = sp (symbol ’S’)
<@ k (Abs "_x"
(Abs "_y"
(Abs "_z"
(App (App (Var "_x") (Var "_z"))
(App (Var "_y") (Var "_z"))))))

kCombinator :: Parser Char Term


kCombinator = sp (symbol ’K’) <@ k (Abs "_x" (Abs "_y" (Var "_x")))

iCombinator :: Parser Char Term


iCombinator = sp (symbol ’I’) <@ k (Abs "_x" (Var "_x"))

21.9.7 Utility functions


These are utility functions used elsewhere.

mklaop :: Term -> [(BinOp, Term)] -> Term


mklaop a [] = error "mklaop should never get here!"
mklaop a [(op, b)] = PrimBinTm op a b
mklaop a ((op, b):more) = mklaop (PrimBinTm op a b) more

188
21.10 Module: Stack

mkabs :: [Var] -> Term -> Term


mkabs [] _ = error "mkabs should never get here!"
mkabs [v] t = Abs v t
mkabs (v:vs) t = Abs v (mkabs vs t)

mkITE :: (Term, (Term, Term)) -> Term


mkITE (i, (t, e)) = ITE i t e

isdigit :: Char -> Bool


isdigit c = ’0’ <= c && c <= ’9’

singleton :: a -> [a]


singleton s = [s]

k :: a-> b -> a
k x y = x

21.10 Module: Stack


A very small module implementing stacks.

module Stack
( Stack,
emptyStack,
isEmptyStack,
push,
pop
) where

newtype Stack a = St [a]

instance Show a => Show (Stack a) where


show (St l) = show l

instance Eq a => Eq (Stack a) where


(St []) == (St []) = True
(St []) == _ = False
_ == (St []) = False
s1 == s2 = (pop s1) == (pop s2)

emptyStack = St []

isEmptyStack (St st) = st == []

push val (St st) = St (val : st)

pop (St []) = error "Can’t pop empty stack"


pop (St (h:t)) = (h, St t)

189
21 An SECD machine

190
Bibliography
[1] Hassan Aït-Kassi. Warren’s Abstract Machine: A Tutorial Reconstruction. MIT Press, Cam-
bridge, Massachusetts, USA, 1991. Also available from http://www.isg.sfu.ca/~hak/
documents/wam.html.
[2] Andrew Appel. Compiling with Continuations. Cambridge University Press, Cambridge, England,
1992.
[3] Franz Baader and Tobias Nipkow. Term Rewriting and All That. Cambridge University Press,
Cambridge, England, 1998.
[4] Henk Barendregt. The Lambda Calculus Its Syntax and Semantics, volume 103 of Studies in
Logic and the Foundations of Mathematics. North-Holland, Amsterdam, The Netherlands, revised
edition, 1984.
[5] Henk Barendregt. The impact of the lambda calculus in logic and computer science. Bulletin of
Symbolic Logic, 3(2):181–215, 1997. Also available at http://www.math.ucla.edu/~asl/
bsl/0302/0302-003.ps.
[6] Richard Bird. Introduction to Functional Programing using Hasekll. Prentice-Hall, second edition,
1998.
[7] Ivan Bratko. Prolog Programming for Artificial Intelligence. International Computer Science Series.
Addison-Wesley, Wokingham, England, 1986.
[8] Luca Cardelli and Peter Wegner. On understanding types, data abstraction, and polymor-
phism. Computing Surveys, 17(4):471–522, 1985. Available from http://www.research.
microsoft.com/Users/luca/Papers/OnUnderstanding.A4.ps.
[9] Alonzo Church. The Calculi of Lambda Conversion. Princeton University Press, Princeton, NJ, USA,
1941.
[10] Alonzo Church. Introduction to Mathematical Logic, volume 1. Princeton University Press, Prince-
ton, New Jersey, USA, second, enlarged edition, 1956.
[11] W F Clocksin. Clause and Effect Prolog Programming for the Working Programmer. Springer-
Verlag, Berlin, 1997.
[12] W F Clocksin and C S Mellish. Programming in Prolog. Springer-Verlag, Berlin, third, revised and
extended edition, 1987.
[13] Olivier Danvy and Andrej Filinski. Representing control a study of cps transformation. Mathemat-
ical Structures in Computer Science, 1992. Also Tech Report CIS-91-2, Kansas State University.
[14] Anthony J T Davie. An Introduction to Functional Programming Systems Using Haskell, volume 27
of Cambridge Computer Science Texts. Cambridge University Press, Cambridge, England, 1992.
[15] Anthony J T Davie and Ronald Morrison. Recursive Descent Compilng. Ellis-Horwod, 1981.
[16] Martin Davis, editor. The Undecidable. Raven Press, Hewlett, New York, USA, 1965.
[17] N G de Bruijn. Lambda calculus notation with nameless dummies, a tool for automatic formula
manipulation. Indag. Math., 34:381–392, 1972.
[18] N G de Bruijn. A survey of the project AUTOMATH. In J Roger Hindley and Jonathan P Seldin,
editors, To H. B. Curry: Essays in Combinatory Logic, Lambda-Calculus and Formalism, pages
579–607. Academic Press, New York, NY, USA, 1980.

191
Bibliography

[19] Phillipe de Groote. The Curry-Howard Isomorphism, volume 8 of Cahiers du Centre de Logique.
Academia, Louvain-la-Neuve, Belgium, 1995.

[20] Paul de Mast, Jan-Marten Jansen, Dick Bruin, Jeroen Fokker, Pieter Koopman, Sjaak Smetsers,
Marko van Eekelen, and Rinus Plasmeijer. Functional Programming in Clean. Unpublished draft,
2000. Available from http://www.cs.kun.nl/~clean/Manuals/Clean_Book/clean_
book.html.

[21] Edsger Wybe Dijkstra. A Primer of ALGOL 60 Programming. APIC Studies in Data Processing.
Academic Press, London, England, 1962.

[22] Richard P Draves, Brian N Bershad, Richard F Rashid, and Randall W Dean. Using continua-
tions to implement thread management and communication in operating systems. In 13th ACM
Symposium on Operating Systems Principles, pages 122–136. ACM Press, 1991.

[23] Michael J Fischer. Lambda calculus schemata. Sigplan Notices, 7:104–109, 1972.

[24] Cormac Flanagan, Amr Sabry, Bruce F Duba, and Matthias Felleisen. The essence of compiling
with continuations. In Conference on Programming Language Design and Implementation, 1993.

[25] Jeroen Fokker. Functional parsers. In Johan Jeuring and Erik Meijer, editors, Advanced Functional
Programming, Tutorial Text of the First International Spring School on Advanced Functional Pro-
gramming Techniques, volume 925 of Lecture Notes in Computer Science, pages 1–23. Springer,
1995. Also avalaible from http://www.cs.uu.nl/staff/IDX/sds.html.

[26] Harvey Friedman. Classically and intuitionistically provably recursive functions. In G.H. Müller
and D. S. Scott, editors, Higher Set Theory, pages 21–27. Springer, 1977.

[27] Gerhard Gentzen. Investigations into logical deduction. In M E Szabo, editor, The Collected Papers
of Gerhard Gentzen, pages 68–131. North-Holland, Amsterdam, The Netherlands, 1969.

[28] Carlo Ghezzi and Mehdi Jazayeri. Programming Langauge Concepts. John Wiley & Sons, New
York, NY, USA, third edition, 1998.

[29] Hugh Glaser, Chris Hankin, and David Till. Principles of Functional Programming. Prentice/Hall
International, London, England, 1984.

[30] Kurt Gödel. On formally undecidable propositions of Principia Mathematica and related systems
i. In Davis [16], pages 5–38. Originally published in German as Uber formale unentschiedbare
Sätze der Principia Mathematica und verwandter Systeme I Monatschefte für Mathematik und
Physik, Vol. 38 pp. 173–198, 1931.

[31] Andrew Donald Gordon. Functional Programming and Input/Output. PhD, University of Cam-
bridge, 1992. Also Technical Report 285, Cambridge University Computer Laboratory, and pub-
lished as a Distinguished Dissertation in Computer Science, Cambridge University Press.

[32] J S Green. ALGOL Programming for KDF9. An English Electric Leo mini-manual. English Electric-
Leo Computers Ltd, Stoke-on-Trent, England, 1963.

[33] Timothy G Griffin. A formulae-as-types notion of control. In Seventeenth Annual ACM Symposium
on Principles of Programming Languages (POPL 17), pages 47–58. ACM Press, 1990.

[34] John Hatcliff and Olivier Danvy. A generic account of continuation-passing styles. In Popl 94
: ACM Symposium on Principles of Programming Languages, pages 458–471. Association for
Computing Machinery, 1994.

[35] David Hilbert and W Ackermann. Principles of Mathematical Logic. Chelsea Publishing Company,
New York, USA, 1950. A translation, with corrections, of the second German edition of Grundzüge
der Theoretishcen logik of 1938.

[36] David Hilbert and Paul Bernays. Grundlagen der Mathematik, volume 1. Springer, Berlin, Ger-
many, 1934. In German.

192
Bibliography

[37] J Roger Hindley. The principal type-scheme of an object in combinatory logic. Transactions of the
American Mathematical Society, 146(12):29–60, 1969.
[38] J Roger Hindley. Basic Simple Type Theory, volume 42 of Cambridge Tracts In Theoretical Computer
Science. Cambridge University Press, Cambridge, 1997.
[39] J Roger Hindley and Jonathan P Seldin. Introduction to Combinators and λ-Calculus, volume 1
of London Mathematical Society Student Texts. Cambridge University Press, Cambridge, England,
1986.
[40] Paul Hudak. The Haskell School of Expression Learning Functional Programming Through Multi-
media. Cambridge University Press, Cambridge, England, 2000.
[41] John Hughes. Why functional programing matters. In David Turner, editor, Research Topics in
Functional Programming. Addison-Wesley, Reading, Masachussets, USA, 1990. Also appeared in
The Computer Journal 32(2), 1989.
[42] Graham Hutton. Higher-order functions for parsing. Journal of Functional Programming,
2(3):323–343, 1992. Also available from http://www.cs.nott.ac.uk/~gmh/bib.html#
parsing.
[43] Richard Kelsey, William Clinger, and Jonathon Rees. Revised5 report on the algorithmic language
scheme. Journal of Higher Order and Symbolic Computation, 11(1):7 –105, 1998. Also appears
in ACM SIGPLAN Notices 33(9), September 1998.
[44] Brian W Kernighan and Dennis M Ritchie. The C Programming Language. Prentice Hall Software
Series. Prentice Hall, Englewood Cliffs, New Jersey, USA, second edition, 1988.
[45] Stephen C Kleene. General recursive functions of natural numbers. In Davis [16], pages 237–
253. Originally published in Mathematische Annalen 112(5):727–742, 1936.
[46] J Lambek and P J Scott. Introduction to higher order categorical logic, volume 7 of Cambridge
Studies in Advanced Mathematics. Cambridge University Press, Cambridge, England, 1986.
[47] P J Landin. The mechanical evaluation of expressions. The Computer Journal, 6:308–320, 1964.
[48] Leonardo of Pisa (Fibonacci). Liber Abacci. 2 edition, 1228. In Latin.
[49] Neil Leslie. Specification And Implementation Of A Unification Algorithm In Martin-Löf’s Type
Theory. MSc, St Andrews, 1993.
[50] Neil Leslie. Continuations and Martin-Löf’s Type Theory. Ph.D, Massey Unviersity, 2000.
[51] Tim Lindholm and Frank Yellin. The Java Virtual Machine Specification. Java series. Addison-
Wesley, second edition, 1999. Available from http://java.sun.com/docs/books/
vmspec/index.html.
[52] Kenneth C Louden. Programming Languages Principles and Practice. PWS-KENT Series in Com-
puter Science. PWS Publishing Company, Boston, Massachusetts, USA, 1993.
[53] John McCarthy. History of LISP. In Wexelblat [85], pages 173–197.
[54] Zohar Manna and Richard Waldinger. Deductive synthesis of the unification algorithm. Science
of Computer Programming, 1:5–48, 1981.
[55] Per Martin-Löf. Intuitionistic Type Theory, volume 1 of Studies in Proof Theory Lecture Notes.
Bibliopolis, Napoli, Italy, 1984. Notes taken by Giovanni Sambin from a series of lectures given
in Padua, June 1980.
[56] Robin Milner. A theory of type polymorphism in programming. Journal of Computer and System
Sciences, 17(3):348–375, 1978.
[57] Chetan R Murthy. Extracting Constructive Content From Classical Proofs. PhD, Cornell University,
1990. Also available as TR 89-1151 from Dept. of Computer Science, Cornell University.

193
Bibliography

[58] Bengt Nordström, Kent Petersson, and Jan M Smith. Programming in Martin-Löf’s Type Theory An
Introduction. Clarendon Press, Oxford, England, 1990.

[59] Chris Okasaki. Purely Functional Data Structures. Cambridge University Press, Cambridge, Eng-
land, 1998.

[60] Larry Paulson. Verifying the unification algorithm in lcf. Science of Computer Programming,
5:143–169, 1985.

[61] Laurence C Paulson. ML for the Working Programmer. Cambridge University Press, Cambridge,
England, 1991.

[62] Alan J Perlis. The american side of the development of ALGOL. In Wexelblat [85], pages 75–91.

[63] Rózsa Péter. Recursive Functions in Computer Theory. The Ellis Horwood Series in Computers and
Their Applications. Ellis Horwood, Chichester, England, 1981. Originally published in German
as Rekursive Funktionen in der Komputer-Theorie, 1977.

[64] Simon L Peyton Jones. The Implementation of Functional Programming Languages. Prentice-Hall
International Series on Computer Science. Prentice-Hall, Inc., Englewood Cliffs, New Jersey, USA,
1987.

[65] Rinus Plasmeijer and Marko van Eekelen. Functional Programming and Parallel Graph Rewriting.
International Computer Science Series. Addison-Wesley, Wokingham, England, 1993.

[66] Gordon D Plotkin. Call-by-name, call-by-value and the λ-calculus. Theoretical Computer Science,
1:125–159, 1975.

[67] Emil L Post. Finite combinatory processes formulation i. In Davis [16], pages 288–292. Originally
published in The Journal of Symbolic Logic 1:103–105, 1936.

[68] Stuart Russell and Peter Norvig. Artificial Intelligence A Modern Approach. Prentice Hall Studies
in Artificial Intelligence. Prentice Hall, Upper Saddle River, New Jersey, USA, 1995.

[69] Giovanni Sambin and Jan Smith, editors. Twenty-Five Years of Constructive Type Theory, vol-
ume 36 of Oxford Logic Guides. Clarendon Press, Oxford, England, 1998.

[70] David A Schmidt. Denotational Semantics A Methodology for Language Development. Wm. C.
Brown, Dubuque, Iowa, USA, 1986.

[71] Helmut Schwichtenberg. Proofs, lambda terms and control operators. In Helmut Schwichtenberg,
editor, Logic of Computation, pages 309–348. Springer, Heidelberg, Germany, 1997. Proceed-
ings of the NATO Advanced Study Institute on Logic of Computation, held at Marktoberdorf,
Germany, July 25 – August 6, 1995.

[72] Brian J. Shelburne and Christopher P.Burton. Early programs on the manchester mark i prototype.
IEEE Annals of the History of Computing, 20(3):4–15, July-September 1998. See also http:
//www.computer50.org.

[73] J C Shepherson and H E Sturgis. Computabilty of recursive functions. Journal of the Association
for Computing Machinery, 10:217–255, 1963.

[74] Christopher Strachey and Christopher P Wadsworth. Continuations A mathematical semantics


for handling full jumps. Technical Monograph PRG-11, Oxford University Computing Laboratory
Programming Research Group, 1974.

[75] Hayo Thielecke. Categorical Structure of Continuation Passing Style. PhD, University of Edinburgh,
1997. Also available as technical report ECS-LFCS-97-376.

[76] Simon Thompson. Type Theory and Functional Programming. International Computer Science
Series. Addison-Wesley, Wokingham, England, 1991.

194
Bibliography

[77] Simon Thompson. Haskell The Craft of Functional Programming. International Computer Science
Series. Addison-Wesley, Wokingham, England, second edition, 1999.
[78] Anne Sjerp Troelstra and Dirk van Dalen. Constructivism in Mathematics An Introduction Volume 1,
volume 121 of Studies in Logic and the Foundations of Mathematics. North-Holland, Amsterdam,
The Netherlands, revised edition, 1988.
[79] Alan Turing. On computable numbers, with an application to the Entscheidungsproblem. In Davis
[16], pages 115–153. Originaly published in Proceedings of the London Mathematical Society
Series 2 vol. 42 1936-1937 pp. 230–265.
[80] Nikolai Nikolaevich Vorobev. The Fibonacci Numbers. Heath, Boston, Massachusetts, USA, 1963.
Translation of Chisla Fibonachchi, published in Russian, 1951.
[81] Philip Wadler. How to replace failure by a list of successes. In 2 nd International Conference on
Functional Programming Languages and Computer Architecture. Springer-Verlag, 1985.
[82] Philip Wadler. The essence of functional programming. In 19’th Symposium on Principles of
Programming Languages. ACM Press, 1992.
[83] Philip Wadler. Monads for functional programming. In M Broy, editor, Marktoberdorf
Summer School on Program Design Calculi, volume 118 of NATO ASI Series F: Computer
and systems sciences, Berlin, Germany, 1992. Springer. This paper is available from
http://www.cs.bell-labs.com/who/wadler/topics/monads.html.
[84] Philip Wadler and Stephen Blott. How to make ad-hoc polymorphism less ad hoc. In 16th
Symposium on Principles of Programming Languages. ACM Press, 1989. Also available from
http://cm.bell-labs.com/who/wadler/papers/class/class.ps.gz.
[85] Richard L Wexelblat, editor. History of Programming Languages. ACM Monograph. Academic
Press, New York, New York, USA, 1981.
[86] F.C. Williams and T. Kilburn. Electronic digital computers. Nature, 162:487, September 1948.
See also http://www.computer50.org.

195

You might also like