Sofp Vol1
Sofp Vol1
Part I
The Science of Functional
Programming. Part I
A tutorial, with examples in Scala
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Docu-
mentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections,
no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the appendix entitled “GNU Free
Documentation License” (Appendix F on page 1265).
A Transparent copy of the source code for the book is available at https://github.com/winitzki/sofp and
includes LyX, LaTeX, graphics source files, and build scripts. A full-color hyperlinked PDF file is available at
https://github.com/winitzki/sofp/releases under “Assets”. The source code may be also included as a “file
attachment” named sofp-src.tar.bz2 within a PDF file. To extract, run `pdftk sofp.pdf unpack_files output .`
and then `tar jxvf sofp-src.tar.bz2`. See the file README_build.md for build instructions.
This book is a pedagogical in-depth tutorial and reference on the theory of functional programming (FP) as it was
practiced at the beginning of the XXI century. Starting from issues found in practical coding, the book builds up the
theoretical intuition, knowledge, and techniques that programmers need for rigorous reasoning about types and code.
Examples are given in Scala, but most of the material applies equally to other FP languages.
The book’s topics include working with FP-style collections; reasoning about recursive functions and types; the
Curry-Howard correspondence; laws, structural analysis, and code for functors, monads, and other typeclasses based
on exponential-polynomial data types; techniques of symbolic derivation and proof; free typeclass constructions; and
practical applications of parametricity.
Long and difficult, yet boring explanations are logically developed in excruciating detail through 1906 Scala code
snippets, 192 statements with step-by-step derivations, 104 diagrams, 223 examples with tested Scala code, and 310
exercises. Discussions build upon each chapter’s material further.
Beginners in FP will find tutorials about the map/reduce programming style, type parameters, disjunctive types,
and higher-order functions. For more advanced readers, the book shows the practical uses of the Curry-Howard
correspondence and the parametricity theorems without unnecessary jargon; proves that all the standard monads (e.g.,
List or State) satisfy the monad laws; derives lawful instances of Functor and other typeclasses from types; shows that
monad transformers need 18 laws; and explains the use of parametricity for reasoning about the Church encoding and the
free typeclasses.
Readers should have a working knowledge of programming; e.g., be able to write code that prints the number of
distinct words in a sentence. The difficulty of this book’s mathematical derivations is at the level of undergraduate
multivariate calculus, similar to that of multiplying matrices or simplifying the expressions:
1 1 𝑑
− and ( ( 𝑥 + 1) 𝑓 ( 𝑥)𝑒−𝑥 ) .
𝑥−2 𝑥+2 𝑑𝑥
The author received a Ph.D. in theoretical physics. After a career in academic research, he works as a software engineer.
Contents
Preface 1
Formatting conventions used in this book . . . . . . . . . . . . . . . . . . 2
I Introductory level 5
1 Mathematical formulas as code. I. Nameless functions 7
1.1 Translating mathematics into code . . . . . . . . . . . . . . . . . . . 7
1.1.1 First examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1.2 Nameless functions . . . . . . . . . . . . . . . . . . . . . . . . 8
1.1.3 Nameless functions and bound variables . . . . . . . . . . . 11
1.2 Aggregating data from sequences . . . . . . . . . . . . . . . . . . . . 13
1.3 Filtering and truncating a sequence . . . . . . . . . . . . . . . . . . . 15
1.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.4.1 Aggregations . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.4.2 Transformations . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.6.1 Aggregations . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.6.2 Transformations . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.7.1 Functional programming as a paradigm . . . . . . . . . . . . 22
1.7.2 Iteration without loops . . . . . . . . . . . . . . . . . . . . . . 22
1.7.3 The mathematical meaning of “variables” . . . . . . . . . . . 23
1.7.4 Nameless functions in mathematical notation . . . . . . . . 25
1.7.5 Named and nameless expressions . . . . . . . . . . . . . . . 27
1.7.6 Historical perspective on nameless functions . . . . . . . . . 28
iv
Preface
This book is a reference text and a tutorial that teaches functional programmers
how to reason mathematically about types and code, in a manner directly rele-
vant to software practice. The material ranges from introductory (Part I) to ad-
vanced (Part III). The book assumes a certain amount of mathematical experience
(at about the level of undergraduate algebra or calculus) as well as some experi-
ence writing code in general-purpose programming languages.
The vision of this book is to explain the mathematical theory that guides the
practice of functional programming. So, all mathematical developments in this
book are motivated by practical programming issues and are accompanied by
Scala code illustrating their usage. For instance, the laws for standard typeclasses
(functors, monads, etc.) are first motivated heuristically through code examples.
Then the laws are formulated as mathematical equations and proved rigorously.
To achieve a clearer presentation of the material, the book uses certain non-
standard notations (see Appendix A on page 1157) and terminology (Appendix B
on page 1167). The presentation is self-contained, defining and explaining all re-
quired techniques, notations, and Scala features. All code examples have been
tested to work but are intended only for explanation and illustration. As a rule,
the code is not optimized for performance. Although the code examples are in
Scala, the material in this book also applies to many other languages.
A software engineer needs to learn only those few fragments of mathematical
theory that answer questions arising in the programming practice. So, this book
keeps theoretical material at the minimum: vita brevis, ars longa. The scope of the
required mathematical knowledge is limited to first notions of set theory, formal
logic, and category theory. Concepts such as functors or natural transformations
arise from the practice of reasoning about code and are first explained without
reference to category theory.
This book is not an introduction to current theoretical research in functional pro-
gramming. Instead, the focus is on material known to be practically useful. The
book organically develops the scope of theoretical concepts that help program-
mers write code or answer practical questions about code. That includes construc-
tions such as “filterable functors” and “applicative contrafunctors” but excludes a
number of theoretical developments that do not appear to have significant ap-
plications. For instance, this book does not talk about introduction/elimination
rules, strong normalization, complete partial orders, domain theory, model the-
ory, adjoint functors, co-ends, pullbacks, or topoi.
The first part of the book introduces functional programming. Readers already
1
Preface
scala> s.product
res0: Int = 3628800
• In the introductory chapters, type expressions and code examples are writ-
ten in the Scala syntax. In Chapters 4–5, the book introduces a mathematical
notation for types: e.g., the Scala type expression ((A, B)) => Option[A] is
written as 𝐴 × 𝐵 → 1 + 𝐴. Chapters 4–7 also develop a more concise notation
for code. For example, the functor composition law (in Scala: _.map(f).map(g)
== _.map(f andThen g)) is written in the code notation as:
𝑓 ↑𝐿 # 𝑔↑𝐿 = ( 𝑓 # 𝑔) ↑𝐿 ,
1 https://www.meetup.com/sf-types-theorems-and-programming-languages/
2
Formatting conventions used in this book
where 𝐿 is a functor and 𝑓 :𝐴→𝐵 and 𝑔 :𝐵→𝐶 are arbitrary functions of the spec-
ified types. The notation 𝑓 ↑𝐿 denotes the function 𝑓 lifted to the functor 𝐿
and replaces Scala’s syntax x.map(f) where x is of type L[A]. The symbol #
denotes the forward composition of functions (Scala’s method andThen). If
the notation still appears hard to follow after going through Chapters 5–6,
readers will benefit from working through Chapter 7, which explains the
code notation more systematically and clarifies it with additional examples.
Appendix A on page 1157 summarizes this book’s notation for types and
code.
• Frequently used methods of standard typeclasses, such as Scala’s flatten,
flatMap, etc., are denoted by shorter words and are labeled by the type con-
structor they belong to. For instance, the methods pure, flatten, and flatMap
for a monad 𝑀 are denoted by pu 𝑀 , ftn 𝑀 , and flm 𝑀 when writing code for-
mulas and proofs of laws.
• Derivations are written in a two-column format. The right column contains
formulas in the code notation. The left column gives an explanation or in-
dicates the property or law used to derive the expression at right from the
previous expression. A green underline shows the parts of an expression
that will be rewritten in the next step:
expect to equal pu 𝑀 : pu↑Id
𝑀 # pu 𝑀 # ftn 𝑀
lifting to the identity functor : = pu 𝑀 # pu 𝑀 # ftn 𝑀
left identity law of 𝑀 : = pu 𝑀 .
When the two-column presentation becomes too wide to fit the page, the
explanations are placed before the next step’s line:
expect to equal pu 𝑀 :
pu↑Id
𝑀 # pu 𝑀 # ftn 𝑀
lifting to the identity functor :
= pu 𝑀 # pu 𝑀 # ftn 𝑀
left identity law of 𝑀 :
= pu 𝑀 .
A green underline is sometimes also used at the last step of a derivation,
to indicate the sub-expression that resulted from the most recent rewriting.
Other than providing hints to help clarify the steps, the green text and the
green underlines play no role in symbolic derivations.
• The symbol is used occasionally to indicate more clearly the end of a defi-
nition, a derivation, or a proof.
3
Part I
Introductory level
1 Mathematical formulas as code. I.
Nameless functions
1.1 Translating mathematics into code
1.1.1 First examples
We begin by implementing some computational tasks in Scala.
Example 1.1.1.1 Find the product of integers from 1 to 10 (the factorial of 10,
usually denoted by 10!).
Solution First, we write a mathematical formula for the result:
10
Ö
10! = 1 ∗ 2 ∗ ... ∗ 10 , or in mathematical notation : 10! = 𝑘 .
𝑘=1
We can then write Scala code in a way that resembles the last formula:
scala> (1 to 10).product
res0: Int = 3628800
The syntax (1 to 10) produces a sequence of integers from 1 to 10. The product
method computes the product of the numbers in the sequence.
The code (1 to 10).product is an expression, which means that (1) the code can
be evaluated and yields a value, and (2) the code can be used inside a larger ex-
pression. For example, we could write:
scala> 100 + (1 to 10).product + 100 // The code `(1 to 10).product` is a
sub-expression.
res0: Int = 3629000
The Scala interpreter indicates that the result of (1 to 10).product is a value 3628800
of type Int. If we need to define a name for that value, we use the “val” syntax:
scala> val fac10 = (1 to 10).product
fac10: Int = 3628800
Equation (1.1) indicates that 𝑛 must be from the set of positive integers, denoted
by N in mathematics. This is similar to specifying the type (n: Int) in the Scala
code. So, the argument’s type in the code specifies the domain of a function (the
set of admissible values of a a function’s argument).
Having defined the function f, we can now apply it to an integer value 10 (or,
as programmers say, “call” the function f with argument 10):
scala> f(10)
res6: Int = 3628800
It is a type error to apply f to a non-integer value:
scala> f("abc")
<console>:13: error: type mismatch;
found : String("abc")
required: Int
8
1.1 Translating mathematics into code
This reads as “a function that maps 𝑛 to the product of all 𝑘 where 𝑘 goes from 1
to 𝑛”. The Scala expression implementing this mathematical formula is:
(n: Int) => (1 to n).product
This expression shows Scala’s syntax for a nameless function. Here, n: Int is the
function’s argument variable,2 while (1 to n).product is the function’s body. The
function arrow (=>) separates the argument variable from the body.3
Functions in Scala (whether named or nameless) are treated as values, which
means that we can also define a Scala value as:
scala> val fac = (n: Int) => (1 to n).product
fac: Int => Int = <function1>
We see that the value fac has the type Int => Int, which means that the function
fac takes an integer (Int) argument and returns an integer result value. What is the
value of the function fac itself ? As we have just seen, the standard Scala interpreter
prints <function1> as the “value” of fac. Another Scala interpreter called ammonite4
prints this:
scala@ val fac = (n: Int) => (1 to n).product
fac: Int => Int = ammonite.$sess.cmd0$$$Lambda$1675/2107543287@1e44b638
The long number could indicate an address in memory. We may imagine that
a “function value” represents a block of compiled code. That code will run and
evaluate the function’s body whenever the function is applied to an argument.
Once defined, a function can be applied to an argument value like this:
scala> fac(10)
res1: Int = 3628800
Functions can be also used without naming them. We may directly apply a name-
less factorial function to an integer argument 10 instead of writing fac(10):
scala> ((n: Int) => (1 to n).product)(10)
res2: Int = 3628800
We would rarely write code like this. Instead of creating a nameless function
and then applying it right away to an argument, it is easier to evaluate the expres-
sion symbolically by substituting 10 instead of n in the function body:
((n: Int) => (1 to n).product)(10) == (1 to 10).product
9
1 Mathematical formulas as code. I. Nameless functions
Of course, it is better to avoid repeating the value 12345. To achieve that, define n
as a value in an expression block like this:
scala> { val n = 12345; n * n * n + n * n }
res3: Int = 322687002
Defined in this way, the value n is visible only within the expression block. Out-
side the block, another value named n could be defined independently of this n.
For this reason, the definition of n is called a local-scope definition.
Nameless functions are convenient when they are themselves arguments of
other functions, as we will see next.
Example 1.1.2.1 Define a function that takes an integer argument 𝑛 and deter-
mines whether 𝑛 is a prime number.
Solution By definition, 𝑛 is prime if, for all 𝑘 between 2 and 𝑛 − 1, the remain-
der after dividing 𝑟 by 𝑘 (denoted by 𝑟%𝑘) is nonzero. We can write this as a
mathematical formula using the “forall” symbol (∀):
This formula has two parts: first, a range of integers from 2 to 𝑛 − 1, and second, a
requirement that all these integers 𝑘 should satisfy the given condition: (𝑛%𝑘) ≠ 0.
Formula (1.2) is translated into Scala code as:
def isPrime(n: Int) = (2 to n - 1).forall(k => n % k != 0)
This code looks closely similar to the mathematical notation, except for the ar-
row after 𝑘 that introduces a nameless function (k => n % k != 0). We do not need
to specify the type Int for the argument k of that nameless function. The Scala
compiler knows that k is going to iterate over the integer elements of the range (2
to n - 1), which effectively forces k to be of type Int because types must match.
We can now apply the function isPrime to some integer values:
scala> isPrime(12)
res3: Boolean = false
scala> isPrime(13)
res4: Boolean = true
As we can see from the output above, the function isPrime returns a value of type
Boolean. Therefore, the function isPrime has type Int => Boolean.
A function that returns a Boolean value is called a predicate.
In Scala, it is strongly recommended (although often not mandatory) to specify
the return types of named functions. The required syntax looks like this:
def isPrime(n: Int): Boolean = (2 to n - 1).forall(k => n % k != 0)
10
1.1 Translating mathematics into code
This would bring the syntax closer to Eq. (1.2). However, there still remains the
second difference: The symbol 𝑘 is used as an argument of a nameless function
(k => n % k != 0) in the Scala code, while the formula:
does not seem to define such a function but defines the symbol 𝑘 that goes over the
range [2, 𝑛 − 1]. The variable 𝑘 is then used for writing the predicate (𝑛%𝑘) ≠ 0.
Let us investigate the role of 𝑘 more closely. The mathematical variable 𝑘 is
accessible only inside the expression “∀𝑘...” and makes no sense outside that ex-
pression. This becomes clear by looking at Eq. (1.2): The variable 𝑘 is not present
in the left-hand side and could not possibly be used there. The name “𝑘” is accessi-
ble only in the right-hand side, where it is first mentioned as the arbitrary element
𝑘 ∈ [2, 𝑛 − 1] and then used in the sub-expression “𝑛%𝑘”.
So, the mathematical notation in Eq. (1.3) says two things: First, we use the
name 𝑘 for integers from 2 to 𝑛 − 1. Second, for each of those 𝑘 we evaluate the
expression (𝑛%𝑘) ≠ 0, which can be viewed as a certain function of 𝑘 that returns
a Boolean value. Translating the mathematical notation into code, it is natural to
use the nameless function 𝑘 → (𝑛%𝑘) ≠ 0 and to write Scala code applying this
nameless function to each element of the range [2, 𝑛 − 1] and checking that all
result values be true:
(2 to n - 1).forall(k => n % k != 0)
11
1 Mathematical formulas as code. I. Nameless functions
Just as the mathematical notation defines the variable 𝑘 only in the right-hand
side of Eq. (1.2), the argument k of the nameless Scala function k => n % k != 0 is
defined within that function’s body and cannot be used in any code outside the
expression n % k != 0.
Variables that are defined only inside an expression and are invisible outside
are called bound variables, or “variables bound in an expression”. Variables that
are used in an expression but are defined outside it are called free variables, or
“variables occurring free in an expression”. These concepts apply equally well
to mathematical formulas and to Scala code. For example, in the mathematical
expression ∀𝑘. (𝑛%𝑘) ≠ 0, the variable 𝑘 is bound (defined and only visible within
that expression’s scope) but the variable 𝑛 is free: it must be defined somewhere
outside the expression ∀𝑘. (𝑛%𝑘) ≠ 0.
The main difference between free and bound variables is that bound variables
can be locally renamed at will, unlike free variables. To see this, consider that we
could rename 𝑘 to 𝑧 and write instead of Eq. (1.2) an equivalent definition:
At this point we can apply a simplification trick to this code. The nameless func-
tion 𝑘 → 𝑝(𝑘) does exactly the same thing as the (named) function 𝑝: It takes an
argument, which we may call 𝑘, and returns 𝑝(𝑘). So, we can simplify the Scala
code above to:
(1 to n).forall(p)
12
1.2 Aggregating data from sequences
Here we defined a helper function isEven in order to write more easily a formula
for countEven. In mathematics, complicated formulas are often split into simpler
parts by defining helper expressions.
We can write the Scala code similarly. We first define the helper function isEven;
the Scala code can be written in a style quite similar to the mathematical formula:
def isEven(k: Int): Int = (k % 2) match {
case 0 => 1 // First, check if it is zero.
case _ => 0 // The underscore means "otherwise".
}
For such a simple computation, we could also write shorter code using a name-
less function:
val isEven = (k: Int) => if (k % 2 == 0) 1 else 0
Given this function, we now need to translate into Scala code the expression
𝑘 ∈𝐿 isEven (𝑘). We can represent the list 𝐿 using the data type List[Int] from the
Í
Scala standard library.
To compute 𝑘 ∈𝐿 isEven (𝑘), we must apply the function isEven to each element
Í
of the list 𝐿, which will produce a list of some (integer) results, and then we will
need to add all those results together. It is convenient to perform these two steps
separately. This can be done with the functions map and sum, defined in the Scala
standard library as methods for the data type List.
The method sum is similar to product and is defined for any List of numerical
types (Int, Float, Double, etc.). It computes the sum of all numbers in the list:
5 Certain features of Scala allow programmers to write code that looks like f(x) but actually uses
an automatic conversion for the argument x, default argument values, or implicit arguments.
In those cases, replacing the code x => f(x) by f will fail to compile.
13
1 Mathematical formulas as code. I. Nameless functions
The method map needs more explanation. This method takes a function as its
second argument and applies that function to each element of the list. All the
results are stored in a new list, which is then returned as the result value:
scala> List(1, 2, 3).map(x => x * x + 100 * x)
res1: List[Int] = List(101, 204, 309)
Short functions are often defined inline, while longer functions are defined sepa-
rately with a name.
A method, such as map, can be also used with a “dotless” (infix) syntax:
scala> List(1, 2, 3) map func1 // Same as List(1, 2, 3).map(func1)
res3: List[Int] = List(101, 204, 309)
If the transforming function is used only once (such as func1 in the example
above), and especially for simple computations such as 𝑥 → 𝑥 2 + 100𝑥, it is easier
to use a nameless function.
We can now combine the methods map and sum to define countEven:
def countEven(s: List[Int]) = s.map(isEven).sum
This code can be also written using a nameless function instead of isEven:
def countEven(s: List[Int]): Int = s
.map { k => if (k % 2 == 0) 1 else 0 }
.sum
In Scala, methods are often used one after another in a chain. For instance,
s.map(...).sum means: first apply s.map(...), which returns a new list; then apply
sum to that new list. To make the code more readable, we may put each of the
chained methods on a new line.
To test this code, let us run it in the Scala interpreter. In order to let the inter-
preter work correctly with multi-line code, we will enclose the code in braces:
scala> def countEven(s: List[Int]): Int = {
| s.map { k => if (k % 2 == 0) 1 else 0 }
14
1.3 Filtering and truncating a sequence
| .sum
| }
def countEven: (s: List[Int])Int
scala> countEven(List(1,2,3,4,5))
res0: Int = 2
Note that the Scala interpreter prints the types differently for named functions
(i.e., functions declared using def). It prints (s: List[Int])Int for a function of
type List[Int] => Int.
The methods forall, exists, filter, and takeWhile require a predicate as an argu-
ment. The forall method returns true if and only if the predicate returns true for
all values in the list. The exists method returns true if and only if the predicate
holds (returns true) for at least one value in the list. These methods can be written
as mathematical formulas like this:
forall (𝑆, 𝑝) = ∀𝑘 ∈ 𝑆. 𝑝(𝑘) = true ,
exists (𝑆, 𝑝) = ∃𝑘 ∈ 𝑆. 𝑝(𝑘) = true .
The filter method returns a list that contains only the values for which a pred-
icate returns true:
scala> List(1, 2, 3, 4, 5).filter(k => k != 3) // Exclude the value 3.
res5: List[Int] = List(1, 2, 4, 5)
The takeWhile method truncates a given list. More precisely, takeWhile returns a
new list that contains the initial portion of values from the original list for which
predicate remains true:
15
1 Mathematical formulas as code. I. Nameless functions
1.4 Examples
1.4.1 Aggregations
√
Example 1.4.1.1 Improve the code for isPrime by limiting the search to 𝑘 ≤ 𝑛:
Solution Use takeWhile to truncate the initial list when 𝑘 ∗ 𝑘 ≤ 𝑛 becomes false:
def isPrime(n: Int): Boolean = {
(2 to n - 1)
.takeWhile(k => k * k <= n)
.forall(k => n % k != 0)
}
Î10
Example 1.4.1.2 Compute this product of absolute values: 𝑘=1 |sin (𝑘 + 2)|.
Solution
(1 to 10)
.map(k => math.abs(math.sin(k + 2)))
.product
Í √
Example 1.4.1.3 Compute 𝑘∈[1,10]; cos 𝑘 >0 cos 𝑘 (the sum goes only over 𝑘 such
that cos 𝑘 > 0).
16
1.4 Examples
Solution
(1 to 10)
.filter(k => math.cos(k) > 0)
.map(k => math.sqrt(math.cos(k)))
.sum
√
It is safe to compute cos 𝑘, because we have first filtered the list by keeping only
values 𝑘 for which cos 𝑘 > 0. Let us check that this is so:
scala> (1 to 10).toList.filter(k => math.cos(k) > 0).map(x => math.cos(x))
res0: List[Double] = List(0.5403023058681398, 0.28366218546322625,
0.9601702866503661, 0.7539022543433046)
17
1 Mathematical formulas as code. I. Nameless functions
Example 1.4.1.7 Define a function 𝑝 that takes a list of integers and a function
f: Int => Int, and returns the largest value of 𝑓 (𝑥) among all 𝑥 in the list.
Solution
def p(s: List[Int], f: Int => Int): Int = s.map(f).max
1.4.2 Transformations
Example 1.4.2.1 Given a list of lists, s: List[List[Int]], select the inner lists of
size at least 3. The result must be again of type List[List[Int]].
Solution To “select the inner lists” means to compute a new list containing only
the desired inner lists. We use filter on the outer list s. The predicate for the filter
is a function that takes an inner list and returns true if the size of that list is at least
3. Write the predicate as a nameless function, t => t.size >= 3, where t is of type
List[Int]:
def f(s: List[List[Int]]): List[List[Int]] = s.filter(t => t.size >= 3)
The Scala compiler deduces from the code that the type of t is List[Int] because
we apply filter to a list of lists of integers.
Example 1.4.2.2 Find all integers 𝑘 ∈ [1, 10] such that there are at least three
different integers 𝑗, where 1 ≤ 𝑗 ≤ 𝑘, each 𝑗 satisfying the condition 𝑗 ∗ 𝑗 > 2 ∗ 𝑘.
Solution
scala> (1 to 10).toList.filter(k =>
(1 to k).filter(j => j*j > 2*k).size >= 3)
res0: List[Int] = List(6, 7, 8, 9, 10)
The argument of the outer filter is a nameless function that also uses a filter.
The inner expression:
18
1.5 Summary
computes a list of all 𝑗’s that satisfy the condition 𝑗 ∗ 𝑗 > 2 ∗ 𝑘. The size of that
list is then compared with 3. In this way, we impose the requirement that there
should be at least 3 values of 𝑗. We can see how the Scala code closely follows the
mathematical formulation of the task.
1.5 Summary
Functional programs are mathematical formulas translated into code. Table 1.1
summarizes the tools explained in this chapter and gives implementations of
some mathematical constructions in Scala. We have also shown methods such
as takeWhile that do not correspond to widely used mathematical symbols.
What problems can one solve with these techniques?
• Transform and aggregate data from lists using map, filter, sum, and other
methods from the Scala standard library.
What are examples of problems that are not solvable with these tools?
• Example 2: Compute a list of partial sums from a given list of integers. For
example, the list [1, 2, 3, 4] should be transformed into [1, 3, 6, 10].
1.6 Exercises
1.6.1 Aggregations
Exercise 1.6.1.1 Define a function that computes a “staggered factorial” (denoted
by 𝑛!!) for positive integers. It is defined as either 1 · 3 · ... · 𝑛 or as 2 · 4 · ... · 𝑛,
depending on whether 𝑛 is even or odd. For example, 8!! = 384 and 9!! = 945.
Exercise 1.6.1.2 Machin’s formula7 converges to 𝜋 faster than Example 1.4.1.5:
𝜋 1 1
= 4 arctan − arctan ,
4 5 239
∞
1 1 1 1 1 1 Õ (−1) 𝑘 −2𝑘−1
arctan = − + − ... = 𝑛 .
𝑛 𝑛 3 𝑛3 5 𝑛5 𝑘=0
2𝑘 + 1
7 http://turner.faculty.swau.edu/mathematics/materialslibrary/pi/machin.html
20
1.6 Exercises
Implement a function that computes the series for arctan 𝑛1 up to a given number
of terms, and compute an approximation of 𝜋 using this formula. Show that 12
terms of the series are sufficient for a full-precision Double approximation of 𝜋.
𝜋2
Exercise 1.6.1.3 Check numerically that ∞ 1
Í
𝑘=1 𝑘 2 = 6 . First, define a function of
𝑛 that computes a partial sum of that series until 𝑘 = 𝑛. Then compute the partial
sum for a large value of 𝑛 and compare with the limit value.
Exercise 1.6.1.4 Using the function isPrime, check numerically the Euler product
𝜋4
formula8 for the Riemann’s zeta function 𝜁 (4). It is known9 that 𝜁 (4) = 90 :
Ö 1 𝜋4
𝜁 (4) = 1
= .
𝑘 ≥2; 𝑘 is prime 1 −
90
𝑘4
1.6.2 Transformations
Exercise 1.6.2.1 Define a function add20 of type List[List[Int]] => List[List[Int]]
that adds 20 to every element of every inner list. A sample test:
scala> add20( List( List(1), List(2, 3) ) )
res0: List[List[Int]] = List(List(21), List(22, 23))
Exercise 1.6.2.5 Define a function of type List[Double] => List[Double] that per-
forms a “normalization” of a list: it finds the element having the largest absolute
value and, if that value is zero, returns the original list; if that value is nonzero,
divides all elements by that value and returns a new list. Test with:
scala> normalize(List(1.0, -4.0, 2.0))
res0: List[Double] = List(0.25, -1.0, 0.5)
8 http://tinyurl.com/4rjj2rvc
9 https://tinyurl.com/yxey4tsd
21
1 Mathematical formulas as code. I. Nameless functions
1.7 Discussion
1.7.1 Functional programming as a paradigm
Functional programming (FP) is a paradigm — an approach that guides program-
mers to write code in specific ways, applicable to a wide range of tasks.
The main idea of FP is to write code as a mathematical expression or formula. This
allows programmers to derive code through logical reasoning rather than through
guessing, similarly to how books on mathematics reason about mathematical for-
mulas and derive results systematically, without guessing or “debugging.” Like
mathematicians and scientists who reason about formulas, functional program-
mers can reason about code systematically and logically, based on rigorous princi-
ples. This is possible only because code is written as a mathematical formula.
Mathematical intuition is useful for programming tasks because it is backed
by the vast experience of working with data over millennia of human history. It
Í
took centuries to invent flexible and powerful notation, such as 𝑘 ∈𝑆 𝑝(𝑘), and to
develop the corresponding rules of calculation. Converting formulas into code,
FP capitalizes on the power of those reasoning tools.
As we have seen, the Scala code for certain computational tasks corresponds
quite closely to mathematical formulas (although programmers do have to write
out some details that are omitted in the mathematical notation). Just as in mathe-
matics, large code expressions may be split into smaller expressions when needed.
Expressions can be reused, composed in various ways, and written independently
from each other. Over the years, the FP community has developed a toolkit of
functions (such as map, filter, flatMap, etc.), which are not standard in mathemati-
cal literature but proved to be useful in practical programming.
Mastering FP involves practicing to write programs as “formulas translated into
code”, building up the specific kind of applied mathematical intuition, and getting
familiar with certain concepts adapted to a programmer’s needs. The FP commu-
nity has discovered a number of specific programming idioms founded on math-
ematical principles but driven by practical necessities of writing software. This
book explains the theory behind those idioms, starting from code examples and
heuristic ideas, and gradually building up the techniques of rigorous reasoning.
This chapter explored the first significant idiom of FP: iterative calculations per-
formed without loops in the style of mathematical expressions. This technique can
be used in any programming language that supports nameless functions.
Mathematics has the convention that a variable, such as 𝑥, does not change
its value within a formula. Indeed, there is no mathematical notation even to
talk about “changing” the value of 𝑥 inside the formula 𝑥 2 + 𝑥. It would be quite
confusing if a mathematics textbook said “before adding the last 𝑥 in the formula
𝑥 2 + 𝑥, we change that 𝑥 by adding 4 to it”. If the “last 𝑥” in 𝑥 2 + 𝑥 needs to have a 4
added to it, a mathematics textbook will just write the formula 𝑥 2 + 𝑥 + 4.
Arguments of nameless functions are also immutable. Consider, for example:
𝑛
Õ
𝑓 (𝑛) = (𝑘 2 + 𝑘) .
𝑘=0
Here, 𝑛 is the argument of the function 𝑓 , while 𝑘 is the argument of the nameless
function 𝑘 → 𝑘 2 + 𝑘. Neither 𝑛 nor 𝑘 can be “modified” in any sense within the
expressions where they are used. The symbols 𝑘 and 𝑛 stand for some integer
values, and these values are immutable. Indeed, it is meaningless to say that we
“modified the integer 4”. In the same way, we cannot modify 𝑘.
So, a variable in mathematics remains constant within the expression where it is
defined; in that expression, a variable is essentially a “named constant”. Of course,
a function 𝑓 can be applied to different values 𝑥, to compute a different result 𝑓 (𝑥)
each time. However, a given value of 𝑥 will remain unmodified within the body
of the function 𝑓 while 𝑓 (𝑥) is being computed.
Functional programming adopts this convention from mathematics: variables
are immutable named constants. (Scala also has mutable variables, but we will not
consider them in this book.)
In Scala, function arguments are immutable within the function body:
def f(x: Int) = x * x + x // Cannot modify `x` here.
The type of each mathematical variable (such as integer, vector, etc.) is also fixed.
Each variable is a value from a specific set (e.g., the set of all integers, the set of all
vectors, etc.). Mathematical formulas such as 𝑥 2 + 𝑥 do not express any “checking”
that 𝑥 is indeed an integer and not, say, a vector, in the middle of evaluating 𝑥 2 + 𝑥.
The types of all variables are checked in advance.
Functional programming adopts the same view: Each argument of each func-
tion must have a type that represents the set of possible allowed values for that
function argument. The programming language’s compiler will automatically
check the types of all arguments in advance, before the program runs. A program
that calls functions on arguments of incorrect types will not compile.
The second usage of variables in mathematics is to denote expressions that will
be reused. For example, one writes: let 𝑧 = 𝑥−𝑦𝑥+𝑦 and now compute cos 𝑧 + cos 2𝑧 +
cos 3𝑧. Again, the variable 𝑧 remains immutable, and its type remains fixed.
In Scala, this construction (defining an expression to be reused later) is written
with the “val” syntax. Each variable defined using “val” is a named constant, and
its type and value are fixed at the time of definition. Type annotations for “val’’s
are optional in Scala. For instance, we could write:
24
1.7 Discussion
We could also omit the type annotation “:Int” and write more concisely:
val x = 123
The mathematical convention is that one may rename the integration variable at
will, and so these formulas define the same function 𝑓 .
In programming, one situation when a variable “may be renamed at will” is
when the variable represents an argument of a function. We can see that the nota-
𝑑𝑧
𝑑𝑥
tions 1+𝑥 and 1+𝑧 correspond to a nameless function whose argument was renamed
1
from 𝑥 to 𝑧. In FP notation, this nameless function would be denoted as 𝑧 → 1+𝑧 ,
and the integral rewritten as code such as:
integration(0, x, { z => 1.0 / (1 + z) } )
∫ 𝑥Now
𝑑𝑧
compare the mathematical notations for integration and for summation:
Í100 1
0 1+𝑧
and 𝑘=0 1+𝑘 . The integral defines a bound variable 𝑧 via the special symbol
Í
“𝑑”, while the summation places a bound variable 𝑘 in a subscript under . The
25
1 Mathematical formulas as code. I. Nameless functions
10 https://en.wikipedia.org/wiki/Simpson%27s_rule
26
1.7 Discussion
The entire code is one large expression, with a few sub-expressions (s1, s2, etc.)
defined for within the local scope of the function (that is, within the function’s
body). The code contains no loops. This is similar to the way a mathematical
text would define Simpson’s rule. In other words, this code is written in the FP
paradigm. Similar code can be written in any programming language that sup-
ports nameless functions as arguments of other functions.
28
1.7 Discussion
29
2 Mathematical formulas as code.
II. Mathematical induction
We will now study more flexible ways of working with data collections in the
functional programming paradigm. The Scala standard library has methods for
performing general iterative computations, that is, computations defined by in-
duction. Translating mathematical induction into code is the focus of this chapter.
First, we need to become fluent in using tuple types with Scala collections.
The type expression (Int, String) denotes the type of this pair.
A triple is defined in Scala like this:
val b: (Boolean, Int, Int) = (true, 3, 4)
Pairs and triples are examples of tuples. A tuple can contain several values called
parts or fields of a tuple. A tuple’s parts can have different types, but the type of
each part (and the number of parts) is fixed once and for all. It is a type error to
use incorrect types in a tuple, or an incorrect number of parts:
scala> val bad: (Int, String) = (1, 2)
<console>:11: error: type mismatch;
found : Int(2)
required: String
val bad: (Int, String) = (1, 2)
^
scala> val bad: (Int, String) = (1, "a", 3)
<console>:11: error: type mismatch;
found : (Int, String, Int)
required: (Int, String)
val bad: (Int, String) = (1, "a", 3)
^
31
2 Mathematical formulas as code. II. Mathematical induction
Parts of a tuple can be accessed by number, starting from 1. The Scala syntax for
tuple accessor methods looks like ._1, for example:
scala> val a = (123, "xyz")
a: (Int, String) = (123,xyz)
scala> a._1
res0: Int = 123
scala> a._2
res1: String = xyz
scala> a._5
<console>:13: error: value _5 is not a member of (Int, String)
a._5
^
Type errors are detected at compile time, before any computations begin.
Tuples can be nested such that any part of a tuple can be itself a tuple:
scala> val c: (Boolean, (String, Int), Boolean) = (true, ("abc", 3), false)
c: (Boolean, (String, Int), Boolean) = (true,(abc,3),false)
scala> c._1
res0: Boolean = true
scala> c._2
res1: (String, Int) = (abc,3)
scala> c._2._1
res2: String = abc
To define functions whose arguments are tuples, we could use the tuple acces-
sors. An example of such a function is:
def f(p: (Boolean, Int), q: Int): Boolean = p._1 && (p._2 > q)
The first argument, p, of this function, has a tuple type. The function body uses
accessor methods (._1 and ._2) to compute the result value. Note that the second
part of the tuple p is of type Int, so it is valid to compare it with an integer q. It
would be a type error to compare the tuple p with an integer using the expression
p > q. It would be also a type error to apply the function f to an argument p that
has a wrong type, e.g., the type (Int, Int) instead of (Boolean, Int).
32
2.1 Tuple types
The value g is a tuple of three integers. After defining g, we define the three
variables x, y, z at once in a single val definition. We imagine that this definition
“destructures” the data structure contained in g and decomposes it into three parts,
then assigns the names x, y, z to these parts. The types of x, y, z are also assigned
automatically.
In the example above, the left-hand side of the destructuring definition contains
the tuple pattern (x, y, z) that looks like a tuple, except that its parts are names
x, y, z that are so far undefined. These names are called pattern variables. The de-
structuring definition checks whether the structure of the value of g “matches” the
given pattern. (If g does not contain a tuple with exactly three parts, the definition
will fail.) This computation is called pattern matching.
Pattern matching is often used for working with tuples. Look at this example:
scala> (1, 2, 3) match { case (a, b, c) => a + b + c }
res0: Int = 6
The expression { case (a, b, c) => ... } is called a case expression. It performs
pattern matching on its argument. The pattern matching will “destructure” (i.e.,
decompose) a tuple and try to match it to the given pattern (a, b, c). In this
pattern, a, b, c are as yet undefined new variables, — they are called pattern vari-
ables. If the pattern matching succeeds, the pattern variables a, b, c are assigned
their values, and the function body can proceed to perform its computation. In
this example, the pattern variables a, b, c will be assigned values 1, 2, and 3, and
so the expression evaluates to 6.
Pattern matching is especially convenient for nested tuples. Here is an example
where a nested tuple p is destructured by pattern matching:
def t1(p: (Int, (String, Int))): String = p match {
case (x, (str, y)) => str + (x + y).toString
}
33
2 Mathematical formulas as code. II. Mathematical induction
The type structure of the argument (Int, (String, Int)) is visually repeated in the
pattern (x, (str, y)), making it clear that x and y become integers and str becomes
a string after pattern matching.
If we rewrite the code of t1 using the tuple accessor methods instead of pattern
matching, the code will look like this:
def t2(p: (Int, (String, Int))): String = p._2._1 + (p._1 + p._2._2).toString
This code is shorter but harder to read. For example, it is not immediately clear
that p._2._1 is a string. It is also harder to modify this code: Suppose we want to
change the type of the tuple p to ((Int, String), Int). Then the new code is:
def t3(p: ((Int, String), Int)): String = p._1._2 + (p._1._1 + p._2).toString
It takes time to verify, by going through every accessor method, that the function
t3 computes the same expression as t2. In contrast, the code is changed easily
when using the pattern matching expression instead of the accessor methods. We
only need to change the type and the pattern:
def t4(p: ((Int, String), Int)): String = p match {
case ((x, str), y) => str + (x + y).toString
}
It is easy to see that t4 and t1 compute the same result. Also, the names of pattern
variables may be chosen to get more clarity.
Sometimes we only need to use certain parts of a tuple in a pattern match. The
following syntax is used to make that clear:
scala> val (x, _, _, z) = ("abc", 123, false, true)
x: String = abc
z: Boolean = true
The underscore symbol (_) denotes the parts of the pattern that we want to ignore.
The underscore will always match any value regardless of its type.
Scala has a shorter syntax for functions such as {case (x, y) => y} that extract
elements from tuples. The syntax looks like (t => t._2) or equivalently _._2, as
illustrated here:
scala> val p: ((Int, Int )) => Int = { case (x, y) => y }
p: ((Int, Int)) => Int = <function1>
34
2.1 Tuple types
res1: Int = 2
scala> q._1(3)
res0: Int = 4
In this way, we can use the standard methods such as map, filter, max, sum to ma-
nipulate sequences of tuples. The names of the pattern variables (“fruit”, “count”)
are chosen to help us remember the meaning of the parts of tuples.
We can easily transform a list of tuples into a list of values of a different type:
scala> basket.map { case (fruit, count) =>
val isAcidic = (fruit == "lemons")
(fruit, isAcidic)
}
res3: List[(String, Boolean)] = List((apples,false), (pears,false),
(lemons,true))
In the Scala syntax, a nameless function written with braces { ... } may define
local values in its body. The return value of the function is the last expression
written in the function body. In this example, the return value of the nameless
35
2 Mathematical formulas as code. II. Mathematical induction
The same result is obtained by first creating a sequence of key/value pairs and
then converting that sequence into a dictionary via the method toMap:
List(("apples", 3), ("oranges", 2), ("pears", 0)).toMap
The same method works for other collection types such as Seq, Vector, and Array.
The Scala library defines a special infix syntax for pairs via the arrow symbol
->. The expression x -> y is equivalent to the pair (x, y):
scala> "apples" -> 3
res0: (String, Int) = (apples,3)
With this syntax, the code for creating a dictionary is easier to read:
Map("apples" -> 3, "oranges" -> 2, "pears" -> 0)
The ArrayBuffer is one of the many list-like data structures in the Scala library.
All these data structures are subtypes of the common “sequence” type Seq. The
methods defined in the Scala standard library sometimes return different imple-
mentations of the Seq type for reasons of performance.
The standard library has several methods that need tuple types, such as map and
filter (when used with dictionaries), toMap, zip, and zipWithIndex. The methods
flatten, flatMap, groupBy, and sliding also work with most collection types, includ-
ing dictionaries and sets. It is important to become familiar with these methods,
because it will help writing code that uses sequences, sets, and dictionaries. Let
us now look at these methods one by one.
The methods map and toMap Chapter 1 showed how the map method works on
sequences: the expression xs.map(f) applies a given function f to each element of
the sequence xs, gathering the results in a new sequence. In this sense, we can say
that the map method “iterates over” sequences. The map method works similarly on
dictionaries, except that iterating over a dictionary of type Map[K, V] when apply-
36
2.1 Tuple types
ing map looks like iterating over a sequence of pairs, Seq[(K, V)]. If d: Map[K, V] is
a dictionary, the argument f of d.map(f) must be a function operating on tuples of
type (K, V). Typically, such functions are written using case expressions:
val fruitBasket = Map("apples" -> 3, "pears" -> 2, "lemons" -> 0)
When using map to transform a dictionary into a sequence of pairs, the result is
again a dictionary. But when an intermediate result is not a sequence of pairs, we
may need to use toMap:
scala> fruitBasket.map { case (fruit, count) => (fruit, count * 2) }
res1: Map[String,Int] = Map(apples -> 6, pears -> 4, lemons -> 0)
The methods zip and zipWithIndex The zip method takes two sequences and
produces a sequence of pairs, taking one element from each sequence:
scala> val s = List(1, 2, 3)
s: List[Int] = List(1, 2, 3)
scala> s.zip(t)
res3: List[(Int, Boolean)] = List((1,true), (2,false), (3,true))
scala> s zip t
res4: List[(Int, Boolean)] = List((1,true), (2,false), (3,true))
In the last line, the equivalent “dotless” infix syntax (s zip t) is shown to illustrate
a syntax convention of Scala that we will sometimes use.
The zip method works equally well on dictionaries: in that case, dictionaries are
automatically converted to sequences of pairs before applying zip.
The zipWithIndex method creates a sequence of pairs where the second value in
the pair is a zero-based index:
scala> List("a", "b", "c").zipWithIndex
res5: List[(String, Int)] = List((a,0), (b,1), (c,2))
37
2 Mathematical formulas as code. II. Mathematical induction
The method flatten converts a nested sequence type, such as List[List[A]], into
a simple List[A] by concatenating all inner sequences into one:
scala> List(List(1, 2), List(2, 3), List(3, 4)).flatten
res6: List[Int] = List(1, 2, 2, 3, 3, 4)
In Scala, sequences and other collections (such as sets and dictionaries) are gener-
ally concatenated using the operation ++. For example:
scala> List(1, 2, 3) ++ List(4, 5, 6) ++ List(0)
res7: List[Int] = List(1, 2, 3, 4, 5, 6, 0)
So, one can say that the flatten method inserts the operation ++ between all the
inner sequences.
Note that flatten removes only one level of nesting at the top of the data type.
If applied to a List[List[List[Int]]], the flatten method returns a List[List[Int]]
with inner lists unchanged:
scala> List(List(List(1), List(2)), List(List(2), List(3))).flatten
res8: List[List[Int]] = List(List(1), List(2), List(2), List(3))
The method flatMap is closely related to flatten and can be seen as a shortcut,
equivalent to first applying map and then flatten:
scala> List(1, 2, 3, 4).map(n => (1 to n).toList)
res9: List[List[Int]] = List(List(1), List(1, 2), List(1, 2, 3), List(1, 2, 3,
4))
38
2.1 Tuple types
The argument of the groupBy method is a function that computes a “key” out of each
sequence element. The key can have an arbitrarily chosen type. (In the current
example, that type is Int.) The result of groupBy is a dictionary that maps each key
to the sub-sequence of values that have that key. (In the current example, the type
of the dictionary is therefore Map[Int, Seq[String]].) The order of elements in the
sub-sequences remains the same as in the original sequence.
As another example of using groupBy, the following code will group together all
numbers that have the same remainder after division by 3:
scala> List(1, 2, 3, 4, 5).groupBy(k => k % 3)
res13: Map[Int,List[Int]] = Map(2 -> List(2, 5), 1 -> List(1, 4), 0 -> List(3))
The method sortBy sorts a sequence according to a sorting key. The argument of
sortBy is a function that computes the sorting key from a sequence element. This
gives us flexibility to sort elements in a custom way:
scala> Seq(1, 2, 3).sortBy(x => -x)
res0: Seq[Int] = List(3, 2, 1)
39
2 Mathematical formulas as code. II. Mathematical induction
Example 2.1.5.2 Count how many times cos 𝑥𝑖 > sin 𝑥𝑖 occurs in a sequence 𝑥𝑖 .
Hint: use count, assume xs: Seq[Double].
Solution The method count takes a predicate and returns the number of se-
quence elements for which the predicate is true:
xs.count { x => math.cos(x) > math.sin(x) }
We could also reuse the solution of Exercise 2.1.5.1 that computed the cosine and
the sine values. The code would then become:
xs.map { x => (math.cos(x), math.sin(x)) }
.count { case (cosine, sine) => cosine > sine }
Example 2.1.5.3 For given sequences 𝑎𝑖 and 𝑏𝑖 of Double values, compute the se-
quence of differences 𝑐𝑖 = 𝑎𝑖 − 𝑏𝑖 .
Hint: use zip, map, and assume as and bs have equal length.
Solution We can use zip on as and bs, which gives a sequence of pairs:
as.zip(bs): Seq[(Double, Double)]
We then compute the differences 𝑎𝑖 − 𝑏𝑖 by applying map to this sequence:
as.zip(bs).map { case (a, b) => a - b }
Example 2.1.5.4 In a given sequence 𝑝𝑖 , count how many times 𝑝𝑖 > 𝑝𝑖+1 occurs.
Hint: use zip and tail.
Solution Given ps: Seq[Double], we can compute ps.tail. The result is a se-
quence that is one element shorter than ps, for example:
scala> val ps = Seq(1, 2, 3, 4)
ps: Seq[Int] = List(1, 2, 3, 4)
scala> ps.tail
res0: Seq[Int] = List(2, 3, 4)
Taking a zip of the two sequences ps and ps.tail, we get a sequence of pairs:
scala> ps.zip(ps.tail)
res1: Seq[(Int, Int)] = List((1,2), (2,3), (3,4))
Because ps.tail is one element shorter than ps, the resulting sequence of pairs
is also one element shorter than ps. So, it is not necessary to truncate ps before
40
2.1 Tuple types
Example 2.1.5.5 For a given 𝑘 > 0, compute the sequence 𝑐𝑖 = max(𝑏𝑖−𝑘 , ..., 𝑏𝑖+𝑘 ),
starting at 𝑖 = 𝑘.
Solution Applying the sliding method to a list gives a list of nested lists:
val b = List(1, 2, 3, 4, 5) // An example of a possible sequence `b`.
scala> b.sliding(3).toList
res0: List[List[Int]] = List(List(1, 2, 3), List(2, 3, 4), List(3, 4, 5))
For each 𝑖, we need to obtain a list of 2𝑘 + 1 nearby elements (𝑏𝑖−𝑘 , ..., 𝑏𝑖+𝑘 ). So, we
need to use sliding(2 * k + 1) to obtain a window of the required size. Now we
can compute the maximum of each of the nested lists by using the map method on
the outer list, with the max method applied to the nested lists. So, the argument of
the map method must be the function x => x.max (where x will have type List[Int):
def c(b: List[Int], k: Int) = b.sliding(2 * k + 1).toList.map(x => x.max)
because, in Scala, _.max is the same as the nameless function x => x.max. Test this:
scala> c(b = List(1, 2, 3, 4, 5, 6, 5, 4, 3, 2, 1), k = 1) // Write the
argument names for clarity.
res0: Seq[Int] = List(3, 4, 5, 6, 6, 6, 5, 4, 3)
We would like to get List((1,1), (1,2), 1,3)) etc., and so we use map on the inner
list with a nameless function y => (1, y) that converts a number into a tuple:
scala> List(1, 2, 3).map { y => (1, y) }
res0: List[(Int, Int)] = List((1,1), (1,2), (1,3))
41
2 Mathematical formulas as code. II. Mathematical induction
The curly braces in {y => (1, y)} are only for clarity. We could also use round
parentheses and write List(1, 2, 3).map(y => (1, y)).
Now, we need to have (x, y) instead of (1, y) in the argument of map, where x
iterates over List(1, 2, 3) in the outside scope:
scala> val s = List(1, 2, 3).map(x => List(1, 2, 3).map { y => (x, y) })
s: List[List[(Int, Int)]] = List(List((1,1), (1,2), (1,3)), List((2,1), (2,2),
(2,3)), List((3,1), (3,2), (3,3)))
This is almost what we need, except that the nested lists need to be concatenated
into a single list. This is exactly what flatten does:
scala> val s = List(1, 2, 3).map(x => List(1, 2, 3).map { y => (x, y) }).flatten
s: List[(Int, Int)] = List((1,1), (1,2), (1,3), (2,1), (2,2), (2,3), (3,1),
(3,2), (3,3))
This is the list of keys for the required dictionary. The dictionary needs to map
each pair of integers (x, y) to x * y. To create that dictionary, we will apply toMap
to a sequence of pairs (key, value), which in our case needs to be of the form of
a nested tuple ((x, y), x * y). To achieve that, we use map with a function that
computes the product and creates those nested tuples:
scala> val s = List(1, 2, 3).flatMap(x => List(1, 2, 3).map { y => (x, y) }).
map { case (x, y) => ((x, y), x * y) }
s: List[((Int, Int), Int)] = List(((1,1),1), ((1,2),2), ((1,3),3), ((2,1),2),
((2,2),4), ((2,3),6), ((3,1),3), ((3,2),6), ((3,3),9))
We can simplify this code if we notice that we are first mapping each y to a tu-
ple (x, y), and later mapping each tuple (x, y) to a nested tuple ((x, y), x * y).
Instead, the entire computation can be done in the inner map operation:
scala> val s = List(1, 2, 3).flatMap(x => List(1, 2, 3).map { y => ((x, y), x *
y) } )
s: List[((Int, Int), Int)] = List(((1,1),1), ((1,2),2), ((1,3),3), ((2,1),2),
((2,2),4), ((2,3),6), ((3,1),3), ((3,2),6), ((3,3),9))
Applying toMap, we convert this list of tuples to a dictionary. Also, for better read-
ability, we use Scala’s pair syntax, key -> value, which is equivalent to writing the
tuple (key, value):
(1 to 10).flatMap(x => (1 to 10).map { y => (x, y) -> x * y }).toMap
Example 2.1.5.7 For a given sequence 𝑥𝑖 , compute the maximum of all of the
numbers 𝑥𝑖 , 𝑥𝑖2 , cos 𝑥𝑖 , sin 𝑥𝑖 . Hint: use flatMap and max.
Solution We will compute the required value if we take max of a list containing
all of the numbers. To do that, first map each element of the list xs: Seq[Double]
42
2.1 Tuple types
Example 2.1.5.9 Write the solution of Example 2.1.5.8 as a function with type
parameters Name and Addr instead of the fixed type String.
Solution In Scala, the syntax for type parameters in a function definition is:
def rev[Name, Addr](...) = ...
The type of the argument is Map[Name, Addr], while the type of the result is Map[Addr,
Name]. So, we use the type parameters Name and Addr in the type signature of the
function. The final code is:
def rev[Name, Addr](dict: Map[Name, Addr]): Map[Addr, Name] =
dict.map { case (name, addr) => (addr, name) }
The body of the function rev remains the same as in Example 2.1.5.8; only the type
signature changes. This is because the function rev works in the same way for
dictionaries of any type. For this reason, it was easy for us to change the specific
type String into type parameters in that function.
43
2 Mathematical formulas as code. II. Mathematical induction
When the function rev is applied to a dictionary of a specific type, the Scala
compiler will automatically set the type parameters Name and Addr that fit the re-
quired types of the dictionary’s keys and values. For example, if we apply rev
to a dictionary of type Map[Boolean, Seq[String]], the type parameters will be set
automatically as Name = Boolean and Addr = Seq[String]:
scala> val d = Map(true -> Seq("x", "y"), false -> Seq("z", "t"))
d: Map[Boolean, Seq[String]] = Map(true -> List(x, y), false -> List(z, t))
scala> rev(d)
res0: Map[Seq[String], Boolean] = Map(List(x, y) -> true, List(z, t) -> false)
Type parameters can be also set explicitly when using the function rev. If the type
parameters are chosen incorrectly, the program will not compile:
scala> rev[Boolean, Seq[String]](d)
res1: Map[Seq[String],Boolean] = Map(List(x, y) -> true, List(z, t) -> false)
Solution Begin by grouping the words by length. The library method groupBy
takes a function that computes a “grouping key” from each element of a sequence.
To group by word length (computed via the method length), we write:
words.groupBy { word => word.length }
44
2.1 Tuple types
It remains to swap the length and the list of words and to sort the result by in-
creasing length. We can do this in any order: first sort, then swap; or first swap,
then sort. The final code is:
words
.groupBy(_.length)
.toSeq
.sortBy { case (len, words) => len }
.map { case (len, words) => (words, len) }
This can be written somewhat shorter if we use the code _._1 (equivalent to x =>
x._1) for selecting the first parts from pairs and swap for swapping the two elements
of a pair:
words.groupBy(_.length).toSeq.sortBy(_._1).map(_.swap)
In computations like this, the Scala compiler verifies at each step that the oper-
ations are applied to values of the correct types. Writing down the intermediate
types will help us write correct code.
For instance, sortBy is defined for sequences but not for dictionaries, so it would
be a type error to apply sortBy to a dictionary without first converting it to a se-
quence using toSeq. The type of the intermediate result after toSeq is Seq[ (Int,
Seq[String]) ], and the sortBy operation is applied to that sequence. So, the se-
quence element matched by { case (len, words) => len } is a tuple having the type
(Int, Seq[String]). Then the pattern variables len and words must have types Int
and Seq[String] respectively.
If we visualize how the type of the sequence should change at every step, we
can more quickly understand how to implement the required task. Begin by writ-
ing down the intermediate types that would be needed during the computation:
words: Seq[String] // After groupBy() by word length, will have type:
Map[Int, Seq[String]] // To sort by word length, convert to a sequence:
Seq[ (Int, Seq[String]) ] // Sort by the `Int` value; type is unchanged:
Seq[ (Int, Seq[String]) ] // It remains to swap the parts of the tuples:
Seq[ (Seq[String], Int) ] // We are done.
45
2 Mathematical formulas as code. II. Mathematical induction
Having written down these types, we are better assured that the computation can
be done correctly. Writing the code becomes straightforward, since we are guided
by the already known types of the intermediate results:
words.groupBy(_.length).toSeq.sortBy(_._1).map(_.swap)
This example illustrates the main benefits of reasoning about types: it gives
direct guidance about how to organize the computation, together with a greater
confidence about code correctness.
Exercise 2.1.7.7 Given p: Seq[String] and q: Seq[Int] of equal length and assum-
ing that values in q do not repeat, compute a Map[Int, String] mapping numbers
from q to the corresponding strings from p.
Exercise 2.1.7.8 Write the solution of Exercise 2.1.7.7 as a function with type pa-
rameters P and Q instead of the fixed types Int and String. The function’s argu-
ments should be of types Seq[Q] and Seq[P], and the return type should be Map[P,
Q]. Run some tests using types P = Double and Q = Set[Boolean].
46
2.1 Tuple types
the output must be: Map("apple" -> 10, "pear" -> 3, "lemon" -> 2).
Hint: use groupBy, map, sum.
Exercise 2.1.7.11 (a) Given two sets, p: Set[Int] and q: Set[Int], compute a set
of type Set[(Int, Int)] as the Cartesian product of the sets p and q. This is the set
of all pairs (x, y) where x is an element from p and y is an element from q.
(b) Implement this computation as a function with type parameters I, J instead
of Int. The required type signature and a sample test:
def cartesian[I, J](p: Set[I], q: Set[J]): Set[(I, J)] = ???
Note that the same task for integer numbers (instead of floating-point numbers)
can be implemented via length, map, sum, and zip:
def digitsToInt(ds: Seq[Int]): Int = {
val n = ds.length
// Compute a sequence of powers of 10, e.g., [1000, 100, 10, 1].
val powers: Seq[Int] = (0 to n - 1).map(k => math.pow(10, n - 1 - k).toInt)
// Sum the powers of 10 with coefficients from `ds`.
(ds zip powers).map { case (d, p) => d * p }.sum
}
For this task, the required computation can be written as the formula:
𝑛−1
Õ
𝑟= 𝑑 𝑘 ∗ 10𝑛−1−𝑘 .
𝑘=0
The sequence of powers of 10 can be computed separately and “zipped” with the
sequence of digits 𝑑 𝑘 . However, for floating-point numbers, the sequence of pow-
ers of 10 depends on the position of the “dot” character. Methods such as map or zip
cannot compute a sequence whose next elements depend on previous elements
and the dependence is described by some custom function.
48
2.2 Converting a sequence into a single value
• The base case of the induction: We need to specify what value the function
f returns for an empty sequence, Seq(). The standard method isEmpty can be
used to detect empty sequences. In case the function f is defined only for
non-empty sequences, we need to specify what the function f returns for a
one-element sequence such as Seq(x), with any x.
• The inductive step: Assuming that the function f is already computed for
some sequence xs (the inductive assumption), how to compute the function
f for a sequence with one more element x? The sequence with one more
element is written as xs :+ x. So, we need to specify how to compute f(xs
:+ x) assuming that f(xs) is already known.
Once these two computations are specified, the function f is defined (and can in
principle be computed) for an arbitrary input sequence.
With this approach, the inductive definition of the method sum looks like this:
The base case is that the sum of an empty sequence is 0. That is, Seq().sum ==
0. The inductive step says that when the result xs.sum is already known for a
sequence xs, and we have a sequence that has one more element x, then the new
result is equal to xs.sum + x. In code, this is (xs :+ x).sum == xs.sum + x.
The inductive definition of the function digitsToInt goes like this: The base case
is an empty sequence of digits, Seq(), and the result is 0. This is a convenient
base case even if we never need to apply digitsToInt to an empty sequence. The
inductive step: If digitsToInt(xs) is already known for a sequence xs of digits, and
we have a sequence xs :+ x with one more digit x, then:
digitsToInt(xs :+ x) == digitsToInt(xs) * 10 + x
Let us write inductive definitions for the methods length, max, and count.
The method length Base case: The length of an empty sequence is zero, so we
write: Seq().length == 0.
Inductive step: if xs.length is known then (x +: xs).length == xs.length + 1.
The method max The maximum element of a sequence is undefined for empty
sequences.
Base case: for a one-element sequence, Seq(x).max == x.
Inductive step: if xs.max is known then (x +: xs).max == math.max(x, xs.max).
The method count computes the number of a sequence’s elements satisfying a
predicate p.
Base case: for an empty sequence, Seq().count(p) == 0.
49
2 Mathematical formulas as code. II. Mathematical induction
In this way, we show that the property holds for x +: xs and ys assuming it holds
for xs and ys.
There are two main ways of translating mathematical induction into code. The
first way is to write a recursive function. The second way is to use a standard li-
brary function, such as foldLeft or reduce. Often it is better to use library functions,
but sometimes the code is more transparent when recursion is explicit. So, let us
consider each of these ways in turn.
The if/else expression separates the base case from the inductive step. In the
inductive step, it is convenient to split the given sequence s into its first element
x, or the “head” of s, and the remainder (“tail”) sequence xs. So, we split s as s = x
+: xs rather than as s = xs :+ x.1
For computing the sum of a numerical sequence, the order of summation does
not matter. But the order of operations will matter for many other computational
tasks. We will need to choose whether the inductive step should split the sequence
as s = x +: xs or as s = xs :+ x, depending on the task at hand.
Let us implement digitsToInt according to the inductive definition shown in
Section 2.2.1:
def digitsToInt(s: Seq[Int]): Int = if (s.isEmpty) 0 else {
val x = s.last // To split s = xs :+ x, compute x
val xs = s.init // and xs.
digitsToInt(xs) * 10 + x // Call digitsToInt(...) recursively.
}
In this example, it is important to split the sequence s into xs :+ x and not into x
+: xs. The reason is that digits increase their numerical value from right to left, so
the correct result is computed as digitsToInt(xs) * 10 + x if we split s into xs :+ x.
For that splitting, we use the standard library methods init and last.
These examples show how mathematical induction is converted into recursive
code. This approach often works but has two technical problems. The first prob-
lem is that the code will fail due to a stack overflow when the input sequence s is
long enough. In the next subsection, we will see how this problem is solved (at
least in some cases) using tail recursion.
The second problem is that all inductively defined functions will use the same
code for checking the base case and for splitting the sequence s into the subse-
quence xs and the extra element x. This repeated common code can be put into
a library function, and the Scala library provides such functions. We will look at
using them in Section 2.2.4.
51
2 Mathematical formulas as code. II. Mathematical induction
scala> lengthS(s)
java.lang.StackOverflowError
at .lengthS(<console>:12)
at .lengthS(<console>:12)
at .lengthS(<console>:12)
...
The problem is not due to insufficient main memory: we are able to compute
and hold in memory the entire sequence s. The problem is with the code of the
function lengthS. This function calls itself inside the expression 1 + lengthS(...).
Let us visualize how the computer evaluates that code:
lengthS(Seq(1, 2, ..., 100000))
= 1 + lengthS(Seq(2, ..., 100000))
= 1 + (1 + lengthS(Seq(3, ..., 100000)))
= ...
The code of lengthS will repeat the inductive step, that is, the “else” part of the
“if/else”, about 100000 times. Each time, the intermediate sub-expression with
nested computations 1 + (1 + (...)) will get larger. That sub-expression needs
to be held somewhere in memory until the function body goes into the base case,
with no more recursive calls. When that happens, the intermediate sub-expression
will contain about 100000 nested function calls still waiting to be evaluated. A
special area of memory called stack memory is dedicated to storing the arguments
for all not-yet-evaluated nested function calls. Due to the way computer memory
is managed, the stack memory has a fixed size and cannot grow automatically. So,
when the intermediate expression becomes large enough, it causes an overflow of
the stack memory and crashes the program.
One way to avoid stack overflows is to use a trick called tail recursion. Using
tail recursion means rewriting the code so that all recursive calls occur at the end
positions (at the “tails”) of the function body. In other words, each recursive call
must be itself the last computation in the function body, rather than placed inside
other computations. Here is an example of tail-recursive code:
def lengthT(s: Seq[Int], res: Int): Int =
if (s.isEmpty) res
else lengthT(s.tail, res + 1)
In this code, one of the branches of the if/else returns a fixed value without doing
52
2.2 Converting a sequence into a single value
any recursive calls, while the other branch returns the result of a recursive call to
lengthT(...).
It is not a problem that the recursive call to lengthT has some sub-expressions
such as res + 1 as its arguments, because all these sub-expressions will be com-
puted before lengthT is recursively called. The recursive call to lengthT is the last
computation performed by this branch of the if/else. A tail-recursive function
can have many if/else or match/case branches, with or without recursive calls; but
all recursive calls must be always the last expressions returned.
The Scala compiler will always use tail recursion when possible. Additionally,
Scala has a feature for verifying that a function’s code is tail-recursive: the tailrec
annotation. If a function with a tailrec annotation is not tail-recursive (or is not
recursive at all), the program will not compile.
The code of lengthT with a tailrec annotation looks like this:
import scala.annotation.tailrec
(The import declaration is needed whenever the code uses the tailrec annotation.)
Let us trace the evaluation of this function on an example:
lengthT(Seq(1, 2, 3), 0)
= lengthT(Seq(2, 3), 0 + 1) // = lengthT(Seq(2, 3), 1)
= lengthT(Seq(3), 1 + 1) // = lengthT(Seq(3), 2)
= lengthT(Seq(), 2 + 1) // = lengthT(Seq(), 3)
= 3
When length is implemented like that, users will not be able to call lengthT directly,
because lengthT is only visible within the body of the length function.
Another possibility in Scala is to use a default value for the res argument:
@tailrec def length[A](s: Seq[A], res: Int = 0): Int =
if (s.isEmpty) res
else length(s.tail, res + 1)
Giving a default value for a function argument is the same as defining two func-
tions: one with that argument and one without. For example, the syntax:
def f(x: Int, y: Boolean = false): Int = ... // Function body.
is equivalent to defining two functions with the same name but different numbers
of arguments:
def f(x: Int, y: Boolean) = ... // Define the function body here.
def f(x: Int): Int = f(Int, false) // Call the function defined above.
Using a default argument, we can define the tail-recursive helper function and the
main function at once, making the code shorter.
The accumulator trick works in a large number of cases, but it may be not ob-
vious how to introduce the accumulator argument, what its initial value must be,
and how to define the inductive step for the accumulator. In the example with the
lengthT function, the accumulator trick works because of the special mathematical
property of the expression being computed:
This equation follows from the associativity law of addition. So, the computation
can be rearranged to group all additions to the left. During the evaluation, the
accumulator’s value corresponds to a certain number of left-grouped parentheses,
((0 + 1) ...) + 1. In code, it means that intermediate expressions are fully computed
before making recursive calls. So, recursive calls always occur outside all other
sub-expressions — that is, in tail positions. There are no sub-expressions that
need to be stored on the stack until all the recursive calls are complete.
54
2.2 Converting a sequence into a single value
However, not all computations can be rearranged in that way. Even if a code
rearrangement exists, it may not be immediately obvious how to find it.
An example is a tail-recursive version of the function digitsToInt from the pre-
vious subsection, where the sub-expression digitsToInt(xs) * 10 + x was a non-
tail-recursive call. To transform the code into a tail-recursive form, we need to
rearrange the computation:
so that the multiplications group to the left. We can do this by rewriting 𝑟 as:
It follows that the digit sequence s must be split into the leftmost digit and the rest,
s == s.head +: s.tail. So, a tail-recursive implementation of the above formula is:
@tailrec def fromDigits(s: Seq[Int], res: Int = 0): Int =
// `res` is the accumulator.
if (s.isEmpty) res
else fromDigits(s.tail, 10 * res + s.head)
Despite a similarity between this code and the code of digitsToInt from the previ-
ous subsection, the implementation of fromDigits cannot be directly derived from
the inductive definition of digitsToInt. We need a separate proof that fromDigits(s,
0) computes the same result as digitsToInt(s). This can be proved by using the
following property:
Statement 2.2.3.1 For any s: Seq[Int] and r: Int, the following equation holds:
fromDigits(s, r) == digitsToInt(s) + r * math.pow(10, s.length)
Proof We use induction on the length of s. To shorten the proof, denote se-
quences by [1, 2, 3] instead of Seq(1, 2, 3) and temporarily write 𝑑 (𝑠) instead of
digitsToInt(s) and 𝑓 (𝑠, 𝑟) instead of fromDigitsT(s, r). Then an inductive defini-
tion of 𝑓 (𝑠, 𝑟) is:
We prove Eq. (2.2) by induction. For the base case 𝑠 = [], we have 𝑓 ( [] , 𝑟) = 𝑟
and 𝑑 ([]) + 𝑟 ∗ 100 = 𝑟 since 𝑑 ( []) = 0 and |𝑠| = 0. The resulting equality 𝑟 = 𝑟
proves the base case.
55
2 Mathematical formulas as code. II. Mathematical induction
To prove the inductive step, we assume that Eq. (2.2) holds for a given sequence
?
𝑠. Then write the inductive step. We use the symbol = to denote equations we still
need to prove:
?
𝑓 ([𝑥]++𝑠, 𝑟) = 𝑑 ([𝑥]++𝑠) + 𝑟 ∗ 10|𝑠|+1 . (2.3)
We will transform the left-hand side and the right-hand side separately, hoping to
obtain the same expression. The left-hand side of Eq. (2.3) is:
𝑓 ([𝑥]++𝑠, 𝑟)
use Eq. (2.1) : = 𝑓 (𝑠, 10 ∗ 𝑟 + 𝑥)
use Eq. (2.2) : = 𝑑 (𝑠) + (10 ∗ 𝑟 + 𝑥) ∗ 10|𝑠| .
The right-hand side of Eq. (2.3) contains 𝑑 ([𝑥]++𝑠), which we now need to rewrite.
Assuming that 𝑑 (𝑠) correctly calculates a number from its digits, we use a prop-
erty of decimal notation: a digit 𝑥 in front of 𝑛 other digits has the value 𝑥 ∗ 10𝑛 .
This property can be formulated as an equation:
𝑑 ([𝑥]++𝑠) = 𝑥 ∗ 10|𝑠| + 𝑑 (𝑠) . (2.4)
So, the right-hand side of Eq. (2.3) can be rewritten as:
𝑑 ([𝑥]++𝑠) + 𝑟 ∗ 10|𝑠|+1
use Eq. (2.4) : = 𝑥 ∗ 10|𝑠| + 𝑑 (𝑠) + 𝑟 ∗ 10|𝑠|+1
factor out 10|𝑠| : = 𝑑 (𝑠) + (10 ∗ 𝑟 + 𝑥) ∗ 10|𝑠| .
We have successfully transformed both sides of Eq. (2.3) to the same expression.
We have not yet proved that the function 𝑑 satisfies the property in Eq. (2.4).
That proof also uses induction. Begin by writing the code of 𝑑 in a short notation:
𝑑 ( []) = 0 , 𝑑 (𝑠++[𝑦]) = 𝑑 (𝑠) ∗ 10 + 𝑦 . (2.5)
The base case is Eq. (2.4) with 𝑠 = []. It is proved by:
𝑥 = 𝑑 ([]++[𝑥]) = 𝑑 ([𝑥]++[]) = 𝑥 ∗ 100 + 𝑑 ([]) = 𝑥 .
The inductive step assumes Eq. (2.4) for a given 𝑥 and a given sequence 𝑠, and
needs to prove that for any 𝑦, the same property holds with 𝑠++[𝑦] instead of 𝑠:
?
𝑑 ([𝑥]++𝑠++[𝑦]) = 𝑥 ∗ 10|𝑠|+1 + 𝑑 (𝑠++[𝑦]) . (2.6)
The left-hand side of Eq. (2.6) is transformed into its right-hand side like this:
𝑑 ([𝑥]++𝑠++[𝑦])
use Eq. (2.5) : = 𝑑 ([𝑥]++𝑠) ∗ 10 + 𝑦
use Eq. (2.4) : = (𝑥 ∗ 10|𝑠| + 𝑑 (𝑠)) ∗ 10 + 𝑦
expand parentheses : = 𝑥 ∗ 10|𝑠|+1 + 𝑑 (𝑠) ∗ 10 + 𝑦
use Eq. (2.5) : = 𝑥 ∗ 10|𝑠|+1 + 𝑑 (𝑠++[𝑦]) .
This establishes Eq. (2.6) and concludes the proof.
56
2.2 Converting a sequence into a single value
• (Base case.) For an empty sequence, we have f(Seq()) = b0, where b0: B is a
given value.
The code for f is written using recursion and the methods init and last:
def f[A, B](s: Seq[A]): B =
if (s.isEmpty) b0
else g(s.last, f(s.init))
We can now refactor this code into a generic utility function, by turning b0 and g
into parameters. A possible implementation is:
def f[A, B](s: Seq[A], b: B, g: (A, B) => B): B =
if (s.isEmpty) b
else g(s.last, f(s.init, b, g))
We call this function a “left fold” because it aggregates (or “folds”) the sequence
starting from the leftmost element.
In this way, we have defined a general method of computing any inductively
defined aggregation function on a sequence. The function leftFold implements
the logic of aggregation defined via mathematical induction. Using leftFold, we
can write concise implementations of methods such as sum, max, and many other
57
2 Mathematical formulas as code. II. Mathematical induction
aggregation functions. The method leftFold already contains all the code neces-
sary to set up the base case and the inductive step. The programmer just needs to
specify the expressions for the initial value b and for the updater function g.
As a first example, let us use leftFold for implementing the sum method:
def sum(s: Seq[Int]): Int = leftFold(s, 0, (x, y) => x + y )
To understand in detail how leftFold works, let us trace the evaluation of this
function when applied to Seq(1, 2, 3):
sum(Seq(1, 2, 3)) == leftFold(Seq(1, 2, 3), 0, g)
// Here, g = (x, y) => x + y, so g(x, y) = x + y.
== leftFold(Seq(2, 3), g(0, 1), g) // g (0, 1) = 1.
== leftFold(Seq(2, 3), 1, g) // Now expand the code of `leftFold`.
== leftFold(Seq(3), g(1, 2), g) // g(1, 2) = 3; expand the code.
== leftFold(Seq(), g(3, 3), g) // g(3, 3) = 6; expand the code.
== 6
The second argument of leftFold is the accumulator argument. The initial value of
the accumulator is specified when first calling leftFold. At each iteration, the new
accumulator value is computed by calling the updater function g, which uses the
previous accumulator value and the value of the next sequence element. To visu-
alize the process of recursive evaluation, it is convenient to write a table showing
the sequence elements and the accumulator values as they are updated:
1 0 1
2 1 3
3 3 6
58
2.2 Converting a sequence into a single value
The accumulator has type Int, while the sequence elements can have an arbitrary
type, parameterized by A. The foldLeft method works in the same way for all
types of accumulators and all types of sequence elements.
Since foldLeft is tail-recursive, stack overflows will not occur even with long
sequences. The method foldLeft is available in the Scala library for all collections,
including dictionaries and sets.
It is important to gain experience using the foldLeft method. The Scala library
contains several other methods similar to foldLeft, such as foldRight, fold, and
reduce. In the following sections, we will mostly focus on foldLeft because the
other fold-like operations are similar.
If we are sure that the function will never be called on empty sequences, we can
implement max in a simpler way by using the reduce method:
def max(s: Seq[Int]): Int = s.reduce { (x, y) => if (y > x) y else x }
Example 2.2.5.2 For a given non-empty sequence xs: Seq[Double], compute the
minimum, the maximum, and the mean as a tuple (𝑥min , 𝑥max , 𝑥 mean ). The se-
quence should be traversed only once; i.e., the entire code must be xs.foldLeft(...),
using foldLeft only once.
Solution Without the requirement of using a single traversal, we would write:
(xs.min, xs.max, xs.sum / xs.length)
However, this code traverses xs at least three times, since each of the aggregations
xs.min, xs.max, and xs.sum iterates over xs. We need to combine the four inductive
definitions of min, max, sum, and length into a single inductive definition of some
function. What is the type of that function’s return value? We need to accumulate
intermediate values of all four numbers (min, max, sum, and length) in a tuple. So,
the required type of the accumulator is (Double, Double, Double, Int). To avoid
repeating a long type expression, we can define a type alias for it, say, D4:
scala> type D4 = (Double, Double, Double, Int)
defined type alias D4
59
2 Mathematical formulas as code. II. Mathematical induction
The updater updates each of the four numbers according to the definitions of their
inductive steps:
def update(p: D4, x: Double): D4 = p match { case (min, max, sum, length) =>
(math.min(x, min), math.max(x, max), x + sum, length + 1)
}
Example 2.2.5.3 Implement the map method for sequences by using foldLeft. The
input sequence should be of type Seq[A] and the output sequence of type Seq[B],
where A and B are type parameters. The required type signature of the function
and a sample test:
def map[A, B](xs: Seq[A])(f: A => B): Seq[B] = ???
Solution The required code should build a new sequence by applying the
function f to each element. How can we build a new sequence using foldLeft?
The evaluation of foldLeft consists of iterating over the input sequence and accu-
mulating some result value, which is updated at each iteration. Since the result of
a foldLeft is always equal to the last computed accumulator value, it follows that
the new sequence should be that accumulator value. So, we need to update the
accumulator by appending the value f(x), where x is the current element of the
input sequence:
def map[A, B](xs: Seq[A])(f: A => B): Seq[B] =
xs.foldLeft(Seq[B]()) { (acc, x) => acc :+ f(x) }
Example 2.2.5.5 Implement the function digitsToDouble using foldLeft. The argu-
ment is of type Seq[Char]. As a test, digitsToDouble(Seq('3','4','.','2','5')) must
evaluate to 34.25. Assume that all input characters are either digits or a dot (so,
negative numbers are not supported).
60
2.2 Converting a sequence into a single value
While the dot character was not yet seen, the updater function multiplies the
previous result by 10 and adds the current digit. After the dot character, the up-
dater function must add to the previous result the current digit divided by a factor
that represents increasing powers of 10. In other words, the update computation
𝑛0 = 𝑔(𝑛, 𝑐) must be defined by:
(
𝑛 ∗ 10 + 𝑐 if the digit is before the dot
𝑔(𝑛, 𝑐) =
𝑛 + 𝑐/ 𝑓 if after the dot, where 𝑓 = 10, 100, 1000, ... for each new digit
The updater function 𝑔 has only two arguments: the current digit and the previ-
ous accumulator value. So, the changing factor 𝑓 must be part of the accumulator
value, and must be multiplied by 10 at each digit after the dot. If the factor 𝑓 is not
a part of the accumulator value, the function 𝑔 will not have enough information
for computing the next accumulator value correctly. So, the updater computation
must be 𝑛0 = 𝑔(𝑛, 𝑐, 𝑓 ), not 𝑛0 = 𝑔(𝑛, 𝑐).
For this reason, we choose the accumulator type as a tuple (Double, Boolean,
Double) where the first number is the result 𝑛 computed so far, the Boolean flag
indicates whether the dot was already seen, and the third number is 𝑓 , that is,
the power of 10 by which the current digit will be divided if the dot was already
seen. Initially, the accumulator tuple will be equal to (0.0, false, 10.0). Then the
updater function is implemented like this:
61
2 Mathematical formulas as code. II. Mathematical induction
The result of calling d.foldLeft is a tuple (num, flag, factor), in which only the first
part, num, is needed. In Scala’s pattern matching syntax, the underscore (_) denotes
pattern variables whose values are not needed in the code. We could get the first
part using the accessor method ._1, but the code will be more readable if we show
all parts of the tuple (num, _, _).
Example 2.2.5.6 Implement a function toPairs that converts a sequence of type
Seq[A] to a sequence of pairs, Seq[(A, A)], by putting together the adjacent ele-
ments pairwise. If the initial sequence has an odd number of elements, a given
default value of type A is used to fill the last pair. The required type signature and
an example test:
def toPairs[A](xs: Seq[A], default: A): Seq[(A, A)] = ???
Solution We need to accumulate a sequence of pairs, and each pair needs two
values. However, we iterate over values in the input sequence one by one. So,
a new pair can be made only once every two iterations. The accumulator needs
to hold the information about the current iteration being even or odd. For odd-
numbered iterations, the accumulator also needs to store the previous element
that is still waiting for its pair. Therefore, we choose the type of the accumu-
lator to be a tuple (Seq[(A, A)], Seq(A)). The first sequence is the intermediate
result, and the second sequence is the “holdover”: it holds the previous element
62
2.2 Converting a sequence into a single value
We will call foldLeft with this updater and then perform some post-processing to
make sure we create the last pair in case the last iteration is odd-numbered, i.e.,
when the “holdover” is not empty after foldLeft is finished. In this implementa-
tion, we use pattern matching to decide whether a sequence is empty:
def toPairs[A](xs: Seq[A], default: A): Seq[(A, A)] = {
type Acc = (Seq[(A, A)], Seq[A]) // Type alias, for brevity.
def init: Acc = (Seq(), Seq())
def updater(acc: Acc, x: A): Acc = acc match {
case (result, Seq()) => (result, Seq(x))
case (result, Seq(prev)) => (result :+ ((prev, x)), Seq())
}
val (result, holdover) = xs.foldLeft(init)(updater)
holdover match { // May need to append the last element to the result.
case Seq() => result
case Seq(x) => result :+ ((x, default))
}
}
This code shows examples of partial functions that are applied safely. One of these
partial functions is used in this sub-expression:
holdover match {
case Seq() => ...
case Seq(a) => ...
}
This code works when holdover is empty or has length 1 but fails for longer se-
quences. In the implementation of toPairs, the value of holdover will always be a
sequence of length at most 1, so it is safe to use this partial function.
63
2 Mathematical formulas as code. II. Mathematical induction
Exercise 2.2.6.3 Use foldLeft to implement the zipWithIndex method for sequences.
The required type signature and a sample test:
def zipWithIndex[A](xs: Seq[A]): Seq[(A, Int)] = ???
Exercise 2.2.6.6 Split a sequence into batches by “weight” computed via a given
function. The total weight of items in any batch should not be larger than a given
maximum weight. The required type signature and a sample test:
64
2.3 Generating a sequence from a single value
Exercise 2.2.6.7 Use foldLeft to implement a groupBy function. The type signature
and a test:
def groupBy[A, K](xs: Seq[A])(by: A => K): Map[K, Seq[A]] = ???
Hints: The accumulator should be of type Map[K, Seq[A]]. Use the methods
updated and getOrElse to work with dictionaries. The method getOrElse fetches a
value from a dictionary by key but returns a default value if the key is not in the
dictionary:
scala> Map("a" -> 1, "b" -> 2).getOrElse("a", 300)
res0: Int = 1
The method updated produces a new dictionary that contains a new value for the
given key, whether or not that key already exists in the dictionary:
scala> Map("a" -> 1, "b" -> 2).updated("c", 300) // Key is new.
res0: Map[String,Int] = Map(a -> 1, b -> 2, c -> 300)
scala> Map("a" -> 1, "b" -> 2).updated("a", 400) // Key already exists.
res1: Map[String,Int] = Map(a -> 400, b -> 2)
scala> digitsOf(2405)
res0: Seq[Int] = List(2, 4, 0, 5)
We cannot implement digitsOf using map, zip, or foldLeft, because these methods
work only if we already have a sequence; but the function digitsOf needs to create
a new sequence. We could create a sequence via the expression (1 to n) if the
required length of the sequence were known in advance. However, the function
65
2 Mathematical formulas as code. II. Mathematical induction
The stream is ready to start computing the next elements of the sequence (so far,
only the first element, 2, has been computed). In order to see the next elements,
we need to stop the stream at a finite size and then convert the result to a list:
scala> Stream.iterate(2) { x => x + 10 }.take(6).toList
res1: List[Int] = List(2, 12, 22, 32, 42, 52)
If we try to evaluate toList on a stream without first limiting its size via take or
takeWhile, the program will keep producing more elements until it runs out of
memory and crashes.
Streams have methods such as map, filter, and flatMap similar to sequences. For
instance, the method drop skips a given number of initial elements:
scala> Seq(10, 20, 30, 40, 50).drop(3)
res2: Seq[Int] = List(40, 50)
This example shows that in order to evaluate drop(3), the stream had to compute
its elements up to 32 (but the subsequent elements are still not computed).
To figure out the code for digitsOf, we first write this function as a mathematical
formula. To compute the digits of, say, 𝑛 = 2405, we need to divide 𝑛 repeatedly
by 10, getting a sequence 𝑛 𝑘 of intermediate numbers (𝑛0 = 2405, 𝑛1 = 240, ...) and
the corresponding sequence of last digits, 𝑛 𝑘 % 10 (in this example: 5, 0, ...). The
sequence 𝑛 𝑘 is defined using mathematical induction:
66
2.4 Transforming a sequence into another sequence
• Inductive step: 𝑛 𝑘+1 = 𝑛10𝑘 for 𝑘 = 1, 2, ...
Here 𝑛10𝑘 is the mathematical notation for the integer division by 10. Let us tabu-
late the evaluation of the sequence 𝑛 𝑘 for 𝑛 = 2405:
𝑘= 0 1 2 3 4 5 6
𝑛𝑘 = 2405 240 24 2 0 0 0
𝑛 𝑘 % 10 = 5 0 4 2 0 0 0
The numbers 𝑛 𝑘 will remain all zeros after 𝑘 = 4. It is clear that the useful part of
the sequence is before it becomes all zeros. In this example, the sequence 𝑛 𝑘 needs
to be stopped at 𝑘 = 4. The sequence of digits then becomes [5, 0, 4, 2], and we
need to reverse it to obtain [2, 4, 0, 5]. For reversing a sequence, the Scala library
has the standard method reverse. So, a complete implementation for digitsOf is:
def digitsOf(n: Int): Seq[Int] =
if (n == 0) Seq(0) else { // n == 0 is a special case.
Stream.iterate(n) { nk => nk / 10 }
.takeWhile { nk => nk != 0 }
.map { nk => nk % 10 }
.toList.reverse
}
We can shorten the code by using the syntax (_ % 10) instead of { nk => nk % 10 }:
def digitsOf(n: Int): Seq[Int] =
if (n == 0) Seq(0) else { // n == 0 is a special case.
Stream.iterate(n)(_ / 10)
.takeWhile(_ != 0)
.map(_ % 10)
.toList.reverse
}
The type signature of the method Stream.iterate can be written as:
def iterate[A](init: A)(next: A => A): Stream[A]
the elements of the new sequence are defined by induction and depend on previ-
ous elements. An example of this kind is computing the partial sums of a given
Í 𝑘−1
sequence 𝑥𝑖 , say 𝑏 𝑘 = 𝑖=0 𝑥𝑖 . This formula defines 𝑏 0 = 0, 𝑏 1 = 𝑥0 , 𝑏 2 = 𝑥 0 + 𝑥1 ,
𝑏 3 = 𝑥 0 + 𝑥 1 + 𝑥2 , etc. A definition via mathematical induction may be written like
this:
• Base case: 𝑏 0 = 0.
• Inductive step: Given 𝑏 𝑘 , we define 𝑏 𝑘+1 = 𝑏 𝑘 + 𝑥 𝑘 for 𝑘 = 0, 1, 2, ...
The Scala library method scanLeft implements a general sequence-to-sequence
transformation defined in this way. The code implementing the partial sums is:
def partialSums(xs: Seq[Int]): Seq[Int] = xs.scanLeft(0){ (x, y) => x + y }
2.5 Summary
We have seen a number of ways for translating mathematical induction into Scala
code. What problems can we solve now?
Table 2.1 shows Scala code implementing those tasks. Iterative calculations are
implemented by translating mathematical induction directly into code. In the
functional programming paradigm, the programmer does not need to write loops
or use array indices. Instead, the programmer reasons about sequences as mathe-
matical values: “Starting from this value, we get that sequence, then transform it
into that other sequence,” etc. This is a powerful way of working with sequences,
dictionaries, and sets. Many kinds of programming errors (such as using an incor-
rect array index) are avoided from the outset, and the code is shorter and easier to
read than code written via loops.
What problems cannot be solved with these tools? There is no automatic recipe
for converting an arbitrary function into a tail-recursive one. The accumulator
trick does not always work! In some cases, it is impossible to implement tail
recursion in a given recursive computation. An example of such a computation is
the “merge-sort” algorithm where the function body must contain two recursive
calls within a single expression. (It is impossible to rewrite two recursive calls as
one tail call.)
69
2 Mathematical formulas as code. II. Mathematical induction
What if our recursive code cannot be transformed into tail-recursive code via
the accumulator trick, but the recursion depth is so large that stack overflows
occur? There exist special techniques (e.g., “continuations” and “trampolines”)
that convert non-tail-recursive code into code that runs without stack overflows.
Those techniques are beyond the scope of this chapter.
2.5.1 Examples
Example 2.5.1.1 Compute the smallest 𝑛 such that 𝑓 ( 𝑓 ( 𝑓 (... 𝑓 (1)...) ≥ 1000, where
the function 𝑓 is applied 𝑛 times. Test with 𝑓 (𝑥) = 2𝑥 + 1.
Solution Define a stream of values [1, 𝑓 (1), 𝑓 ( 𝑓 (1)), ...] and use takeWhile to
stop the stream when the values reach 1000. The number 𝑛 is then found as the
length of the resulting sequence:
scala> Stream.iterate(1)(x => 2 * x + 1).takeWhile(x => x < 1000).toList
res0: List[Int] = List(1, 3, 7, 15, 31, 63, 127, 255, 511)
Example 2.5.1.2 (a) For a given Stream[Int], compute the stream of the largest
values seen so far.
(b) Compute the stream of 𝑘 largest values seen so far (𝑘 is a given integer
parameter).
Solution We cannot use max or sort the entire stream, since the length of the
stream is not known in advance. So, we need to use scanLeft, which will build the
output stream one element at a time.
(a) Maintain the largest value seen so far in the accumulator of the scanLeft:
def maxSoFar(xs: Stream[Int]): Stream[Int] =
xs.scanLeft(xs.head) { (max, x) => math.max(max, x) }.drop(1)
We use drop(1) to remove the initial value (xs.head) because it is not useful for our
result but is always produced by scanLeft.
To test this function, let us define a stream whose values go up and down:
val s = Stream.iterate(0)(x => 1 - 2 * x)
scala> s.take(10).toList
res0: List[Int] = List(0, 1, -1, 3, -5, 11, -21, 43, -85, 171)
scala> maxSoFar(s).take(10).toList
res1: List[Int] = List(0, 1, 1, 3, 3, 11, 11, 43, 43, 171)
(b) We again use scanLeft, where now the accumulator needs to keep the largest
𝑘 values seen so far. There are two ways of maintaining this accumulator: First, to
have a sequence of 𝑘 values that we sort and truncate each time. Second, to use a
data structure such as a priority queue that automatically keeps values sorted and
70
2.5 Summary
its length bounded. For the purposes of this example, let us avoid using special
data structures:
def maxKSoFar(xs: Stream[Int], k: Int): Stream[Seq[Int]] = {
// The initial value of the accumulator is an empty Seq() of type Seq[Int].
xs.scanLeft(Seq[Int]()) { (seq, x) =>
// Sort in descending order, and take the first k values.
(seq :+ x).sorted.reverse.take(k)
}.drop(1) // Skip the undesired first value.
}
Example 2.5.1.3 Find the last element of a non-empty sequence. (Hint: use
reduce.)
Solution This function is available in the Scala library as the standard method
last on sequences. Here we need to re-implement it using reduce. Begin by writing
an inductive definition:
• (Base case.) last(Seq(x)) == x.
• (Inductive step.) last(x +: xs) == last(xs) assuming xs is non-empty.
The reduce method implements an inductive aggregation similarly to foldLeft,
except that for reduce the base case always returns x for a 1-element sequence
Seq(x). This is exactly what we need here, so the inductive definition is directly
translated into code, with the updater function 𝑔(𝑥, 𝑦) = 𝑦:
def last[A](xs: Seq[A]): A = xs.reduce { (x, y) => y }
Example 2.5.1.4 (a) Count the occurrences of each distinct word in a string:
def countWords(s: String): Map[String, Int] = ???
(b) Count the occurrences of each distinct element in a sequence of type Seq[A].
Solution (a) We split the string into an array of words via s.split(" ") and
apply a foldLeft to that array, since the computation is a kind of aggregation over
the array of words. The accumulator of the aggregation will be a dictionary of
word counts for all the words seen so far:
def countWords(s: String): Map[String, Int] = {
val init: Map[String, Int] = Map()
s.split(" ").foldLeft(init) { (dict, word) =>
val newCount = dict.getOrElse(word, 0) + 1
dict.updated(word, newCount)
}
}
71
2 Mathematical formulas as code. II. Mathematical induction
The groupBy creates a dictionary in one function call rather than one entry at a time.
But the resulting dictionary contains word lists instead of word counts, so we use
map to compute the length of each word list:
scala> "a a b b b c".split(" ").groupBy(w => w)
res0: Map[String,Array[String]] = Map(b -> Array(b, b, b), a -> Array(a, a), c
-> Array(c))
(b) The main code of countWords does not depend on the fact that words are
of type String. It will work in the same way for any other type of keys for the
dictionary. So, we keep the same code (except for renaming word to x) and replace
String by a type parameter A in the type signature:
def countValues[A](xs: Seq[A]): Map[A, Int] =
xs.foldLeft(Map[A, Int]()) { (dict, x) =>
val newCount = dict.getOrElse(x, 0) + 1
dict.updated(x, newCount)
}
Example 2.5.1.5 (a) Implement the binary search algorithm for a sorted sequence
xs: Seq[Int] as a function returning the index of the requested value goal (assume
that xs always contains goal):
@tailrec def binSearch(xs: Seq[Int], goal: Int): Int = ???
72
2.5 Summary
}
}
We will first figure out the type and the initial value of the accumulator, then
implement the updater.
The information required for the recursive call must show the segment of the
sequence where the target number is present. That segment is defined by two
indices 𝑖, 𝑗 representing the left and the right bounds of the sub-sequence, such
that the target element is 𝑥 𝑛 with 𝑥𝑖 ≤ 𝑥 𝑛 ≤ 𝑥 𝑗−1 . It follows that the accumulator
should be a pair of two integers (𝑖, 𝑗). The initial value of the accumulator is the
pair (0, 𝑁), where 𝑁 is the length of the entire sequence. The search is finished
when 𝑖 + 1 = 𝑗. For convenience, we introduce two accumulator values (left and
right) for 𝑖 and 𝑗:
@tailrec def binSearch(xs: Seq[Int], goal: Int)(left: Int = 0, right: Int =
xs.length): Int = {
// Check whether `goal` is at one of the boundaries.
if (right - left <= 1 || xs(left) == goal) left
else {
val middle = (left + right) / 2
// Determine which half of the array contains `target`.
// Update the accumulator accordingly.
val (newLeft, newRight) =
if (goal < xs(middle)) (left, middle)
else (middle, right)
binSearch(xs, goal)(newLeft, newRight) // Tail-recursive call.
}
}
Here we used a feature of Scala that allows us to set xs.length as a default value
for the argument right of binSearch. This works because right is in a different argu-
ment list from xs. Default values in an argument list may depend on arguments
in a previous argument list. However, this code:
def binSearch(xs: Seq[Int], goal: Int, left: Int = 0, right: Int = xs.length)
will generate an error. Arguments in the same argument list cannot depend on
each other. (The error will say not found: value xs.)
(b) We can visualize the binary search as a procedure that generates a stream of
progressively tighter bounds for the location of goal. The initial bounds are (0,
xs.length), and the final bounds are (k, k + 1) for some k. We can generate the
sequence of bounds using Stream.iterate and stop the sequence when the bounds
become sufficiently tight. To detect that, we use the find method:
def binSearch(xs: Seq[Int], goal: Int): Int = {
type Acc = (Int, Int)
val init: Acc = (0, xs.length)
73
2 Mathematical formulas as code. II. Mathematical induction
Stream.iterate(init)(updater)
.find { case (x, y) => y - x <= 1 } // Find an element with tight bounds.
.get._1 // Take the `left` bound from that.
}
Let us compute the sequence [𝑠0 , 𝑠1 , 𝑠2 , ...] by repeatedly applying SD to some num-
ber, say, 99:
scala> Stream.iterate(99)(SD).take(10).toList
res1: List[Int] = List(99, 18, 9, 9, 9, 9, 9, 9, 9, 9)
We need to stop the stream when the values start to repeat, keeping the first re-
peated value. In the example above, we need to stop the stream after the value 9
(but include that value). One solution is to transform the stream via scanLeft into a
stream of pairs of consecutive values, so that it becomes easier to detect repetition:
scala> Stream.iterate(99)(SD).scanLeft((0,0)) { case ((prev, x), next) => (x,
next) }.take(8).toList
res2: List[(Int, Int)] = List((0,0), (0,99), (99,18), (18,9), (9,9), (9,9),
(9,9), (9,9))
74
2.5 Summary
This looks right; it remains to remove the first parts of the tuples:
def sdSeq(n: Int): Seq[Int] = Stream.iterate(n)(SD) // Stream[Int]
.scanLeft((0,0)) { case ((prev, x), next) => (x, next) } // Stream[(Int, Int)]
.drop(1).takeWhile { case (x, y) => x != y } // Stream[(Int, Int)]
.map(_._2) // Stream[Int]
.toList // List[Int]
scala> sdSeq(99)
res3: Seq[Int] = List(99, 18, 9)
The function should create a stream of values of type A with the initial value init.
Next elements are computed from previous ones via the function next until it re-
turns None. (The type Option is explained in Section 3.2.3.) An example test:
scala> unfold(0) { x => if (x > 5) None else Some(x + 2) }
res0: Stream[Int] = Stream(0, ?)
scala> res0.toList
res1: List[Int] = List(0, 2, 4, 6)
Example 2.5.1.8 For a given stream [𝑠0 , 𝑠1 , 𝑠2 , ...] of type Stream[T], compute the
“half-speed” stream ℎ = [𝑠0 , 𝑠0 , 𝑠1 , 𝑠1 , 𝑠2 , 𝑠2 , ...]. The half-speed sequence ℎ is de-
fined as ℎ2𝑘 = ℎ2𝑘+1 = 𝑠 𝑘 for 𝑘 = 0, 1, 2, ...
Solution We use map to replace each element 𝑠𝑖 by a sequence containing two
copies of 𝑠𝑖 . Let us try this on a sample sequence:
scala> Seq(1, 2, 3).map( x => Seq(x, x))
res0: Seq[Seq[Int]] = List(List(1, 1), List(2, 2), List(3, 3))
The result is almost what we need, except we need to flatten the nested list:
scala> Seq(1, 2, 3).map( x => Seq(x, x)).flatten
75
2 Mathematical formulas as code. II. Mathematical induction
The composition of map and flatten is flatMap, so the final code is:
def halfSpeed[T](str: Stream[T]): Stream[T] = str.flatMap(x => Seq(x, x))
Example 2.5.1.9 (The loop detection problem.) Stop a given stream [𝑠0 , 𝑠1 , 𝑠2 , ...]
at a place 𝑘 where the sequence repeats itself; that is, an element 𝑠 𝑘 equals some
earlier element 𝑠𝑖 with 𝑖 < 𝑘.
Solution The trick is to create a half-speed sequence ℎ𝑖 out of 𝑠𝑖 and then find
an index 𝑘 > 0 such that ℎ 𝑘 = 𝑠 𝑘 . (The condition 𝑘 > 0 is needed because we
will always have ℎ0 = 𝑠0 .) If we find such an index 𝑘, it would mean that either
𝑠 𝑘 = 𝑠 𝑘/2 or 𝑠 𝑘 = 𝑠 (𝑘−1)/2 ; in either case, we will have found an element 𝑠 𝑘 that
equals an earlier element.
As an example, for an input sequence 𝑠 = [1, 3, 5, 7, 9, 3, 5, 7, 9, ...] we obtain the
half-speed sequence ℎ = [1, 1, 3, 3, 5, 5, 7, 7, 9, 9, 3, 3, ...]. Looking for an index 𝑘 > 0
such that ℎ 𝑘 = 𝑠 𝑘 , we find that 𝑠7 = ℎ7 = 7. The element 𝑠7 indeed repeats an earlier
element (although 𝑠7 is not the first such repetition).
There are in principle two ways of finding an index 𝑘 > 0 such that ℎ 𝑘 = 𝑠 𝑘 :
First, to iterate over a list of indices 𝑘 = 1, 2, ... and evaluate the condition ℎ 𝑘 = 𝑠 𝑘
as a function of 𝑘. Second, to build a sequence of pairs (ℎ𝑖 , 𝑠𝑖 ) and use takeWhile to
stop at the required index. In the present case, we cannot use the first way because
we do not have a fixed set of indices to iterate over. Also, the condition ℎ 𝑘 = 𝑠 𝑘
cannot be directly evaluated as a function of 𝑘 because 𝑠 and ℎ are streams that
compute elements on demand, not lists whose elements are computed in advance
and ready for use.
So, the code must iterate over a stream of pairs (ℎ𝑖 , 𝑠𝑖 ):
def stopRepeats[T](str: Stream[T]): Stream[T] = {
val halfSpeed = str.flatMap(x => Seq(x, x))
val result = halfSpeed.zip(str) // Stream[(T, T)]
.drop(1) // Enforce the condition k > 0.
.takeWhile { case (h, s) => h != s } // Stream[(T, T)]
.map(_._2) // Stream[T]
str.head +: result // Prepend the first element that was dropped.
}
Example 2.5.1.10 Reverse each word in a string but keep the order of words:
def revWords(s: String): String = ???
76
2.5 Summary
scala> noDups("abbcdeeeeefddgggggh")
res0: String = abcdefdgh
Solution A string is automatically converted into a sequence of characters
when we use methods such as map or zip on it. So, we can use s.zip(s.tail) to
get a sequence of pairs (𝑠 𝑘 , 𝑠 𝑘+1 ) where 𝑐 𝑘 is the 𝑘-th character of the string 𝑠. A
filter will then remove elements 𝑠 𝑘 for which 𝑠 𝑘+1 = 𝑠 𝑘 :
scala> val s = "abbcd"
s: String = abbcd
77
2 Mathematical formulas as code. II. Mathematical induction
78
2.5 Summary
2.5.2 Exercises
Exercise 2.5.2.1 Define a function dsq that computes the sum of squared digits of
a given integer; for instance, dsq(123) = 14 (see Example 2.5.1.6). Generalize dsq
to take as an argument a function f: Int => Int replacing the squaring operation.
The required type signature and a sample test:
def digitsFSum(x: Int)(f: Int => Int): Int = ???
Stop the stream when it reaches 1 (as one would expect3 it will).
Exercise 2.5.2.3 For a given integer 𝑛, compute the sum of cubed digits, then the
sum of cubed digits of the result, etc.; stop the resulting sequence when it repeats
itself, and so determine whether it ever reaches 1. (Use Exercise 2.5.2.1.)
def cubes(n: Int): Stream[Int] = ???
scala> cubes(123).take(10).toList
res0: List[Int] = List(123, 36, 243, 99, 1458, 702, 351, 153, 153, 153)
scala> cubes(2).take(10).toList
res1: List[Int] = List(2, 8, 512, 134, 92, 737, 713, 371, 371, 371)
scala> cubes(4).take(10).toList
res2: List[Int] = List(4, 64, 280, 520, 133, 55, 250, 133, 55, 250)
scala> cubesReach1(10)
res3: Boolean = true
scala> cubesReach1(4)
res4: Boolean = false
Exercise 2.5.2.4 For a, b, c of type Set[Int], compute the set of all sets of the form
Set(x, y, z) where x is from a, y from b, and z from c. The required type signature
and a sample test:
3 https://en.wikipedia.org/wiki/Collatz_conjecture
79
2 Mathematical formulas as code. II. Mathematical induction
Exercise 2.5.2.7 Reverse a sentence’s word order, but keep the words unchanged:
def revSentence(s: String): String = ???
scala> revSentence("A quick brown fox") // Words are separated by one space.
res0: String = "fox brown quick A"
Exercise 2.5.2.8 (a) Reverse an integer’s digits (see Example 2.5.1.6) as shown:
def revDigits(n: Int): Int = ???
scala> revDigits(12345)
res0: Int = 54321
80
2.5 Summary
scala> findPalindrome(123)
res0: Long = 444
scala> findPalindrome(83951)
res1: Long = 869363968
Exercise 2.5.2.10 Transform a given sequence xs: Seq[Int] into a sequence of type
Seq[(Int, Int)] of pairs that skip one neighbor. Implement this transformation as
a function skip1 with a type parameter A instead of the type Int. The required type
signature and a sample test:
def skip1[A](xs: Seq[A]): Seq[(A, A)] = ???
Exercise 2.5.2.11 (a) For a given integer interval [𝑛1 , 𝑛2 ], find the largest integer
𝑘 ∈ [𝑛1 , 𝑛2 ] such that the decimal representation of 𝑘 does not contain any of the
digits 3, 5, or 7.
(b) For a given integer interval [𝑛1 , 𝑛2 ], find the integer 𝑘 ∈ [𝑛1 , 𝑛2 ] with the
largest sum of decimal digits.
(c) A positive integer 𝑛 is called a perfect number if it is equal to the sum of
its divisors (integers 𝑘 such that 1 ≤ 𝑘 < 𝑛 and 𝑘 divides 𝑛). For example, 6 is a
perfect number because its divisors are 1, 2, and 3, and 1 + 2 + 3 = 6, while 8 is
not a perfect number because its divisors are 1, 2, and 4, and 1 + 2 + 4 = 7 ≠ 8.
Write a function that determines whether a given number 𝑛 is perfect. Determine
all perfect numbers up to one million.
Exercise 2.5.2.12 Transform a sequence by removing adjacent repeated elements
when they are repeated more than 𝑘 times. Repetitions up to 𝑘 times should re-
main unchanged. The required type signature and a sample test:
def removeDups[A](s: Seq[A], k: Int): Seq[A] = ???
The function should create a stream of values of type B by repeatedly applying the
given function next until it returns None. At each iteration, next should be applied
to the value of type A returned by the previous call to next. An example test:
scala> unfold2(0) { x => if (x > 5) None else Some((x + 2, s"had $x")) }
res0: Stream[String] = Stream(had 0, ?)
scala> res0.toList
81
2 Mathematical formulas as code. II. Mathematical induction
Exercise 2.5.2.14 (a) Remove repeated elements (whether adjacent or not) from a
sequence of type Seq[A]. (This reproduces the standard library’s method distinct.)
(b) For a sequence of type Seq[A], remove all elements that are repeated (whether
adjacent or not) more than 𝑘 times:
def removeK[A](k: Int, xs: Seq[A]): Seq[A] = ???
scala> removeK(2, Seq("a", "b", "a", "b", "b", "c", "b", "a"))
res0: Seq[String] = List(a, b, a, b, c)
Exercise 2.5.2.15 For a given sequence xs: Seq[Double], find a subsequence that
has the largest sum of values. The sequence xs is not sorted, and its values may
be positive or negative. The required type signature and a sample test:
def maxsub(xs: Seq[Double]): Seq[Double] = ???
scala> maxsub(Seq(1.0, -1.5, 2.0, 3.0, -0.5, 2.0, 1.0, -10.0, 2.0))
res0: Seq[Double] = List(2.0, 3.0, -0.5, 2.0, 1.0)
82
2.6 Discussion and further developments
This kind of error may crash a program at run time. Unlike the type errors we
saw before, which occur at compilation time (i.e., before the program can start),
run-time errors occur while the program is running and only when an invalid
situation actually happens — say, when some partial function gets an incorrect in-
put. The incorrect input may occur at any time after the program started running,
which may crash the program in the middle of a long computation.
So, it seems clear that we should avoid writing code that generates such errors.
For instance, we will prefer to apply max only to sequences that are known to be
non-empty.
Sometimes, a function that uses pattern matching turns out to be a partial func-
tion because its pattern matching code fails on certain input data.
If none of the cases matches in a pattern matching expression, the code will
throw an exception (a MatchError). In functional programming, we usually want to
avoid that situation because reasoning about program correctness becomes hard.
In most cases, programs can be rewritten to avoid the possibility of match errors.
An example of an unsafe pattern matching expression is:
def h(p: (Int, Int)): Int = p match { case (x, 0) => x }
scala> h( (1, 0) )
res0: Int = 1
scala> h( (1, 2) )
scala.MatchError: (1,2) (of class scala.Tuple2$mcII$sp)
at .h(<console>:12)
... 32 elided
Here, the pattern contains a pattern variable x and a constant 0. This pattern only
matches tuples whose second part is equal to 0. If the second argument is nonzero,
a match error occurs and the program crashes. So, h is a partial function.
Pattern matching errors never happen if we match a tuple of correct size with a
pattern such as (x, y, z), because each pattern variable will always match a value.
So, pattern matching with a pattern such as (x, y, z) is infallible (never fails at
run time) when applied to a tuple with 3 elements.
Another way in which pattern matching can be made infallible is by including
a pattern that matches everything:
p match {
case (x, 0) => ... // This only matches certain tuples.
case _ => ... // This matches everything else.
}
If the first pattern (x, 0) fails to match the value p, the second pattern will be tried
(and will always succeed). The case patterns in a match expression are tried in the
order they are written. So, a match expression may be made infallible by adding a
“match-all” underscore pattern.
83
2 Mathematical formulas as code. II. Mathematical induction
scala> f( (2, 4) )
res0: Int = 6
The argument of f is the variable x of a tuple type (Int, Int), but there is also a
pattern variable x in the case expression. The pattern variable x matches the first
part of the tuple and has type Int. Because variables are locally scoped, the pattern
variable x is only defined within the expression x + y. The argument x:(Int,Int)
is a completely different variable that has a different type.
The code works correctly but is confusing to read because of the name clash
between the two quite different variables, both named x. Another negative con-
sequence of the name clash is that the argument x:(Int,Int) is invisible within the
case expression: if we write “x” in that expression, we will get the pattern variable
x:Int. One says that the argument x:(Int,Int) has been shadowed by the pattern
variable x (which is a “bound variable” inside the case expression).
This problem is easy to avoid: we can give the pattern variable another name.
Since the pattern variable is locally scoped, it can be renamed within its scope
without affecting any other code:
def f(x: (Int, Int)): Int = x match { case (a, b) => a + b }
scala> f( (2,4) )
res0: Int = 6
At this point, we have not defined a stopping condition for this stream. In some
sense, streams may be seen as “infinite” sequences, although in practice a stream
is always finite because programs cannot run infinitely long. Also, computers
cannot store infinitely many values in memory.
More precisely, streams are “partially computed” rather than “infinite”. The main
difference between arrays and streams is that a stream’s elements are computed on
demand and not all initially available, while an array’s elements are all computed
in advance and are immediately available.
84
2.6 Discussion and further developments
A lazy value (declared as lazy val in Scala) is computed only when it is needed
in some other expression. Once computed, a lazy value stays in memory and will
not be re-computed.
An “on-call” value is re-computed every time it is used. In Scala, on-call values
are denoted via def declarations as well as via call-by-name function arguments.
Most collection types in Scala (such as List, Array, Set, and Map) are eager. All
elements of an eager collection are already evaluated.
A stream is a lazy collection. Elements of a stream are computed when first
needed. After that, they remain in memory and will not be computed again:
scala> val str = Stream.iterate(1)(_ + 1)
str: Stream[Int] = Stream(1, ?)
scala> str.take(10).toList
res0: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
scala> str
res1: Stream[Int] = Stream(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ?)
scala> 1 until 5
85
2 Mathematical formulas as code. II. Mathematical induction
The types Range and Range.Inclusive are defined in the Scala standard library and
are iterators. They behave as collections and support the usual methods (map,
filter, etc.), but they do not store previously computed values in memory.
The view method Eager collections such as List or Array can be converted to iter-
ators by using the view method. This is necessary when intermediate collections
consume too much memory when fully evaluated. For example, consider the
computation of Example 2.1.5.7 where we used flatMap to replace each element of
an initial sequence by three new numbers before computing max of the resulting
collection. If instead of three new numbers we wanted to compute three million
new numbers each time, the intermediate collection created by flatMap would re-
quire too much memory, and the computation would crash:
scala> (1 to 10).flatMap(x => 1 to 3000000).max
java.lang.OutOfMemoryError: GC overhead limit exceeded
Even though the range (1 to 10) is an iterator, a subsequent flatMap operation cre-
ates an intermediate collection that is too large for our computer’s memory. We
can use view to avoid this:
scala> (1 to 10).view.flatMap(x => 1 to 3000000).max
res0: Int = 3000000
The choice between using streams and using iterators is dictated by memory
constraints. Except for that, streams and iterators behave similarly to other se-
quences. We may write programs in the map/reduce style, applying standard
methods such as map, filter, etc., to streams and iterators. Mathematical reason-
ing about transforming a sequence is the same, whether the sequence is eager,
lazy, or on-call.
The Iterator class The Scala library class Iterator has methods such as iterate
and others, similarly to Stream. However, Iterator does not behave as a value in
the mathematical sense:
scala> val iter = (1 until 10).toIterator
iter: Iterator[Int] = non-empty iterator
scala> iter
res2: Iterator[Int] = empty iterator
Evaluating the expression iter.toList two times produces a different result the
second time. As we see from the Scala output, the value iter has become “empty”
after the first use.
86
2.6 Discussion and further developments
scala> x.toList
res0: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9)
scala> x.toList
res1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9)
Collections such as List, Map, or Stream are immutable. Some elements of a Stream
may not be evaluated yet, but this does not affect its value-like behavior:
scala> val str = (1 until 10).toStream
str: scala.collection.immutable.Stream[Int] = Stream(1, ?)
scala> str.toList
res0: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9)
scala> str.toList
res1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9)
scala> v.toList
res0: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9)
scala> v.toList
res1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9)
Due to the lack of value-like behavior, programs written using Iterator do not
obey the usual rules of mathematical reasoning. This makes it easy to write wrong
87
2 Mathematical formulas as code. II. Mathematical induction
The result [5, 9, 3, 7, 9] is incorrect, but not in an obvious way: the sequence was
stopped at a repetition, as we wanted, but some of the elements of the given se-
quence are missing (while other elements are present). It is difficult to debug a
program that produces partially correct numbers.
The error in this code occurs in the expression halfSpeed.zip(iter) due to the
fact that halfSpeed was itself defined via iter. The result is that iter is used twice
in this code, which leads to errors. Creating an Iterator and using it twice in the
same expression can give wrong results or even fail with an exception:
scala> val s = (1 until 10).toIterator
s: Iterator[Int] = non-empty iterator
scala> str.toList
res0: List[Int] = List(1, 2, 3, 4, 5, 6)
scala> str.toList
res1: List[Int] = List(1, 2, 3, 4, 5, 6)
scala> str.zip(str).toList
res2: List[(Int, Int)] = List((1,1), (2,2), (3,3), (4,4), (5,5), (6,6))
88
2.6 Discussion and further developments
Instead of Iterator, we can use Stream and view when lazy or on-call collections
are required. Newer versions of Scala replace Stream with LazyList, which is a
lazily evaluated (and possibly infinite) stream. Libraries such as scalaz and fs2
also provide streams with correct value-like behavior.
The mutable behavior of Iterator is an example of a “side effect”. A function has
a side effect if the function’s code performs some external action in addition to
computing a result value. Examples of side effects are: modifying a value stored
in memory; starting and stopping processes or threads; reading or writing files;
printing; sending or receiving data over a network; showing images on a display;
playing or recording sounds; getting photos or videos from a digital camera.
Code that performs side effects does not behave as a value. Evaluating such
code twice will perform the side effect twice, which is not the same as just re-using
the result value twice. A function with a side effect may return different values
each time it is called, even when the same arguments are given to the function.
(For example, a digital camera will typically return a different image each time.)
Pure functions are those that contain no code with side effects. A pure function
will always return the same result value when applied to the same arguments. So,
pure functions behave similarly to functions that are used in mathematics.
This book focuses on pure functions and on mathematical reasoning about them.
Statements such as “the map method cannot implement sum because it can only apply
element-wise transformations to sequences” are correct only if the code is restricted to
pure functions without side effects. Otherwise we would write code like this:
def sum(xs: Seq[Int]): Int = {
var result: Int = 0 // A mutable variable.
sum.map { x => result += x } // Side effect: mutation.
result
}
89
3 The logic of types. I. Disjunctive
types
Disjunctive types describe values that belong to a disjoint set of alternatives.
To see how Scala implements disjunctive types, we need to begin by looking at
“case classes”.
We would prevent this kind of mistake if we could use two different types, with
names such as MySock and Payment, for the two kinds of data. There are three basic
ways of defining a new named type in Scala: using a type alias, using a class (or
“trait”), and using an opaque type.
Opaque types (hiding a type under a new name) is a feature of Scala 3. It can be
seen as a case class with a single field but without the cost of memory allocation.
Here, we will focus on type aliases and case classes.
A type alias is an alternative name for an existing (already defined) type. We
could use type aliases in our example to add clarity to the code:
type MySockTuple = (Double, String)
type PaymentTuple = (Double, String)
91
3 The logic of types. I. Disjunctive types
s: MySockTuple = (10.5,white)
scala> paid.amount
res3: Double = 25.0
The mix-up error is now a type error detected by the compiler:
def totalAmountPaid(ps: Seq[Payment]): Double = ps.map(_.amount).sum
grams can run only if all types match. This prevents a broad class of run-time
errors that occur due to wrong types.
Just as tuples can have any number of parts, case classes can have any number
of parts, but the part names must be distinct, for example:
case class Person(firstName: String, lastName: String, age: Int)
scala> noether.firstName
res5: String = Emmy
scala> noether.age
res6: Int = 137
This data type carries the same information as a tuple (String, String, Int). How-
ever, the declaration of a case class Person gives the programmer several features
that make working with the tuple’s data more convenient and less error-prone.
Some (or all) part names may be specified when creating a case class value:
scala> val poincaré = Person(firstName = "Henri", lastName = "Poincaré", 165)
poincaré: Person = Person(Henri,Poincaré,165)
This error is due to an incorrect order of parts when creating a case class value.
However, parts can be specified in any order when using part names:
scala> val p = Person(age = 137, lastName = "Noether", firstName = "Emmy")
p: Person = Person(Emmy,Noether,137)
A part of a case class can have the type of another case class, creating a type similar
to a nested tuple:
case class BagOfSocks(sock: MySock, count: Int)
val bag = BagOfSocks(MySock(10.5, "white"), 6)
scala> bag.sock.size
res7: Double = 10.5
93
3 The logic of types. I. Disjunctive types
This case class can accommodate every type A. We may now create values of
MySockX containing a value of any given type, say Int:
scala> val s = MySockX(10.5, "white", 123)
s: MySockX[Int] = MySockX(10.5,white,123)
Because the value 123 has type Int, the type parameter A in MySockX[A] was auto-
matically set to the type Int. The result has type MySockX[Int]. The programmer
does not need to specify that type explicitly.
Each time we create a value of type MySockX, a specific type will have to be used
instead of the type parameter A. If we want to be explicit, we may write the type
parameter like this:
scala> val s = MySockX[String](10.5, "white", "last pair")
s: MySockX[String] = MySockX(10.5,white,last pair)
We can write parametric code working with MySockX[A], that is, keeping the type
parameter A in the code. For example, a function that checks whether a sock of
type MySockX[A] fits the author’s foot can be written as:
def fits[A](sock: MySockX[A]): Boolean = sock.size >= 10.5 && sock.size <= 11
This function is defined for all types A at once, because its code works in the same
way regardless of what A is. Scala will set the type parameter A automatically
when we apply fits to an argument:
scala> fits(MySockX(10.5, "blue", List(1, 2, 3))) // Using MySockX[List[Int]].
res0: Boolean = true
This code forces the type parameter A to be List[Int], and so we may omit the
type parameter of fits. When types become more complicated, it may be helpful
to write out some type parameters. The compiler can detect a mismatch between
the type parameter A = List[Int] used in the “sock” value and the type parameter
A = Int in the function fits:
scala> fits[Int](MySockX(10.5, "blue", List(1, 2, 3)))
<console>:15: error: type mismatch;
found : List[Int]
required: Int
fits[Int](MySockX(10.5, "blue", List(1, 2, 3)))
94
3.1 Scala’s “case classes”
Case classes may have several type parameters, and the types of the parts may
use these type parameters. Here is an artificial example of a case class using type
parameters in different ways:
case class Complicated[A, B, C, D](x: (A, A), y: (B, Int) => A, z: C => C)
This case class contains parts of different types that use the type parameters A, B,
C in tuples and functions. The type parameter D is not used at all; this is allowed
(and occasionally useful).
A type with type parameters, such as MySockX or Complicated, is called a type con-
structor. A type constructor “constructs” a new type, such as MySockX[Int], from a
given type parameter Int. Values of type MySockX cannot be created without setting
the type parameter. So, it is important to distinguish the type constructor, such as
MySockX, from a type that can have values, such as MySockX[Int].
Case classes may have one part or zero parts, similarly to the one-part and zero-
part tuples:
case class B(z: Int) // Tuple with one part.
case class C() // Tuple with no parts.
The following table shows the correspondence between tuples and case classes:
There are two main differences between case class C() and case object C:
• A case object cannot have type parameters, while we could define a case
class C[X, Y, Z]() with type parameters X, Y, Z, etc.
• A case object is allocated in memory only once, while new values of a case
class C() will be allocated in memory each time C() is evaluated.
Other than that, case class C() and case object C have the same meaning: a named
tuple with zero parts, which we may also view as a “named Unit” type. This book
will not use case objects because case classes are sufficient.
In both situations, case classes can be used as patterns. The following code is an
example of a destructuring definition with case classes:
case class MySock(size: Double, color: String)
case class BagOfSocks(sock: MySock, count: Int)
96
3.2 Disjunctive types
scala> printBag(bag)
res0: String = bag has 6 white socks of size 10.5
A case expression can match a value, extract some pattern variables, and com-
pute a result:
def fits(bag: BagOfSocks): Boolean = bag match {
case BagOfSocks(MySock(size, _), _) => (size >= 10.5 && size <= 11.0)
}
In the code of this function, the value of bag is matched against the pattern ex-
pression BagOfSocks(MySock(size, _), _). This pattern will define size as a pattern
variable of type Double and assign the corresponding part of the case class to that
variable. For example, the value BagOfSocks(MySock(10.5, "white"), 6)) matched
against BagOfSocks(MySock(size, _), _) assigns 10.5 to size. The symbols “_” mean
that we just ignore other parts of the case classes and do not create any pattern
variables for them (because we do not need them in this code).
The syntax for pattern matching for case classes is similar to the syntax for pat-
tern matching for tuples, except for the presence of names of the case classes. For
example, by removing the case class names from the pattern:
case BagOfSocks(MySock(size, _), _) => ...
that could be used for values of type ((Double, String), Int). So, within pattern
matching expressions, case classes behave as tuple types with added names.
Scala’s “case classes” got their name from their use in case expressions. It is
usually more convenient to use case expressions with case classes than to use de-
structuring definitions.
termines that the array does not contain 𝑥. It is convenient if the algorithm could
return a value of a single type (say, SearchResult) that represents either an index at
which 𝑥 is found, or the absence of an index.
More generally, we may have computations that either return a result or generate
an error and fail to produce a result. It is then convenient to return a value of a
single type (say, Result) that represents either a correct result or an error message.
In certain computer games, one has different types of “rooms”, each room hav-
ing certain properties depending on its type. Some rooms are dangerous because
of monsters, other rooms contain useful objects, certain rooms allow you to fin-
ish the game, and so on. We want to represent all the different kinds of rooms
uniformly as a type Room. A value of type Room should automatically describe the
room’s relevant properties in each case.
In all these situations, data comes in several mutually exclusive shapes. This
sort of data can be represented by a single type if that type is able to describe a
mutually exclusive set of cases:
• RootsOfQ must be either the empty tuple (), or a Double value, or a tuple of
type (Double, Double)
• SearchResult must be either an Int value or the empty tuple ()
• Result must be either an Int value or a String error message
We see that the empty tuple, i.e., the Unit type, is natural to use in these situations.
It is also helpful to assign names to each of the cases:
• RootsOfQ is “no roots” with value (), or “one root” with value Double, or “two
roots” with value (Double, Double)
• SearchResult is “index” with an Int value, or “not found” with value ()
• Result is “value” of type Int or “error message” of type String
Scala’s case classes provides exactly what we need here — named tuples with
zero, one, two, or more parts. So, it is natural to use case classes instead of tuples:
• RootsOfQ is a value of the form NoRoots(), or of the form OneRoot(x: Double), or
of the form TwoRoots(x: Double, y: Double)
• SearchResult is a value of the form Index(x: Int) or of the form NotFound()
• Result is a value of the form Value(x: Int) or Error(message: String)
Our three examples are now described as types that allow us to select one case
class out of a given set. It remains to see how Scala defines such types. For in-
stance, the definition of RootsOfQ needs to indicate that the case classes NoRoots,
OneRoot, and TwoRoots are the only possibilities allowed by the type RootsOfQ. The
Scala syntax for that definition looks like this:
98
3.2 Disjunctive types
The definition of the Result type is parameterized, so that we can describe results
of any type (while error messages are always of type String):
sealed trait Result[A]
final case class Value[A](x: A) extends Result[A]
final case class Error[A](message: String) extends Result[A]
The “sealed trait / final case class” syntax defines a type that represents a
choice of one case class from a fixed set of case classes. This kind of type is called
a disjunctive type (or a co-product type) in this book. The keywords final and
sealed tell the Scala compiler that the given set of case classes within a disjunctive
type is fixed and unchangeable.
How can we use a given value, say, x: RootsOfQ? Disjunctive types fit well with
pattern matching. In Chapter 2, we used pattern matching with syntax such as
{ case (x, y) => ... }. To use pattern matching with disjunctive types, we write
several case patterns because we need to detect several possible cases of the dis-
junctive type:
def print(r: RootsOfQ): String = r match {
case NoRoots() => "no real roots"
case OneRoot(r) => s"one real root: $r"
case TwoRoots(x, y) => s"real roots: ($x, $y)"
}
99
3 The logic of types. I. Disjunctive types
scala> print(x)
res0: String = "one real root: 2.0"
Each case pattern will introduce its own pattern variables, such as r, x, y in the
code above. Each pattern variable is defined only within the local scope, that is,
within the scope of its case expression. It is impossible to make a mistake where
we, say, refer to the variable r within the code that handles the case of two roots.
If the code only needs to work with a subset of cases, we can match all other
cases with an underscore character (as in case _):
scala> x match {
case OneRoot(r) => s"one real root: $r"
case _ => "have something else"
}
res1: String = one real root: 2.0
The match/case expression represents a choice over possible values of a given type.
Note the similarity with this code:
def f(x: Int): Int = x match {
case 0 => println(s"error: must be nonzero"); -1
case 1 => println(s"error: must be greater than 1"); -1
case _ => x
}
The values 0 and 1 are some possible values of type Int, just as OneRoot(4.0) is
a possible value of type RootsOfQ. When used with disjunctive types, match/case
expressions will usually cover the complete list of possibilities. If the list of cases
is incomplete, the Scala compiler will print a warning:
scala> def g(x: RootsOfQ): String = x match {
case OneRoot(r) => s"one real root: $r"
}
<console>:14: warning: match may not be exhaustive.
It would fail on the following inputs: NoRoots(), TwoRoots(_, _)
This code defines a partial function g that can be applied only to values of the form
OneRoot(...) and will fail (throwing an exception) for other values.
Let us look at more examples of using the disjunctive types we just defined.
Example 3.2.2.1 Given a sequence of quadratic equations, compute a sequence
containing their real roots as values of type RootsOfQ.
Solution Define a case class representing a quadratic equation 𝑥 2 + 𝑏𝑥 + 𝑐 = 0:
case class QEqu(b: Double, c: Double)
The following function determines how many real roots an equation has:
def solve(quadraticEqu: QEqu): RootsOfQ = {
val QEqu(b, c) = quadraticEqu // Destructure QEqu.
val d = b * b / 4 - c
if (d > 0) {
100
3.2 Disjunctive types
val s = math.sqrt(d)
TwoRoots(- b / 2 - s, - b / 2 + s)
} else if (d == 0.0) OneRoot(- b / 2)
else NoRoots()
}
If the function solve will not be used often, we may want to write it inline as a
nameless function:
def findRoots(equs: Seq[QEqu]): Seq[RootsOfQ] = equs.map { case QEqu(b, c) =>
(b * b / 4 - c) match {
case d if d > 0 =>
val s = math.sqrt(d)
TwoRoots(- b / 2 - s, - b / 2 + s)
case 0.0 => OneRoot(- b / 2)
case _ => NoRoots()
}
}
This code depends on some features of Scala syntax. We can use the function ex-
pression { case QEqu(b, c) => ... } directly as the argument of map, destructuring
QEqu at the same time. The if/else expression is replaced by an “embedded” if
within a case expression, which is easier to read.
Test the final code:
scala> findRoots(Seq(QEqu(1, 1), QEqu(2, 1)))
res4: Seq[RootsOfQ] = List(NoRoots(), OneRoot(-1.0))
101
3 The logic of types. I. Disjunctive types
rs.filter {
case OneRoot(x) => true
case _ => false
}.map { case OneRoot(x) => x }
In the map operation, we need to cover only the one-root case because the two other
possibilities have been excluded (“filtered out”) by the preceding filter operation.
We can implement the same function by using the standard library’s collect
method that performs the filtering and mapping operation in one step:
def singleRoots(rs: Seq[RootsOfQ]): Seq[Double]
= rs.collect { case OneRoot(x) => x }
In that case, the array’s element at the computed index will not be equal to goal.
We should return NotFound() in that case. We use a match/case expression for the
new logic:
def safeBinSearch(xs: Seq[Int], goal: Int): SearchResult =
binSearch(xs, goal) match {
case n if xs(n) == goal => Index(n)
case _ => NotFound()
}
Example 3.2.2.4 Use the disjunctive type Result[Int] to implement “safe arith-
metic”, where a division by zero or a square root of a negative number gives an
error message. Define arithmetic operations directly for values of type Result[Int].
Abandon further computations on any error.
Solution Begin by implementing the (integer-valued) square root as a func-
tion from Result[Int] to Result[Int]:
def sqrt(r: Result[Int]): Result[Int] = r match {
case Value(x) if x >= 0 => Value(math.sqrt(x).toInt)
case Value(x) => Error(s"error: sqrt($x)")
case Error(m) => Error(m) // Keep the error message.
102
3.2 Disjunctive types
The square root is computed only if we have the Value(x) case, and only if 𝑥 ≥ 0. If
the argument r was already an Error case, we keep the error message and perform
no further computations.
To implement the addition operation, we need a bit more work:
def add(rx: Result[Int], ry: Result[Int]): Result[Int] = (rx, ry) match {
case (Value(x), Value(y)) => Value(x + y)
case (Error(m), _) => Error(m) // Keep the first error message.
case (_, Error(m)) => Error(m) // Keep the second error message.
}
This code illustrates nested patterns that match the tuple (rx, ry) against various
possibilities. When written in this way, the code is clearer than code written with
nested if/else expressions.
Implementing the multiplication operation results in almost the same code:
def mul(rx: Result[Int], ry: Result[Int]): Result[Int] = (rx, ry) match {
case (Value(x), Value(y)) => Value(x * y)
case (Error(m), _) => Error(m)
case (_, Error(m)) => Error(m)
}
To avoid repetition, we may define a general function (map2) that “maps” binary
operations on integers to operations on Result[Int] types:
def map2(rx: Result[Int], ry: Result[Int])(op: (Int, Int) => Int): Result[Int] =
(rx, ry) match {
case (Value(x), Value(y)) => Value(op(x, y))
case (Error(m), _) => Error(m)
case (_, Error(m)) => Error(m)
}
Now we can easily “map” any binary operation on integers to a binary operation
on Result[Int], assuming that the operation itself never generates an error:
def sub(rx: Result[Int], ry: Result[Int]): Result[Int] =
map2(rx, ry) { (x, y) => x - y }
Custom code is still needed for operations that may generate errors:
def div(rx: Result[Int], ry: Result[Int]): Result[Int] = (rx, ry) match {
case (Value(x), Value(y)) if y != 0 => Value(x / y)
case (Value(x), Value(y)) => Error(s"error: $x / $y")
case (Error(m), _) => Error(m)
case (_, Error(m)) => Error(m)
}
We can now test the “safe arithmetic” on simple calculations. Let us see what
happens after an error:
scala> add(Value(1), Value(2))
res10: Result[Int] = Value(3)
103
3 The logic of types. I. Disjunctive types
Let us check that all further computations are abandoned once an error occurs.
Indeed, the following example shows that the error message for 20 + 1/0 never
mentions 20:
scala> add(Value(20), div(Value(1), Value(0)))
res12: Result[Int] = Error(error: 1 / 0)
This code is similar to the type SearchResult defined in Section 3.2.1, except that
Option has a type parameter instead of a fixed type Int. Another difference is the
use of a case object instead of an empty case class, such as None(). Since Scala’s
case objects cannot have type parameters, the type parameter in the definition of
None must be set to the special type Nothing, which is a type with no values, also
called the void type (not to be confused with Java or C’s void keyword!). The
special type annotation +T makes None usable as a value of type Option[T] for any
type T; see Section 6.1.8 for more details.
An alternative (implemented, e.g., in the scalaz library) is to define the empty
option value as:
final case class None[T]() extends Option[T]
104
3.2 Disjunctive types
At the two sides of “case None => None”, the value None has different types, namely
Option[Long] and Option[Seq[Long]]. Since these types are declared in the type sig-
nature of the function getDigits, the Scala compiler is able to figure out the types
of all expressions in the match/case construction. So, pattern-matching code can be
written without explicit type annotations such as (None: Option[Long]).
If we now need to compute the number of digits, we can write:
def numberOfDigits(phone: Option[Long]): Option[Long] = getDigits(phone) match {
case None => None
case Some(digits) => Some(digits.length)
}
It is then natural to generalize this function to arbitrary types using type param-
eters instead of a fixed type Long. The resulting function is usually called fmap in
functional programming libraries:
105
3 The logic of types. I. Disjunctive types
scala> fmap(digitsOf)(Some(4096))
res0: Option[Seq[Long]] = Some(List(4, 0, 9, 6))
scala> fmap(digitsOf)(None)
res1: Option[Seq[Long]] = None
We say that the fmap operation lifts a given function f of type A => B to a new
function of type Option[A] => Option[B].
It is important to keep in mind that the code case Some(a) => Some(f(a)) changes
the type of the option value. On the left side of the arrow, the type is Option[A],
while on the right side it is Option[B]. The Scala compiler knows this from the
given type signature of fmap, so an explicit type parameter, which we could write
as Some[B](f(a)), is not needed.
The Scala library implements an equivalent function as a method of the Option
class, with the syntax x.map(f) rather than fmap(f)(x). We can concisely rewrite the
previous code using these methods:
def getDigits(phone: Option[Long]): Option[Seq[Long]] = phone.map(digitsOf)
def numberOfDigits(phone: Option[Long]): Option[Long] =
phone.map(digitsOf).map(_.length)
We see that the map operation for the Option type is analogous to the map operation
for sequences.
The similarity between Option[A] and Seq[A] is clearer if we view Option[A] as
a special kind of “sequence” whose length is restricted to be either 0 or 1. So,
Option[A] can have all the operations of Seq[A] except operations such as concat
that may grow the sequence beyond length 1. The standard operations defined
on Option include map, filter, zip, forall, exists, flatMap, and foldLeft.
Example 3.2.3.2 Given a phone number as Option[Long], extract the country code
if it is present. The result must be again of type Option[Long]. Assume that the
country code is the digits in front of a 10-digit phone number; for the phone num-
ber 18004151212, the country code is 1.
Solution If the phone number is a positive integer 𝑛, we may compute the
country code simply as n / 10000000000L. However, if the result of that division is
zero, we should return an empty Option (i.e., the value None) rather than 0:
def countryCode(phone: Option[Long]): Option[Long] = phone match {
case None => None
case Some(n) =>
val countryCode = n / 10000000000L
if (countryCode != 0L) Some(countryCode) else None
}
106
3.2 Disjunctive types
Notice that we have reimplemented the code pattern similar to map, namely “if None
then return None, else return Some(...)”. So, we may try to rewrite the code as:
def countryCode(phone: Option[Long]): Option[Long] = phone.map { n =>
val countryCode = n / 10000000000L
if (countryCode != 0L) Some(countryCode) else None
} // Type error: the result is Option[Option[Long]], not Option[Long].
This code does not compile: we are returning an Option[Long] within a function
lifted via map, so the resulting type is Option[Option[Long]]. Use flatten to convert
Option[Option[Long]] to the required type Option[Long]:
def countryCode(phone: Option[Long]): Option[Long] = phone.map { n =>
val countryCode = n / 10000000000L
if (countryCode != 0L) Some(countryCode) else None
}.flatten // Types are correct now.
Since the flatten follows a map, rewrite the code using flatMap:
def countryCode(phone: Option[Long]): Option[Long] = phone.flatMap { n =>
val countryCode = n / 10000000000L
if (countryCode != 0L) Some(countryCode) else None
}
Another way of implementing this example is to notice the code pattern “if con-
dition does not hold, return None, otherwise keep the value”. For an Option type,
this is equivalent to the filter operation (recall that filter returns an empty se-
quence if the predicate never holds). The code is:
def countryCode(phone: Option[Long]): Option[Long] = phone.map(_ /
10000000000L).filter(_ != 0L)
scala> countryCode(Some(18004151212L))
res0: Option[Long] = Some(1)
scala> countryCode(Some(8004151212L))
res1: Option[Long] = None
Example 3.2.3.3 Add a new requirement to Example 3.2.3.2: if the country code
is not present, return the default country code 1.
Solution This is an often used code pattern: “if empty, substitute a default
value”. The Scala library has the method getOrElse for this purpose:
scala> Some(100).getOrElse(1)
res2: Int = 100
scala> None.getOrElse(1)
res3: Int = 1
So, we can implement the new requirement as:
scala> countryCode(Some(8004151212L)).getOrElse(1L)
res4: Long = 1
107
3 The logic of types. I. Disjunctive types
Using Option with collections Several Scala library methods return an Option as
a result. Examples are find, headOption, and lift for sequences, as well as get for
dictionaries.
The find method returns the first element satisfying a predicate:
scala> (1 to 10).find(_ > 5)
res0: Option[Int] = Some(6)
The headOption method returns the first element of a sequence, unless the se-
quence is empty. This is equivalent to lift(0):
scala> Seq(1, 2, 3).headOption
res4: Option[Int] = Some(1)
The get method is a safe by-key access to dictionaries, unlike the direct access that
may fail with an exception:
scala> Map(10 -> "a", 20 -> "b")(10)
res8: String = a
Similarly, lift is a safe by-index access to collections, unlike the direct access that
may fail with an exception:
scala> Seq(10, 20, 30)(0)
108
3.2 Disjunctive types
res9: Int = 10
The Either type The standard disjunctive type Either[A, B] has two type param-
eters and is often used for computations that report errors. By convention, the first
type (A) is the type of error, and the second type (B) is the type of the (non-error)
result. The names of the two cases are Left and Right. A possible definition of
Either may be written as:
sealed trait Either[A, B]
final case class Left[A, B](value: A) extends Either[A, B]
final case class Right[A, B](value: B) extends Either[A, B]
To test:
scala> logError(Right(123), -1)
res1: Int = 123
Why use Either instead of Option for computations that may fail? When a miss-
ing result is an error, we will usually need to know the reason why the result is
unavailable. The Either type may provide detailed information about such errors,
which Option cannot do. An Option type is mostly used in cases where the absence
of a result is not an error.
The Either type generalizes the type Result defined in Section 3.2.1 to an ar-
bitrary error type instead of String. We have seen its usage in Example 3.2.2.4,
where the code pattern was “if value is present, do a computation, otherwise keep
the error”. This code pattern is implemented by the map method of Either:
1 scala> Right(1).map(_ + 1)
2 res0: Either[Nothing, Int] = Right(2)
3
4 scala> Left[String, Int]("error").map(_ + 1)
5 res1: Either[String, Int] = Left("error")
The type Nothing was filled in by the Scala compiler because we did not specify
109
3 The logic of types. I. Disjunctive types
110
3.2 Disjunctive types
programs correctly: it is hard to figure out and to keep in mind all the possible ex-
ceptions that a given library function may “throw” in its code (and in the code of all
other libraries being used). Instead of using exceptions for indicating errors, Scala
programmers can write functions that return a disjunctive type, such as Either,
describing both a correct result and an error condition. Users of these functions
will have to do pattern matching on the result values. This helps programmers to
avoid forgetting to handle an error situation that the code is likely to encounter.
Nevertheless, programmers will often need to use Java or Scala libraries that
throw exceptions. To help write code for these situations, the Scala library provides
a disjunctive type called Try. The type Try[A] is equivalent to Either[Throwable, A],
where Throwable is the general type of all exceptions (i.e., values to which a throw
operation can be applied). The two parts of the disjunctive type Try[A] are called
Failure and Success[A] (instead of Left[Throwable, A] and Right[Throwable, A] in the
Either type). The class constructor Try(expr) will catch all “planned” exceptions
thrown while the expression expr is evaluated.2
If the evaluation of expr succeeds and returns a value x: A, the value of Try(expr)
will be Success(x). Otherwise it will be Failure(t), where t: Throwable is a value
containing details about the exception. Here is an example of using Try:
import scala.util.{Try, Success, Failure}
The code Try("xyz".toInt) does not generate any exceptions and will not crash the
program. Any computation that may throw a planned exception can be enclosed in
a Try(), and the exception will be caught and encapsulated within the disjunctive
type as a Failure(...) value.
The methods map, filter, flatMap, foldLeft are defined for the Try class similarly
to the Either type. One additional feature of Try is to catch exceptions generated
by the function arguments of map, filter, flatMap, and other standard methods:
scala> val y = q.map(y => throw new Exception("ouch"))
y: Try[Int] = Failure(java.lang.Exception: ouch)
In this example, the values y and z were computed successfully even though excep-
tions were thrown while the function arguments of map and filter were evaluated.
Further code can use pattern matching on the values y and z and examine those
2 But Try()will not catch exceptions of class java.lang.Error and its subclasses. Those exceptions
are intended to represent unplanned, serious error situations.
111
3 The logic of types. I. Disjunctive types
exceptions. However, it is important that these exceptions were caught and the
program did not crash, meaning that further code is able to run.
While the standard types Try and Either will cover many use cases, program-
mers can also define custom disjunctive types in order to represent all the antic-
ipated failures or errors in the business logic of a particular application. Repre-
senting all errors in the types helps assure that the program will not crash because
of an exception that we forgot to handle or did not even know about.
The type NInt has two disjunctive parts: N1 and N2. But the case class N2 contains a
value of type NInt as if the type NInt were already defined.
A type whose definition uses that same type is called a recursive type. The type
NInt is an example of a recursive disjunctive type.
We might imagine defining a disjunctive type X whose parts recursively refer to
the same type X (and/or to each other) in complicated ways. What kind of data
would be represented by such a type X, and in what situations would X be useful?
For instance, the simple definition:
final case class Bad(x: Bad)
is useless since we cannot create a value of type Bad unless we already have a value
x of type Bad. This is an example of an infinite loop in type recursion. We will
never be able to create values of type Bad, which means that the type Bad is “void”
(has no values, like the special Scala type Nothing).
Section 8.5.1 will derive precise conditions under which a recursive type is not
void. For now, we will look at the recursive disjunctive types that are used most
often: lists and trees.
112
3.3 Lists and trees as recursive disjunctive types
However, this definition is not practical: we cannot define a separate case class
for each possible length. Instead, we define the type List[A] via mathematical
induction on the length of the list:
• Base case: empty list, case class List0[A]().
• Inductive step: given a list of a previously defined length, say List𝑛−1 , define
a new case class List𝑛 describing a list with one more element of type A. So,
we could define List𝑛 = (A, List𝑛−1 ).
Let us try to write this inductive definition as code:
sealed trait ListI[A] // Inductive definition of a list.
final case class List0[A]() extends ListI[A]
final case class List1[A](x: A, next: List0[A]) extends ListI[A]
final case class List2[A](x: A, next: List1[A]) extends ListI[A]
??? // Still need an infinitely long definition.
To avoid writing an infinitely long type definition, we use a trick. Note that the
definitions of List1, List2, etc., have a similar form (while List0 is not similar). To
replace the definitions List1, List2, etc., by a single definition ListN, we write the
type ListI[A] inside the case class ListN:
sealed trait ListI[A] // Inductive definition of a list.
final case class List0[A]() extends ListI[A]
final case class ListN[A](x: A, next: ListI[A]) extends ListI[A]
The type definition has become recursive. For this trick to work, it is important to
use ListI[A] and not ListN[A] inside the case class ListN[A]. Otherwise, we would
get an infinite loop in type recursion (similarly to case class Bad shown before).
Since we obtained the definition of type ListI[A] via a trick, let us verify that the
code actually defines the disjunctive type we wanted.
To create a value of type ListI[A], we must use one of the two available case
classes. Using the first case class, we may create a value List0(). Since this empty
case class does not contain any values of type A, it effectively represents an empty
list (the base case of the induction). Using the second case class, we may create a
value ListN(x, next) where x is of type A and next is an already constructed value
of type ListI[A]. This represents the inductive step because the case class ListN is a
named tuple containing A and ListI[A]. Now, the same consideration recursively
applies to constructing the value next, which must be either an empty list or a
pair containing a value of type A and another list. The assumption that the value
next: ListI[A] is already constructed is equivalent to the inductive assumption
that we already have a list of a previously defined length. So, we have verified
that ListI[A] implements the inductive definition shown above.
Examples of values of type ListI are the empty list List0(), a one-element list
ListN(x, List0()), and a two-element list ListN(x, ListN(y, List0()).
To illustrate writing pattern-matching code using this type, let us implement
the method headOption:
113
3 The logic of types. I. Disjunctive types
Because “operator-like” case class names, such as ::, support the infix syntax, we
may write expressions such as head :: tail instead of ::(head, tail). This syntax
can be also used in pattern matching on List values, with code that looks like this:
def headOption[A]: List[A] => Option[A] = {
case Nil => None
case head :: tail => Some(head)
}
Examples of values created using Scala’s standard List type are the empty list Nil,
a one-element list x :: Nil, and a two-element list x :: y :: Nil. The same syntax
x :: y :: Nil is used both for creating values of type List and for pattern matching
on such values.
The Scala library also defines the helper function List(), so that List() is the
same as Nil and List(1, 2, 3) is the same as 1 :: 2 :: 3 :: Nil. Lists are easier to
read in the syntax List(1, 2, 3). Pattern matching may also use that syntax:
val x: List[Int] = List(1, 2, 3)
x match {
case List(a) => ...
case List(a, b, c) => ...
case _ => ...
}
The base case is an empty list, and we return again an empty list:
def map[A, B](xs: List[A])(f: A => B): List[B] = xs match {
case Nil => Nil
...
114
3.3 Lists and trees as recursive disjunctive types
In the inductive step, we have a pair (head, tail) in the case class ::, with head: A
and tail: List[A]. The pair can be pattern-matched with the syntax head :: tail.
The map function should apply the argument f to the head value, which will give
the first element of the resulting list. The remaining elements are computed by the
induction assumption, i.e., by a recursive call to map:
def map[A, B](xs: List[A])(f: A => B): List[B] = xs match {
case Nil => Nil
case head :: tail => f(head) :: map(tail)(f)
Reasoning by induction, we start with the base case xs == Nil, where the only
possibility is to return the value init:
def foldLeft[A, R](xs: List[A])(init: R)(f: (R, A) => R): R = xs match {
case Nil => init
...
The inductive step for foldLeft says that, given the values head: A and tail: List[A],
we need to apply the updater function to the previous accumulator value. That
value is init. So, we apply foldLeft recursively to the tail of the list once we have
the updated accumulator value:
@tailrec def foldLeft[A, R](xs: List[A])(init: R)(f: (R, A) => R): R =
xs match {
case Nil => init
case head :: tail =>
val newInit = f(init, head) // Update the accumulator.
foldLeft(tail)(newInit)(f) // Recursive call to `foldLeft`.
}
Without the explicit type annotation (Nil: List[A]), the Scala compiler will decide
that Nil has type List[Nothing], and the types will not match later in the code. In
115
3 The logic of types. I. Disjunctive types
Scala, the initial value for foldLeft often needs an explicit type annotation.
The reverse function can be used to obtain a tail-recursive implementation of
map for List. The idea is to first use foldLeft to accumulate transformed elements:
scala> Seq(1, 2, 3).foldLeft(Nil:List[Int])((prev, x) => (x * x) :: prev)
res0: List[Int] = List(9, 4, 1)
This achieves stack safety at the cost of traversing the list twice. (This code is
shown only as an example. The Scala library implements List’s map using mutable
variables to improve performance.)
Example 3.3.2.1 A definition of the non-empty list is similar to List except that
the empty-list case is replaced by a 1-element case:
sealed trait NEL[A]
final case class Last[A](head: A) extends NEL[A]
final case class More[A](head: A, tail: NEL[A]) extends NEL[A]
To test:
scala> toNEL(1, List()) // Result = [1].
res0: NEL[Int] = Last(1)
The head method is safe for non-empty lists, unlike head for an ordinary List:
def head[A]: NEL[A] => A = {
case Last(x) => x
116
3.3 Lists and trees as recursive disjunctive types
Example 3.3.2.2 Use foldLeft to implement a reverse function for the type NEL.
The required type signature and a sample test:
def reverse[A]: NEL[A] => NEL[A] = ???
scala> reverse(toNEL(10, List(20, 30))) // The result must be [30, 20, 10].
res3: NEL[Int] = More(30,More(20,Last(10)))
Solution We will use foldLeft to build up the reversed list as the accumulator
value. It remains to choose the initial value of the accumulator and the updater
function. We have already seen the code for reversing the ordinary list via the
foldLeft method:
def reverse[A](xs: List[A]): List[A] = xs.foldLeft(Nil: List[A])((prev,x) => x
:: prev)
However, we cannot reuse the same code for non-empty lists by writing More(x,
prev) instead of x :: prev, because the foldLeft operation works with non-empty
lists differently. Since lists are always non-empty, the updater function is always
applied to an initial value, and the code works incorrectly:
def reverse[A](xs: NEL[A]): NEL[A] =
foldLeft(xs)(Last(head(xs)): NEL[A])((prev,x) => More(x, prev))
scala> reverse(toNEL(10, List(20, 30))) // The result is [30, 20, 10, 10].
res4: NEL[Int] = More(30,More(20,More(10,Last(10))))
The last element, 10, should not have been repeated. It was repeated because the
initial accumulator value already contained the head element 10 of the original
list. However, we cannot set the initial accumulator value to an empty list, since
a value of type NEL[A] must be non-empty. It seems that we need to handle the
case of a one-element list separately. So, we begin by matching on the argument
of reverse, and apply foldLeft only when the list is longer than 1 element:
def reverse[A]: NEL[A] => NEL[A] = {
case Last(x) => Last(x) // `reverse` is a no-op.
case More(x, tail) => // Use foldLeft on `tail`.
foldLeft(tail)(Last(x): NEL[A])((prev, x) => More(x, prev))
}
117
3 The logic of types. I. Disjunctive types
Exercise 3.3.2.3 Implement a function toList that converts a non-empty list into
an ordinary Scala List. The required type signature and a sample test:
def toList[A](nel: NEL[A]): List[A] = ???
Exercise 3.3.2.4 Implement a map function for the type NEL. Type signature and a
sample test:
def mapNEL[A,B](xs: NEL[A])(f: A => B): NEL[B] = ???
118
3.3 Lists and trees as recursive disjunctive types
Here are some examples of code expressions and the corresponding trees:
Branch(Branch(Leaf("a1"), Leaf("a2")), Leaf("a3"))
𝑎3
𝑎1 𝑎2
Note that this function cannot be made tail-recursive using the accumulator trick,
because foldLeft needs to call itself twice in the Branch case.
To verify that foldLeft works as intended, let us run a simple test:
val t: Tree2[String] = Branch(Branch(Leaf("a1"), Leaf("a2")), Leaf("a3"))
Since we used a non-empty list NEL, a Branch() value is guaranteed to have at least
one branch. If we used an ordinary List instead, we could (by mistake) create a
tree with empty branches.
Exercise 3.3.4.1 Define the function foldLeft for a rose tree of type TreeN[A] shown
above. Assume that a foldLeft function is already available for the type NEL. The
required type signature and a test:
119
3 The logic of types. I. Disjunctive types
The case Branch1 describes a perfect-shaped tree with total depth 1, the case Branch2
has total depth 2, and so on. The non-trivial step is to notice that each case class
Branch𝑛 uses the previous case class’s data structure with the type parameter set to
(A, A) instead of A. So, we can rewrite the above definition as:
sealed trait PTree[A]
final case class Leaf[A](x: A) extends PTree[A]
final case class Branch1[A](xs: Leaf[(A, A)]) extends PTree[A]
final case class Branch2[A](xs: Branch1[(A, A)]) extends PTree[A]
??? // Need an infinitely long definition.
We can now apply the type recursion trick: replace the type Branch𝑛−1 [(A, A)] in
the definition of Branch𝑛 by the recursively used type PTree[(A, A)]. Now we can
define a perfect-shaped binary tree:
sealed trait PTree[A]
final case class Leaf[A](x: A) extends PTree[A]
final case class Branch[A](xs: PTree[(A, A)]) extends PTree[A]
120
3.3 Lists and trees as recursive disjunctive types
Since we used some tricks to figure out the definition of PTree[A], let us verify
that this definition actually describes the recursive disjunctive type we wanted.
The only way to create a structure of type PTree[A] is to create a Leaf[A] or a
Branch[A]. A value of type Leaf[A] is itself a perfect-shaped tree. It remains to
consider the case of Branch[A]. Creating a Branch[A] requires a previously created
PTree with values of type (A, A) instead of A. By the inductive assumption, the
previously created PTree[A] would have the correct shape. Now, it is clear that if
we replace the type parameter A by the pair (A, A), a perfect-shaped tree such as
is replaced by (each leaf value 𝑎𝑖 became
𝑎1 𝑎2 𝑎3 𝑎4
0 0 0 0
𝑎1 𝑎1” 𝑎2 𝑎2” 𝑎3 𝑎3” 𝑎4 𝑎4”
0
a pair 𝑎𝑖 , 𝑎𝑖” ). That tree is again perfect-shaped but is one level deeper. We see that
PTree[A] is a correct definition of a perfect-shaped binary tree.
Example 3.3.5.1 Define a (non-tail-recursive) map function for a perfect-shaped
binary tree. The required type signature and a test:
def map[A, B](t: PTree[A])(f: A => B): PTree[B] = ???
In the inductive step, we are given a previous tree value xs: PTree[(A, A)]. It is
clear that we need to apply map recursively to xs. Let us try:
def map[A, B](t: PTree[A])(f: A => B): PTree[B] = t match {
case Leaf(x) => Leaf(f(x))
case Branch(xs) => Branch(map(xs)(f)) // Type error!
}
Here, map(xs)(f) has an incorrect type of the function f. Since xs has type PTree[(A,
A)], the recursive call map(xs)(f) requires f to be of type ((A, A)) => (B, B) instead
of A => B. So, we need to provide a function of the correct type instead of f. A
function of type ((A, A)) => (B, B) will be obtained out of f: A => B if we apply
f to each part of the tuple (A, A). The code for that function is { case (x, y) =>
(f(x), f(y)) }. Therefore, we can implement map as:
121
3 The logic of types. I. Disjunctive types
scala> depth(Branch(Branch(Leaf((("a","b"),("c","d"))))))
res2: Int = 2
We will need one case class for each of Sqrt, Add, Mul, and Div. An additional oper-
ation, Num, will lift ordinary integers into “safe integers”. So, we define the disjunc-
tive type (Arith) for the “safe arithmetic” sub-language as:
sealed trait Arith
final case class Num(x: Int) extends Arith
final case class Sqrt(x: Arith) extends Arith
final case class Add(x: Arith, y: Arith) extends Arith
final case class Mul(x: Arith, y: Arith) extends Arith
final case class Div(x: Arith, y: Arith) extends Arith
A value of type Arith is either a Num(x) for some integer x, or an Add(x, y) where x
and y are previously defined Arith expressions, or another operation.
This type definition is similar to the binary tree type if we rename Leaf to Num
and Branch to Add:
sealed trait Tree
final case class Leaf(x: Int) extends Tree
final case class Branch(x: Tree, y: Tree) extends Tree
However, the Arith type is a tree that supports four different types of branches,
some with branching number 1 and others with branching number 2.
This example illustrates the structure of an AST: it is a tree of a specific shape,
with leaves and branches chosen from a specified set of allowed possibilities. In
the “safe arithmetic” example, we have a single allowed type of leaf (Num) and four
allowed types of branches (Sqrt, Add, Mul, and Div).
This completes the first stage of implementing the sub-language. We may now
use the √disjunctive type Arith to create expressions in the sub-language. For ex-
ample, 16 ∗ (1 + 2) is represented by:
scala> val x: Arith = Mul(Sqrt(Num(16)), Add(Num(1), Num(2)))
x: Arith = Mul(Sqrt(Num(16)),Add(Num(1),Num(2)))
Num
Num Num
16
1 2
√
The expressions 20 + 1/0 and 10 ∗ −1 are represented by:
scala> val y: Arith = Add(Num(20), Div(Num(1), Num(0)))
y: Arith = Add(Num(20),Div(Num(1),Num(0)))
scala> run(y)
res1: Either[String, Int] = Left("error: 1 / 0")
scala> run(z)
res2: Either[String, Int] = Left("error: sqrt(-1)")
3.4 Summary
What problems can we solve now?
124
3.4 Summary
3.4.1 Examples
Example 3.4.1.1 Define a disjunctive type DayOfWeek representing the seven days
of a week.
Solution Since each day carries no information except the day’s name, we can
use empty case classes and represent the day’s name via the name of the case class:
sealed trait DayOfWeek
final case class Sunday() extends DayOfWeek
final case class Monday() extends DayOfWeek
final case class Tuesday() extends DayOfWeek
final case class Wednesday() extends DayOfWeek
final case class Thursday() extends DayOfWeek
final case class Friday() extends DayOfWeek
final case class Saturday() extends DayOfWeek
This data type is analogous to an enumeration type in C or C++:
typedef enum { Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday }
DayOfWeek;
Example 3.4.1.2 Modify DayOfWeek so that on Fridays the values additionally rep-
resent names of restaurants and amounts paid, and on Saturdays a wake-up time.
Solution For the days where additional information is given, we use non-
empty case classes:
sealed trait DayOfWeekX
final case class Sunday() extends DayOfWeekX
final case class Monday() extends DayOfWeekX
final case class Tuesday() extends DayOfWeekX
final case class Wednesday() extends DayOfWeekX
final case class Thursday() extends DayOfWeekX
final case class Friday(restaurant: String, amount: Int) extends DayOfWeekX
final case class Saturday(wakeUpAt: java.time.LocalTime) extends DayOfWeekX
This data type is no longer equivalent to an enumeration type.
Example 3.4.1.3 Define a disjunctive type that describes the real roots of the
equation 𝑎𝑥 2 + 𝑏𝑥 + 𝑐 = 0, where 𝑎, 𝑏, 𝑐 are arbitrary real numbers. Write a func-
tion that returns a value of that type and solves a given equation of the form
𝑎𝑥 2 + 𝑏𝑥 + 𝑐 = 0.
125
3 The logic of types. I. Disjunctive types
Solution Begin by solving the equation and enumerating all the possible cases.
It may happen that 𝑎 = 𝑏 = 𝑐 = 0, and then all 𝑥 are roots. If 𝑎 = 𝑏 = 0 but 𝑐 ≠ 0,
the equation is 𝑐 = 0, which has no roots. If 𝑎 = 0 but 𝑏 ≠ 0, the equation becomes
𝑏𝑥 + 𝑐 = 0, having a single root. If 𝑎 ≠ 0 and 𝑏 2 > 4𝑎𝑐, we have two distinct real
roots. If 𝑎 ≠ 0 and 𝑏 2 = 4𝑎𝑐, we have one real root. If 𝑏 2 < 4𝑎𝑐, we have no real
roots. The resulting type definition can be written as:
sealed trait RootsOfQ2
final case class AllRoots() extends RootsOfQ2
final case class ConstNoRoots() extends RootsOfQ2
final case class Linear(x: Double) extends RootsOfQ2
final case class NoRealRoots() extends RootsOfQ2
final case class OneRootQ(x: Double) extends RootsOfQ2
final case class TwoRootsQ(x: Double, y: Double) extends RootsOfQ2
This disjunctive type contains six parts: three parts are empty tuples and two
parts are single-element tuples; but this is not a useless redundancy. We would
lose information if we reused Linear for the two cases (𝑎 = 0, 𝑏 ≠ 0) and (𝑎 ≠ 0,
𝑏 2 = 4𝑎𝑐), or if we reused NoRoots() for all three different no-roots cases.
To solve a given equation, we need to decide which part of the disjunctive type
to return. The code is:
def solveQ2(a: Double, b: Double, c: Double) : RootsOfQ2 = (a, b, c) match {
case (0.0, 0.0, 0.0) => AllRoots()
case (0.0, 0.0, _) => NoRealRoots()
case (0.0, _, _) => Linear(-c / b)
case _ => // We match here only if `a` is nonzero.
val d = b * b - 4 * a * c
val p = - b / (2.0 * a)
if (d < 0.0) NoRealRoots()
else if (d == 0.0) OneRootQ(p)
else {
val s = math.sqrt(d) / (2.0 * a)
TwoRootsQ(p - s, p + s)
}
}
Example 3.4.1.4 Define a function rootAverage that computes the average value of
all real roots of a general quadratic equation, where the set of roots is represented
by the type RootsOfQ2 defined in Example 3.4.1.3. The required type signature is:
val rootAverage: RootsOfQ2 => Option[Double] = ???
126
3.4 Summary
Return None if the average is undefined (no roots or all values are roots).
Solution The average is defined only in cases Linear, OneRootQ, and TwoRootsQ.
In all other cases, we must return None. We implement this via pattern matching:
val rootAverage: RootsOfQ2 => Option[Double] = roots => roots match {
case Linear(x) => Some(x)
case OneRootQ(x) => Some(x)
case TwoRootsQ(x, y) => Some((x + y) * 0.5)
case _ => None
}
We do not need to enumerate all other cases since the underscore (_) matches
everything that the previous cases did not match.
In Scala, the often-used code pattern x => x match { case ... => ... } can be
shortened to just the nameless function { case ... => ... }. Then the code is:
val rootAverage: RootsOfQ2 => Option[Double] = {
case Linear(x) => Some(x)
case OneRootQ(x) => Some(x)
case TwoRootsQ(x, y) => Some((x + y) * 0.5)
case _ => None
}
Test it:
scala> Seq(NoRealRoots(), OneRootQ(1.0), TwoRootsQ(1.0, 2.0),
AllRoots()).map(rootAverage)
res0: Seq[Option[Double]] = List(None, Some(1.0), Some(1.5), None)
127
3 The logic of types. I. Disjunctive types
It remains to remove the None values and to compute the mean of the resulting
sequence. The Scala library defines the flatten method that removes Nones and
transforms Seq[Option[A]] into Seq[A]:
scala> largest.flatten
res0: Seq[Double] = List(0.9346072365885472, 1.1356234869160806,
0.9453181931646322, 1.1595052441078866, 0.5762252742788...
Now compute the mean of the last sequence. Since the flatten operation is pre-
ceded by map, we can replace it by a flatMap. The final code is:
val largest = Seq.fill(100)(QEqu(random(), random()))
.map(solve)
.flatMap {
case OneRoot(x) => Some(x)
case TwoRoots(x, y) => Some(math.max(x, y))
case _ => None
}
In line 3, we wrote the type annotation eab: Either[A, B] only for clarity. It is not
required here since the Scala compiler can deduce the type of the pattern variable
eab from the fact that we are matching a value of type Option[Either[A, B]].
In the scope of line 2, we need to return a value of type Either[A, Option[B]]. A
value of that type must be either a Left(x) for some x: A, or a Right(y) for some y:
Option[B], where y must be either None or Some(z) with a z: B. However, in our case
the code is of the form case None => ???, and we cannot produce any values x: A
or z: B since A and B are arbitrary, unknown types. The only remaining possibility
is to return Right(y) with y = None, and so the code must be:
case None => Right(None) // No other choice here.
In the next scope, we can perform pattern matching on the value eab:
case Some(eab: Either[A, B]) = eab match {
case Left(a) => ???
case Right(b) => ???
}
128
3.4 Summary
It remains to figure out what expressions to write in each case. In the case
Left(a) => ???, we have a value of type A, and we need to compute a value of type
Either[A, Option[B]]. We use the same argument as before: The return value must
be Left(x) for some x: A, or Right(y) for some y: Option[B]. At this point, we have
a value of type A but no values of type B. So, we have two possibilities: to return
Left(a) or to return Right(None). If we decide to return Left(a), the code is:
1 def f1[A, B]: Option[Either[A, B]] => Either[A, Option[B]] = {
2 case None => Right(None) // No other choice here.
3 case Some(eab) => eab match {
4 case Left(a) => Left(a) // Could return Right(None) here.
5 case Right(b) => ???
6 }
7 }
Should we return Left(a) or Right(None) in line 4? Both choices will satisfy the
required return type Either[A, Option[B]]. However, if we return Right(None) in
that line, we will ignore the given value a: A, losing information. So, we return
Left(a) in line 4.
Similarly, we find in line 5 that we may return Right(None) or Right(Some(b)).
Both choices will have the required return type (Either[A, Option[B]]), but the first
choice ignores the given value b: B. To preserve information, we need to make the
second choice:
1 def f1[A, B]: Option[Either[A, B]] => Either[A, Option[B]] = {
2 case None => Right(None)
3 case Some(eab) => eab match {
4 case Left(a) => Left(a)
5 case Right(b) => Right(Some(b))
6 }
7 }
We can now refactor this code into a somewhat more readable form by using
nested patterns:
def f1[A, B]: Option[Either[A, B]] => Either[A, Option[B]] = {
case None => Right(None)
case Some(Left(a)) => Left(a)
case Some(Right(b)) => Right(Some(b))
}
Option[(A, B)]. A value of that type is either None or Some((x, y)) where we would
need to choose some x: A and y: B. Since A and B are arbitrary types, we cannot
produce new values x and y from scratch. The only way of obtaining x and y is
to set x = a and y = b. So, our choices are to return Some((a, b)) or None. We reject
returning None since that would unnecessarily lose information. Thus, we continue
writing code as:
1 def f2[A, B]: (Option[A], Option[B]) => Option[(A, B)] = {
2 case (Some(a), Some(b)) => Some((a, b))
3 case (Some(a), None) => ???
In lines 4–5, we find that there is no choice other than returning None. So, we can
simplify the code:
def f2[A, B]: (Option[A], Option[B]) => Option[(A, B)] = {
case (Some(a), Some(b)) => Some((a, b))
case _ => None // No other choice here.
}
3.4.2 Exercises
Exercise 3.4.2.1 Define a disjunctive type CellState representing the visual state
of one cell in the “Minesweeper”3 game: A cell can be closed (showing nothing),
or show a bomb, or be open and show the number of bombs in neighbor cells.
Exercise 3.4.2.2 In the context of the “Minesweeper” game (Exercise 3.4.2.1), count
the total number of cells with zero neighbor bombs shown by implementing a
function with type signature Seq[Seq[CellState]] => Int.
Exercise 3.4.2.3 Define a disjunctive type RootOfLinear representing all possibili-
ties for the solution of the equation 𝑎𝑥 + 𝑏 = 0 for arbitrary real 𝑎, 𝑏. (The possibil-
ities are: no roots; one root; all 𝑥 are roots.) Implement the solution as a function
solve1 with type signature:
def solve1: ((Double, Double)) => RootOfLinear = ???
3 https://en.wikipedia.org/wiki/Minesweeper_(video_game)
130
3.4 Summary
Exercise 3.4.2.8 Use pattern matching to implement functions with given type
signatures, preserving information as much as possible:
def f1[A, B]: Option[(A, B)] => (Option[A], Option[B]) = ???
def f2[A, B]: Either[A, B] => (Option[A], Option[B]) = ???
def f3[A, B, C]: Either[A, Either[B, C]] => Either[Either[A, B], C] = ???
OneRoot(x)
x
NoRoots()
x
y
TwoRoots(x, y)
at a given place (say, Option[Option[A]] instead of Option[A] for the second value
in the list) should cause a type error. Implement (not necessarily tail-recursive)
functions map and foldLeft for ListX. The type signatures:
def map[A, B](lx: ListX[A])(f: A => B): ListX[B] = ???
def foldLeft[A, R](lx: ListX[A])(init: R)(f: (R, A) = R): R = ???
The type RootsOfQ represents the set of admissible values of the argument r, that
is, the mathematical domain of the function isDoubleRoot. What kind of domain is
that? The set of real roots of a quadratic equation 𝑥 2 + 𝑏𝑥 + 𝑐 = 0 can be empty, or it
can contain a single real number 𝑥, or a pair of real numbers (𝑥, 𝑦). Geometrically,
a number 𝑥 is pictured as a point in a line (a one-dimensional space), and pair
of numbers (𝑥, 𝑦) is pictured as a point in a Cartesian plane (a two-dimensional
space). The no-roots case corresponds to a zero-dimensional space, which can be
pictured as a single point (see Figure 3.1). The point, the line, and the plane do not
intersect (i.e., have no common points). Together, they form the set of the possible
roots of the quadratic equation 𝑥 2 + 𝑏𝑥 + 𝑐 = 0.
In the mathematical notation, a one-dimensional real space is denoted by R, a
132
3.5 Discussion and further developments
Consider the value 𝑢 used by the mathematical set ( NoRoots, 𝑢) 𝑢∈R0 . Since R0
consists of a single point, there is only one possible value of 𝑢. Similarly, the Unit
type in Scala has only one distinct value, written as (). A case class with no parts,
such as NoRoots, has only one distinct value, written as NoRoots(). The Scala value
NoRoots() is fully analogous to the mathematical notation ( NoRoots, 𝑢) 𝑢∈R0 .
So, case classes with no parts are similar to Unit except for an added name. For
instance, NoRoots() can be regarded as the Unit value () with name NoRoots. For this
reason, this book calls them “named unit” types.
This type does not include any labels telling us which of the values is present.
Without a label, we (and the compiler) will not know whether a given value of
type i_d_l represents an int, a double, or a long. This will lead to errors that are
hard to detect.
Programming languages of the C family (C, C++, Objective C, Java) support
enumeration (enum) types, which are a limited form of disjunctive types, and a
switch operation, which is a limited form of pattern matching. An enum type decla-
ration in Java looks like this:
enum Color { RED, GREEN, BLUE; }
If we add extra data to the enum types, allowing the tuples to be non-empty, and
extend the switch expression to be able to handle the extra data, we will recover
the full functionality of disjunctive types. A definition of RootsOfQ could then look
4 The programming languages Ada and Pascal support disjunctive types but no other FP features.
134
3.5 Discussion and further developments
like this:
enum RootsOfQ { // This is not valid in Java!
NoRoots(), OneRoot(Double x), TwoRoots(Double x, Double y);
}
Scala 3 has a shorter a syntax for disjunctive types5 that resembles Java’s “enum”:
enum RootsOfQ {
case NoRoots
case OneRoot(x: Double)
case TwoRoots(x: Double, y: Double)
}
For comparison, the syntax for a disjunctive type equivalent to RootsOfQ in OCaml
and Haskell is:
(* OCaml *)
type RootsOfQ = NoRoots | OneRoot of float | TwoRoots of float * float
-- Haskell
data RootsOfQ = NoRoots | OneRoot Double | TwoRoots (Double, Double)
This is more concise than the Scala syntax. When reasoning about disjunctive
types, it is inconvenient to write out long type definitions. Chapter 5 will intro-
duce a mathematical notation designed for efficient reasoning about types. That
notation is even more concise than the syntax of Haskell or OCaml.
There is a similar connection between logical conjunctions and tuple types. Con-
sider the named tuple (i.e., a case class) TwoRoots(x: Double, y: Double). We can
5 https://dotty.epfl.ch/docs/reference/enums/adts.html
135
3 The logic of types. I. Disjunctive types
have a value of type TwoRoots only if we have two values of type Double. Rewriting
this sentence as a logical formula, we get:
We find that tuples are related to logical conjunctions in the same way as dis-
junctive types are related to logical disjunctions. This is the main reason for choos-
ing the name “disjunctive types”.6
The correspondence between disjunctions, conjunctions, and data types is ex-
plained in more detail in Chapter 5. For now, we note that the operations of con-
junction and disjunction are not sufficient to produce all possible logical expres-
sions. To obtain a complete logic, it is also necessary to have the logical implica-
tion 𝐴 → 𝐵 (“if 𝐴 is true than 𝐵 is true”). It turns out that the implication 𝐴 → 𝐵
is related to the function type A => B in the same way as the disjunction operation
is related to disjunctive types and the conjunction to tuples. In Chapter 4, we will
study function types in depth.
6 Disjunctive
types are also called sum types, co-product types, variants, and tagged unions. This
book uses the terms “disjunctive types” and “co-product types” interchangeably.
136
4 The logic of types. II. Curried
functions
4.1 Functions that return functions
4.1.1 Motivation and first examples
Consider the task of preparing a logger function that prints messages with a con-
figurable prefix.
A simple logger function can be a value of type String => Unit, such as:
val logger: String => Unit = { message => println(s"INFO: $message") }
137
4 The logic of types. II. Curried functions
scala> warn("goodbye")
WARN: goodbye
The values info and warn can be used by any code that needs a logging function.
It is important that the prefix is “baked into” functions created by logWith. A
logger such as warn will always print messages with the prefix "WARN", and the
prefix cannot be changed any more. This is because the value prefix is treated as
a local constant within the body of the nameless function computed and returned
by logWith. For instance, the body of the function warn is equivalent to:
{ val prefix = "WARN"; (message => s"$prefix: $message") }
scala> c = 1000
c: Int = 1000
scala> f(10)
res1: Int = 1210
138
4.1 Functions that return functions
we would expect that info is the same value as logWith("INFO"), and so the code
info("hello") should have the same effect as the code logWith("INFO")("hello").
This is indeed so:
scala> logWith("INFO")("hello")
INFO: hello
The syntax logWith("INFO")("hello") looks like the function logWith applied to two
arguments. Yet, logWith was defined as a function with a single argument of
type String. This is not a contradiction because logWith("INFO") returns a func-
tion that accepts an additional argument. So, expressions logWith("INFO") and
logWith("INFO")("hello") are both valid. In this sense, we are allowed to apply
logWith to one argument at a time.
A function that can be applied to arguments in this way is called a curried
function.
While a curried function can be applied to one argument at a time, an uncurried
function must be applied to all arguments at once, e.g.:
def prefixLog(prefix: String, message: String): Unit = println(s"$prefix:
$message")
The type of the curried function logWith is String => (String => Unit). By Scala’s
syntax conventions, the function arrow (=>) groups to the right. So, the parentheses
in the type expression String => (String => Unit) are not needed. The function’s
type can be written as String => String => Unit.
The type String => String => Unit is different from (String => String) => Unit,
which is the type of a function returning Unit and having a single argument of
type String => String.
When an argument’s type is a function type, e.g., String => String, it must be
enclosed in parentheses, as in (String => String) => Unit.
In general, a curried function takes an argument and returns another function
that again takes an argument and returns another function, and so on, until fi-
nally a non-function type is returned. So, the type signature of a curried function
generally looks like A => B => C => ... => R => S, where A, B, ..., R are the curried
arguments and S is the “final” result type.
139
4 The logic of types. II. Curried functions
For example, in the type expression A => B => C => D the types A, B, C are the
types of curried arguments, and D is the final result type. It takes time to get used
to reading this kind of syntax.
In Scala, functions defined with multiple argument lists (enclosed in multiple
pairs of parentheses) are curried functions. We have seen examples of curried
functions before:
def map[A, B](xs: Seq[A])(f: A => B): Seq[B]
def fmap[A, B](f: A => B)(xs: Option[A]): Option[B]
def foldLeft[A, R](xs: Seq[A])(init: R)(update: (R, A) => R): R
The type signatures of these functions can be also written equivalently without
argument names, although this is less convenient in practical coding:
def map[A, B]: Seq[A] => (A => B) => Seq[B]
def fmap[A, B]: (A => B) => Option[A] => Option[B]
def foldLeft[A, R]: Seq[A] => R => ((R, A) => R) => R
140
4.1 Functions that return functions
The function takes an integer x and returns the expression y => x - y, which is
a function of type Int => Int. The code of f1 can be written equivalently as:
val f1: Int => Int => Int = { x => y => x - y }
The function f2 has type signature (Int, Int) => Int. Calling f1 and f2 requires
different syntax:
scala> f1(20)(4)
res0: Int = 16
scala> f2(20, 4)
res1: Int = 16
The main difference is that f2 must be applied at once to both arguments, while
f1 could be applied to just the first argument (20). Applying a curried function to
some but not all possible arguments is called a partial application. The result of
evaluating f1(20) is a function that can be later applied to another argument:
scala> val r1 = f1(20)
r1: Int => Int = <function1>
scala> r1(4)
res2: Int = 16
141
4 The logic of types. II. Curried functions
(The type annotation Int => Int is required in line 1.) This code creates a func-
tion r2 by applying f2 to the first argument but not to the second. Then r2 is the
same function as r1 defined above; i.e., r2 returns the same values for the same
arguments as r1. A more verbose syntax for a partial application is:
scala> val r3: Int => Int = { x => f2(20, x) } // Same as r2 above.
r3: Int => Int = <function1>
scala> r3(4)
res4: Int = 16
We can see that a curried function, such as f1, is better adapted for partial ap-
plication than f2, because the syntax is shorter. However, the types of functions f1
and f2 are equivalent: for any f1 of type Int => Int => Int we can reconstruct f2
of type (Int, Int) => Int and vice versa, without loss of information:
def f2new(x: Int, y: Int): Int = f1(x)(y) // f2new is equal to f2
def f1new: Int => Int => Int = { x => y => f2(x, y) } // f1new is equal to f1
It is clear that the function f1new computes the same results as f1, and that the
function f2new computes the same results as f2. The equivalence of the functions
f1 and f2 is not equality — these functions are different; but each of them can be re-
constructed from the other. The one-to-one correspondence between all functions
of type Int => Int => Int and all functions of type (Int, Int) => Int is what we call
the “equivalence of types”.
More generally, a curried function has a type signature of the form A => B => C
=> ... => R => S, where A, B, C, ..., S are some types. A function with this type signa-
ture is equivalent to an uncurried function with type signature (A,B,C,...,R) => S.
The uncurried function takes all arguments at once, while the curried function
takes one argument at a time. Other than that, these two functions compute the
same results given the same arguments.
We have seen how a curried function can be converted to an equivalent un-
curried one, and vice versa. The Scala library defines the methods curried and
uncurried that convert between these forms of functions. To convert between f2
and f1:
scala> val f1c = (f2 _).curried
f1c: Int => (Int => Int) = <function1>
The syntax (f2 _) is needed in Scala to convert methods to function values. Recall
that Scala has two ways of defining a function: one as a method (defined using
def), another as a function value (defined using val). The extra underscore is un-
necessary in Scala 3.
The methods curried and uncurried are quick to implement (see Section 4.2.1
below). These functions are called the currying and uncurrying transformations.
142
4.2 Fully parametric functions
We can introduce type parameters into the type signature of swap to make it fully
143
4 The logic of types. II. Curried functions
parametric:
def swap[A, B](p: (A, B)): (B, A) = p match {
case (x, y) => (y, x)
}
Converting swap into a fully parametric function is possible because the operation
of swapping the parts of a tuple (A, B) works in the same way for all types A, B. No
changes were made in the body of the function. The specialized version of swap
working on (Double, Double) can be obtained from the fully parametric version of
swap if we set the type parameters as A = Double, B = Double.
In contrast, the function cos_sin performs a computation that is specific to the
type Double. That computation cannot be generalized to an arbitrary type param-
eter A instead of the type Double. For instance, the code of cos_sin uses the function
math.sqrt, which is defined only for the type Double.
To generalize cos_sin to a fully parametric function that works with a type pa-
rameter A, we would need to replace all computations specific to the type Double
by new arguments working with the type parameter A. For example, we could in-
troduce two new arguments (named, say, distance and ratio) and replace cos_sin
by the fully parametric function cos_sin_parametric:
def cos_sin_parametric[A](p: (A, A), distance: (A, A) => A, ratio: (A, A) =>
A): (A, A) = p match {
case (x, y) =>
val r = distance(x, y)
(ratio(x, r), ratio(y, r))
}
A fully parametric function has all its arguments typed with type parameters or
with some combinations of type parameters, i.e., type expressions such as (A, B)
or X => Either[X, Y].
The swap operation for pairs is already defined in the Scala library:
scala> (1, "abc").swap
res0: (String, Int) = (abc,1)
If needed, other swapping functions can be implemented for tuples with more
elements, e.g.:
def swapAC[A, B, C]: ((A, B, C)) => (C, B, A) = { case (x, y, z) => (z, y, x) }
The Scala syntax requires double parentheses around tuple types of arguments but
not around the tuple type of a function’s result. So, the function cos_sin may be
written as a value like this:
val cos_sin: ((Double, Double)) => (Double, Double) = ...
Further examples of fully parametric functions are the identity function, the
const function, the function composition methods, and the currying / uncurrying
transformations.
144
4.2 Fully parametric functions
In the mathematical notation, we write the identity function as “id” for brevity.
The function available in the Scala library as Function.const[C, X] takes an argu-
ment c of type C and returns a new function that always returns c:
def const[C, X](c: C): X => C = (_ => c)
The syntax _ => c is used to emphasize that the new returned function ignores its
argument. One-argument functions that ignore their argument are called constant
functions.
scala> h(40)
res36: String = Result x = 45.67
The Scala compiler derives the type of h automatically as Int => String.
This book denotes the forward composition by the symbol # (which can be read
as “before”). We define 𝑓 # 𝑔 (reads “ 𝑓 before 𝑔”) by:
𝑓 # 𝑔 def
= 𝑥 → 𝑔( 𝑓 (𝑥)) . (4.1)
This type signature requires the types of the function arguments to match in a
certain way, or else the composition is undefined (and the code would produce
a type error). The method andThen is an example of a function that both returns a
new function and takes other functions as arguments.
The backward composition of two functions 𝑓 and 𝑔 works in the opposite
order: first 𝑔 is applied and then 𝑓 . This operation is denoted by the symbol ◦
145
4 The logic of types. II. Curried functions
(pronounced “after”):
𝑓 ◦ 𝑔 def
= 𝑥 → 𝑓 (𝑔(𝑥)) . (4.2)
In Scala, the backward composition is called compose and used as f compose g. This
method may be implemented as a fully parametric function:
def compose[X, Y, Z](f: Y => X)(g: Z => Y): Z => X = { z => f(g(z)) }
We have already seen the methods curried and uncurried from the Scala library.
As an illustration, here is the code for the uncurrying transformation (converting
curried functions to uncurried):
def uncurry[A, B, R](f: A => B => R): ((A, B)) => R = { case (a, b) => f(a)(b) }
These examples show that fully parametric functions perform operations so
general that they work in the same way for all types. Some arguments of fully
parametric functions may have complicated types such as A => B => R, which are
type expressions built up from type parameters. But fully parametric functions
do not use values of specific types such as Int or String.
Functions with type parameters are often called “generic”. This book uses the
term “fully parametric” to designate a certain restricted kind of generic functions.
• The two identity laws: the composition of any function 𝑓 with an identity
function (identity[A]) will give again the function 𝑓 .
These laws hold equally for the forward and the backward composition, since
those are just syntactic variants of the same operation. Let us write these laws
rigorously as equations and prove them.
Proofs with forward composition The composition of the identity function with
an arbitrary function 𝑓 on the left is written as 𝑓 # id. The composition with the
function 𝑓 on the right is written as id # 𝑓 . In both cases, the result must be equal
to the function 𝑓 . The resulting two laws are:
To prove that these laws hold, we need to show that the functions at both sides
of the laws give the same result when applied to an arbitrary value 𝑥. Let us first
clarify how the type parameters must be set for all types to match consistently.
146
4.2 Fully parametric functions
The laws must hold for an arbitrary function 𝑓 . Assume that 𝑓 has the type
signature 𝐴 → 𝐵, where 𝐴 and 𝐵 are arbitrary types (type parameters). Consider
the left identity law. The function (id # 𝑓 ) is, by definition (4.1), a function that
takes an argument 𝑥, applies id to that 𝑥, and then applies 𝑓 to the result:
If 𝑓 has type 𝐴 → 𝐵, its argument must be of type 𝐴, or else the types will not
match. Therefore, the identity function must have type 𝐴 → 𝐴, and the argu-
ment 𝑥 must have type 𝐴. With these choices of the type parameters, the function
(𝑥 → 𝑓 (id(𝑥))) will have type 𝐴 → 𝐵. This type matches the right-hand side of
the law, which is just 𝑓 . We add type annotations to the code as superscripts:
:𝐴→𝐵
id:𝐴→𝐴 # 𝑓 :𝐴→𝐵 = 𝑥 :𝐴 → 𝑓 (id (𝑥)) .
The last step works since 𝑥 → 𝑓 (𝑥) is a function that takes an argument 𝑥 and
applies 𝑓 to that argument. This is the same function as 𝑓 . We say that 𝑥 → 𝑓 (𝑥)
is an expanded form of the function 𝑓 .
We turn to the right identity law, 𝑓 # id = 𝑓 . Write out the left-hand side:
𝑓 # id = (𝑥 → id ( 𝑓 (𝑥))) .
To check that the types match, assume that 𝑓 :𝐴→𝐵 . Then 𝑥 must have type 𝐴, and
the identity function must have type 𝐵 → 𝐵. The result of id ( 𝑓 (𝑥)) will also have
type 𝐵. With these choices of type parameters, all types match:
:𝐴→𝐵
𝑓 :𝐴→𝐵 # id:𝐵→𝐵 = 𝑥 :𝐴 → id ( 𝑓 (𝑥)) .
𝑓 # id = (𝑥 → 𝑓 (𝑥)) = 𝑓 .
147
4 The logic of types. II. Curried functions
We now apply both sides of the laws to an arbitrary value 𝑥 :𝐴 . For the left identity
law, we find:
id ◦ 𝑓 = (𝑥 → id ( 𝑓 (𝑥))) = (𝑥 → 𝑓 (𝑥)) = 𝑓 .
The types are checked by assuming that 𝑓 has the type 𝑓 :𝐴→𝐵 . The types in 𝑔 ◦ 𝑓
match only when 𝑔 :𝐵→𝐶 , and then 𝑔 ◦ 𝑓 is of type 𝐴 → 𝐶. The type of ℎ must be
ℎ:𝐶→𝐷 for the types in ℎ ◦ (𝑔 ◦ 𝑓 ) to match. We can write the associativity law with
type annotations as:
149
4 The logic of types. II. Curried functions
scala> fid("abc")
res0: String = abc
scala> fid(true)
res1: Boolean = true
scala> fid(0)
res2: Int = -1
While Scala allows us to write this kind of code, the result is confusing: the type
signature A => A does not indicate a special behavior with A = Int. In any case, fid
is not a fully parametric function.
Let us see whether the identity laws of function composition hold when using
fid[A] instead of the correct function identity[A]. To see that, we compose fid with
a simple function f_1 defined by:
def f_1: Int => Int = { x => x + 1 }
The composition (f_1 andThen fid) has type Int => Int. Since f_1 has type Int =>
Int, Scala will automatically set the type parameter A = Int in fid[A]:
scala> def f_2 = f_1 andThen fid // 𝑓2 = 𝑓1 # fid
f_2: Int => Int
By the identity law, we should have 𝑓2 = 𝑓1 # id = 𝑓1 . But we can check that f_1 and
f_2 are not equal:
scala> f_1(0)
res3: Int = 1
scala> f_2(0)
res4: Int = 0
It is important that we are able to detect that fid is not a fully parametric func-
tion by checking whether some equation holds, without looking at the code of
fid. In this book, we will always formulate any desired properties through equa-
tions or “laws”. To verify that a law holds, we will perform symbolic calcula-
tions similar to the proofs in Section 4.2.2. These calculations are symbolic in
the sense that we are manipulating symbols (such as 𝑥, 𝑓 , 𝑔, ℎ) without substi-
tuting any specific values for these symbols but only using some general rules
and properties. This is similar to symbolic calculations in mathematics, such as
(𝑥 − 𝑦) (𝑥 2 + 𝑥𝑦 + 𝑦 2 ) = 𝑥 3 − 𝑦 3 . In the next section, we will get more experience
with symbolic calculations relevant to functional programming.
150
4.3 Symbolic calculations with nameless functions
(𝑥 → 𝑥 + 10)(2) = 2 + 10 = 12 .
To run this computation in Scala, we need to add a type annotation to the nameless
function as in (𝑥 :Int → 𝑥 + 10)(2). The code is:
scala> ((x: Int) => x + 10)(2)
res0: Int = 12
Curried function calls such as 𝑓 (𝑥)(𝑦) or 𝑥 → expr(𝑥) (𝑦)(𝑧) may look unfa-
miliar and confusing. We need to get some experience working with them.
Consider the expression (x => y => x - y)(20)(4), and begin with the curried
argument 20. Applying a nameless function of the form (x => ...) to 20 means
substituting x = 20 into the body of the function. After that substitution, we obtain
the expression y => 20 - y, which is again a nameless function. Applying that
function to the remaining argument (4) means substituting y = 4 into the body of
y => 20 - y. We get the expression 20 - 4, which equals 16. Test in Scala:
scala> ((x: Int) => (y: Int) => x - y)(20)(4)
res1: Int = 16
Applying a curried function such as x => y => z => expr(x,y,z) to three curried
arguments 10, 20, and 30 means substituting x = 10, y = 20, and z = 30 into the
expression expr(x,y,z).
This calculation is made easier by the convention that f(g)(h) means first apply-
ing f to g and then applying the result to h. In other words, function application
groups to the left: f(g)(h) = (f(g))(h). It would be confusing if function applica-
tion grouped to the right and f(g)(h) meant first applying g to h and then applying
f to the result. If that were the syntax convention, it would be harder to reason
about applying a curried function to its arguments.
We see that the right grouping of the function arrow => is well adapted to the left
grouping of function applications. All functional languages follow these syntactic
conventions.
To make calculations shorter, we will write code in a mathematical notation
rather than in the Scala syntax. Type annotations are written with a colon in the
151
4 The logic of types. II. Curried functions
(𝑥 → 𝑥 ∗ 2) (10) = 10 ∗ 2 = 20 .
( 𝑝 → 𝑧 → 𝑧 ∗ 𝑝) (𝑡) = (𝑧 → 𝑧 ∗ 𝑡) .
( 𝑝 → 𝑧 → 𝑧 ∗ 𝑝) (𝑡)(4) = (𝑧 → 𝑧 ∗ 𝑡)(4) = 4 ∗ 𝑡 .
Some results of these computation are integer values such as 20; other results are
nameless functions such as 𝑧 → 𝑧 ∗ 𝑡. Verify this in Scala:
scala> ((x: Int) => x * 2)(10)
res3: Int = 20
( 𝑓 → 𝑝 → 𝑓 ( 𝑝)) (𝑔 → 𝑔(2)) (𝑥 → 𝑥 + 4)
use Eq. (4.7) : = ( 𝑝 → 𝑝(2)) (𝑥 → 𝑥 + 4)
substitute 𝑝 = (𝑥 → 𝑥 + 4) : = (𝑥 → 𝑥 + 4) (2)
substitute 𝑥 = 2 : = 2+4 = 6 .
𝑝, which is Int → Int. So, 𝑓 ’s type must be (Int → Int) → 𝐴 for some type 𝐴.
Since in our example 𝑓 = (𝑔 → 𝑔(2)), types match only if 𝑔 has type Int → Int.
But then 𝑔(2) has type Int, and so we must have 𝐴 = Int. Thus, the type of 𝑓 is
(Int → Int) → Int. We know enough to write the Scala code now:
scala> ((f: (Int => Int) => Int) => p => f(p))(g => g(2))(x => x + 4)
res6: Int = 6
Type annotations for 𝑝, 𝑔, and 𝑥 may be omitted: Scala’s compiler can figure out
the missing types from the given type of 𝑓 . However, extra type annotations often
make code clearer.
const𝐶,𝑋 def
= 𝑐 :𝐶 → _:𝑋 → 𝑐 , id 𝐴 def
= 𝑎 :𝐴 → 𝑎 .
The types will match in the expression const(id) only if the argument of the func-
tion const has the same type as the type of id. Since const is a curried function,
we need to look at its first curried argument, which is of type 𝐶. The type of id is
𝐴 → 𝐴, where 𝐴 is (so far) an arbitrary type. So, the type parameter 𝐶 in const𝐶,𝑋
must be equal to 𝐴 → 𝐴:
𝐶=𝐴→𝐴 .
The type parameter 𝑋 in const𝐶,𝑋 is not constrained, so we keep it as 𝑋. The result
of applying const to id is of type 𝑋 → 𝐶, which equals 𝑋 → 𝐴 → 𝐴. In this way,
we find:
const 𝐴→𝐴,𝑋 (id 𝐴 ) : 𝑋 → 𝐴 → 𝐴 .
The types 𝐴 and 𝑋 remain arbitrary. The type 𝑋 → 𝐴 → 𝐴 is the most general type
for the expression const(id) because we have not made any assumptions about the
154
4.3 Symbolic calculations with nameless functions
types except requiring that all functions must be always applied to arguments of
the correct types.
To compute the value of const(id), it remains to substitute the code of const and
id. Since we already checked the types, we may omit all type annotations:
const (id)
definition of const : = (𝑐 → 𝑥 → 𝑐)(id)
apply function, substitute 𝑐 = id : = 𝑥 → id
definition of id : =𝑥→𝑎→𝑎 .
The function (𝑥 → 𝑎 → 𝑎) takes an argument 𝑥 :𝑋 and returns the identity func-
tion 𝑎 :𝐴 → 𝑎. It is clear that the argument 𝑥 is ignored by this function. So, we can
rewrite it equivalently as:
const (id) = _:𝑋 → 𝑎 :𝐴 → 𝑎 .
Example 4.3.2.2 Implement a function twice that takes a function f: Int => Int
as its argument and returns a function that applies f twice. For instance, if the
function f is { x => x + 3 }, the result of twice(f) should be equal to the function x
=> x + 6. Test this with the expression twice(x => x + 3)(10). After implementing
the function twice, generalize it to a fully parametric function.
Solution According to the requirements, the function twice must return a new
function of type Int => Int. So, the type signature of twice is:
def twice(f: Int => Int): Int => Int = ???
Since twice(f) must be a new function with an integer argument, we begin the
code of twice by writing a new nameless function { (x: Int) => ... },
def twice(f: Int => Int): Int => Int = { (x: Int) => ??? }
The new function must apply f twice to its argument, that is, it must return
f(f(x)). We can finish the implementation now:
def twice(f: Int => Int): Int => Int = { x => f(f(x)) }
The type annotation (x: Int) can be omitted. Let us verify that twice(x => x+3)(10)
equals 10 + 6:
scala> val g = twice(x => x + 3) // Expect g to be equal to the function { x
=> x + 6 }.
g: Int => Int = <function1>
twice 𝐴 def
= 𝑓 :𝐴→𝐴 → 𝑥 :𝐴 → 𝑓 ( 𝑓 (𝑥)) = 𝑓 :𝐴→𝐴 → 𝑓 # 𝑓 . (4.8)
The procedure of deriving the most general type for a given code is called type
inference. In Example 4.3.2.2, the presence of the type parameter 𝐴 and the type
signature ( 𝐴 → 𝐴) → 𝐴 → 𝐴 have been “inferred” from the code 𝑓 → 𝑥 →
𝑓 ( 𝑓 (𝑥)).
Example 4.3.2.3 Consider the fully parametric function twice defined in Exam-
ple 4.3.2.2. What is the most general type of twice(twice), and what computation
does it perform? Test your answer on the expression twice(twice)(x => x + 3)(10).
What are the type parameters in that expression?
Solution Note that twice(twice) means that the function twice is used as its own
argument, i.e., this is twice(f) with f = twice. We begin by assuming unknown
type parameters as twice[A](twice[B]). The function twice[A] of type ( 𝐴 → 𝐴) →
𝐴 → 𝐴 can be applied to the argument twice[B] only if twice[B] has type 𝐴 → 𝐴.
But twice[B] is of type (𝐵 → 𝐵) → 𝐵 → 𝐵. The symbol → groups to the right, so
we have:
(𝐵 → 𝐵) → 𝐵 → 𝐵 = (𝐵 → 𝐵) → (𝐵 → 𝐵) .
This can match with 𝐴 → 𝐴 only if we set 𝐴 = (𝐵 → 𝐵). So, the most general type
of twice(twice) is:
After checking that types match, we may omit types from further calculations.
Example 4.3.2.2 defined twice with the def syntax. To use twice as an argument in
the expression twice(twice), it is convenient to define twice as a value, val twice
= ... However, the function twice needs type parameters, and Scala 2 does not
directly support val definitions with type parameters. Scala 3 supports type pa-
rameters appearing together with arguments in a nameless function:
val twice = [A] => (f: A => A) => (x: A) => f(f(x)) // Valid only in Scala 3.
156
4.3 Symbolic calculations with nameless functions
Keeping this in mind, we use the definition of twice from Eq. (4.8): twice ( 𝑓 ) =
𝑓 # 𝑓 , which omits the curried argument 𝑥 :𝐴 and makes the calculation shorter.
Substituting that into twice(twice), we find:
This confirms that twice(twice)(x => x + 3) equals the function x => x + 12.
Example 4.3.2.4 (a) Infer a general type signature with type parameter(s) for the
given function p:
def p[...]:... = { f => f(2) }
(b) Could we choose the type parameters in the expression p(p) such that the types
match?
Solution (a) In the nameless function 𝑓 → 𝑓 (2), the argument 𝑓 must be itself
a function with an argument of type Int, otherwise the sub-expression 𝑓 (2) is ill-
typed. So, types will match if 𝑓 has type Int → Int or Int → String or similar. The
most general case is when 𝑓 has type Int → 𝐴, where 𝐴 is an arbitrary type (i.e.,
a type parameter); then the value 𝑓 (2) has type 𝐴. Since the nameless function
𝑓 → 𝑓 (2) has an argument 𝑓 of type Int → 𝐴 and a result 𝑓 (2) of type 𝐴, we find
that the type of 𝑝 must be (Int → 𝐴) → 𝐴. With this type assignment, all types
match. The type parameter 𝐴 remains undetermined and is added to the type
signature of the function p. The code is:
def p[A]: (Int => A) => A = { f => f(2) }
(b) The expression p(p) applies p to itself, just as twice(twice) did in Exam-
ple 4.3.2.3. Begin by writing p(p) with unknown type parameters: p[A](p[B]). Then
try to choose A and B so that the types match in that expression. Does the type of
p[B], which is (Int => B) => B, match the type of the argument of p[A], which is
Int => A, with some choice of A and B? A function type P => Q matches X => Y only
if P = X and Q = Y. So, (Int => B) => B can match Int => A only if Int => B matches
157
4 The logic of types. II. Curried functions
Int and if B = A. But it is impossible for Int => B to match Int, no matter how we
choose B.
We conclude that the expression p[A](p[B]) has a problem: for any choice of A
and B, some type will be mismatched. One says that the expression p(p) is not
well-typed. Such expressions contain a type error and are rejected by the Scala
compiler.
In the examples seen so far, we inferred the most general type of a code ex-
pression simply by trying to make all function types match the types of their
arguments. The Damas-Hindley-Milner algorithm2 performs type inference (or
determines that there is a type error) for any code containing functions, tuples,
and disjunctive types.
4.4 Summary
Table 4.1 shows the notations introduced in this chapter.
What can we do using this chapter’s techniques?
• Make functions that return new functions and/or take functions as argu-
ments.
• Simplify expressions symbolically when functions are applied to arguments.
• Derive a general type for a given code expression (perform type inference).
• Convert functions to a fully parametric form when possible.
The following examples and exercises illustrate these techniques further.
2 https://en.wikipedia.org/wiki/Hindley%E2%80%93Milner_type_system#Algorithm_W
158
4.4 Summary
4.4.1 Examples
Example 4.4.1.1 Implement a function that applies a given function 𝑓 repeatedly
to an initial value 𝑥0 , until a given function cond returns true:
def converge[X](f: X => X, x0: X, cond: X => Boolean): X = ???
Solution We call find on an iterator that keeps applying f; this stops when the
condition is true:
def converge[X](f: X => X, x0: X, cond: X => Boolean): X =
Stream.iterate(x0)(f) // Type is Stream[X].
.find(cond) // Type is Option[X].
.get // Type is X.
The method get is a partial function that can be applied only to non-empty Option
values. It is safe to call get here, because the stream is unbounded and, if the
condition cond never becomes true, the program will run out of memory (since
Stream.iterate keeps all computed values in memory) or the user will run out of
patience. So, _.find(cond) can never return an empty Option value. Of course, it is
not satisfactory that the program crashes when the sequence does not converge.
Exercise 4.4.2.2 will implement a safer version of this function by limiting the
allowed number of iterations.
A tail-recursive implementation that works in constant memory is:
@tailrec def converge[X](f: X => X, x0: X, cond: X => Boolean): X =
if (cond(x0)) x0 else converge(f, f(x0), cond)
√
To test this code, compute an approximation to 𝑞 by Newton’s method with the
iteration function 𝑓 (𝑥) = 21 𝑥 + 𝑞𝑥 . We iterate 𝑓 (𝑥) starting with 𝑥 0 = 𝑞/2 until a
given precision is obtained:
def approx_sqrt(q: Double, precision: Double): Double = {
def cond(x: Double): Boolean = math.abs(x * x - q) <= precision
def iterate_sqrt(x: Double): Double = 0.5 * (x + q / x)
converge(iterate_sqrt, q / 2, cond)
}
√
Newton’s method for 𝑞 is guaranteed to converge when 𝑞 ≥ 0. Test it:
scala> approx_sqrt(25, 1.0e-8)
res0: Double = 5.000000000016778
Example 4.4.1.2 Using both def and val, define a Scala function that takes an
integer x and returns a function that adds x to its argument.
Solution Let us first write down the required type signature. The function
must take an integer argument x: Int, and the return value must be a function of
type Int => Int:
def add_x(x: Int): Int => Int = ???
We are required to return a function that adds x to its argument. Let us call that ar-
159
4 The logic of types. II. Curried functions
gument z, to avoid confusion with the x. So, we are required to return the function
{ z => z + x }. Since functions are values, we return a new function by writing a
nameless function expression:
def add_x(x: Int): Int => Int = { z => z + x }
To implement the same function by using a val, we first convert the type signature
of add_x to the equivalent curried type Int → Int → Int. Now we can write the
Scala code of a function add_x_v:
val add_x_v: Int => Int => Int = { x => z => z + x }
The function add_x_v is equal to add_x except for using the val syntax instead of def.
We do not need to write the type of the arguments x and z since we already wrote
the type Int → Int → Int of add_x_v.
Example 4.4.1.3 Using def and val, implement a curried function prime_f that
takes a function 𝑓 and an integer 𝑥, and returns true when 𝑓 (𝑥) is prime. Use the
function isPrime from Section 1.1.2.
Solution First, determine the required type signature of prime_f. The value
𝑓 (𝑥) must have type Int, or else we cannot check whether it is prime. So, 𝑓 must
have type Int → Int. Since prime_f should be a curried function, we need to put
each argument into its own set of parentheses:
def prime_f(f: Int => Int)(x: Int): Boolean = ???
To implement the same function using val, rewrite its type signature as:
val prime_f: (Int => Int) => Int => Boolean = ???
(The parentheses around Int => Int are mandatory as Int => Int => Int => Boolean
would be a completely different type.) The implementation is:
val prime_f: (Int => Int) => Int => Boolean = { f => x => isPrime(f(x)) }
Example 4.4.1.5 Infer the most general type for the fully parametric function:
def q[...]: ... = { f => g => g(f) }
What types are inferred for the expressions q(q) and q(q(q))?
Solution To begin, assume 𝑓 :𝐴 with a type parameter 𝐴. In the sub-expression
𝑔 → 𝑔( 𝑓 ), the curried argument 𝑔 must itself be a function, because it is being
applied to 𝑓 as 𝑔( 𝑓 ). So, we assign types as 𝑓 :𝐴 → 𝑔 :𝐴→𝐵 → 𝑔( 𝑓 ), where 𝐴 and
𝐵 are type parameters. Then the final returned value 𝑔( 𝑓 ) has type 𝐵. Since there
are no other constraints on the types, the types 𝐴 and 𝐵 remain arbitrary, so we
add them to the type signature:
def q[A, B]: A => (A => B) => B = { f => g => g(f) }
To match types in the expression q(q), we first assume arbitrary type parame-
ters and write q[A, B](q[C, D]). We need to introduce new type parameters 𝐶, 𝐷
because those type parameters may need to be set differently from 𝐴, 𝐵 when we
try to match the types in the expression q(q).
The type of the first curried argument of q[A, B], which is 𝐴, must match the
entire type of q[C, D], which is 𝐶 → (𝐶 → 𝐷) → 𝐷. So, we must choose 𝐴 as:
𝐴 = 𝐶 → (𝐶 → 𝐷) → 𝐷 .
The type of q(q) becomes:
𝑞 𝐴,𝐵 (𝑞𝐶,𝐷 ) : ((𝐶 → (𝐶 → 𝐷) → 𝐷) → 𝐵) → 𝐵 ,
where 𝐴 = 𝐶 → (𝐶 → 𝐷) → 𝐷 .
161
4 The logic of types. II. Curried functions
scala> def qqq[A, B, C, D]: ((((A => (A => B) => B) => C) => C) => D) => D =
q(q(q))
qqq: [A, B, C, D]=> ((((A => ((A => B) => B)) => C) => C) => D) => D
We did not need to write any type parameters within the expressions q(q) and
q(q(q)) because the full type signature was declared for each of these expressions.
Since the Scala compiler did not print any error messages, we are assured that the
types match correctly.
Example 4.4.1.6 For the following expressions, infer the most general types or
show that the expression is not well-typed with simple types:
(a) 𝑓 → 𝑓 ( 𝑓 ) .
(b) 𝑓 → 𝑓 (ℎ → ℎ( 𝑓 )) .
(c) 𝑓 → 𝑔 → 𝑓 (ℎ → ℎ(𝑔)) .
By “simple types” we mean that 𝑓 , 𝑔, ℎ cannot have their own type parameters.
Solution (a) The type of 𝑓 is unknown, so we begin by assigning an arbitrary
type 𝐴 to it. Types now need to match in the expression 𝑓 ( 𝑓 ) with 𝑓 :𝐴 . So, the
type 𝐴 must be a function type whose argument is again of type 𝐴. We can write
that function type as 𝐴 → 𝐵, where 𝐵 is another arbitrary type. Now, types match
only if 𝐴 and 𝐴 → 𝐵 is the same type. But there are no simple types 𝐴 and 𝐵 such
that 𝐴 = 𝐴 → 𝐵. So, the expression 𝑓 → 𝑓 ( 𝑓 ) is not well-typed.
This conclusion holds only because we do not allow the function 𝑓 to have
its own type parameters. Otherwise, the expression 𝑓 ( 𝑓 ) could be well-typed.
162
4.4 Summary
See, for instance, Example 4.3.2.3 showing that the expression twice(twice) is well-
typed.
(b) Begin by assigning type parameters as 𝑓 :𝐴 and ℎ:𝐵 , where 𝐴 and 𝐵 are un-
known. To match types in ℎ( 𝑓 ), the type of ℎ must be a function type with an
argument of type 𝐴. So, we must have 𝐵 = 𝐴 → 𝐶, where 𝐶 is unknown. Then
ℎ( 𝑓 ) has type 𝐶, and ℎ → ℎ( 𝑓 ) has type ( 𝐴 → 𝐶) → 𝐶. This is the type of an
argument of 𝑓 , so 𝐴 = (( 𝐴 → 𝐶) → 𝐶) → 𝐷, where 𝐷 is unknown. But we cannot
have a simple type 𝐴 that satisfies the type equation 𝐴 = (( 𝐴 → 𝐶) → 𝐶) → 𝐷.
We conclude that the expression 𝑓 → 𝑓 (ℎ → ℎ( 𝑓 )) is not well-typed.
(c) Begin by assigning type parameters as 𝑓 :𝐴 , 𝑔 :𝐵 , ℎ:𝐶 . To match types in ℎ(𝑔),
we must have 𝐶 = 𝐵 → 𝐷. Then ℎ → ℎ( 𝑓 ) has type 𝐶 → 𝐷, and that must be the
type of 𝑓 ’s argument. So, we must have:
𝐴 = (𝐶 → 𝐷) → 𝐸 = ((𝐵 → 𝐷) → 𝐷) → 𝐸 .
There are no other restrictions. We have found the most general type:
( 𝑓 → 𝑔 → 𝑔( 𝑓 )) ( 𝑓 → 𝑔 → 𝑔( 𝑓 )) ( 𝑓 → 𝑓 (10)) ,
It follows that 𝑓 must have the same type as 𝑥 → 𝑦 → 𝑦(𝑥), while 𝑔 must have the
same type as ℎ → ℎ(10). The type of 𝑔, which we know as 𝐴 → 𝐵, will match the
type of ℎ → ℎ(10), which we know as (Int → 𝐸) → 𝐸, only if 𝐴 = (Int → 𝐸) and
𝐵 = 𝐸. It follows that 𝑓 has type Int → 𝐸. At the same time, the type of 𝑓 must
match the type of 𝑥 → 𝑦 → 𝑦(𝑥), which is 𝐶 → (𝐶 → 𝐷) → 𝐷. This can work only
if 𝐶 = Int and 𝐸 = (𝐶 → 𝐷) → 𝐷 = (Int → 𝐷) → 𝐷.
In this way, we have found all the relationships between the type parameters 𝐴,
𝐵, 𝐶, 𝐷, 𝐸 in Eq. (4.11). The type 𝐷 remains arbitrary, while the type parameters
𝐴, 𝐵, 𝐶, 𝐸 are expressed as:
The entire expression in Eq. (4.11) is a full application of a curried function, and
thus has the same type as the “final” result expression 𝑔( 𝑓 ), which has type 𝐵. So,
the entire expression in Eq. (4.11) has type 𝐵 = (Int → 𝐷) → 𝐷.
Having established that types match, we can now omit the type annotations
and rewrite the code:
The type of this expression is (Int → 𝐷) → 𝐷 with a type parameter 𝐷. Since the
argument 𝑦 is an arbitrary function, we cannot simplify either 𝑦(10) or 𝑦 → 𝑦(10)
any further. So, the final simplified form of Eq. (4.10) is 𝑦 :Int→𝐷 → 𝑦(10).
To test this, we first define the function 𝑓 → 𝑔 → 𝑔( 𝑓 ) as in Example 4.4.1.5:
def q[A, B]: A => (A => B) => B = { f => g => g(f) }
To help Scala evaluate Eq. (4.11), we need to set the type parameters for the first q
function as q[A, B] where 𝐴 and 𝐵 are given by Eqs. (4.12)–(4.13):
scala> def s[D] = q[Int => (Int => D) => D, (Int => D) => D](q)(r)
s: [D]=> (Int => D) => D
To verify that the function 𝑠 𝐷 indeed equals 𝑦 :Int→𝐷 → 𝑦(10), we apply 𝑠 𝐷 to some
functions of type Int → 𝐷, say, with 𝐷 = Boolean or 𝐷 = Int:
164
4.4 Summary
scala> s(_ > 0) // Set D = Boolean and evaluate (10 > 0).
res6: Boolean = true
4.4.2 Exercises
Exercise 4.4.2.1 Revise the function from Exercise 1.6.2.4, making it a curried
function and replacing the hard-coded number 100 by a curried first argument.
The type signature should become Int => List[List[Int]] => List[List[Int]].
165
4 The logic of types. II. Curried functions
Exercise 4.4.2.2 Implement the function converge from Example 4.4.1.1 as a cur-
ried function with an additional argument to set the maximum number of itera-
tions, returning Option[Double] as the final result type. The new version of converge
should return None if the convergence condition is not satisfied after the given
maximum number of iterations. The type signature and an example test:
@tailrec def convergeN[X](cond: X => Boolean)(x0: X)(maxIter: Int)(f: X => X):
Option[X] = ???
Exercise 4.4.2.4 For id and const as defined above, what are the types of id(id),
id(id)(id), id(id(id)), id(const), and const(const)? Simplify these code expressions
by symbolic calculations.
Exercise 4.4.2.5 For the function twice from Example 4.3.2.2, show that the func-
tion twice(twice(f))) is the same as twice(twice)(f) for any f: Int => Int.
Exercise 4.4.2.6 For the function twice from Example 4.3.2.2, infer the most gen-
eral type for the function twice(twice(twice))). What does that function do? Test
your answer on an example.
Exercise 4.4.2.7 Define a function thrice similarly to twice except it should apply
a given function 3 times. What does the function thrice(thrice(thrice))) do?
Exercise 4.4.2.8 Define a function ence similarly to twice except it should apply a
given function 𝑛 times, where 𝑛 is an additional curried argument.
Exercise 4.4.2.9 Define a fully parametric function flip(f) that swaps arguments
for any given uncurried function f having two arguments. To test:
def f(x: Int, y: Int) = x - y // Expect f(10, 2) == 8.
val g = flip(f) // Now expect g(2, 10) == 8.
Exercise 4.4.2.10 Write a function curry2 converting a function of type (A, A) => A
into an equivalent curried function of type A => A => A.
166
4.4 Summary
The function f1 has type signature Int => Int and order 1, so it is not a higher-order
function.
def f2(x: Int): Int => Int = (z => z + x)
The function f2 has type signature Int => Int => Int and is a higher-order function
of order 2.
def f3(g: Int => Int): Int = g(123)
The function f3 has type signature (Int => Int) => Int and is a higher-order func-
tion of order 2.
Note that f2 is a higher-order function only because its return value is of a func-
tion type. An equivalent computation can be performed by an uncurried function
that is not higher-order:
scala> def f2u(x: Int, z: Int): Int = z + x // Type signature (Int, Int) =>
Int
168
4.5 Discussion and further developments
Here, two bound variables named 𝑥 are defined in two scopes: one in the scope
1
of 𝑓 , another in the scope of the nameless function 𝑥 → 1+𝑥 . The convention in
mathematics is to treat these two 𝑥’s as two completely different variables that just
happen to have the same name. In sub-expressions where both of these bound
variables are visible, priority is given to the bound variable defined in the smaller
inner scope. The outer definition of 𝑥 is then shadowed (hidden) by the inner
definition of 𝑥. For this reason, evaluating 𝑓 (10) will give:
∫ 10
𝑑𝑥
𝑓 (10) = = log𝑒 (11) ≈ 2.398 ,
0 1+𝑥
∫ 10 𝑑𝑥 10
rather than 0 1+10 = 11 . The outer definition 𝑥 = 10 is shadowed within the
1 1
expression 1+𝑥 by the definition of 𝑥 in the smaller local scope of 𝑥 → 1+𝑥 .
Since this is the standard mathematical convention, the same convention is
adopted in functional programming. A variable defined in a function scope (i.e.,
a bound variable) will shadow any outside definitions of a variable with the same
name.
Name shadowing is not advisable in practical programming, because it usually
decreases the clarity of code and so invites errors. Consider the nameless function:
𝑥→𝑥→𝑥 ,
and let us decipher this confusing syntax. The symbol → groups to the right, so
𝑥 → 𝑥 → 𝑥 is the same as 𝑥 → (𝑥 → 𝑥). It is a function that takes 𝑥 and returns
𝑥 → 𝑥. Since the argument 𝑥 in (𝑥 → 𝑥) may be renamed to y without changing
the function, we can rewrite the code to:
𝑥 → (𝑦 → 𝑦) .
Having removed name shadowing, we can more easily understand this code and
reason about it. For instance, it becomes clear that this function ignores its argu-
ment 𝑥 and always returns the same value (the identity function 𝑦 → 𝑦). So, we
can rewrite (𝑥 → 𝑥 → 𝑥) as (_ → 𝑦 → 𝑦), which is clearer.
169
4 The logic of types. II. Curried functions
3 Theoperator syntax has a long history in programming. It is used in Unix shell commands, for
example cp file1 file2, and also in the language Tcl. In LISP-like languages, function applica-
tions are enclosed in parentheses but the arguments are space-separated, for example (f 10 20).
170
4.5 Discussion and further developments
The code that calls summation2 is easier to read because the curried argument is
syntactically separated from the rest of the code by curly braces. This is especially
useful when the curried argument is itself a function with a complicated body,
since Scala’s curly braces syntax allows function bodies to contain local definitions
(val or def) of new bound variables.
Another feature of Scala is the “dotless” method syntax: for example, xs map f is
equivalent to xs.map(f) and f andThen g is equivalent to f.andThen(g). The “dotless”
syntax is available only for infix methods, such as map, defined on specific types
such as Seq. In Scala 3, the “dotless” syntax is generally enabled by the infix
def annotation. Do not confuse Scala’s “dotless” method syntax with the operator
syntax used in Haskell and other languages.
How can we implement pa? Since pa(x)(f) must return a function of type B => C,
we have no choice other than to begin writing a nameless function in the code:
def pa[A, B, C](x: A)(f: (A, B) => C): B => C = { y: B =>
??? // Need to compute a value of type C in this scope.
}
In the inner scope, we need to compute a value of type C, and we have values x: A,
y: B, and f: (A, B) => C. How can we compute a value of type C? If we knew that
C = Int when pa(x)(f) is applied, we could have simply selected a fixed integer
value, say, 1, as the value of type C. If we knew that C = String, we could have
4 See https://en.wikipedia.org/wiki/Hindley%E2%80%93Milner_type_system
5 See http://dysphoria.net/2009/06/28/hindley-milner-type-inference-in-scala/
171
4 The logic of types. II. Curried functions
selected a fixed string, say, "hello", as the value of type C. But a fully parametric
function cannot use any knowledge of the types of its actual arguments.
So, a fully parametric function cannot produce a value of an arbitrary type C
from scratch. The only way of producing a value of type C is by applying the
function f to arguments of types A and B. Since the types A and B are arbitrary, we
cannot obtain any values of these types other than x: A and y: B. So, the only way
of getting a value of type C is to compute f(x, y). Thus, the body of pa must be:
def pa[A, B, C](x: A)(f: (A, B) => C): B => C = { y => f(x, y) }
In this way, we have unambiguously derived the body of this function from its type
signature, by assuming that the function must be fully parametric.
Another example is the operation of forward composition 𝑓 # 𝑔 viewed as a fully
parametric function with type signature:
def before[A, B, C](f: A => B, g: B => C): A => C = ???
To implement before, we need to create a nameless function of type A => C:
def before[A, B, C](f: A => B, g: B => C): A => C = { x: A =>
??? // Need to compute a value of type C in this scope.
}
In the inner scope, we need to compute a value of type 𝐶 from the values 𝑥 :𝐴 ,
𝑓 :𝐴→𝐵 , and 𝑔 :𝐵→𝐶 . Since the type 𝐶 is arbitrary, the only way of obtaining a value
of type 𝐶 is by applying 𝑔 to an argument of type 𝐵. In turn, the only way of
obtaining a value of type 𝐵 is to apply 𝑓 to an argument of type 𝐴. Finally, we
have only one value of type 𝐴, namely 𝑥 :𝐴 . So, the only way of obtaining the
required result is to compute 𝑔( 𝑓 (𝑥)).
We have derived the body of the function from its type signature:
def before[A, B, C](f: A => B, g: B => C): A => C = { x => g(f(x)) }
Chapter 5 will show how code can be derived from type signatures for a wide
range of fully parametric functions.
172
5 The logic of types. III. The
Curry-Howard correspondence
Fully parametric functions (introduced in Section 4.2) perform operations so gen-
eral that their code works in the same way for all types. An example of a fully
parametric function is:
def before[A, B, C](f: A => B, g: B => C): A => C = { x => g(f(x)) }
We have seen in Section 4.5.4 that for certain functions of this kind one can de-
rive the code unambiguously from the type signature. There exists a mathematical
theory (called the Curry-Howard correspondence) that gives precise conditions
for the possibility of deriving a function’s code from its type. There is also a sys-
tematic derivation algorithm that either produces the function’s code or proves
that the given type signature cannot be implemented. This chapter describes the
main results and applications of that theory to functional programming.
If this program compiles without type errors, it means that the types match and,
in particular, that the function f is able to compute a value x of type Either[A, B].
It is sometimes impossible to compute a value of a certain type in fully parametric
code. For example, the fully parametric function fmap shown in Example 3.2.3.1
cannot compute a value of type A:
def fmap[A, B](f: A => B): Option[A] => Option[B] = {
val x: A = ??? // Cannot compute x here!
...
}
The reason is that no fully parametric code can compute values of type A “from
scratch”, that is, without using any previously given value of type A and without
173
5 The logic of types. III. The Curry-Howard correspondence
Since the case None has no values of type A, we are unable to compute a value x in
that scope (as long as fmap remains a fully parametric function).
“Being able” to compute x: A means that, if needed, the code should be able to
return x as a result value. This requires computing x in all cases, not just within
one part (case ...) of a pattern-matching expression. For that, one would need to
implement the following type signature via fully parametric code:
def bad[A, B](f: A => B)(pa: Option[A]): A = ??? // Cannot implement.
So, the question “can we compute a value of type A within a fully parametric
function with arguments of type B and C” is equivalent to the question “can be
implement a fully parametric function of type (B, C) => A”. From now on, we will
focus on the latter kind of questions.
Here are some other examples where no fully parametric code can implement a
given type signature:
def bad2[A, B](f: A => B): A = ???
def bad3[A, B, C](p: A => Either[B, C]): Either[A => B, A => C] = ???
The problem with bad2 is that no data of type A is given, while the given function
f returns values of type B, not A.
The problem with bad3 is that it needs to hard-code the decision of whether to
return the Left or the Right part of Either. That decision cannot depend on the
function p because one cannot pattern-match on a function, and because bad3 does
not receive any data of type A and so cannot call p. Suppose bad3 is hard-coded
to always return a Left(f) with some f: A => B. It is then necessary to compute f
from p, but that is impossible: the given function p may return either Left(b) or
Right(c) for different values of its argument (of type A). This data is insufficient to
create a function of type A => B. Similarly, bad3 is not able to return Right(f) with
some f: A => C.
Could we try to switch between functions of type A => B and A => C depending
on a given value of type A? This idea means that we are working with a different
type signature, which has an additional argument of type A. That type signature
174
5.1 Values computed by fully parametric functions
So, when working with fully parametric code and looking at some type sig-
nature of a function, we may ask the question — is that type signature imple-
mentable, and if so, can we derive the code by just “following the types”?
It is remarkable that this question makes sense at all. When working with non-
FP languages, the notion of fully parametric functions is usually not relevant, and
implementations cannot be derived from types. But in functional programming,
fully parametric functions are used often. It is then important for the programmer
to know whether a given fully parametric type signature can be implemented,
and if so, to be able to derive the code.
Can we prove rigorously that the functions bad, bad2, bad3 cannot be imple-
mented by any fully parametric code? Or, perhaps, we are mistaken and a clever
trick could produce some code for those type signatures?
So far, we only saw informal arguments about whether values of certain types
can be computed. To make those arguments rigorous, we need to translate state-
ments such as “a fully parametric function before can compute a value of type
C => A” into mathematical formulas with rules for proving them true or false.
The first step towards a rigorous mathematical formulation is to choose a pre-
cise notation. In Section 3.5.3, we denoted by CH ( 𝐴) the proposition “we Can
H ave a value of type 𝐴 within a fully parametric function”. When writing the
code of that function, we may use the function’s arguments, which might have
types, say, 𝑋, 𝑌 , ..., 𝑍. So, we are interested in proving statements like this:
175
5 The logic of types. III. The Curry-Howard correspondence
Here 𝑋, 𝑌 , ..., 𝑍, 𝐴 may be either type parameters or more complicated type ex-
pressions, such as 𝐵 → 𝐶 or (𝐶 → 𝐷) → 𝐸, built from some type parameters.
If arguments of types 𝑋, 𝑌 , ..., 𝑍 are given, it means we “already have” val-
ues of those types, i.e., the propositions CH (𝑋), CH (𝑌 ), ..., CH (𝑍) will be true.
So, proposition (5.1) is equivalent to “CH ( 𝐴) is true assuming CH (𝑋), CH (𝑌 ),
..., CH (𝑍) are true”. In mathematical logic, a statement of this form is called a
sequent and is denoted using the symbol ` (called the “turnstile”):
The assumptions CH (𝑋), CH (𝑌 ), ..., CH (𝑍) are called premises and the proposi-
tion CH ( 𝐴) is called the goal of the sequent.
Sequents provide a notation for questions about implementability of fully para-
metric functions. Since our goal is to answer such questions rigorously, we will
need to be able to prove sequents of the form (5.2). The following sequents corre-
spond to the type signatures we just saw:
So far, we only saw informal arguments towards proving the first two sequents
and disproving the last two. We will now develop tools for proving sequents
rigorously.
In formal logic, sequents are proved by starting with certain axioms and fol-
lowing certain derivation rules. Different choices of axioms and derivation rules
will give different logics. We need to discover the correct logic for reasoning about
sequents with CH -propositions. To find that logic’s complete set of axioms and
derivation rules, we will systematically examine all the types and code construc-
tions that are possible in a fully parametric function. The resulting logic is known
under the name “constructive propositional logic”. That logic’s axioms and deriva-
tion rules directly correspond to programming language constructions allowed
by fully parametric code. For that reason, constructive propositional logic gives
correct answers about implementable and non-implementable type signatures of
fully parametric functions.
We will then be able to borrow the results and methods available in the mathe-
matical literature. The main result is an algorithm (called LJT) for finding a proof
for a given sequent in the constructive propositional logic. If a proof is found,
the algorithm also provides the code of a function that has the type signature cor-
responding to the sequent. If a proof is not found, it means that the given type
signature cannot be implemented by fully parametric code.
176
5.1 Values computed by fully parametric functions
Tuples and case classes with more than two parts are denoted by 𝐴 × 𝐵 × 𝐶 or
𝐴 × 𝐵 × 𝐶 × 𝐷, etc. For example, the Scala definition:
case class Person(firstName: String, lastName: String, age: Int)
177
5 The logic of types. III. The Curry-Howard correspondence
RootsOfQ def
= 1 + Double + Double × Double .
The type notation is significantly shorter because it omits all case class names and
part names from the Scala type definitions.
To clarify our notation for parameterized types, consider this code:
def f[A, B]: A => (A => B) => B = { x => g => g(x) }
The type notation for the type signature of f may be written as:
The type quantifier ∀( 𝐴, 𝐵) (reads “for all 𝐴 and 𝐵”) indicates that 𝑓 can be used
with all types 𝐴 and 𝐵.
In Scala, type expressions can be named, and those names (called type aliases)
can be used to make code shorter. Type aliases may also contain type parameters.
Defining and using a type alias for the type signature of the function f looks like
this:
type F[A, B] = A => (A => B) => B
def f[A, B]: F[A, B] = { x => g => g(x) }
This is written in the type notation by placing all type parameters into super-
scripts:
𝐹 𝐴,𝐵 def
= 𝐴 → ( 𝐴 → 𝐵) → 𝐵 ,
𝑓 𝐴,𝐵 : 𝐹 𝐴,𝐵 def
= 𝑥 :𝐴 → 𝑔 :𝐴→𝐵 → 𝑔(𝑥) ,
In Scala 3, the function f can be written as a value (val) via this syntax:
val f: [A, B] => A => (A => B) => B = { // Valid only in Scala 3.
[A, B] => (x: A) => (g: A => B) => g(x)
}
This syntax closely corresponds to the code notation 𝐴,𝐵 → 𝑥 :𝐴 → 𝑔 :𝐴→𝐵 → 𝑔(𝑥).
The precedence of operators in the type notation is chosen in order to write
fewer parentheses in some frequently used type expressions. The rules of prece-
dence are:
• The type product operator (×) groups stronger than the disjunctive operator
(+), so that type expressions such as 𝐴 + 𝐵 × 𝐶 have the same operator prece-
dence as in arithmetic. That is, 𝐴 + 𝐵 × 𝐶 means 𝐴 + (𝐵 × 𝐶). This convention
makes type expressions easier to read.
178
5.1 Values computed by fully parametric functions
• The function type arrow (→) groups weaker than the operators + and ×,
so that often-used types such as 𝐴 → 1 + 𝐵 (representing A => Option[B]) or
𝐴 × 𝐵 → 𝐶 (representing ((A, B)) => C) can be written without any paren-
theses. Type expressions such as ( 𝐴 → 𝐵) × 𝐶 will require parentheses but
are needed less often.
• The type quantifiers group weaker than all other operators, so we can write
types such as ∀𝐴. 𝐴 → 𝐴 → 𝐴 without parentheses. This is helpful because
type quantifiers are most often placed at the top level of a type expression.
When that is not the case, parentheses are necessary. An example is the type
expression (∀𝐴. 𝐴 → 𝐴 → 𝐴) → 1 + 1.
So, the proposition CH ( Unit) is always true. In the type notation, the Unit type is
denoted by 1. We may write the rule as CH (1) = 𝑇𝑟𝑢𝑒.
Named unit types also have a single value that is always possible to compute.
For example:
final case class N1()
defines a named unit type. We can compute a value of type N1 without using any
previously given values:
val x: N1 = N1()
So, the proposition CH ( N1) is always true. In the type notation, named unit types
are also denoted by 1, same as the Unit type itself.
1b) Rule for the void type The Scala type Nothing has no values, so the propo-
sition CH ( Nothing) is always false. The type Nothing is denoted by 0 in the type
notation. So, the rule is CH (0) = 𝐹𝑎𝑙𝑠𝑒.
179
5 The logic of types. III. The Curry-Howard correspondence
1c) Rule for primitive types For a specific primitive (or library-defined) type
such as Int or String, the corresponding CH -proposition is always true because
we may always create a constant value of that type, e.g.:
def f[...]: ... {
...
val x: String = "abc" // We can always compute a `String` value.
...
}
So, the rule for primitive types is the same as that for the Unit type. For example,
CH (String) = 𝑇𝑟𝑢𝑒.
2) Rule for tuple types To compute a value of a tuple type (A, B) requires com-
puting a value of type A and a value of type B. This is expressed by the logical
formula CH ( (A, B)) = CH ( 𝐴) ∧ CH (𝐵). A similar formula holds for case classes,
as Eq. (3.2) shows. In the type notation, the tuple (A, B) is written as 𝐴 × 𝐵, and
tuples with more parts are written similarly. So, we write the rule for tuples as:
3) Rule for disjunctive types A disjunctive type may consist of several cases.
Having a value of a disjunctive type means to have a value of (at least) one of
those cases. An example of translating this relationship into a formula was shown
by Eq. (3.1). For the standard disjunctive type Either[A, B], we have the logical
formula CH ( Either[A, B]) = CH ( 𝐴) ∨ CH (𝐵). In the type notation, disjunctive
types with more than two parts are written similarly as 𝐴 + 𝐵 + ... + 𝐶. So, the rule
for disjunctive types is written as:
4) Rule for function types Consider now a function type such as A => B. This
type is written in the type notation as 𝐴 → 𝐵. To compute a value of that type, we
need to write code like this:
val f: A => B = { (a: A) =>
??? // Compute a value of type B in this scope.
}
The inner scope of this function needs to compute a value of type 𝐵, and the given
value a: A may be used for that. So, CH ( 𝐴 → 𝐵) is true if and only if we are able
to compute a value of type 𝐵 when we are given a value of type 𝐴. To translate
this statement into the language of logical propositions, we need to use the logical
implication, CH ( 𝐴) ⇒ CH (𝐵), which means that CH (𝐵) can be proved if we
already have a proof of CH ( 𝐴). So, the rule for function types is:
CH ( 𝐴 → 𝐵) = CH ( 𝐴) ⇒ CH (𝐵) .
180
5.1 Values computed by fully parametric functions
5) Rule for parameterized types Consider this function with type parameters:
def f[A, B]: A => (A => B) => B = { x => g => g(x) }
Being able to define the body of such a function is the same as being able to com-
pute a value of type A => (A => B) => B for all possible Scala types A and B. In the
notation of logic, this is written as:
CH (∀( 𝐴, 𝐵). 𝐴 → ( 𝐴 → 𝐵) → 𝐵) ,
∀( 𝐴, 𝐵). CH ( 𝐴 → ( 𝐴 → 𝐵) → 𝐵) .
The symbol ∀ means “for all” and is called the universal quantifier in logic. We
read ∀𝐴. CH (𝐹 𝐴 ) as the proposition “for all types A, we can compute a value of
type F[A]”. Here, F[A] can be any type expression that depends on A (or even a
type expression that does not depend on A).
So, the rule for parameterized types of the form ∀𝐴. 𝐹 𝐴 is:
CH (∀𝐴. 𝐹 𝐴 ) = ∀𝐴. CH (𝐹 𝐴 ) .
The rules just shown will allow us to express CH -propositions for complicated
types via CH -propositions for type parameters. Then any type signature can be
rewritten as a sequent that contains CH -propositions only for the individual type
parameters.
In this way, we find a correspondence between a fully parametric type signa-
ture and a logical sequent that expresses the statement “the type signature can be
implemented”. This is the first part of the Curry-Howard correspondence.
Table 5.1 summarizes the type notation and shows how to translate it into logic
formulas with CH -propositions. Apart from recursive types (which we do not
consider in this chapter), Table 5.1 lists all type constructions that may be used in
the code of a fully parametric function.
Example 5.1.4.1 Define a function delta taking an argument x and returning the
pair (x, x). Derive the most general type for this function. Write the type signa-
ture of delta in the type notation, and translate it into a CH -proposition. Simplify
the CH -proposition if possible.
Solution Begin by writing the code of the function:
def delta(x: ...) = (x, x)
To derive the most general type for delta, first assume x: A, where A is a type
parameter; then the tuple (x, x) has type (A, A). We do not see any constraints on
the type parameter A. So, A represents an arbitrary type and needs to be added to
the type signature of delta:
def delta[A](x: A): (A, A) = (x, x)
We find that the most general type of delta is A => (A, A). We also note that delta
seems to be the only way of implementing a fully parametric function with type
signature A => (A, A).
We will use the letter Δ for the function delta. In the type notation, the type
signature of Δ is:
Δ𝐴 : 𝐴 → 𝐴 × 𝐴 .
So, the proposition CH (Δ) (meaning “the function Δ can be implemented”) is:
CH (Δ) = ∀𝐴. CH ( 𝐴 → 𝐴 × 𝐴) .
In the type expression 𝐴 → 𝐴 × 𝐴, the product symbol (×) binds stronger than the
function arrow (→), so the parentheses in 𝐴 → ( 𝐴 × 𝐴) may be omitted.
182
5.1 Values computed by fully parametric functions
CH ( 𝐴 → 𝐴 × 𝐴)
rule for function types : = CH ( 𝐴) ⇒ CH ( 𝐴 × 𝐴)
rule for tuple types : = CH ( 𝐴) ⇒ (CH ( 𝐴) ∧ CH ( 𝐴)) .
It is intuitively clear that the proposition CH (Δ) is true: it just says that if
CH ( 𝐴) is true then CH ( 𝐴) and CH ( 𝐴) is true. The point of writing CH (Δ) in
a mathematical notation is to prepare for proving that proposition rigorously.
Example 5.1.4.2 The standard types Either[A, B] and Option[A] are written in the
type notation as:
UserAction def
= String × String + String + Long . (5.3)
The type operation × groups stronger than +, as in arithmetic. To derive the type
notation (5.3), we first drop all names from case classes and get three nameless
tuples (String, String), (String), and (Long). Each of these tuples is then converted
into a product using the operator ×, and all products are “summed” in the type
notation using the operator +.
Example 5.1.4.4 The parameterized disjunctive type Either3 is a generalization
of Either:
sealed trait Either3[A, B, C]
final case class Left[A, B, C](x: A) extends Either3[A, B, C]
final case class Middle[A, B, C](x: B) extends Either3[A, B, C]
final case class Right[A, B, C](x: C) extends Either3[A, B, C]
This disjunctive type is written in the type notation as Either3 𝐴,𝐵,𝐶 def
= 𝐴 + 𝐵 + 𝐶.
183
5 The logic of types. III. The Curry-Howard correspondence
Example 5.1.4.5 Define a Scala type constructor F corresponding to the type no-
tation:
𝐹 𝐴 def
= 1 + Int × 𝐴 × 𝐴 + Int × (Int → 𝐴) .
Solution The formula for 𝐹 𝐴 defines a disjunctive type F[A] with three parts.
To implement F[A] in Scala, we need to choose names for each of the disjoint parts,
which will become case classes. For the purposes of this example, let us choose
names F1, F2, and F3. Each of these case classes needs to have the same type pa-
rameter A. So, we begin writing the code as:
sealed trait F[A]
final case class F1[A](...) extends F[A]
final case class F2[A](...) extends F[A]
final case class F3[A](...) extends F[A]
Each of these case classes represents one part of the disjunctive type: F1 represents
1, F2 represents Int × 𝐴 × 𝐴, and F3 represents Int × (Int → 𝐴). It remains to choose
names and define the case classes:
sealed trait F[A]
final case class F1[A]() extends F[A] // Named unit type.
final case class F2[A](n: Int, x1: A, x2: A) extends F[A]
final case class F3[A](n: Int, f: Int => A) extends F[A]
Solution This is a curried function, so we first rewrite the type signature as:
def fmap[A, B]: (A => B) => Option[A] => Option[B]
The type notation for Option[A] is 1 + 𝐴. Now we can write the type signature of
fmap as:
fmap 𝐴,𝐵 : ( 𝐴 → 𝐵) → 1 + 𝐴 → 1 + 𝐵 ,
or equivalently : fmap : ∀( 𝐴, 𝐵). ( 𝐴 → 𝐵) → 1 + 𝐴 → 1 + 𝐵 .
We do not put parentheses around 1 + 𝐴 and 1 + 𝐵 because the function arrow (→)
groups weaker than the other type operations. But parentheses around ( 𝐴 → 𝐵)
are required.
We will usually prefer to write type parameters in superscripts rather than un-
der type quantifiers. For example, we will prefer to write the type signature of an
identity function as id 𝐴 : 𝐴 → 𝐴 rather than as id : ∀𝐴. 𝐴 → 𝐴.
184
5.2 The logic of CH -propositions
𝑓 𝐴,𝐵,𝐶 : 1 + 𝐴 + 𝐵 + 𝐶 → ( 𝐴 → 1 + 𝐵) → 1 + 𝐵 + 𝐶 .
∀(𝛼, 𝛽). (𝛼 ⇒ 𝛽) ⇒ 𝛼 ⇒ 𝛼 .
186
5.2 The logic of CH -propositions
scala> toLeft(123)
res0: Either[Int, Nothing] = Left(123)
scala> toRight("abc")
res1: Either[Nothing, String] = Right("abc")
We can write the functions toLeft and toRight in the code notation as:
toLeft 𝐴,𝐵 def
= 𝑥 :𝐴 → 𝑥 + 0:𝐵 , toRight 𝐴,𝐵 def
= 𝑦 :𝐵 → 0:𝐴 + 𝑦 .
The code notation shows values of disjunctive types without using Scala class
names such as Either, Right, and Left. This shortens the writing and speeds up
reasoning about code.
188
5.2 The logic of CH -propositions
In the notation 0:𝐴 + 𝑦, we use the symbol 0 rather than an ordinary zero (0), to
avoid suggesting that 0 is a value of type 0. The void type 0 has no values, unlike
the Unit type, 1, which has a value denoted by 1 in the code notation.
Type annotations such as 0:𝐴 are helpful to remind ourselves about the type
parameter 𝐴 used, e.g., by the disjunctive value 0:𝐴 + 𝑦 :𝐵 in the body of toRight[A,
B]. Without that type annotation, 0 + 𝑦 :𝐵 needs to be interpreted as a value of type
Either[A, B], where the type parameter 𝐴 must be determined by matching with
the types of other expressions. When it is clear what types are being used, we may
omit type annotations and write simply 0 + 𝑦 instead of 0:𝐴 + 𝑦 :𝐵 .
The type notation for pattern matching is also unconventional because it uses
“function matrices” (matrices whose elements are functions). To motivate that, we
view a match/case expression as a set of functions that map parts of a disjunctive
type into parts of another disjunctive type. Consider this example code:
1 def f: Either[Int, String] => Either[String, Int] = {
2 case Left(x) => Right(10 * x)
3 case Right(y) => Left("a" + y + "b")
4 }
If we ignore the type names (Left and Right), we will see that line 2 is similar to the
function x => 10 * x of type Int => Int, while line 3 is similar to the function y =>
"a" + y + "b" of type String => String. These functions become matrix elements in
the “function matrix” for f:
String Int
:Int+String→String+Int def
𝑓 = Int 0 𝑥 → 10 ∗ 𝑥 .
String 𝑦 → "a" + 𝑦 + "b" 0
The rows of the matrix correspond to the case rows in the Scala code. There is one
row for each part of the disjunctive type of the input argument. The columns of
the matrix correspond to the parts of the disjunctive type of the output. The matrix
element in the first row and second column is the function 𝑥 → 10 ∗ 𝑥 that corre-
sponds to line 2 in the Scala code. The result type for that case is Right[Nothing,
Int], which is written as 0 + Int in the type notation. The function 𝑥 → 10 ∗ 𝑥 is
written in the second column to indicate that the result type is 0 + Int. The matrix
element in the first row and the first column is written as 0 because no value of
the type Left is returned in that case.
The matrix element in the second row and first column is the function 𝑦 →
"a" + 𝑦 + "b" that corresponds to line 3 in the Scala code. The result type for that
case is String + 0. The other matrix element in the second row is written as 0,
according to the return type String + 0.
In this way, we translate all lines of the match/case expression into a code matrix.
In each row of the matrix, there can be only one element that is not 0.
It turns out that the matrix notation is well adapted to computing forward compo-
sitions of functions that operate on disjunctive types. We will see many examples
189
5 The logic of types. III. The Curry-Howard correspondence
of such computations later in this book. In this chapter, we will use code matrices
in Example 5.3.2.4, Example 5.3.4.6, and some others.
(create unit) .
Γ ` CH (1)
The “fraction with a label” represents a proof rule. The denominator of the fraction
is the target sequent that we need to prove. The numerator of the fraction can have
zero or more other sequents that need to be proved before the target sequent can
190
5.2 The logic of CH -propositions
be proved. In this case, the set of previous sequents is empty: the target sequent is
an axiom and so requires no previous sequents for its proof. The label “create unit”
is an arbitrary name used to refer to the rule.
2) Use a given value At any place within the code of a fully parametric function,
we may use one of the function’s arguments, say 𝑥 :𝐴 . If some argument has type
𝐴, it means that we already have a value of type 𝐴. So, the corresponding propo-
sition, 𝛼 def
= CH ( 𝐴), belongs to the set of premises of the sequent we are trying to
prove. To indicate this, we write the set of premises as “Γ, 𝛼”. The code construct
x computes a value of type 𝐴, i.e., shows that “𝛼 is true with premises Γ, 𝛼”. That
proposition is the meaning of the sequent Γ, 𝛼 ` 𝛼. The proof code for that sequent
is an expression that just returns the value 𝑥:
Proof Γ, 𝛼 ` 𝛼 given 𝑥 : 𝐴 = 𝑥 .
Here, the subscript “given 𝑥 :𝐴 ” indicates that the value 𝑥 :𝐴 must come from the
premises. In this case, the set of premises is Γ, 𝛼 and so the proposition 𝛼 must
have been already proved. The proof of 𝛼 will give a value 𝑥 :𝐴 .
Actually, the premises for the sequent Γ, 𝛼 ` 𝛼 may give us not only a value
:𝐴
𝑥 but also some other values of other types. We may collectively denote those
values by 𝑝 :Γ . But the proof of the sequent Γ, 𝛼 ` 𝛼 does not need to use 𝑝. To
show that explicitly, we may write:
Proof Γ, 𝛼 ` 𝛼 given 𝑝 :Γ , 𝑥 : 𝐴 = 𝑥 .
The sequent Γ, 𝛼 ` 𝛼 is an axiom since its proof requires no previous sequents;
a value of type 𝐴 is already given in the premises. We denote this axiom by:
(use value) .
Γ, 𝛼 ` 𝛼
3) Create a function At any place in the code, we may compute a nameless func-
tion of type, say, 𝐴 → 𝐵, by writing (x: A) => expr as long as a value expr of type
𝐵 can be computed in the inner scope of the function. The code for expr is also
required to be fully parametric; it may use x and/or other values visible in that
scope. So, we now need to answer the question of whether a fully parametric
function can compute a value of type 𝐵, given an argument of type 𝐴 as well as
all other arguments previously given to the parent function. This question is an-
swered by a sequent whose premises contain one more proposition, CH ( 𝐴), in
addition to all previously available premises. Translating this into the language of
CH -propositions, we find that we will prove the sequent:
Γ ` CH ( 𝐴 → 𝐵) = Γ ` CH ( 𝐴) ⇒ CH (𝐵) = Γ`𝛼⇒𝛽
if we can prove the sequent Γ, CH ( 𝐴) ` CH (𝐵) = Γ, 𝛼 ` 𝛽. In the notation of
formal logic, this is a derivation rule (rather than an axiom) and is written as:
Γ, 𝛼 ` 𝛽
(create function) .
Γ`𝛼⇒𝛽
191
5 The logic of types. III. The Curry-Howard correspondence
The turnstile symbol (`) groups weaker than other operators. So, we can write
sequents such as (Γ, 𝛼) ` (𝛽 ⇒ 𝛾) with fewer parentheses: Γ, 𝛼 ` 𝛽 ⇒ 𝛾.
What code corresponds to the “create function” rule? The proof of Γ ` 𝛼 ⇒ 𝛽
depends on a proof of another sequent. So, the corresponding code must be a
function that takes a proof of the previous sequent as an argument and returns a
proof of the new sequent. We call that function a proof transformer.
By the CH correspondence, a proof of a sequent corresponds to a code expres-
sion of the type given by the goal of the sequent. That expression may use argu-
ments of types corresponding to the premises of the sequent. So, a proof of the
sequent Γ, 𝛼 ` 𝛽 is an expression exprB of type 𝐵 that may use a given value of
type 𝐴 as well as any other arguments given previously. Then we can write the
proof code for the sequent Γ ` 𝛼 ⇒ 𝛽 as the nameless function (x: A) => exprB.
This function has type 𝐴 → 𝐵 and requires us to already have a suitable expres-
sion exprB. This exactly corresponds to the proof rule “create function”. That rule’s
proof transformer is:
Proof Γ ` 𝛼 ⇒ 𝛽 given 𝑝 :Γ = 𝑥 :𝐴 → Proof Γ, 𝛼 ` 𝛽 given 𝑝 :Γ , 𝑥 : 𝐴 .
4) Use a function At any place in the code, we may apply an already defined
function of type 𝐴 → 𝐵 to an already computed value of type 𝐴. The result will
be a value of type 𝐵. This corresponds to assuming CH ( 𝐴 → 𝐵) and CH ( 𝐴), and
then deriving CH (𝐵). The notation for this proof rule is:
Γ`𝛼 Γ`𝛼⇒𝛽
(use function) .
Γ`𝛽
The code corresponding to this proof rule takes previously computed values x: A
and f: A => B, and writes the expression f(x). This can be written as a function
application:
Proof (Γ ` 𝛽) = Proof (Γ ` 𝛼 ⇒ 𝛽) (Proof (Γ ` 𝛼)) .
Here we omitted the subscripts “given 𝑝 :Γ ” for brevity, since all sequents have the
same premises Γ.
5) Create a tuple If we have already computed some values a: A and b: B, we
may write the expression (a, b) and so compute a value of the tuple type (A, B).
The proof rule is:
Γ`𝛼 Γ`𝛽
(create tuple) .
Γ` 𝛼∧𝛽
In the code notation, 𝑎 × 𝑏 means the pair (a, b), so we can write the code as:
Proof (Γ ` 𝛼 ∧ 𝛽) = Proof (Γ ` 𝛼) × Proof (Γ ` 𝛽) .
This rule describes creating a tuple of 2 values. A larger tuple, such as (w, x,
y, z), can be expressed via nested pairs, e.g., as (w, (x, (y, z))). So, it suffices
to have a derivation rule for creating pairs. That rule allows us to derive the
rules for creating all larger tuples, without having to define separate rules for, say,
Γ ` 𝛼 ∧ 𝛽 ∧ 𝛾.
192
5.2 The logic of CH -propositions
Γ` 𝛼∧𝛽 Γ` 𝛼∧𝛽
(use tuple-1) , (use tuple-2) .
Γ`𝛼 Γ`𝛽
where we introduced the notation 𝜋1 and 𝜋2 to mean the Scala code _._1 and _._2.
Since all tuples can be expressed through pairs, it is sufficient to have proof
rules for pairs.
7) Create a disjunctive value The type Either[A, B] corresponding to the disjunc-
tion 𝛼 ∨ 𝛽 can be used to define any other disjunctive type; e.g., a disjunctive type
with three parts can be expressed as Either[A, Either[B, C]]. So, it suffices to have
proof rules for a disjunction of two propositions.
There are two ways of creating a value of the type Either[A, B]: the code ex-
pressions are Left(x: A) and Right(y: B). The values x: A or y: B must have been
computed previously (and correspond to previously proved sequents). So, the
sequent proof rules are:
Γ`𝛼 Γ`𝛽
(create Left) (create Right) .
Γ` 𝛼∨𝛽 Γ` 𝛼∨𝛽
The corresponding proof transformers can be written using the case class names
Left and Right as:
8) Use a disjunctive value Pattern matching is the basic way of using a value of
type Either[A, B]:
val result: C = (e: Either[A, B]) match {
case Left(x: A) => expr1(x)
case Right(y: B) => expr2(y)
}
Γ` 𝛼∨𝛽 Γ, 𝛼 ` 𝛾 Γ, 𝛽 ` 𝛾
(use Either) .
Γ`𝛾
We found eight proof rules shown in Table 5.2. These rules define the intuition-
istic propositional logic, also called constructive propositional logic. We will
call this logic “constructive” for short.
194
5.2 The logic of CH -propositions
Solution First, we formulate the task as proving the proposition “For any type
𝑋, we can have a value of type 𝑋 → 𝑋 × 𝑋”. This corresponds to the proposition
CH (∀𝑋. 𝑋 → 𝑋 × 𝑋). That proposition will be the goal of a sequent. The function
has no arguments, so there are no premises for the sequent. We denote an empty
set of premises by the symbol ∅ . So, the sequent is written as:
∅ ` CH (∀𝑋. 𝑋 → 𝑋 × 𝑋) .
We denote 𝜒 def= CH (𝑋) and rewrite this sequent using the rules of Table 5.1. The
result is a sequent involving just 𝜒:
∀𝜒. ∅ ` 𝜒 ⇒ 𝜒 ∧ 𝜒 .
Next, we look for a proof of this sequent. For brevity, we will omit the quantifier
∀𝜒 since it will be present in front of every sequent.
We search through the proof rules in Table 5.2, looking for “denominators” that
match our current sequent. If we find such a rule, we will apply that rule to our
sequent. Then we will need to prove the sequents in the rule’s “numerator”.
Beginning with ∅ ` 𝜒 ⇒ 𝜒 ∧ 𝜒, we find a match with the rule “create function”:
Γ, 𝛼 ` 𝛽
(create function)
Γ`𝛼⇒𝛽
The denominator of that rule is Γ ` 𝛼 ⇒ 𝛽. This pattern will match our sequent
(∅ ` 𝜒 ⇒ 𝜒 ∧ 𝜒) if we set Γ = ∅, 𝛼 = 𝜒, and 𝛽 = 𝜒 ∧ 𝜒. So, we are allowed to apply
the rule “create function” with these assignments.
After these assignments, the rule “create function” becomes:
∅, 𝜒 ` 𝜒 ∧ 𝜒
.
∅ ` 𝜒 ⇒ 𝜒∧𝜒
Now the rule says: we will prove the denominator (∅ ` 𝜒 ⇒ 𝜒 ∧ 𝜒) if we first
prove the numerator (∅, 𝜒 ` 𝜒 ∧ 𝜒).
The set of premises ∅, 𝜒 is the union of an empty set and the set having a single
premise 𝜒. So, we can write the last sequent also as 𝜒 ` 𝜒 ∧ 𝜒 if we like.
To prove that sequent, we again look for a rule whose denominator matches our
sequent. That rule is “create tuple”:
Γ`𝛼 Γ`𝛽
(create tuple)
Γ` 𝛼∧𝛽
The denominator (Γ ` 𝛼 ∧ 𝛽) will match our sequent (𝜒 ` 𝜒 ∧ 𝜒) if we assign Γ = 𝜒,
𝛼 = 𝜒, 𝛽 = 𝜒. With these assignments, the rule says that we need to prove two
sequents (Γ ` 𝛼 and Γ ` 𝛽), which are in fact the same sequent (𝜒 ` 𝜒).
To prove that sequent, we apply the axiom “use value”:
.
Γ, 𝛼 ` 𝛼
195
5 The logic of types. III. The Curry-Howard correspondence
∅ ` 𝜒 ⇒ 𝜒∧𝜒
𝜒 ` 𝜒∧𝜒
𝜒`𝜒 𝜒`𝜒
.
∅, 𝜒 ` 𝜒
This axiom says that the sequent ∅, 𝜒 ` 𝜒 (or equivalently 𝜒 ` 𝜒) is already true
with nothing more needed to prove. So, the proof is finished.
We may visualize the proof as a tree shown in Figure 5.1. The tree starts with
the initial sequent and applies rules that require us to prove other sequents. The
tree stops with axioms in leaf positions.
Now we need to extract code from the proof. We begin with the leaves of the
tree and go back towards the top of the proof.
The axiom “use value” has the proof code 𝑥, where 𝑥 is given in the premises:
Proof Γ, 𝛼 ` 𝛼 given 𝑥 : 𝐴 = 𝑥 .
The proof uses this axiom twice with 𝛼 = 𝜒. Recall that 𝜒 denotes CH (𝑋), and so
the value 𝑥 must have type 𝑋. So, we write:
Proof 𝜒 ` 𝜒 given 𝑥 :𝑋 = 𝑥 .
The previous rule used by the proof was “create tuple”. Its proof code is:
That rule was used with Γ = 𝜒, 𝛼 = 𝜒, and 𝛽 = 𝜒. So, the proof code becomes:
196
5.2 The logic of CH -propositions
Finally, the first rule “create function” has the proof code:
Denote 𝛼 def
= CH ( 𝐴) and 𝛽 def= CH (𝐵), and rewrite the sequent using the rules of
Table 5.1 to obtain a logic formula that involves just 𝛼 and 𝛽:
The next step is to prove the sequent (5.6). For brevity, we will omit the quanti-
fier ∀(𝛼, 𝛽) since it will be present in front of every sequent.
Begin by looking for a proof rule whose “denominator” has a sequent similar to
Eq. (5.6), i.e., has an implication (𝑝 ⇒ 𝑞) in the goal. We have only one rule with
the “denominator” of the form Γ ` 𝑝 ⇒ 𝑞. That rule is “create function”, which we
will rewrite for clarity as:
Γ, 𝑝 ` 𝑞
(create function)
Γ`𝑝⇒𝑞
To match the denominator, we use this rule with the assignments Γ = ∅, 𝑝 def
= (𝛼 ⇒
def
𝛼) ⇒ 𝛽 and 𝑞 = 𝛽:
(𝛼 ⇒ 𝛼) ⇒ 𝛽 ` 𝛽
(create function)
∅ ` ((𝛼 ⇒ 𝛼) ⇒ 𝛽) ⇒ 𝛽
The rule’s numerator now requires us to prove the sequent (𝛼 ⇒ 𝛼) ⇒ 𝛽 ` 𝛽.
We may write that sequent as as 𝛾 ` 𝛽, where we defined 𝛾 def
= (𝛼 ⇒ 𝛼) ⇒ 𝛽 for
brevity.
So, the next step is to prove the sequent 𝛾 ` 𝛽. The premise (𝛾) contains an
implication. But there is no proof rule whose denominator has a premise in the
197
5 The logic of types. III. The Curry-Howard correspondence
form of an implication (𝑝 ⇒ 𝑞). Instead, we have the rule “use function” whose
denominator contains an arbitrary sequent:
Γ`𝑝 Γ`𝑝⇒𝑞
(use function)
Γ`𝑞
𝛾`𝛼⇒𝛼 𝛾 ` (𝛼 ⇒ 𝛼) ⇒ 𝛽
(use function)
𝛾`𝛽
The sequent in the numerator 𝛾, 𝛼 ` 𝛼 is proved directly by the axiom “use value”.
The sequent 𝛾 ` (𝛼 ⇒ 𝛼) ⇒ 𝛽 is the same as 𝛾 ` 𝛾 and is also proved by the axiom
“use value”.
The proof of the sequent (5.6) is now complete and can be drawn as a tree (see
Figure 5.2). The next step is to convert that proof to Scala code.
To do that, we combine the code expressions that correspond to each of the
proof rules we used. We need to retrace the proof backwards, starting from the
leaves of the tree and going towards the root. We will then combine the corre-
sponding Proof (...) code expressions.
Begin with the left-most leaf: “use value”. That rule gives the code 𝑥 :𝐴 :
Here “given 𝑥 :𝐴 ” means that 𝑥 :𝐴 must be a proof of the premise 𝛼 in the sequent
𝛾, 𝛼 ` 𝛼 (recall that 𝛼 denotes CH ( 𝐴), and so 𝑥 has type 𝐴). We need to use the
same 𝑥 :𝐴 when we write the code for the previous rule, “create function”:
Note that in this code we are able to use a value 𝑥 of type 𝐴 even though no
such value is given as an argument of our function s[A, B]. The reason is that the
sequent 𝛾, 𝛼 ` 𝛼 has an extra premise 𝛼 added to the set of premises at this step of
the proof. Once we are finished with this step, we again will not have any values
of type 𝐴 available. In the code, this corresponds to the local scoping of the bound
value 𝑥 :𝐴 in the function 𝑥 :𝐴 → 𝑥.
198
5.2 The logic of CH -propositions
∅ ` ((𝛼 ⇒ 𝛼) ⇒ 𝛽) ⇒ 𝛽
(𝛼 ⇒ 𝛼) ⇒ 𝛽 ` 𝛽
(𝛼 ⇒ 𝛼) ⇒ 𝛽 ` 𝛼 ⇒ 𝛼 (𝛼 ⇒ 𝛼) ⇒ 𝛽 ` (𝛼 ⇒ 𝛼) ⇒ 𝛽
(𝛼 ⇒ 𝛼) ⇒ 𝛽, 𝛼 ` 𝛼
We continue tracing the proof tree bottom-up. The right-most leaf “use value”
corresponds to the code 𝑓 :( 𝐴→𝐴)→𝐵 , where 𝑓 is the code corresponding to the
premise 𝛾 = (𝛼 ⇒ 𝛼) ⇒ 𝛽. So, we can write:
The previous rule (“use function”) combines the two preceding proofs:
Going further backwards, we find that the rule applied before “use function” was
“create function”. We need to provide the same 𝑓 :( 𝐴→𝐴)→𝐵 as in the premise above,
and so we obtain the code:
Proof (∅ ` ((𝛼 ⇒ 𝛼) ⇒ 𝛽) ⇒ 𝛽)
= 𝑓 :( 𝐴→𝐴)→𝐵 → Proof ((𝛼 ⇒ 𝛼) ⇒ 𝛽 ` 𝛽) given 𝑓 :( 𝐴→𝐴)→𝐵
= 𝑓 :( 𝐴→𝐴)→𝐵 → 𝑓 (𝑥 :𝐴 → 𝑥) .
def s[A, B]: ((A => A) => B) => B = { (f : A => A) => f(x => x) }
We found the proof tree in Figure 5.2 by combining various proof rules that
match our sequents. But we had to guess how to apply the “use function” rule: it
was not obvious how to assign the rule’s variable 𝑝. If we somehow find a proof
tree for a sequent, we can derive the corresponding code (perform “code inference”
from type). As we have seen, choosing the proof rules from Table 5.7 requires
guessing or trying different possibilities.
In other words, the rules in Table 5.7 do not provide an algorithm for finding a
proof tree automatically. It turns out that one can replace the rules in Table 5.7 by
a different but equivalent set of derivation rules that do give an algorithm (called
the “LJT algorithm”, see Section 5.2.5 below). That algorithm either finds that the
given formula cannot be proved, or it finds a proof and infers code that has the
given type signature.
The library curryhoward2 implements the LJT algorithm. Here are some examples
of using this library for code inference. We will run the ammonite3 shell to load the
library more easily.
As a non-trivial (but artificial) example, consider the type signature:
∀( 𝐴, 𝐵). (((( 𝐴 → 𝐵) → 𝐴) → 𝐴) → 𝐵) → 𝐵 .
It is not obvious whether a function with this type signature exists. The LJT algo-
rithm can figure that out and derive the code automatically. The library does this
via the method implement:
@ import $ivy.`io.chymyst::curryhoward:0.3.8`, io.chymyst.ch._
@ def f[A, B]: ((((A => B) => A) => A) => B) => B = implement
defined function f
@ println(f.lambdaTerm.prettyPrint)
a => a (b => b (c => a (d => c)))
The code 𝑎 → 𝑎 (𝑏 → 𝑏 (𝑐 → 𝑎 (𝑑 → 𝑐))) was produced automatically for the
function f. The function f has been compiled and is ready to be used in any sub-
sequent code.
A compile-time error occurs when no fully parametric function has the given
type signature:
@ def g[A, B]: ((A => B) => A) => A = implement
cmd3.sc:1: type ((A => B) => A) => A cannot be implemented
The logical formula corresponding to this type signature is:
∀(𝛼, 𝛽). ((𝛼 ⇒ 𝛽) ⇒ 𝛼) ⇒ 𝛼 . (5.7)
2 https://github.com/Chymyst/curryhoward
3 http://ammonite.io/#Ammonite-Shell
200
5.2 The logic of CH -propositions
This formula is known as Peirce’s law4 and gives an example showing that the
logic of types in functional programming languages is not Boolean (other exam-
ples are shown in Sections 5.2.6 and 5.5.4). Peirce’s law is true in Boolean logic but
does not hold in the constructive logic, i.e., it cannot be derived using the proof
rules in Table 5.7. If we try to implement g[A, B] with the type signature shown
above via fully parametric code, we will fail to write code that compiles without
type errors. This is because no such code exists, — not because we are insuffi-
ciently clever. The LJT algorithm can prove that the given type signature cannot
be implemented. The curryhoward library will then print an error message, and
compilation will fail.
As another example, let us verify that the type signature from Section 5.1.1 is
not implementable:
@ def bad2[A, B, C](g: A => Either[B, C]): Either[A => B, A => C] = implement
cmd4.sc:1: type (A => Either[B, C]) => Either[A => B, A => C] cannot be
implemented
The LJT algorithm will sometimes find several inequivalent proofs of the same
logic formula. In that case, each of the different proofs will be automatically trans-
lated into code. The curryhoward library uses heuristics to try finding the code that
has the least information loss. In many cases, the heuristics will select the imple-
mentation that is most useful to the programmer.
The rules of constructive logic and the LJT algorithm define rigorously what
it means to write code “guided by the types”. However, in order to use the LJT
algorithm well, a programmer needs to learn how to infer code from types by
hand. We will practice doing that throughout the book.
𝐴, 𝐵 ∨ 𝐶 ` ( 𝐴 ∧ 𝐵) ∨ 𝐶 .
We expect that this sequent is provable because we can write the corresponding
Scala code:
def f[A, B, C](a: A): Either[B, C] => Either[(A, B), C] = {
case Left(b) => Left((a, b))
case Right(c) => Right(c)
}
4 https://en.wikipedia.org/wiki/Peirce%27s_law
201
5 The logic of types. III. The Curry-Howard correspondence
How can we obtain a proof of this sequent via Table 5.7? We could potentially
apply the rules “create Left”, “create Right”, “use Either”, and “use function”. But we
will get stuck at the next step, no matter what rule we choose. Let us see why:
To apply “create Left”, we first need to prove the sequent 𝐴, 𝐵 ∨ 𝐶 ` 𝐴 ∧ 𝐵. But
this sequent cannot be proved: we do not necessarily have values of both types
𝐴 and 𝐵 if we are only given values of type 𝐴 and of type Either[B, C]. To apply
“create Right”, we need to prove the sequent 𝐴, 𝐵 ∨ 𝐶 ` 𝐶. Again, we find that this
sequent cannot be proved. The next choice is the rule “use Either” that matches
any goal of the sequent as the proposition 𝛾. But we are then required to choose
two new propositions (𝛼 and 𝛽) such that we can prove 𝐴, 𝐵 ∨ 𝐶 ` 𝛼 ∨ 𝛽 as well
as 𝐴, 𝐵 ∨ 𝐶, 𝛼 ` ( 𝐴 ∧ 𝐵) ∨ 𝐶 and 𝐴, 𝐵 ∨ 𝐶, 𝛽 ` ( 𝐴 ∧ 𝐵) ∨ 𝐶. It is not clear how we
should choose 𝛼 and 𝛽 in order to make progress with the proof. The remaining
rule, “use function”, similarly requires us to choose a new proposition 𝛼 such that
we can prove 𝐴, 𝐵 ∨ 𝐶 ` 𝛼 and 𝐴, 𝐵 ∨ 𝐶 ` 𝛼 ⇒ (( 𝐴 ∧ 𝐵) ∨ 𝐶). The rules give us no
guidance for choosing 𝛼 appropriately.
The rules in Table 5.2 are not helpful for proof search because the rules “use
function” and “use Either” require us to choose new unknown propositions and to
prove sequents more complicated than the ones we had before.
For instance, the rule “use function” gives a proof of Γ ` 𝛽 only if we first choose
some other proposition 𝛼 and prove the sequents Γ ` 𝛼 and Γ ` 𝛼 ⇒ 𝛽. The rule
does not tell us how to choose the proposition 𝛼 correctly. We need to guess the
correct 𝛼 by trial and error. Even after choosing 𝛼 in some way, we will have to
prove a more complicated sequent (Γ ` 𝛼 ⇒ 𝛽). It is not guaranteed that we are
getting closer to finding the proof of the initial sequent (Γ ` 𝛽).
It is far from obvious how to overcome that difficulty. Mathematicians have
studied the constructive logic for more than 60 years, trying to replace the rules
in Table 5.7 by a different but equivalent set of derivation rules that require no
guessing when looking for a proof. The first partial success came in 1935 with an
algorithm called “LJ”.5 The LJ algorithm works in many cases but still has a signif-
icant problem: one of its derivation rules may be applied infinitely many times,
leading to an infinite loop. So, the LJ algorithm is not guaranteed to terminate
without some heuristics for avoiding infinite loops. This problem is solved by a
modification of the LJ algorithm, called LJT, first formulated in 1992.6
We will begin with the LJ algorithm. Although that algorithm does not guaran-
tee termination, it is simpler to understand and to apply by hand. Then we will
show how to modify the LJ algorithm in order to obtain the always-terminating
LJT algorithm.
The LJ algorithm Figure 5.3 shows the LJ algorithm’s axioms and derivation
rules. Each rule says that the bottom sequent will be proved if proofs are given for
5 Seehttps://en.wikipedia.org/wiki/Sequent_calculus#Overview
6 An often cited paper by R. Dyckhoff is https://philpapers.org/rec/DYCCSC. For the history
of that research, see https://research-repository.st-andrews.ac.uk/handle/10023/8824
202
5.2 The logic of CH -propositions
(Id) (True)
Γ, 𝑋 ` 𝑋 Γ`>
Γ, 𝐴 ⇒ 𝐵 ` 𝐴 Γ, 𝐵 ` 𝐶 Γ, 𝐴 ` 𝐵
(Left ⇒) (Right ⇒)
Γ, 𝐴 ⇒ 𝐵 ` 𝐶 Γ`𝐴⇒𝐵
Γ, 𝐴𝑖 ` 𝐶 Γ`𝐴 Γ`𝐵
(Left∧𝑖 ) (Right∧)
Γ, 𝐴1 ∧ 𝐴2 ` 𝐶 Γ ` 𝐴∧𝐵
Γ, 𝐴 ` 𝐶 Γ, 𝐵 ` 𝐶 Γ ` 𝐴𝑖
(Left∨) (Right∨𝑖 )
Γ, 𝐴 ∨ 𝐵 ` 𝐶 Γ ` 𝐴1 ∨ 𝐴2
Figure 5.3: Axioms and derivation rules of the LJ algorithm. Each of the rules
“(Left∧𝑖 )” and “(Right∨𝑖 )” have two versions, with 𝑖 = 1 or 𝑖 = 2.
𝑆2 def
= (𝛼 ⇒ 𝛼) ⇒ 𝛽 ` 𝛼 ⇒ 𝛼 , 𝑆3 def
= 𝛽`𝛽 .
Sequent 𝑆3 follows from the “(Id)” axiom, so it remains to prove 𝑆2 . Since 𝑆2 con-
tains an implication both as a premise and as the goal, we may apply either the
rule “(Left ⇒)” or the rule “(Right ⇒)”. We choose to apply “(Left ⇒)” and get two
new sequents:
𝑆4 def
= (𝛼 ⇒ 𝛼) ⇒ 𝛽 ` 𝛼 ⇒ 𝛼 , 𝑆5 : 𝛽 ` 𝛼 ⇒ 𝛼 .
203
5 The logic of types. III. The Curry-Howard correspondence
Notice that 𝑆4 = 𝑆2 . So, our proof search is getting into an infinite loop trying to
prove the same sequent 𝑆2 over and over again. We can prove 𝑆5 but this will not
help us break the loop.
Once we recognize the problem, we backtrack to the point where we chose to
apply “(Left ⇒)” to 𝑆2 . That was a bad choice, so let us instead apply “(Right ⇒)”
to 𝑆2 . This yields a new sequent 𝑆6 :
𝑆6 def
= (𝛼 ⇒ 𝛼) ⇒ 𝛽, 𝛼 ` 𝛼 .
This sequent follows from the “(Id)” axiom. There are no more sequents to prove,
so the proof of 𝑆0 is finished. It can be drawn as a proof tree like this:
4 (Id)
𝑆3
𝑆0
/ (Right ⇒)
𝑆1
/ (Left ⇒)
𝑆2
/ (Right ⇒)
𝑆6
/ (Id)
The nodes of the proof tree are axioms or derivation rules, and the edges are in-
termediate sequents required by the rules. Some rule nodes branch into several
sequents because some rules require more than one new sequent to be proved.
The leaves of the tree are axioms that do not require proving any further sequents.
Extracting code from proofs According to the Curry-Howard correspondence,
a sequent of the form CH ( 𝐴), CH (𝐵), ..., CH (𝐶) ` CH (𝑋) represents the task of
writing a fully parametric code expression of type 𝑋 that uses some given values
of types 𝐴, 𝐵, ..., 𝐶. The sequent is true (i.e., can be proved) if that code expression
can be found. So, the code serves as an “evidence of proof” for the sequent.
In the previous subsection, we have found a proof of the sequent 𝑆0 , which
represents the task of writing a fully parametric function with type signature
(( 𝐴 → 𝐴) → 𝐵) → 𝐵). Let us now see how we can extract the code of that function
from the proof of the sequent 𝑆0 .
We start from the leaves of the proof tree and move step by step towards the
initial sequent. At each step, we shorten the proof tree by replacing some sequent
by its corresponding evidence-of-proof code. Eventually we will replace the initial
sequent by its corresponding code. Let us see how this procedure works for the
proof tree of the sequent 𝑆0 shown in the previous section.
Since the leaves are axioms, let us write the code corresponding to each axiom
of LJ:
205
5 The logic of types. III. The Curry-Howard correspondence
we will assume that the propositions 𝛼 and 𝛽 correspond to types 𝐴 and 𝐵; that
is, 𝛼 def
= CH ( 𝐴) and 𝛽 def
= CH (𝐵).
The leaves in the proof tree for 𝑆0 are the “(Id)” axioms used to prove the se-
quents 𝑆3 and 𝑆6 . Let us write the code that serves as the “evidence of proof” for
these sequents. For brevity, we denote 𝛾 def
= (𝛼 ⇒ 𝛼) ⇒ 𝛽 and 𝐶 def = ( 𝐴 → 𝐴) → 𝐵,
so that 𝛾 = CH (𝐶). Then we can write:
𝑆3 def
= 𝛽`𝛽 , Proof (𝑆3 )given 𝑦 :𝐵 = 𝑦 ,
def
𝑆6 = 𝛾, 𝛼 ` 𝛼 , Proof (𝑆6 )given 𝑞 :𝐶 , 𝑥 : 𝐴 = 𝑥 .
Note that the proof of 𝑆6 does not use the first given value 𝑞 :𝐶 (corresponding to
the premise 𝛾).
We now shorten the proof tree by replacing the sequents 𝑆3 and 𝑆6 by their
“evidence of proof”:
(𝑦)given 𝑦 :𝐵 4
(𝑥)given 𝑞 :𝐶 ,𝑥 : 𝐴
𝑆0
/ (Right ⇒)
𝑆1
/ (Left ⇒)
𝑆2
/ (Right ⇒) /
The next step is to consider the proof of 𝑆2 , which is found by applying the rule
“(Right ⇒)”. This rule promises to give a proof of 𝑆2 if we have a proof of 𝑆6 .
In order to extract code from that rule, we can write a function that transforms a
proof of 𝑆6 into a proof of 𝑆2 . That function is the proof transformer corresponding
to the rule “(Right ⇒)”. That rule and its transformer are defined as:
Γ, 𝐴 ` 𝐵
(Right ⇒) : Proof (Γ ` 𝐴 ⇒ 𝐵)given 𝑝 :Γ
Γ`𝐴⇒𝐵
= 𝑥 :𝐴 → Proof (Γ, 𝐴 ` 𝐵)given 𝑝 :Γ , 𝑥 : 𝐴 .
Applying the proof transformer to the known proof of 𝑆6 , we obtain a proof of 𝑆2 :
Proof (𝑆2 )given 𝑞 :𝐶 = 𝑥 :𝐴 → Proof (𝑆6 )given 𝑞 :𝐶 , 𝑥 : 𝐴 = (𝑥 :𝐴 → 𝑥)given 𝑞 :𝐶 .
The proof tree can be now shortened to:
(𝑦)given 𝑦 :𝐵 6
(𝑥 : 𝐴→𝑥)given 𝑞 :𝐶
𝑆0
/ (Right ⇒)
𝑆1
/ (Left ⇒) /
The next step is to get the proof of 𝑆1 obtained by applying the rule “(Left ⇒)”.
That rule requires two previous sequents, so its transformer is a function of two
previously obtained proofs:
Γ, 𝐴 ⇒ 𝐵 ` 𝐴 Γ, 𝐵 ` 𝐶
(Left ⇒) :
Γ, 𝐴 ⇒ 𝐵 ` 𝐶
Proof (Γ, 𝐴 ⇒ 𝐵 ` 𝐶)given 𝑝 :Γ ,𝑞 : 𝐴→𝐵 = Proof (Γ, 𝐵 ` 𝐶)given 𝑝 :Γ ,𝑏 :𝐵
where 𝑏 :𝐵 def
= 𝑞 Proof (Γ, 𝐴 ⇒ 𝐵 ` 𝐴)given 𝑝 :Γ ,𝑞 : 𝐴→𝐵 .
206
5.2 The logic of CH -propositions
In the proof tree shown above, we obtain a proof of 𝑆1 by applying that proof
transformer to the proofs of 𝑆2 and 𝑆3 :
Proof (𝑆1 )given 𝑞 :𝐶 = Proof (𝑆3 )given 𝑏 :𝐵 where 𝑏 :𝐵 def
= 𝑞(Proof (𝑆2 ))given 𝑞 :𝐶
= 𝑏 where 𝑏 :𝐵 def
= 𝑞(𝑥 :𝐴 → 𝑥)given 𝑞 :𝐶 = 𝑞(𝑥 :𝐴 → 𝑥)given 𝑞 :𝐶 .
Substituting this proof into the proof tree, we shorten the tree to:
𝑞(𝑥 : 𝐴→𝑥)given 𝑞 :𝐶
𝑆0
/ (Right ⇒) /
It remains to obtain the proof of 𝑆0 by applying the proof transformer of the rule
“(Right ⇒)”:
Proof (𝑆0 ) = Proof (∅ ` (𝛼 ⇒ 𝛼) ⇒ 𝛽) ⇒ 𝛽)
= 𝑞 :( 𝐴→𝐴)→𝐵 → Proof (𝑆1 )given 𝑞 :𝐶 = 𝑞 :( 𝐴→𝐴)→𝐵 → 𝑞(𝑥 :𝐴 → 𝑥) .
The proof tree is now shortened to just the code 𝑞 :( 𝐴→𝐴)→𝐵 → 𝑞(𝑥 :𝐴 → 𝑥), which
has type (( 𝐴 → 𝐴) → 𝐵) → 𝐵. So, that code is an evidence of proof for 𝑆0 . In
this way, we have derived the code of a fully parametric function from its type
signature.
Figure 5.4 shows the proof transformers for all the rules of the LJ algorithm.
Apart from the special rule “(Left ⇒)”, all other rules have proof transformers us-
ing just one of the code constructions (“create function”, “create tuple”, “use tuple”,
etc.) allowed within fully parametric code.
The LJT algorithm As we have seen, the LJ algorithm enters a loop if the rule
“(Left ⇒)” gives a sequent we already had at a previous step. That rule requires
us to prove two new sequents:
Γ, 𝐴 ⇒ 𝐵 ` 𝐴 Γ, 𝐵 ` 𝐶
(Left ⇒) .
Γ, 𝐴 ⇒ 𝐵 ` 𝐶
A sign of trouble is that the first of these sequents (Γ, 𝐴 ⇒ 𝐵 ` 𝐴) does not have
a simpler form than the initial sequent (Γ, 𝐴 ⇒ 𝐵 ` 𝐶). So, it is not clear that we
are getting closer to completing the proof. If 𝐴 = 𝐶, the new sequent will simply
repeat the initial sequent, immediately creating a loop.
In some cases, a repeated sequent will occur after more than one step. It is not
easy to formulate rigorous conditions for stopping the loop or for avoiding the
rule “(Left ⇒)”.
The LJT algorithm solves this problem by removing the rule “(Left ⇒)” from the
LJ algorithm. Instead, four new rules are introduced. Each of these rules contains
a different pattern instead of 𝐴 in the premise 𝐴 ⇒ 𝐶:
Γ, 𝐴, 𝐵 ` 𝐷 Γ, 𝐴 ⇒ 𝐵 ⇒ 𝐶 ` 𝐷
(𝐴 is atomic) (Left ⇒ 𝐴 ) (Left ⇒∧ )
Γ, 𝐴, 𝐴 ⇒ 𝐵 ` 𝐷 Γ, ( 𝐴 ∧ 𝐵) ⇒ 𝐶 ` 𝐷
Γ, 𝐵 ⇒ 𝐶 ` 𝐴 ⇒ 𝐵 Γ, 𝐶 ` 𝐷 Γ, 𝐴 ⇒ 𝐶, 𝐵 ⇒ 𝐶 ` 𝐷
(Left ⇒⇒ ) (Left ⇒∨ )
Γ, ( 𝐴 ⇒ 𝐵) ⇒ 𝐶 ` 𝐷 Γ, ( 𝐴 ∨ 𝐵) ⇒ 𝐶 ` 𝐷
207
5 The logic of types. III. The Curry-Howard correspondence
The rule “Left ⇒ 𝐴 ” applies only if the implication starts with an “atomic” type
expression, i.e., a single type parameter or a unit type. In all other cases, the
implication must start with a conjunction, a disjunction, or an implication, which
means that one of the three remaining rules will apply.
The LJT algorithm retains all the rules in Figure 5.4 except the rule “(Left ⇒)”,
which is replaced by the four new rules. It is far from obvious that the new rules
are equivalent to the old ones. It took mathematicians several decades to come up
with the LJT rules and to prove their validity. This book will rely on that result
and will not attempt to prove it.
The proof transformers for the new rules are shown in Figure 5.5. Figures 5.4–5.5
define the set of proof transformers sufficient for using the LJT algorithm in prac-
tice. The curryhoward library7 implements those proof transformers.
The most complicated of the new rules is the rule “(Left ⇒⇒ )”. It is far from
obvious why the rule Left ⇒⇒ is useful or even correct. This rule is based on a
non-trivial logic identity:
(( 𝐴 → 𝐵) → 𝐶) → ( 𝐴 → 𝐵) ⇐⇒ (𝐵 → 𝐶) → ( 𝐴 → 𝐵) .
(( 𝐴 → 𝐵) → 𝐶) → 𝐵 → 𝐶 .
𝑓 = 𝑘 :( 𝐴→𝐵)→𝐶 → 𝑏 :𝐵 → 𝑘 (_:𝐴 → 𝑏) .
The function 𝑓 occurs in the proof transformer for the rule Left ⇒⇒ (shown below
in Table 5.5). Note that this 𝑓 applies 𝑘 to a function (_ → 𝑏) that ignores its
argument. We expect to be able to simplify the resulting expression at the place
when (_ → 𝑏) is applied to some argument expression, which we can then ignore.
For this reason, applying the transformer for the rule Left ⇒⇒ results in evidence-
of-proof code that is longer than the code obtained via LJ’s rule transformers. The
code obtained via the LJT algorithm needs to be simplified symbolically.
As an example of using the LJT algorithm, we again prove the sequent from the
previous section: 𝑆0 = ∅ ` ((𝛼 ⇒ 𝛼) ⇒ 𝛽) ⇒ 𝛽. At each step, only one LJT rule
applies to each sequent. The initial part of the proof tree looks like this:
𝛽`𝛽
3
∅`((𝛼⇒𝛼)⇒𝛽)⇒𝛽 (𝛼⇒𝛼)⇒𝛽`𝛽
/ (Right ⇒) / (Left ⇒⇒ )
𝛼⇒𝛽`𝛼⇒𝛼
/
The proofs for the sequents 𝛽 ` 𝛽 and 𝛼 ⇒ 𝛽 ` 𝛼 ⇒ 𝛼 are the same as before:
208
5.2 The logic of CH -propositions
Γ, 𝐴, 𝐵 ` 𝐷
(Left ⇒ 𝐴) Proof (Γ, 𝐴, 𝐴 ⇒ 𝐵 ` 𝐷)given 𝑝 :Γ ,𝑥 : 𝐴,𝑞 : 𝐴→𝐵
Γ, 𝐴, 𝐴 ⇒ 𝐵 ` 𝐷
= Proof (Γ, 𝐴, 𝐵 ` 𝐷)given 𝑝,𝑥,𝑞 ( 𝑥)
Γ, 𝐴 ⇒ 𝐵 ⇒ 𝐶 ` 𝐷
(Left ⇒∧ ) Proof (Γ, ( 𝐴 ∧ 𝐵) ⇒ 𝐶 ` 𝐷)given 𝑝 :Γ ,𝑞 : 𝐴×𝐵→𝐶
Γ, ( 𝐴 ∧ 𝐵) ⇒ 𝐶 ` 𝐷
= Proof (Γ,
𝐴 ⇒ 𝐵 ⇒ 𝐶 ` 𝐷)given 𝑝,(𝑎: 𝐴→𝑏 :𝐵 →𝑞 (𝑎×𝑏))
Γ, 𝐴 ⇒ 𝐶, 𝐵 ⇒ 𝐶 ` 𝐷
(Left ⇒∨ ) Proof (Γ, ( 𝐴 ∨ 𝐵) ⇒ 𝐶 ` 𝐷)given 𝑝 :Γ ,𝑞 : 𝐴+𝐵→𝐶
Γ, ( 𝐴 ∨ 𝐵) ⇒ 𝐶 ` 𝐷
= Proof (Γ, 𝐴 ⇒ 𝐶, 𝐵 ⇒ 𝐶 ` 𝐷)given 𝑝,𝑟 ,𝑠
where 𝑟 def
= 𝑎 : 𝐴 → 𝑞(𝑎 + 0)
and 𝑠 def
= 𝑏 :𝐵 → 𝑞(0 + 𝑏)
Γ, 𝐵 ⇒ 𝐶 ` 𝐴 ⇒ 𝐵 Γ, 𝐶 ` 𝐷
(Left ⇒⇒ ) Proof (Γ, ( 𝐴 ⇒ 𝐵) ⇒ 𝐶 ` 𝐷)given 𝑝 :Γ ,𝑞 :( 𝐴→𝐵)→𝐶
Γ, ( 𝐴 ⇒ 𝐵) ⇒ 𝐶 ` 𝐷
= Proof (Γ, 𝐶 ` 𝐷)given 𝑝,𝑐
where 𝑐 :𝐶 def
= 𝑞 Proof (Γ,
𝐵 ⇒ 𝐶 ` 𝐴 ⇒ 𝐵)given 𝑝,𝑟
and 𝑟 :𝐵→𝐶 def
= 𝑏 :𝐵 → 𝑞(_: 𝐴 → 𝑏)
Figure 5.5: Proof transformers for the four new rules of the LJT algorithm.
Substituting these proofs into the proof transformer of the rule “(Left ⇒⇒ )” pro-
duces this code:
Proof ((𝛼 ⇒ 𝛼) ⇒ 𝛽 ` 𝛽)given 𝑞 :( 𝐴→𝐴)→𝐵 = 𝑞 Proof (𝛼 ⇒ 𝛽 ` 𝛼 ⇒ 𝛼)given 𝑟 : 𝐴→𝐵
where 𝑟 :𝐴→𝐵 = 𝑎 :𝐴 → 𝑞(_:𝐴 → 𝑎)
= 𝑞(𝑥 :𝐴 → 𝑥) .
The proof of 𝛼 ⇒ 𝛽 ` 𝛼 ⇒ 𝛼 does not actually use the intermediate value 𝑟 :𝐴→𝐵
provided by the proof transformer. As a symbolic simplification step, we may
simply omit the code of 𝑟. The curryhoward library always performs symbolic sim-
plification after applying the LJT algorithm.
The reason the LJT algorithm terminates is that each rule replaces a given se-
quent by one or more sequents with simpler premises or goals.8 This guarantees
that the proof search will terminate either with a complete proof or with a sequent
8 The paper http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.35.2618 shows
that the LJT algorithm terminates by giving an explicit decreasing measure on sequents.
209
5 The logic of types. III. The Curry-Howard correspondence
(𝛼 ⇒ 𝛽) def
= ((¬𝛼) ∨ 𝛽) . (5.8)
To verify whether a formula is true in the Boolean logic, we can substitute either
𝑇𝑟𝑢𝑒 or 𝐹𝑎𝑙𝑠𝑒 into every variable and check if the formula has the value 𝑇𝑟𝑢𝑒 in
all possible cases. The result can be arranged into a truth table. The formula is
true if all values in its truth table are 𝑇𝑟𝑢𝑒.
Disjunction, conjunction, negation, and implication operations are described by
this truth table:
Using this table, we find that the formula 𝛼 ⇒ 𝛼 has the value 𝑇𝑟𝑢𝑒 in all cases,
whether 𝛼 itself is 𝑇𝑟𝑢𝑒 or 𝐹𝑎𝑙𝑠𝑒. This check is sufficient to show that ∀𝛼. 𝛼 ⇒ 𝛼
is true in Boolean logic.
Here is the truth table for the formulas ∀(𝛼, 𝛽). (𝛼 ∧ 𝛽) ⇒ 𝛼 and ∀(𝛼, 𝛽). 𝛼 ⇒
(𝛼 ∧ 𝛽). The first formula is true since all values in its column are 𝑇𝑟𝑢𝑒, while the
second formula is not true since one value in the last column is 𝐹𝑎𝑙𝑠𝑒:
210
5.2 The logic of CH -propositions
𝛼 𝛽 𝛼∧𝛽 (𝛼 ∧ 𝛽) ⇒ 𝛼 𝛼 ⇒ (𝛼 ∧ 𝛽)
Table 5.3 shows more examples of logical formulas that are true in Boolean logic.
Each formula is first written in terms of CH -propositions (we denote 𝛼 def
= CH ( 𝐴)
def
and 𝛽 = CH (𝐵) for brevity) and then as a Scala type signature of a function. So,
all these type signatures can be implemented.
Table 5.3: Examples of logical formulas that are true theorems in Boolean logic.
Table 5.4 shows some examples of formulas that are not true in Boolean logic.
Translated into type formulas and then into Scala, these formulas yield type sig-
natures that cannot be implemented by fully parametric functions.
Table 5.4: Examples of logical formulas that are not true in Boolean logic.
At first sight, it may appear from these examples that whenever a formula is
true in Boolean logic, the corresponding type signature can be implemented in
code, and vice versa. However, this is incorrect: the rules of Boolean logic are not
fully suitable for reasoning about types in a functional language. False Boolean
formulas do correspond to unimplementable type signatures. But not all true
211
5 The logic of types. III. The Curry-Howard correspondence
∀( 𝐴, 𝐵, 𝐶). ( 𝐴 → 𝐵 + 𝐶) → ( 𝐴 → 𝐵) + ( 𝐴 → 𝐶) , (5.9)
The function bad3 cannot be implemented via fully parametric code, as we al-
ready discussed in Section 5.1.1. Now, the type signature (5.9) gives this CH -
proposition:
It turns out that this formula is true in Boolean logic. To prove this, we need to
show that Eq. (5.10) is equal to 𝑇𝑟𝑢𝑒 for any Boolean values of the variables 𝛼, 𝛽,
𝛾. One way is to rewrite the expression (5.10) using the rules of Boolean logic:
𝛼 ⇒ (𝛽 ∨ 𝛾)
definition of ⇒ via Eq. (5.8) : = (¬𝛼) ∨ 𝛽 ∨ 𝛾 ,
(𝛼 ⇒ 𝛽) ∨ (𝛼 ⇒ 𝛾)
definition of ⇒ via Eq. (5.8) : = (¬𝛼) ∨ 𝛽 ∨ (¬𝛼) ∨ 𝛾
property 𝑥 ∨ 𝑥 = 𝑥 in Boolean logic : = (¬𝛼) ∨ 𝛽 ∨ 𝛾 .
With the CH correspondence in mind, we may say that the existence of the code
x => _ => x with the type 𝐴 → (𝐵 → 𝐴) “is” a proof of the logical formula (5.11),
because it shows how to compute a value of type ∀( 𝐴, 𝐵). 𝐴 → 𝐵 → 𝐴.
The Curry-Howard correspondence maps logic formulas such as (𝛼 ∨ 𝛽) ∧ 𝛾 into
type expressions such as ( 𝐴 + 𝐵) × 𝐶. We have seen that types behave similarly to
logic formulas in one respect: A logic formula is a true theorem of constructive
logic when the corresponding type signature can be implemented as a fully para-
metric function, and vice versa.
It turns out that the similarity ends here. In other respects, type expressions
behave as arithmetic expressions and not as logic formulas. For this reason, the
type notation used in this book denotes disjunctive types by 𝐴 + 𝐵 and tuples by
𝐴 × 𝐵, which is designed to remind us of arithmetic expressions (such as 1 + 2 and
2 × 3) rather than of logical formulas (such as 𝐴 ∨ 𝐵 and 𝐴 ∧ 𝐵).
An important use of the type notation is for writing equations with types. Can
we use the arithmetic intuition for writing type equations such as:
( 𝐴 + 𝐵) × 𝐶 = 𝐴 × 𝐶 + 𝐵 × 𝐶 ? (5.12)
In this section, we will learn how to check whether one type expression is equiv-
alent to another.
∀( 𝐴, 𝐵, 𝐶). ( 𝐴 ∨ 𝐵) ∧ 𝐶 = ( 𝐴 ∧ 𝐶) ∨ (𝐵 ∧ 𝐶) . (5.13)
∀( 𝐴, 𝐵, 𝐶). ( 𝐴 ∨ 𝐵) ∧ 𝐶 ⇒ ( 𝐴 ∧ 𝐶) ∨ (𝐵 ∧ 𝐶) , (5.14)
∀( 𝐴, 𝐵, 𝐶). ( 𝐴 ∧ 𝐶) ∨ (𝐵 ∧ 𝐶) ⇒ ( 𝐴 ∨ 𝐵) ∧ 𝐶 . (5.15)
9 See https://en.wikipedia.org/wiki/Distributive_property#Rule_of_replacement
213
5 The logic of types. III. The Curry-Howard correspondence
𝑓1𝐴,𝐵,𝐶 : ( 𝐴 + 𝐵) × 𝐶 → 𝐴 × 𝐶 + 𝐵 × 𝐶 ,
𝑓2𝐴,𝐵,𝐶 : 𝐴 × 𝐶 + 𝐵 × 𝐶 → ( 𝐴 + 𝐵) × 𝐶 .
Since the two logical formulas (5.14)–(5.15) are true theorems in constructive logic,
we expect to be able to implement the functions f1 and f2. We could use the
proof rules of the LJT algorithm to obtain proofs of Eqs. (5.14)–(5.15) and to derive
implementations of f1 and f2. Instead, let us exercise our intuition and write the
Scala code directly.
To implement f1, we need to perform pattern matching on the argument:
def f1[A, B, C]: ((Either[A, B], C)) => Either[(A, C), (B, C)] = {
case (Left(a), c) => Left((a, c)) // No other choice here.
case (Right(b), c) => Right((b, c)) // No other choice here.
}
In both cases, we have only one possible expression of the correct type.
Similarly, the implementation of f2 leaves us no choices:
def f2[A, B, C]: Either[(A, C), (B, C)] => (Either[A, B], C) = {
case Left((a, c)) => (Left(a), c) // No other choice here.
case Right((b, c)) => (Right(b), c) // No other choice here.
}
The code of f1 and f2 never discards any given values; in other words, these
functions appear to preserve information. We can formulate this property rig-
orously as a requirement that an arbitrary value x: (Either[A, B], C) be mapped
by f1 to some value y: Either[(A, C), (B, C)] and then mapped by f2 back to the
same value x. Similarly, any value y of type Either[(A, C), (B, C)] should be trans-
formed by f2 and then by f1 back to the same value y.
Let us write those conditions as equations:
If these equations hold, it means that all the information in a value 𝑥 :( 𝐴+𝐵)×𝐶 is
completely preserved inside the value 𝑦 def= 𝑓1 (𝑥); the original value 𝑥 can be re-
covered as 𝑥 = 𝑓2 (𝑦). Then the function 𝑓1 is the inverse of 𝑓2 . Conversely, all
the information in a value 𝑦 :𝐴×𝐶+𝐵×𝐶 is preserved inside 𝑥 def = 𝑓2 (𝑦) and can be
recovered by applying 𝑓1 . Since the values 𝑥 :( 𝐴+𝐵)×𝐶 and 𝑦 :𝐴×𝐶+𝐵×𝐶 are arbitrary,
it will follow that the data types themselves, ( 𝐴 + 𝐵) × 𝐶 and 𝐴 × 𝐶 + 𝐵 × 𝐶, carry
equivalent information. Such types are called equivalent or isomorphic.
Generally, we say that types 𝑃 and 𝑄 are equivalent or isomorphic (denoted
𝑃 𝑄) when there exist functions 𝑓1 : 𝑃 → 𝑄 and 𝑓2 : 𝑄 → 𝑃 that are inverses of
each other. We can write that using the notation ( 𝑓1 # 𝑓2 )(𝑥) def
= 𝑓2 ( 𝑓1 (𝑥)) as:
𝑓1 # 𝑓2 = id , 𝑓2 # 𝑓1 = id .
214
5.3 Equivalence of types
(In Scala, the forward composition 𝑓1 # 𝑓2 is the function f1 andThen f2. We omit
type annotations since we already checked that the types match.) If these condi-
tions hold, there is a one-to-one correspondence between values of types 𝑃 and 𝑄.
In other words, the data types 𝑃 and 𝑄 carry equivalent information.
To verify that the Scala functions f1 and f2 defined above are inverses of each
other, we first check if 𝑓1 # 𝑓2 = id. Applying 𝑓1 # 𝑓2 means to apply 𝑓1 and then
to apply 𝑓2 to the result. Begin by applying 𝑓1 to an arbitrary value 𝑥 :( 𝐴+𝐵)×𝐶 . A
value 𝑥 of that type can be in only one of the two disjoint cases: a tuple (Left(a),
c) or a tuple (Right(b), c), for some values a:A, b:B, and c:C. The Scala code of f1
maps these tuples to Left((a, c)) and to Right((b, c)) respectively; we can see this
directly from the code of f1. We then apply 𝑓2 to those values, which maps them
back to a tuple (Left(a), c) or to a tuple (Right(b), c) respectively, according to
the code of f2. These tuples are exactly the value 𝑥 we started with. So, applying
𝑓1 # 𝑓2 to an arbitrary 𝑥 :( 𝐴+𝐵)×𝐶 returns that value 𝑥. This is the same as to say that
𝑓1 # 𝑓2 = id.
To check whether 𝑓2 # 𝑓1 = id, we apply 𝑓2 to an arbitrary value 𝑦 :𝐴×𝐶+𝐵×𝐶 , which
must be one of the two disjoint cases, Left((a, c)) or Right((b, c)). The code of
f2 maps these two cases into tuples (Left(a), c) and (Right(b), c) respectively.
Then we apply f1 and map these tuples back to Left((a, c)) and Right((b, c))
respectively. It follows that applying 𝑓2 and then 𝑓1 will always return the initial
value. As a formula, this is written as 𝑓2 # 𝑓1 = id.
By looking at the code of f1 and f2, we can directly observe that these functions
are inverses of each other: the tuple pattern (Left(a), c) is mapped to Left((a,
c)), and the pattern (Right(b), c) to Right((b, c)), or vice versa. It is visually clear
that no information is lost and that the original values are returned by function
compositions 𝑓1 # 𝑓2 or 𝑓2 # 𝑓1 .
We find that the logical identity (5.13) leads to an equivalence of the correspond-
ing types:
( 𝐴 + 𝐵) × 𝐶 𝐴 × 𝐶 + 𝐵 × 𝐶 . (5.16)
To get Eq. (5.16) from Eq. (5.13), we need to convert a logical formula to an arith-
metic expression by replacing the disjunction operations ∨ by + and the conjunc-
tions ∧ by × everywhere.
As another example of a logical identity, consider the associativity law for con-
junction:
(𝛼 ∧ 𝛽) ∧ 𝛾 = 𝛼 ∧ (𝛽 ∧ 𝛾) . (5.17)
The corresponding types are ( 𝐴 × 𝐵) × 𝐶 and 𝐴 × (𝐵 × 𝐶); in Scala, ((A, B), C) and
(A, (B, C)). We can define functions that convert between these types without
information loss:
def f3[A, B, C]: (((A, B), C)) => (A, (B, C)) = { case ((a, b), c) =>
(a, (b, c)) }
def f4[A, B, C]: (A, (B, C)) => (((A, B), C)) = { case (a, (b, c)) =>
((a, b), c) }
215
5 The logic of types. III. The Curry-Howard correspondence
By applying these functions to arbitrary values of types ((A, B), C) and (A, (B,
C)), it is easy to see that the functions f3 and f4 are inverses of each other. This
is also directly visible in the code: the nested tuple pattern ((a, b), c) is mapped
to the pattern (a, (b, c)) and back. So, the types ( 𝐴 × 𝐵) × 𝐶 and 𝐴 × (𝐵 × 𝐶) are
equivalent. We will often write 𝐴 × 𝐵 × 𝐶 without parentheses.
Does a logical identity always correspond to an equivalence of types? This turns
out to be not so. A simple example of a logical identity that does not correspond
to a type equivalence is:
𝑇𝑟𝑢𝑒 ∨ 𝛼 = 𝑇𝑟𝑢𝑒 . (5.18)
Since the CH correspondence maps the logical constant 𝑇𝑟𝑢𝑒 into the unit type 1,
the type equivalence corresponding to Eq. (5.18) is 1 + 𝐴 1. The type denoted by
1 + 𝐴 means Option[A] in Scala, so the corresponding equivalence is Option[A] Unit.
Intuitively, this type equivalence should not hold: an Option[A] may carry a value
of type A, which cannot possibly be stored in a value of type Unit. We can verify
this intuition rigorously by proving that any fully parametric functions with type
signatures 𝑔1 : 1 + 𝐴 → 1 and 𝑔2 : 1 → 1 + 𝐴 will not satisfy 𝑔1 # 𝑔2 = id. To verify
this, we note that 𝑔2 : 1 → 1 + 𝐴 must have this type signature:
def g2[A]: Unit => Option[A] = ???
This function must always return None, since a fully parametric function cannot
produce values of an arbitrary type A from scratch. Therefore, 𝑔1 # 𝑔2 is also a func-
tion that always returns None. The function 𝑔1 # 𝑔2 has type signature 1 + 𝐴 → 1 + 𝐴
or, in Scala syntax, Option[A] => Option[A], and is not equal to the identity function,
because the identity function does not always return None.
Another example of a logical identity that does not correspond to a type equiv-
alence is the distributive law:
∀( 𝐴, 𝐵, 𝐶). ( 𝐴 ∧ 𝐵) ∨ 𝐶 = ( 𝐴 ∨ 𝐶) ∧ (𝐵 ∨ 𝐶) , (5.19)
which is “dual” to the law (5.13), i.e., it is obtained from Eq. (5.13) by swapping all
conjunctions (∧) with disjunctions (∨). In logic, a dual formula to an identity is
also an identity. The CH correspondence maps Eq. (5.19) into the type equation:
?
∀( 𝐴, 𝐵, 𝐶). ( 𝐴 × 𝐵) + 𝐶 = ( 𝐴 + 𝐶) × (𝐵 + 𝐶) . (5.20)
216
5.3 Equivalence of types
(1 + 10) × 20 = 1 × 20 + 10 × 20 .
The logical identity in Eq. (5.19), which does not yield a type equivalence, leads
to an incorrect arithmetic equation (5.20), e.g., (1 × 10) + 20 ≠ (1 + 20) × (10 + 20).
Similarly, the associativity law (5.17) leads to a type equivalence and to the arith-
metic identity:
(𝑎 × 𝑏) × 𝑐 = 𝑎 × (𝑏 × 𝑐) ,
The logical identity in Eq. (5.18), which does not yield a type equivalence, leads
to an incorrect arithmetic statement (“1 + 𝑎 = 1 for all 𝑎”).
Table 5.5 summarizes these and other examples of logical identities and the
corresponding type equivalences. In all rows, quantifiers such as ∀𝛼 or ∀( 𝐴, 𝐵)
are implied as necessary.
Because the type notation is similar to the ordinary arithmetic notation, it is easy
to translate a possible type equivalence into an arithmetic equation. In all cases,
valid arithmetic identities correspond to type equivalences, and failures to obtain
a type equivalence correspond to incorrect arithmetic identities. With regard to
type equivalence, types such as 𝐴 + 𝐵 and 𝐴 × 𝐵 behave similarly to arithmetic
expressions such as 10 + 20 and 10 × 20 and not similarly to logical formulas such
as 𝛼 ∨ 𝛽 and 𝛼 ∧ 𝛽.
We already verified the first line and the last three lines of Table 5.5. Other
identities are verified in a similar way. Let us begin with lines 3 and 4 of Table 5.5,
217
5 The logic of types. III. The Curry-Howard correspondence
𝑇𝑟𝑢𝑒 ∨ 𝛼 = 𝑇𝑟𝑢𝑒 1+ 𝐴 1
𝑇𝑟𝑢𝑒 ∧ 𝛼 = 𝛼 1× 𝐴 𝐴
𝐹𝑎𝑙𝑠𝑒 ∨ 𝛼 = 𝛼 0+ 𝐴 𝐴
𝐹𝑎𝑙𝑠𝑒 ∧ 𝛼 = 𝐹𝑎𝑙𝑠𝑒 0× 𝐴 0
𝛼∨𝛽 = 𝛽∨𝛼 𝐴+𝐵 𝐵+𝐴
𝛼∧𝛽 = 𝛽∧𝛼 𝐴×𝐵 𝐵×𝐴
(𝛼 ∨ 𝛽) ∨ 𝛾 = 𝛼 ∨ (𝛽 ∨ 𝛾) ( 𝐴 + 𝐵) + 𝐶 𝐴 + (𝐵 + 𝐶)
(𝛼 ∧ 𝛽) ∧ 𝛾 = 𝛼 ∧ (𝛽 ∧ 𝛾) ( 𝐴 × 𝐵) × 𝐶 𝐴 × (𝐵 × 𝐶)
(𝛼 ∨ 𝛽) ∧ 𝛾 = (𝛼 ∧ 𝛾) ∨ (𝛽 ∧ 𝛾) ( 𝐴 + 𝐵) × 𝐶 𝐴 × 𝐶 + 𝐵 × 𝐶
(𝛼 ∧ 𝛽) ∨ 𝛾 = (𝛼 ∨ 𝛾) ∧ (𝛽 ∨ 𝛾) ( 𝐴 × 𝐵) + 𝐶 ( 𝐴 + 𝐶) × (𝐵 + 𝐶)
Table 5.5: Logic identities with disjunction and conjunction, and the possible type
equivalences.
which involve the proposition 𝐹𝑎𝑙𝑠𝑒 and the corresponding void type 0 (Scala’s
Nothing). Reasoning about the void type needs a special technique that we will
now develop while verifying the type isomorphisms 0 × 𝐴 0 and 0 + 𝐴 𝐴.
Example 5.3.2.1 Verify the type equivalence 0 × 𝐴 0.
Solution Recall that the type notation 0 × 𝐴 represents the Scala tuple type
(Nothing, A). To demonstrate that the type (Nothing, A) is equivalent to the type
Nothing, we need to show that the type (Nothing, A) has no values. Indeed, how
could we create a value of type, say, (Nothing, Int)? We would need to fill both
parts of the tuple. We have values of type Int, but we can never get a value of type
Nothing. So, regardless of the type A, it is impossible to create any values of type
(Nothing, A). In other words, the set of values of the type (Nothing, A) is empty.
But that is the definition of the void type Nothing. The types (Nothing, A) (denoted
by 0 × 𝐴) and Nothing (denoted by 0) are both void and therefore equivalent.
Example 5.3.2.2 Verify the type equivalence 0 + 𝐴 𝐴.
Solution The type notation 0 + 𝐴 corresponds to the Scala type Either[Nothing,
A]. We need to show that any value of that type can be mapped without loss of
information to a value of type A, and vice versa. This means implementing func-
tions 𝑓1 : 0 + 𝐴 → 𝐴 and 𝑓2 : 𝐴 → 0 + 𝐴 such that 𝑓1 # 𝑓2 = id and 𝑓2 # 𝑓1 = id.
The argument of 𝑓1 is of type Either[Nothing, A]. How can we create a value of
that type? Our only choices are to create a Left(x) with x: Nothing, or to create a
Right(y) with y: A. However, we cannot create a value x of type Nothing because
the type Nothing has no values. We cannot create a Left(x). The only remaining
218
5.3 Equivalence of types
possibility is to create a Right(y) with some value y of type A. So, any values of
type 0 + 𝐴 must be of the form Right(y), and we can extract that y to obtain a value
of type A:
def f1[A]: Either[Nothing, A] => A = {
case Right(y) => y
// No need for `case Left(x) => ...` since no `x` can ever be given as
`Left(x)`.
}
For the same reason, there is only one implementation of the function f2:
def f2[A]: A => Either[Nothing, A] = { y => Right(y) }
It is clear from the code that the functions f1 and f2 are inverses of each other.
Example 5.3.2.3 Verify the type equivalence 𝐴 × 1 𝐴.
Solution The corresponding Scala types are the tuple (A, Unit) and the type
A. We need to implement functions 𝑓1 : ∀𝐴. 𝐴 × 1 → 𝐴 and 𝑓2 : ∀𝐴. 𝐴 → 𝐴 × 1
and to demonstrate that they are inverses of each other. The Scala code for these
functions is:
def f1[A]: ((A, Unit)) => A = { case (a, ()) => a }
def f2[A]: A => (A, Unit) = { a => (a, ()) }
Now let us write a proof in the code notation. The codes of 𝑓1 and 𝑓2 are:
𝑓1 def
= 𝑎 :𝐴 × 1 → 𝑎 , 𝑓2 def
= 𝑎 :𝐴 → 𝑎 × 1 ,
This shows that both compositions are identity functions. Another way of writ-
ing the proof is by computing the function compositions symbolically, without
applying to a value 𝑎 :𝐴 :
𝑓1 # 𝑓2 = (𝑎 × 1 → 𝑎) # (𝑎 → 𝑎 × 1) = (𝑎 × 1 → 𝑎 × 1) = id 𝐴×1 ,
𝐴
𝑓2 # 𝑓1 = (𝑎 → 𝑎 × 1) # (𝑎 × 1 → 𝑎) = (𝑎 → 𝑎) = id .
The functions f1 and f2 are implemented by code that can be derived unambigu-
ously from the type signatures. For instance, the line case Left(a) => ... is re-
quired to return a value of type Either[B, A] by using a given value a: A. The only
way of doing that is by returning Right(a).
It is clear from the code that the functions f1 and f2 are inverses of each other. To
verify that rigorously, we need to show that f1 andThen f2 is equal to an identity
function. The function f1 andThen f2 applies f2 to the result of f1. The code of
f1 contains two case ... lines, each returning a result. So, we need to apply f2
separately in each line. Evaluate the code symbolically:
(f1 andThen f2) == {
case Left(a) => f2(Right(a))
case Right(b) => f2(Left(b))
} == {
case Left(a) => Left(a)
case Right(b) => Right(b)
}
The result is a function of type Either[A, B] => Either[A, B] that does not change
its argument; so, it is equal to the identity function.
Let us now write the function f1 in the code notation and perform the same
derivation. We will also develop a useful notation for functions operating on dis-
junctive types.
The pattern matching construction in the Scala code of f1 is similar to a pair of
functions with types A => Either[B, A] and B => Either[B, A]. One of these func-
tions is applied depending on whether the argument of f1 has type 𝐴 + 0 or 0 + 𝐵.
So, we may write the code of f1 as:
(
def :𝐴+𝐵 if 𝑥 = 𝑎 :𝐴 + 0:𝐵 : 0:𝐵 + 𝑎 :𝐴
𝑓1 = 𝑥 →
if 𝑥 = 0:𝐴 + 𝑏 :𝐵 : 𝑏 :𝐵 + 0:𝐴
Since both the argument and the result of 𝑓1 are disjunctive types with 2 parts
each, the code notation represents 𝑓1 as a 2 × 2 matrix that maps the input parts to
the output parts:
def f1[A, B]: Either[A, B] => Either[B, A] = {
case Left(a) => Right(a)
case Right(b) => Left(b)
}
220
5.3 Equivalence of types
𝐵 𝐴
𝑓1 def
= 𝐴 0 𝑎 :𝐴 → 𝑎 .
𝐵 𝑏 :𝐵 → 𝑏 0
The matrix element in row 𝐴 and column 𝐴 is a function 𝑎 :𝐴 → 𝑎 of type 𝐴 → 𝐴
that corresponds to the line case Left(a) => Right(a) in the Scala code. The matrix
element in row 𝐴 and column 𝐵 is written as 0 because no value of that type is
returned. The matrix row 𝐵 contains just the function 𝑏 :𝐵 → 𝑏 in the first column.
In the second column, row 𝐵 contains a 0.
The code of 𝑓2 is written similarly. Let us rename arguments for clarity:
def f2[A, B]: Either[B, A] => Either[A, B] = {
case Left(y) => Right(y)
case Right(x) => Left(x)
}
𝐴 𝐵
def
𝑓2 = 𝐵 0 𝑦 :𝐵 → 𝑦 .
𝐴 𝑥 :𝐴 → 𝑥 0
The forward composition 𝑓1 # 𝑓2 is computed by the standard rules of row-by-
column matrix multiplication.10 Any terms containing 0 are omitted, and the
remaining functions are composed:
𝐵 𝐴 𝐴 𝐵
𝑓1 # 𝑓2 = 𝐴 0 𝑎 :𝐴 → 𝑎 # 𝐵 0 𝑦 :𝐵 → 𝑦
𝐵 𝑏 :𝐵 → 𝑏 0 𝐴 𝑥 :𝐴 → 𝑥 0
𝐴 𝐵
matrix composition : = 𝐴 (𝑎 :𝐴 → 𝑎) # (𝑥 :𝐴 → 𝑥) 0
𝐵 0 (𝑏 :𝐵 → 𝑏) # (𝑦 :𝐵 → 𝑦)
𝐴 𝐵
function composition : = 𝐴 id 0 = id:𝐴+𝐵→𝐴+𝐵 .
𝐵 0 id
Several features of the matrix notation are helpful in such calculations. The
parts of the code of 𝑓1 are automatically composed with the corresponding parts
of the code of 𝑓2 . To check that the types match in the function composition, we
10 https://en.wikipedia.org/wiki/Matrix_multiplication
221
5 The logic of types. III. The Curry-Howard correspondence
just need to compare the types in the output row 𝐵 𝐴 of 𝑓1 with the input
𝐵
column of 𝑓2 . Once we verified that all types match, we may omit the type
𝐴
annotations and write the same derivation more concisely as:
0 𝑎 :𝐴 → 𝑎 0 𝑦 :𝐵 → 𝑦
𝑓1 # 𝑓2 = #
𝑏 :𝐵 → 𝑏 0 𝑥 :𝐴 → 𝑥 0
(𝑎 :𝐴 → 𝑎) # (𝑥 :𝐴 → 𝑥) 0
matrix composition : =
0 (𝑏 :𝐵 → 𝑏) # (𝑦 :𝐵 → 𝑦)
id 0
function composition : = = id .
0 id
id 0
The identity function is represented by this diagonal matrix: .
0 id
Exercise 5.3.2.5 Verify the type equivalence 𝐴 × 𝐵 𝐵 × 𝐴.
Exercise 5.3.2.6 Verify the type equivalence ( 𝐴 + 𝐵) + 𝐶 𝐴 + (𝐵 + 𝐶). Since Sec-
tion 5.3.1 proved the equivalences ( 𝐴 + 𝐵) + 𝐶 𝐴 + (𝐵 + 𝐶) and ( 𝐴 × 𝐵) × 𝐶
𝐴 × (𝐵 × 𝐶), we may write 𝐴 + 𝐵 + 𝐶 and 𝐴 × 𝐵 × 𝐶 without any parentheses.
Exercise 5.3.2.7 Verify the type equivalence:
( 𝐴 + 𝐵) × ( 𝐴 + 𝐵) = 𝐴 × 𝐴 + 2 × 𝐴 × 𝐵 + 𝐵 × 𝐵 ,
String. So, the total number of possible different strings will be finite, depending
on the computer. Similarly, the set of all possible values of type List[Int] will be a
finite set.
But this introduces an arbitrary limit on the total size of data, which is incon-
venient for reasoning about programs. For instance, string concatenation and list
concatenation will become partial functions; those operations will fail when the
total size of the result is larger than the memory limit. It is more convenient to
imagine a computer with an infinite array of memory locations. On that com-
puter, each program is still only allowed to use a finite amount of memory, but
that amount is not limited in advance. Then all basic operations on data types be-
come total functions; for example, the concatenation of any two strings is always
well-defined.
In the model of an infinite computer, the set of all possible strings will be a
countably infinite set consisting of all possible character sequences (where char-
acters come from a finite set). Similarly, the set of all possible values of type
List[Int] will be a countably infinite set. This makes it difficult to reason about
the total number of values of a given type. So, for the purposes of this section, we
will limit our consideration to types 𝐴, 𝐵, ..., that have finite cardinalities.
The next step is to consider the cardinality of types such as 𝐴 × 𝐵 and 𝐴 + 𝐵.
If the types 𝐴 and 𝐵 have cardinalities | 𝐴| and |𝐵|, it follows that the set of all
distinct pairs (A, B) has | 𝐴| × |𝐵| elements. So, the cardinality of the type 𝐴 × 𝐵
is equal to the (arithmetic) product of the cardinalities of 𝐴 and 𝐵. The set of all
pairs, denoted in mathematics by:
{(𝑎, 𝑏) | 𝑎 ∈ 𝐴, 𝑏 ∈ 𝐵} ,
| 𝐴 × 𝐵| = | 𝐴| × |𝐵| ,
| 𝐴 + 𝐵| = | 𝐴| + |𝐵| .
The type notation, 𝐴 × 𝐵 for (A,B) and 𝐴 + 𝐵 for Either[A, B], translates directly
into type cardinalities.
The last step is to notice that two types can be equivalent, 𝑃 𝑄, only if their
cardinalities are equal, |𝑃| = |𝑄|. When the cardinalities are not equal, |𝑃| ≠ |𝑄|, it
223
5 The logic of types. III. The Curry-Howard correspondence
The presence of an arbitrary choice in this code is a warning sign. In f1, we could
map None to Left(false) or to Left(true) and adjust the rest of the code accord-
ingly. The type equivalence holds with either choice. So, these types are equiva-
lent, but there is no natural choice of the conversion functions f1 and f2 because
the meaning of those data types will be application-dependent. We call this type
equivalence accidental.
Example 5.3.3.1 Are the types Option[A] and Either[Unit, A] equivalent? Check
whether the corresponding logic identity and arithmetic identity hold.
Solution Begin by writing the given types in the type notation: Option[A] is writ-
ten as 1 + 𝐴, and Either[Unit, A] is written also as 1 + 𝐴. The notation already
indicates that the types are equivalent. But let us verify explicitly that the type
notation is not misleading us here.
To establish the type equivalence, we need to implement two functions:
def f1[A]: Option[A] => Either[Unit, A] = ???
def f2[A]: Either[Unit, A] => Option[A] = ???
The code clearly shows that f1 and f2 are inverses of each other. This verifies the
type equivalence.
The logic identity is 𝑇𝑟𝑢𝑒 ∨ 𝐴 = 𝑇𝑟𝑢𝑒 ∨ 𝐴 and holds trivially. It remains to
check the arithmetic identity, which relates the cardinalities of types Option[A] and
Either[Unit, A]. Assume that the cardinality of type A is | 𝐴|. Any possible value of
type Option[A] must be either None or Some(x), where x is a value of type A. So, the
number of distinct values of type Option[A] is 1 + | 𝐴|. All possible values of type
Either[Unit, A] are of the form Left(()) or Right(x), where x is a value of type A. So,
the cardinality of type Either[Unit, A] is 1 + | 𝐴|. We see that the arithmetic identity
holds: the types Option[A] and Either[Unit, A] have equally many distinct values.
This example shows that the type notation is helpful for reasoning about type
equivalences. The answer was found immediately when we wrote the type nota-
tion (1 + 𝐴) for the given types.
225
5 The logic of types. III. The Curry-Howard correspondence
(𝑇𝑟𝑢𝑒 ⇒ 𝛼) = 𝛼 1→𝐴 𝐴 𝑎1 = 𝑎
(𝛼 ∨ 𝛽) ⇒ 𝛾 = (𝛼 ⇒ 𝛾) ∧ (𝛽 ⇒ 𝛾) 𝐴 + 𝐵 → 𝐶 ( 𝐴 → 𝐶) × (𝐵 → 𝐶) 𝑐 𝑎+𝑏 = 𝑐 𝑎 × 𝑐 𝑏
(𝛼 ∧ 𝛽) ⇒ 𝛾 = 𝛼 ⇒ (𝛽 ⇒ 𝛾) 𝐴×𝐵 →𝐶 𝐴 → 𝐵 →𝐶 𝑐 𝑎×𝑏 = (𝑐 𝑏 ) 𝑎
𝛼 ⇒ (𝛽 ∧ 𝛾) = (𝛼 ⇒ 𝛽) ∧ (𝛼 ⇒ 𝛾) 𝐴 → 𝐵 × 𝐶 ( 𝐴 → 𝐵) × ( 𝐴 → 𝐶) (𝑏 × 𝑐) 𝑎 = 𝑏 𝑎 × 𝑐 𝑎
Table 5.6: Logical identities with implication, and the corresponding type equiva-
lences and arithmetic identities.
complicated (and practically useless) ways. The code of those functions will be
much larger than the available memory of a realistic computer. So, the num-
ber of practically implementable functions of type 𝐴 → 𝐵 can be much smaller
than |𝐵| | 𝐴| . Since the code of a function is a list of bytes that needs to fit into the
computer’s memory, the number of implementable functions is no larger than the
number of possible byte lists.
Nevertheless, the formula |𝐵| | 𝐴| is useful since it shows the number of distinct
functions that are possible in principle, on an imaginary computer with infinite
memory (although we still need to limit our consideration to types 𝐴, 𝐵 with finite
cardinalities | 𝐴|, |𝐵|). When types 𝐴 and 𝐵 have only a small number of distinct
values (for example, with 𝐴 = Option[Boolean]] and 𝐵 = Either[Boolean, Boolean]),
the formula |𝐵| | 𝐴| gives an exact and practically relevant answer.
Let us now look for logic identities and arithmetic identities involving function
types. Table 5.6 lists the available identities and the corresponding type equiva-
lences. (In the last column, we defined 𝑎 def = | 𝐴|, 𝑏 def
= |𝐵|, and 𝑐 def
= |𝐶 | for brevity.)
It is notable that no logic identity is available for the formula 𝛼 ⇒ (𝛽 ∨ 𝛾), and
correspondingly no type equivalence is available for the type expression 𝐴 →
𝐵 + 𝐶 (although there is an identity for 𝐴 → 𝐵 × 𝐶). Reasoning about types of
the form 𝐴 → 𝐵 + 𝐶 is more complicated because those types usually cannot be
rewritten as simpler types.
We will now prove some of the type identities in Table 5.6.
Example 5.3.4.1 Verify the type equivalence 1 → 𝐴 𝐴.
Solution Recall that the type notation 1 → 𝐴 means the Scala function type
Unit => A. There is only one value of type Unit. The choice of a function of type
Unit => A is the same as the choice of a value of type A. So, the type 1 → 𝐴 has | 𝐴|
distinct values, and the arithmetic identity holds.
226
5.3 Equivalence of types
𝑓1 def
= ℎ:1→𝐴 → ℎ(1) , 𝑓2 def
= 𝑥 :𝐴 → 1 → 𝑥 .
227
5 The logic of types. III. The Curry-Howard correspondence
The type 1 → 𝐴 is equivalent to the type 𝐴 in the sense of carrying the same
information, but these types are not exactly the same. An important difference
between these types is that a value of type 𝐴 is available immediately, while a
value of type 1 → 𝐴 is a function that still needs to be applied to an argument
(of type 1) before a value of type 𝐴 is obtained. The type 1 → 𝐴 may represent
an “on-call” value of type 𝐴. That value is computed on demand every time it is
requested. (See Section 2.6.3 for more details about “on-call” values.)
The void type 0 needs special reasoning, as the next examples show:
Example 5.3.4.2 Verify the type equivalence 0 → 𝐴 1.
Solution To verify that a type 𝑋 is equivalent to the Unit type, we need to show
that there is only one distinct value of type 𝑋. So, let us find out how many values
the type 0 → 𝐴 has. Consider a value of that type, which is a function 𝑓 :0→𝐴 from
the type 0 to a type 𝐴. Since there exist no values of type 0, the function 𝑓 will
never be applied to any arguments and so does not need to compute any actual
values of type 𝐴. So, 𝑓 is a function whose body may be “empty”. At least, 𝑓 ’s
body does not need to contain any expressions of type 𝐴. In Scala, such a function
can be written as:
def absurd[A]: Nothing => A = { ??? }
This code will compile without type errors. An equivalent code is:
def absurd[A]: Nothing => A = { x => ??? }
The symbol ??? is defined in the Scala library and represents code that is “not
implemented”. Trying to evaluate this symbol will produce an error:
scala> val x = ???
scala.NotImplementedError: an implementation is missing
Since the function absurd can never be applied to an argument, this error will never
happen. So, one can pretend that the result value (which will never be computed)
has any required type, e.g., the type 𝐴. In this way, the compiler will accept the
definition of absurd.
Let us now verify that there exists only one distinct function of type 0 → 𝐴. Take
any two functions of that type, 𝑓 :0→𝐴 and 𝑔 :0→𝐴 . Are they different? The only way
of showing that 𝑓 and 𝑔 are different is by finding a value 𝑥 such that 𝑓 (𝑥) ≠ 𝑔(𝑥).
But then 𝑥 would be of type 0, and there are no values of type 0. So, we will never
be able to find the required value 𝑥. It follows that any two functions 𝑓 and 𝑔 of
type 0 → 𝐴 are equal, 𝑓 = 𝑔. In other words, there exists only one distinct value
of type 0 → 𝐴. Since the cardinality of the type 0 → 𝐴 is 1, we obtain the type
equivalence 0 → 𝐴 1.
Example 5.3.4.3 Show that 𝐴 → 0 0 and 𝐴 → 0 1, where 𝐴 is an arbitrary
unknown type.
Solution To prove that two types are not equivalent, it is sufficient to show
that their cardinalities are different. Let us determine how the cardinality of the
228
5.3 Equivalence of types
𝑓 :𝐴→1 def
= _→1 .
We can show that there exists only one distinct function of type 𝐴 → 1 (that is, the
type 𝐴 → 1 has cardinality 1). Assume that 𝑓 and 𝑔 are two such functions, and
try to find a value 𝑥 :𝐴 such that 𝑓 (𝑥) ≠ 𝑔(𝑥). We cannot find any such 𝑥 because
𝑓 (𝑥) = 1 and 𝑔(𝑥) = 1 for all 𝑥. So, any two functions 𝑓 and 𝑔 of type 𝐴 → 1 must
be equal to each other. The cardinality of the type 𝐴 → 1 is 1.
Any type having cardinality 1 is equivalent to the Unit type (1). So, 𝐴 → 1 1.
Example 5.3.4.5 Denote by _:𝐴 → 𝐵 the type of constant functions of type 𝐴 → 𝐵
(functions that ignore their argument). Show that the type _:𝐴 → 𝐵 is equivalent
to the type 𝐵, as long as 𝐴 ≠ 0.
Solution An isomorphism between the types 𝐵 and _:𝐴 → 𝐵 is given by the
two functions:
𝑓1 : 𝐵 → _:𝐴 → 𝐵 , 𝑓1 def
= 𝑏→_→𝑏 ;
:𝐴 def :_→𝐵
𝑓2 : (_ → 𝐵) → 𝐵 , 𝑓2 = 𝑘 → 𝑘 (𝑥 :𝐴 ) ,
where 𝑥 is any value of type 𝐴. That value exists since the type 𝐴 is not void. The
function 𝑓2 does not depend on the choice of 𝑥 because 𝑘 is a constant function, so
𝑘 (𝑥) is the same for all 𝑥. In other words, the function 𝑘 satisfies 𝑘 = (_ → 𝑘 (𝑥))
with any chosen 𝑥. To prove that 𝑓1 and 𝑓2 are inverses:
𝑓1 # 𝑓2 = (𝑏 → _ → 𝑏) # (𝑘 → 𝑘 (𝑥)) = 𝑏 → (_ → 𝑏)(𝑥) = (𝑏 → 𝑏) = id ,
𝑓2 # 𝑓1 = (𝑘 → 𝑘 (𝑥)) # (𝑏 → _ → 𝑏) = 𝑘 → _ → 𝑘 (𝑥) = 𝑘 → 𝑘 = id .
𝐴 + 𝐵 → 𝐶 ( 𝐴 → 𝐶) × (𝐵 → 𝐶) .
229
5 The logic of types. III. The Curry-Howard correspondence
𝑓1 : ( 𝐴 + 𝐵 → 𝐶) → ( 𝐴 → 𝐶) × (𝐵 → 𝐶) ,
𝑓1 def
= ℎ:𝐴+𝐵→𝐶 → 𝑎 :𝐴 → ℎ(𝑎 + 0:𝐵 ) × 𝑏 :𝐵 → ℎ(0:𝐴 + 𝑏)
.
For the function f2, we need to apply pattern matching to both curried argu-
ments and then return a value of type C. This can be achieved in only one way:
def f2[A, B, C](f: A => C, g: B => C): Either[A, B] => C = {
case Left(a) => f(a)
case Right(b) => g(b)
}
We write this function in the code notation like this:
𝑓2 : ( 𝐴 → 𝐶) × (𝐵 → 𝐶) → 𝐴 + 𝐵 → 𝐶 ,
𝐶
𝑓2 def
= 𝑓 :𝐴→𝐶 × 𝑔 :𝐵→𝐶 → 𝐴 𝑎 → 𝑓 (𝑎) .
𝐵 𝑏 → 𝑔(𝑏)
The matrix in the last line has only one column because the result type, 𝐶, is not
known to be a disjunctive type. We may also simplify the functions, e.g., replace
𝑎 → 𝑓 (𝑎) by just 𝑓 , and write:
𝐶
𝑓2 def
= 𝑓 :𝐴→𝐶 × 𝑔 :𝐵→𝐶 → 𝐴 𝑓 .
𝐵 𝑔
(omitting types):
𝑓
𝑓1 # 𝑓2 = ℎ → (𝑎 → ℎ(𝑎 + 0)) × (𝑏 → ℎ(0 + 𝑏)) # 𝑓 × 𝑔 →
𝑔
𝑎 → ℎ(𝑎 + 0)
compute composition : =ℎ→ .
𝑏 → ℎ(0 + 𝑏)
To proceed, we need to simplify the expressions ℎ(𝑎 + 0) and ℎ(0 + 𝑏). We rewrite
the argument ℎ (an arbitrary function of type 𝐴 + 𝐵 → 𝐶) in the matrix notation:
𝐶 𝐶
ℎ def
= 𝐴 𝑎 → 𝑝(𝑎) = 𝐴 𝑝 ,
𝐵 𝑏 → 𝑞(𝑏) 𝐵 𝑞
where 𝑝 :𝐴→𝐶 and 𝑞 :𝐵→𝐶 are new arbitrary functions. Since we already checked the
types, we can omit all type annotations and write ℎ as:
𝑝
ℎ def
= .
𝑞
To evaluate expressions such as ℎ(𝑎 + 0) and ℎ(0 + 𝑏), we need to use one of the
rows of this matrix. The correct row will be selected automatically by the rules of
matrix multiplication if we place a row vector to the left of the matrix and use the
convention of omitting terms containing 0:
𝑝 𝑝
𝑎 0 ⊲ =𝑎⊲𝑝 , 0 𝑏 ⊲ = 𝑏⊲𝑞 .
𝑞 𝑞
Here we used the symbol ⊲ to separate an argument from a function when the
argument is written to the left of the function. The symbol ⊲ (pronounced “pipe”)
is defined by 𝑥 ⊲ 𝑓 def
= 𝑓 (𝑥). In Scala, this operation is available as x.pipe(f) as of
Scala 2.13.
We can write values of disjunctive types, such as 𝑎 + 0, as row vectors 𝑎 0 :
ℎ(𝑎 + 0) = (𝑎 + 0) ⊲ ℎ = 𝑎 0 ⊲ ℎ . (5.21)
With these notations, we compute further. Omit all terms applying 0 or applying
something to 0:
𝑝
ℎ(𝑎 + 0) = 𝑎 0 ⊲ ℎ = 𝑎 0 ⊲ = 𝑎 ⊲ 𝑝 = 𝑝(𝑎) ,
𝑞
231
5 The logic of types. III. The Curry-Howard correspondence
𝑝
ℎ(0 + 𝑏) = 0 𝑏 ⊲ ℎ = 0 𝑏 ⊲ = 𝑏 ⊲ 𝑞 = 𝑞(𝑏) .
𝑞
Now we can complete the proof of 𝑓1 # 𝑓2 = id:
𝑎 → ℎ(𝑎 + 0)
𝑓1 # 𝑓2 = ℎ →
𝑏 → ℎ(0 + 𝑏)
𝑝 𝑎 → 𝑝(𝑎)
previous equations : = →
𝑞 𝑏 → 𝑞(𝑏)
𝑝 𝑝
simplify functions : = → = id .
𝑞 𝑞
𝑓 𝑓
composition : = 𝑓 ×𝑔 → 𝑎 → 𝑎 0 ⊲ × 𝑏→ 0 𝑏 ⊲
𝑔 𝑔
apply functions : = 𝑓 × 𝑔 → (𝑎 → 𝑎 ⊲ 𝑓 ) × (𝑏 → 𝑏 ⊲ 𝑔)
definition of ⊲ : = 𝑓 × 𝑔 → (𝑎 → 𝑓 (𝑎)) × (𝑏 → 𝑔(𝑏))
simplify functions : = ( 𝑓 × 𝑔 → 𝑓 × 𝑔) = id .
In this way, we have proved that 𝑓1 and 𝑓2 are mutual inverses. The proofs
appear long because we took time to motivate and introduce new notation for
applying matrices to row vectors. Once this notation is understood, a proof of
𝑓1 # 𝑓2 = id can be written as:
𝑓
𝑓1 # 𝑓2 = (ℎ → (𝑎 → (𝑎 + 0) ⊲ ℎ) × (𝑏 → (0 + 𝑏) ⊲ ℎ)) # 𝑓 × 𝑔 →
𝑔
𝑝
𝑎→ 𝑎 0 ⊲
𝑎 → 𝑎 0 ⊲ℎ 𝑝 𝑞
composition : =ℎ→ = →
𝑏 → 0 𝑏 ⊲ℎ 𝑞 𝑝
𝑏 → 0 𝑏 ⊲
𝑞
𝑝 𝑎 →𝑎⊲𝑝 𝑝 𝑝
apply functions : = → = → = id .
𝑞 𝑏 → 𝑏⊲𝑞 𝑞 𝑞
232
5.3 Equivalence of types
Proofs in the code notation are shorter than in Scala syntax because certain names
and keywords (such as Left, Right, case, match, etc.) are omitted. From now on, we
will prefer to use the code notation in proofs, keeping in mind that one can always
convert the code notation to Scala.
The function arrow (→) binds weaker than the pipe operation (⊲), so the formula
𝑥 → 𝑦 ⊲ 𝑧 means 𝑥 → (𝑦 ⊲ 𝑧). We will review the pipe notation more systematically
in Chapter 7.
Example 5.3.4.7 Verify the type equivalence:
𝐴×𝐵 →𝐶 𝐴 → 𝐵 →𝐶 .
The Scala code can be derived from the type signatures unambiguously:
def f1[A,B,C]: (((A, B)) => C) => A => B => C = g => a => b => g((a, b))
def f2[A,B,C]: (A => B => C) => ((A, B)) => C = h => { case (a, b) => h(a)(b) }
𝑓1 = 𝑔 :𝐴×𝐵→𝐶 → 𝑎 :𝐴 → 𝑏 :𝐵 → 𝑔(𝑎 × 𝑏) ,
:𝐴→𝐵→𝐶 :𝐴×𝐵
𝑓2 = ℎ → (𝑎 × 𝑏) → ℎ(𝑎)(𝑏) .
5.4 Summary
What tasks can we perform now?
• Convert a fully parametric type signature into a logical formula and:
– Decide whether the type signature can be implemented in code.
– If possible, derive the code using the CH correspondence.
• Use the type notation (Table 5.1) for reasoning about types to:
– Decide type equivalence using the rules in Tables 5.5–5.6.
– Simplify type expressions before writing code.
• Use the matrix notation and the pipe notation to write code that works on
disjunctive types.
What tasks cannot be performed with these tools?
• Automatically generate code for recursive functions. The CH correspondence
is based on propositional logic, which cannot describe recursion. Accord-
ingly, recursion is absent from the eight code constructions of Section 5.2.2.
Recursive functions need to be coded by hand.
• Automatically generate code satisfying a property (e.g., isomorphism). We
may generate some code, but the CH correspondence does not guarantee
that properties will hold. We need to verify the required properties manu-
ally, after deriving the code.
• Express complicated conditions (e.g., “array is sorted”) in a type signature.
This can be done using dependent types (i.e., types that directly depend on
values in some way). This is an advanced technique beyond the scope of this
book. Programming languages such as Coq, Agda, and Idris fully support
dependent types, while Scala has only limited support.
• Generate code using type constructors with known methods (e.g., the map
method).
As an example of using type constructors with known methods, consider this type
signature:
def q[A]: Array[A] => (A => Option[B]) => Array[Option[B]]
Can we generate the code of this function from its type signature? We know that
the Scala library defines a map method on the Array class. So, an implementation of
q is:
def q[A]: Array[A] => (A => Option[B]) => Array[Option[B]] = { arr => f =>
arr.map(f) }
234
5.4 Summary
However, it is hard to create an algorithm that can derive this implementation au-
tomatically from the type signature of q via the Curry-Howard correspondence.
The algorithm would have to convert the type signature of q into this logical for-
mula:
𝐵
CH (Array 𝐴 ) ⇒ CH ( 𝐴 → Opt𝐵 ) ⇒ CH (ArrayOpt ) . (5.22)
To derive an implementation, the algorithm would need to use the available map
method for Array. That method has the type signature:
To derive the CH -proposition (5.22), the algorithm will need to assume that the
CH -proposition:
CH ∀( 𝐴, 𝐵). Array 𝐴 → ( 𝐴 → 𝐵) → Array𝐵 (5.23)
already holds. In other words, Eq. (5.23) must be one of the premises of a sequent.
Reasoning about premises such as Eq. (5.23) requires first-order logic — a logic
whose proof rules can handle quantified types such as ∀( 𝐴, 𝐵) inside premises.
However, first-order logic is undecidable: no algorithm can find a proof (or verify
the absence of a proof) in all cases.
The constructive propositional logic with the rules listed in Table 5.2 is decid-
able, i.e., it has an algorithm that either finds a proof or disproves any given for-
mula. However, that logic cannot handle type constructors such as Array. It also
cannot handle premises containing type quantifiers such as ∀( 𝐴, 𝐵), because all
the available logic rules have the quantifiers placed outside the premises.
So, code for functions such as q can only be derived by trial and error, informed
by intuition. This book will help programmers to acquire the necessary intuition
and technique.
5.4.1 Examples
Example 5.4.1.1 Find the cardinality of the type P = Option[Option[Boolean] =>
Boolean]. Write P in the type notation.
Solution Begin with the type Option[Boolean], which can be either None or Some(x)
with some value x: Boolean. Because the type Boolean has 2 possible values, the
type Option[Boolean] has 3 possible values:
|1 + Boolean| = |1 + 2| = 1 + 2 = 3 .
235
5 The logic of types. III. The Curry-Howard correspondence
|𝑃| = |1 + (1 + 2 → 2)| = 1 + |1 + 2 → 2| = 1 + 8 = 9 .
Example 5.4.1.2 Implement a Scala type P[A] given by this type notation:
𝑃 𝐴 def
= 1 + 𝐴 + Int × 𝐴 + (String → 𝐴) .
Solution To translate type notation into Scala code, begin by defining the dis-
junctive types as case classes, choosing class names for convenience. In this case,
𝑃 𝐴 is a disjunctive type with four parts, so we need four case classes:
sealed trait P[A]
final case class P1[A](???) extends P[A]
final case class P2[A](???) extends P[A]
final case class P3[A](???) extends P[A]
final case class P4[A](???) extends P[A]
Each of the case classes represents one part of the disjunctive type. Now we write
the contents for each of the case classes, in order to implement the data in each of
the disjunctive parts:
sealed trait P[A]
final case class P1[A]() extends P[A]
final case class P2[A](x: A) extends P[A]
final case class P3[A](n: Int, x: A) extends P[A]
final case class P4[A](f: String => A) extends P[A]
Example 5.4.1.3 Find an equivalent disjunctive type for the type P = (Either[A,
B], Either[C, D]).
Solution Begin by writing the given type in the type notation. The tuple be-
comes a product type, and Either becomes a disjunctive type:
𝑃 def
= ( 𝐴 + 𝐵) × (𝐶 + 𝐷) .
𝐴 𝐴 𝐴 𝐴
𝑓2 def
= 𝑎 :𝐴 → 𝑎 + 0:𝐴 = = .
𝐴 𝑎 :𝐴 → 𝑎 0 𝐴 id 0
The composition of these functions is not equal to identity:
id id 0 id 0
𝑓1 # 𝑓2 = # id 0 = , while we have id:𝐴+𝐴→𝐴+𝐴 = .
id id 0 0 id
237
5 The logic of types. III. The Curry-Howard correspondence
𝑓1 def
= 𝑎 1:𝐴 × 𝑎 2:𝐴 → 𝑎 1 = 𝜋1:𝐴×𝐴→𝐴 , 𝑓2 def
= 𝑎 :𝐴 → 𝑎 × 𝑎 = Δ:𝐴→𝐴×𝐴 .
𝑓1 # 𝑓2 = (𝑎 1 × 𝑎 2 → 𝑎 1 ) # (𝑎 → 𝑎 × 𝑎)
= (𝑎 1 × 𝑎 2 → 𝑎 1 × 𝑎 1 ) ≠ id = (𝑎 1 × 𝑎 2 → 𝑎 1 × 𝑎 2 ) .
( 𝐴 ∧ 𝐵 ⇒ 𝐶) ⇒ ( 𝐴 ⇒ 𝐶) ∨ (𝐵 ⇒ 𝐶)
and (( 𝐴 ⇒ 𝐶) ∨ (𝐵 ⇒ 𝐶)) ⇒ (( 𝐴 ∧ 𝐵) ⇒ 𝐶) .
( 𝐴 × 𝐵 → 𝐶) → ( 𝐴 → 𝐶) + (𝐵 → 𝐶) and ( 𝐴 → 𝐶) + (𝐵 → 𝐶) → 𝐴 × 𝐵 → 𝐶 .
𝐴×𝐵 →𝐶
𝑓2 def
= 𝐴→𝐶 𝑔 :𝐴→𝐶 → 𝑎 × 𝑏 → 𝑔(𝑎) .
𝐵→𝐶 ℎ:𝐵→𝐶 → 𝑎 × 𝑏 → ℎ(𝑏)
Let us now show that the logical identity:
Both sides of Eq. (5.24) are equal to the same formula, ¬𝛼 ∨ ¬𝛽 ∨ 𝛾, so the identity
holds.
This calculation does not work in the constructive logic because its proof rules
can derive neither the Boolean formula (5.8) nor the law of de Morgan, ¬(𝛼 ∧ 𝛽) =
(¬𝛼 ∨ ¬𝛽).
Another way of proving the Boolean identity (5.24) is to enumerate all possible
truth values for the variables 𝛼, 𝛽, and 𝛾. The left-hand side, (𝛼 ∧ 𝛽) ⇒ 𝛾, can be
𝐹𝑎𝑙𝑠𝑒 only if 𝛼 ∧ 𝛽 = 𝑇𝑟𝑢𝑒 (that is, both 𝛼 and 𝛽 are 𝑇𝑟𝑢𝑒) and 𝛾 = 𝐹𝑎𝑙𝑠𝑒. For all
other truth values of 𝛼, 𝛽, and 𝛾, the formula (𝛼 ∧ 𝛽) ⇒ 𝛾 is 𝑇𝑟𝑢𝑒. Let us determine
239
5 The logic of types. III. The Curry-Howard correspondence
when the right-hand side, (𝛼 ⇒ 𝛾) ∨ (𝛽 ⇒ 𝛾), can be 𝐹𝑎𝑙𝑠𝑒. This can happen only
if both parts of the disjunction are 𝐹𝑎𝑙𝑠𝑒. That means 𝛼 = 𝑇𝑟𝑢𝑒, 𝛽 = 𝑇𝑟𝑢𝑒, and
𝛾 = 𝐹𝑎𝑙𝑠𝑒. So, the two sides of the identity (5.24) are both 𝑇𝑟𝑢𝑒 or both 𝐹𝑎𝑙𝑠𝑒
with any choice of truth values of 𝛼, 𝛽, and 𝛾. In Boolean logic, this is sufficient to
prove the identity (5.24).
The following example shows how to use the formulas from Tables 5.5–5.6 to
derive the type equivalence of complicated type expressions without need for
proofs.
Example 5.4.1.6 Use known formulas to verify the type equivalences without
direct proofs:
(a) 𝐴 × ( 𝐴 + 1) × ( 𝐴 + 1 + 1) 𝐴 × (1 + 1 + 𝐴 × (1 + 1 + 1 + 𝐴)).
(b) 1 + 𝐴 + 𝐵 → 1 × 𝐵 (𝐵 → 𝐵) × ( 𝐴 → 𝐵) × 𝐵.
Solution (a) We can expand brackets in type expressions as in arithmetic:
𝐴 × ( 𝐴 + 1) 𝐴 × 𝐴 + 𝐴 × 1 𝐴 × 𝐴 + 𝐴 ,
𝐴 × ( 𝐴 + 1) × ( 𝐴 + 1 + 1) ( 𝐴 × 𝐴 + 𝐴) × ( 𝐴 + 1 + 1)
𝐴 × 𝐴 × 𝐴 + 𝐴 × 𝐴 + 𝐴 × 𝐴 × (1 + 1) + 𝐴 × (1 + 1)
𝐴 × 𝐴 × 𝐴 + 𝐴 × 𝐴 × (1 + 1 + 1) + 𝐴 × (1 + 1) .
The result looks like a polynomial in 𝐴, which we can now rearrange into the
required form:
𝐴 × 𝐴 × 𝐴 + 𝐴 × 𝐴 × (1 + 1 + 1) + 𝐴 × (1 + 1) 𝐴 × (1 + 1 + 𝐴 × (1 + 1 + 1 + 𝐴)) .
(b) Keep in mind that the conventions of the type notation make the function
arrow (→) group weaker than other type operations. So, the type expression 1 +
𝐴 + 𝐵 → 1 × 𝐵 means a function from 1 + 𝐴 + 𝐵 to 1 × 𝐵.
Begin by using the equivalence 1 × 𝐵 𝐵 to obtain 1 + 𝐴 + 𝐵 → 𝐵. Now we use
another rule:
𝐴 + 𝐵 → 𝐶 ( 𝐴 → 𝐶) × (𝐵 → 𝐶)
and derive the equivalence:
1 + 𝐴 + 𝐵 → 𝐵 (1 → 𝐵) × ( 𝐴 → 𝐵) × (𝐵 → 𝐵) .
𝐵 × ( 𝐴 → 𝐵) × (𝐵 → 𝐵) (𝐵 → 𝐵) × ( 𝐴 → 𝐵) × 𝐵 .
241
5 The logic of types. III. The Curry-Howard correspondence
Example 5.4.1.8 Show that one cannot implement the type signature Reader[A,
T] => (A => B) => Reader[B, T] by a fully parametric function.
Solution Expand the type signature and try implementing this function:
def m[A, B, T] : (A => T) => (A => B) => B => T = { r => f => b => ??? }
Given values 𝑟 :𝐴→𝑇 , 𝑓 :𝐴→𝐵 , and 𝑏 :𝐵 , we need to compute a value of type 𝑇:
𝑚 = 𝑟 :𝐴→𝑇 → 𝑓 :𝐴→𝐵 → 𝑏 :𝐵 →???:𝑇 .
The only way of getting a value of type 𝑇 is to apply 𝑟 to some value of type 𝐴:
𝑚 = 𝑟 :𝐴→𝑇 → 𝑓 :𝐴→𝐵 → 𝑏 :𝐵 → 𝑟 (???:𝐴 ) .
However, we do not have any values of type 𝐴. We have a function 𝑓 :𝐴→𝐵 that
consumes values of type 𝐴, and we cannot use 𝑓 to produce any values of type
𝐴. So, it seems that we are unable to fill the typed hole ???:𝐴 and implement the
function m.
In order to verify that m is unimplementable, we need to prove that the logical
formula:
∀(𝛼, 𝛽, 𝜏). (𝛼 ⇒ 𝜏) ⇒ (𝛼 ⇒ 𝛽) ⇒ (𝛽 ⇒ 𝜏) (5.25)
is not true in the constructive logic. We could use the curryhoward library for that:
@ def m[A, B, T] : (A => T) => (A => B) => B => T = implement
cmd1.sc:1: type (A => T) => (A => B) => B => T cannot be implemented
def m[A, B, T] : (A => T) => (A => B) => B => T = implement
^
Compilation Failed
Another way is to check whether this formula is true in Boolean logic. A formula
that holds in constructive logic will always hold in Boolean logic, because all rules
shown in Section 5.2.3 preserve Boolean truth values (see Section 5.5.4 for a proof).
It follows that any formula that fails to hold in Boolean logic will also not hold in
constructive logic.
It is relatively easy to check whether a given Boolean formula is always equal
to 𝑇𝑟𝑢𝑒. Simplifying Eq. (5.25) with the rules of Boolean logic, we find:
(𝛼 ⇒ 𝜏) ⇒ (𝛼 ⇒ 𝛽) ⇒ (𝛽 ⇒ 𝜏)
use Eq. (5.8) : = ¬(𝛼 ⇒ 𝜏) ∨ ¬(𝛼 ⇒ 𝛽) ∨ (𝛽 ⇒ 𝜏)
use Eq. (5.8) : = ¬(¬𝛼 ∨ 𝜏) ∨ ¬(¬𝛼 ∨ 𝛽) ∨ (¬𝛽 ∨ 𝜏)
use de Morgan’s law : = (𝛼 ∧ ¬𝜏) ∨ (𝛼 ∧ ¬𝛽) ∨ ¬𝛽 ∨ 𝜏
use identity ( 𝑝 ∧ 𝑞) ∨ 𝑞 = 𝑞 : = (𝛼 ∧ ¬𝜏) ∨ ¬𝛽 ∨ 𝜏
use identity ( 𝑝 ∧ ¬𝑞) ∨ 𝑞 = 𝑝 ∨ 𝑞 : = 𝛼 ∨ ¬𝛽 ∨ 𝜏 .
This formula is not identically 𝑇𝑟𝑢𝑒: it is 𝐹𝑎𝑙𝑠𝑒 when 𝛼 = 𝜏 = 𝐹𝑎𝑙𝑠𝑒 and 𝛽 = 𝑇𝑟𝑢𝑒.
So, Eq. (5.25) is not true in Boolean logic, therefore it is also not true in constructive
logic. By the CH correspondence, we conclude that the type signature of m cannot
be implemented by a fully parametric function.
242
5.4 Summary
Example 5.4.1.9 Define the type constructor 𝑃 𝐴 def = 1 + 𝐴 + 𝐴 and implement map
for it, with the type signature map 𝐴,𝐵 : 𝑃 𝐴 → ( 𝐴 → 𝐵) → 𝑃 𝐵 . To check that map
preserves information, verify the law map(p)(x => x) == p for all p: P[A].
Solution It is implied that map should be fully parametric and information-
preserving. Begin by defining a Scala type constructor for the notation 𝑃 𝐴 def
= 1+
𝐴 + 𝐴:
sealed trait P[A]
final case class P1[A]() extends P[A]
final case class P2[A](x: A) extends P[A]
final case class P3[A](x: A) extends P[A]
Now we can write code to implement the required type signature. Each time we
have several choices of an implementation, we will choose to preserve informa-
tion as much as possible.
def map[A, B]: P[A] => (A => B) => P[B] =
p => f => p match {
case P1() => P1() // No other choice.
case P2(x) => ???
case P3(x) => ???
}
In the case P2(x), we are required to produce a value of type 𝑃 𝐵 from a value
𝑥 :𝐴 and a function 𝑓 :𝐴→𝐵 . Since 𝑃 𝐵 is a disjunctive type with three parts, we can
produce a value of type 𝑃 𝐵 in three different ways: P1(), P2(...), and P3(...). If we
return P1(), we will lose the information about the value x. If we return P3(...),
we will preserve the information about x but lose the information that the input
value was a P2 rather than a P3. By returning P2(...) in that scope, we preserve the
entire input information.
The value under P2(...) must be of type 𝐵, and the only way of getting a value
of type 𝐵 is to apply 𝑓 to 𝑥. So, we return P2(f(x)).
Similarly, in the case P3(x), we should return P3(f(x)). The final code of map is:
def map[A, B]: P[A] => (A => B) => P[B] = p => f => p match {
case P1() => P1() // No other choice here.
case P2(x) => P2(f(x)) // Preserve information.
case P3(x) => P3(f(x)) // Preserve information.
}
To verify the given law, we first write a matrix notation for map:
1 𝐵 𝐵
1 id 0 0
map 𝐴,𝐵 def
= 𝑝 :1+𝐴+𝐴 → 𝑓 :𝐴→𝐵 → 𝑝 ⊲ .
𝐴 0 𝑓 0
𝐴 0 0 𝑓
The required law is written as an equation map ( 𝑝) (id) = 𝑝, called the identity
243
5 The logic of types. III. The Curry-Howard correspondence
law. Substituting the code notation for map, we verify the law:
Example 5.4.1.10 Implement map and flatMap for Either[L, R], applied to the type
parameter L.
Solution For a type constructor, say, 𝑃, the standard type signatures for map
and flatMap are:
map : 𝑃 𝐴 → ( 𝐴 → 𝐵) → 𝑃 𝐵 , flatMap : 𝑃 𝐴 → ( 𝐴 → 𝑃 𝐵 ) → 𝑃 𝐵 .
If a type constructor has more than one type parameter, e.g., 𝑃 𝐴,𝑆,𝑇 , one can define
the functions map and flatMap applied to a chosen type parameter. For example,
when applied to the type parameter 𝐴, the type signatures are:
Being “applied to the type parameter 𝐴” means that the other type parameters 𝑆, 𝑇
in 𝑃 𝐴,𝑆,𝑇 remain fixed while the type parameter 𝐴 is replaced by 𝐵 in the type
signatures of map and flatMap.
For the type Either[L, R] (in the type notation, 𝐿 + 𝑅), we keep the type param-
eter 𝑅 fixed while 𝐿 is replaced by 𝑀. So, we obtain the type signatures:
map : 𝐿 + 𝑅 → (𝐿 → 𝑀) → 𝑀 + 𝑅 ,
flatMap : 𝐿 + 𝑅 → (𝐿 → 𝑀 + 𝑅) → 𝑀 + 𝑅 .
def flatMap[L, M, R]: Either[L, R] => (L => Either[M, R]) => Either[M, R] = e
=> f => e match {
case Left(x) => f(x)
case Right(y) => Right(y)
}
244
5.4 Summary
𝑀 𝑅
def
map = 𝑒 :𝐿+𝑅 → 𝑓 :𝐿→𝑀 → 𝑒 ⊲ 𝐿 𝑓 0 ,
𝑅 0 id
𝑀+𝑅
flatMap def
= 𝑒 :𝐿+𝑅 → 𝑓 :𝐿→𝑀+𝑅 → 𝑒 ⊲ 𝐿 𝑓 .
𝑅 𝑦 :𝑅 → 0:𝑀 + 𝑦
Note that the code matrix for flatMap cannot be split into the 𝑀 and 𝑅 columns
because we do not know in advance which part of the disjunctive type 𝑀 + 𝑅 will
be returned when we evaluate 𝑓 (𝑥 :𝐿 ).
Example 5.4.1.11 Define a type constructor State𝑆,𝐴 ≡ 𝑆 → 𝐴 × 𝑆 and implement
the functions:
(a) pure𝑆,𝐴 : 𝐴 → State𝑆,𝐴 .
(b) map𝑆,𝐴,𝐵 : State𝑆,𝐴 → ( 𝐴 → 𝐵) → State𝑆,𝐵 .
(c) flatMap𝑆,𝐴,𝐵 : State𝑆,𝐴 → ( 𝐴 → State𝑆,𝐵 ) → State𝑆,𝐵 .
Solution It is assumed that all functions must be fully parametric and pre-
serve as much information as possible. We define the type alias:
type State[S, A] = S => (A, S)
(a) The type signature is 𝐴 → 𝑆 → 𝐴 × 𝑆, and there is only one implementation:
def pure[S, A]: A => State[S, A] = a => s => (a, s)
In the code notation, this is written as:
pu𝑆,𝐴 def
= 𝑎 :𝐴 → 𝑠 :𝑆 → 𝑎 × 𝑠 .
map𝑆,𝐴,𝐵 : (𝑆 → 𝐴 × 𝑆) → ( 𝐴 → 𝐵) → 𝑆 → 𝐵 × 𝑆 .
map def
= 𝑡 :𝑆→𝐴×𝑆 → 𝑓 :𝐴→𝐵 → 𝑠 :𝑆 → ???:𝐵 × ???:𝑆 .
map def
= 𝑡 :𝑆→𝐴×𝑆 → 𝑓 :𝐴→𝐵 → 𝑠 :𝑆 → 𝑓 (???:𝐴 ) × ???:𝑆 .
245
5 The logic of types. III. The Curry-Howard correspondence
The only possibility of filling the typed hole ???:𝐴 is to apply 𝑡 to a value of type
𝑆. We already have such a value, 𝑠 :𝑆 . Computing 𝑡 (𝑠) yields a pair of type 𝐴 × 𝑆,
from which we may take the first part (of type 𝐴) to fill the typed hole ???:𝐴 . The
second part of the pair is a value of type 𝑆 that we may use to fill the second typed
hole, ???:𝑆 . So, the Scala code is:
1 def map[S, A, B]: State[S, A] => (A => B) => State[S, B] = {
2 t => f => s =>
3 val (a, s2) = t(s)
4 (f(a), s2) // We could also return `(f(a), s)` here.
5 }
Why not return the original value s in the tuple 𝐵 × 𝑆, instead of the new value
s2? The reason is that we would like to preserve information as much as possible.
If we return (f(a), s) in line 4, we will have discarded the computed value s2,
which is a loss of information.
To write the code notation for map, we need to destructure the pair that 𝑡 (𝑠)
returns. We can write explicit destructuring code like this:
map def
= 𝑡 :𝑆→𝐴×𝑆 → 𝑓 :𝐴→𝐵 → 𝑠 :𝑆 → (𝑎 :𝐴 × 𝑠2:𝑆 → 𝑓 (𝑎) × 𝑠2 )(𝑡 (𝑠)) .
If we temporarily denote by 𝑞 the following destructuring function:
𝑞 def
= (𝑎 :𝐴 × 𝑠2:𝑆 → 𝑓 (𝑎) × 𝑠2 ) ,
we will notice that the expression 𝑠 → 𝑞(𝑡 (𝑠)) is a function composition applied
to 𝑠. So, we rewrite 𝑠 → 𝑞(𝑡 (𝑠)) as the composition 𝑡 # 𝑞 and obtain shorter code:
map def
= 𝑡 :𝑆→𝐴×𝑆 → 𝑓 :𝐴→𝐵 → 𝑡 # (𝑎 :𝐴 × 𝑠 :𝑆 → 𝑓 (𝑎) × 𝑠) .
Shorter formulas are often easier to reason about in derivations, although not nec-
essarily easier to read when converted to program code.
(c) The required type signature is:
flatMap𝑆,𝐴,𝐵 : (𝑆 → 𝐴 × 𝑆) → ( 𝐴 → 𝑆 → 𝐵 × 𝑆) → 𝑆 → 𝐵 × 𝑆 .
We perform code reasoning with typed holes:
flatMap def
= 𝑡 :𝑆→𝐴×𝑆 → 𝑓 :𝐴→𝑆→𝐵×𝑆 → 𝑠 :𝑆 → ???:𝐵×𝑆 .
To fill ???:𝐵×𝑆 , we need to apply 𝑓 to some arguments, since 𝑓 is the only function
that returns any values of type 𝐵. Applying 𝑓 to two values will yield a value of
type 𝐵 × 𝑆, just as we need:
flatMap def
= 𝑡 :𝑆→𝐴×𝑆 → 𝑓 :𝐴→𝑆→𝐵×𝑆 → 𝑠 :𝑆 → 𝑓 (???:𝐴 )(???:𝑆 ) .
To fill the new typed holes, we need to apply 𝑡 to an argument of type 𝑆. We have
only one given value 𝑠 :𝑆 of type 𝑆, so we must compute 𝑡 (𝑠) and destructure it:
flatMap def
= 𝑡 :𝑆→𝐴×𝑆 → 𝑓 :𝐴→𝑆→𝐵×𝑆 → 𝑠 :𝑆 → (𝑎 × 𝑠2 → 𝑓 (𝑎)(𝑠2 )) (𝑡 (𝑠)) .
Translating this notation into Scala code, we obtain:
246
5.4 Summary
def flatMap[S, A, B]: State[S, A] => (A => State[S, B]) => State[S, B] = {
t => f => s =>
val (a, s2) = t(s)
f(a)(s2) // We could also return `f(a)(s)` here, but that would
lose information.
}
In order to preserve information, we choose not to discard the computed value s2.
The code notation for this flatMap can be simplified to:
flatMap def
= 𝑡 :𝑆→𝐴×𝑆 → 𝑓 :𝐴→𝑆→𝐵×𝑆 → 𝑡 # (𝑎 × 𝑠 → 𝑓 (𝑎)(𝑠)) .
5.4.2 Exercises
Exercise 5.4.2.1 Find the cardinality of the following Scala type:
type P = Option[Boolean => Option[Boolean]]
Show that P is equivalent to Option[Boolean] => Boolean, but the equivalence is ac-
cidental and not “natural”.
Exercise 5.4.2.2 Verify the type equivalences 𝐴 + 𝐴 2 × 𝐴 and 𝐴 × 𝐴 2 → 𝐴,
where 2 denotes the Boolean type.
Exercise 5.4.2.3 Show that 𝛼 ⇒ (𝛽 ∨ 𝛾) ≠ (𝛼 ⇒ 𝛽) ∧ (𝛼 ⇒ 𝛾) in constructive and
Boolean logic.
Exercise 5.4.2.4 Verify the type equivalence ( 𝐴 → 𝐵 × 𝐶) ( 𝐴 → 𝐵) × ( 𝐴 → 𝐶)
with full proofs.
Exercise 5.4.2.5 Use known rules to verify the type equivalences without need
for proofs:
(a) ( 𝐴 + 𝐵) × ( 𝐴 → 𝐵) 𝐴 × ( 𝐴 → 𝐵) + (1 + 𝐴 → 𝐵) .
(b) ( 𝐴 × (1 + 𝐴) → 𝐵) ( 𝐴 → 𝐵) × ( 𝐴 → 𝐴 → 𝐵) .
(c) 𝐴 → (1 + 𝐵) → 𝐶 × 𝐷 ( 𝐴 → 𝐶) × ( 𝐴 → 𝐷) × ( 𝐴 × 𝐵 → 𝐶) × ( 𝐴 × 𝐵 → 𝐷) .
Exercise 5.4.2.6 Write the type notation for Either[(A, Int), Either[(A, Char),
(A, Float)]]. Transform this type into an equivalent type of the form 𝐴 × (...).
Exercise 5.4.2.7 Define a type OptE𝑇,𝐴 def = 1 + 𝑇 + 𝐴 and implement information-
preserving map and flatMap for it, applied to the type parameter 𝐴. Get the same
result using the equivalent type (1 + 𝐴) + 𝑇, i.e., Either[Option[A], T]. The required
type signatures are:
Exercise 5.4.2.8 Implement the map function for the type constructor P from Ex-
ample 5.4.1.2. The required type signature is 𝑃 𝐴 → ( 𝐴 → 𝐵) → 𝑃 𝐵 . Preserve
information as much as possible.
247
5 The logic of types. III. The Curry-Howard correspondence
Exercise 5.4.2.9 For the type constructor 𝑄 defined in Exercise 5.1.5.1, define the
map function, preserving information as much as possible:
Although tools such as the curryhoward library can sometimes derive code from
types, it is beneficial if a programmer is able to derive an implementation by hand
or to determine that an implementation is impossible. For instance, the program-
mer should recognize that the type signature:
def f[A, B]: A => (A => B) => B
has only one fully parametric implementation, while the following two type sig-
natures have none:
def g[A, B]: A => (B => A) => B
def h[A, B]: ((A => B) => A) => B
Exercises in this chapter help to build up the required technique and intuition. The
two main guidelines for code derivation are: “values of parametric types cannot
be constructed from scratch” and “one must hard-code the decision to return a
chosen part of a disjunctive type when no other disjunctive value is given”. These
guidelines can be justified by referring to the rules of proof in Table 5.2. Sequents
producing a value of type 𝐴 can be proved only if there is a premise containing
𝐴 or a function that returns a value of type 𝐴.13 One can derive a disjunction
without hard-coding only if one already has a disjunction in the premises (and
then the rule “use Either” could apply).
Throughout this chapter, we require all code to be fully parametric. This is
because the CH correspondence gives useful, non-trivial results only for param-
eterized types and fully parametric code. For concrete, non-parameterized types
(Int, String, etc.), one can always produce some values even with no previous data.
So, the propositions CH (Int) or CH (String) are always true.
Consider the function (x: Int) => x + 1. Its type signature, Int => Int, may be
implemented by many other functions, such as x => x - 1, x => x * 2, etc. So, the
type signature Int => Int is insufficient to specify the code of the function, and
deriving code from that type is not a meaningful task. Only a fully parametric
type signature, such as 𝐴 → ( 𝐴 → 𝐵) → 𝐵, could give enough information for de-
riving the function’s code. Additionally, we must require the code of functions to
be fully parametric. Otherwise we will be unable to reason about code derivation
from type signatures.
Validity of a CH -proposition CH (𝑇) means that we can implement some value
of the given type 𝑇. But this does not give any information about the properties
of that value, such as whether it satisfies any laws. This is why type equivalence
(which requires the laws of isomorphisms) is not determined by an equivalence
of logical formulas.
It is useful for programmers to be able to transform type expressions to equiv-
alent simpler types before starting to write code. The type notation introduced
in this book is designed to help programmers to recognize patterns in type ex-
13 This is proved rigorously by R. Dyckhoff as the “Theorem” in section 6 (“Goal-directed pruning”),
see https://research-repository.st-andrews.ac.uk/handle/10023/8824
249
5 The logic of types. III. The Curry-Howard correspondence
pressions and to reason about them more easily. We have shown that a type
equivalence corresponds to each standard arithmetic identity such as (𝑎 + 𝑏) + 𝑐 =
𝑎 + (𝑏 + 𝑐), (𝑎 × 𝑏) × 𝑐 = 𝑎 × (𝑏 × 𝑐), 1 × 𝑎 = 𝑎, (𝑎 + 𝑏) × 𝑐 = 𝑎 × 𝑐 + 𝑏 × 𝑐, and so on.
Because of this, we are allowed to transform and simplify types as if they were
arithmetic expressions, e.g., to rewrite:
1 × ( 𝐴 + 𝐵) × 𝐶 + 𝐷 𝐷 + 𝐴 × 𝐶 + 𝐵 × 𝐶 .
The type notation makes this reasoning more intuitive.
These results apply to all type expressions built up using product types, disjunc-
tive types (also called “sum” types because they correspond to arithmetic sums),
and function types (also called “exponential” types because they correspond to
arithmetic exponentials). Type expressions that contain only products and sum
types are called polynomial.14 Type expressions that also contain function types
are called exponential-polynomial. We focus on exponential-polynomial types
because they are sufficient for almost all design patterns used in functional pro-
gramming.
There are no types corresponding to subtraction or division, so arithmetic equa-
tions such as:
𝑡 +𝑡 ×𝑡
(1 − 𝑡) × (1 + 𝑡) = 1 − 𝑡 × 𝑡 , and = 1+𝑡 ,
𝑡
do not directly yield any type equivalences. However, consider this well-known
formula:
1
= 1 + 𝑡 + 𝑡 2 + 𝑡 3 + ... + 𝑡 𝑛 + ... .
1−𝑡
At first sight, this formula appears to involve subtraction, division, and an infinite
series, and so cannot be directly translated into a type equivalence. However, the
formula can be rewritten as:
1
= 𝐿 (𝑡) where 𝐿 (𝑡) def = 1 + 𝑡 + 𝑡 2 + 𝑡 3 + ... + 𝑡 𝑛 × 𝐿 (𝑡) . (5.26)
1−𝑡
The definition of 𝐿(𝑡) is finite and only contains additions and multiplications. So,
Eq. (5.26) can be translated into a type equivalence:
𝐿 𝐴 1 + 𝐴 + 𝐴 × 𝐴 + 𝐴 × 𝐴 × 𝐴 + ... + 𝐴 × ... × 𝐴 × 𝐿 𝐴 . (5.27)
| {z }
𝑛 times
250
5.5 Discussion and further developments
• For each axiom and proof rule of the logic, provide a code construction in
the language.
Mathematicians have studied different logics, such as modal logic, temporal logic,
or linear logic. Compared with the constructive logic, those other logics have
251
5 The logic of types. III. The Curry-Howard correspondence
some additional type operations. For instance, modal logic adds the operations
“necessarily” and “possibly”, and temporal logic adds the operation “until”. For
each logic, mathematicians have determined the minimal complete sets of oper-
ations, axioms, and proof rules that do not lead to inconsistency. Programming
language designers can use this mathematical knowledge by choosing a logic and
translating it into a minimal “core” of a programming language. Code in that
language will be guaranteed never to crash as long as all types match. This math-
ematical guarantee (known as type safety) is a powerful help for programmers
since it automatically prevents a large number of coding errors. So, programmers
will benefit if they use languages designed using the CH correspondence.
Practically useful programming languages will of course need more features
than the minimal set of mathematically necessary features derived from a chosen
logic. Language designers need to make sure that all added features are consistent
with the core language.
At present, it is still not fully understood how a practical programming lan-
guage could use, say, modal or linear logic as its logic of types. Experience sug-
gests that, at least, the operations of the plain constructive logic should be avail-
able. So, it appears that the six type constructions and the eight code constructions
will remain available in all future languages of functional programming.
It is possible to apply the FP paradigm while writing code in any program-
ming language. However, some languages lack certain features that make FP tech-
niques easier to use in practice. For example, in a language such as C++ or Java,
one can easily use the map/reduce operations but not disjunctive types. More ad-
vanced FP constructions (such as typeclasses) are impractical in those languages:
the required code becomes too hard to read and to write without errors, which
negates the advantages of rigorous reasoning about functional programs.
Some programming languages, such as Haskell and OCaml, were designed
specifically for advanced use and exploration of the FP paradigm. Other lan-
guages, such as F#, Scala, Swift, and Rust, have different design goals but still
support enough FP features to be considered FP languages. This book uses Scala,
but the same constructions may be implemented in other FP languages in a simi-
lar way. Differences between OCaml, Haskell, F#, Scala, Swift, Rust, and other FP
languages do not play a significant role at the level of detail needed in this book.
252
5.5 Discussion and further developments
The else branch does not return a value, but x is declared to have type Double. For
this code to type-check, both branches must return values of the same type. So,
the compiler needs to pretend that the else branch also returns a value of type
Double. The compiler first assigns the type Nothing to the expression throw ... and
then automatically uses the conversion Nothing => Double to convert that type to
Double. In this way, types will match in the definition of the value x.
This book does not discuss exceptions in much detail. The functional program-
ming paradigm does not use exceptions because their presence prevents mathe-
matical reasoning about code.
As another example of using the void type, suppose an external library imple-
ments a function:
def parallel_run[E, A, B](f: A => Either[E, B]): Either[E, B] = ???
Returning an error is now impossible (the type Nothing has no values). If the func-
tion parallel_run is fully parametric, it will work in the same way with all types 𝐸,
including 𝐸 = 0. The code implements our intention via type parameters, giving
a compile-time guarantee of correct results.
So far, none of our examples involved the logical negation operation. It is de-
fined as:
¬𝛼 def
= (𝛼 ⇒ 𝐹𝑎𝑙𝑠𝑒) .
Its practical use in functional programming is as limited as that of 𝐹𝑎𝑙𝑠𝑒 and the
void type. (The type corresponding to 𝛼 ⇒ 𝐹𝑎𝑙𝑠𝑒 is the function type 𝐴 → 0 or, in
Scala, A => Nothing.) However, logical negation plays an important role in Boolean
logic.
253
5 The logic of types. III. The Curry-Howard correspondence
Solution The result type of bad5 is a disjunctive type, but the argument type is
not. The code of bad5 cannot pattern-match on its argument f to make a decision
about returning a Left or a Right. So, bad5 must hard-code that decision and either
always return a value of type ( 𝐴 → 0) + 0:𝐵 , or always return a value of type
0:𝐴→0 + 𝐵. A value of type 𝐵 cannot be computed from scratch because the type 𝐵
is unknown. The only way of getting a value of type 𝐵 would be by computing
f(x) for some x: A, but no values of type 𝐴 are available. The remaining possibility
is to return a value of type 𝐴 → 0. But the type 𝐴 → 0 is void unless 𝐴 = 0 (see
Example 5.3.4.3). As the type 𝐴 is unknown, we cannot write code that works in
the same way for all types 𝐴 and produces a value of type 𝐴 → 0. So, the function
bad5 cannot be implemented via fully parametric code.
Nevertheless, as we will now show, any theorem of constructive logic is also a
254
5.5 Discussion and further developments
Table 5.7: Proof rules of constructive logic are true also in the Boolean logic.
theorem of Boolean logic. The reason is that all eight rules of constructive logic
(Section 5.2.3) also hold in Boolean logic.
To verify that a formula is true in Boolean logic, it is sufficient to check that
the value of the formula is 𝑇𝑟𝑢𝑒 for all possible truth values (𝑇𝑟𝑢𝑒 or 𝐹𝑎𝑙𝑠𝑒) of its
variables. A sequent such as 𝛼, 𝛽 ` 𝛾 is true in Boolean logic if and only if 𝛾 = 𝑇𝑟𝑢𝑒
under the assumption that 𝛼 = 𝛽 = 𝑇𝑟𝑢𝑒. So, the sequent 𝛼, 𝛽 ` 𝛾 is translated into
the Boolean formula:
𝛼, 𝛽 ` 𝛾 = ((𝛼 ∧ 𝛽) ⇒ 𝛾) = (¬𝛼 ∨ ¬𝛽 ∨ 𝛾) .
Table 5.7 translates all proof rules of Section 5.2.3 into Boolean formulas. The first
two lines are axioms, while the subsequent lines are Boolean theorems that can be
verified by calculation.
To simplify the calculations, note that all terms in the formulas contain the op-
eration (¬Γ ∨ ...) corresponding to the context Γ. Now, if Γ is 𝐹𝑎𝑙𝑠𝑒, the entire
formula becomes automatically 𝑇𝑟𝑢𝑒, and there is nothing else to check. So, it
remains to verify the formula in case Γ = 𝑇𝑟𝑢𝑒, and then we can simply omit all
instances of ¬Γ in the formulas. Let us show the Boolean derivations for the rules
“use function” and “use Either”; other formulas are checked in a similar way:
formula “use function” : (𝛼 ∧ (𝛼 ⇒ 𝛽)) ⇒ 𝛽
use Eq. (5.8) : = ¬(𝛼 ∧ (¬𝛼 ∨ 𝛽)) ∨ 𝛽
de Morgan’s laws : = ¬𝛼 ∨ (𝛼 ∧ ¬𝛽) ∨ 𝛽
255
5 The logic of types. III. The Curry-Howard correspondence
Since each proof rule of the constructive logic is translated into a true formula in
Boolean logic, it follows that a proof tree in the constructive logic will be translated
into a tree of Boolean formulas that have value 𝑇𝑟𝑢𝑒 for each axiom or proof rule.
So, a proof tree for a sequent such as ∅ ` 𝑓 (𝛼, 𝛽, 𝛾) is translated into a tree of
Boolean implications that look like this:
Since (𝑇𝑟𝑢𝑒 ⇒ 𝑥) = 𝑥 for any 𝑥, the Boolean formula 𝑓 (𝛼, 𝛽, 𝛾) will be proved 𝑇𝑟𝑢𝑒.
To see how this works in practice, consider the proof tree shown in Figure 5.2
(page 199). Each step in that proof is made via an axiom or via a derivation rule.
Denoting for brevity 𝛾 def
= (𝛼 ⇒ 𝛼) ⇒ 𝛽, let us translate each of those axioms and
rules into Boolean formulas that (as we know) all have value 𝑇𝑟𝑢𝑒:
257
5 The logic of types. III. The Curry-Howard correspondence
In Boolean logic, one may prove that a value “should exist” by showing that
the non-existence of a value is contradictory in some way. However, any prac-
tically useful program needs to “construct” (i.e., to compute) actual values. The
“constructive” logic got its name from this requirement. So, it is the constructive
logic (not the Boolean logic) that provides correct reasoning about the types of
values computable by fully parametric functional programs.
If we drop the requirement of full parametricity, we could implement the law of
excluded middle. Special features of Scala (reflection, type tags, and type casts)
allow programmers to compare types as values and to determine what type was
given to a type parameter when a function is applied:
import scala.reflect.runtime.universe._
// Convert the type parameter T into a special value.
def getType[T: TypeTag]: Type = weakTypeOf[T]
// Compare types A and B.
def equalTypes[A: TypeTag, B: TypeTag]: Boolean = getType[A] =:= getType[B]
scala> excludedMiddle[Int]
res0: Either[Int,Int => Nothing] = Left(123)
scala> excludedMiddle[Nothing]
res1: Either[Nothing,Nothing => Nothing] = Right(<function1>)
In this code, we check whether 𝐴 = 0. If so, we can implement 𝐴 → 0 as an
identity function of type 0 → 0. Otherwise, we know that 𝐴 is one of the existing
Scala types (Int, Boolean, etc.), which are not void and have values that we can
simply write down one by one in the subsequent code.
Explicit type casts, such as 123.asInstanceOf[A], are needed because the Scala
compiler cannot know that A is Int in the scope where we return Left(123). Without
a type cast, the compiler will not accept 123 as a value of type A in that scope.
The method asInstanceOf is dangerous because the code x.asInstanceOf[T] dis-
ables the type checking for the value x. This tells the Scala compiler to believe that
x has type T even when the type T is inconsistent with the actually given code of x.
The resulting programs compile but may give unexpected results or crash. These
errors would have been prevented if we did not disable the type checking. In this
book, we will avoid writing such code whenever possible.
258
Essay: Towards functional data
engineering with Scala
Data engineering is among the highest-demand1 novel occupations in the IT world
today. Data engineers create software pipelines that process large volumes of
data efficiently. Why did the Scala programming language emerge as a premier
tool2 for crafting the foundational data engineering technologies such as Spark or
Akka? Why is Scala in high demand3 within the world of big data?
There are reasons to believe that the choice of Scala was not accidental.
Data is math
Humanity has been working with data at least since Babylonian tax tables4 and
the ancient Chinese number books.5 Mathematics summarizes several millennia’s
worth of data processing experience in a few fundamental tenets:
• Values of different type (population count, land area, distance, price, loca-
tion, time, growth percentage, etc.) need to be handled separately. For ex-
ample, it is an error to add a distance to a population count.
Violating these tenets produces nonsense (see Fig. 5.1 for a real-life illustration).
The power of the principles of mathematics extends over all epochs and all
cultures; math is the same in San Francisco, in Rio de Janeiro, in Kuala-Lumpur,
and in Pyongyang (Fig. 5.2).
1 http://archive.is/mK59h
2 https://tinyurl.com/4wwsedrz
3 https://techcrunch.com/2016/06/14/scala-is-the-new-golden-child/
4 https://www.nytimes.com/2017/08/29/science/trigonometry-babylonian-tablet.html
5 https://quatr.us/china/science/chinamath.htm
259
Essay: Towards functional data engineering with Scala
260
The power of abstraction
261
Essay: Towards functional data engineering with Scala
262
Summary
Summary
Only Scala has all of the features required for industrial-grade functional pro-
gramming:
263
List of Tables
1.1 Translating mathematics into code. . . . . . . . . . . . . . . . . . . . 19
1.2 Nameless functions in various programming languages. . . . . . . 29
265
List of Figures
3.1 The disjoint domain represented by the type RootsOfQ. . . . . . . . . 132
267
Index
examples (with code), 16, 40, 59, 70, Kurt Gödel, 257
99, 105, 116, 121, 125, 154, 159,
181, 218, 224, 226, 235 labeled union, 133, 223
exception, 82, 100, 110, 253 lambda-function
exercises, 20, 46, 79, 118, 119, 122, 130, see “nameless function”, 28
165, 185, 222, 247 law of de Morgan, 239
expanded form of a function, 147 law of excluded middle, 257, 258
exponent, 225 lazy collection, 85
exponential-polynomial type, 250 lazy value, 85
expression, 7 lifting, 106
expression block, 10 LJT algorithm, 200, 202, 209
local scope, 10, 27, 84, 100
factorial function, 7 logical axiom, 190
first-order logic, 235 logical implication, 180, 185
formal logic, 186 loop detection, 76
forward composition, 145, 148
free variable, 12 Machin’s formula, 20
fully parametric map/reduce programming style, 16, 28
code, 143, 146, 187 mathematical induction, 20, 23, 49
code constructions, 187 base case, 49
function as a value, 9, 142 inductive assumption, 49
function composition, 145 inductive step, 49
functional programming paradigm, 22 matrix notation, 220
method syntax, 11
generic functions, 146 mutable value, 88
infallible, 83 stream, 66
pattern variables, 33 sum type
Peirce’s law, 201 see “disjunctive type”, 223
perfect numbers, 81 symbolic calculations, 150
perfect-shaped tree, 120
pipe notation, 188, 231 tail recursion, 52
operator precedence, 233 total function, 82
planned exception, 110 trampolines, 70
polynomial type, 250 ⊲-notation
predicate, 10 see “pipe notation”, 231
procedure, 131 truth table, 210
product type, 223 tuples, 31
product types, 177 accessors, 32
proof (in logic), 176 as function arguments, 144
proof by induction, 50 fields, 31
proof transformer, 192 nested, 32
proof tree, 204 nested in pattern matching, 68
proposition (in logic), 135 parts, 31
pure function, 87, 89 turnstile (`) symbol, 176, 192
type alias, 59, 91, 178
recursive function, 50 type annotation, 105, 128
accumulator argument, 53 type casts, 258
recursive types, 112 type constructor, 95
infinite loop, 112 type equivalence, 142, 214
referential transparency, 87 accidental, 224, 247
Richard Bornat, 186, 210 type error, 8, 31, 32, 45, 83, 92
Riemann’s zeta function, 21 type expression, 144
rose tree, 119 type inference, 156, 158, 171
Roy Dyckhoff, 202, 249 type notation, 177, 182
run-time error, 83 operator precedence, 178
runner, 124 type parameter, 36, 94
type safety, 252
Scala method, 142 typed hole, 241
Scala’s Iterator class, 86–88 types, 24
sequent (in logic), 176 equivalent, 214
goal, 176 exponential-polynomial types, 250
premises, 176 isomorphic, 214
shadowed name, 84, 169 polynomial types, 250
side effect, 89
Simpson’s rule, 26 uncurried function, 139, 142
six type constructions, 177 uncurrying, 142, 146
stack memory, 52 undecidable logic, 235
staggered factorial function, 20 unevaluated expression, 122
271
Index
value-like behavior, 87
variable, 23, 24
void type, 104, 112, 217, 228, 252
Wallis product, 17
well-typed expression, 158
272