0% found this document useful (0 votes)
17 views280 pages

Sofp Vol1

Uploaded by

datta.yash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views280 pages

Sofp Vol1

Uploaded by

datta.yash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 280

The Science of Functional Programming.

Part I
The Science of Functional
Programming. Part I
A tutorial, with examples in Scala

by Sergei Winitzki, Ph.D.

Version of September 4, 2024

Published by lulu.com in 2024


Copyright © 2018-2024 by Sergei Winitzki

Print on demand at lulu.com

ISBN (e-book): 978-0-359-76877-6


ISBN (vol. 1): 978-1-4710-4004-7

Source hash (sha256): cc147a546a04a29eaca33d732eae46d8a4e2b52b357f6bd30f4f99f07d1e6863


Git commit: b120b8a972e81a7e59a8aec320143ddde01c2fe1
PDF file built on Wed, 04 Sep 2024 09:48:58 +0000 by pdfTeX 3.14159265-2.6-1.40.20 (TeX Live 2019/Debian) on Linux

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Docu-
mentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections,
no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the appendix entitled “GNU Free
Documentation License” (Appendix F on page 1265).

A Transparent copy of the source code for the book is available at https://github.com/winitzki/sofp and
includes LyX, LaTeX, graphics source files, and build scripts. A full-color hyperlinked PDF file is available at
https://github.com/winitzki/sofp/releases under “Assets”. The source code may be also included as a “file
attachment” named sofp-src.tar.bz2 within a PDF file. To extract, run `pdftk sofp.pdf unpack_files output .`
and then `tar jxvf sofp-src.tar.bz2`. See the file README_build.md for build instructions.

This book is a pedagogical in-depth tutorial and reference on the theory of functional programming (FP) as it was
practiced at the beginning of the XXI century. Starting from issues found in practical coding, the book builds up the
theoretical intuition, knowledge, and techniques that programmers need for rigorous reasoning about types and code.
Examples are given in Scala, but most of the material applies equally to other FP languages.

The book’s topics include working with FP-style collections; reasoning about recursive functions and types; the
Curry-Howard correspondence; laws, structural analysis, and code for functors, monads, and other typeclasses based
on exponential-polynomial data types; techniques of symbolic derivation and proof; free typeclass constructions; and
practical applications of parametricity.

Long and difficult, yet boring explanations are logically developed in excruciating detail through 1906 Scala code
snippets, 192 statements with step-by-step derivations, 104 diagrams, 223 examples with tested Scala code, and 310
exercises. Discussions build upon each chapter’s material further.

Beginners in FP will find tutorials about the map/reduce programming style, type parameters, disjunctive types,
and higher-order functions. For more advanced readers, the book shows the practical uses of the Curry-Howard
correspondence and the parametricity theorems without unnecessary jargon; proves that all the standard monads (e.g.,
List or State) satisfy the monad laws; derives lawful instances of Functor and other typeclasses from types; shows that
monad transformers need 18 laws; and explains the use of parametricity for reasoning about the Church encoding and the
free typeclasses.

Readers should have a working knowledge of programming; e.g., be able to write code that prints the number of
distinct words in a sentence. The difficulty of this book’s mathematical derivations is at the level of undergraduate
multivariate calculus, similar to that of multiplying matrices or simplifying the expressions:

1 1 𝑑
− and ( ( 𝑥 + 1) 𝑓 ( 𝑥)𝑒−𝑥 ) .
𝑥−2 𝑥+2 𝑑𝑥

The author received a Ph.D. in theoretical physics. After a career in academic research, he works as a software engineer.
Contents
Preface 1
Formatting conventions used in this book . . . . . . . . . . . . . . . . . . 2

I Introductory level 5
1 Mathematical formulas as code. I. Nameless functions 7
1.1 Translating mathematics into code . . . . . . . . . . . . . . . . . . . 7
1.1.1 First examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1.2 Nameless functions . . . . . . . . . . . . . . . . . . . . . . . . 8
1.1.3 Nameless functions and bound variables . . . . . . . . . . . 11
1.2 Aggregating data from sequences . . . . . . . . . . . . . . . . . . . . 13
1.3 Filtering and truncating a sequence . . . . . . . . . . . . . . . . . . . 15
1.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.4.1 Aggregations . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.4.2 Transformations . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.6.1 Aggregations . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.6.2 Transformations . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.7.1 Functional programming as a paradigm . . . . . . . . . . . . 22
1.7.2 Iteration without loops . . . . . . . . . . . . . . . . . . . . . . 22
1.7.3 The mathematical meaning of “variables” . . . . . . . . . . . 23
1.7.4 Nameless functions in mathematical notation . . . . . . . . 25
1.7.5 Named and nameless expressions . . . . . . . . . . . . . . . 27
1.7.6 Historical perspective on nameless functions . . . . . . . . . 28

2 Mathematical formulas as code. II. Mathematical induction 31


2.1 Tuple types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.1.1 Examples: Using tuples . . . . . . . . . . . . . . . . . . . . . 31
2.1.2 Pattern matching for tuples . . . . . . . . . . . . . . . . . . . 33
2.1.3 Using tuples with collections . . . . . . . . . . . . . . . . . . 35
2.1.4 Treating dictionaries as collections . . . . . . . . . . . . . . . 36
2.1.5 Examples: Tuples and collections . . . . . . . . . . . . . . . . 40
2.1.6 Reasoning about type parameters in collections . . . . . . . 45
i
Contents

2.1.7 Exercises: Tuples and collections . . . . . . . . . . . . . . . . 46


2.2 Converting a sequence into a single value . . . . . . . . . . . . . . . 48
2.2.1 Inductive definitions of aggregation functions . . . . . . . . 49
2.2.2 Implementing functions by recursion . . . . . . . . . . . . . 50
2.2.3 Tail recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.2.4 Implementing general aggregation (foldLeft) . . . . . . . . 57
2.2.5 Examples: Using foldLeft . . . . . . . . . . . . . . . . . . . 59
2.2.6 Exercises: Using foldLeft . . . . . . . . . . . . . . . . . . . . 64
2.3 Generating a sequence from a single value . . . . . . . . . . . . . . 65
2.4 Transforming a sequence into another sequence . . . . . . . . . . . 67
2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
2.5.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
2.5.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.6 Discussion and further developments . . . . . . . . . . . . . . . . . 82
2.6.1 Total and partial functions . . . . . . . . . . . . . . . . . . . . 82
2.6.2 Scope and shadowing of pattern matching variables . . . . . 84
2.6.3 Lazy values and sequences. Iterators and streams . . . . . . 84

3 The logic of types. I. Disjunctive types 91


3.1 Scala’s “case classes” . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
3.1.1 Tuple types with names . . . . . . . . . . . . . . . . . . . . . 91
3.1.2 Case classes with type parameters . . . . . . . . . . . . . . . 94
3.1.3 Tuples with one part and with zero parts . . . . . . . . . . . 95
3.1.4 Pattern matching for case classes . . . . . . . . . . . . . . . . 96
3.2 Disjunctive types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
3.2.1 Motivation and first examples . . . . . . . . . . . . . . . . . 97
3.2.2 Examples: Pattern matching for disjunctive types . . . . . . 99
3.2.3 Standard disjunctive types: Option, Either, Try . . . . . . . 104
3.3 Lists and trees as recursive disjunctive types . . . . . . . . . . . . . 112
3.3.1 The recursive type List . . . . . . . . . . . . . . . . . . . . . 112
3.3.2 Tail recursion with List . . . . . . . . . . . . . . . . . . . . . 114
3.3.3 Binary trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
3.3.4 Rose trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
3.3.5 Perfect-shaped trees . . . . . . . . . . . . . . . . . . . . . . . 120
3.3.6 Abstract syntax trees . . . . . . . . . . . . . . . . . . . . . . . 122
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
3.4.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
3.4.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
3.5 Discussion and further developments . . . . . . . . . . . . . . . . . 132
3.5.1 Disjunctive types as mathematical sets . . . . . . . . . . . . . 132
3.5.2 Disjunctive types in other programming languages . . . . . 134
3.5.3 Disjunctions and conjunctions in formal logic . . . . . . . . 135
ii
Contents

4 The logic of types. II. Curried functions 137


4.1 Functions that return functions . . . . . . . . . . . . . . . . . . . . . 137
4.1.1 Motivation and first examples . . . . . . . . . . . . . . . . . 137
4.1.2 Curried and uncurried functions . . . . . . . . . . . . . . . . 139
4.1.3 Equivalence of curried and uncurried functions . . . . . . . 141
4.2 Fully parametric functions . . . . . . . . . . . . . . . . . . . . . . . . 143
4.2.1 Function composition . . . . . . . . . . . . . . . . . . . . . . 145
4.2.2 Laws of function composition . . . . . . . . . . . . . . . . . . 146
4.2.3 Example: A function that is not fully parametric . . . . . . . 149
4.3 Symbolic calculations with nameless functions . . . . . . . . . . . . 151
4.3.1 Calculations with curried functions . . . . . . . . . . . . . . 151
4.3.2 Examples: Deriving a function’s type from its code . . . . . 154
4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
4.4.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
4.4.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
4.5 Discussion and further developments . . . . . . . . . . . . . . . . . 168
4.5.1 Higher-order functions . . . . . . . . . . . . . . . . . . . . . . 168
4.5.2 Name shadowing and the scope of bound variables . . . . . 168
4.5.3 Operator syntax for function applications . . . . . . . . . . . 170
4.5.4 Deriving a function’s code from its type . . . . . . . . . . . . 171

5 The logic of types. III. The Curry-Howard correspondence 173


5.1 Values computed by fully parametric functions . . . . . . . . . . . . 173
5.1.1 Motivation and outlook . . . . . . . . . . . . . . . . . . . . . 173
5.1.2 Type notation for standard type constructions . . . . . . . . 177
5.1.3 Rules for writing CH -propositions . . . . . . . . . . . . . . . 179
5.1.4 Examples: Type notation . . . . . . . . . . . . . . . . . . . . . 181
5.1.5 Exercises: Type notation . . . . . . . . . . . . . . . . . . . . . 185
5.2 The logic of CH -propositions . . . . . . . . . . . . . . . . . . . . . . 185
5.2.1 Motivation and first examples . . . . . . . . . . . . . . . . . 185
5.2.2 Short notation for fully parametric code . . . . . . . . . . . . 187
5.2.3 The rules of proof for CH -propositions . . . . . . . . . . . . 190
5.2.4 Examples: Deriving code from proofs of CH -propositions . 194
5.2.5 The LJT algorithm . . . . . . . . . . . . . . . . . . . . . . . . 201
5.2.6 Failure of Boolean logic in reasoning about CH -propositions 210
5.3 Equivalence of types . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
5.3.1 Logical identity does not correspond to type equivalence . . 213
5.3.2 Arithmetic identity corresponds to type equivalence . . . . 217
5.3.3 Type cardinalities and type equivalence . . . . . . . . . . . . 222
5.3.4 Type equivalence involving function types . . . . . . . . . . 225
5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
5.4.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
5.4.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
iii
Contents

5.5 Discussion and further developments . . . . . . . . . . . . . . . . . 248


5.5.1 Using the Curry-Howard correspondence for writing code . 248
5.5.2 Implications for designing new programming languages . . 251
5.5.3 Practical uses of the void type (Scala’s Nothing) . . . . . . . 252
5.5.4 Relationship between Boolean logic and constructive logic . 254
5.5.5 The constructive logic and the law of excluded middle . . . 257

Essay: Towards functional data engineering with Scala 259


Data is math . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
Functional programming is math . . . . . . . . . . . . . . . . . . . . . . . 260
The power of abstraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
Scala is Java on math . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263

List of Tables 265

List of Figures 267

iv
Preface
This book is a reference text and a tutorial that teaches functional programmers
how to reason mathematically about types and code, in a manner directly rele-
vant to software practice. The material ranges from introductory (Part I) to ad-
vanced (Part III). The book assumes a certain amount of mathematical experience
(at about the level of undergraduate algebra or calculus) as well as some experi-
ence writing code in general-purpose programming languages.
The vision of this book is to explain the mathematical theory that guides the
practice of functional programming. So, all mathematical developments in this
book are motivated by practical programming issues and are accompanied by
Scala code illustrating their usage. For instance, the laws for standard typeclasses
(functors, monads, etc.) are first motivated heuristically through code examples.
Then the laws are formulated as mathematical equations and proved rigorously.
To achieve a clearer presentation of the material, the book uses certain non-
standard notations (see Appendix A on page 1157) and terminology (Appendix B
on page 1167). The presentation is self-contained, defining and explaining all re-
quired techniques, notations, and Scala features. All code examples have been
tested to work but are intended only for explanation and illustration. As a rule,
the code is not optimized for performance. Although the code examples are in
Scala, the material in this book also applies to many other languages.
A software engineer needs to learn only those few fragments of mathematical
theory that answer questions arising in the programming practice. So, this book
keeps theoretical material at the minimum: vita brevis, ars longa. The scope of the
required mathematical knowledge is limited to first notions of set theory, formal
logic, and category theory. Concepts such as functors or natural transformations
arise from the practice of reasoning about code and are first explained without
reference to category theory.
This book is not an introduction to current theoretical research in functional pro-
gramming. Instead, the focus is on material known to be practically useful. The
book organically develops the scope of theoretical concepts that help program-
mers write code or answer practical questions about code. That includes construc-
tions such as “filterable functors” and “applicative contrafunctors” but excludes a
number of theoretical developments that do not appear to have significant ap-
plications. For instance, this book does not talk about introduction/elimination
rules, strong normalization, complete partial orders, domain theory, model the-
ory, adjoint functors, co-ends, pullbacks, or topoi.
The first part of the book introduces functional programming. Readers already
1
Preface

familiar with functional programming could skim the glossary (Appendix B on


page 1167) for unfamiliar terminology and then start reading Chapter 5.
Participation in the meetup “San Francisco Types, Theorems, and Programming
Languages”1 initially motivated the author to begin working on this book. Thanks
are due to Adrian King, Hew Wolff, Peter Vanderbilt, and Young-il Choo for in-
spiration and support in that meetup. The author appreciates the work of Joseph
Kim and Jim Kleck who did many of the exercises and reported some errors in
earlier versions of this book. The author also thanks Bill Venners for many helpful
comments on the draft, and Harald Gliebe, Andreas Röhler, and Philip Schwarz
for contributing corrections to the text via github. The author is grateful to Fred-
erick Pitts, Hew Wolff, and several anonymous github contributors who reported
errors in the draft and made helpful suggestions, and to Barisere Jonathan for
valuable assistance with setting up automatic builds.
No generative AI was used for creating or editing this book.

Formatting conventions used in this book


• Text in boldface indicates a new concept or term that is being defined at that
place in the text. Italics means logical emphasis. Example:
An aggregation is a function from a collection of values to a single
value.
• Equations are numbered per chapter: Eq. (1.3). Statements, examples, and
exercises are numbered per subsection: Example 1.4.1.1 is in subsection 1.4.1,
which belongs to Chapter 1.
• Scala code is written inline using a small monospaced font: val a = "xyz".
Longer code examples are written in separate code blocks and may also
show the Scala interpreter’s output for certain lines:
val s = (1 to 10).toList

scala> s.product
res0: Int = 3628800

• In the introductory chapters, type expressions and code examples are writ-
ten in the Scala syntax. In Chapters 4–5, the book introduces a mathematical
notation for types: e.g., the Scala type expression ((A, B)) => Option[A] is
written as 𝐴 × 𝐵 → 1 + 𝐴. Chapters 4–7 also develop a more concise notation
for code. For example, the functor composition law (in Scala: _.map(f).map(g)
== _.map(f andThen g)) is written in the code notation as:

𝑓 ↑𝐿 # 𝑔↑𝐿 = ( 𝑓 # 𝑔) ↑𝐿 ,
1 https://www.meetup.com/sf-types-theorems-and-programming-languages/

2
Formatting conventions used in this book

where 𝐿 is a functor and 𝑓 :𝐴→𝐵 and 𝑔 :𝐵→𝐶 are arbitrary functions of the spec-
ified types. The notation 𝑓 ↑𝐿 denotes the function 𝑓 lifted to the functor 𝐿
and replaces Scala’s syntax x.map(f) where x is of type L[A]. The symbol #
denotes the forward composition of functions (Scala’s method andThen). If
the notation still appears hard to follow after going through Chapters 5–6,
readers will benefit from working through Chapter 7, which explains the
code notation more systematically and clarifies it with additional examples.
Appendix A on page 1157 summarizes this book’s notation for types and
code.
• Frequently used methods of standard typeclasses, such as Scala’s flatten,
flatMap, etc., are denoted by shorter words and are labeled by the type con-
structor they belong to. For instance, the methods pure, flatten, and flatMap
for a monad 𝑀 are denoted by pu 𝑀 , ftn 𝑀 , and flm 𝑀 when writing code for-
mulas and proofs of laws.
• Derivations are written in a two-column format. The right column contains
formulas in the code notation. The left column gives an explanation or in-
dicates the property or law used to derive the expression at right from the
previous expression. A green underline shows the parts of an expression
that will be rewritten in the next step:
expect to equal pu 𝑀 : pu↑Id
𝑀 # pu 𝑀 # ftn 𝑀
lifting to the identity functor : = pu 𝑀 # pu 𝑀 # ftn 𝑀
left identity law of 𝑀 : = pu 𝑀 .
When the two-column presentation becomes too wide to fit the page, the
explanations are placed before the next step’s line:
expect to equal pu 𝑀 :
pu↑Id
𝑀 # pu 𝑀 # ftn 𝑀
lifting to the identity functor :
= pu 𝑀 # pu 𝑀 # ftn 𝑀
left identity law of 𝑀 :
= pu 𝑀 .
A green underline is sometimes also used at the last step of a derivation,
to indicate the sub-expression that resulted from the most recent rewriting.
Other than providing hints to help clarify the steps, the green text and the
green underlines play no role in symbolic derivations.
• The symbol  is used occasionally to indicate more clearly the end of a defi-
nition, a derivation, or a proof.

3
Part I

Introductory level
1 Mathematical formulas as code. I.
Nameless functions
1.1 Translating mathematics into code
1.1.1 First examples
We begin by implementing some computational tasks in Scala.
Example 1.1.1.1 Find the product of integers from 1 to 10 (the factorial of 10,
usually denoted by 10!).
Solution First, we write a mathematical formula for the result:
10
Ö
10! = 1 ∗ 2 ∗ ... ∗ 10 , or in mathematical notation : 10! = 𝑘 .
𝑘=1

We can then write Scala code in a way that resembles the last formula:
scala> (1 to 10).product
res0: Int = 3628800

The syntax (1 to 10) produces a sequence of integers from 1 to 10. The product
method computes the product of the numbers in the sequence.
The code (1 to 10).product is an expression, which means that (1) the code can
be evaluated and yields a value, and (2) the code can be used inside a larger ex-
pression. For example, we could write:
scala> 100 + (1 to 10).product + 100 // The code `(1 to 10).product` is a
sub-expression.
res0: Int = 3629000

scala> 3628800 == (1 to 10).product


res1: Boolean = true

The Scala interpreter indicates that the result of (1 to 10).product is a value 3628800
of type Int. If we need to define a name for that value, we use the “val” syntax:
scala> val fac10 = (1 to 10).product
fac10: Int = 3628800

Example 1.1.1.2 Define a function to compute the factorial of an integer argu-


ment 𝑛.
7
1 Mathematical formulas as code. I. Nameless functions

Solution A mathematical formula for this function can be written as:


Ö𝑛
𝑓 (𝑛) = 𝑘 .
𝑘=1

The corresponding Scala code is:


def f(n: Int) = (1 to n).product
In Scala’s def syntax, we need to specify the type of a function’s argument; here,
we wrote n: Int. In the usual mathematical notation, types of function arguments
are either not written at all, or written separately from the formula:
𝑛
Ö
𝑓 (𝑛) = 𝑘 , ∀𝑛 ∈ N . (1.1)
𝑘=1

Equation (1.1) indicates that 𝑛 must be from the set of positive integers, denoted
by N in mathematics. This is similar to specifying the type (n: Int) in the Scala
code. So, the argument’s type in the code specifies the domain of a function (the
set of admissible values of a a function’s argument).
Having defined the function f, we can now apply it to an integer value 10 (or,
as programmers say, “call” the function f with argument 10):
scala> f(10)
res6: Int = 3628800
It is a type error to apply f to a non-integer value:
scala> f("abc")
<console>:13: error: type mismatch;
found : String("abc")
required: Int

1.1.2 Nameless functions


Both the code written above and Eq. (1.1) involve naming the function as “ 𝑓 ”.
Sometimes a function does not really need a name, — say, if the function is used
only once. “Nameless” mathematical functions may be denoted using the symbol
→ (pronounced “maps to”) like this:1

𝑥 → some formula that may use 𝑥 .
So, a mathematical notation for the nameless factorial function is:
𝑛
Ö
𝑛→ 𝑘 .
𝑘=1
1 Inmathematics, an often used symbol for “maps to” is ↦→, but this book uses a simpler arrow
symbol (→) that is visually similar.

8
1.1 Translating mathematics into code

This reads as “a function that maps 𝑛 to the product of all 𝑘 where 𝑘 goes from 1
to 𝑛”. The Scala expression implementing this mathematical formula is:
(n: Int) => (1 to n).product

This expression shows Scala’s syntax for a nameless function. Here, n: Int is the
function’s argument variable,2 while (1 to n).product is the function’s body. The
function arrow (=>) separates the argument variable from the body.3
Functions in Scala (whether named or nameless) are treated as values, which
means that we can also define a Scala value as:
scala> val fac = (n: Int) => (1 to n).product
fac: Int => Int = <function1>

We see that the value fac has the type Int => Int, which means that the function
fac takes an integer (Int) argument and returns an integer result value. What is the
value of the function fac itself ? As we have just seen, the standard Scala interpreter
prints <function1> as the “value” of fac. Another Scala interpreter called ammonite4
prints this:
scala@ val fac = (n: Int) => (1 to n).product
fac: Int => Int = ammonite.$sess.cmd0$$$Lambda$1675/2107543287@1e44b638

The long number could indicate an address in memory. We may imagine that
a “function value” represents a block of compiled code. That code will run and
evaluate the function’s body whenever the function is applied to an argument.
Once defined, a function can be applied to an argument value like this:
scala> fac(10)
res1: Int = 3628800

Functions can be also used without naming them. We may directly apply a name-
less factorial function to an integer argument 10 instead of writing fac(10):
scala> ((n: Int) => (1 to n).product)(10)
res2: Int = 3628800

We would rarely write code like this. Instead of creating a nameless function
and then applying it right away to an argument, it is easier to evaluate the expres-
sion symbolically by substituting 10 instead of n in the function body:
((n: Int) => (1 to n).product)(10) == (1 to 10).product

If a nameless function uses the argument several times, as in this code:


((n: Int) => n * n * n + n * n)(12345)

2 In computer science, argument variables are called “parameters” of a function, while an


“argument” is a value to which the function is actually applied. This book uses the word
“argument” for both, following the mathematical usage.
3 Some programming languages use the symbols -> or => for the function arrow; see Table 1.2.
4 See https://ammonite.io/

9
1 Mathematical formulas as code. I. Nameless functions

we could substitute the argument and eliminate the nameless function:


12345 * 12345 * 12345 + 12345 * 12345

Of course, it is better to avoid repeating the value 12345. To achieve that, define n
as a value in an expression block like this:
scala> { val n = 12345; n * n * n + n * n }
res3: Int = 322687002

Defined in this way, the value n is visible only within the expression block. Out-
side the block, another value named n could be defined independently of this n.
For this reason, the definition of n is called a local-scope definition.
Nameless functions are convenient when they are themselves arguments of
other functions, as we will see next.
Example 1.1.2.1 Define a function that takes an integer argument 𝑛 and deter-
mines whether 𝑛 is a prime number.
Solution By definition, 𝑛 is prime if, for all 𝑘 between 2 and 𝑛 − 1, the remain-
der after dividing 𝑟 by 𝑘 (denoted by 𝑟%𝑘) is nonzero. We can write this as a
mathematical formula using the “forall” symbol (∀):

isPrime (𝑛) = ∀𝑘 ∈ [2, 𝑛 − 1] . (𝑛%𝑘) ≠ 0 . (1.2)

This formula has two parts: first, a range of integers from 2 to 𝑛 − 1, and second, a
requirement that all these integers 𝑘 should satisfy the given condition: (𝑛%𝑘) ≠ 0.
Formula (1.2) is translated into Scala code as:
def isPrime(n: Int) = (2 to n - 1).forall(k => n % k != 0)

This code looks closely similar to the mathematical notation, except for the ar-
row after 𝑘 that introduces a nameless function (k => n % k != 0). We do not need
to specify the type Int for the argument k of that nameless function. The Scala
compiler knows that k is going to iterate over the integer elements of the range (2
to n - 1), which effectively forces k to be of type Int because types must match.
We can now apply the function isPrime to some integer values:
scala> isPrime(12)
res3: Boolean = false

scala> isPrime(13)
res4: Boolean = true

As we can see from the output above, the function isPrime returns a value of type
Boolean. Therefore, the function isPrime has type Int => Boolean.
A function that returns a Boolean value is called a predicate.
In Scala, it is strongly recommended (although often not mandatory) to specify
the return types of named functions. The required syntax looks like this:
def isPrime(n: Int): Boolean = (2 to n - 1).forall(k => n % k != 0)

10
1.1 Translating mathematics into code

1.1.3 Nameless functions and bound variables


The code for isPrime differs from the mathematical formula (1.2) in two ways.
One difference is that the interval [2, 𝑛 − 1] is in front of forall. Another is that
the Scala code uses a nameless function (k => n % k != 0), while Eq. (1.2) does not
seem to use such a function.
To understand the first difference, we need to keep in mind that the Scala syn-
tax such as (2 to n - 1).forall(k => ...) means to apply a function called forall
to two arguments: the first argument is the range (2 to n - 1), and the second
argument is the nameless function (k => ...). In Scala, the method syntax x.f(z),
and the equivalent infix syntax x f z, means that a function f is applied to its two
arguments, x and z. In the ordinary mathematical notation, this would be 𝑓 (𝑥, 𝑧).
Infix notation is widely used when it is easier to read: for instance, we write 𝑥 + 𝑦
rather than something like 𝑝𝑙𝑢𝑠 (𝑥, 𝑦).
A single-argument function could be also defined as a method, and then the
syntax is x.f, as in the expression (1 to n).product shown before.
The methods product and forall are already provided in the Scala standard li-
brary, so it is natural to use them. If we want to avoid the method syntax, we
could define a function forAll with two arguments and write code like this:
forAll(2 to n - 1, k => n % k != 0)

This would bring the syntax closer to Eq. (1.2). However, there still remains the
second difference: The symbol 𝑘 is used as an argument of a nameless function
(k => n % k != 0) in the Scala code, while the formula:

∀𝑘 ∈ [2, 𝑛 − 1] . (𝑛%𝑘) ≠ 0 (1.3)

does not seem to define such a function but defines the symbol 𝑘 that goes over the
range [2, 𝑛 − 1]. The variable 𝑘 is then used for writing the predicate (𝑛%𝑘) ≠ 0.
Let us investigate the role of 𝑘 more closely. The mathematical variable 𝑘 is
accessible only inside the expression “∀𝑘...” and makes no sense outside that ex-
pression. This becomes clear by looking at Eq. (1.2): The variable 𝑘 is not present
in the left-hand side and could not possibly be used there. The name “𝑘” is accessi-
ble only in the right-hand side, where it is first mentioned as the arbitrary element
𝑘 ∈ [2, 𝑛 − 1] and then used in the sub-expression “𝑛%𝑘”.
So, the mathematical notation in Eq. (1.3) says two things: First, we use the
name 𝑘 for integers from 2 to 𝑛 − 1. Second, for each of those 𝑘 we evaluate the
expression (𝑛%𝑘) ≠ 0, which can be viewed as a certain function of 𝑘 that returns
a Boolean value. Translating the mathematical notation into code, it is natural to
use the nameless function 𝑘 → (𝑛%𝑘) ≠ 0 and to write Scala code applying this
nameless function to each element of the range [2, 𝑛 − 1] and checking that all
result values be true:
(2 to n - 1).forall(k => n % k != 0)

11
1 Mathematical formulas as code. I. Nameless functions

Just as the mathematical notation defines the variable 𝑘 only in the right-hand
side of Eq. (1.2), the argument k of the nameless Scala function k => n % k != 0 is
defined within that function’s body and cannot be used in any code outside the
expression n % k != 0.
Variables that are defined only inside an expression and are invisible outside
are called bound variables, or “variables bound in an expression”. Variables that
are used in an expression but are defined outside it are called free variables, or
“variables occurring free in an expression”. These concepts apply equally well
to mathematical formulas and to Scala code. For example, in the mathematical
expression ∀𝑘. (𝑛%𝑘) ≠ 0, the variable 𝑘 is bound (defined and only visible within
that expression’s scope) but the variable 𝑛 is free: it must be defined somewhere
outside the expression ∀𝑘. (𝑛%𝑘) ≠ 0.
The main difference between free and bound variables is that bound variables
can be locally renamed at will, unlike free variables. To see this, consider that we
could rename 𝑘 to 𝑧 and write instead of Eq. (1.2) an equivalent definition:

isPrime (𝑛) = ∀𝑧 ∈ [2, 𝑛 − 1] . (𝑛%𝑧) ≠ 0 .

def isPrime(n: Int): Boolean = (2 to n - 1).forall(z => n % z != 0)

The argument z in the nameless function z => n % z != 0 is a bound variable: it


may be renamed without changing any code outside that function. But n is a free
variable within z => n % z != 0 (it is not defined inside that function, so it must be
defined outside). If we wanted to rename n in the sub-expression z => n % z != 0,
we would also need to change all the code that involves the variable n outside that
sub-expression, or else the program would become incorrect.
Mathematical formulas use bound variables in constructions such as ∀𝑘. 𝑝(𝑘),
∫1
∃𝑘. 𝑝(𝑘), 𝑏𝑘=𝑎 𝑓 (𝑘), 0 𝑘 2 𝑑𝑘, lim𝑛→∞ 𝑓 (𝑛), and argmax 𝑘 𝑓 (𝑘). When translating
Í
mathematical expressions into code, we need to recognize the bound variables
present in the mathematical notation. For each bound variable, we create a name-
less function whose argument is that variable, e.g., k => p(k) or k => f(k) for the
examples just shown. Then our code will correctly reproduce the behavior of
bound variables in mathematical expressions.
As an example, the mathematical formula ∀𝑘 ∈ [1, 𝑛] . 𝑝(𝑘) has a bound variable
𝑘 and is translated into Scala code as:
(1 to n).forall(k => p(k))

At this point we can apply a simplification trick to this code. The nameless func-
tion 𝑘 → 𝑝(𝑘) does exactly the same thing as the (named) function 𝑝: It takes an
argument, which we may call 𝑘, and returns 𝑝(𝑘). So, we can simplify the Scala
code above to:
(1 to n).forall(p)

12
1.2 Aggregating data from sequences

The simplification of 𝑥 → 𝑓 (𝑥) to just 𝑓 is always possible for functions 𝑓 of a


single argument.5

1.2 Aggregating data from sequences


Consider the task of counting the even numbers contained in a given list 𝐿 of
integers. For example, the list [5, 6, 7, 8, 9] contains two even numbers: 6 and 8.
A mathematical formula for this task can be written using the “sum” operation
Í
(denoted by ):
Õ
countEven (𝐿) = isEven (𝑘) ,
𝑘∈𝐿
(
1 if (𝑘%2) = 0 ,
isEven (𝑘) =
0 otherwise .

Here we defined a helper function isEven in order to write more easily a formula
for countEven. In mathematics, complicated formulas are often split into simpler
parts by defining helper expressions.
We can write the Scala code similarly. We first define the helper function isEven;
the Scala code can be written in a style quite similar to the mathematical formula:
def isEven(k: Int): Int = (k % 2) match {
case 0 => 1 // First, check if it is zero.
case _ => 0 // The underscore means "otherwise".
}

For such a simple computation, we could also write shorter code using a name-
less function:
val isEven = (k: Int) => if (k % 2 == 0) 1 else 0

Given this function, we now need to translate into Scala code the expression
𝑘 ∈𝐿 isEven (𝑘). We can represent the list 𝐿 using the data type List[Int] from the
Í
Scala standard library.
To compute 𝑘 ∈𝐿 isEven (𝑘), we must apply the function isEven to each element
Í
of the list 𝐿, which will produce a list of some (integer) results, and then we will
need to add all those results together. It is convenient to perform these two steps
separately. This can be done with the functions map and sum, defined in the Scala
standard library as methods for the data type List.
The method sum is similar to product and is defined for any List of numerical
types (Int, Float, Double, etc.). It computes the sum of all numbers in the list:
5 Certain features of Scala allow programmers to write code that looks like f(x) but actually uses
an automatic conversion for the argument x, default argument values, or implicit arguments.
In those cases, replacing the code x => f(x) by f will fail to compile.

13
1 Mathematical formulas as code. I. Nameless functions

scala> List(1, 2, 3).sum


res0: Int = 6

The method map needs more explanation. This method takes a function as its
second argument and applies that function to each element of the list. All the
results are stored in a new list, which is then returned as the result value:
scala> List(1, 2, 3).map(x => x * x + 100 * x)
res1: List[Int] = List(101, 204, 309)

In this example, the argument of map is the nameless function 𝑥 → 𝑥 2 + 100𝑥.


This function will be used repeatedly by map to transform each integer from the
sequence List(1, 2, 3), creating a new list as a result.
It is equally possible to define the transforming function separately, give it a
name, and then use it as the argument to map:
scala> def func1(x: Int): Int = x * x + 100 * x
func1: (x: Int)Int

scala> List(1, 2, 3).map(func1)


res2: List[Int] = List(101, 204, 309)

Short functions are often defined inline, while longer functions are defined sepa-
rately with a name.
A method, such as map, can be also used with a “dotless” (infix) syntax:
scala> List(1, 2, 3) map func1 // Same as List(1, 2, 3).map(func1)
res3: List[Int] = List(101, 204, 309)

If the transforming function is used only once (such as func1 in the example
above), and especially for simple computations such as 𝑥 → 𝑥 2 + 100𝑥, it is easier
to use a nameless function.
We can now combine the methods map and sum to define countEven:
def countEven(s: List[Int]) = s.map(isEven).sum

This code can be also written using a nameless function instead of isEven:
def countEven(s: List[Int]): Int = s
.map { k => if (k % 2 == 0) 1 else 0 }
.sum

In Scala, methods are often used one after another in a chain. For instance,
s.map(...).sum means: first apply s.map(...), which returns a new list; then apply
sum to that new list. To make the code more readable, we may put each of the
chained methods on a new line.
To test this code, let us run it in the Scala interpreter. In order to let the inter-
preter work correctly with multi-line code, we will enclose the code in braces:
scala> def countEven(s: List[Int]): Int = {
| s.map { k => if (k % 2 == 0) 1 else 0 }

14
1.3 Filtering and truncating a sequence

| .sum
| }
def countEven: (s: List[Int])Int

scala> countEven(List(1,2,3,4,5))
res0: Int = 2

scala> countEven( List(1,2,3,4,5).map(x => x * 2) )


res1: Int = 5

Note that the Scala interpreter prints the types differently for named functions
(i.e., functions declared using def). It prints (s: List[Int])Int for a function of
type List[Int] => Int.

1.3 Filtering and truncating a sequence


In addition to the methods sum, product, map, forall that we have already seen, the
Scala standard library defines many other useful methods. We will now take a
look at using the methods max, min, exists, size, filter, and takeWhile.
The methods max, min, and size are self-explanatory:
scala> List(10, 20, 30).max
res2: Int = 30

scala> List(10, 20, 30).min


res3: Int = 10

scala> List(10, 20, 30).size


res4: Int = 3

The methods forall, exists, filter, and takeWhile require a predicate as an argu-
ment. The forall method returns true if and only if the predicate returns true for
all values in the list. The exists method returns true if and only if the predicate
holds (returns true) for at least one value in the list. These methods can be written
as mathematical formulas like this:

forall (𝑆, 𝑝) = ∀𝑘 ∈ 𝑆. 𝑝(𝑘) = true ,

exists (𝑆, 𝑝) = ∃𝑘 ∈ 𝑆. 𝑝(𝑘) = true .

The filter method returns a list that contains only the values for which a pred-
icate returns true:
scala> List(1, 2, 3, 4, 5).filter(k => k != 3) // Exclude the value 3.
res5: List[Int] = List(1, 2, 4, 5)

The takeWhile method truncates a given list. More precisely, takeWhile returns a
new list that contains the initial portion of values from the original list for which
predicate remains true:
15
1 Mathematical formulas as code. I. Nameless functions

scala> List(1, 2, 3, 4, 5).takeWhile(k => k != 3) // Truncate at the value 3.


res6: List[Int] = List(1, 2)
In all these cases, the predicate’s argument, k, will be of the same type as the
elements in the list. In the examples shown above, the elements are integers (i.e.,
the lists have type List[Int]), therefore k must be of type Int.
The methods sum and product are defined for lists of numeric types, such as Int or
Float. The methods max and min are defined on lists of “orderable” types (including
String, Boolean, and the numeric types). The other methods are defined for lists of
all types.
Using these methods, we can solve many problems that involve transforming
and aggregating data stored in lists, arrays, sets, and other data structures that
work as “containers storing values”. In this context, a transformation is a func-
tion taking a container with values and returning a new container with changed
values. (We speak of “transformation” even though the original container remains
unchanged.) Examples of transformations are filter and map. An aggregation is a
function taking a container of values and returning a single value. Examples of
aggregations are max and sum.
Writing programs by chaining together various methods of transformation and
aggregation is known as programming in the map/reduce style.

1.4 Examples
1.4.1 Aggregations

Example 1.4.1.1 Improve the code for isPrime by limiting the search to 𝑘 ≤ 𝑛:

isPrime (𝑛) = ∀𝑘 ∈ [2, 𝑛 − 1] such that if 𝑘 ∗ 𝑘 ≤ 𝑛 then (𝑛%𝑘) ≠ 0 .

Solution Use takeWhile to truncate the initial list when 𝑘 ∗ 𝑘 ≤ 𝑛 becomes false:
def isPrime(n: Int): Boolean = {
(2 to n - 1)
.takeWhile(k => k * k <= n)
.forall(k => n % k != 0)
}
Î10
Example 1.4.1.2 Compute this product of absolute values: 𝑘=1 |sin (𝑘 + 2)|.
Solution
(1 to 10)
.map(k => math.abs(math.sin(k + 2)))
.product
Í √
Example 1.4.1.3 Compute 𝑘∈[1,10]; cos 𝑘 >0 cos 𝑘 (the sum goes only over 𝑘 such
that cos 𝑘 > 0).
16
1.4 Examples

Solution
(1 to 10)
.filter(k => math.cos(k) > 0)
.map(k => math.sqrt(math.cos(k)))
.sum

It is safe to compute cos 𝑘, because we have first filtered the list by keeping only
values 𝑘 for which cos 𝑘 > 0. Let us check that this is so:
scala> (1 to 10).toList.filter(k => math.cos(k) > 0).map(x => math.cos(x))
res0: List[Double] = List(0.5403023058681398, 0.28366218546322625,
0.9601702866503661, 0.7539022543433046)

Example 1.4.1.4 Compute the average of a non-empty list of type List[Double],


𝑛−1

average (𝑠) = 𝑠𝑖 .
𝑛 𝑖=0

Solution We need to divide the sum by the length of the list:


def average(s: List[Double]): Double = s.sum / s.size

scala> average(List(1.0, 2.0, 3.0))


res0: Double = 2.0
2𝑛
Example 1.4.1.5 Given 𝑛, compute the Wallis product6 truncated up to 2𝑛+1 :
224466 2𝑛
wallis (𝑛) = ... .
1 3 3 5 5 7 2𝑛 + 1
Solution Define the helper function wallis_frac(i) that computes the 𝑖 th frac-
tion. The method toDouble converts integers to Double numbers:
def wallis_frac(i: Int): Double = ((2 * i).toDouble / (2 * i - 1)) * ((2 *
i).toDouble / (2 * i + 1))

def wallis(n: Int) = (1 to n).map(wallis_frac).product

scala> math.cos(wallis(10000)) // Should be close to 0.


res0: Double = 3.9267453954401036E-5

scala> math.cos(wallis(100000)) // Should be even closer to 0.


res1: Double = 3.926966362362075E-6
The cosine of wallis(n) tends to zero for large 𝑛 because the limit of the Wallis
product is 𝜋2 .
Example 1.4.1.6 Check numerically the following infinite product formula:
∞ 
𝑥2

Ö sin 𝜋𝑥
1− 2 = .
𝑘 𝜋𝑥
𝑘=1
6 https://en.wikipedia.org/wiki/Wallis_product

17
1 Mathematical formulas as code. I. Nameless functions

Solution Compute this product up to 𝑘 = 𝑛 for 𝑥 = 0.1 with a large value of 𝑛,


say 𝑛 = 105 , and compare with the right-hand side:
def sine_product(n: Int, x: Double): Double =
(1 to n).map(k => 1.0 - x * x / k / k).product

scala> sine_product(n = 100000, x = 0.1) // Arguments may be named, for clarity.


res0: Double = 0.9836317414461351

scala> math.sin(pi * 0.1) / pi / 0.1


res1: Double = 0.9836316430834658

Example 1.4.1.7 Define a function 𝑝 that takes a list of integers and a function
f: Int => Int, and returns the largest value of 𝑓 (𝑥) among all 𝑥 in the list.
Solution
def p(s: List[Int], f: Int => Int): Int = s.map(f).max

Here is a test for this function:


scala> p(List(2, 3, 4, 5), x => 60 / x)
res0: Int = 30

1.4.2 Transformations
Example 1.4.2.1 Given a list of lists, s: List[List[Int]], select the inner lists of
size at least 3. The result must be again of type List[List[Int]].
Solution To “select the inner lists” means to compute a new list containing only
the desired inner lists. We use filter on the outer list s. The predicate for the filter
is a function that takes an inner list and returns true if the size of that list is at least
3. Write the predicate as a nameless function, t => t.size >= 3, where t is of type
List[Int]:
def f(s: List[List[Int]]): List[List[Int]] = s.filter(t => t.size >= 3)

scala> f(List( List(1,2), List(1,2,3), List(1,2,3,4) ))


res0: List[List[Int]] = List(List(1, 2, 3), List(1, 2, 3, 4))

The Scala compiler deduces from the code that the type of t is List[Int] because
we apply filter to a list of lists of integers.
Example 1.4.2.2 Find all integers 𝑘 ∈ [1, 10] such that there are at least three
different integers 𝑗, where 1 ≤ 𝑗 ≤ 𝑘, each 𝑗 satisfying the condition 𝑗 ∗ 𝑗 > 2 ∗ 𝑘.
Solution
scala> (1 to 10).toList.filter(k =>
(1 to k).filter(j => j*j > 2*k).size >= 3)
res0: List[Int] = List(6, 7, 8, 9, 10)

The argument of the outer filter is a nameless function that also uses a filter.
The inner expression:
18
1.5 Summary

Mathematical notation Scala code



𝑥 → 𝑥2 + 1 x => math.sqrt(x * x + 1)
[1, 2, ..., 𝑛] (1 to n)
[ 𝑓 (1), ..., 𝑓 (𝑛)] (1 to n).map(k => f(k))
Í𝑛 2 (1 to n).map(k => k * k).sum
𝑘=1 𝑘
(1 to n).map(f).product
Î𝑛
𝑘=1 𝑓 (𝑘)
∀𝑘 ∈ [1, ..., 𝑛]. 𝑝(𝑘) holds (1 to n).forall(k => p(k))
∃𝑘 ∈ [1, ..., 𝑛]. 𝑝(𝑘) holds (1 to n).exists(k => p(k))
Õ
𝑓 (𝑘) s.filter(p).map(f).sum
𝑘∈𝑆 such that 𝑝(𝑘) holds

Table 1.1: Translating mathematics into code.

(1 to k).filter(j => j*j > 2*k).size >= 3

computes a list of all 𝑗’s that satisfy the condition 𝑗 ∗ 𝑗 > 2 ∗ 𝑘. The size of that
list is then compared with 3. In this way, we impose the requirement that there
should be at least 3 values of 𝑗. We can see how the Scala code closely follows the
mathematical formulation of the task.

1.5 Summary
Functional programs are mathematical formulas translated into code. Table 1.1
summarizes the tools explained in this chapter and gives implementations of
some mathematical constructions in Scala. We have also shown methods such
as takeWhile that do not correspond to widely used mathematical symbols.
What problems can one solve with these techniques?

• Compute mathematical expressions involving sums, products, and quanti-


Í
fiers, based on integer ranges, such as 𝑛𝑘=1 𝑓 (𝑘).

• Transform and aggregate data from lists using map, filter, sum, and other
methods from the Scala standard library.

What are examples of problems that are not solvable with these tools?

• Example 1: Compute the smallest 𝑛 ≥ 1 such that 𝑓 ( 𝑓 ( 𝑓 (... 𝑓 (0)...))) ≥ 1000,


where the given function 𝑓 is applied 𝑛 times.
19
1 Mathematical formulas as code. I. Nameless functions

• Example 2: Compute a list of partial sums from a given list of integers. For
example, the list [1, 2, 3, 4] should be transformed into [1, 3, 6, 10].

• Example 3: Perform binary search over a sorted list of integers.

These computations require a general case of mathematical induction. Chapter 2


will explain how to implement these tasks using recursion as well as using library
methods such as foldLeft.
Library functions we have seen so far, such as map and filter, implement a re-
stricted class of iterative operations on lists: namely, operations that process each
element of a given list independently and accumulate results. In those cases, the
number of iterations is known (or at least bounded) in advance. For instance,
when computing s.map(f), the number of function applications is given by the
size of the initial list. However, Example 1 requires applying a function 𝑓 repeat-
edly until a given condition holds — that is, repeating for an initially unknown
number of times. So, it is impossible to write an expression containing map, filter,
takeWhile, etc., that solves Example 1. We could write the solution of Example 1
as a formula by using mathematical induction, but we have not yet seen how to
translate that into Scala code.
An implementation of Example 2 is shown in Section 2.4. This cannot be im-
plemented with operations such as map and filter because they cannot produce
sequences whose next elements depend on previous values.
Example 3 defines the search result by induction: the list is split in half, and
search is performed recursively (i.e., using the inductive hypothesis) in the half
that contains the required value. This computation requires an initially unknown
number of steps.

1.6 Exercises
1.6.1 Aggregations
Exercise 1.6.1.1 Define a function that computes a “staggered factorial” (denoted
by 𝑛!!) for positive integers. It is defined as either 1 · 3 · ... · 𝑛 or as 2 · 4 · ... · 𝑛,
depending on whether 𝑛 is even or odd. For example, 8!! = 384 and 9!! = 945.
Exercise 1.6.1.2 Machin’s formula7 converges to 𝜋 faster than Example 1.4.1.5:

𝜋 1 1
= 4 arctan − arctan ,
4 5 239

1 1 1 1 1 1 Õ (−1) 𝑘 −2𝑘−1
arctan = − + − ... = 𝑛 .
𝑛 𝑛 3 𝑛3 5 𝑛5 𝑘=0
2𝑘 + 1

7 http://turner.faculty.swau.edu/mathematics/materialslibrary/pi/machin.html

20
1.6 Exercises

Implement a function that computes the series for arctan 𝑛1 up to a given number
of terms, and compute an approximation of 𝜋 using this formula. Show that 12
terms of the series are sufficient for a full-precision Double approximation of 𝜋.
𝜋2
Exercise 1.6.1.3 Check numerically that ∞ 1
Í
𝑘=1 𝑘 2 = 6 . First, define a function of
𝑛 that computes a partial sum of that series until 𝑘 = 𝑛. Then compute the partial
sum for a large value of 𝑛 and compare with the limit value.
Exercise 1.6.1.4 Using the function isPrime, check numerically the Euler product
𝜋4
formula8 for the Riemann’s zeta function 𝜁 (4). It is known9 that 𝜁 (4) = 90 :

Ö 1 𝜋4
𝜁 (4) = 1
= .
𝑘 ≥2; 𝑘 is prime 1 −
90
𝑘4

1.6.2 Transformations
Exercise 1.6.2.1 Define a function add20 of type List[List[Int]] => List[List[Int]]
that adds 20 to every element of every inner list. A sample test:
scala> add20( List( List(1), List(2, 3) ) )
res0: List[List[Int]] = List(List(21), List(22, 23))

Exercise 1.6.2.2 An integer 𝑛 is called a “3-factor” if it is divisible by only three


different integers 𝑖, 𝑗, 𝑘 such that 1 < 𝑖 < 𝑗 < 𝑘 < 𝑛. Compute the set of all
“3-factor” integers 𝑛 among 𝑛 ∈ [1, ..., 1000] .
Exercise 1.6.2.3 Given a function f: Int => Boolean, an integer 𝑛 is called a “3- 𝑓 ”
if there are only three different integers 1 < 𝑖 < 𝑗 < 𝑘 < 𝑛 such that 𝑓 (𝑖), 𝑓 ( 𝑗),
and 𝑓 (𝑘) are all true. Define a function that takes 𝑓 as an argument and returns
a sequence of all “3- 𝑓 ” integers among 𝑛 ∈ [1, ..., 1000]. What is the type of that
function? Implement Exercise 1.6.2.2 using that function.
Exercise 1.6.2.4 Define a function at100 of type List[List[Int]] => List[List[Int]]
that selects only those inner lists whose largest value is at least 100. Test with:
scala> at100( List( List(0, 1, 100), List(60, 80), List(1000) ) )
res0: List[List[Int]] = List(List(0, 1, 100), List(1000))

Exercise 1.6.2.5 Define a function of type List[Double] => List[Double] that per-
forms a “normalization” of a list: it finds the element having the largest absolute
value and, if that value is zero, returns the original list; if that value is nonzero,
divides all elements by that value and returns a new list. Test with:
scala> normalize(List(1.0, -4.0, 2.0))
res0: List[Double] = List(0.25, -1.0, 0.5)

8 http://tinyurl.com/4rjj2rvc
9 https://tinyurl.com/yxey4tsd

21
1 Mathematical formulas as code. I. Nameless functions

1.7 Discussion
1.7.1 Functional programming as a paradigm
Functional programming (FP) is a paradigm — an approach that guides program-
mers to write code in specific ways, applicable to a wide range of tasks.
The main idea of FP is to write code as a mathematical expression or formula. This
allows programmers to derive code through logical reasoning rather than through
guessing, similarly to how books on mathematics reason about mathematical for-
mulas and derive results systematically, without guessing or “debugging.” Like
mathematicians and scientists who reason about formulas, functional program-
mers can reason about code systematically and logically, based on rigorous princi-
ples. This is possible only because code is written as a mathematical formula.
Mathematical intuition is useful for programming tasks because it is backed
by the vast experience of working with data over millennia of human history. It
Í
took centuries to invent flexible and powerful notation, such as 𝑘 ∈𝑆 𝑝(𝑘), and to
develop the corresponding rules of calculation. Converting formulas into code,
FP capitalizes on the power of those reasoning tools.
As we have seen, the Scala code for certain computational tasks corresponds
quite closely to mathematical formulas (although programmers do have to write
out some details that are omitted in the mathematical notation). Just as in mathe-
matics, large code expressions may be split into smaller expressions when needed.
Expressions can be reused, composed in various ways, and written independently
from each other. Over the years, the FP community has developed a toolkit of
functions (such as map, filter, flatMap, etc.), which are not standard in mathemati-
cal literature but proved to be useful in practical programming.
Mastering FP involves practicing to write programs as “formulas translated into
code”, building up the specific kind of applied mathematical intuition, and getting
familiar with certain concepts adapted to a programmer’s needs. The FP commu-
nity has discovered a number of specific programming idioms founded on math-
ematical principles but driven by practical necessities of writing software. This
book explains the theory behind those idioms, starting from code examples and
heuristic ideas, and gradually building up the techniques of rigorous reasoning.
This chapter explored the first significant idiom of FP: iterative calculations per-
formed without loops in the style of mathematical expressions. This technique can
be used in any programming language that supports nameless functions.

1.7.2 Iteration without loops


In mathematical notation, iterative computations are written without loops. As
an example, consider the formula for the standard deviation (𝜎) estimated from a
22
1.7 Discussion

data sample [𝑥 1 , ..., 𝑥 𝑛 ]:


v
u
t !2
𝑛 𝑛
1 Õ 2 1 Õ
𝜎= 𝑥𝑖 − 𝑥𝑖 .
𝑛−1 𝑛 (𝑛 − 1)
𝑖=1 𝑖=1
Here the index 𝑖 goes over the integer range [1, ..., 𝑛]. No mathematics textbook
would define the standard deviation 𝜎 via loops or by saying “now repeat this
equation 𝑛 times”. Indeed, it is unnecessary to evaluate a formula such as 𝑥𝑖2 many
times, as the value of 𝑥𝑖2 remains the same every time. It is just as unnecessary to
“repeat” a mathematical equation.
Í𝑛
Instead of loops, mathematicians write expressions such as 𝑖=1 𝑠𝑖 , where sym-
Í𝑛
bols such as 𝑖=1 denote certain iterative computations. Such computations are
defined rigorously using mathematical induction. The FP paradigm has devel-
oped rich tools for translating mathematical induction into code. This chapter fo-
cuses on methods such as map, filter, and sum. The next chapter shows more gen-
eral methods for implementing inductive computations. Those methods can be
combined in flexible ways, enabling programmers to write iterative code without
loops. For example, the value 𝜎 defined by the formula shown above is computed
by this code:
def sigma(xs: Seq[Double]): Double = {
val n = xs.length.toDouble
val xsum = xs.sum
val x2sum = xs.map(x => x * x).sum
math.sqrt(x2sum / (n - 1) - xsum * xsum / n / (n - 1))
}

scala> sigma(Seq(10, 20, 30))


res0: Double = 10.0
The programmer can avoid writing loops because all iterative computations are
delegated to functions such as map, filter, sum, and others. It is the job of the library
and the compiler to translate those high-level functions into low-level machine
code. The machine code will likely contain loops, but the programmer does not
need to see that machine code or to reason about it.

1.7.3 The mathematical meaning of “variables”


The usage of variables in functional programming is similar to how mathematical
literature uses variables. In mathematics, variables are used first of all as argu-
ments of functions; e.g., the formula:
𝑓 (𝑥) = 𝑥 2 + 𝑥
contains the variable 𝑥 and defines a function 𝑓 that takes 𝑥 as its argument (to be
definite, assume that 𝑥 is an integer) and computes the value 𝑥 2 + 𝑥. The body of
the function is the expression 𝑥 2 + 𝑥.
23
1 Mathematical formulas as code. I. Nameless functions

Mathematics has the convention that a variable, such as 𝑥, does not change
its value within a formula. Indeed, there is no mathematical notation even to
talk about “changing” the value of 𝑥 inside the formula 𝑥 2 + 𝑥. It would be quite
confusing if a mathematics textbook said “before adding the last 𝑥 in the formula
𝑥 2 + 𝑥, we change that 𝑥 by adding 4 to it”. If the “last 𝑥” in 𝑥 2 + 𝑥 needs to have a 4
added to it, a mathematics textbook will just write the formula 𝑥 2 + 𝑥 + 4.
Arguments of nameless functions are also immutable. Consider, for example:
𝑛
Õ
𝑓 (𝑛) = (𝑘 2 + 𝑘) .
𝑘=0
Here, 𝑛 is the argument of the function 𝑓 , while 𝑘 is the argument of the nameless
function 𝑘 → 𝑘 2 + 𝑘. Neither 𝑛 nor 𝑘 can be “modified” in any sense within the
expressions where they are used. The symbols 𝑘 and 𝑛 stand for some integer
values, and these values are immutable. Indeed, it is meaningless to say that we
“modified the integer 4”. In the same way, we cannot modify 𝑘.
So, a variable in mathematics remains constant within the expression where it is
defined; in that expression, a variable is essentially a “named constant”. Of course,
a function 𝑓 can be applied to different values 𝑥, to compute a different result 𝑓 (𝑥)
each time. However, a given value of 𝑥 will remain unmodified within the body
of the function 𝑓 while 𝑓 (𝑥) is being computed.
Functional programming adopts this convention from mathematics: variables
are immutable named constants. (Scala also has mutable variables, but we will not
consider them in this book.)
In Scala, function arguments are immutable within the function body:
def f(x: Int) = x * x + x // Cannot modify `x` here.
The type of each mathematical variable (such as integer, vector, etc.) is also fixed.
Each variable is a value from a specific set (e.g., the set of all integers, the set of all
vectors, etc.). Mathematical formulas such as 𝑥 2 + 𝑥 do not express any “checking”
that 𝑥 is indeed an integer and not, say, a vector, in the middle of evaluating 𝑥 2 + 𝑥.
The types of all variables are checked in advance.
Functional programming adopts the same view: Each argument of each func-
tion must have a type that represents the set of possible allowed values for that
function argument. The programming language’s compiler will automatically
check the types of all arguments in advance, before the program runs. A program
that calls functions on arguments of incorrect types will not compile.
The second usage of variables in mathematics is to denote expressions that will
be reused. For example, one writes: let 𝑧 = 𝑥−𝑦𝑥+𝑦 and now compute cos 𝑧 + cos 2𝑧 +
cos 3𝑧. Again, the variable 𝑧 remains immutable, and its type remains fixed.
In Scala, this construction (defining an expression to be reused later) is written
with the “val” syntax. Each variable defined using “val” is a named constant, and
its type and value are fixed at the time of definition. Type annotations for “val’’s
are optional in Scala. For instance, we could write:
24
1.7 Discussion

val x: Int = 123

We could also omit the type annotation “:Int” and write more concisely:
val x = 123

Here, it is clear that this x is an integer. Nevertheless, it is often helpful to write


out the types. If we do so, the compiler will check that the types match correctly
and give an error message whenever wrong types are used. For example, a type
error is detected when using a String instead of an Int:
scala> val x: Int = "123"
<console>:11: error: type mismatch;
found : String("123")
required: Int
val x: Int = "123"
^

1.7.4 Nameless functions in mathematical notation


Functions in mathematics are mappings from one set to another. A function does
not necessarily need a name; we just need to define the mapping. However, name-
less functions have not been widely used in the conventional mathematical no-
tation. It turns out that nameless functions are important in functional program-
ming because, in particular, they allow programmers to write code with a straight-
forward and consistent syntax.
Nameless functions contain bound variables that are invisible outside the func-
tion’s scope. This property is directly reflected by the prevailing mathematical
conventions. Compare the formulas:
∫ 𝑥 ∫ 𝑥
𝑑𝑥 𝑑𝑧
𝑓 (𝑥) = ; 𝑓 (𝑥) = .
0 1+𝑥 0 1+𝑧

The mathematical convention is that one may rename the integration variable at
will, and so these formulas define the same function 𝑓 .
In programming, one situation when a variable “may be renamed at will” is
when the variable represents an argument of a function. We can see that the nota-
𝑑𝑧
𝑑𝑥
tions 1+𝑥 and 1+𝑧 correspond to a nameless function whose argument was renamed
1
from 𝑥 to 𝑧. In FP notation, this nameless function would be denoted as 𝑧 → 1+𝑧 ,
and the integral rewritten as code such as:
integration(0, x, { z => 1.0 / (1 + z) } )

∫ 𝑥Now
𝑑𝑧
compare the mathematical notations for integration and for summation:
Í100 1
0 1+𝑧
and 𝑘=0 1+𝑘 . The integral defines a bound variable 𝑧 via the special symbol
Í
“𝑑”, while the summation places a bound variable 𝑘 in a subscript under . The
25
1 Mathematical formulas as code. I. Nameless functions

notation could be made more consistent by using nameless functions explicitly,


for example like this:
𝑥   𝑥
Õ 1 Õ 1
denote summation by 𝑘→ instead of ,
0
1 + 𝑘 𝑘=0
1 + 𝑘
∫ 𝑥  ∫ 𝑥
1 𝑑𝑧
denote integration by 𝑧→ instead of .
0 1+𝑧 0 1+𝑧
Í
In the new notation, the summation symbol 𝑥0 does not mention the name ∫ 𝑥 “𝑘”
but takes a function as an argument. Similarly, the integration symbol 0 does
not mention “𝑧” and does not use the special symbol “𝑑” but takes a function as
an argument. Written in this way, the operations of summation and integration
become functions that take functions as arguments. The above summation may be
written as a Scala function:
summation(0, x, { y => 1.0 / (1 + y) } )
We could implement summation(a, b, g) as:
def summation(a: Int, b: Int, g: Int => Double): Double = (a to b).map(g).sum

scala> summation(1, 10, x => math.sqrt(x))


res0: Double = 22.4682781862041
Integration requires longer code since the computations are more complicated.
Simpson’s rule10 gives the following formulas for numerical integration:
𝛿 
simpson (𝑎, 𝑏, 𝑔, 𝜀) = 𝑔(𝑎) + 𝑔(𝑏) + 4𝑠1 + 2𝑠2 ,
3 
𝑏−𝑎 𝑏−𝑎
where 𝑛 = 2 , 𝛿𝑥 = ,
𝜀 𝑛
Õ Õ
𝑠1 = 𝑔(𝑎 + 𝑘𝛿𝑥 ) , 𝑠2 = 𝑔(𝑎 + 𝑘𝛿𝑥 ) .
𝑘=1,3,...,𝑛−1 𝑘=2,4,...,𝑛−2

Here is a straightforward line-by-line translation of these formulas into Scala:


def simpson(a: Double, b: Double, g: Double => Double, eps: Double): Double = {
// First, we define some helper values and functions corresponding
// to the definitions "where n = ..." in the mathematical formulas.
val n: Int = 2 * ((b - a) / eps).toInt
val delta_x = (b - a) / n
val s1 = (1 to (n - 1) by 2).map { k => g(a + k * delta_x) }.sum
val s2 = (2 to (n - 2) by 2).map { k => g(a + k * delta_x) }.sum
// Now we can write the expression for the final result.
delta_x / 3 * (g(a) + g(b) + 4 * s1 + 2 * s2)
}

10 https://en.wikipedia.org/wiki/Simpson%27s_rule

26
1.7 Discussion

scala> simpson(0, 5, x => x*x*x*x, eps = 0.01) // The answer is 625.


res0: Double = 625.0000000004167

scala> simpson(0, 7, x => x*x*x*x*x*x, eps = 0.01) // The answer is 117649.


res1: Double = 117649.00000014296

The entire code is one large expression, with a few sub-expressions (s1, s2, etc.)
defined for within the local scope of the function (that is, within the function’s
body). The code contains no loops. This is similar to the way a mathematical
text would define Simpson’s rule. In other words, this code is written in the FP
paradigm. Similar code can be written in any programming language that sup-
ports nameless functions as arguments of other functions.

1.7.5 Named and nameless expressions


It is a significant advantage if a programming language supports unnamed (or
“nameless”) expressions. To see this, consider a familiar situation where we take
the absence of names for granted.
In today’s programming languages, we may directly write expressions such as
(x + 123) * y / (4 + x). Note that the entire expression does not need to have a
name. Parts of that expression (e.g., the sub-expressions x + 123 or 4 + x) also do
not have separate names. It would be inconvenient if we needed to assign a name
to each sub-expression. The code for (x + 123) * y / (4 + x) would look like this:
{
val r0 = 123
val r1 = x + r0
val r2 = r1 * y
val r3 = 4
val r4 = r3 + x
val r5 = r2 / r4 // Do we still remember what `r2` means?
r5
}

This style of programming resembles assembly languages: every sub-expression


— that is, every step of every calculation — needs to be written into a separate
memory address or a CPU register.
Programmers become more productive when their programming language sup-
ports nameless expressions. This is also common practice in mathematics; names
are assigned when needed, but most expressions remain nameless.
It is also useful to be able to create nameless data structures. For instance, a
dictionary (also called a “map” or a “hashmap”) is created in Scala with this code:
Map("a" -> 1, "b" -> 2, "c" -> 3)

This is a nameless expression whose value is a dictionary. In programming lan-


guages that do not have such a construction, programmers have to write special
code that creates an initially empty dictionary and then fills in one value at a time:
27
1 Mathematical formulas as code. I. Nameless functions

// Scala code creating a dictionary:


Map("a" -> 1, "b" -> 2, "c" -> 3)

// Shortest Java code for the same:


new HashMap<String, Integer>() {{
put("a", 1);
put("b", 2);
put("c", 3);
}}
Nameless functions are useful for the same reason as other nameless values:
they allow us to build larger programs from simpler parts in a uniform way.

1.7.6 Historical perspective on nameless functions


What this book calls (for clarity) a “nameless function” is also known as an anony-
mous function, a function expression, a function literal, a closure, a lambda func-
tion, a lambda expression, or just a “lambda”.
Nameless functions were first used in 1936 in a theoretical programming lan-
guage called “𝜆-calculus”. In that language,11 all functions are nameless and have
a single argument. The Greek letter 𝜆 is a syntax separator that denotes function
arguments in nameless functions. For example, the nameless function 𝑥 → 𝑥 + 1
would be written as 𝜆𝑥. 𝑎𝑑𝑑 𝑥 1 in 𝜆-calculus if it had a function 𝑎𝑑𝑑 for adding
integers (but it does not).
In most programming languages that were in use until around 1990, all func-
tions required names. But by 2015, the use of nameless functions in the map/reduce
programming style turned out to be so productive that most newly created lan-
guages included nameless functions, while older languages added that feature.
Table 1.2 shows the year when various languages supported nameless functions.

11 Although called a “calculus,” it is a (drastically simplified) programming language, not related to


differential or integral calculus. Practitioners of functional programming do not need to study
the theory of 𝜆-calculus. The practically relevant knowledge that comes from 𝜆-calculus will
be explained in Chapter 4.

28
1.7 Discussion

Language Year Code for 𝑘 → 𝑘 + 1

𝜆-calculus 1936 𝜆𝑘. 𝑎𝑑𝑑 𝑘 1


typed 𝜆-calculus 1940 𝜆𝑘 : 𝑖𝑛𝑡. 𝑎𝑑𝑑 𝑘 1
LISP 1958 (lambda (k) (+ k 1))
ALGOL 68 1968 (INT k) INT: k + 1
Standard ML 1973 fn (k: int) => k + 1
Caml 1985 fun (k: int) -> k + 1
Erlang 1986 fun(K) -> K + 1 end
Haskell 1990 \ k -> k + 1
Oz 1991 fun {$ K} K + 1
R 1993 function(k) k + 1
Python 1.0 1994 lambda k: k + 1
JavaScript 1995 function(k) { return k + 1; }
Mercury 1995 func(K) = K + 1
Ruby 1995 lambda { |k| k + 1 }
Lua 3.1 1998 function(k) return k + 1 end
Scala 2003 (k: Int) => k + 1
F# 2005 fun (k: int) -> k + 1
C# 3.0 2007 delegate(int k) { return k + 1; }
Clojure 2009 (fn [k] (+ k 1))
C++ 11 2011 [] (int k) { return k + 1; }
Go 2012 func(k int) { return k + 1 }
Julia 2012 function(k :: Int) k + 1 end
Kotlin 2012 { k: Int -> k + 1 }
Swift 2014 { (k: int) -> int in return k + 1 }
Java 8 2014 (int k) -> k + 1
Rust 2015 |k: i32| k + 1

Table 1.2: Nameless functions in various programming languages.

29
2 Mathematical formulas as code.
II. Mathematical induction
We will now study more flexible ways of working with data collections in the
functional programming paradigm. The Scala standard library has methods for
performing general iterative computations, that is, computations defined by in-
duction. Translating mathematical induction into code is the focus of this chapter.
First, we need to become fluent in using tuple types with Scala collections.

2.1 Tuple types


2.1.1 Examples: Using tuples
Many standard library methods in Scala work with tuple types. A simple example
of a tuple is a pair of values, e.g., a pair of an integer and a string. The Scala syntax
for this type of pair is:
val a: (Int, String) = (123, "xyz")

The type expression (Int, String) denotes the type of this pair.
A triple is defined in Scala like this:
val b: (Boolean, Int, Int) = (true, 3, 4)

Pairs and triples are examples of tuples. A tuple can contain several values called
parts or fields of a tuple. A tuple’s parts can have different types, but the type of
each part (and the number of parts) is fixed once and for all. It is a type error to
use incorrect types in a tuple, or an incorrect number of parts:
scala> val bad: (Int, String) = (1, 2)
<console>:11: error: type mismatch;
found : Int(2)
required: String
val bad: (Int, String) = (1, 2)
^
scala> val bad: (Int, String) = (1, "a", 3)
<console>:11: error: type mismatch;
found : (Int, String, Int)
required: (Int, String)
val bad: (Int, String) = (1, "a", 3)
^

31
2 Mathematical formulas as code. II. Mathematical induction

Parts of a tuple can be accessed by number, starting from 1. The Scala syntax for
tuple accessor methods looks like ._1, for example:
scala> val a = (123, "xyz")
a: (Int, String) = (123,xyz)

scala> a._1
res0: Int = 123

scala> a._2
res1: String = xyz

It is a type error to access a tuple part that does not exist:


scala> a._0
<console>:13: error: value _0 is not a member of (Int, String)
a._0
^

scala> a._5
<console>:13: error: value _5 is not a member of (Int, String)
a._5
^

Type errors are detected at compile time, before any computations begin.
Tuples can be nested such that any part of a tuple can be itself a tuple:
scala> val c: (Boolean, (String, Int), Boolean) = (true, ("abc", 3), false)
c: (Boolean, (String, Int), Boolean) = (true,(abc,3),false)

scala> c._1
res0: Boolean = true

scala> c._2
res1: (String, Int) = (abc,3)

scala> c._2._1
res2: String = abc

To define functions whose arguments are tuples, we could use the tuple acces-
sors. An example of such a function is:
def f(p: (Boolean, Int), q: Int): Boolean = p._1 && (p._2 > q)

The first argument, p, of this function, has a tuple type. The function body uses
accessor methods (._1 and ._2) to compute the result value. Note that the second
part of the tuple p is of type Int, so it is valid to compare it with an integer q. It
would be a type error to compare the tuple p with an integer using the expression
p > q. It would be also a type error to apply the function f to an argument p that
has a wrong type, e.g., the type (Int, Int) instead of (Boolean, Int).
32
2.1 Tuple types

2.1.2 Pattern matching for tuples


Instead of using accessor methods when working with tuples, it is often conve-
nient to use pattern matching. Pattern matching occurs in Scala as:

• a destructuring definition: val 𝑝𝑎𝑡𝑡𝑒𝑟𝑛 = ...

• a case expression: case 𝑝𝑎𝑡𝑡𝑒𝑟𝑛 => ...

Here is an example of a destructuring definition:


scala> val g = (1, 2, 3)
g: (Int, Int, Int) = (1,2,3)

scala> val (x, y, z) = g


x: Int = 1
y: Int = 2
z: Int = 3

The value g is a tuple of three integers. After defining g, we define the three
variables x, y, z at once in a single val definition. We imagine that this definition
“destructures” the data structure contained in g and decomposes it into three parts,
then assigns the names x, y, z to these parts. The types of x, y, z are also assigned
automatically.
In the example above, the left-hand side of the destructuring definition contains
the tuple pattern (x, y, z) that looks like a tuple, except that its parts are names
x, y, z that are so far undefined. These names are called pattern variables. The de-
structuring definition checks whether the structure of the value of g “matches” the
given pattern. (If g does not contain a tuple with exactly three parts, the definition
will fail.) This computation is called pattern matching.
Pattern matching is often used for working with tuples. Look at this example:
scala> (1, 2, 3) match { case (a, b, c) => a + b + c }
res0: Int = 6

The expression { case (a, b, c) => ... } is called a case expression. It performs
pattern matching on its argument. The pattern matching will “destructure” (i.e.,
decompose) a tuple and try to match it to the given pattern (a, b, c). In this
pattern, a, b, c are as yet undefined new variables, — they are called pattern vari-
ables. If the pattern matching succeeds, the pattern variables a, b, c are assigned
their values, and the function body can proceed to perform its computation. In
this example, the pattern variables a, b, c will be assigned values 1, 2, and 3, and
so the expression evaluates to 6.
Pattern matching is especially convenient for nested tuples. Here is an example
where a nested tuple p is destructured by pattern matching:
def t1(p: (Int, (String, Int))): String = p match {
case (x, (str, y)) => str + (x + y).toString
}

33
2 Mathematical formulas as code. II. Mathematical induction

scala> t1((10, ("result is ", 2)))


res0: String = result is 12

The type structure of the argument (Int, (String, Int)) is visually repeated in the
pattern (x, (str, y)), making it clear that x and y become integers and str becomes
a string after pattern matching.
If we rewrite the code of t1 using the tuple accessor methods instead of pattern
matching, the code will look like this:
def t2(p: (Int, (String, Int))): String = p._2._1 + (p._1 + p._2._2).toString

This code is shorter but harder to read. For example, it is not immediately clear
that p._2._1 is a string. It is also harder to modify this code: Suppose we want to
change the type of the tuple p to ((Int, String), Int). Then the new code is:
def t3(p: ((Int, String), Int)): String = p._1._2 + (p._1._1 + p._2).toString

It takes time to verify, by going through every accessor method, that the function
t3 computes the same expression as t2. In contrast, the code is changed easily
when using the pattern matching expression instead of the accessor methods. We
only need to change the type and the pattern:
def t4(p: ((Int, String), Int)): String = p match {
case ((x, str), y) => str + (x + y).toString
}

It is easy to see that t4 and t1 compute the same result. Also, the names of pattern
variables may be chosen to get more clarity.
Sometimes we only need to use certain parts of a tuple in a pattern match. The
following syntax is used to make that clear:
scala> val (x, _, _, z) = ("abc", 123, false, true)
x: String = abc
z: Boolean = true

The underscore symbol (_) denotes the parts of the pattern that we want to ignore.
The underscore will always match any value regardless of its type.
Scala has a shorter syntax for functions such as {case (x, y) => y} that extract
elements from tuples. The syntax looks like (t => t._2) or equivalently _._2, as
illustrated here:
scala> val p: ((Int, Int )) => Int = { case (x, y) => y }
p: ((Int, Int)) => Int = <function1>

scala> p((1, 2))


res0: Int = 2

scala> val q: ((Int, Int )) => Int = (t => t._2)


q: ((Int, Int)) => Int = <function1>

scala> q((1, 2))

34
2.1 Tuple types

res1: Int = 2

scala> Seq( (1, 10), (2, 20), (3, 30) ).map(_._2)


res2: Seq[Int] = List(10, 20, 30)

2.1.3 Using tuples with collections


Tuples can be combined with any other types without restrictions. For instance,
we can define a tuple of functions:
val q: (Int => Int, Int => Int) = (x => x + 1, x => x - 1)

scala> q._1(3)
res0: Int = 4

We can create a list of tuples:


val r: List[(String, Int)] = List(("apples", 3), ("oranges", 2), ("pears", 0))

We could define a tuple of lists of tuples of functions, or any other combination.


Here is an example of using the standard method map to transform a list of tu-
ples. (As usual, we speak of “data transformation” even though the original list
remains unchanged.) The argument of map must be a function taking a tuple as its
argument. It is convenient to use pattern matching for writing such functions:
scala> val basket: List[(String, Int)] = List(("apples", 3), ("pears", 2),
("lemons", 0))
basket: List[(String, Int)] = List((apples,3), (pears,2), (lemons,0))

scala> basket.map { case (fruit, count) => count * 2 }


res1: List[Int] = List(6, 4, 0)

scala> basket.map { case (fruit, count) => count * 2 }.sum


res2: Int = 10

In this way, we can use the standard methods such as map, filter, max, sum to ma-
nipulate sequences of tuples. The names of the pattern variables (“fruit”, “count”)
are chosen to help us remember the meaning of the parts of tuples.
We can easily transform a list of tuples into a list of values of a different type:
scala> basket.map { case (fruit, count) =>
val isAcidic = (fruit == "lemons")
(fruit, isAcidic)
}
res3: List[(String, Boolean)] = List((apples,false), (pears,false),
(lemons,true))

In the Scala syntax, a nameless function written with braces { ... } may define
local values in its body. The return value of the function is the last expression
written in the function body. In this example, the return value of the nameless
35
2 Mathematical formulas as code. II. Mathematical induction

function is the tuple (fruit, isAcidic).

2.1.4 Treating dictionaries as collections


In the Scala standard library, tuples are frequently used as types of intermediate
values. For instance, tuples are used when iterating over dictionaries. The Scala
type Map[K, V] represents a dictionary with keys of type K and values of type V.
Here K and V are type parameters. Type parameters represent unknown types that
will be chosen later, when working with values having specific types.
In order to create a dictionary with given keys and values, we can write:
Map(("apples", 3), ("oranges", 2), ("pears", 0))

The same result is obtained by first creating a sequence of key/value pairs and
then converting that sequence into a dictionary via the method toMap:
List(("apples", 3), ("oranges", 2), ("pears", 0)).toMap

The same method works for other collection types such as Seq, Vector, and Array.
The Scala library defines a special infix syntax for pairs via the arrow symbol
->. The expression x -> y is equivalent to the pair (x, y):
scala> "apples" -> 3
res0: (String, Int) = (apples,3)

With this syntax, the code for creating a dictionary is easier to read:
Map("apples" -> 3, "oranges" -> 2, "pears" -> 0)

The method toSeq converts a dictionary into a sequence of pairs:


scala> Map("apples" -> 3, "oranges" -> 2, "pears" -> 0).toSeq
res20: Seq[(String, Int)] = ArrayBuffer((apples,3), (oranges,2), (pears,0))

The ArrayBuffer is one of the many list-like data structures in the Scala library.
All these data structures are subtypes of the common “sequence” type Seq. The
methods defined in the Scala standard library sometimes return different imple-
mentations of the Seq type for reasons of performance.
The standard library has several methods that need tuple types, such as map and
filter (when used with dictionaries), toMap, zip, and zipWithIndex. The methods
flatten, flatMap, groupBy, and sliding also work with most collection types, includ-
ing dictionaries and sets. It is important to become familiar with these methods,
because it will help writing code that uses sequences, sets, and dictionaries. Let
us now look at these methods one by one.
The methods map and toMap Chapter 1 showed how the map method works on
sequences: the expression xs.map(f) applies a given function f to each element of
the sequence xs, gathering the results in a new sequence. In this sense, we can say
that the map method “iterates over” sequences. The map method works similarly on
dictionaries, except that iterating over a dictionary of type Map[K, V] when apply-
36
2.1 Tuple types

ing map looks like iterating over a sequence of pairs, Seq[(K, V)]. If d: Map[K, V] is
a dictionary, the argument f of d.map(f) must be a function operating on tuples of
type (K, V). Typically, such functions are written using case expressions:
val fruitBasket = Map("apples" -> 3, "pears" -> 2, "lemons" -> 0)

scala> fruitBasket.map { case (fruit, count) => count * 2 }


res0: Seq[Int] = ArrayBuffer(6, 4, 0)

When using map to transform a dictionary into a sequence of pairs, the result is
again a dictionary. But when an intermediate result is not a sequence of pairs, we
may need to use toMap:
scala> fruitBasket.map { case (fruit, count) => (fruit, count * 2) }
res1: Map[String,Int] = Map(apples -> 6, pears -> 4, lemons -> 0)

scala> fruitBasket.map { case (fruit, count) => (fruit, count, count * 2) }.


map { case (fruit, _, count2) => (fruit, count2 / 2) }.toMap
res2: Map[String,Int] = Map(apples -> 3, pears -> 2, lemons -> 0)

The method filter works on dictionaries by iterating on key/value pairs. The


filtering predicate must be a function of type ((K, V)) => Boolean. For example:
scala> fruitBasket.filter { case (fruit, count) => count > 0 }
res2: Map[String,Int] = Map(apples -> 3, pears -> 2)

The methods zip and zipWithIndex The zip method takes two sequences and
produces a sequence of pairs, taking one element from each sequence:
scala> val s = List(1, 2, 3)
s: List[Int] = List(1, 2, 3)

scala> val t = List(true, false, true)


t: List[Boolean] = List(true, false, true)

scala> s.zip(t)
res3: List[(Int, Boolean)] = List((1,true), (2,false), (3,true))

scala> s zip t
res4: List[(Int, Boolean)] = List((1,true), (2,false), (3,true))

In the last line, the equivalent “dotless” infix syntax (s zip t) is shown to illustrate
a syntax convention of Scala that we will sometimes use.
The zip method works equally well on dictionaries: in that case, dictionaries are
automatically converted to sequences of pairs before applying zip.
The zipWithIndex method creates a sequence of pairs where the second value in
the pair is a zero-based index:
scala> List("a", "b", "c").zipWithIndex
res5: List[(String, Int)] = List((a,0), (b,1), (c,2))

37
2 Mathematical formulas as code. II. Mathematical induction

The method flatten converts a nested sequence type, such as List[List[A]], into
a simple List[A] by concatenating all inner sequences into one:
scala> List(List(1, 2), List(2, 3), List(3, 4)).flatten
res6: List[Int] = List(1, 2, 2, 3, 3, 4)

In Scala, sequences and other collections (such as sets and dictionaries) are gener-
ally concatenated using the operation ++. For example:
scala> List(1, 2, 3) ++ List(4, 5, 6) ++ List(0)
res7: List[Int] = List(1, 2, 3, 4, 5, 6, 0)

So, one can say that the flatten method inserts the operation ++ between all the
inner sequences.
Note that flatten removes only one level of nesting at the top of the data type.
If applied to a List[List[List[Int]]], the flatten method returns a List[List[Int]]
with inner lists unchanged:
scala> List(List(List(1), List(2)), List(List(2), List(3))).flatten
res8: List[List[Int]] = List(List(1), List(2), List(2), List(3))

The method flatMap is closely related to flatten and can be seen as a shortcut,
equivalent to first applying map and then flatten:
scala> List(1, 2, 3, 4).map(n => (1 to n).toList)
res9: List[List[Int]] = List(List(1), List(1, 2), List(1, 2, 3), List(1, 2, 3,
4))

scala> List(1, 2, 3, 4).map(n => (1 to n).toList).flatten


res10: List[Int] = List(1, 1, 2, 1, 2, 3, 1, 2, 3, 4)

scala> List(1, 2, 3, 4).flatMap(n => (1 to n).toList)


res11: List[Int] = List(1, 1, 2, 1, 2, 3, 1, 2, 3, 4)

The flatMap operation transforms a sequence by replacing each element by some


number (zero or more) of new elements.
At first sight it may be unclear why flatMap is useful, as map and flatten appear
to be unrelated. (Should we also combine filter and flatten into a “flatFilter”?)
However, we will see later in this book that flatMap describes nested iterations
and can be generalized to many other data types. This chapter’s examples and
exercises will illustrate the use of flatMap with sequences.
The method groupBy rearranges a sequence into a dictionary where some ele-
ments of the original sequence are grouped together into subsequences. For ex-
ample, given a sequence of words, we can group all words that start with the letter
"y" into one subsequence, and all other words into another subsequence. This is
accomplished by the following code:
scala> Seq("xenon", "yogurt", "zebra").groupBy(s => if (s startsWith "y") 1
else 2)
res12: Map[Int,Seq[String]] = Map(1 -> List(yogurt), 2 -> List(xenon, zebra))

38
2.1 Tuple types

The argument of the groupBy method is a function that computes a “key” out of each
sequence element. The key can have an arbitrarily chosen type. (In the current
example, that type is Int.) The result of groupBy is a dictionary that maps each key
to the sub-sequence of values that have that key. (In the current example, the type
of the dictionary is therefore Map[Int, Seq[String]].) The order of elements in the
sub-sequences remains the same as in the original sequence.
As another example of using groupBy, the following code will group together all
numbers that have the same remainder after division by 3:
scala> List(1, 2, 3, 4, 5).groupBy(k => k % 3)
res13: Map[Int,List[Int]] = Map(2 -> List(2, 5), 1 -> List(1, 4), 0 -> List(3))

The method sliding creates a sequence of sliding windows of a given width:


scala> (1 to 10).sliding(4).toList
res14: List[IndexedSeq[Int]] = List(Vector(1, 2, 3, 4), Vector(2, 3, 4, 5),
Vector(3, 4, 5, 6), Vector(4, 5, 6, 7), Vector(5, 6, 7, 8), Vector(6, 7, 8,
9), Vector(7, 8, 9, 10))

After creating a nested sequence, we can apply an aggregation operation to the


inner sequences. For example, the following code computes a sliding-window
average with window width 50 over an array of 100 numbers:
scala> (1 to 100).map(x => math.cos(x)).sliding(50).map(_.sum /
50).take(5).toList
res15: List[Double] = List(-0.005153079196990285, -0.0011160413780774369,
0.003947079736951305, 0.005381273944717851, 0.0018679497047270743)

The method sortBy sorts a sequence according to a sorting key. The argument of
sortBy is a function that computes the sorting key from a sequence element. This
gives us flexibility to sort elements in a custom way:
scala> Seq(1, 2, 3).sortBy(x => -x)
res0: Seq[Int] = List(3, 2, 1)

scala> Seq("xx", "z", "yyy").sortBy(word => word) // Sort alphabetically.


res1: Seq[String] = List(xx, yyy, z)

scala> Seq("xx", "z", "yyy").sortBy(word => word.length) // Sort by word length.


res2: Seq[String] = List(z, xx, yyy)

Sorting by the elements themselves, as we have done here with .sortBy(word


=> word), is only possible if the element’s type has a well-defined ordering. For
strings, this is the alphabetic ordering, and for integers, the standard arithmetic
ordering. For such types, a convenience method sorted is defined, and works
equivalently to sortBy(x => x):
scala> Seq("xx", "z", "yyy").sorted
res3: Seq[String] = List(xx, yyy, z)

39
2 Mathematical formulas as code. II. Mathematical induction

2.1.5 Examples: Tuples and collections


Example 2.1.5.1 For a given sequence 𝑥𝑖 , compute the sequence of pairs of values
(cos 𝑥𝑖 , sin 𝑥𝑖 ).
Hint: use map, assume xs: Seq[Double].
Solution We need to produce a sequence that has a pair of values correspond-
ing to each element of the original sequence. This transformation is exactly what
the map method does. So, the code is:
xs.map { x => (math.cos(x), math.sin(x)) }

Example 2.1.5.2 Count how many times cos 𝑥𝑖 > sin 𝑥𝑖 occurs in a sequence 𝑥𝑖 .
Hint: use count, assume xs: Seq[Double].
Solution The method count takes a predicate and returns the number of se-
quence elements for which the predicate is true:
xs.count { x => math.cos(x) > math.sin(x) }
We could also reuse the solution of Exercise 2.1.5.1 that computed the cosine and
the sine values. The code would then become:
xs.map { x => (math.cos(x), math.sin(x)) }
.count { case (cosine, sine) => cosine > sine }

Example 2.1.5.3 For given sequences 𝑎𝑖 and 𝑏𝑖 of Double values, compute the se-
quence of differences 𝑐𝑖 = 𝑎𝑖 − 𝑏𝑖 .
Hint: use zip, map, and assume as and bs have equal length.
Solution We can use zip on as and bs, which gives a sequence of pairs:
as.zip(bs): Seq[(Double, Double)]
We then compute the differences 𝑎𝑖 − 𝑏𝑖 by applying map to this sequence:
as.zip(bs).map { case (a, b) => a - b }

Example 2.1.5.4 In a given sequence 𝑝𝑖 , count how many times 𝑝𝑖 > 𝑝𝑖+1 occurs.
Hint: use zip and tail.
Solution Given ps: Seq[Double], we can compute ps.tail. The result is a se-
quence that is one element shorter than ps, for example:
scala> val ps = Seq(1, 2, 3, 4)
ps: Seq[Int] = List(1, 2, 3, 4)

scala> ps.tail
res0: Seq[Int] = List(2, 3, 4)
Taking a zip of the two sequences ps and ps.tail, we get a sequence of pairs:
scala> ps.zip(ps.tail)
res1: Seq[(Int, Int)] = List((1,2), (2,3), (3,4))
Because ps.tail is one element shorter than ps, the resulting sequence of pairs
is also one element shorter than ps. So, it is not necessary to truncate ps before
40
2.1 Tuple types

computing ps.zip(ps.tail). Now apply the count method:


ps.zip(ps.tail).count { case (a, b) => a > b }

Example 2.1.5.5 For a given 𝑘 > 0, compute the sequence 𝑐𝑖 = max(𝑏𝑖−𝑘 , ..., 𝑏𝑖+𝑘 ),
starting at 𝑖 = 𝑘.
Solution Applying the sliding method to a list gives a list of nested lists:
val b = List(1, 2, 3, 4, 5) // An example of a possible sequence `b`.

scala> b.sliding(3).toList
res0: List[List[Int]] = List(List(1, 2, 3), List(2, 3, 4), List(3, 4, 5))

For each 𝑖, we need to obtain a list of 2𝑘 + 1 nearby elements (𝑏𝑖−𝑘 , ..., 𝑏𝑖+𝑘 ). So, we
need to use sliding(2 * k + 1) to obtain a window of the required size. Now we
can compute the maximum of each of the nested lists by using the map method on
the outer list, with the max method applied to the nested lists. So, the argument of
the map method must be the function x => x.max (where x will have type List[Int):
def c(b: List[Int], k: Int) = b.sliding(2 * k + 1).toList.map(x => x.max)

This code can be written more concisely using the syntax:


def c(b: List[Int], k: Int) = b.sliding(2 * k + 1).toList.map(_.max)

because, in Scala, _.max is the same as the nameless function x => x.max. Test this:
scala> c(b = List(1, 2, 3, 4, 5, 6, 5, 4, 3, 2, 1), k = 1) // Write the
argument names for clarity.
res0: Seq[Int] = List(3, 4, 5, 6, 6, 6, 5, 4, 3)

Example 2.1.5.6 Create a 10 × 10 multiplication table as a dictionary having the


type Map[(Int, Int), Int]. For example, a 3 × 3 multiplication table would be given
by this dictionary:
Map( (1, 1) -> 1, (1, 2) -> 2, (1, 3) -> 3, (2, 1) -> 2,
(2, 2) -> 4, (2, 3) -> 6, (3, 1) -> 3, (3, 2) -> 6, (3, 3) -> 9 )

Hint: use flatMap and toMap.


Solution We are required to make a dictionary that maps pairs of integers (x,
y) to x * y. Begin by creating the list of keys for that dictionary, which must be a
list of pairs (x, y) of the form List((1,1), (1,2), ..., (2,1), (2,2), ...). We need
to iterate over a sequence of values of x; and for each x, we then need to iterate
over another sequence to provide values for y. Try this computation:
scala> val s = List(1, 2, 3).map(x => List(1, 2, 3))
s: List[List[Int]] = List(List(1, 2, 3), List(1, 2, 3), List(1, 2, 3))

We would like to get List((1,1), (1,2), 1,3)) etc., and so we use map on the inner
list with a nameless function y => (1, y) that converts a number into a tuple:
scala> List(1, 2, 3).map { y => (1, y) }
res0: List[(Int, Int)] = List((1,1), (1,2), (1,3))

41
2 Mathematical formulas as code. II. Mathematical induction

The curly braces in {y => (1, y)} are only for clarity. We could also use round
parentheses and write List(1, 2, 3).map(y => (1, y)).
Now, we need to have (x, y) instead of (1, y) in the argument of map, where x
iterates over List(1, 2, 3) in the outside scope:
scala> val s = List(1, 2, 3).map(x => List(1, 2, 3).map { y => (x, y) })
s: List[List[(Int, Int)]] = List(List((1,1), (1,2), (1,3)), List((2,1), (2,2),
(2,3)), List((3,1), (3,2), (3,3)))

This is almost what we need, except that the nested lists need to be concatenated
into a single list. This is exactly what flatten does:
scala> val s = List(1, 2, 3).map(x => List(1, 2, 3).map { y => (x, y) }).flatten
s: List[(Int, Int)] = List((1,1), (1,2), (1,3), (2,1), (2,2), (2,3), (3,1),
(3,2), (3,3))

It is shorter to write .flatMap(...) instead of .map(...).flatten:


scala> val s = List(1, 2, 3).flatMap(x => List(1, 2, 3).map { y => (x, y) })
s: List[(Int, Int)] = List((1,1), (1,2), (1,3), (2,1), (2,2), (2,3), (3,1),
(3,2), (3,3))

This is the list of keys for the required dictionary. The dictionary needs to map
each pair of integers (x, y) to x * y. To create that dictionary, we will apply toMap
to a sequence of pairs (key, value), which in our case needs to be of the form of
a nested tuple ((x, y), x * y). To achieve that, we use map with a function that
computes the product and creates those nested tuples:
scala> val s = List(1, 2, 3).flatMap(x => List(1, 2, 3).map { y => (x, y) }).
map { case (x, y) => ((x, y), x * y) }
s: List[((Int, Int), Int)] = List(((1,1),1), ((1,2),2), ((1,3),3), ((2,1),2),
((2,2),4), ((2,3),6), ((3,1),3), ((3,2),6), ((3,3),9))

We can simplify this code if we notice that we are first mapping each y to a tu-
ple (x, y), and later mapping each tuple (x, y) to a nested tuple ((x, y), x * y).
Instead, the entire computation can be done in the inner map operation:
scala> val s = List(1, 2, 3).flatMap(x => List(1, 2, 3).map { y => ((x, y), x *
y) } )
s: List[((Int, Int), Int)] = List(((1,1),1), ((1,2),2), ((1,3),3), ((2,1),2),
((2,2),4), ((2,3),6), ((3,1),3), ((3,2),6), ((3,3),9))

Applying toMap, we convert this list of tuples to a dictionary. Also, for better read-
ability, we use Scala’s pair syntax, key -> value, which is equivalent to writing the
tuple (key, value):
(1 to 10).flatMap(x => (1 to 10).map { y => (x, y) -> x * y }).toMap

Example 2.1.5.7 For a given sequence 𝑥𝑖 , compute the maximum of all of the
numbers 𝑥𝑖 , 𝑥𝑖2 , cos 𝑥𝑖 , sin 𝑥𝑖 . Hint: use flatMap and max.
Solution We will compute the required value if we take max of a list containing
all of the numbers. To do that, first map each element of the list xs: Seq[Double]
42
2.1 Tuple types

into a sequence of three numbers:


scala> val xs = List(0.1, 0.5, 0.9) // An example list of `Double` values.
xs: List[Double] = List(0.1, 0.5, 0.9)

scala> xs.map { x => Seq(x, x * x, math.cos(x), math.sin(x)) }


res0: List[Seq[Double]] = List(List(0.1, 0.010000000000000002,
0.9950041652780258, 0.09983341664682815), List(0.5, 0.25,
0.8775825618903728, 0.479425538604203), List(0.9, 0.81, 0.6216099682706644,
0.7833269096274834))

This list is almost what we need, except we need to flatten it:


scala> res0.flatten
res1: List[Double] = List(0.1, 0.010000000000000002, 0.9950041652780258,
0.09983341664682815, 0.5, 0.25, 0.8775825618903728, 0.479425538604203, 0.9,
0.81, 0.6216099682706644, 0.7833269096274834)

It remains to take the maximum of the resulting numbers:


scala> res1.max
res2: Double = 0.9950041652780258

The final code (starting from a given sequence xs) is:


xs.flatMap { x => Seq(x, x * x, math.cos(x), math.sin(x)) }.max

Example 2.1.5.8 From a dictionary of type Map[String, String] mapping names to


addresses, and assuming that the addresses do not repeat, compute a dictionary
of type Map[String, String] mapping the addresses back to names.
Solution Iterating over a dictionary looks like iterating over a list of (key,
value) pairs. The result is converted to a Map automatically.
dict.map { case (name, addr) => (addr, name) } // This has type Map[String,
String].

Example 2.1.5.9 Write the solution of Example 2.1.5.8 as a function with type
parameters Name and Addr instead of the fixed type String.
Solution In Scala, the syntax for type parameters in a function definition is:
def rev[Name, Addr](...) = ...

The type of the argument is Map[Name, Addr], while the type of the result is Map[Addr,
Name]. So, we use the type parameters Name and Addr in the type signature of the
function. The final code is:
def rev[Name, Addr](dict: Map[Name, Addr]): Map[Addr, Name] =
dict.map { case (name, addr) => (addr, name) }

The body of the function rev remains the same as in Example 2.1.5.8; only the type
signature changes. This is because the function rev works in the same way for
dictionaries of any type. For this reason, it was easy for us to change the specific
type String into type parameters in that function.
43
2 Mathematical formulas as code. II. Mathematical induction

When the function rev is applied to a dictionary of a specific type, the Scala
compiler will automatically set the type parameters Name and Addr that fit the re-
quired types of the dictionary’s keys and values. For example, if we apply rev
to a dictionary of type Map[Boolean, Seq[String]], the type parameters will be set
automatically as Name = Boolean and Addr = Seq[String]:
scala> val d = Map(true -> Seq("x", "y"), false -> Seq("z", "t"))
d: Map[Boolean, Seq[String]] = Map(true -> List(x, y), false -> List(z, t))

scala> rev(d)
res0: Map[Seq[String], Boolean] = Map(List(x, y) -> true, List(z, t) -> false)

Type parameters can be also set explicitly when using the function rev. If the type
parameters are chosen incorrectly, the program will not compile:
scala> rev[Boolean, Seq[String]](d)
res1: Map[Seq[String],Boolean] = Map(List(x, y) -> true, List(z, t) -> false)

scala> rev[Int, Double](d)


<console>:14: error: type mismatch;
found : Map[Boolean,Seq[String]]
required: Map[Int,Double]
rev[Int, Double](d)
^

Example 2.1.5.10 Given a sequence words: Seq[String] of some “words”, compute


a sequence of type Seq[(Seq[String], Int)], where each inner sequence should con-
tain all the words having the same length, paired with the integer value show-
ing that length. The resulting sequence must be ordered by increasing length of
words. So, the input Seq("the", "food", "is", "good") should produce:
Seq((Seq("is"), 2), (Seq("the"), 3), (Seq("food", "good"), 4))

Solution Begin by grouping the words by length. The library method groupBy
takes a function that computes a “grouping key” from each element of a sequence.
To group by word length (computed via the method length), we write:
words.groupBy { word => word.length }

or, more concisely, words.groupBy(_.length). The result of this expression is a dic-


tionary that maps each length to the list of words having that length:
scala> words.groupBy(_.length)
res0: Map[Int,Seq[String]] = Map(2 -> List(is), 4 -> List(food, good), 3 ->
List(the))

This is close to what we need. If we convert this dictionary to a sequence, we will


get a list of pairs:
scala> words.groupBy(_.length).toSeq
res1: Seq[(Int, Seq[String])] = ArrayBuffer((2,List(is)), (4,List(food, good)),
(3,List(the)))

44
2.1 Tuple types

It remains to swap the length and the list of words and to sort the result by in-
creasing length. We can do this in any order: first sort, then swap; or first swap,
then sort. The final code is:
words
.groupBy(_.length)
.toSeq
.sortBy { case (len, words) => len }
.map { case (len, words) => (words, len) }

This can be written somewhat shorter if we use the code _._1 (equivalent to x =>
x._1) for selecting the first parts from pairs and swap for swapping the two elements
of a pair:
words.groupBy(_.length).toSeq.sortBy(_._1).map(_.swap)

However, the program may now be harder to read and to modify.

2.1.6 Reasoning about type parameters in collections


In Example 2.1.5.10 we have applied a chain of operations to a sequence. Let us
add comments showing the type of the intermediate result after each operation:
words // Seq[String]
.groupBy(_.length) // Map[Int, Seq[String]]
.toSeq // Seq[ (Int, Seq[String]) ]
.sortBy { case (len, words) => len } // Seq[ (Int, Seq[String]) ]
.map { case (len, words) => (words, len) } // Seq[ (Seq[String], Int) ]

In computations like this, the Scala compiler verifies at each step that the oper-
ations are applied to values of the correct types. Writing down the intermediate
types will help us write correct code.
For instance, sortBy is defined for sequences but not for dictionaries, so it would
be a type error to apply sortBy to a dictionary without first converting it to a se-
quence using toSeq. The type of the intermediate result after toSeq is Seq[ (Int,
Seq[String]) ], and the sortBy operation is applied to that sequence. So, the se-
quence element matched by { case (len, words) => len } is a tuple having the type
(Int, Seq[String]). Then the pattern variables len and words must have types Int
and Seq[String] respectively.
If we visualize how the type of the sequence should change at every step, we
can more quickly understand how to implement the required task. Begin by writ-
ing down the intermediate types that would be needed during the computation:
words: Seq[String] // After groupBy() by word length, will have type:
Map[Int, Seq[String]] // To sort by word length, convert to a sequence:
Seq[ (Int, Seq[String]) ] // Sort by the `Int` value; type is unchanged:
Seq[ (Int, Seq[String]) ] // It remains to swap the parts of the tuples:
Seq[ (Seq[String], Int) ] // We are done.

45
2 Mathematical formulas as code. II. Mathematical induction

Having written down these types, we are better assured that the computation can
be done correctly. Writing the code becomes straightforward, since we are guided
by the already known types of the intermediate results:
words.groupBy(_.length).toSeq.sortBy(_._1).map(_.swap)
This example illustrates the main benefits of reasoning about types: it gives
direct guidance about how to organize the computation, together with a greater
confidence about code correctness.

2.1.7 Exercises: Tuples and collections


Exercise 2.1.7.1 Find all integer pairs 𝑖, 𝑗 where 0 ≤ 𝑖 ≤ 9 and 0 ≤ 𝑗 ≤ 9 and
𝑖 + 4 ∗ 𝑗 > 𝑖 ∗ 𝑗.
Hint: use flatMap and filter.
Exercise 2.1.7.2 Find all integer triples 𝑖, 𝑗, 𝑘 where 0 ≤ 𝑖 ≤ 9, 0 ≤ 𝑗 ≤ 9, 0 ≤ 𝑘 ≤ 9,
and 𝑖 + 4 ∗ 𝑗 + 9 ∗ 𝑘 > 𝑖 ∗ 𝑗 ∗ 𝑘.
Exercise 2.1.7.3 Given two sequences p: Seq[String] and q: Seq[Boolean] of equal
length, compute a Seq[String] with those elements of p for which the correspond-
ing element of q is true.
Hint: use zip, map, filter.
Exercise 2.1.7.4 Convert a Seq[Int] into a Seq[(Int, Boolean)] where the Boolean
value is true if an Int value is followed by a larger value. For example, the input
Seq(1, 3, 2, 4) must be converted into Seq((1,true),(3,false),(2,true),(4,false)).
The last value (here, 4) has no following value and is always paired with false.
Exercise 2.1.7.5 Given p: Seq[String] and q: Seq[Int] of equal length, compute a
Seq[String] that contains the strings from p ordered according to the correspond-
ing numbers from q. For example, if p = Seq("a", "b", "c") and q = Seq(10, -1, 5)
then the result must be Seq("b", "c", "a").
Exercise 2.1.7.6 Write the solution of Exercise 2.1.7.5 as a function with type pa-
rameter A instead of the fixed type String. The type signature and a sample test:
def reorder[A](p: Seq[A], q: Seq[Int]): Seq[A] = ??? // In Scala, ??? means
"not yet implemented".

scala> reorder(Seq(6.0,2.0,8.0,4.0), Seq(20,10,40,30)) // Test with A = Double.


res0: Seq[Double] = List(2.0, 6.0, 4.0, 8.0)

Exercise 2.1.7.7 Given p: Seq[String] and q: Seq[Int] of equal length and assum-
ing that values in q do not repeat, compute a Map[Int, String] mapping numbers
from q to the corresponding strings from p.
Exercise 2.1.7.8 Write the solution of Exercise 2.1.7.7 as a function with type pa-
rameters P and Q instead of the fixed types Int and String. The function’s argu-
ments should be of types Seq[Q] and Seq[P], and the return type should be Map[P,
Q]. Run some tests using types P = Double and Q = Set[Boolean].

46
2.1 Tuple types

Exercise 2.1.7.9 Given a Seq[(String, Int)] showing a list of purchased items


(where item names may repeat), compute a Map[String, Int] showing the total
counts. So, for the input:
Seq(("apple", 2), ("pear", 3), ("apple", 5), ("lemon", 2), ("apple", 3))

the output must be: Map("apple" -> 10, "pear" -> 3, "lemon" -> 2).
Hint: use groupBy, map, sum.

Exercise 2.1.7.10 Given a Seq[Seq[Int]], compute a new Seq[Seq[Int]] where each


new inner sequence contains the 3 largest elements from the corresponding old
inner sequence, sorted in descending order (or fewer than 3 elements if the old
inner sequence is shorter). So, for the input:
Seq(Seq(0, 50, 5, 10, 30), Seq(10, 100), Seq(1, 2, 200, 20))

the output must be:


Seq(Seq(50, 30, 10), Seq(100, 10), Seq(200, 20, 2))

Hint: use map, sortBy, take.

Exercise 2.1.7.11 (a) Given two sets, p: Set[Int] and q: Set[Int], compute a set
of type Set[(Int, Int)] as the Cartesian product of the sets p and q. This is the set
of all pairs (x, y) where x is an element from p and y is an element from q.
(b) Implement this computation as a function with type parameters I, J instead
of Int. The required type signature and a sample test:
def cartesian[I, J](p: Set[I], q: Set[J]): Set[(I, J)] = ???

scala> cartesian(Set("a", "b"), Set(10, 20))


res0: Set[(String, Int)] = Set((a,10), (a,20), (b,10), (b,20))

Hint: use flatMap and map on sets.

Exercise 2.1.7.12 Given a Seq[Map[Person, Amount]], showing the amounts vari-


ous people paid on each day, compute a Map[Person, Seq[Amount]], showing the
sequence of payments for each person. Assume that Person and Amount are type
parameters. The required type signature and a sample test:
def payments[Person, Amount](data: Seq[Map[Person, Amount]]): Map[Person,
Seq[Amount]] = ???
// On day 1, Tarski paid 10 and Gödel paid 20. On day 2, Gentzen paid 50, etc.
scala> payments(Seq(Map("Tarski" -> 10, "Gödel" -> 20), Map("Gentzen" -> 50),
Map("Tarski" -> 50, "Church" -> 100), Map("Banach" -> 15, "Gentzen" -> 35)))
res0: Map[String, Seq[Int]] = Map(Genzten -> List(50, 35), Church -> List(100),
Banach -> List(15), Tarski -> List(10, 50), Gödel -> List(20))

Hint: use flatMap, groupBy, map on dictionaries.


47
2 Mathematical formulas as code. II. Mathematical induction

2.2 Converting a sequence into a single value


Until this point, we have been working with sequences using methods such as map
and zip. These techniques are powerful but still insufficient for certain tasks.
A simple computation that is impossible to do using map is obtaining the sum of
a sequence of numbers. The standard library method sum already does this; but we
cannot re-implement sum ourselves by using map, zip, or filter. These operations
always compute new sequences, while we need to compute a single value (the sum
of all elements) from a sequence.
We have seen a few library methods such as count, length, and max that compute
a single value from a sequence; but we still cannot implement sum using these
methods. What we need is a more general way of converting a sequence to a
single value, such that we could ourselves implement sum, count, max, and other
similar computations.
Another task not easily solved with map, sum, etc., is to compute a floating-point
number from a given sequence of decimal digits (including a “dot” character):
def digitsToDouble(ds: Seq[Char]): Double = ???

scala> digitsToDouble(Seq('2', '0', '4', '.', '5'))


res0: Double = 204.5

Note that the same task for integer numbers (instead of floating-point numbers)
can be implemented via length, map, sum, and zip:
def digitsToInt(ds: Seq[Int]): Int = {
val n = ds.length
// Compute a sequence of powers of 10, e.g., [1000, 100, 10, 1].
val powers: Seq[Int] = (0 to n - 1).map(k => math.pow(10, n - 1 - k).toInt)
// Sum the powers of 10 with coefficients from `ds`.
(ds zip powers).map { case (d, p) => d * p }.sum
}

scala> digitsToInt(Seq(2, 4, 0, 5))


res0: Int = 2405

For this task, the required computation can be written as the formula:

𝑛−1
Õ
𝑟= 𝑑 𝑘 ∗ 10𝑛−1−𝑘 .
𝑘=0

The sequence of powers of 10 can be computed separately and “zipped” with the
sequence of digits 𝑑 𝑘 . However, for floating-point numbers, the sequence of pow-
ers of 10 depends on the position of the “dot” character. Methods such as map or zip
cannot compute a sequence whose next elements depend on previous elements
and the dependence is described by some custom function.
48
2.2 Converting a sequence into a single value

2.2.1 Inductive definitions of aggregation functions


Mathematical induction is a general way of expressing the dependence of next
values on previously computed values. To define a function from a sequence to
a single value (e.g., an aggregation function f: Seq[Int] => Int) via mathematical
induction, we need to specify two computations:

• The base case of the induction: We need to specify what value the function
f returns for an empty sequence, Seq(). The standard method isEmpty can be
used to detect empty sequences. In case the function f is defined only for
non-empty sequences, we need to specify what the function f returns for a
one-element sequence such as Seq(x), with any x.

• The inductive step: Assuming that the function f is already computed for
some sequence xs (the inductive assumption), how to compute the function
f for a sequence with one more element x? The sequence with one more
element is written as xs :+ x. So, we need to specify how to compute f(xs
:+ x) assuming that f(xs) is already known.

Once these two computations are specified, the function f is defined (and can in
principle be computed) for an arbitrary input sequence.
With this approach, the inductive definition of the method sum looks like this:
The base case is that the sum of an empty sequence is 0. That is, Seq().sum ==
0. The inductive step says that when the result xs.sum is already known for a
sequence xs, and we have a sequence that has one more element x, then the new
result is equal to xs.sum + x. In code, this is (xs :+ x).sum == xs.sum + x.
The inductive definition of the function digitsToInt goes like this: The base case
is an empty sequence of digits, Seq(), and the result is 0. This is a convenient
base case even if we never need to apply digitsToInt to an empty sequence. The
inductive step: If digitsToInt(xs) is already known for a sequence xs of digits, and
we have a sequence xs :+ x with one more digit x, then:
digitsToInt(xs :+ x) == digitsToInt(xs) * 10 + x

Let us write inductive definitions for the methods length, max, and count.
The method length Base case: The length of an empty sequence is zero, so we
write: Seq().length == 0.
Inductive step: if xs.length is known then (x +: xs).length == xs.length + 1.
The method max The maximum element of a sequence is undefined for empty
sequences.
Base case: for a one-element sequence, Seq(x).max == x.
Inductive step: if xs.max is known then (x +: xs).max == math.max(x, xs.max).
The method count computes the number of a sequence’s elements satisfying a
predicate p.
Base case: for an empty sequence, Seq().count(p) == 0.
49
2 Mathematical formulas as code. II. Mathematical induction

Inductive step: if xs.count(p) is known then (x +: xs).count(p) == xs.count(p)


+ c, where we define c = 1 when p(x) == true and c = 0 otherwise.
When a function is defined by induction, proving a property of that function
will usually involve a “proof by induction”. As an example, let us prove that (xs
++ ys).length = xs.length + ys.length. We use induction on the length of the se-
quence xs. In the base case, we need to prove that the property holds for an empty
sequence xs (and an arbitrary sequence ys). To verify the base case, we write:
(Seq() ++ ys).length == ys.length. In the inductive step of the proof, we assume
that the property already holds for some xs and ys and prove that the property
will then hold for x +: xs instead of xs. To verify that, we use the associativity law
of the concatenation operation (to be proved in Statement 8.5.2.1), which allows
us to write: (x +: xs) ++ ys == x +: (xs ++ ys). Then:
((x +: xs) ++ ys).length // Expect to equal (x +: xs).length + ys.length
== (x +: (xs ++ ys)).length
== 1 + (xs.length + ys.length)
== (x +: xs).length + ys.length

In this way, we show that the property holds for x +: xs and ys assuming it holds
for xs and ys.
There are two main ways of translating mathematical induction into code. The
first way is to write a recursive function. The second way is to use a standard li-
brary function, such as foldLeft or reduce. Often it is better to use library functions,
but sometimes the code is more transparent when recursion is explicit. So, let us
consider each of these ways in turn.

2.2.2 Implementing functions by recursion


A recursive function is any function that calls itself somewhere within its own
body. The call to itself is the recursive call. Recursion may be used to implement
functions defined by induction.
When the body of a recursive function is evaluated, it may repeatedly call it-
self with different arguments until a result value can be computed without any
recursive calls. The repeated recursive calls correspond to inductive steps, and
the last call corresponds to the base case of the inductive definition. It is an error
(an infinite loop) if the base case is never reached, as in this example:
scala> def infiniteLoop(x: Int): Int = infiniteLoop(x + 1)
infiniteLoop: (x: Int)Int

scala> infiniteLoop(0) // You will need to press Ctrl-C to stop this.

We translate mathematical induction into code by first writing a condition to


decide whether we have the base case or the inductive step. As an example, let us
define sum by recursion. The base case returns 0, while the inductive step returns
a value computed from the recursive call. Look at this code:
50
2.2 Converting a sequence into a single value

def sum(s: Seq[Int]): Int = if (s.isEmpty) 0 else {


val x = s.head // To split s = x +: xs, compute x
val xs = s.tail // and xs.
sum(xs) + x // Call sum(...) recursively.
}

The if/else expression separates the base case from the inductive step. In the
inductive step, it is convenient to split the given sequence s into its first element
x, or the “head” of s, and the remainder (“tail”) sequence xs. So, we split s as s = x
+: xs rather than as s = xs :+ x.1
For computing the sum of a numerical sequence, the order of summation does
not matter. But the order of operations will matter for many other computational
tasks. We will need to choose whether the inductive step should split the sequence
as s = x +: xs or as s = xs :+ x, depending on the task at hand.
Let us implement digitsToInt according to the inductive definition shown in
Section 2.2.1:
def digitsToInt(s: Seq[Int]): Int = if (s.isEmpty) 0 else {
val x = s.last // To split s = xs :+ x, compute x
val xs = s.init // and xs.
digitsToInt(xs) * 10 + x // Call digitsToInt(...) recursively.
}

In this example, it is important to split the sequence s into xs :+ x and not into x
+: xs. The reason is that digits increase their numerical value from right to left, so
the correct result is computed as digitsToInt(xs) * 10 + x if we split s into xs :+ x.
For that splitting, we use the standard library methods init and last.
These examples show how mathematical induction is converted into recursive
code. This approach often works but has two technical problems. The first prob-
lem is that the code will fail due to a stack overflow when the input sequence s is
long enough. In the next subsection, we will see how this problem is solved (at
least in some cases) using tail recursion.
The second problem is that all inductively defined functions will use the same
code for checking the base case and for splitting the sequence s into the subse-
quence xs and the extra element x. This repeated common code can be put into
a library function, and the Scala library provides such functions. We will look at
using them in Section 2.2.4.

2.2.3 Tail recursion


The code of lengthS will fail for large enough sequences. To see why, consider an
inductive definition of the length method as a function lengthS:
1 Itis easier to remember the meaning of x +: xs and xs :+ x if we note that the colon (:) always
points to the collection (xs) and the plus sign (+) to a single element (x) that is being added.

51
2 Mathematical formulas as code. II. Mathematical induction

def lengthS(s: Seq[Int]): Int = if (s.isEmpty) 0 else 1 + lengthS(s.tail)

scala> lengthS((1 to 1000).toList)


res0: Int = 1000

scala> val s = (1 to 100000).toList


s: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, ...

scala> lengthS(s)
java.lang.StackOverflowError
at .lengthS(<console>:12)
at .lengthS(<console>:12)
at .lengthS(<console>:12)
...

The problem is not due to insufficient main memory: we are able to compute
and hold in memory the entire sequence s. The problem is with the code of the
function lengthS. This function calls itself inside the expression 1 + lengthS(...).
Let us visualize how the computer evaluates that code:
lengthS(Seq(1, 2, ..., 100000))
= 1 + lengthS(Seq(2, ..., 100000))
= 1 + (1 + lengthS(Seq(3, ..., 100000)))
= ...

The code of lengthS will repeat the inductive step, that is, the “else” part of the
“if/else”, about 100000 times. Each time, the intermediate sub-expression with
nested computations 1 + (1 + (...)) will get larger. That sub-expression needs
to be held somewhere in memory until the function body goes into the base case,
with no more recursive calls. When that happens, the intermediate sub-expression
will contain about 100000 nested function calls still waiting to be evaluated. A
special area of memory called stack memory is dedicated to storing the arguments
for all not-yet-evaluated nested function calls. Due to the way computer memory
is managed, the stack memory has a fixed size and cannot grow automatically. So,
when the intermediate expression becomes large enough, it causes an overflow of
the stack memory and crashes the program.
One way to avoid stack overflows is to use a trick called tail recursion. Using
tail recursion means rewriting the code so that all recursive calls occur at the end
positions (at the “tails”) of the function body. In other words, each recursive call
must be itself the last computation in the function body, rather than placed inside
other computations. Here is an example of tail-recursive code:
def lengthT(s: Seq[Int], res: Int): Int =
if (s.isEmpty) res
else lengthT(s.tail, res + 1)

In this code, one of the branches of the if/else returns a fixed value without doing
52
2.2 Converting a sequence into a single value

any recursive calls, while the other branch returns the result of a recursive call to
lengthT(...).
It is not a problem that the recursive call to lengthT has some sub-expressions
such as res + 1 as its arguments, because all these sub-expressions will be com-
puted before lengthT is recursively called. The recursive call to lengthT is the last
computation performed by this branch of the if/else. A tail-recursive function
can have many if/else or match/case branches, with or without recursive calls; but
all recursive calls must be always the last expressions returned.
The Scala compiler will always use tail recursion when possible. Additionally,
Scala has a feature for verifying that a function’s code is tail-recursive: the tailrec
annotation. If a function with a tailrec annotation is not tail-recursive (or is not
recursive at all), the program will not compile.
The code of lengthT with a tailrec annotation looks like this:
import scala.annotation.tailrec

@tailrec def lengthT(s: Seq[Int], res: Int): Int =


if (s.isEmpty) res
else lengthT(s.tail, res + 1)

(The import declaration is needed whenever the code uses the tailrec annotation.)
Let us trace the evaluation of this function on an example:
lengthT(Seq(1, 2, 3), 0)
= lengthT(Seq(2, 3), 0 + 1) // = lengthT(Seq(2, 3), 1)
= lengthT(Seq(3), 1 + 1) // = lengthT(Seq(3), 2)
= lengthT(Seq(), 2 + 1) // = lengthT(Seq(), 3)
= 3

All sub-expressions such as 1 + 1 and 2 + 1 are computed before recursive calls to


lengthT. Because of that, sub-expressions do not grow within the stack memory.
This is the main benefit of tail recursion.
How did we rewrite the code of lengthS into the tail-recursive code of lengthT?
An important difference between lengthS and lengthT is the additional argument
(res), called the accumulator argument. This argument is equal to an intermediate
result of the computation. The next intermediate result (res + 1) is computed and
passed on to the next recursive call via the accumulator argument. In the base
case of the recursion, the function now returns the accumulated result (res) rather
than 0, because at that time the computation is finished.
Rewriting code by adding an accumulator argument to achieve tail recursion is
called the accumulator technique or the “accumulator trick”.
One consequence of using the accumulator trick is that the function lengthT now
always needs a value for the accumulator argument. However, our goal is to
implement a function such as length(s) with just one argument, s: Seq[Int]. We
can define length(s) = lengthT(s, ???) if we supply an initial accumulator value.
The correct initial value for the accumulator is 0, since in the base case (an empty
53
2 Mathematical formulas as code. II. Mathematical induction

sequence s) we need to return 0.


It appears useful to define the helper function (lengthT) separately. Then length
will just call lengthT and specify the initial value of the accumulator argument. To
emphasize that lengthT is a helper function that is only used by length to achieve
tail recursion, we define lengthT as a nested function inside the code of length:
def length[A](xs: Seq[A]): Int = {
@tailrec def lengthT(s: Seq[A], res: Int): Int = {
if (s.isEmpty) res
else lengthT(s.tail, res + 1)
}
lengthT(xs, 0)
}

When length is implemented like that, users will not be able to call lengthT directly,
because lengthT is only visible within the body of the length function.
Another possibility in Scala is to use a default value for the res argument:
@tailrec def length[A](s: Seq[A], res: Int = 0): Int =
if (s.isEmpty) res
else length(s.tail, res + 1)

Giving a default value for a function argument is the same as defining two func-
tions: one with that argument and one without. For example, the syntax:
def f(x: Int, y: Boolean = false): Int = ... // Function body.

is equivalent to defining two functions with the same name but different numbers
of arguments:
def f(x: Int, y: Boolean) = ... // Define the function body here.
def f(x: Int): Int = f(Int, false) // Call the function defined above.

Using a default argument, we can define the tail-recursive helper function and the
main function at once, making the code shorter.
The accumulator trick works in a large number of cases, but it may be not ob-
vious how to introduce the accumulator argument, what its initial value must be,
and how to define the inductive step for the accumulator. In the example with the
lengthT function, the accumulator trick works because of the special mathematical
property of the expression being computed:

1 + (1 + (1 + (... + 0))) = (((0 + 1) + 1) + ...) + 1 .

This equation follows from the associativity law of addition. So, the computation
can be rearranged to group all additions to the left. During the evaluation, the
accumulator’s value corresponds to a certain number of left-grouped parentheses,
((0 + 1) ...) + 1. In code, it means that intermediate expressions are fully computed
before making recursive calls. So, recursive calls always occur outside all other
sub-expressions — that is, in tail positions. There are no sub-expressions that
need to be stored on the stack until all the recursive calls are complete.
54
2.2 Converting a sequence into a single value

However, not all computations can be rearranged in that way. Even if a code
rearrangement exists, it may not be immediately obvious how to find it.
An example is a tail-recursive version of the function digitsToInt from the pre-
vious subsection, where the sub-expression digitsToInt(xs) * 10 + x was a non-
tail-recursive call. To transform the code into a tail-recursive form, we need to
rearrange the computation:

𝑟 = 𝑑𝑛−1 + 10 ∗ (𝑑𝑛−2 + 10 ∗ (𝑑𝑛−3 + 10 ∗ (... + 10 ∗ 𝑑0 ))) ,

so that the multiplications group to the left. We can do this by rewriting 𝑟 as:

𝑟 = ((𝑑0 ∗ 10 + 𝑑1 ) ∗ 10 + ...) ∗ 10 + 𝑑𝑛−1 .

It follows that the digit sequence s must be split into the leftmost digit and the rest,
s == s.head +: s.tail. So, a tail-recursive implementation of the above formula is:
@tailrec def fromDigits(s: Seq[Int], res: Int = 0): Int =
// `res` is the accumulator.
if (s.isEmpty) res
else fromDigits(s.tail, 10 * res + s.head)

scala> fromDigits(Seq(1, 2, 3, 4))


res0: Int = 1234

Despite a similarity between this code and the code of digitsToInt from the previ-
ous subsection, the implementation of fromDigits cannot be directly derived from
the inductive definition of digitsToInt. We need a separate proof that fromDigits(s,
0) computes the same result as digitsToInt(s). This can be proved by using the
following property:
Statement 2.2.3.1 For any s: Seq[Int] and r: Int, the following equation holds:
fromDigits(s, r) == digitsToInt(s) + r * math.pow(10, s.length)

Proof We use induction on the length of s. To shorten the proof, denote se-
quences by [1, 2, 3] instead of Seq(1, 2, 3) and temporarily write 𝑑 (𝑠) instead of
digitsToInt(s) and 𝑓 (𝑠, 𝑟) instead of fromDigitsT(s, r). Then an inductive defini-
tion of 𝑓 (𝑠, 𝑟) is:

𝑓 ([], 𝑟) = 𝑟 , 𝑓 ([𝑥]++𝑠, 𝑟) = 𝑓 (𝑠, 10 ∗ 𝑟 + 𝑥) . (2.1)

Denoting the length of a sequence 𝑠 by |𝑠|, we reformulate Statement 2.2.3.1 as:

𝑓 (𝑠, 𝑟) = 𝑑 (𝑠) + 𝑟 ∗ 10|𝑠| . (2.2)

We prove Eq. (2.2) by induction. For the base case 𝑠 = [], we have 𝑓 ( [] , 𝑟) = 𝑟
and 𝑑 ([]) + 𝑟 ∗ 100 = 𝑟 since 𝑑 ( []) = 0 and |𝑠| = 0. The resulting equality 𝑟 = 𝑟
proves the base case.
55
2 Mathematical formulas as code. II. Mathematical induction

To prove the inductive step, we assume that Eq. (2.2) holds for a given sequence
?
𝑠. Then write the inductive step. We use the symbol = to denote equations we still
need to prove:
?
𝑓 ([𝑥]++𝑠, 𝑟) = 𝑑 ([𝑥]++𝑠) + 𝑟 ∗ 10|𝑠|+1 . (2.3)
We will transform the left-hand side and the right-hand side separately, hoping to
obtain the same expression. The left-hand side of Eq. (2.3) is:
𝑓 ([𝑥]++𝑠, 𝑟)
use Eq. (2.1) : = 𝑓 (𝑠, 10 ∗ 𝑟 + 𝑥)
use Eq. (2.2) : = 𝑑 (𝑠) + (10 ∗ 𝑟 + 𝑥) ∗ 10|𝑠| .
The right-hand side of Eq. (2.3) contains 𝑑 ([𝑥]++𝑠), which we now need to rewrite.
Assuming that 𝑑 (𝑠) correctly calculates a number from its digits, we use a prop-
erty of decimal notation: a digit 𝑥 in front of 𝑛 other digits has the value 𝑥 ∗ 10𝑛 .
This property can be formulated as an equation:
𝑑 ([𝑥]++𝑠) = 𝑥 ∗ 10|𝑠| + 𝑑 (𝑠) . (2.4)
So, the right-hand side of Eq. (2.3) can be rewritten as:
𝑑 ([𝑥]++𝑠) + 𝑟 ∗ 10|𝑠|+1
use Eq. (2.4) : = 𝑥 ∗ 10|𝑠| + 𝑑 (𝑠) + 𝑟 ∗ 10|𝑠|+1
factor out 10|𝑠| : = 𝑑 (𝑠) + (10 ∗ 𝑟 + 𝑥) ∗ 10|𝑠| .
We have successfully transformed both sides of Eq. (2.3) to the same expression.
We have not yet proved that the function 𝑑 satisfies the property in Eq. (2.4).
That proof also uses induction. Begin by writing the code of 𝑑 in a short notation:
𝑑 ( []) = 0 , 𝑑 (𝑠++[𝑦]) = 𝑑 (𝑠) ∗ 10 + 𝑦 . (2.5)
The base case is Eq. (2.4) with 𝑠 = []. It is proved by:
𝑥 = 𝑑 ([]++[𝑥]) = 𝑑 ([𝑥]++[]) = 𝑥 ∗ 100 + 𝑑 ([]) = 𝑥 .
The inductive step assumes Eq. (2.4) for a given 𝑥 and a given sequence 𝑠, and
needs to prove that for any 𝑦, the same property holds with 𝑠++[𝑦] instead of 𝑠:
?
𝑑 ([𝑥]++𝑠++[𝑦]) = 𝑥 ∗ 10|𝑠|+1 + 𝑑 (𝑠++[𝑦]) . (2.6)
The left-hand side of Eq. (2.6) is transformed into its right-hand side like this:
𝑑 ([𝑥]++𝑠++[𝑦])
use Eq. (2.5) : = 𝑑 ([𝑥]++𝑠) ∗ 10 + 𝑦
use Eq. (2.4) : = (𝑥 ∗ 10|𝑠| + 𝑑 (𝑠)) ∗ 10 + 𝑦
expand parentheses : = 𝑥 ∗ 10|𝑠|+1 + 𝑑 (𝑠) ∗ 10 + 𝑦
use Eq. (2.5) : = 𝑥 ∗ 10|𝑠|+1 + 𝑑 (𝑠++[𝑦]) .
This establishes Eq. (2.6) and concludes the proof.
56
2.2 Converting a sequence into a single value

2.2.4 Implementing general aggregation (foldLeft)


As a rule, an aggregation computes a single value from a sequence of values. In
general, the type of the result may be different from the type of sequence elements.
To describe that general situation, we introduce type parameters A and B, so that
the input sequence is of type Seq[A] and the aggregated value is of type B. Then an
inductive definition of any aggregation function f: Seq[A] => B looks like this:

• (Base case.) For an empty sequence, we have f(Seq()) = b0, where b0: B is a
given value.

• (Inductive step.) Assuming that b = f(xs) is already computed, we define


f(xs :+ x) = g(x, b), where g is a given function, g: (A, B) => B.

The code for f is written using recursion and the methods init and last:
def f[A, B](s: Seq[A]): B =
if (s.isEmpty) b0
else g(s.last, f(s.init))

We can now refactor this code into a generic utility function, by turning b0 and g
into parameters. A possible implementation is:
def f[A, B](s: Seq[A], b: B, g: (A, B) => B): B =
if (s.isEmpty) b
else g(s.last, f(s.init, b, g))

However, this implementation is not tail-recursive. Applying f to a sequence of,


say, three elements, Seq(x, y, z), will create an intermediate expression g(z, g(y,
g(x, b))). This expression will grow with the length of s, which is not acceptable.
To rearrange the computation into a tail-recursive form, we need to start the base
case at the innermost call g(x, b), then compute g(y, g(x, b)) and continue. In
other words, we need to traverse the sequence starting from its leftmost element
x, rather than starting from the right. So, instead of splitting the sequence s into
s.init :+ s.last as we did in the code of f, we need to split s into s.head +: s.tail.
Let us also exchange the order of the arguments of g, in order to be more consistent
with the way this code is implemented in the Scala library. The resulting code is
tail-recursive:
@tailrec def leftFold[A, B](s: Seq[A], b: B, g: (B, A) => B): B =
if (s.isEmpty) b
else leftFold(s.tail, g(b, s.head), g)

We call this function a “left fold” because it aggregates (or “folds”) the sequence
starting from the leftmost element.
In this way, we have defined a general method of computing any inductively
defined aggregation function on a sequence. The function leftFold implements
the logic of aggregation defined via mathematical induction. Using leftFold, we
can write concise implementations of methods such as sum, max, and many other
57
2 Mathematical formulas as code. II. Mathematical induction

aggregation functions. The method leftFold already contains all the code neces-
sary to set up the base case and the inductive step. The programmer just needs to
specify the expressions for the initial value b and for the updater function g.
As a first example, let us use leftFold for implementing the sum method:
def sum(s: Seq[Int]): Int = leftFold(s, 0, (x, y) => x + y )

To understand in detail how leftFold works, let us trace the evaluation of this
function when applied to Seq(1, 2, 3):
sum(Seq(1, 2, 3)) == leftFold(Seq(1, 2, 3), 0, g)
// Here, g = (x, y) => x + y, so g(x, y) = x + y.
== leftFold(Seq(2, 3), g(0, 1), g) // g (0, 1) = 1.
== leftFold(Seq(2, 3), 1, g) // Now expand the code of `leftFold`.
== leftFold(Seq(3), g(1, 2), g) // g(1, 2) = 3; expand the code.
== leftFold(Seq(), g(3, 3), g) // g(3, 3) = 6; expand the code.
== 6

The second argument of leftFold is the accumulator argument. The initial value of
the accumulator is specified when first calling leftFold. At each iteration, the new
accumulator value is computed by calling the updater function g, which uses the
previous accumulator value and the value of the next sequence element. To visu-
alize the process of recursive evaluation, it is convenient to write a table showing
the sequence elements and the accumulator values as they are updated:

Current element x Old accumulator value New accumulator value

1 0 1
2 1 3
3 3 6

We implemented leftFold only as an illustration. Scala’s library has a method


called foldLeft implementing the same logic using a slightly different type signa-
ture. To see this difference, compare the implementation of sum using our leftFold
function and using the standard foldLeft method:
def sum(s: Seq[Int]): Int = leftFold(s, 0, (x, y) => x + y )

def sum(s: Seq[Int]): Int = s.foldLeft(0) { (x, y) => x + y }

The syntax of foldLeft makes it more convenient to use a nameless function as


the updater argument of foldLeft, since curly braces separate that argument from
others. We will use the standard foldLeft method from now on.
In general, the type of the accumulator value can be different from the type of
the sequence elements. An example is an implementation of count:
def count[A](s: Seq[A], p: A => Boolean): Int =
s.foldLeft(0) { (x, y) => x + (if (p(y)) 1 else 0) }

58
2.2 Converting a sequence into a single value

The accumulator has type Int, while the sequence elements can have an arbitrary
type, parameterized by A. The foldLeft method works in the same way for all
types of accumulators and all types of sequence elements.
Since foldLeft is tail-recursive, stack overflows will not occur even with long
sequences. The method foldLeft is available in the Scala library for all collections,
including dictionaries and sets.
It is important to gain experience using the foldLeft method. The Scala library
contains several other methods similar to foldLeft, such as foldRight, fold, and
reduce. In the following sections, we will mostly focus on foldLeft because the
other fold-like operations are similar.

2.2.5 Examples: Using foldLeft


Example 2.2.5.1 Use foldLeft for implementing the max function for integer se-
quences. Return the special value Int.MinValue for empty sequences.
Solution Begin by writing an inductive formulation of the max function for
sequences. Base case: For an empty sequence, return Int.MinValue. Inductive step:
If max is already computed on a sequence xs, say max(xs) = b, the value of max on a
sequence xs :+ x is the maximum of b and x. So, the code is:
def max(s: Seq[Int]): Int = s.foldLeft(Int.MinValue) { (b, x) => if (b > x) b
else x }

If we are sure that the function will never be called on empty sequences, we can
implement max in a simpler way by using the reduce method:
def max(s: Seq[Int]): Int = s.reduce { (x, y) => if (y > x) y else x }

Example 2.2.5.2 For a given non-empty sequence xs: Seq[Double], compute the
minimum, the maximum, and the mean as a tuple (𝑥min , 𝑥max , 𝑥 mean ). The se-
quence should be traversed only once; i.e., the entire code must be xs.foldLeft(...),
using foldLeft only once.
Solution Without the requirement of using a single traversal, we would write:
(xs.min, xs.max, xs.sum / xs.length)

However, this code traverses xs at least three times, since each of the aggregations
xs.min, xs.max, and xs.sum iterates over xs. We need to combine the four inductive
definitions of min, max, sum, and length into a single inductive definition of some
function. What is the type of that function’s return value? We need to accumulate
intermediate values of all four numbers (min, max, sum, and length) in a tuple. So,
the required type of the accumulator is (Double, Double, Double, Int). To avoid
repeating a long type expression, we can define a type alias for it, say, D4:
scala> type D4 = (Double, Double, Double, Int)
defined type alias D4

59
2 Mathematical formulas as code. II. Mathematical induction

The updater updates each of the four numbers according to the definitions of their
inductive steps:
def update(p: D4, x: Double): D4 = p match { case (min, max, sum, length) =>
(math.min(x, min), math.max(x, max), x + sum, length + 1)
}

Now we can write the code of the required function:


def f(xs: Seq[Double]): (Double, Double, Double) = {
val init: D4 = (Double.PositiveInfinity, Double.NegativeInfinity, 0.0, 0)
val (min, max, sum, length) = xs.foldLeft(init)(update)
(min, max, sum/length)
}

scala> f(Seq(1.0, 1.5, 2.0, 2.5, 3.0))


res0: (Double, Double, Double) = (1.0,3.0,2.0)

Example 2.2.5.3 Implement the map method for sequences by using foldLeft. The
input sequence should be of type Seq[A] and the output sequence of type Seq[B],
where A and B are type parameters. The required type signature of the function
and a sample test:
def map[A, B](xs: Seq[A])(f: A => B): Seq[B] = ???

scala> map(List(1, 2, 3)) { x => x * 10 }


res0: Seq[Int] = List(10, 20, 30)

Solution The required code should build a new sequence by applying the
function f to each element. How can we build a new sequence using foldLeft?
The evaluation of foldLeft consists of iterating over the input sequence and accu-
mulating some result value, which is updated at each iteration. Since the result of
a foldLeft is always equal to the last computed accumulator value, it follows that
the new sequence should be that accumulator value. So, we need to update the
accumulator by appending the value f(x), where x is the current element of the
input sequence:
def map[A, B](xs: Seq[A])(f: A => B): Seq[B] =
xs.foldLeft(Seq[B]()) { (acc, x) => acc :+ f(x) }

Example 2.2.5.4 Implement the function digitsToInt using foldLeft.


Solution The inductive definition of digitsToInt is directly translated into code:
def digitsToInt(d: Seq[Int]): Int =
d.foldLeft(0){ (n, x) => n * 10 + x }

Example 2.2.5.5 Implement the function digitsToDouble using foldLeft. The argu-
ment is of type Seq[Char]. As a test, digitsToDouble(Seq('3','4','.','2','5')) must
evaluate to 34.25. Assume that all input characters are either digits or a dot (so,
negative numbers are not supported).
60
2.2 Converting a sequence into a single value

Solution The evaluation of a foldLeft on a sequence of digits will visit the


sequence from left to right. The updating function should work as in digitsToInt
until a dot character is found. After that, we need to change the updating function.
So, we need to remember whether a dot character has been seen. The only way
for foldLeft to “remember” any data is to hold that data in the accumulator value.
We can choose the type of the accumulator according to our needs. So, for this
task we can choose the accumulator to be a tuple that contains, for instance, the
floating-point result constructed so far and a Boolean flag showing whether we
have already seen the dot character.
Let us consider how the evaluation of digitsToDouble(Seq('3', '4', '.', '2',
'5')) should go. We can write a table showing the intermediate result at each
iteration. This will hopefully help us figure out what the accumulator and the
updater function 𝑔(...) must be:

Current digit 𝑐 Previous result 𝑛 New result 𝑛 0 = 𝑔(𝑛, 𝑐)

'3' 0.0 3.0

'4' 3.0 34.0

'.' 34.0 34.0

'2' 34.0 34.2

'5' 34.2 34.25

While the dot character was not yet seen, the updater function multiplies the
previous result by 10 and adds the current digit. After the dot character, the up-
dater function must add to the previous result the current digit divided by a factor
that represents increasing powers of 10. In other words, the update computation
𝑛0 = 𝑔(𝑛, 𝑐) must be defined by:
(
𝑛 ∗ 10 + 𝑐 if the digit is before the dot
𝑔(𝑛, 𝑐) =
𝑛 + 𝑐/ 𝑓 if after the dot, where 𝑓 = 10, 100, 1000, ... for each new digit
The updater function 𝑔 has only two arguments: the current digit and the previ-
ous accumulator value. So, the changing factor 𝑓 must be part of the accumulator
value, and must be multiplied by 10 at each digit after the dot. If the factor 𝑓 is not
a part of the accumulator value, the function 𝑔 will not have enough information
for computing the next accumulator value correctly. So, the updater computation
must be 𝑛0 = 𝑔(𝑛, 𝑐, 𝑓 ), not 𝑛0 = 𝑔(𝑛, 𝑐).
For this reason, we choose the accumulator type as a tuple (Double, Boolean,
Double) where the first number is the result 𝑛 computed so far, the Boolean flag
indicates whether the dot was already seen, and the third number is 𝑓 , that is,
the power of 10 by which the current digit will be divided if the dot was already
seen. Initially, the accumulator tuple will be equal to (0.0, false, 10.0). Then the
updater function is implemented like this:
61
2 Mathematical formulas as code. II. Mathematical induction

def update(acc: (Double, Boolean, Double), c: Char): (Double, Boolean, Double) =


acc match { case (num, flag, factor) =>
if (c == '.') (num, true, factor) // Set flag to `true` after seeing a dot.
else {
val digit = c.asDigit // Convert a character to decimal digit.
if (flag) (num + digit / factor, flag, factor * 10) // After the dot.
else (num * 10 + digit, flag, factor) // Before the dot.
}
}

Now we can implement digitsToDouble like this:


def digitsToDouble(d: Seq[Char]): Double = {
val initAcc = (0.0, false, 10.0)
val (num, _, _) = d.foldLeft(initAcc)(update)
num
}

scala> digitsToDouble(Seq('3', '4', '.', '2', '5'))


res0: Double = 34.25

The result of calling d.foldLeft is a tuple (num, flag, factor), in which only the first
part, num, is needed. In Scala’s pattern matching syntax, the underscore (_) denotes
pattern variables whose values are not needed in the code. We could get the first
part using the accessor method ._1, but the code will be more readable if we show
all parts of the tuple (num, _, _).
Example 2.2.5.6 Implement a function toPairs that converts a sequence of type
Seq[A] to a sequence of pairs, Seq[(A, A)], by putting together the adjacent ele-
ments pairwise. If the initial sequence has an odd number of elements, a given
default value of type A is used to fill the last pair. The required type signature and
an example test:
def toPairs[A](xs: Seq[A], default: A): Seq[(A, A)] = ???

scala> toPairs(Seq(1, 2, 3, 4, 5, 6), -1)


res0: Seq[(Int, Int)] = List((1,2), (3,4), (5,6))

scala> toPairs(Seq("a", "b", "c"), "<nothing>")


res1: Seq[(String, String)] = List((a,b), (c,<nothing>))

Solution We need to accumulate a sequence of pairs, and each pair needs two
values. However, we iterate over values in the input sequence one by one. So,
a new pair can be made only once every two iterations. The accumulator needs
to hold the information about the current iteration being even or odd. For odd-
numbered iterations, the accumulator also needs to store the previous element
that is still waiting for its pair. Therefore, we choose the type of the accumu-
lator to be a tuple (Seq[(A, A)], Seq(A)). The first sequence is the intermediate
result, and the second sequence is the “holdover”: it holds the previous element
62
2.2 Converting a sequence into a single value

for odd-numbered iterations and is empty for even-numbered iterations. Initially,


the accumulator should be empty. An example evaluation is:

Current element x Previous accumulator Next accumulator

"a" (Seq(), Seq()) (Seq(), Seq("a"))


"b" (Seq(), Seq("a")) (Seq(("a","b")), Seq())
"c" (Seq(("a","b")), Seq()) (Seq(("a","b")), Seq("c"))

Now it becomes clear how to implement the updater function:


type Acc = (Seq[(A, A)], Seq[A]) // Type alias, for brevity.
def updater(acc: Acc, x: A): Acc = acc match {
case (result, Seq()) => (result, Seq(x))
case (result, Seq(prev)) => (result :+ ((prev, x)), Seq())
}

We will call foldLeft with this updater and then perform some post-processing to
make sure we create the last pair in case the last iteration is odd-numbered, i.e.,
when the “holdover” is not empty after foldLeft is finished. In this implementa-
tion, we use pattern matching to decide whether a sequence is empty:
def toPairs[A](xs: Seq[A], default: A): Seq[(A, A)] = {
type Acc = (Seq[(A, A)], Seq[A]) // Type alias, for brevity.
def init: Acc = (Seq(), Seq())
def updater(acc: Acc, x: A): Acc = acc match {
case (result, Seq()) => (result, Seq(x))
case (result, Seq(prev)) => (result :+ ((prev, x)), Seq())
}
val (result, holdover) = xs.foldLeft(init)(updater)
holdover match { // May need to append the last element to the result.
case Seq() => result
case Seq(x) => result :+ ((x, default))
}
}

This code shows examples of partial functions that are applied safely. One of these
partial functions is used in this sub-expression:
holdover match {
case Seq() => ...
case Seq(a) => ...
}

This code works when holdover is empty or has length 1 but fails for longer se-
quences. In the implementation of toPairs, the value of holdover will always be a
sequence of length at most 1, so it is safe to use this partial function.
63
2 Mathematical formulas as code. II. Mathematical induction

2.2.6 Exercises: Using foldLeft


Exercise 2.2.6.1 Implement a function fromPairs that performs the inverse trans-
formation to the toPairs function defined in Example 2.2.5.6. The required type
signature and a sample test are:
def fromPairs[A](xs: Seq[(A, A)]): Seq[A] = ???

scala> fromPairs(Seq((1, 2), (3, 4)))


res0: Seq[Int] = List(1, 2, 3, 4)

Hint: This can be done with foldLeft or with flatMap.


Exercise 2.2.6.2 Implement the flatten method for sequences by using foldLeft.
The required type signature and a sample test are:
def flatten[A](xxs: Seq[Seq[A]]): Seq[A] = ???

scala> flatten(Seq(Seq(1, 2, 3), Seq(), Seq(4)))


res0: Seq[Int] = List(1, 2, 3, 4)

Exercise 2.2.6.3 Use foldLeft to implement the zipWithIndex method for sequences.
The required type signature and a sample test:
def zipWithIndex[A](xs: Seq[A]): Seq[(A, Int)] = ???

scala> zipWithIndex(Seq("a", "b", "c", "d"))


res0: Seq[(String, Int)] = List((a, 0), (b, 1), (c, 2), (d, 3))

Exercise 2.2.6.4 Use foldLeft to implement a function filterMap that combines


map and filter for sequences. The predicate is applied to the elements of the ini-
tial sequence, and values that pass the predicate are mapped. The required type
signature and a sample test:
def filterMap[A, B](xs: Seq[A])(pred: A => Boolean)(f: A => B): Seq[B] = ???

scala> filterMap(Seq(1, 2, 3, 4)) { x => x > 2 } { x => x * 10 }


res0: Seq[Int] = List(30, 40)

Exercise 2.2.6.5 Split a sequence into subsequences (“batches”) of length at most


𝑛. The required type signature and a sample test:
def byLength[A](xs: Seq[A], maxLength: Int): Seq[Seq[A]] = ???

scala> byLength(Seq("a", "b", "c", "d"), 2)


res0: Seq[Seq[String]] = List(List(a, b), List(c, d))

scala> byLength(Seq(1, 2, 3, 4, 5, 6, 7), 3)


res1: Seq[Seq[Int]] = List(List(1, 2, 3), List(4, 5, 6), List(7))

Exercise 2.2.6.6 Split a sequence into batches by “weight” computed via a given
function. The total weight of items in any batch should not be larger than a given
maximum weight. The required type signature and a sample test:
64
2.3 Generating a sequence from a single value

def byWeight[A](xs: Seq[A], maxW: Double)(w: A => Double): Seq[Seq[A]] = ???

scala> byWeight((1 to 10).toList, 5.75){ x => math.sqrt(x) }


res0: Seq[Seq[Int]] = List(List(1, 2, 3), List(4, 5), List(6, 7), List(8),
List(9), List(10))

Exercise 2.2.6.7 Use foldLeft to implement a groupBy function. The type signature
and a test:
def groupBy[A, K](xs: Seq[A])(by: A => K): Map[K, Seq[A]] = ???

scala> groupBy(Seq(1, 2, 3, 4, 5)){ x => x % 2 }


res0: Map[Int, Seq[Int]] = Map(1 -> List(1, 3, 5), 0 -> List(2, 4))

Hints: The accumulator should be of type Map[K, Seq[A]]. Use the methods
updated and getOrElse to work with dictionaries. The method getOrElse fetches a
value from a dictionary by key but returns a default value if the key is not in the
dictionary:
scala> Map("a" -> 1, "b" -> 2).getOrElse("a", 300)
res0: Int = 1

scala> Map("a" -> 1, "b" -> 2).getOrElse("c", 300)


res1: Int = 300

The method updated produces a new dictionary that contains a new value for the
given key, whether or not that key already exists in the dictionary:
scala> Map("a" -> 1, "b" -> 2).updated("c", 300) // Key is new.
res0: Map[String,Int] = Map(a -> 1, b -> 2, c -> 300)

scala> Map("a" -> 1, "b" -> 2).updated("a", 400) // Key already exists.
res1: Map[String,Int] = Map(a -> 400, b -> 2)

2.3 Generating a sequence from a single value


An aggregation converts (“folds”) a sequence into a single value; the opposite op-
eration (“unfolding”) builds a new sequence from a single value and other needed
information. An example is computing the decimal digits of a given integer:
def digitsOf(x: Int): Seq[Int] = ???

scala> digitsOf(2405)
res0: Seq[Int] = List(2, 4, 0, 5)

We cannot implement digitsOf using map, zip, or foldLeft, because these methods
work only if we already have a sequence; but the function digitsOf needs to create
a new sequence. We could create a sequence via the expression (1 to n) if the
required length of the sequence were known in advance. However, the function
65
2 Mathematical formulas as code. II. Mathematical induction

digitsOf must produce a sequence whose length is determined by a condition that


we cannot easily evaluate in advance.
A general “unfolding” operation needs to build a sequence whose length is not
determined in advance. This kind of sequence is called a stream. The elements of
a stream are computed only when necessary (unlike the elements of List or Array,
which are all computed in advance). The unfolding operation will compute next
elements on demand; this creates a stream. We can then apply takeWhile to the
stream, in order to stop it when a certain condition holds. Finally, if required, the
truncated stream may be converted to a list or another type of sequence. In this
way, we can generate a sequence of initially unknown length according to any
given requirements.
The Scala library has a general stream-producing function Stream.iterate.2 This
function has two arguments, the initial value and a function that computes the
next value from the previous one:
scala> Stream.iterate(2) { x => x + 10 }
res0: Stream[Int] = Stream(2, ?)

The stream is ready to start computing the next elements of the sequence (so far,
only the first element, 2, has been computed). In order to see the next elements,
we need to stop the stream at a finite size and then convert the result to a list:
scala> Stream.iterate(2) { x => x + 10 }.take(6).toList
res1: List[Int] = List(2, 12, 22, 32, 42, 52)

If we try to evaluate toList on a stream without first limiting its size via take or
takeWhile, the program will keep producing more elements until it runs out of
memory and crashes.
Streams have methods such as map, filter, and flatMap similar to sequences. For
instance, the method drop skips a given number of initial elements:
scala> Seq(10, 20, 30, 40, 50).drop(3)
res2: Seq[Int] = List(40, 50)

scala> Stream.iterate(2) { x => x + 10 }.drop(3)


res3: Stream[Int] = Stream(32, ?)

This example shows that in order to evaluate drop(3), the stream had to compute
its elements up to 32 (but the subsequent elements are still not computed).
To figure out the code for digitsOf, we first write this function as a mathematical
formula. To compute the digits of, say, 𝑛 = 2405, we need to divide 𝑛 repeatedly
by 10, getting a sequence 𝑛 𝑘 of intermediate numbers (𝑛0 = 2405, 𝑛1 = 240, ...) and
the corresponding sequence of last digits, 𝑛 𝑘 % 10 (in this example: 5, 0, ...). The
sequence 𝑛 𝑘 is defined using mathematical induction:

• Base case: 𝑛0 = 𝑛, where 𝑛 is a given initial integer.


2 In Scala 3, the Stream class is replaced by LazyList.

66
2.4 Transforming a sequence into another sequence
 
• Inductive step: 𝑛 𝑘+1 = 𝑛10𝑘 for 𝑘 = 1, 2, ...
 
Here 𝑛10𝑘 is the mathematical notation for the integer division by 10. Let us tabu-
late the evaluation of the sequence 𝑛 𝑘 for 𝑛 = 2405:

𝑘= 0 1 2 3 4 5 6
𝑛𝑘 = 2405 240 24 2 0 0 0
𝑛 𝑘 % 10 = 5 0 4 2 0 0 0

The numbers 𝑛 𝑘 will remain all zeros after 𝑘 = 4. It is clear that the useful part of
the sequence is before it becomes all zeros. In this example, the sequence 𝑛 𝑘 needs
to be stopped at 𝑘 = 4. The sequence of digits then becomes [5, 0, 4, 2], and we
need to reverse it to obtain [2, 4, 0, 5]. For reversing a sequence, the Scala library
has the standard method reverse. So, a complete implementation for digitsOf is:
def digitsOf(n: Int): Seq[Int] =
if (n == 0) Seq(0) else { // n == 0 is a special case.
Stream.iterate(n) { nk => nk / 10 }
.takeWhile { nk => nk != 0 }
.map { nk => nk % 10 }
.toList.reverse
}

We can shorten the code by using the syntax (_ % 10) instead of { nk => nk % 10 }:
def digitsOf(n: Int): Seq[Int] =
if (n == 0) Seq(0) else { // n == 0 is a special case.
Stream.iterate(n)(_ / 10)
.takeWhile(_ != 0)
.map(_ % 10)
.toList.reverse
}


The type signature of the method Stream.iterate can be written as:
def iterate[A](init: A)(next: A => A): Stream[A]

This shows a close correspondence to a definition by mathematical induction. The


base case is the first value (init) and the inductive step is a function (next) that
computes the next element from the previous one. It is a general way of creating
sequences whose length is not determined in advance.

2.4 Transforming a sequence into another sequence


We have seen methods such as map and zip that transform sequences into se-
quences. However, these methods cannot express a general transformation where
67
2 Mathematical formulas as code. II. Mathematical induction

the elements of the new sequence are defined by induction and depend on previ-
ous elements. An example of this kind is computing the partial sums of a given
Í 𝑘−1
sequence 𝑥𝑖 , say 𝑏 𝑘 = 𝑖=0 𝑥𝑖 . This formula defines 𝑏 0 = 0, 𝑏 1 = 𝑥0 , 𝑏 2 = 𝑥 0 + 𝑥1 ,
𝑏 3 = 𝑥 0 + 𝑥 1 + 𝑥2 , etc. A definition via mathematical induction may be written like
this:
• Base case: 𝑏 0 = 0.
• Inductive step: Given 𝑏 𝑘 , we define 𝑏 𝑘+1 = 𝑏 𝑘 + 𝑥 𝑘 for 𝑘 = 0, 1, 2, ...
The Scala library method scanLeft implements a general sequence-to-sequence
transformation defined in this way. The code implementing the partial sums is:
def partialSums(xs: Seq[Int]): Seq[Int] = xs.scanLeft(0){ (x, y) => x + y }

scala> partialSums(Seq(1, 2, 3, 4))


res0: Seq[Int] = List(0, 1, 3, 6, 10)
The first argument of scanLeft is the base case, and the second argument is an
updater function describing the inductive step.
In general, the type of elements of the second sequence is different from that
of the first sequence. The updater function takes an element of the first sequence
and a previous element of the second sequence, and returns the next element of
the second sequence. Note that the result of scanLeft is one element longer than
the original sequence, because the base case provides an initial value.
Until now, we have seen that foldLeft is sufficient to re-implement almost every
method that works on sequences, such as map, filter, or flatten. Let us show, as
an illustration, how to implement the method scanLeft via foldLeft. In the imple-
mentation, the accumulator contains the previous element of the second sequence
together with a growing fragment of that sequence, which is updated as we iterate
over the first sequence. The code is:
1 def scanLeft[A, B](xs: Seq[A])(b0: B)(next: (B, A) => B): Seq[B] = {
2 val init: (B, Seq[B]) = (b0, Seq(b0))
3 val (_, result) = xs.foldLeft(init) {
4 case ((b, seq), x) =>
5 val newB = next(b, x)
6 (newB, seq :+ newB)
7 }
8 result
9 }
To implement the (nameless) updater function for foldLeft in lines 4–6, we used a
Scala feature that makes it easier to define functions with several arguments con-
taining tuples. In our case, the updater function in foldLeft has two arguments:
the first is a tuple (B, Seq[B]), the second is a value of type A. Although the pat-
tern expression case ((b, seq), x) => ... appears to match a nested tuple, it is just
a special syntax. In reality, this expression matches the two arguments of the up-
dater function and, at the same time, destructures the tuple argument as (b, seq).
68
2.5 Summary

Definition by induction Scala code example

𝑓 ( []) = 𝑏 ; 𝑓 (𝑠++[𝑥]) = 𝑔( 𝑓 (𝑠), 𝑥) f(xs) = xs.foldLeft(b)(g)


𝑓 ( []) = 𝑏 ; 𝑓 ( [𝑥]++𝑠) = 𝑔(𝑥, 𝑓 (𝑠)) f(xs) = xs.foldRight(b)(g)
𝑥 0 = 𝑏 ; 𝑥 𝑘+1 = 𝑔(𝑥 𝑘 ) xs = Stream.iterate(b)(g)
𝑦 0 = 𝑏 ; 𝑦 𝑘+1 = 𝑔(𝑦 𝑘 , 𝑥 𝑘 ) ys = xs.scanLeft(b)(g)

Table 2.1: Implementing mathematical induction.

2.5 Summary
We have seen a number of ways for translating mathematical induction into Scala
code. What problems can we solve now?

• Compute mathematical expressions involving arbitrary recursion.

• Use the accumulator trick to enforce tail recursion.

• Implement functions with type parameters.

• Use arbitrary inductive (i.e., recursive) formulas to:


– convert sequences to single values (aggregation or “folding”);
– create new sequences from single values (“unfolding”);
– transform existing sequences into new sequences.

Table 2.1 shows Scala code implementing those tasks. Iterative calculations are
implemented by translating mathematical induction directly into code. In the
functional programming paradigm, the programmer does not need to write loops
or use array indices. Instead, the programmer reasons about sequences as mathe-
matical values: “Starting from this value, we get that sequence, then transform it
into that other sequence,” etc. This is a powerful way of working with sequences,
dictionaries, and sets. Many kinds of programming errors (such as using an incor-
rect array index) are avoided from the outset, and the code is shorter and easier to
read than code written via loops.
What problems cannot be solved with these tools? There is no automatic recipe
for converting an arbitrary function into a tail-recursive one. The accumulator
trick does not always work! In some cases, it is impossible to implement tail
recursion in a given recursive computation. An example of such a computation is
the “merge-sort” algorithm where the function body must contain two recursive
calls within a single expression. (It is impossible to rewrite two recursive calls as
one tail call.)
69
2 Mathematical formulas as code. II. Mathematical induction

What if our recursive code cannot be transformed into tail-recursive code via
the accumulator trick, but the recursion depth is so large that stack overflows
occur? There exist special techniques (e.g., “continuations” and “trampolines”)
that convert non-tail-recursive code into code that runs without stack overflows.
Those techniques are beyond the scope of this chapter.

2.5.1 Examples
Example 2.5.1.1 Compute the smallest 𝑛 such that 𝑓 ( 𝑓 ( 𝑓 (... 𝑓 (1)...) ≥ 1000, where
the function 𝑓 is applied 𝑛 times. Test with 𝑓 (𝑥) = 2𝑥 + 1.
Solution Define a stream of values [1, 𝑓 (1), 𝑓 ( 𝑓 (1)), ...] and use takeWhile to
stop the stream when the values reach 1000. The number 𝑛 is then found as the
length of the resulting sequence:
scala> Stream.iterate(1)(x => 2 * x + 1).takeWhile(x => x < 1000).toList
res0: List[Int] = List(1, 3, 7, 15, 31, 63, 127, 255, 511)

scala> Stream.iterate(1)(x => 2 * x + 1).takeWhile(x => x < 1000).length


res1: Int = 9

Example 2.5.1.2 (a) For a given Stream[Int], compute the stream of the largest
values seen so far.
(b) Compute the stream of 𝑘 largest values seen so far (𝑘 is a given integer
parameter).
Solution We cannot use max or sort the entire stream, since the length of the
stream is not known in advance. So, we need to use scanLeft, which will build the
output stream one element at a time.
(a) Maintain the largest value seen so far in the accumulator of the scanLeft:
def maxSoFar(xs: Stream[Int]): Stream[Int] =
xs.scanLeft(xs.head) { (max, x) => math.max(max, x) }.drop(1)

We use drop(1) to remove the initial value (xs.head) because it is not useful for our
result but is always produced by scanLeft.
To test this function, let us define a stream whose values go up and down:
val s = Stream.iterate(0)(x => 1 - 2 * x)

scala> s.take(10).toList
res0: List[Int] = List(0, 1, -1, 3, -5, 11, -21, 43, -85, 171)

scala> maxSoFar(s).take(10).toList
res1: List[Int] = List(0, 1, 1, 3, 3, 11, 11, 43, 43, 171)

(b) We again use scanLeft, where now the accumulator needs to keep the largest
𝑘 values seen so far. There are two ways of maintaining this accumulator: First, to
have a sequence of 𝑘 values that we sort and truncate each time. Second, to use a
data structure such as a priority queue that automatically keeps values sorted and
70
2.5 Summary

its length bounded. For the purposes of this example, let us avoid using special
data structures:
def maxKSoFar(xs: Stream[Int], k: Int): Stream[Seq[Int]] = {
// The initial value of the accumulator is an empty Seq() of type Seq[Int].
xs.scanLeft(Seq[Int]()) { (seq, x) =>
// Sort in descending order, and take the first k values.
(seq :+ x).sorted.reverse.take(k)
}.drop(1) // Skip the undesired first value.
}

scala> maxKSoFar(s, 3).take(10).toList


res2: List[Seq[Int]] = List(List(0), List(1, 0), List(1, 0, -1), List(3, 1, 0),
List(3, 1, 0), List(11, 3, 1), List(11, 3, 1), List(43, 11, 3), List(43,
11, 3), List(171, 43, 11))

Example 2.5.1.3 Find the last element of a non-empty sequence. (Hint: use
reduce.)
Solution This function is available in the Scala library as the standard method
last on sequences. Here we need to re-implement it using reduce. Begin by writing
an inductive definition:
• (Base case.) last(Seq(x)) == x.
• (Inductive step.) last(x +: xs) == last(xs) assuming xs is non-empty.
The reduce method implements an inductive aggregation similarly to foldLeft,
except that for reduce the base case always returns x for a 1-element sequence
Seq(x). This is exactly what we need here, so the inductive definition is directly
translated into code, with the updater function 𝑔(𝑥, 𝑦) = 𝑦:
def last[A](xs: Seq[A]): A = xs.reduce { (x, y) => y }

Example 2.5.1.4 (a) Count the occurrences of each distinct word in a string:
def countWords(s: String): Map[String, Int] = ???

scala> countWords("a quick a quick a brown a fox")


res0: Map[String, Int] = Map(a -> 4, quick -> 2, brown -> 1, fox -> 1)

(b) Count the occurrences of each distinct element in a sequence of type Seq[A].
Solution (a) We split the string into an array of words via s.split(" ") and
apply a foldLeft to that array, since the computation is a kind of aggregation over
the array of words. The accumulator of the aggregation will be a dictionary of
word counts for all the words seen so far:
def countWords(s: String): Map[String, Int] = {
val init: Map[String, Int] = Map()
s.split(" ").foldLeft(init) { (dict, word) =>
val newCount = dict.getOrElse(word, 0) + 1
dict.updated(word, newCount)
}
}

71
2 Mathematical formulas as code. II. Mathematical induction

An alternative, shorter implementation of the same function is:


def countWords(s: String): Map[String, Int] =
s.split(" ").groupBy(w => w).map { case (w, xs) => (w, xs.length) }

The groupBy creates a dictionary in one function call rather than one entry at a time.
But the resulting dictionary contains word lists instead of word counts, so we use
map to compute the length of each word list:
scala> "a a b b b c".split(" ").groupBy(w => w)
res0: Map[String,Array[String]] = Map(b -> Array(b, b, b), a -> Array(a, a), c
-> Array(c))

scala> res0.map { case (w, xs) => (w, xs.length) }


res1: Map[String,Int] = Map(b -> 3, a -> 2, c -> 1)

(b) The main code of countWords does not depend on the fact that words are
of type String. It will work in the same way for any other type of keys for the
dictionary. So, we keep the same code (except for renaming word to x) and replace
String by a type parameter A in the type signature:
def countValues[A](xs: Seq[A]): Map[A, Int] =
xs.foldLeft(Map[A, Int]()) { (dict, x) =>
val newCount = dict.getOrElse(x, 0) + 1
dict.updated(x, newCount)
}

scala> countValues(Seq(100, 100, 200, 100, 200, 200, 100))


res0: Map[Int,Int] = Map(100 -> 4, 200 -> 3)

Example 2.5.1.5 (a) Implement the binary search algorithm for a sorted sequence
xs: Seq[Int] as a function returning the index of the requested value goal (assume
that xs always contains goal):
@tailrec def binSearch(xs: Seq[Int], goal: Int): Int = ???

scala> binSearch(Seq(1, 3, 5, 7), 5)


res0: Int = 2

(b) Implement binSearch using Stream.iterate without explicit recursion.


Solution (a) The binary search algorithm splits the array into two halves and
may continue the search recursively in one of the halves. We need to write the
solution as a tail-recursive function with an additional accumulator argument.
So, we expect that the code should look like this:
@tailrec def binSearch(xs: Seq[Int], goal: Int, acc: _ = ???): Int = {
if (???) acc // This condition must decide whether we are finished.
else {
// Determine which half of the sequence contains `goal`.
// Then update the accumulator accordingly.
val newAcc = ???
binSearch(xs, goal, newAcc) // Tail-recursive call.

72
2.5 Summary

}
}

We will first figure out the type and the initial value of the accumulator, then
implement the updater.
The information required for the recursive call must show the segment of the
sequence where the target number is present. That segment is defined by two
indices 𝑖, 𝑗 representing the left and the right bounds of the sub-sequence, such
that the target element is 𝑥 𝑛 with 𝑥𝑖 ≤ 𝑥 𝑛 ≤ 𝑥 𝑗−1 . It follows that the accumulator
should be a pair of two integers (𝑖, 𝑗). The initial value of the accumulator is the
pair (0, 𝑁), where 𝑁 is the length of the entire sequence. The search is finished
when 𝑖 + 1 = 𝑗. For convenience, we introduce two accumulator values (left and
right) for 𝑖 and 𝑗:
@tailrec def binSearch(xs: Seq[Int], goal: Int)(left: Int = 0, right: Int =
xs.length): Int = {
// Check whether `goal` is at one of the boundaries.
if (right - left <= 1 || xs(left) == goal) left
else {
val middle = (left + right) / 2
// Determine which half of the array contains `target`.
// Update the accumulator accordingly.
val (newLeft, newRight) =
if (goal < xs(middle)) (left, middle)
else (middle, right)
binSearch(xs, goal)(newLeft, newRight) // Tail-recursive call.
}
}

scala> binSearch(0 to 10, 3)() // Default accumulator values.


res0: Int = 3

Here we used a feature of Scala that allows us to set xs.length as a default value
for the argument right of binSearch. This works because right is in a different argu-
ment list from xs. Default values in an argument list may depend on arguments
in a previous argument list. However, this code:
def binSearch(xs: Seq[Int], goal: Int, left: Int = 0, right: Int = xs.length)

will generate an error. Arguments in the same argument list cannot depend on
each other. (The error will say not found: value xs.)
(b) We can visualize the binary search as a procedure that generates a stream of
progressively tighter bounds for the location of goal. The initial bounds are (0,
xs.length), and the final bounds are (k, k + 1) for some k. We can generate the
sequence of bounds using Stream.iterate and stop the sequence when the bounds
become sufficiently tight. To detect that, we use the find method:
def binSearch(xs: Seq[Int], goal: Int): Int = {
type Acc = (Int, Int)
val init: Acc = (0, xs.length)

73
2 Mathematical formulas as code. II. Mathematical induction

val updater: Acc => Acc = { case (left, right) =>


if (right - left <= 1 || xs(left) == goal) (left, left + 1)
else {
val middle = (left + right) / 2
// Determine which half of the array contains `target`.
// Update the accumulator accordingly.
if (goal < xs(middle)) (left, middle)
else (middle, right)
}
}

Stream.iterate(init)(updater)
.find { case (x, y) => y - x <= 1 } // Find an element with tight bounds.
.get._1 // Take the `left` bound from that.
}

In this code, recursion is delegated to Stream.iterate and is cleanly separated from


the “business logic” (i.e., from specific computations needed in the base case, the
inductive step, and the post-processing).
Example 2.5.1.6 For a given positive n: Int, compute the sequence [𝑠0 , 𝑠1 , 𝑠2 , ...]
defined by 𝑠0 = 𝑆𝐷 (𝑛) and 𝑠 𝑘 = 𝑆𝐷 (𝑠 𝑘−1 ) for 𝑘 > 0, where 𝑆𝐷 (𝑥) is the sum of the
decimal digits of the integer 𝑥, e.g., 𝑆𝐷 (123) = 6. Stop the sequence 𝑠𝑖 when the
numbers begin repeating. For example, 𝑆𝐷 (99) = 18, 𝑆𝐷 (18) = 9, 𝑆𝐷 (9) = 9. So,
for 𝑛 = 99, the sequence 𝑠𝑖 must be computed as [99, 18, 9].
Hint: use Stream.iterate and scanLeft.
Solution We need to implement a function sdSeq having the type signature:
def sdSeq(n: Int): Seq[Int]

First, we need to implement 𝑆𝐷 (𝑥). The sum of digits is obtained similarly to


Section 2.3:
def SD(n: Int): Int = Stream.iterate(n)(_ / 10).takeWhile(_ != 0).map(_ %
10).sum

Let us compute the sequence [𝑠0 , 𝑠1 , 𝑠2 , ...] by repeatedly applying SD to some num-
ber, say, 99:
scala> Stream.iterate(99)(SD).take(10).toList
res1: List[Int] = List(99, 18, 9, 9, 9, 9, 9, 9, 9, 9)

We need to stop the stream when the values start to repeat, keeping the first re-
peated value. In the example above, we need to stop the stream after the value 9
(but include that value). One solution is to transform the stream via scanLeft into a
stream of pairs of consecutive values, so that it becomes easier to detect repetition:
scala> Stream.iterate(99)(SD).scanLeft((0,0)) { case ((prev, x), next) => (x,
next) }.take(8).toList
res2: List[(Int, Int)] = List((0,0), (0,99), (99,18), (18,9), (9,9), (9,9),
(9,9), (9,9))

74
2.5 Summary

scala> res2.drop(1).takeWhile { case (x, y) => x != y }


res3: List[(Int, Int)] = List((0,99), (99,18), (18,9))

This looks right; it remains to remove the first parts of the tuples:
def sdSeq(n: Int): Seq[Int] = Stream.iterate(n)(SD) // Stream[Int]
.scanLeft((0,0)) { case ((prev, x), next) => (x, next) } // Stream[(Int, Int)]
.drop(1).takeWhile { case (x, y) => x != y } // Stream[(Int, Int)]
.map(_._2) // Stream[Int]
.toList // List[Int]

scala> sdSeq(99)
res3: Seq[Int] = List(99, 18, 9)

Example 2.5.1.7 Implement a function unfold with the type signature:


def unfold[A](init: A)(next: A => Option[A]): Stream[A]

The function should create a stream of values of type A with the initial value init.
Next elements are computed from previous ones via the function next until it re-
turns None. (The type Option is explained in Section 3.2.3.) An example test:
scala> unfold(0) { x => if (x > 5) None else Some(x + 2) }
res0: Stream[Int] = Stream(0, ?)

scala> res0.toList
res1: List[Int] = List(0, 2, 4, 6)

Solution We can formulate the task as an inductive definition of a stream.


If next(init) == None, the stream will have just one value (init). This is the base
case of the induction. Otherwise, next(init) == Some(x) yields a new value x. So,
we need to continue to “unfold” the stream with x instead of init. (This is the
inductive step.) To create streams with given values, we use the Scala library
method Stream.cons. It constructs a stream from a head value and a tail stream:
def unfold[A](init: A)(next: A => Option[A]): Stream[A] = next(init) match {
case None => Stream(init) // A stream having a single value `init`.
case Some(x) => Stream.cons(init, unfold(x)(next)) // `init` and then the
tail of the stream.
}

Example 2.5.1.8 For a given stream [𝑠0 , 𝑠1 , 𝑠2 , ...] of type Stream[T], compute the
“half-speed” stream ℎ = [𝑠0 , 𝑠0 , 𝑠1 , 𝑠1 , 𝑠2 , 𝑠2 , ...]. The half-speed sequence ℎ is de-
fined as ℎ2𝑘 = ℎ2𝑘+1 = 𝑠 𝑘 for 𝑘 = 0, 1, 2, ...
Solution We use map to replace each element 𝑠𝑖 by a sequence containing two
copies of 𝑠𝑖 . Let us try this on a sample sequence:
scala> Seq(1, 2, 3).map( x => Seq(x, x))
res0: Seq[Seq[Int]] = List(List(1, 1), List(2, 2), List(3, 3))

The result is almost what we need, except we need to flatten the nested list:
scala> Seq(1, 2, 3).map( x => Seq(x, x)).flatten

75
2 Mathematical formulas as code. II. Mathematical induction

res1: Seq[Seq[Int]] = List(1, 1, 2, 2, 3, 3)

The composition of map and flatten is flatMap, so the final code is:
def halfSpeed[T](str: Stream[T]): Stream[T] = str.flatMap(x => Seq(x, x))

scala> halfSpeed(Seq(1, 2, 3).toStream)


res2: Stream[Int] = Stream(1, ?)

scala> halfSpeed(Seq(1, 2, 3).toStream).toList


res3: List[Int] = List(1, 1, 2, 2, 3, 3)

Example 2.5.1.9 (The loop detection problem.) Stop a given stream [𝑠0 , 𝑠1 , 𝑠2 , ...]
at a place 𝑘 where the sequence repeats itself; that is, an element 𝑠 𝑘 equals some
earlier element 𝑠𝑖 with 𝑖 < 𝑘.
Solution The trick is to create a half-speed sequence ℎ𝑖 out of 𝑠𝑖 and then find
an index 𝑘 > 0 such that ℎ 𝑘 = 𝑠 𝑘 . (The condition 𝑘 > 0 is needed because we
will always have ℎ0 = 𝑠0 .) If we find such an index 𝑘, it would mean that either
𝑠 𝑘 = 𝑠 𝑘/2 or 𝑠 𝑘 = 𝑠 (𝑘−1)/2 ; in either case, we will have found an element 𝑠 𝑘 that
equals an earlier element.
As an example, for an input sequence 𝑠 = [1, 3, 5, 7, 9, 3, 5, 7, 9, ...] we obtain the
half-speed sequence ℎ = [1, 1, 3, 3, 5, 5, 7, 7, 9, 9, 3, 3, ...]. Looking for an index 𝑘 > 0
such that ℎ 𝑘 = 𝑠 𝑘 , we find that 𝑠7 = ℎ7 = 7. The element 𝑠7 indeed repeats an earlier
element (although 𝑠7 is not the first such repetition).
There are in principle two ways of finding an index 𝑘 > 0 such that ℎ 𝑘 = 𝑠 𝑘 :
First, to iterate over a list of indices 𝑘 = 1, 2, ... and evaluate the condition ℎ 𝑘 = 𝑠 𝑘
as a function of 𝑘. Second, to build a sequence of pairs (ℎ𝑖 , 𝑠𝑖 ) and use takeWhile to
stop at the required index. In the present case, we cannot use the first way because
we do not have a fixed set of indices to iterate over. Also, the condition ℎ 𝑘 = 𝑠 𝑘
cannot be directly evaluated as a function of 𝑘 because 𝑠 and ℎ are streams that
compute elements on demand, not lists whose elements are computed in advance
and ready for use.
So, the code must iterate over a stream of pairs (ℎ𝑖 , 𝑠𝑖 ):
def stopRepeats[T](str: Stream[T]): Stream[T] = {
val halfSpeed = str.flatMap(x => Seq(x, x))
val result = halfSpeed.zip(str) // Stream[(T, T)]
.drop(1) // Enforce the condition k > 0.
.takeWhile { case (h, s) => h != s } // Stream[(T, T)]
.map(_._2) // Stream[T]
str.head +: result // Prepend the first element that was dropped.
}

scala> stopRepeats(Seq(1, 3, 5, 7, 9, 3, 5, 7, 9).toStream).toList


res0: List[Int] = List(1, 3, 5, 7, 9, 3, 5)

Example 2.5.1.10 Reverse each word in a string but keep the order of words:
def revWords(s: String): String = ???

76
2.5 Summary

scala> revWords("A quick brown fox")


res0: String = A kciuq nworb xof
Solution The standard method split converts a string into an array of words:
scala> "pa re ci vo mu".split(" ")
res0: Array[String] = Array(pa, re, ci, vo, mu)
Each word is reversed with reverse; the resulting array is concatenated into a
string with mkString:
def revWords(s: String): String = s.split(" ").map(_.reverse).mkString(" ")

Example 2.5.1.11 Remove adjacent repeated characters from a string:


def noDups(s: String): String = ???

scala> noDups("abbcdeeeeefddgggggh")
res0: String = abcdefdgh
Solution A string is automatically converted into a sequence of characters
when we use methods such as map or zip on it. So, we can use s.zip(s.tail) to
get a sequence of pairs (𝑠 𝑘 , 𝑠 𝑘+1 ) where 𝑐 𝑘 is the 𝑘-th character of the string 𝑠. A
filter will then remove elements 𝑠 𝑘 for which 𝑠 𝑘+1 = 𝑠 𝑘 :
scala> val s = "abbcd"
s: String = abbcd

scala> s.zip(s.tail).filter { case (sk, skPlus1) => sk != skPlus1 }


res0: IndexedSeq[(Char, Char)] = Vector((a,b), (b,c), (c,d))
It remains to convert this sequence of pairs into the string "abcd". One way of
doing this is to project the sequence of pairs onto the second parts of the pairs:
scala> res0.map(_._2).mkString
res1: String = bcd
We just need to add the first character, 'a'. The resulting code is:
def noDups(s: String): String = if (s == "") "" else {
val pairs = s.zip(s.tail).filter { case (x, y) => x != y }
pairs.head._1 +: pairs.map(_._2).mkString
}
The method +: prepends an element to a sequence, so x +: xs is equivalent to
Seq(x) ++ xs.
Example 2.5.1.12 For a given sequence of type Seq[A], find the longest subse-
quence that does not contain any adjacent duplicate values.
def longestNoDups[A](xs: Seq[A]): Seq[A] = ???

scala> longestNoDups(Seq(1, 2, 2, 5, 4, 4, 4, 8, 2, 3, 3))


res0: Seq[Int] = List(4, 8, 2, 3)

77
2 Mathematical formulas as code. II. Mathematical induction

Solution This is a “dynamic programming” problem. Many such problems are


solved with a single foldLeft. The accumulator represents the current state of the
dynamic programming solution, and the state is updated with each new element
of the input sequence.
We first need to determine the type of the accumulator value. The task is to
find the longest subsequence without adjacent duplicates. So, the accumulator
should represent the longest subsequence found so far, as well as any required
extra information about other subsequences that might grow as we iterate over
the elements of xs. What is that extra information in our case?
Imagine creating the set of all subsequences that have no adjacent duplicates.
For the input sequence [1, 2, 2, 5, 4, 4, 4, 8, 2, 3, 3], this set of all subsequences will be
{[1, 2] , [2, 5, 4] , [4, 8, 2, 3]}. We can build this set incrementally in the accumulator
value of a foldLeft. To visualize how this set would be built, consider the partial
result after seeing the first 8 elements of the input sequence, [1, 2, 2, 5, 4, 4, 4, 8].
The partial set of non-repeating subsequences is {[1, 2] , [2, 5, 4] , [4, 8]}. When we
see the next element, 2, we will update that partial set to {[1, 2] , [2, 5, 4] , [4, 8, 2]}.
It is now clear that the subsequence [1, 2] has no chance of being the longest sub-
sequence, since [2, 5, 4] is already longer. However, we do not yet know whether
[2, 5, 4] or [4, 8, 2] is the winner, because the subsequence [4, 8, 2] could still grow
and become the longest one (and it does become [4, 8, 2, 3] later). At this point,
we need to keep both of these two subsequences in the accumulator, but we may
already discard [1, 2].
We have deduced that the accumulator needs to keep only two sequences: the
first sequence is already terminated and will not grow, the second sequence ends
with the current element and may yet grow. The initial value of the accumulator
is empty. The first subsequence is discarded when it becomes shorter than the
second. The code can be written now:
def longestNoDups[A](xs: Seq[A]): Seq[A] = {
val init: (Seq[A], Seq[A]) = (Seq(), Seq())
val (first, last) = xs.foldLeft(init) { case ((first, current), x) =>
// If `current` is empty, `x` is not considered to be repeated.
val xWasRepeated = current != Seq() && current.last == x
val firstIsLongerThanCurrent = first.length > current.length
// Compute the new pair `(first, current)`.
// Keep `first` only if it is longer; otherwise replace it by `current`.
val newFirst = if (firstIsLongerThanCurrent) first else current
// Append `x` to `current` if `x` is not repeated.
val newCurrent = if (xWasRepeated) Seq(x) else current :+ x
(newFirst, newCurrent)
}
// Return the longer of the two subsequences; prefer `first`.
if (first.length >= last.length) first else last
}

78
2.5 Summary

2.5.2 Exercises
Exercise 2.5.2.1 Define a function dsq that computes the sum of squared digits of
a given integer; for instance, dsq(123) = 14 (see Example 2.5.1.6). Generalize dsq
to take as an argument a function f: Int => Int replacing the squaring operation.
The required type signature and a sample test:
def digitsFSum(x: Int)(f: Int => Int): Int = ???

scala> digitsFSum(123) { x => x * x }


res0: Int = 14

scala> digitsFSum(123) { x => x * x * x }


res1: Int = 36

Exercise 2.5.2.2 Compute the Collatz sequence 𝑐𝑖 as a stream defined by:


(
𝑐 𝑘 /2 if 𝑐 𝑘 is even,
𝑐0 = 𝑛 ; 𝑐 𝑘+1 =
3 ∗ 𝑐𝑘 + 1 if 𝑐 𝑘 is odd.

Stop the stream when it reaches 1 (as one would expect3 it will).
Exercise 2.5.2.3 For a given integer 𝑛, compute the sum of cubed digits, then the
sum of cubed digits of the result, etc.; stop the resulting sequence when it repeats
itself, and so determine whether it ever reaches 1. (Use Exercise 2.5.2.1.)
def cubes(n: Int): Stream[Int] = ???

scala> cubes(123).take(10).toList
res0: List[Int] = List(123, 36, 243, 99, 1458, 702, 351, 153, 153, 153)

scala> cubes(2).take(10).toList
res1: List[Int] = List(2, 8, 512, 134, 92, 737, 713, 371, 371, 371)

scala> cubes(4).take(10).toList
res2: List[Int] = List(4, 64, 280, 520, 133, 55, 250, 133, 55, 250)

def cubesReach1(n: Int): Boolean = ???

scala> cubesReach1(10)
res3: Boolean = true

scala> cubesReach1(4)
res4: Boolean = false

Exercise 2.5.2.4 For a, b, c of type Set[Int], compute the set of all sets of the form
Set(x, y, z) where x is from a, y from b, and z from c. The required type signature
and a sample test:
3 https://en.wikipedia.org/wiki/Collatz_conjecture

79
2 Mathematical formulas as code. II. Mathematical induction

def prod3(a: Set[Int], b: Set[Int], c: Set[Int]): Set[Set[Int]] = ???

scala> prod3(Set(1, 2), Set(3), Set(4, 5))


res0: Set[Set[Int]] = Set(Set(1,3,4), Set(1,3,5), Set(2,3,4), Set(2,3,5))

Hint: use flatMap.


Exercise 2.5.2.5 Same task as in Exercise 2.5.2.4 but using a set of sets. Instead of
just three sets a, b, c, we are given a value of type Set[Set[Int]]. The required type
signature and a sample test:
def prodSet(si: Set[Set[Int]]): Set[Set[Int]] = ???

scala> prodSet(Set(Set(1, 2), Set(3), Set(4, 5), Set(6)))


res0: Set[Set[Int]] = Set(Set(1,3,4,6),Set(1,3,5,6),Set(2,3,4,6),Set(2,3,5,6))

Hint: use foldLeft and flatMap.


Exercise 2.5.2.6 In a sorted integer array where no values are repeated, find all
pairs of values whose sum equals a given number 𝑛. Use tail recursion. A type
signature and a sample test:
def pairs(goal: Int, xs: Array[Int]): Set[(Int, Int)] = ???

scala> pairs(10, Array(1, 2, 4, 5, 6, 8))


res0: Set[(Int, Int)] = Set((2,8), (4,6), (5,5))

Exercise 2.5.2.7 Reverse a sentence’s word order, but keep the words unchanged:
def revSentence(s: String): String = ???

scala> revSentence("A quick brown fox") // Words are separated by one space.
res0: String = "fox brown quick A"

Exercise 2.5.2.8 (a) Reverse an integer’s digits (see Example 2.5.1.6) as shown:
def revDigits(n: Int): Int = ???

scala> revDigits(12345)
res0: Int = 54321

(b) A palindrome integer is an integer number n such that revDigits(n) == n. Write


a predicate function of type Int => Boolean that checks whether a given positive
integer is a palindrome.
Exercise 2.5.2.9 Define a function findPalindrome: Long => Long performing the
following computation: First define f(n) = revDigits(n) + n for a given integer n,
where the function revDigits was defined in Exercise 2.5.2.8. If f(n) is a palindrome
integer, findPalindrome returns that integer. Otherwise, it keeps applying the same
transformation and computes f(n), f(f(n)), ..., until a palindrome integer is even-
tually found (this is mathematically guaranteed). A sample test:
scala> findPalindrome(10101)

80
2.5 Summary

res0: Long = 10101

scala> findPalindrome(123)
res0: Long = 444

scala> findPalindrome(83951)
res1: Long = 869363968

Exercise 2.5.2.10 Transform a given sequence xs: Seq[Int] into a sequence of type
Seq[(Int, Int)] of pairs that skip one neighbor. Implement this transformation as
a function skip1 with a type parameter A instead of the type Int. The required type
signature and a sample test:
def skip1[A](xs: Seq[A]): Seq[(A, A)] = ???

scala> skip1(List(1, 2, 3, 4, 5))


res0: List[Int] = List((1,3), (2,4), (3,5))

Exercise 2.5.2.11 (a) For a given integer interval [𝑛1 , 𝑛2 ], find the largest integer
𝑘 ∈ [𝑛1 , 𝑛2 ] such that the decimal representation of 𝑘 does not contain any of the
digits 3, 5, or 7.
(b) For a given integer interval [𝑛1 , 𝑛2 ], find the integer 𝑘 ∈ [𝑛1 , 𝑛2 ] with the
largest sum of decimal digits.
(c) A positive integer 𝑛 is called a perfect number if it is equal to the sum of
its divisors (integers 𝑘 such that 1 ≤ 𝑘 < 𝑛 and 𝑘 divides 𝑛). For example, 6 is a
perfect number because its divisors are 1, 2, and 3, and 1 + 2 + 3 = 6, while 8 is
not a perfect number because its divisors are 1, 2, and 4, and 1 + 2 + 4 = 7 ≠ 8.
Write a function that determines whether a given number 𝑛 is perfect. Determine
all perfect numbers up to one million.
Exercise 2.5.2.12 Transform a sequence by removing adjacent repeated elements
when they are repeated more than 𝑘 times. Repetitions up to 𝑘 times should re-
main unchanged. The required type signature and a sample test:
def removeDups[A](s: Seq[A], k: Int): Seq[A] = ???

scala> removeDups(Seq(1, 1, 1, 1, 5, 2, 2, 5, 5, 5, 5, 5, 1), 3)


res0: Seq[Int] = List(1, 1, 1, 5, 2, 2, 5, 5, 5, 1)

Exercise 2.5.2.13 Implement a function unfold2 with the type signature:


def unfold2[A, B](init: A)(next: A => Option[(A, B)]): Stream[B]

The function should create a stream of values of type B by repeatedly applying the
given function next until it returns None. At each iteration, next should be applied
to the value of type A returned by the previous call to next. An example test:
scala> unfold2(0) { x => if (x > 5) None else Some((x + 2, s"had $x")) }
res0: Stream[String] = Stream(had 0, ?)

scala> res0.toList

81
2 Mathematical formulas as code. II. Mathematical induction

res1: List[String] = List(had 0, had 2, had 4)

Exercise 2.5.2.14 (a) Remove repeated elements (whether adjacent or not) from a
sequence of type Seq[A]. (This reproduces the standard library’s method distinct.)
(b) For a sequence of type Seq[A], remove all elements that are repeated (whether
adjacent or not) more than 𝑘 times:
def removeK[A](k: Int, xs: Seq[A]): Seq[A] = ???

scala> removeK(2, Seq("a", "b", "a", "b", "b", "c", "b", "a"))
res0: Seq[String] = List(a, b, a, b, c)

Exercise 2.5.2.15 For a given sequence xs: Seq[Double], find a subsequence that
has the largest sum of values. The sequence xs is not sorted, and its values may
be positive or negative. The required type signature and a sample test:
def maxsub(xs: Seq[Double]): Seq[Double] = ???

scala> maxsub(Seq(1.0, -1.5, 2.0, 3.0, -0.5, 2.0, 1.0, -10.0, 2.0))
res0: Seq[Double] = List(2.0, 3.0, -0.5, 2.0, 1.0)

Hint: use dynamic programming techniques and foldLeft.


Exercise 2.5.2.16 Using tail recursion, find all common integers between two
sorted sequences:
@tailrec def commonInt(xs: Seq[Int], ys: Seq[Int]): Seq[Int] = ???

scala> commonInt(Seq(1, 3, 5, 7), Seq(2, 3, 4, 6, 7, 8))


res0: Seq[Int] = List(3, 7)

2.6 Discussion and further developments


2.6.1 Total and partial functions
Functions can be total or partial. A total function will always compute a result
value, while a partial function may fail to compute its result for certain values of
its arguments.
A simple example of a partial function in Scala is the max method: it only works
for non-empty sequences. Trying to evaluate it on an empty sequence generates
an error (an “exception”):
scala> Seq(1).tail
res0: Seq[Int] = List()
scala> res0.max
java.lang.UnsupportedOperationException: empty.max
at scala.collection.TraversableOnce$class.max(TraversableOnce.scala:229)
at scala.collection.AbstractTraversable.max(Traversable.scala:104)
... 32 elided

82
2.6 Discussion and further developments

This kind of error may crash a program at run time. Unlike the type errors we
saw before, which occur at compilation time (i.e., before the program can start),
run-time errors occur while the program is running and only when an invalid
situation actually happens — say, when some partial function gets an incorrect in-
put. The incorrect input may occur at any time after the program started running,
which may crash the program in the middle of a long computation.
So, it seems clear that we should avoid writing code that generates such errors.
For instance, we will prefer to apply max only to sequences that are known to be
non-empty.
Sometimes, a function that uses pattern matching turns out to be a partial func-
tion because its pattern matching code fails on certain input data.
If none of the cases matches in a pattern matching expression, the code will
throw an exception (a MatchError). In functional programming, we usually want to
avoid that situation because reasoning about program correctness becomes hard.
In most cases, programs can be rewritten to avoid the possibility of match errors.
An example of an unsafe pattern matching expression is:
def h(p: (Int, Int)): Int = p match { case (x, 0) => x }

scala> h( (1, 0) )
res0: Int = 1

scala> h( (1, 2) )
scala.MatchError: (1,2) (of class scala.Tuple2$mcII$sp)
at .h(<console>:12)
... 32 elided

Here, the pattern contains a pattern variable x and a constant 0. This pattern only
matches tuples whose second part is equal to 0. If the second argument is nonzero,
a match error occurs and the program crashes. So, h is a partial function.
Pattern matching errors never happen if we match a tuple of correct size with a
pattern such as (x, y, z), because each pattern variable will always match a value.
So, pattern matching with a pattern such as (x, y, z) is infallible (never fails at
run time) when applied to a tuple with 3 elements.
Another way in which pattern matching can be made infallible is by including
a pattern that matches everything:
p match {
case (x, 0) => ... // This only matches certain tuples.
case _ => ... // This matches everything else.
}

If the first pattern (x, 0) fails to match the value p, the second pattern will be tried
(and will always succeed). The case patterns in a match expression are tried in the
order they are written. So, a match expression may be made infallible by adding a
“match-all” underscore pattern.
83
2 Mathematical formulas as code. II. Mathematical induction

2.6.2 Scope and shadowing of pattern matching variables


Pattern matching introduces locally scoped variables — that is, variables accessi-
ble only within the right-hand side of the pattern match expression. As an exam-
ple, consider this code:
def f(x: (Int, Int)): Int = x match { case (x, y) => x + y }

scala> f( (2, 4) )
res0: Int = 6

The argument of f is the variable x of a tuple type (Int, Int), but there is also a
pattern variable x in the case expression. The pattern variable x matches the first
part of the tuple and has type Int. Because variables are locally scoped, the pattern
variable x is only defined within the expression x + y. The argument x:(Int,Int)
is a completely different variable that has a different type.
The code works correctly but is confusing to read because of the name clash
between the two quite different variables, both named x. Another negative con-
sequence of the name clash is that the argument x:(Int,Int) is invisible within the
case expression: if we write “x” in that expression, we will get the pattern variable
x:Int. One says that the argument x:(Int,Int) has been shadowed by the pattern
variable x (which is a “bound variable” inside the case expression).
This problem is easy to avoid: we can give the pattern variable another name.
Since the pattern variable is locally scoped, it can be renamed within its scope
without affecting any other code:
def f(x: (Int, Int)): Int = x match { case (a, b) => a + b }

scala> f( (2,4) )
res0: Int = 6

2.6.3 Lazy values and sequences. Iterators and streams


We have used streams to create sequences whose length is not known in advance.
An example is a stream containing a sequence of increasing positive integers:
scala> val p = Stream.iterate(1)(_ + 1)
p: Stream[Int] = Stream(1, ?)

At this point, we have not defined a stopping condition for this stream. In some
sense, streams may be seen as “infinite” sequences, although in practice a stream
is always finite because programs cannot run infinitely long. Also, computers
cannot store infinitely many values in memory.
More precisely, streams are “partially computed” rather than “infinite”. The main
difference between arrays and streams is that a stream’s elements are computed on
demand and not all initially available, while an array’s elements are all computed
in advance and are immediately available.
84
2.6 Discussion and further developments

Generally, there are three possible ways a value could be available:

Availability Explanation Example Scala code

“eager” computed immediately val z = f(123)


“lazy” computed upon first use and stored lazy val z = f(123)
“on-call” computed each time it is needed def z = f(123)

A lazy value (declared as lazy val in Scala) is computed only when it is needed
in some other expression. Once computed, a lazy value stays in memory and will
not be re-computed.
An “on-call” value is re-computed every time it is used. In Scala, on-call values
are denoted via def declarations as well as via call-by-name function arguments.
Most collection types in Scala (such as List, Array, Set, and Map) are eager. All
elements of an eager collection are already evaluated.
A stream is a lazy collection. Elements of a stream are computed when first
needed. After that, they remain in memory and will not be computed again:
scala> val str = Stream.iterate(1)(_ + 1)
str: Stream[Int] = Stream(1, ?)

scala> str.take(10).toList
res0: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

scala> str
res1: Stream[Int] = Stream(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ?)

In many cases, it is not necessary to keep previous values of a sequence in mem-


ory. For example:
scala> (1L to 1000000000L).sum // Compute the sum from 1 to 1 billion.
res0: Long = 500000000500000000

We do not actually need to store a billion numbers in memory if we only want


to compute their sum. Indeed, the computation just shown does not store all the
numbers in memory. The computation will fail if we use a list or a stream:
scala> (1L to 1000000000L).toStream.sum
java.lang.OutOfMemoryError: GC overhead limit exceeded

The code (1L to 1000000000L).sum works because (1 to n) produces a sequence


whose elements are computed whenever needed but do not remain in memory.
This can be seen as a sequence with the “on-call” availability of elements. Se-
quences of this sort are called iterators:
scala> 1 to 5
res0: scala.collection.immutable.Range.Inclusive = Range(1, 2, 3, 4, 5)

scala> 1 until 5

85
2 Mathematical formulas as code. II. Mathematical induction

res1: scala.collection.immutable.Range = Range(1, 2, 3, 4)

The types Range and Range.Inclusive are defined in the Scala standard library and
are iterators. They behave as collections and support the usual methods (map,
filter, etc.), but they do not store previously computed values in memory.
The view method Eager collections such as List or Array can be converted to iter-
ators by using the view method. This is necessary when intermediate collections
consume too much memory when fully evaluated. For example, consider the
computation of Example 2.1.5.7 where we used flatMap to replace each element of
an initial sequence by three new numbers before computing max of the resulting
collection. If instead of three new numbers we wanted to compute three million
new numbers each time, the intermediate collection created by flatMap would re-
quire too much memory, and the computation would crash:
scala> (1 to 10).flatMap(x => 1 to 3000000).max
java.lang.OutOfMemoryError: GC overhead limit exceeded

Even though the range (1 to 10) is an iterator, a subsequent flatMap operation cre-
ates an intermediate collection that is too large for our computer’s memory. We
can use view to avoid this:
scala> (1 to 10).view.flatMap(x => 1 to 3000000).max
res0: Int = 3000000

The choice between using streams and using iterators is dictated by memory
constraints. Except for that, streams and iterators behave similarly to other se-
quences. We may write programs in the map/reduce style, applying standard
methods such as map, filter, etc., to streams and iterators. Mathematical reason-
ing about transforming a sequence is the same, whether the sequence is eager,
lazy, or on-call.
The Iterator class The Scala library class Iterator has methods such as iterate
and others, similarly to Stream. However, Iterator does not behave as a value in
the mathematical sense:
scala> val iter = (1 until 10).toIterator
iter: Iterator[Int] = non-empty iterator

scala> iter.toList // Look at the elements of `iter`.


res0: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9)

scala> iter.toList // Look at those elements again...??


res1: List[Int] = List()

scala> iter
res2: Iterator[Int] = empty iterator

Evaluating the expression iter.toList two times produces a different result the
second time. As we see from the Scala output, the value iter has become “empty”
after the first use.
86
2.6 Discussion and further developments

This situation is impossible in mathematics: if 𝑥 is a value, such as√100, and 𝑓 is



a function, such as 𝑓 (𝑥) = 𝑥, then 𝑓 (𝑥) will be the same, 𝑓 (100) = 100 = 10, no
matter how many times we compute 𝑓 (𝑥). For instance, we can compute 𝑓 (𝑥) +
𝑓 (𝑥) = 20 and obtain the correct result. We could also set 𝑦 = 𝑓 (𝑥) and compute
𝑦 + 𝑦 = 20, with the same result. This property is called referential transparency
or functional purity of the function 𝑓 .
When we set 𝑥 = 100 and compute 𝑓 (𝑥) + 𝑓 (𝑥), the number 100 does not “become
empty” after the first use; its value remains the same. This behavior is what we
expect values to have. So, we say that integers “are values” in the mathematical
sense. Alternatively, one says that numbers are immutable, i.e., cannot be modi-
fied. (What would it mean to “modify” the number 10?)
In programming, a type has value-like behavior if a computation applied to
it always gives the same result. Usually, this means that the type contains im-
mutable data, and the computation is referentially transparent. We can see that
Scala’s Range is immutable and behaves as a value:
scala> val x = 1 until 10
x: scala.collection.immutable.Range = Range(1, 2, 3, 4, 5, 6, 7, 8, 9)

scala> x.toList
res0: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9)

scala> x.toList
res1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9)

Collections such as List, Map, or Stream are immutable. Some elements of a Stream
may not be evaluated yet, but this does not affect its value-like behavior:
scala> val str = (1 until 10).toStream
str: scala.collection.immutable.Stream[Int] = Stream(1, ?)

scala> str.toList
res0: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9)

scala> str.toList
res1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9)

The view method produces iterators that do have value-like behavior:


scala> val v = (1 until 10).view
v: scala.collection.SeqView[Int,IndexedSeq[Int]] = SeqView(...)

scala> v.toList
res0: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9)

scala> v.toList
res1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9)

Due to the lack of value-like behavior, programs written using Iterator do not
obey the usual rules of mathematical reasoning. This makes it easy to write wrong
87
2 Mathematical formulas as code. II. Mathematical induction

code that looks correct.


To illustrate the problem, let us re-implement Example 2.5.1.9 by keeping the
same code but using Iterator instead of Stream:
def stopRepeatsBad[T](iter: Iterator[T]): Iterator[T] = {
val halfSpeed = iter.flatMap(x => Seq(x, x))
halfSpeed.zip(iter) // Do not prepend the first element. It won't help.
.drop(1).takeWhile { case (h, s) => h != s }
.map(_._2)
}

scala> stopRepeatsBad(Seq(1, 3, 5, 7, 9, 3, 5, 7, 9).toIterator).toList


res0: List[Int] = List(5, 9, 3, 7, 9)

The result [5, 9, 3, 7, 9] is incorrect, but not in an obvious way: the sequence was
stopped at a repetition, as we wanted, but some of the elements of the given se-
quence are missing (while other elements are present). It is difficult to debug a
program that produces partially correct numbers.
The error in this code occurs in the expression halfSpeed.zip(iter) due to the
fact that halfSpeed was itself defined via iter. The result is that iter is used twice
in this code, which leads to errors. Creating an Iterator and using it twice in the
same expression can give wrong results or even fail with an exception:
scala> val s = (1 until 10).toIterator
s: Iterator[Int] = non-empty iterator

scala> val t = s.zip(s).toList


java.util.NoSuchElementException: next on empty iterator

It is surprising and counter-intuitive that a variable (here, s) cannot be used twice;


we expect that the code s.zip(s) would just “zip” a given sequence s with itself.
But Scala’s Iterator class is mutable: it gets modified during its use. This breaks
the value-based reasoning about code.
An Iterator can be converted to a Stream using the toStream method. This restores
the value-based reasoning because streams behave as values:
scala> val iter = (1 until 10).toIterator
iter: Iterator[Int] = non-empty iterator

scala> val str = iter.toStream


str: Stream[Int] = Stream(1, ?)

scala> str.toList
res0: List[Int] = List(1, 2, 3, 4, 5, 6)

scala> str.toList
res1: List[Int] = List(1, 2, 3, 4, 5, 6)

scala> str.zip(str).toList
res2: List[(Int, Int)] = List((1,1), (2,2), (3,3), (4,4), (5,5), (6,6))

88
2.6 Discussion and further developments

Instead of Iterator, we can use Stream and view when lazy or on-call collections
are required. Newer versions of Scala replace Stream with LazyList, which is a
lazily evaluated (and possibly infinite) stream. Libraries such as scalaz and fs2
also provide streams with correct value-like behavior.
The mutable behavior of Iterator is an example of a “side effect”. A function has
a side effect if the function’s code performs some external action in addition to
computing a result value. Examples of side effects are: modifying a value stored
in memory; starting and stopping processes or threads; reading or writing files;
printing; sending or receiving data over a network; showing images on a display;
playing or recording sounds; getting photos or videos from a digital camera.
Code that performs side effects does not behave as a value. Evaluating such
code twice will perform the side effect twice, which is not the same as just re-using
the result value twice. A function with a side effect may return different values
each time it is called, even when the same arguments are given to the function.
(For example, a digital camera will typically return a different image each time.)
Pure functions are those that contain no code with side effects. A pure function
will always return the same result value when applied to the same arguments. So,
pure functions behave similarly to functions that are used in mathematics.
This book focuses on pure functions and on mathematical reasoning about them.
Statements such as “the map method cannot implement sum because it can only apply
element-wise transformations to sequences” are correct only if the code is restricted to
pure functions without side effects. Otherwise we would write code like this:
def sum(xs: Seq[Int]): Int = {
var result: Int = 0 // A mutable variable.
sum.map { x => result += x } // Side effect: mutation.
result
}

89
3 The logic of types. I. Disjunctive
types
Disjunctive types describe values that belong to a disjoint set of alternatives.
To see how Scala implements disjunctive types, we need to begin by looking at
“case classes”.

3.1 Scala’s “case classes”


3.1.1 Tuple types with names
It is often helpful to use names for the different parts of a tuple. Suppose that some
program represents the size and the color of socks with the tuple type (Double,
String). What if the same tuple type (Double, String) is used in another place in
the program to mean the amount paid and the payee? A programmer could mix
the two values by mistake, and it would be hard to find out why the program
incorrectly computes, say, the total amount paid:
def totalAmountPaid(ps: Seq[(Double, String)]): Double = ps.map(_._1).sum
val x = (10.5, "white") // Sock size and color.
val y = (25.0, "restaurant") // The amount paid and the payee.

scala> totalAmountPaid(Seq(x, y)) // Nonsense.


res0: Double = 35.5

We would prevent this kind of mistake if we could use two different types, with
names such as MySock and Payment, for the two kinds of data. There are three basic
ways of defining a new named type in Scala: using a type alias, using a class (or
“trait”), and using an opaque type.
Opaque types (hiding a type under a new name) is a feature of Scala 3. It can be
seen as a case class with a single field but without the cost of memory allocation.
Here, we will focus on type aliases and case classes.
A type alias is an alternative name for an existing (already defined) type. We
could use type aliases in our example to add clarity to the code:
type MySockTuple = (Double, String)
type PaymentTuple = (Double, String)

scala> val s: MySockTuple = (10.5, "white")

91
3 The logic of types. I. Disjunctive types

s: MySockTuple = (10.5,white)

scala> val p: PaymentTuple = (25.0, "restaurant")


p: PaymentTuple = (25.0,restaurant)
But type aliases do not prevent mix-up errors:
scala> totalAmountPaid(Seq(s, p)) // Nonsense again.
res1: Double = 35.5
Scala’s case classes can be seen as “tuples with names”. A case class is equivalent
to a tuple type that has a name designating the type and a separate name for each
part of the case class. This is how we might define case classes for the example
with socks and payments:
case class MySock(size: Double, color: String)
case class Payment(amount: Double, name: String)

scala> val sock = MySock(10.5, "white")


sock: MySock = MySock(10.5,white)

scala> val paid = Payment(25.0, "restaurant")


paid: Payment = Payment(25.0,restaurant)
This code defines new types named MySock and Payment. Values of type MySock
are written as MySock(10.5, "white"), which is similar to writing the tuple (10.5,
"white") except for adding the name MySock in front of the tuple.
To access the parts of a case class, we use the part names:
scala> sock.size
res2: Double = 10.5

scala> paid.amount
res3: Double = 25.0
The mix-up error is now a type error detected by the compiler:
def totalAmountPaid(ps: Seq[Payment]): Double = ps.map(_.amount).sum

scala> totalAmountPaid(Seq(paid, paid))


res4: Double = 50.0

scala> totalAmountPaid(Seq(sock, paid))


<console>:19: error: type mismatch;
found : MySock
required: Payment
totalAmountPaid(Seq(sock, paid))
^
A function whose argument is of type MySock cannot be applied to an argument
of type Payment. Case classes with different names are different types, even if they
contain the same parts.
It is important that type errors are detected at compile time. Compiled pro-
92
3.1 Scala’s “case classes”

grams can run only if all types match. This prevents a broad class of run-time
errors that occur due to wrong types.
Just as tuples can have any number of parts, case classes can have any number
of parts, but the part names must be distinct, for example:
case class Person(firstName: String, lastName: String, age: Int)

scala> val noether = Person("Emmy", "Noether", 137)


noether: Person = Person(Emmy,Noether,137)

scala> noether.firstName
res5: String = Emmy

scala> noether.age
res6: Int = 137

This data type carries the same information as a tuple (String, String, Int). How-
ever, the declaration of a case class Person gives the programmer several features
that make working with the tuple’s data more convenient and less error-prone.
Some (or all) part names may be specified when creating a case class value:
scala> val poincaré = Person(firstName = "Henri", lastName = "Poincaré", 165)
poincaré: Person = Person(Henri,Poincaré,165)

It is a type error to use wrong types with a case class:


scala> val p = Person(140, "Einstein", "Albert")
<console>:13: error: type mismatch;
found : Int(140)
required: String
val p = Person(140, "Einstein", "Albert")
^
<console>:13: error: type mismatch;
found : String("Albert")
required: Int
val p = Person(140, "Einstein", "Albert")
^

This error is due to an incorrect order of parts when creating a case class value.
However, parts can be specified in any order when using part names:
scala> val p = Person(age = 137, lastName = "Noether", firstName = "Emmy")
p: Person = Person(Emmy,Noether,137)

A part of a case class can have the type of another case class, creating a type similar
to a nested tuple:
case class BagOfSocks(sock: MySock, count: Int)
val bag = BagOfSocks(MySock(10.5, "white"), 6)

scala> bag.sock.size
res7: Double = 10.5

93
3 The logic of types. I. Disjunctive types

3.1.2 Case classes with type parameters


Type classes can be defined with type parameters. As an example, consider an
extension of MySock where, in addition to the size and color, an “extended sock”
holds another value. We could define several specialized case classes:
case class MySockInt(size: Double, color: String, value: Int)
case class MySockBoolean(size: Double, color: String, value: Boolean)

but it is better to define a single parameterized case class:


case class MySockX[A](size: Double, color: String, value: A)

This case class can accommodate every type A. We may now create values of
MySockX containing a value of any given type, say Int:
scala> val s = MySockX(10.5, "white", 123)
s: MySockX[Int] = MySockX(10.5,white,123)

Because the value 123 has type Int, the type parameter A in MySockX[A] was auto-
matically set to the type Int. The result has type MySockX[Int]. The programmer
does not need to specify that type explicitly.
Each time we create a value of type MySockX, a specific type will have to be used
instead of the type parameter A. If we want to be explicit, we may write the type
parameter like this:
scala> val s = MySockX[String](10.5, "white", "last pair")
s: MySockX[String] = MySockX(10.5,white,last pair)

We can write parametric code working with MySockX[A], that is, keeping the type
parameter A in the code. For example, a function that checks whether a sock of
type MySockX[A] fits the author’s foot can be written as:
def fits[A](sock: MySockX[A]): Boolean = sock.size >= 10.5 && sock.size <= 11

This function is defined for all types A at once, because its code works in the same
way regardless of what A is. Scala will set the type parameter A automatically
when we apply fits to an argument:
scala> fits(MySockX(10.5, "blue", List(1, 2, 3))) // Using MySockX[List[Int]].
res0: Boolean = true

This code forces the type parameter A to be List[Int], and so we may omit the
type parameter of fits. When types become more complicated, it may be helpful
to write out some type parameters. The compiler can detect a mismatch between
the type parameter A = List[Int] used in the “sock” value and the type parameter
A = Int in the function fits:
scala> fits[Int](MySockX(10.5, "blue", List(1, 2, 3)))
<console>:15: error: type mismatch;
found : List[Int]
required: Int
fits[Int](MySockX(10.5, "blue", List(1, 2, 3)))

94
3.1 Scala’s “case classes”

Case classes may have several type parameters, and the types of the parts may
use these type parameters. Here is an artificial example of a case class using type
parameters in different ways:
case class Complicated[A, B, C, D](x: (A, A), y: (B, Int) => A, z: C => C)

This case class contains parts of different types that use the type parameters A, B,
C in tuples and functions. The type parameter D is not used at all; this is allowed
(and occasionally useful).
A type with type parameters, such as MySockX or Complicated, is called a type con-
structor. A type constructor “constructs” a new type, such as MySockX[Int], from a
given type parameter Int. Values of type MySockX cannot be created without setting
the type parameter. So, it is important to distinguish the type constructor, such as
MySockX, from a type that can have values, such as MySockX[Int].

3.1.3 Tuples with one part and with zero parts


Let us compare tuples and case classes more systematically.
Parts of a case class are accessed with a dot syntax, for example sock.color. Parts
of a tuple are accessed with the accessors such as x._1. This syntax is the same as
that for a case class whose parts have names _1, _2, etc. So, it appears that tu-
ple parts do have names in Scala, although those names are always automatically
chosen as _1, _2, etc. Tuple types are also automatically named in Scala as Tuple2,
Tuple3, etc., and they are parameterized, since each part of the tuple may be of any
chosen type. A tuple type expression such as (Int, String) is just a special syntax
for the parameterized type Tuple2[Int, String]. One could define the tuple types
as case classes like this:
case class Tuple2[A, B](_1: A, _2: B)
case class Tuple3[A, B, C](_1: A, _2: B, _3: C) // And so on with Tuple4,
Tuple5...

However, these types are already defined in the Scala library.


Proceeding systematically, we ask whether tuple types can have just one part or
even no parts. Indeed, Scala defines Tuple1[A] (which is rarely used in practice) as
a tuple with a single part.
The tuple with zero parts also exists and is called Unit (instead of “Tuple0”). The
syntax for the value of the Unit type is an empty tuple, denoted by () in Scala. It is
clear that the value () is the only possible value of the Unit type. The name “unit”
reminds us of that.
At first sight, the Unit type — an empty tuple that carries no data — may appear
to be useless. It turns out, however, that the Unit type is important in functional
programming. It is used as a type guaranteed to have only a single distinct value.
This book will show many examples of using Unit.
95
3 The logic of types. I. Disjunctive types

Case classes may have one part or zero parts, similarly to the one-part and zero-
part tuples:
case class B(z: Int) // Tuple with one part.
case class C() // Tuple with no parts.

The following table shows the correspondence between tuples and case classes:

Tuples Case classes

(123, "xyz"): Tuple2[Int, String] case class A(x: Int, y: String)


(123,): Tuple1[Int] case class B(z: Int)
(): Unit case class C()

Scala has an alternative syntax for empty case classes:


case object C // Similar to `case class C()`.

There are two main differences between case class C() and case object C:

• A case object cannot have type parameters, while we could define a case
class C[X, Y, Z]() with type parameters X, Y, Z, etc.

• A case object is allocated in memory only once, while new values of a case
class C() will be allocated in memory each time C() is evaluated.

Other than that, case class C() and case object C have the same meaning: a named
tuple with zero parts, which we may also view as a “named Unit” type. This book
will not use case objects because case classes are sufficient.

3.1.4 Pattern matching for case classes


Scala performs pattern matching in two situations:

• destructuring definition: val 𝑝𝑎𝑡𝑡𝑒𝑟𝑛 = ...

• case expression: case 𝑝𝑎𝑡𝑡𝑒𝑟𝑛 => ...

In both situations, case classes can be used as patterns. The following code is an
example of a destructuring definition with case classes:
case class MySock(size: Double, color: String)
case class BagOfSocks(sock: MySock, count: Int)

def printBag(bag: BagOfSocks): String = {


val BagOfSocks(MySock(size, color), count) = bag // Destructure the `bag`.
s"bag has $count $color socks of size $size"
}

96
3.2 Disjunctive types

val bag = BagOfSocks(MySock(10.5, "white"), 6)

scala> printBag(bag)
res0: String = bag has 6 white socks of size 10.5

A case expression can match a value, extract some pattern variables, and com-
pute a result:
def fits(bag: BagOfSocks): Boolean = bag match {
case BagOfSocks(MySock(size, _), _) => (size >= 10.5 && size <= 11.0)
}

In the code of this function, the value of bag is matched against the pattern ex-
pression BagOfSocks(MySock(size, _), _). This pattern will define size as a pattern
variable of type Double and assign the corresponding part of the case class to that
variable. For example, the value BagOfSocks(MySock(10.5, "white"), 6)) matched
against BagOfSocks(MySock(size, _), _) assigns 10.5 to size. The symbols “_” mean
that we just ignore other parts of the case classes and do not create any pattern
variables for them (because we do not need them in this code).
The syntax for pattern matching for case classes is similar to the syntax for pat-
tern matching for tuples, except for the presence of names of the case classes. For
example, by removing the case class names from the pattern:
case BagOfSocks(MySock(size, _), _) => ...

we obtain a nested tuple pattern:


case ((size, _), _) => ...

that could be used for values of type ((Double, String), Int). So, within pattern
matching expressions, case classes behave as tuple types with added names.
Scala’s “case classes” got their name from their use in case expressions. It is
usually more convenient to use case expressions with case classes than to use de-
structuring definitions.

3.2 Disjunctive types


3.2.1 Motivation and first examples
In many situations, it is useful to have several different shapes of data within the
same type. As a first example, suppose we are looking for real roots of a quadratic
equation 𝑥 2 + 𝑏𝑥 + 𝑐 = 0. There are three cases: no real roots, one real root, and two
real roots. It is convenient to have a type that represents “real roots of a quadratic
equation”; call it RootsOfQ. Inside that type, we distinguish between the three cases,
but outside it looks like a single type.
Another example is the binary search algorithm that looks for an integer 𝑥 in
a sorted array. Either the algorithm finds the location of 𝑥 in the array, or it de-
97
3 The logic of types. I. Disjunctive types

termines that the array does not contain 𝑥. It is convenient if the algorithm could
return a value of a single type (say, SearchResult) that represents either an index at
which 𝑥 is found, or the absence of an index.
More generally, we may have computations that either return a result or generate
an error and fail to produce a result. It is then convenient to return a value of a
single type (say, Result) that represents either a correct result or an error message.
In certain computer games, one has different types of “rooms”, each room hav-
ing certain properties depending on its type. Some rooms are dangerous because
of monsters, other rooms contain useful objects, certain rooms allow you to fin-
ish the game, and so on. We want to represent all the different kinds of rooms
uniformly as a type Room. A value of type Room should automatically describe the
room’s relevant properties in each case.
In all these situations, data comes in several mutually exclusive shapes. This
sort of data can be represented by a single type if that type is able to describe a
mutually exclusive set of cases:
• RootsOfQ must be either the empty tuple (), or a Double value, or a tuple of
type (Double, Double)
• SearchResult must be either an Int value or the empty tuple ()
• Result must be either an Int value or a String error message
We see that the empty tuple, i.e., the Unit type, is natural to use in these situations.
It is also helpful to assign names to each of the cases:
• RootsOfQ is “no roots” with value (), or “one root” with value Double, or “two
roots” with value (Double, Double)
• SearchResult is “index” with an Int value, or “not found” with value ()
• Result is “value” of type Int or “error message” of type String
Scala’s case classes provides exactly what we need here — named tuples with
zero, one, two, or more parts. So, it is natural to use case classes instead of tuples:
• RootsOfQ is a value of the form NoRoots(), or of the form OneRoot(x: Double), or
of the form TwoRoots(x: Double, y: Double)
• SearchResult is a value of the form Index(x: Int) or of the form NotFound()
• Result is a value of the form Value(x: Int) or Error(message: String)
Our three examples are now described as types that allow us to select one case
class out of a given set. It remains to see how Scala defines such types. For in-
stance, the definition of RootsOfQ needs to indicate that the case classes NoRoots,
OneRoot, and TwoRoots are the only possibilities allowed by the type RootsOfQ. The
Scala syntax for that definition looks like this:
98
3.2 Disjunctive types

sealed trait RootsOfQ


final case class NoRoots() extends RootsOfQ
final case class OneRoot(x: Double) extends RootsOfQ
final case class TwoRoots(x: Double, y: Double) extends RootsOfQ

In the definition of SearchResult, we have two cases:


sealed trait SearchResult
final case class Index(x: Int) extends SearchResult
final case class NotFound() extends SearchResult

The definition of the Result type is parameterized, so that we can describe results
of any type (while error messages are always of type String):
sealed trait Result[A]
final case class Value[A](x: A) extends Result[A]
final case class Error[A](message: String) extends Result[A]

The “sealed trait / final case class” syntax defines a type that represents a
choice of one case class from a fixed set of case classes. This kind of type is called
a disjunctive type (or a co-product type) in this book. The keywords final and
sealed tell the Scala compiler that the given set of case classes within a disjunctive
type is fixed and unchangeable.

3.2.2 Examples: Pattern matching for disjunctive types


Our first examples of disjunctive types are RootsOfQ, SearchResult, and Result[A]
defined in the previous section. We will now look at the Scala syntax for working
with disjunctive types.
Consider the disjunctive type RootsOfQ with three parts (the case classes NoRoots,
OneRoot, TwoRoots). The only way of creating a value of type RootsOfQ is to create
a value of one of these case classes. This is done by writing expressions such as
NoRoots(), OneRoot(2.0), or TwoRoots(1.0, -1.0). Scala will accept these expressions
as having the type RootsOfQ:
scala> val x: RootsOfQ = OneRoot(2.0)
x: RootsOfQ = OneRoot(2.0)

How can we use a given value, say, x: RootsOfQ? Disjunctive types fit well with
pattern matching. In Chapter 2, we used pattern matching with syntax such as
{ case (x, y) => ... }. To use pattern matching with disjunctive types, we write
several case patterns because we need to detect several possible cases of the dis-
junctive type:
def print(r: RootsOfQ): String = r match {
case NoRoots() => "no real roots"
case OneRoot(r) => s"one real root: $r"
case TwoRoots(x, y) => s"real roots: ($x, $y)"
}

99
3 The logic of types. I. Disjunctive types

scala> print(x)
res0: String = "one real root: 2.0"

Each case pattern will introduce its own pattern variables, such as r, x, y in the
code above. Each pattern variable is defined only within the local scope, that is,
within the scope of its case expression. It is impossible to make a mistake where
we, say, refer to the variable r within the code that handles the case of two roots.
If the code only needs to work with a subset of cases, we can match all other
cases with an underscore character (as in case _):
scala> x match {
case OneRoot(r) => s"one real root: $r"
case _ => "have something else"
}
res1: String = one real root: 2.0

The match/case expression represents a choice over possible values of a given type.
Note the similarity with this code:
def f(x: Int): Int = x match {
case 0 => println(s"error: must be nonzero"); -1
case 1 => println(s"error: must be greater than 1"); -1
case _ => x
}

The values 0 and 1 are some possible values of type Int, just as OneRoot(4.0) is
a possible value of type RootsOfQ. When used with disjunctive types, match/case
expressions will usually cover the complete list of possibilities. If the list of cases
is incomplete, the Scala compiler will print a warning:
scala> def g(x: RootsOfQ): String = x match {
case OneRoot(r) => s"one real root: $r"
}
<console>:14: warning: match may not be exhaustive.
It would fail on the following inputs: NoRoots(), TwoRoots(_, _)

This code defines a partial function g that can be applied only to values of the form
OneRoot(...) and will fail (throwing an exception) for other values.
Let us look at more examples of using the disjunctive types we just defined.
Example 3.2.2.1 Given a sequence of quadratic equations, compute a sequence
containing their real roots as values of type RootsOfQ.
Solution Define a case class representing a quadratic equation 𝑥 2 + 𝑏𝑥 + 𝑐 = 0:
case class QEqu(b: Double, c: Double)

The following function determines how many real roots an equation has:
def solve(quadraticEqu: QEqu): RootsOfQ = {
val QEqu(b, c) = quadraticEqu // Destructure QEqu.
val d = b * b / 4 - c
if (d > 0) {

100
3.2 Disjunctive types

val s = math.sqrt(d)
TwoRoots(- b / 2 - s, - b / 2 + s)
} else if (d == 0.0) OneRoot(- b / 2)
else NoRoots()
}

Test the solve function:


scala> solve(QEqu(1, 1))
res1: RootsOfQ = NoRoots()

scala> solve(QEqu(1, -1))


res2: RootsOfQ = TwoRoots(-1.618033988749895,0.6180339887498949)

scala> solve(QEqu(6, 9))


res3: RootsOfQ = OneRoot(-3.0)

We can now implement the function findRoots:


def findRoots(equs: Seq[QEqu]): Seq[RootsOfQ] = equs.map(solve)

If the function solve will not be used often, we may want to write it inline as a
nameless function:
def findRoots(equs: Seq[QEqu]): Seq[RootsOfQ] = equs.map { case QEqu(b, c) =>
(b * b / 4 - c) match {
case d if d > 0 =>
val s = math.sqrt(d)
TwoRoots(- b / 2 - s, - b / 2 + s)
case 0.0 => OneRoot(- b / 2)
case _ => NoRoots()
}
}

This code depends on some features of Scala syntax. We can use the function ex-
pression { case QEqu(b, c) => ... } directly as the argument of map, destructuring
QEqu at the same time. The if/else expression is replaced by an “embedded” if
within a case expression, which is easier to read.
Test the final code:
scala> findRoots(Seq(QEqu(1, 1), QEqu(2, 1)))
res4: Seq[RootsOfQ] = List(NoRoots(), OneRoot(-1.0))

Example 3.2.2.2 Given a sequence of values of type RootsOfQ, compute a sequence


containing only the single roots. Example test:
def singleRoots(rs: Seq[RootsOfQ]): Seq[Double] = ???

scala> singleRoots(Seq(TwoRoots(-1, 1), OneRoot(3.0), OneRoot(1.0), NoRoots()))


res5: Seq[Double] = List(3.0, 1.0)

Solution We apply filter and map to the sequence of roots:


def singleRoots(rs: Seq[RootsOfQ]): Seq[Double] =

101
3 The logic of types. I. Disjunctive types

rs.filter {
case OneRoot(x) => true
case _ => false
}.map { case OneRoot(x) => x }

In the map operation, we need to cover only the one-root case because the two other
possibilities have been excluded (“filtered out”) by the preceding filter operation.
We can implement the same function by using the standard library’s collect
method that performs the filtering and mapping operation in one step:
def singleRoots(rs: Seq[RootsOfQ]): Seq[Double]
= rs.collect { case OneRoot(x) => x }

Example 3.2.2.3 Implement binary search returning a SearchResult. Modify the


implementation from Example 2.5.1.5(b) to return a NotFound value when needed.
Solution The code from Example 2.5.1.5(b) will return some index even if the
given number is not present in the array:
scala> binSearch(Array(1, 3, 5, 7), goal = 5)
res6: Int = 2

scala> binSearch(Array(1, 3, 5, 7), goal = 4)


res7: Int = 1

In that case, the array’s element at the computed index will not be equal to goal.
We should return NotFound() in that case. We use a match/case expression for the
new logic:
def safeBinSearch(xs: Seq[Int], goal: Int): SearchResult =
binSearch(xs, goal) match {
case n if xs(n) == goal => Index(n)
case _ => NotFound()
}

scala> safeBinSearch(Array(1, 3, 5, 7), 5)


res8: SearchResult = Index(2)

scala> safeBinSearch(Array(1, 3, 5, 7), 4)


res9: SearchResult = NotFound()

Example 3.2.2.4 Use the disjunctive type Result[Int] to implement “safe arith-
metic”, where a division by zero or a square root of a negative number gives an
error message. Define arithmetic operations directly for values of type Result[Int].
Abandon further computations on any error.
Solution Begin by implementing the (integer-valued) square root as a func-
tion from Result[Int] to Result[Int]:
def sqrt(r: Result[Int]): Result[Int] = r match {
case Value(x) if x >= 0 => Value(math.sqrt(x).toInt)
case Value(x) => Error(s"error: sqrt($x)")
case Error(m) => Error(m) // Keep the error message.

102
3.2 Disjunctive types

The square root is computed only if we have the Value(x) case, and only if 𝑥 ≥ 0. If
the argument r was already an Error case, we keep the error message and perform
no further computations.
To implement the addition operation, we need a bit more work:
def add(rx: Result[Int], ry: Result[Int]): Result[Int] = (rx, ry) match {
case (Value(x), Value(y)) => Value(x + y)
case (Error(m), _) => Error(m) // Keep the first error message.
case (_, Error(m)) => Error(m) // Keep the second error message.
}

This code illustrates nested patterns that match the tuple (rx, ry) against various
possibilities. When written in this way, the code is clearer than code written with
nested if/else expressions.
Implementing the multiplication operation results in almost the same code:
def mul(rx: Result[Int], ry: Result[Int]): Result[Int] = (rx, ry) match {
case (Value(x), Value(y)) => Value(x * y)
case (Error(m), _) => Error(m)
case (_, Error(m)) => Error(m)
}

To avoid repetition, we may define a general function (map2) that “maps” binary
operations on integers to operations on Result[Int] types:
def map2(rx: Result[Int], ry: Result[Int])(op: (Int, Int) => Int): Result[Int] =
(rx, ry) match {
case (Value(x), Value(y)) => Value(op(x, y))
case (Error(m), _) => Error(m)
case (_, Error(m)) => Error(m)
}

Now we can easily “map” any binary operation on integers to a binary operation
on Result[Int], assuming that the operation itself never generates an error:
def sub(rx: Result[Int], ry: Result[Int]): Result[Int] =
map2(rx, ry) { (x, y) => x - y }

Custom code is still needed for operations that may generate errors:
def div(rx: Result[Int], ry: Result[Int]): Result[Int] = (rx, ry) match {
case (Value(x), Value(y)) if y != 0 => Value(x / y)
case (Value(x), Value(y)) => Error(s"error: $x / $y")
case (Error(m), _) => Error(m)
case (_, Error(m)) => Error(m)
}

We can now test the “safe arithmetic” on simple calculations. Let us see what
happens after an error:
scala> add(Value(1), Value(2))
res10: Result[Int] = Value(3)

103
3 The logic of types. I. Disjunctive types

scala> div(add(Value(1), Value(2)), Value(0))


res11: Result[Int] = Error(error: 3 / 0)

Let us check that all further computations are abandoned once an error occurs.
Indeed, the following example shows that the error message for 20 + 1/0 never
mentions 20:
scala> add(Value(20), div(Value(1), Value(0)))
res12: Result[Int] = Error(error: 1 / 0)

scala> add(sqrt(Value(-1)), Value(10))


res13: Result[Int] = Error(error: sqrt(-1))

3.2.3 Standard disjunctive types: Option, Either, Try


The Scala library defines the disjunctive types Option, Either, and Try. These types
are used often in Scala programs.
The Option type is a disjunctive type with two cases: the empty tuple and a one-
element tuple. The names of the two case classes are None and Some. If the Option
type were not already defined in the Scala library, we could define it by:
sealed trait Option[+T] // The annotation `+T` will be explained in Chapter 6.
final case object None extends Option[Nothing]
final case class Some[T](t: T) extends Option[T]

This code is similar to the type SearchResult defined in Section 3.2.1, except that
Option has a type parameter instead of a fixed type Int. Another difference is the
use of a case object instead of an empty case class, such as None(). Since Scala’s
case objects cannot have type parameters, the type parameter in the definition of
None must be set to the special type Nothing, which is a type with no values, also
called the void type (not to be confused with Java or C’s void keyword!). The
special type annotation +T makes None usable as a value of type Option[T] for any
type T; see Section 6.1.8 for more details.
An alternative (implemented, e.g., in the scalaz library) is to define the empty
option value as:
final case class None[T]() extends Option[T]

In that implementation, the empty option None[T]() has a type parameter.


The Scala library’s decision to define None without a type parameter means that
None can be reused as a value of type Option[A] for any type A:
scala> val y: Option[Int] = None
y: Option[Int] = None

scala> val z: Option[String] = None


z: Option[String] = None

104
3.2 Disjunctive types

Typically, Option is used in situations where a value may be either present or


missing, especially when a missing value is not an error. The missing-value case is
represented by None, while Some(x) represents a value x that is present.
Example 3.2.3.1 Information about “subscribers” must include a name and an
email address, but a telephone number is optional. To represent this information,
we define a case class like this:
case class Subscriber(name: String, email: String, phone: Option[Long])

What if we represent the missing telephone number by a special value such as


-1 and use the simpler type Long instead of Option[Long]? The disadvantage is
that we would need to remember to check for the special value -1 in all functions
that take the telephone number as an argument. Looking at a function such as
sendSMS(phone: Long) at a different place in the code, a programmer might forget
that the telephone number is actually optional. In contrast, the type signature
sendSMS(phone: Option[Long]) unambiguously indicates that the telephone number
might be missing and helps the programmer to remember to handle both cases.
Pattern-matching code involving Option can handle the two cases like this:
def getDigits(phone: Option[Long]): Option[Seq[Long]] = phone match {
case None => None // Have no digits, so return `None`.
case Some(number) => Some(digitsOf(number))
} // The function `digitsOf` was defined in Section 2.3.

At the two sides of “case None => None”, the value None has different types, namely
Option[Long] and Option[Seq[Long]]. Since these types are declared in the type sig-
nature of the function getDigits, the Scala compiler is able to figure out the types
of all expressions in the match/case construction. So, pattern-matching code can be
written without explicit type annotations such as (None: Option[Long]).
If we now need to compute the number of digits, we can write:
def numberOfDigits(phone: Option[Long]): Option[Long] = getDigits(phone) match {
case None => None
case Some(digits) => Some(digits.length)
}

These examples perform a computation when an Option value is non-empty, and


leave it empty otherwise. This code pattern is used often. To avoid repeating the
code, we can implement this code pattern as a function that takes the computation
as an argument f:
def doComputation(x: Option[Long], f: Long => Long): Option[Long] = x match {
case None => None
case Some(i) => Some(f(i))
}

It is then natural to generalize this function to arbitrary types using type param-
eters instead of a fixed type Long. The resulting function is usually called fmap in
functional programming libraries:
105
3 The logic of types. I. Disjunctive types

def fmap[A, B](f: A => B): Option[A] => Option[B] = {


case None => None
case Some(a) => Some(f(a))
}

scala> fmap(digitsOf)(Some(4096))
res0: Option[Seq[Long]] = Some(List(4, 0, 9, 6))

scala> fmap(digitsOf)(None)
res1: Option[Seq[Long]] = None

We say that the fmap operation lifts a given function f of type A => B to a new
function of type Option[A] => Option[B].
It is important to keep in mind that the code case Some(a) => Some(f(a)) changes
the type of the option value. On the left side of the arrow, the type is Option[A],
while on the right side it is Option[B]. The Scala compiler knows this from the
given type signature of fmap, so an explicit type parameter, which we could write
as Some[B](f(a)), is not needed.
The Scala library implements an equivalent function as a method of the Option
class, with the syntax x.map(f) rather than fmap(f)(x). We can concisely rewrite the
previous code using these methods:
def getDigits(phone: Option[Long]): Option[Seq[Long]] = phone.map(digitsOf)
def numberOfDigits(phone: Option[Long]): Option[Long] =
phone.map(digitsOf).map(_.length)

We see that the map operation for the Option type is analogous to the map operation
for sequences.
The similarity between Option[A] and Seq[A] is clearer if we view Option[A] as
a special kind of “sequence” whose length is restricted to be either 0 or 1. So,
Option[A] can have all the operations of Seq[A] except operations such as concat
that may grow the sequence beyond length 1. The standard operations defined
on Option include map, filter, zip, forall, exists, flatMap, and foldLeft.
Example 3.2.3.2 Given a phone number as Option[Long], extract the country code
if it is present. The result must be again of type Option[Long]. Assume that the
country code is the digits in front of a 10-digit phone number; for the phone num-
ber 18004151212, the country code is 1.
Solution If the phone number is a positive integer 𝑛, we may compute the
country code simply as n / 10000000000L. However, if the result of that division is
zero, we should return an empty Option (i.e., the value None) rather than 0:
def countryCode(phone: Option[Long]): Option[Long] = phone match {
case None => None
case Some(n) =>
val countryCode = n / 10000000000L
if (countryCode != 0L) Some(countryCode) else None
}

106
3.2 Disjunctive types

Notice that we have reimplemented the code pattern similar to map, namely “if None
then return None, else return Some(...)”. So, we may try to rewrite the code as:
def countryCode(phone: Option[Long]): Option[Long] = phone.map { n =>
val countryCode = n / 10000000000L
if (countryCode != 0L) Some(countryCode) else None
} // Type error: the result is Option[Option[Long]], not Option[Long].
This code does not compile: we are returning an Option[Long] within a function
lifted via map, so the resulting type is Option[Option[Long]]. Use flatten to convert
Option[Option[Long]] to the required type Option[Long]:
def countryCode(phone: Option[Long]): Option[Long] = phone.map { n =>
val countryCode = n / 10000000000L
if (countryCode != 0L) Some(countryCode) else None
}.flatten // Types are correct now.
Since the flatten follows a map, rewrite the code using flatMap:
def countryCode(phone: Option[Long]): Option[Long] = phone.flatMap { n =>
val countryCode = n / 10000000000L
if (countryCode != 0L) Some(countryCode) else None
}
Another way of implementing this example is to notice the code pattern “if con-
dition does not hold, return None, otherwise keep the value”. For an Option type,
this is equivalent to the filter operation (recall that filter returns an empty se-
quence if the predicate never holds). The code is:
def countryCode(phone: Option[Long]): Option[Long] = phone.map(_ /
10000000000L).filter(_ != 0L)

scala> countryCode(Some(18004151212L))
res0: Option[Long] = Some(1)

scala> countryCode(Some(8004151212L))
res1: Option[Long] = None

Example 3.2.3.3 Add a new requirement to Example 3.2.3.2: if the country code
is not present, return the default country code 1.
Solution This is an often used code pattern: “if empty, substitute a default
value”. The Scala library has the method getOrElse for this purpose:
scala> Some(100).getOrElse(1)
res2: Int = 100

scala> None.getOrElse(1)
res3: Int = 1
So, we can implement the new requirement as:
scala> countryCode(Some(8004151212L)).getOrElse(1L)
res4: Long = 1

107
3 The logic of types. I. Disjunctive types

Using Option with collections Several Scala library methods return an Option as
a result. Examples are find, headOption, and lift for sequences, as well as get for
dictionaries.
The find method returns the first element satisfying a predicate:
scala> (1 to 10).find(_ > 5)
res0: Option[Int] = Some(6)

scala> (1 to 10).find(_ > 100) // No element is > 100.


res1: Option[Int] = None

The lift method returns the element of a sequence at a given index:


scala> (10 to 100).lift(0)
res2: Option[Int] = Some(10)

scala> (10 to 100).lift(1000) // No element at index 1000.


res3: Option[Int] = None

The headOption method returns the first element of a sequence, unless the se-
quence is empty. This is equivalent to lift(0):
scala> Seq(1, 2, 3).headOption
res4: Option[Int] = Some(1)

scala> Seq(1, 2, 3).filter(_ > 10).headOption


res5: Option[Int] = None

Applying .find(p) computes the same result as .filter(p).headOption, but .find(p)


may be faster.
The get method for a dictionary checks whether the given key is present in the
dictionary. If so, get returns the value wrapped in Some(). Otherwise, it returns
None:
scala> Map(10 -> "a", 20 -> "b").get(10)
res6: Option[String] = Some(a)

scala> Map(10 -> "a", 20 -> "b").get(30)


res7: Option[String] = None

The get method is a safe by-key access to dictionaries, unlike the direct access that
may fail with an exception:
scala> Map(10 -> "a", 20 -> "b")(10)
res8: String = a

scala> Map(10 -> "a", 20 -> "b")(30)


java.util.NoSuchElementException: key not found: 30

Similarly, lift is a safe by-index access to collections, unlike the direct access that
may fail with an exception:
scala> Seq(10, 20, 30)(0)

108
3.2 Disjunctive types

res9: Int = 10

scala> Seq(10, 20, 30)(5)


java.lang.IndexOutOfBoundsException: 5

The Either type The standard disjunctive type Either[A, B] has two type param-
eters and is often used for computations that report errors. By convention, the first
type (A) is the type of error, and the second type (B) is the type of the (non-error)
result. The names of the two cases are Left and Right. A possible definition of
Either may be written as:
sealed trait Either[A, B]
final case class Left[A, B](value: A) extends Either[A, B]
final case class Right[A, B](value: B) extends Either[A, B]

By convention, a value Left(x) represents an error, and a value Right(y) represents


a valid result.
As an example, the following function substitutes a default value and logs the
error information:
def logError(x: Either[String, Int], default: Int): Int = x match {
case Left(error) => println(s"Got error: $error"); default
case Right(res) => res
}

To test:
scala> logError(Right(123), -1)
res1: Int = 123

scala> logError(Left("bad result"), -1)


Got error: bad result
res2: Int = -1

Why use Either instead of Option for computations that may fail? When a miss-
ing result is an error, we will usually need to know the reason why the result is
unavailable. The Either type may provide detailed information about such errors,
which Option cannot do. An Option type is mostly used in cases where the absence
of a result is not an error.
The Either type generalizes the type Result defined in Section 3.2.1 to an ar-
bitrary error type instead of String. We have seen its usage in Example 3.2.2.4,
where the code pattern was “if value is present, do a computation, otherwise keep
the error”. This code pattern is implemented by the map method of Either:
1 scala> Right(1).map(_ + 1)
2 res0: Either[Nothing, Int] = Right(2)
3
4 scala> Left[String, Int]("error").map(_ + 1)
5 res1: Either[String, Int] = Left("error")

The type Nothing was filled in by the Scala compiler because we did not specify
109
3 The logic of types. I. Disjunctive types

the first type parameter of Right in line 1.


The methods flatMap, fold, and getOrElse are also defined for Either, with the
same convention that a Left value represents an error.
Exceptions and the Try type When computations fail for any reason, Scala gen-
erates an exception instead of returning a value. An exception means that the
evaluation of some expression was stopped without returning a result.
As an example, exceptions are generated when the available memory is too
small to store the resulting data (as we saw in Section 2.6.3), or if a stack over-
flow occurs during the computation (see Section 2.2.3). Exceptions may also oc-
cur due to programmer’s errors: when a pattern matching operation fails, when a
requested key does not exist in a dictionary, or when the head operation is applied
to an empty list.
Motivated by these examples, we may distinguish “planned” and “unplanned”
exceptions.
A planned exception is generated by programmer’s code via the throw syntax:
scala> throw new Exception("This is a test... this is only a test.")
java.lang.Exception: This is a test... this is only a test.
The Scala library contains a throw operation in various places, such as in the code
for applying the head method to an empty sequence, as well as in other situations
where exceptions are generated due to programmer’s errors. These exceptions are
generated deliberately and in well-defined situations. Although these exceptions
indicate errors, these errors are anticipated in advance and so may be handled by
the programmer.
For example, many Java libraries will generate exceptions when function ar-
guments have unexpected values, when a network operation takes too long or a
network connection is unexpectedly broken, when a file is not found or cannot
be read due to access permissions, and in other situations. All those exceptions
are “planned” because they are generated explicitly by library code such as throw
new FileNotFoundException(...). The programmer’s code is expected to catch those
exceptions, to handle the errors, and to continue running the program.
An unplanned exception is generated by the Java runtime system when critical
errors occur, such as a stack overflow or an out-of-memory error. It is rare that a
programmer writes val y = f(x) while expecting that an out-of-memory exception
will likely occur at that point.1 An unplanned exception indicates a serious prob-
lem with memory or another critically important resource, such as the operating
system’s threads or file handles. Such problems usually cannot be fixed and will
prevent the program from running any further. It is reasonable that the program
should abruptly stop (or “crash”, as programmers say) after such an error.
The use of planned exceptions assumes that the programmer will write code
to handle each exception. This assumption makes it significantly harder to write
1 Just
once in the author’s experience, an out-of-memory exception had to be anticipated in an
Android app as something that regularly happens during normal usage of the app.

110
3.2 Disjunctive types

programs correctly: it is hard to figure out and to keep in mind all the possible ex-
ceptions that a given library function may “throw” in its code (and in the code of all
other libraries being used). Instead of using exceptions for indicating errors, Scala
programmers can write functions that return a disjunctive type, such as Either,
describing both a correct result and an error condition. Users of these functions
will have to do pattern matching on the result values. This helps programmers to
avoid forgetting to handle an error situation that the code is likely to encounter.
Nevertheless, programmers will often need to use Java or Scala libraries that
throw exceptions. To help write code for these situations, the Scala library provides
a disjunctive type called Try. The type Try[A] is equivalent to Either[Throwable, A],
where Throwable is the general type of all exceptions (i.e., values to which a throw
operation can be applied). The two parts of the disjunctive type Try[A] are called
Failure and Success[A] (instead of Left[Throwable, A] and Right[Throwable, A] in the
Either type). The class constructor Try(expr) will catch all “planned” exceptions
thrown while the expression expr is evaluated.2
If the evaluation of expr succeeds and returns a value x: A, the value of Try(expr)
will be Success(x). Otherwise it will be Failure(t), where t: Throwable is a value
containing details about the exception. Here is an example of using Try:
import scala.util.{Try, Success, Failure}

scala> val p = Try("xyz".toInt)


p: Try[Int] = Failure(java.lang.NumberFormatException: For input string: "xyz")

scala> val q = Try("0002".toInt)


q: Try[Int] = Success(2)

The code Try("xyz".toInt) does not generate any exceptions and will not crash the
program. Any computation that may throw a planned exception can be enclosed in
a Try(), and the exception will be caught and encapsulated within the disjunctive
type as a Failure(...) value.
The methods map, filter, flatMap, foldLeft are defined for the Try class similarly
to the Either type. One additional feature of Try is to catch exceptions generated
by the function arguments of map, filter, flatMap, and other standard methods:
scala> val y = q.map(y => throw new Exception("ouch"))
y: Try[Int] = Failure(java.lang.Exception: ouch)

scala> val z = q.filter(y => throw new Exception("huh"))


z: Try[Int] = Failure(java.lang.Exception: huh)

In this example, the values y and z were computed successfully even though excep-
tions were thrown while the function arguments of map and filter were evaluated.
Further code can use pattern matching on the values y and z and examine those

2 But Try()will not catch exceptions of class java.lang.Error and its subclasses. Those exceptions
are intended to represent unplanned, serious error situations.

111
3 The logic of types. I. Disjunctive types

exceptions. However, it is important that these exceptions were caught and the
program did not crash, meaning that further code is able to run.
While the standard types Try and Either will cover many use cases, program-
mers can also define custom disjunctive types in order to represent all the antic-
ipated failures or errors in the business logic of a particular application. Repre-
senting all errors in the types helps assure that the program will not crash because
of an exception that we forgot to handle or did not even know about.

3.3 Lists and trees as recursive disjunctive types


Consider this code defining a disjunctive type NInt:
sealed trait NInt
final case class N1(x: Int) extends NInt
final case class N2(n: NInt) extends NInt

The type NInt has two disjunctive parts: N1 and N2. But the case class N2 contains a
value of type NInt as if the type NInt were already defined.
A type whose definition uses that same type is called a recursive type. The type
NInt is an example of a recursive disjunctive type.
We might imagine defining a disjunctive type X whose parts recursively refer to
the same type X (and/or to each other) in complicated ways. What kind of data
would be represented by such a type X, and in what situations would X be useful?
For instance, the simple definition:
final case class Bad(x: Bad)

is useless since we cannot create a value of type Bad unless we already have a value
x of type Bad. This is an example of an infinite loop in type recursion. We will
never be able to create values of type Bad, which means that the type Bad is “void”
(has no values, like the special Scala type Nothing).
Section 8.5.1 will derive precise conditions under which a recursive type is not
void. For now, we will look at the recursive disjunctive types that are used most
often: lists and trees.

3.3.1 The recursive type List


A list of values of type A is either empty, or has one value of type A, or two values
of type A, etc. We can visualize the type List[A] as a disjunctive type defined by:
sealed trait List[A]
final case class List0[A]() extends List[A]
final case class List1[A](x: A) extends List[A]
final case class List2[A](x1: A, x2: A) extends List[A]
??? // Need an infinitely long definition.

112
3.3 Lists and trees as recursive disjunctive types

However, this definition is not practical: we cannot define a separate case class
for each possible length. Instead, we define the type List[A] via mathematical
induction on the length of the list:
• Base case: empty list, case class List0[A]().
• Inductive step: given a list of a previously defined length, say List𝑛−1 , define
a new case class List𝑛 describing a list with one more element of type A. So,
we could define List𝑛 = (A, List𝑛−1 ).
Let us try to write this inductive definition as code:
sealed trait ListI[A] // Inductive definition of a list.
final case class List0[A]() extends ListI[A]
final case class List1[A](x: A, next: List0[A]) extends ListI[A]
final case class List2[A](x: A, next: List1[A]) extends ListI[A]
??? // Still need an infinitely long definition.
To avoid writing an infinitely long type definition, we use a trick. Note that the
definitions of List1, List2, etc., have a similar form (while List0 is not similar). To
replace the definitions List1, List2, etc., by a single definition ListN, we write the
type ListI[A] inside the case class ListN:
sealed trait ListI[A] // Inductive definition of a list.
final case class List0[A]() extends ListI[A]
final case class ListN[A](x: A, next: ListI[A]) extends ListI[A]
The type definition has become recursive. For this trick to work, it is important to
use ListI[A] and not ListN[A] inside the case class ListN[A]. Otherwise, we would
get an infinite loop in type recursion (similarly to case class Bad shown before).
Since we obtained the definition of type ListI[A] via a trick, let us verify that the
code actually defines the disjunctive type we wanted.
To create a value of type ListI[A], we must use one of the two available case
classes. Using the first case class, we may create a value List0(). Since this empty
case class does not contain any values of type A, it effectively represents an empty
list (the base case of the induction). Using the second case class, we may create a
value ListN(x, next) where x is of type A and next is an already constructed value
of type ListI[A]. This represents the inductive step because the case class ListN is a
named tuple containing A and ListI[A]. Now, the same consideration recursively
applies to constructing the value next, which must be either an empty list or a
pair containing a value of type A and another list. The assumption that the value
next: ListI[A] is already constructed is equivalent to the inductive assumption
that we already have a list of a previously defined length. So, we have verified
that ListI[A] implements the inductive definition shown above.
Examples of values of type ListI are the empty list List0(), a one-element list
ListN(x, List0()), and a two-element list ListN(x, ListN(y, List0()).
To illustrate writing pattern-matching code using this type, let us implement
the method headOption:
113
3 The logic of types. I. Disjunctive types

def headOption[A]: ListI[A] => Option[A] = {


case List0() => None
case ListN(x, next) => Some(x)
}

The Scala library defines the type List[A] in a different way:


sealed trait List[A]
final case object Nil extends List[Nothing]
final case class ::[A](head: A, tail: List[A]) extends List[A]

Because “operator-like” case class names, such as ::, support the infix syntax, we
may write expressions such as head :: tail instead of ::(head, tail). This syntax
can be also used in pattern matching on List values, with code that looks like this:
def headOption[A]: List[A] => Option[A] = {
case Nil => None
case head :: tail => Some(head)
}

Examples of values created using Scala’s standard List type are the empty list Nil,
a one-element list x :: Nil, and a two-element list x :: y :: Nil. The same syntax
x :: y :: Nil is used both for creating values of type List and for pattern matching
on such values.
The Scala library also defines the helper function List(), so that List() is the
same as Nil and List(1, 2, 3) is the same as 1 :: 2 :: 3 :: Nil. Lists are easier to
read in the syntax List(1, 2, 3). Pattern matching may also use that syntax:
val x: List[Int] = List(1, 2, 3)

x match {
case List(a) => ...
case List(a, b, c) => ...
case _ => ...
}

3.3.2 Tail recursion with List


Because the List type is defined by induction, it is straightforward to implement
iterative computations with the List type using recursion.
A first example is the map function. We use reasoning by induction in order to
figure out the implementation of map. The required type signature is:
def map[A, B](xs: List[A])(f: A => B): List[B] = ???

The base case is an empty list, and we return again an empty list:
def map[A, B](xs: List[A])(f: A => B): List[B] = xs match {
case Nil => Nil
...

114
3.3 Lists and trees as recursive disjunctive types

In the inductive step, we have a pair (head, tail) in the case class ::, with head: A
and tail: List[A]. The pair can be pattern-matched with the syntax head :: tail.
The map function should apply the argument f to the head value, which will give
the first element of the resulting list. The remaining elements are computed by the
induction assumption, i.e., by a recursive call to map:
def map[A, B](xs: List[A])(f: A => B): List[B] = xs match {
case Nil => Nil
case head :: tail => f(head) :: map(tail)(f)

While this implementation is straightforward and concise, it is not tail-recursive.


This will be a problem for large enough lists.
To resolve this issue, let us implement a tail-recursive foldLeft, because many
methods can be expressed via foldLeft. The required type signature is:
def foldLeft[A, R](xs: List[A])(init: R)(f: (R, A) => R): R = ???

Reasoning by induction, we start with the base case xs == Nil, where the only
possibility is to return the value init:
def foldLeft[A, R](xs: List[A])(init: R)(f: (R, A) => R): R = xs match {
case Nil => init
...

The inductive step for foldLeft says that, given the values head: A and tail: List[A],
we need to apply the updater function to the previous accumulator value. That
value is init. So, we apply foldLeft recursively to the tail of the list once we have
the updated accumulator value:
@tailrec def foldLeft[A, R](xs: List[A])(init: R)(f: (R, A) => R): R =
xs match {
case Nil => init
case head :: tail =>
val newInit = f(init, head) // Update the accumulator.
foldLeft(tail)(newInit)(f) // Recursive call to `foldLeft`.
}

This implementation is tail-recursive because the recursive call to foldLeft is the


last expression returned in a case branch.
Another example is a function for reversing a list. The Scala library defines the
reverse method for this task, but we will show an implementation using foldLeft.
The updater function prepends an element to a previous list:
def reverse[A](xs: List[A]): List[A] =
xs.foldLeft(Nil: List[A])((prev, x) => x :: prev)

scala> reverse(List(1, 2, 3))


res0: List[Int] = List(3, 2, 1)

Without the explicit type annotation (Nil: List[A]), the Scala compiler will decide
that Nil has type List[Nothing], and the types will not match later in the code. In
115
3 The logic of types. I. Disjunctive types

Scala, the initial value for foldLeft often needs an explicit type annotation.
The reverse function can be used to obtain a tail-recursive implementation of
map for List. The idea is to first use foldLeft to accumulate transformed elements:
scala> Seq(1, 2, 3).foldLeft(Nil:List[Int])((prev, x) => (x * x) :: prev)
res0: List[Int] = List(9, 4, 1)

To obtain the correct result (List(1, 4, 9)), it remains to apply reverse:


def map[A, B](xs: List[A])(f: A => B): List[B] =
xs.foldLeft(Nil: List[B])((prev, x) => f(x) :: prev).reverse

scala> map(List(1, 2, 3))(x => x * x)


res2: List[Int] = List(1, 4, 9)

This achieves stack safety at the cost of traversing the list twice. (This code is
shown only as an example. The Scala library implements List’s map using mutable
variables to improve performance.)
Example 3.3.2.1 A definition of the non-empty list is similar to List except that
the empty-list case is replaced by a 1-element case:
sealed trait NEL[A]
final case class Last[A](head: A) extends NEL[A]
final case class More[A](head: A, tail: NEL[A]) extends NEL[A]

Values of a non-empty list look like this:


scala> val xs: NEL[Int] = More(1, More(2, Last(3))) // [1, 2, 3]
xs: NEL[Int] = More(1,More(2,Last(3)))

scala> val ys: NEL[String] = Last("abc") // One element, ["abc"].


ys: NEL[String] = Last(abc)

To create non-empty lists more easily, we implement a conversion function toNEL


from an ordinary list. To guarantee that a non-empty list can be created, we give
toNEL two arguments:
def toNEL[A](x: A, rest: List[A]): NEL[A] = rest match {
case Nil => Last(x)
case y :: tail => More(x, toNEL(y, tail))
} // Not tail-recursive: `toNEL()` is used inside `More(...)`.

To test:
scala> toNEL(1, List()) // Result = [1].
res0: NEL[Int] = Last(1)

scala> toNEL(1, List(2, 3)) // Result = [1, 2, 3].


res1: NEL[Int] = More(1,More(2,Last(3)))

The head method is safe for non-empty lists, unlike head for an ordinary List:
def head[A]: NEL[A] => A = {
case Last(x) => x

116
3.3 Lists and trees as recursive disjunctive types

case More(x, _) => x


}

We can also implement a tail-recursive foldLeft function for non-empty lists:


@tailrec def foldLeft[A, R](n: NEL[A])(init: R)(f: (R, A) => R): R = n match {
case Last(x) => f(init, x)
case More(x, tail) => foldLeft(tail)(f(init, x))(f)
}

scala> foldLeft(More(1, More(2, Last(3))))(0)(_ + _)


res2: Int = 6

Example 3.3.2.2 Use foldLeft to implement a reverse function for the type NEL.
The required type signature and a sample test:
def reverse[A]: NEL[A] => NEL[A] = ???

scala> reverse(toNEL(10, List(20, 30))) // The result must be [30, 20, 10].
res3: NEL[Int] = More(30,More(20,Last(10)))

Solution We will use foldLeft to build up the reversed list as the accumulator
value. It remains to choose the initial value of the accumulator and the updater
function. We have already seen the code for reversing the ordinary list via the
foldLeft method:
def reverse[A](xs: List[A]): List[A] = xs.foldLeft(Nil: List[A])((prev,x) => x
:: prev)

However, we cannot reuse the same code for non-empty lists by writing More(x,
prev) instead of x :: prev, because the foldLeft operation works with non-empty
lists differently. Since lists are always non-empty, the updater function is always
applied to an initial value, and the code works incorrectly:
def reverse[A](xs: NEL[A]): NEL[A] =
foldLeft(xs)(Last(head(xs)): NEL[A])((prev,x) => More(x, prev))

scala> reverse(toNEL(10, List(20, 30))) // The result is [30, 20, 10, 10].
res4: NEL[Int] = More(30,More(20,More(10,Last(10))))

The last element, 10, should not have been repeated. It was repeated because the
initial accumulator value already contained the head element 10 of the original
list. However, we cannot set the initial accumulator value to an empty list, since
a value of type NEL[A] must be non-empty. It seems that we need to handle the
case of a one-element list separately. So, we begin by matching on the argument
of reverse, and apply foldLeft only when the list is longer than 1 element:
def reverse[A]: NEL[A] => NEL[A] = {
case Last(x) => Last(x) // `reverse` is a no-op.
case More(x, tail) => // Use foldLeft on `tail`.
foldLeft(tail)(Last(x): NEL[A])((prev, x) => More(x, prev))
}

117
3 The logic of types. I. Disjunctive types

scala> reverse(toNEL(10, List(20, 30))) // The result is [30, 20, 10].


res5: NEL[Int] = More(30,More(20,Last(10)))

Exercise 3.3.2.3 Implement a function toList that converts a non-empty list into
an ordinary Scala List. The required type signature and a sample test:
def toList[A](nel: NEL[A]): List[A] = ???

scala> toList(More(1, More(2, Last(3)))) // This is [1, 2, 3].


res6: List[Int] = List(1, 2, 3)

Exercise 3.3.2.4 Implement a map function for the type NEL. Type signature and a
sample test:
def mapNEL[A,B](xs: NEL[A])(f: A => B): NEL[B] = ???

scala> mapNEL[Int, Int](toNEL(10, List(20, 30)))(_ + 5)


res7: NEL[Int] = More(15,More(25,Last(35)))

Exercise 3.3.2.5 Implement a function that concatenates two non-empty lists:


def concatNEL[A](xs: NEL[A], ys: NEL[A]): NEL[A] = ???

scala> concatNEL(More(1, More(2, Last(3))), More(4, Last(5)))


res8: NEL[Int] = More(1,More(2,More(3,More(4,Last(5)))))

Exercise 3.3.2.6 Implement flatten for non-empty lists:


def flattenNEL[A](xs: NEL[Nel[A]]): NEL[A] = ???

scala> flattenNEL(More(More(1, Last(2)), More(More(3, Last(4)), Last(More(5,


Last(6))))))
res9: NEL[Int] = More(1,More(2,More(3,More(4,More(5,Last(6))))))

3.3.3 Binary trees


We will consider four kinds of trees defined as recursive disjunctive types: binary
trees, rose trees, perfect-shaped trees, and abstract syntax trees.
The diagrams and (where 𝑎 1 , ..., 𝑎 5 are some values
𝑎3
𝑎1 𝑎2
𝑎1 𝑎4 𝑎5
𝑎2 𝑎3
of type A) are examples of binary trees with leaves of type A.
An inductive definition says that a binary tree is either a leaf with a value of
type A or a branch containing two previously defined binary trees. Translating this
definition into code, we get:
sealed trait Tree2[A]
final case class Leaf[A](a: A) extends Tree2[A]
final case class Branch[A](x: Tree2[A], y: Tree2[A]) extends Tree2[A]

118
3.3 Lists and trees as recursive disjunctive types

Here are some examples of code expressions and the corresponding trees:
Branch(Branch(Leaf("a1"), Leaf("a2")), Leaf("a3"))
𝑎3
𝑎1 𝑎2

Branch(Branch(Leaf("a1"), Branch(Leaf("a2"), Leaf("a3"))),


Branch(Leaf("a4"), Leaf("a5"))) 𝑎4 𝑎5
𝑎1
𝑎2 𝑎3
Recursive functions on trees are translated into concise code. For instance, the
function foldLeft for trees of type Tree2 is implemented as:
def foldLeft[A, R](t: Tree2[A])(init: R)(f: (R, A) => R): R = t match {
case Leaf(a) => f(init, a)
case Branch(t1, t2) =>
val r1 = foldLeft(t1)(init)(f) // Fold the left branch and obtain the
result `r1`.
foldLeft(t2)(r1)(f) // Using `r1` as the `init` value, fold
the right branch.
}

Note that this function cannot be made tail-recursive using the accumulator trick,
because foldLeft needs to call itself twice in the Branch case.
To verify that foldLeft works as intended, let us run a simple test:
val t: Tree2[String] = Branch(Branch(Leaf("a1"), Leaf("a2")), Leaf("a3"))

scala> foldLeft(t)("")(_ + " " + _)


res0: String = " a1 a2 a3"

3.3.4 Rose trees


A rose tree is similar to the binary tree except the branches contain a non-empty
list of trees. Because of that, a rose tree can fork into arbitrarily many branches
at each node, rather than always into two branches as the binary tree does. Some
examples of rose trees are and .
𝑎1 𝑎4 𝑎5
𝑎2 𝑎3
𝑎1 𝑎2 𝑎3 𝑎4 𝑎5 𝑎6
A possible definition of a data type for the rose tree is:
sealed trait TreeN[A]
final case class Leaf[A](a: A) extends TreeN[A]
final case class Branch[A](ts: NEL[TreeN[A]]) extends TreeN[A]

Since we used a non-empty list NEL, a Branch() value is guaranteed to have at least
one branch. If we used an ordinary List instead, we could (by mistake) create a
tree with empty branches.
Exercise 3.3.4.1 Define the function foldLeft for a rose tree of type TreeN[A] shown
above. Assume that a foldLeft function is already available for the type NEL. The
required type signature and a test:
119
3 The logic of types. I. Disjunctive types

def foldLeft[A, R](t: TreeN[A])(init: R)(f: (R, A) => R): R = ???

scala> foldLeft(Branch(More(Leaf(1), More(Leaf(2), Last(Leaf(3))))))(0)(_ + _)


res0: Int = 6

3.3.5 Perfect-shaped trees


Binary trees and rose trees may choose to branch or not to branch at any given
node, resulting in structures that may have different branching depths at different
nodes, such as . A perfect-shaped tree always branches in the same
𝑎4
𝑎1
𝑎2 𝑎3
way at every node until a chosen total depth. As an example, consider the tree
where all nodes at depth 0 and 1 always branch into two, while all
𝑎1 𝑎2 𝑎3 𝑎4
nodes at depth 2 are leaves (do not branch). The branching number is fixed for a
given type of a perfect-shaped tree; in this example, the branching number is 2, so
it is a perfect-shaped binary tree.
How can we define a data type representing a perfect-shaped binary tree? We
need a type that is either a single value, or a pair of values, or a pair of pairs, etc.
Begin with the non-recursive (but, of course, impractical) definition:
sealed trait PTree[A]
final case class Leaf[A](x: A) extends PTree[A]
final case class Branch1[A](xs: (A, A)) extends PTree[A]
final case class Branch2[A](xs: ((A, A),(A, A))) extends PTree[A]
??? // Need an infinitely long definition.

The case Branch1 describes a perfect-shaped tree with total depth 1, the case Branch2
has total depth 2, and so on. The non-trivial step is to notice that each case class
Branch𝑛 uses the previous case class’s data structure with the type parameter set to
(A, A) instead of A. So, we can rewrite the above definition as:
sealed trait PTree[A]
final case class Leaf[A](x: A) extends PTree[A]
final case class Branch1[A](xs: Leaf[(A, A)]) extends PTree[A]
final case class Branch2[A](xs: Branch1[(A, A)]) extends PTree[A]
??? // Need an infinitely long definition.

We can now apply the type recursion trick: replace the type Branch𝑛−1 [(A, A)] in
the definition of Branch𝑛 by the recursively used type PTree[(A, A)]. Now we can
define a perfect-shaped binary tree:
sealed trait PTree[A]
final case class Leaf[A](x: A) extends PTree[A]
final case class Branch[A](xs: PTree[(A, A)]) extends PTree[A]

120
3.3 Lists and trees as recursive disjunctive types

Since we used some tricks to figure out the definition of PTree[A], let us verify
that this definition actually describes the recursive disjunctive type we wanted.
The only way to create a structure of type PTree[A] is to create a Leaf[A] or a
Branch[A]. A value of type Leaf[A] is itself a perfect-shaped tree. It remains to
consider the case of Branch[A]. Creating a Branch[A] requires a previously created
PTree with values of type (A, A) instead of A. By the inductive assumption, the
previously created PTree[A] would have the correct shape. Now, it is clear that if
we replace the type parameter A by the pair (A, A), a perfect-shaped tree such as
is replaced by (each leaf value 𝑎𝑖 became
𝑎1 𝑎2 𝑎3 𝑎4

0 0 0 0
𝑎1 𝑎1” 𝑎2 𝑎2” 𝑎3 𝑎3” 𝑎4 𝑎4”
0
a pair 𝑎𝑖 , 𝑎𝑖” ). That tree is again perfect-shaped but is one level deeper. We see that
PTree[A] is a correct definition of a perfect-shaped binary tree.
Example 3.3.5.1 Define a (non-tail-recursive) map function for a perfect-shaped
binary tree. The required type signature and a test:
def map[A, B](t: PTree[A])(f: A => B): PTree[B] = ???

scala> map(Branch(Branch(Leaf(((1, 2), (3, 4))))))(_ * 10)


res0: PTree[Int] = Branch(Branch(Leaf(((10,20),(30,40)))))

Solution Begin by pattern matching on the tree:


def map[A, B](t: PTree[A])(f: A => B): PTree[B] = t match {
case Leaf(x) => ???
case Branch(xs) => ???
}

In the base case, we have no choice but to return Leaf(f(x)):


def map[A, B](t: PTree[A])(f: A => B): PTree[B] = t match {
case Leaf(x) => Leaf(f(x))
case Branch(xs) => ???
}

In the inductive step, we are given a previous tree value xs: PTree[(A, A)]. It is
clear that we need to apply map recursively to xs. Let us try:
def map[A, B](t: PTree[A])(f: A => B): PTree[B] = t match {
case Leaf(x) => Leaf(f(x))
case Branch(xs) => Branch(map(xs)(f)) // Type error!
}

Here, map(xs)(f) has an incorrect type of the function f. Since xs has type PTree[(A,
A)], the recursive call map(xs)(f) requires f to be of type ((A, A)) => (B, B) instead
of A => B. So, we need to provide a function of the correct type instead of f. A
function of type ((A, A)) => (B, B) will be obtained out of f: A => B if we apply
f to each part of the tuple (A, A). The code for that function is { case (x, y) =>
(f(x), f(y)) }. Therefore, we can implement map as:

121
3 The logic of types. I. Disjunctive types

def map[A, B](t: PTree[A])(f: A => B): PTree[B] = t match {


case Leaf(x) => Leaf(f(x))
case Branch(xs) => Branch(map(xs){ case (x, y) => (f(x), f(y)) })
}

This code is not tail-recursive since it calls map inside an expression.


Exercise 3.3.5.2 Using tail recursion, compute the depth of a perfect-shaped bi-
nary tree of type PTree. (An PTree of depth 𝑛 has 2𝑛 leaf values.) The required type
signature and a test:
@tailrec def depth[A](t: PTree[A]): Int = ???

scala> depth(Branch(Branch(Leaf((("a","b"),("c","d"))))))
res2: Int = 2

Exercise 3.3.5.3 Define a tail-recursive function foldLeft for a perfect-shaped bi-


nary tree. The required type signature and a test:
@tailrec def foldLeft[A, R](t: PTree[A])(init: R)(f: (R, A) => R): R = ???

scala> foldLeft(Branch(Branch(Leaf(((1, 2), (3, 4))))))(0)(_ + _)


res0: Int = 10

scala> foldLeft(Branch(Branch(Leaf((("a", "b"), ("c", "d"))))))("")(_ + _)


res1: String = abcd

3.3.6 Abstract syntax trees


Expressions in formal languages are represented by abstract syntax trees. An ab-
stract syntax tree (or AST for short) is defined as either a leaf of one of the avail-
able leaf types, or a branch of one of the available branch types. All the available
leaf and branch types must be specified as part of the definition of an AST. In other
words, one must specify the data carried by leaves and branches, as well as the
branching numbers.
To illustrate how ASTs are used, let us rewrite Example 3.2.2.4 via an AST.
We view Example 3.2.2.4 as a small sub-language that deals with “safe integers”
and supports the “safe arithmetic” operations Sqrt, Add, Mul, and Div. Example

calculations
√ in this sub-language are 16 ∗ (1 + 2) = 12; 20 + 1/0 = error; and
10 + −1 = error.
We can implement this sub-language in two stages. The first stage will create
a data structure (an AST) that represents an unevaluated expression in the sub-
language. The second stage will evaluate that AST to obtain either a number or
an error message.
A straightforward way of defining the data structure for the AST is to use a dis-
junctive type whose parts describe all the possible operations of the sub-language.
122
3.3 Lists and trees as recursive disjunctive types

We will need one case class for each of Sqrt, Add, Mul, and Div. An additional oper-
ation, Num, will lift ordinary integers into “safe integers”. So, we define the disjunc-
tive type (Arith) for the “safe arithmetic” sub-language as:
sealed trait Arith
final case class Num(x: Int) extends Arith
final case class Sqrt(x: Arith) extends Arith
final case class Add(x: Arith, y: Arith) extends Arith
final case class Mul(x: Arith, y: Arith) extends Arith
final case class Div(x: Arith, y: Arith) extends Arith

A value of type Arith is either a Num(x) for some integer x, or an Add(x, y) where x
and y are previously defined Arith expressions, or another operation.
This type definition is similar to the binary tree type if we rename Leaf to Num
and Branch to Add:
sealed trait Tree
final case class Leaf(x: Int) extends Tree
final case class Branch(x: Tree, y: Tree) extends Tree

However, the Arith type is a tree that supports four different types of branches,
some with branching number 1 and others with branching number 2.
This example illustrates the structure of an AST: it is a tree of a specific shape,
with leaves and branches chosen from a specified set of allowed possibilities. In
the “safe arithmetic” example, we have a single allowed type of leaf (Num) and four
allowed types of branches (Sqrt, Add, Mul, and Div).
This completes the first stage of implementing the sub-language. We may now
use the √disjunctive type Arith to create expressions in the sub-language. For ex-
ample, 16 ∗ (1 + 2) is represented by:
scala> val x: Arith = Mul(Sqrt(Num(16)), Add(Num(1), Num(2)))
x: Arith = Mul(Sqrt(Num(16)),Add(Num(1),Num(2)))

We can visualize x as the abstract syntax tree Mul .


Sqrt Add

Num
Num Num
16
1 2

The expressions 20 + 1/0 and 10 ∗ −1 are represented by:
scala> val y: Arith = Add(Num(20), Div(Num(1), Num(0)))
y: Arith = Add(Num(20),Div(Num(1),Num(0)))

scala> val z: Arith = Mul(Num(10), Sqrt(Num(-1)))


z: Arith = Mul(Num(10),Sqrt(Num(-1)))

As we see, the expressions x, y, and z remain unevaluated; each of them is a data


structure that encodes a tree of operations of the sub-language. These operations
will be evaluated at the second stage of implementing the sub-language.
123
3 The logic of types. I. Disjunctive types

To evaluate expressions in the “safe arithmetic”, we can implement a function


with type signature run: Arith => Either[String, Int]. That function plays the role
of an interpreter or “runner” for programs written in the sub-language. The run-
ner will walk through the expression tree and execute all the operations, taking
care of possible errors.
To implement run, we need to define the required arithmetic operations on the
type Either[String, Int]. For instance, we need to be able to add or multiply val-
ues of that type. Instead of custom code from Example 3.2.2.4, we can use the
standard map and flatMap methods defined on Either. For example, addition and
multiplication of two “safe integers” is implemented as:
def add(x: Either[String, Int], y: Either[String, Int]):
Either[String, Int] = x.flatMap { r1 => y.map(r2 => r1 + r2) }
def mul(x: Either[String, Int], y: Either[String, Int]):
Either[String, Int] = x.flatMap { r1 => y.map(r2 => r1 * r2) }

The code for the “safe division” is:


def div(x: Either[String, Int], y: Either[String, Int]): Either[String, Int] =
x.flatMap { r1 => y.flatMap { r2 =>
if (r2 == 0) Left(s"error: $r1 / $r2") else Right(r1 / r2)
}
}

With this code, we can implement the runner as a recursive function:


def run: Arith => Either[String, Int] = {
case Num(x) => Right(x)
case Sqrt(x) => run(x).flatMap { r =>
if (r < 0) Left(s"error: sqrt($r)") else Right(math.sqrt(r).toInt)
}
case Add(x, y) => add(run(x), run(y))
case Mul(x, y) => mul(run(x), run(y))
case Div(x, y) => div(run(x), run(y))
}

Test it with the values x, y, z defined previously:


scala> run(x)
res0: Either[String, Int] = Right(12)

scala> run(y)
res1: Either[String, Int] = Left("error: 1 / 0")

scala> run(z)
res2: Either[String, Int] = Left("error: sqrt(-1)")

3.4 Summary
What problems can we solve now?
124
3.4 Summary

• Represent values from a disjoint domain by a custom disjunctive type.


• Use disjunctive types instead of exceptions to indicate failures.
• Use standard disjunctive types Option, Try, Either and their methods.
• Define recursive disjunctive types (such as lists and trees) and implement
recursive functions that work with them.
The following examples and exercises illustrate these tasks.

3.4.1 Examples
Example 3.4.1.1 Define a disjunctive type DayOfWeek representing the seven days
of a week.
Solution Since each day carries no information except the day’s name, we can
use empty case classes and represent the day’s name via the name of the case class:
sealed trait DayOfWeek
final case class Sunday() extends DayOfWeek
final case class Monday() extends DayOfWeek
final case class Tuesday() extends DayOfWeek
final case class Wednesday() extends DayOfWeek
final case class Thursday() extends DayOfWeek
final case class Friday() extends DayOfWeek
final case class Saturday() extends DayOfWeek
This data type is analogous to an enumeration type in C or C++:
typedef enum { Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday }
DayOfWeek;

Example 3.4.1.2 Modify DayOfWeek so that on Fridays the values additionally rep-
resent names of restaurants and amounts paid, and on Saturdays a wake-up time.
Solution For the days where additional information is given, we use non-
empty case classes:
sealed trait DayOfWeekX
final case class Sunday() extends DayOfWeekX
final case class Monday() extends DayOfWeekX
final case class Tuesday() extends DayOfWeekX
final case class Wednesday() extends DayOfWeekX
final case class Thursday() extends DayOfWeekX
final case class Friday(restaurant: String, amount: Int) extends DayOfWeekX
final case class Saturday(wakeUpAt: java.time.LocalTime) extends DayOfWeekX
This data type is no longer equivalent to an enumeration type.
Example 3.4.1.3 Define a disjunctive type that describes the real roots of the
equation 𝑎𝑥 2 + 𝑏𝑥 + 𝑐 = 0, where 𝑎, 𝑏, 𝑐 are arbitrary real numbers. Write a func-
tion that returns a value of that type and solves a given equation of the form
𝑎𝑥 2 + 𝑏𝑥 + 𝑐 = 0.
125
3 The logic of types. I. Disjunctive types

Solution Begin by solving the equation and enumerating all the possible cases.
It may happen that 𝑎 = 𝑏 = 𝑐 = 0, and then all 𝑥 are roots. If 𝑎 = 𝑏 = 0 but 𝑐 ≠ 0,
the equation is 𝑐 = 0, which has no roots. If 𝑎 = 0 but 𝑏 ≠ 0, the equation becomes
𝑏𝑥 + 𝑐 = 0, having a single root. If 𝑎 ≠ 0 and 𝑏 2 > 4𝑎𝑐, we have two distinct real
roots. If 𝑎 ≠ 0 and 𝑏 2 = 4𝑎𝑐, we have one real root. If 𝑏 2 < 4𝑎𝑐, we have no real
roots. The resulting type definition can be written as:
sealed trait RootsOfQ2
final case class AllRoots() extends RootsOfQ2
final case class ConstNoRoots() extends RootsOfQ2
final case class Linear(x: Double) extends RootsOfQ2
final case class NoRealRoots() extends RootsOfQ2
final case class OneRootQ(x: Double) extends RootsOfQ2
final case class TwoRootsQ(x: Double, y: Double) extends RootsOfQ2

This disjunctive type contains six parts: three parts are empty tuples and two
parts are single-element tuples; but this is not a useless redundancy. We would
lose information if we reused Linear for the two cases (𝑎 = 0, 𝑏 ≠ 0) and (𝑎 ≠ 0,
𝑏 2 = 4𝑎𝑐), or if we reused NoRoots() for all three different no-roots cases.
To solve a given equation, we need to decide which part of the disjunctive type
to return. The code is:
def solveQ2(a: Double, b: Double, c: Double) : RootsOfQ2 = (a, b, c) match {
case (0.0, 0.0, 0.0) => AllRoots()
case (0.0, 0.0, _) => NoRealRoots()
case (0.0, _, _) => Linear(-c / b)
case _ => // We match here only if `a` is nonzero.
val d = b * b - 4 * a * c
val p = - b / (2.0 * a)
if (d < 0.0) NoRealRoots()
else if (d == 0.0) OneRootQ(p)
else {
val s = math.sqrt(d) / (2.0 * a)
TwoRootsQ(p - s, p + s)
}
}

Let us test this code with various input parameters:


scala> solveQ2(1, 1, 1)
res0: RootsOfQ2 = NoRealRoots()

scala> solveQ2(1, 0, -4)


res1: RootsOfQ2 = TwoRootsQ(-2.0, 2.0)

Example 3.4.1.4 Define a function rootAverage that computes the average value of
all real roots of a general quadratic equation, where the set of roots is represented
by the type RootsOfQ2 defined in Example 3.4.1.3. The required type signature is:
val rootAverage: RootsOfQ2 => Option[Double] = ???

126
3.4 Summary

Return None if the average is undefined (no roots or all values are roots).
Solution The average is defined only in cases Linear, OneRootQ, and TwoRootsQ.
In all other cases, we must return None. We implement this via pattern matching:
val rootAverage: RootsOfQ2 => Option[Double] = roots => roots match {
case Linear(x) => Some(x)
case OneRootQ(x) => Some(x)
case TwoRootsQ(x, y) => Some((x + y) * 0.5)
case _ => None
}

We do not need to enumerate all other cases since the underscore (_) matches
everything that the previous cases did not match.
In Scala, the often-used code pattern x => x match { case ... => ... } can be
shortened to just the nameless function { case ... => ... }. Then the code is:
val rootAverage: RootsOfQ2 => Option[Double] = {
case Linear(x) => Some(x)
case OneRootQ(x) => Some(x)
case TwoRootsQ(x, y) => Some((x + y) * 0.5)
case _ => None
}

Test it:
scala> Seq(NoRealRoots(), OneRootQ(1.0), TwoRootsQ(1.0, 2.0),
AllRoots()).map(rootAverage)
res0: Seq[Option[Double]] = List(None, Some(1.0), Some(1.5), None)

Example 3.4.1.5 Generate 100 quadratic equations 𝑥 2 + 𝑏𝑥 + 𝑐 = 0 with random


coefficients 𝑏, 𝑐 (uniformly distributed between −1 and 1) and compute the mean
of the largest real roots from all these equations.
Solution Use the type QEqu and the solve function from Example 3.2.2.1. A
sequence of equations with random coefficients is created by applying the method
Seq.fill:
def random(): Double = scala.util.Random.nextDouble() * 2 - 1
val coeffs: Seq[QEqu] = Seq.fill(100)(QEqu(random(), random()))

Now we can use the solve function to compute all roots:


val solutions: Seq[RootsOfQ] = coeffs.map(solve)

For each set of roots, compute the largest root:


scala> val largest: Seq[Option[Double]] = solutions.map {
case OneRoot(x) => Some(x)
case TwoRoots(x, y) => Some(math.max(x, y))
case _ => None
}
largest: Seq[Option[Double]] = List(None, Some(0.9346072365885472),
Some(1.1356234869160806), Some(0.9453181931646322),
Some(1.1595052441078866), None, Some(0.5762252742788)...

127
3 The logic of types. I. Disjunctive types

It remains to remove the None values and to compute the mean of the resulting
sequence. The Scala library defines the flatten method that removes Nones and
transforms Seq[Option[A]] into Seq[A]:
scala> largest.flatten
res0: Seq[Double] = List(0.9346072365885472, 1.1356234869160806,
0.9453181931646322, 1.1595052441078866, 0.5762252742788...

Now compute the mean of the last sequence. Since the flatten operation is pre-
ceded by map, we can replace it by a flatMap. The final code is:
val largest = Seq.fill(100)(QEqu(random(), random()))
.map(solve)
.flatMap {
case OneRoot(x) => Some(x)
case TwoRoots(x, y) => Some(math.max(x, y))
case _ => None
}

scala> largest.sum / largest.size


res1: Double = 0.7682649774589514

Example 3.4.1.6 Implement a function with type signature:


def f1[A, B]: Option[Either[A, B]] => Either[A, Option[B]] = ???

The function should preserve information as much as possible.


Solution Begin by pattern matching on the argument:
1 def f1[A, B]: Option[Either[A, B]] => Either[A, Option[B]] = {
2 case None => ???
3 case Some(eab: Either[A, B]) => ???
4 }

In line 3, we wrote the type annotation eab: Either[A, B] only for clarity. It is not
required here since the Scala compiler can deduce the type of the pattern variable
eab from the fact that we are matching a value of type Option[Either[A, B]].
In the scope of line 2, we need to return a value of type Either[A, Option[B]]. A
value of that type must be either a Left(x) for some x: A, or a Right(y) for some y:
Option[B], where y must be either None or Some(z) with a z: B. However, in our case
the code is of the form case None => ???, and we cannot produce any values x: A
or z: B since A and B are arbitrary, unknown types. The only remaining possibility
is to return Right(y) with y = None, and so the code must be:
case None => Right(None) // No other choice here.

In the next scope, we can perform pattern matching on the value eab:
case Some(eab: Either[A, B]) = eab match {
case Left(a) => ???
case Right(b) => ???
}

128
3.4 Summary

It remains to figure out what expressions to write in each case. In the case
Left(a) => ???, we have a value of type A, and we need to compute a value of type
Either[A, Option[B]]. We use the same argument as before: The return value must
be Left(x) for some x: A, or Right(y) for some y: Option[B]. At this point, we have
a value of type A but no values of type B. So, we have two possibilities: to return
Left(a) or to return Right(None). If we decide to return Left(a), the code is:
1 def f1[A, B]: Option[Either[A, B]] => Either[A, Option[B]] = {
2 case None => Right(None) // No other choice here.
3 case Some(eab) => eab match {
4 case Left(a) => Left(a) // Could return Right(None) here.
5 case Right(b) => ???
6 }
7 }

Should we return Left(a) or Right(None) in line 4? Both choices will satisfy the
required return type Either[A, Option[B]]. However, if we return Right(None) in
that line, we will ignore the given value a: A, losing information. So, we return
Left(a) in line 4.
Similarly, we find in line 5 that we may return Right(None) or Right(Some(b)).
Both choices will have the required return type (Either[A, Option[B]]), but the first
choice ignores the given value b: B. To preserve information, we need to make the
second choice:
1 def f1[A, B]: Option[Either[A, B]] => Either[A, Option[B]] = {
2 case None => Right(None)
3 case Some(eab) => eab match {
4 case Left(a) => Left(a)
5 case Right(b) => Right(Some(b))
6 }
7 }

We can now refactor this code into a somewhat more readable form by using
nested patterns:
def f1[A, B]: Option[Either[A, B]] => Either[A, Option[B]] = {
case None => Right(None)
case Some(Left(a)) => Left(a)
case Some(Right(b)) => Right(Some(b))
}

Example 3.4.1.7 Implement a function with the type signature:


def f2[A, B]: (Option[A], Option[B]) => Option[(A, B)] = ???

The function should preserve information as much as possible.


Solution Begin by pattern matching on the argument:
1 def f2[A, B]: (Option[A], Option[B]) => Option[(A, B)] = {
2 case (Some(a), Some(b)) => ???

In line 2, we have values a: A and b: B, and we need to compute a value of type


129
3 The logic of types. I. Disjunctive types

Option[(A, B)]. A value of that type is either None or Some((x, y)) where we would
need to choose some x: A and y: B. Since A and B are arbitrary types, we cannot
produce new values x and y from scratch. The only way of obtaining x and y is
to set x = a and y = b. So, our choices are to return Some((a, b)) or None. We reject
returning None since that would unnecessarily lose information. Thus, we continue
writing code as:
1 def f2[A, B]: (Option[A], Option[B]) => Option[(A, B)] = {
2 case (Some(a), Some(b)) => Some((a, b))
3 case (Some(a), None) => ???

In line 3, we have a value a: A but no values of type B. As the type B is arbitrary,


we cannot produce any values of type B to return a value of the form Some((x, y)).
So, None is the only computable value of type Option[(A, B)] in line 3. We continue
to write the code:
1 def f2[A, B]: (Option[A], Option[B]) => Option[(A, B)] = {
2 case (Some(a), Some(b)) => Some((a, b))
3 case (Some(a), None) => None // No other choice here.
4 case (None, Some(b)) => ???
5 case (None, None) => ???
6 }

In lines 4–5, we find that there is no choice other than returning None. So, we can
simplify the code:
def f2[A, B]: (Option[A], Option[B]) => Option[(A, B)] = {
case (Some(a), Some(b)) => Some((a, b))
case _ => None // No other choice here.
}

3.4.2 Exercises
Exercise 3.4.2.1 Define a disjunctive type CellState representing the visual state
of one cell in the “Minesweeper”3 game: A cell can be closed (showing nothing),
or show a bomb, or be open and show the number of bombs in neighbor cells.
Exercise 3.4.2.2 In the context of the “Minesweeper” game (Exercise 3.4.2.1), count
the total number of cells with zero neighbor bombs shown by implementing a
function with type signature Seq[Seq[CellState]] => Int.
Exercise 3.4.2.3 Define a disjunctive type RootOfLinear representing all possibili-
ties for the solution of the equation 𝑎𝑥 + 𝑏 = 0 for arbitrary real 𝑎, 𝑏. (The possibil-
ities are: no roots; one root; all 𝑥 are roots.) Implement the solution as a function
solve1 with type signature:
def solve1: ((Double, Double)) => RootOfLinear = ???

3 https://en.wikipedia.org/wiki/Minesweeper_(video_game)

130
3.4 Summary

Exercise 3.4.2.4 Given a Seq[(Double, Double)] containing pairs (𝑎, 𝑏) of coeffi-


cients of 𝑎𝑥 + 𝑏 = 0, compute a Seq[Double] containing the roots of that equation
when a unique root exists. Use the type RootOfLinear and the function solve1 de-
fined in Exercise 3.4.2.3.
Exercise 3.4.2.5 Use the function rootAverage from Example 3.4.1.4 to compute
the average root of 100 equations whose coefficients (𝑎, 𝑏, 𝑐) are chosen randomly
between 0 and 1. Ignore equations for which the average root value is undefined.
Exercise 3.4.2.6 The case class Subscriber was defined in Example 3.2.3.1. Given
a Seq[Subscriber], compute the sequence of email addresses for all subscribers that
did not provide a phone number.
Exercise 3.4.2.7 In this exercise, a “procedure” is a function of type Unit => Unit.
An example of a procedure is { _ => println("hello") }. Define a disjunctive type
Proc for an abstract syntax tree representing three operations on procedures: 1)
Func[A](f), to create a procedure from a function f of type Unit => A, where A is a
type parameter. (Note that the type Proc does not have any type parameters.) 2)
Sequ(p1, p2), to execute two procedures sequentially. 3) Para(p1, p2), to execute
two procedures in parallel. Then implement a “runner” that converts a Proc into a
Future[Unit], running the computations either sequentially or in parallel as appro-
priate. Test with this code:
sealed trait Proc
final case class Func[A](???) extends Proc // And so on.

def runner: Proc => Future[Unit] = ???


val proc1: Proc = Func{_ => Thread.sleep(200); println("hello1")}
val proc2: Proc = Func{_ => Thread.sleep(400); println("hello2")}

scala> runner(Sequ(Para(proc2, proc1), proc2))


hello1
hello2
hello2

Exercise 3.4.2.8 Use pattern matching to implement functions with given type
signatures, preserving information as much as possible:
def f1[A, B]: Option[(A, B)] => (Option[A], Option[B]) = ???
def f2[A, B]: Either[A, B] => (Option[A], Option[B]) = ???
def f3[A, B, C]: Either[A, Either[B, C]] => Either[Either[A, B], C] = ???

Exercise 3.4.2.9 Define a parameterized type EvenList[A] representing a list of


values of type A that must have an even length (zero, two, four, etc.). Implement
foldLeft and map for EvenList.
Exercise 3.4.2.10 The standard type List[A] requires all its values to have the
same type A. Define a parameterized type ListX[A] representing a data structure
in the form of a non-empty list where the first value has type A, the second value
has type Option[A], the third — Option[Option[A]], and so on. Using a wrong type
131
3 The logic of types. I. Disjunctive types

OneRoot(x)
x

NoRoots()

x
y
TwoRoots(x, y)

Figure 3.1: The disjoint domain represented by the type RootsOfQ.

at a given place (say, Option[Option[A]] instead of Option[A] for the second value
in the list) should cause a type error. Implement (not necessarily tail-recursive)
functions map and foldLeft for ListX. The type signatures:
def map[A, B](lx: ListX[A])(f: A => B): ListX[B] = ???
def foldLeft[A, R](lx: ListX[A])(init: R)(f: (R, A) = R): R = ???

3.5 Discussion and further developments


3.5.1 Disjunctive types as mathematical sets
To understand the properties of disjunctive types from the mathematical point of
view, consider a function whose argument is a disjunctive type, such as:
def isDoubleRoot(r: RootsOfQ) = ...

The type RootsOfQ represents the set of admissible values of the argument r, that
is, the mathematical domain of the function isDoubleRoot. What kind of domain is
that? The set of real roots of a quadratic equation 𝑥 2 + 𝑏𝑥 + 𝑐 = 0 can be empty, or it
can contain a single real number 𝑥, or a pair of real numbers (𝑥, 𝑦). Geometrically,
a number 𝑥 is pictured as a point in a line (a one-dimensional space), and pair
of numbers (𝑥, 𝑦) is pictured as a point in a Cartesian plane (a two-dimensional
space). The no-roots case corresponds to a zero-dimensional space, which can be
pictured as a single point (see Figure 3.1). The point, the line, and the plane do not
intersect (i.e., have no common points). Together, they form the set of the possible
roots of the quadratic equation 𝑥 2 + 𝑏𝑥 + 𝑐 = 0.
In the mathematical notation, a one-dimensional real space is denoted by R, a
132
3.5 Discussion and further developments

two-dimensional space by R2 , and a zero-dimensional space by R0 . At first, we


may think that the mathematical representation of the type RootsOfQ is a union of
the three sets, R0 ∪ R1 ∪ R2 . But an ordinary union of sets would not always work
correctly because we need to distinguish the parts of the union unambiguously,
even if some parts have the same type. For instance, the disjunctive type shown
in Example 3.4.1.3 cannot be correctly represented by the mathematical union:
R0 ∪ R0 ∪ R1 ∪ R0 ∪ R1 ∪ R2 ,
because R0 ∪ R0 = R0 and R1 ∪ R1 = R1 , so:
R0 ∪ R0 ∪ R1 ∪ R0 ∪ R1 ∪ R2 = R0 ∪ R1 ∪ R2 .
For instance, this representation has no distinction between the cases Linear(x)
and OneRootQ(x).
In Scala code, each part of a disjunctive type must be distinguished by a unique
name such as NoRoots, OneRoot, and TwoRoots. To represent this mathematically,
we need to attach a distinct label to each part of the union. Labels are sym-
bols without any special meaning, and we can assume that labels are names of
Scala case classes. Parts of the union are then represented by sets of pairs such as
( OneRoot, 𝑥)𝑥∈R1 . Then the domain RootsOfQ is expressed as:
RootsOfQ = ( NoRoots, 𝑢)𝑢∈R0 ∪ ( OneRoot, 𝑥)𝑥∈R1 ∪ ( TwoRoots, (𝑥, 𝑦))(𝑥,𝑦)∈R2 .
This is an ordinary union of mathematical sets, but each of the sets has a unique
label, so no two values from different parts of the union could possibly be equal.
This kind of set is called a labeled union (also a tagged union or a disjoint union).
Each element of a labeled union is a pair of the form (label, data), where the
label uniquely identifies the part of the union, and the data can have any chosen
type such as R1 . If we use labeled unions, we cannot confuse different parts of
the union even if their data have the same type, because labels are required to be
distinct.
Labeled unions are not often used in mathematics, but they are needed in soft-
ware engineering because real-life data is often described by sets having several
disjoint parts.
Named Unit types At first sight, it may seem strange that the zero-dimensional
space is represented by a set containing one point. Why should we not use an
empty set (rather than a set with one element) to represent the case where the
equation has no real roots? The reason is that we are required to represent not
only the values of the roots but also the information about the existence of the
roots. The case with no real roots needs to be represented by some value of type
RootsOfQ. That value cannot be missing, which would happen if we used an empty
set to represent the no-roots case. It is natural to use the named empty tuple
NoRoots() to represent that case, just as we used a named 2-tuple TwoRoots(x, y) to
represent the case of two roots.
133
3 The logic of types. I. Disjunctive types

Consider the value 𝑢 used by the mathematical set ( NoRoots, 𝑢) 𝑢∈R0 . Since R0
consists of a single point, there is only one possible value of 𝑢. Similarly, the Unit
type in Scala has only one distinct value, written as (). A case class with no parts,
such as NoRoots, has only one distinct value, written as NoRoots(). The Scala value
NoRoots() is fully analogous to the mathematical notation ( NoRoots, 𝑢) 𝑢∈R0 .
So, case classes with no parts are similar to Unit except for an added name. For
instance, NoRoots() can be regarded as the Unit value () with name NoRoots. For this
reason, this book calls them “named unit” types.

3.5.2 Disjunctive types in other programming languages


Disjunctive types and pattern matching turns out to be one of the defining features
of FP languages. Languages that were not designed for functional programming
do not support these features, while OCaml, Haskell, F#, Scala, Swift, and Rust
support disjunctive types and pattern matching as part of the language design.
It is remarkable that named tuple types (also called “structs” or “records”) are
provided in almost every programming language, while disjunctive types are al-
most never present except in languages designed for the FP paradigm.4
The union types in C and C++ are not disjunctive types because it is not possible
to determine which part of the union is represented by a given value. A union
declaration in C looks like this:
union { int x; double y; long z; } i_d_l;

This type does not include any labels telling us which of the values is present.
Without a label, we (and the compiler) will not know whether a given value of
type i_d_l represents an int, a double, or a long. This will lead to errors that are
hard to detect.
Programming languages of the C family (C, C++, Objective C, Java) support
enumeration (enum) types, which are a limited form of disjunctive types, and a
switch operation, which is a limited form of pattern matching. An enum type decla-
ration in Java looks like this:
enum Color { RED, GREEN, BLUE; }

In Scala, this is equivalent to a disjunctive type containing three empty tuples:


sealed trait Color
final case class RED() extends Color
final case class GREEN() extends Color
final case class BLUE() extends Color

If we add extra data to the enum types, allowing the tuples to be non-empty, and
extend the switch expression to be able to handle the extra data, we will recover
the full functionality of disjunctive types. A definition of RootsOfQ could then look
4 The programming languages Ada and Pascal support disjunctive types but no other FP features.

134
3.5 Discussion and further developments

like this:
enum RootsOfQ { // This is not valid in Java!
NoRoots(), OneRoot(Double x), TwoRoots(Double x, Double y);
}

Scala 3 has a shorter a syntax for disjunctive types5 that resembles Java’s “enum”:
enum RootsOfQ {
case NoRoots
case OneRoot(x: Double)
case TwoRoots(x: Double, y: Double)
}

For comparison, the syntax for a disjunctive type equivalent to RootsOfQ in OCaml
and Haskell is:
(* OCaml *)
type RootsOfQ = NoRoots | OneRoot of float | TwoRoots of float * float

-- Haskell
data RootsOfQ = NoRoots | OneRoot Double | TwoRoots (Double, Double)

This is more concise than the Scala syntax. When reasoning about disjunctive
types, it is inconvenient to write out long type definitions. Chapter 5 will intro-
duce a mathematical notation designed for efficient reasoning about types. That
notation is even more concise than the syntax of Haskell or OCaml.

3.5.3 Disjunctions and conjunctions in formal logic


In logic, a proposition is a logical formula that could be true or false. A disjunc-
tion of propositions 𝐴, 𝐵, 𝐶 is denoted by 𝐴 ∨ 𝐵 ∨ 𝐶 and is true if and only if at
least one of 𝐴, 𝐵, 𝐶 is true. A conjunction of 𝐴, 𝐵, 𝐶 is denoted by 𝐴 ∧ 𝐵 ∧ 𝐶 and is
true if and only if all of the propositions 𝐴, 𝐵, 𝐶 are true.
There is a connection between disjunctive data types and logical disjunctions
of propositions. A value of the disjunctive data type RootsOfQ can be constructed
only if we have one of the values NoRoots(), OneRoot(x), or TwoRoots(x, y). Let us
now rewrite the previous sentence as a logical formula. Denote by CH ( 𝐴) the
logical proposition “we Can H ave a value of type A here”, where by “here” we
mean a particular scope in a program. So, the proposition “the code can compute
a value of type RootsOfQ” is denoted by CH ( RootsOfQ). We can then write the
above sentence about RootsOfQ as a logical formula:

CH ( RootsOfQ) = CH ( NoRoots) ∨ CH ( OneRoot) ∨ CH ( TwoRoots) . (3.1)

There is a similar connection between logical conjunctions and tuple types. Con-
sider the named tuple (i.e., a case class) TwoRoots(x: Double, y: Double). We can
5 https://dotty.epfl.ch/docs/reference/enums/adts.html

135
3 The logic of types. I. Disjunctive types

have a value of type TwoRoots only if we have two values of type Double. Rewriting
this sentence as a logical formula, we get:

CH ( TwoRoots) = CH ( Double) ∧ CH ( Double) .

Formal logic admits the simplification:

CH ( Double) ∧ CH ( Double) = CH ( Double) .

However, no such simplification will be available in the general case, e.g.:


case class Data3(x: Int, y: String, z: Double)
For this type, we will have the formula:

CH ( Data3) = CH ( Int) ∧ CH ( String) ∧ CH ( Double) . (3.2)

We find that tuples are related to logical conjunctions in the same way as dis-
junctive types are related to logical disjunctions. This is the main reason for choos-
ing the name “disjunctive types”.6
The correspondence between disjunctions, conjunctions, and data types is ex-
plained in more detail in Chapter 5. For now, we note that the operations of con-
junction and disjunction are not sufficient to produce all possible logical expres-
sions. To obtain a complete logic, it is also necessary to have the logical implica-
tion 𝐴 → 𝐵 (“if 𝐴 is true than 𝐵 is true”). It turns out that the implication 𝐴 → 𝐵
is related to the function type A => B in the same way as the disjunction operation
is related to disjunctive types and the conjunction to tuples. In Chapter 4, we will
study function types in depth.

6 Disjunctive
types are also called sum types, co-product types, variants, and tagged unions. This
book uses the terms “disjunctive types” and “co-product types” interchangeably.

136
4 The logic of types. II. Curried
functions
4.1 Functions that return functions
4.1.1 Motivation and first examples
Consider the task of preparing a logger function that prints messages with a con-
figurable prefix.
A simple logger function can be a value of type String => Unit, such as:
val logger: String => Unit = { message => println(s"INFO: $message") }

scala> logger("hello world")


INFO: hello world
This function prints any given message with the logging prefix "INFO".
The standard library function println(...) always returns a Unit value after
printing its arguments. As we already know, there is only a single value of type
Unit, and that value is denoted by (). To see that println returns Unit, run this code:
scala> val x = println(123)
123
x: Unit = ()
The task is to make the logging prefix configurable. A simple solution is to
implement a function logWith that takes a prefix as an argument and returns a
new logger containing that prefix. Note that the function logWith returns a new
function, i.e., a new value of type String => Unit:
def logWith(prefix: String): (String => Unit) = {
message => println(s"$prefix: $message")
}
The body of logWith consists of a nameless function message => println(...), which
is a value of type String => Unit. This value will be returned when we evaluate
logWith("...").
We can now use logWith to create some logger functions:
scala> val info = logWith("INFO")
info: String => Unit = <function1>

scala> val warn = logWith("WARN")

137
4 The logic of types. II. Curried functions

warn: String => Unit = <function1>

The created loggers are then usable as ordinary functions:


scala> info("hello")
INFO: hello

scala> warn("goodbye")
WARN: goodbye

The values info and warn can be used by any code that needs a logging function.
It is important that the prefix is “baked into” functions created by logWith. A
logger such as warn will always print messages with the prefix "WARN", and the
prefix cannot be changed any more. This is because the value prefix is treated as
a local constant within the body of the nameless function computed and returned
by logWith. For instance, the body of the function warn is equivalent to:
{ val prefix = "WARN"; (message => s"$prefix: $message") }

Whenever a new function is created using logWith(prefix), the (immutable) ref-


erence to prefix is stored within the body of the newly created function. This
is a general feature of nameless functions: the function’s body captures refer-
ences to all the outer-scope values it uses. One sometimes says that the function’s
body “closes over” those values; for this reason, nameless functions are also called
“closures”.
However, nameless functions do not copy values from outer scopes. Those val-
ues are captured by reference. This distinction is important in Scala as it supports
mutable values (as well as classes that encapsulate mutable values).
Here is an example of a function body capturing references to variables:
var c: Int = 10 // Mutable variable!
val f: Int => Int = {
val p = 10
val q = 20
x => p + q * x + c
}

The body of the function f is equivalent to { x => 10 + 20 * x + c }. The values


p = 10 and q = 20 are local constants captured in the function’s body. However,
the value c is captured by reference. If we change c, the behavior of f will also
change:
scala> f(10)
res0: Int = 220

scala> c = 1000
c: Int = 1000

scala> f(10)
res1: Int = 1210

138
4.1 Functions that return functions

A captured reference to a mutable external variable c makes the function f itself


mutable, even though f was defined as a val. We will avoid such code in this book
and instead use immutable values.

4.1.2 Curried and uncurried functions


Reasoning mathematically about the following code:
val info = logWith("INFO")
info("hello")

we would expect that info is the same value as logWith("INFO"), and so the code
info("hello") should have the same effect as the code logWith("INFO")("hello").
This is indeed so:
scala> logWith("INFO")("hello")
INFO: hello

The syntax logWith("INFO")("hello") looks like the function logWith applied to two
arguments. Yet, logWith was defined as a function with a single argument of
type String. This is not a contradiction because logWith("INFO") returns a func-
tion that accepts an additional argument. So, expressions logWith("INFO") and
logWith("INFO")("hello") are both valid. In this sense, we are allowed to apply
logWith to one argument at a time.
A function that can be applied to arguments in this way is called a curried
function.
While a curried function can be applied to one argument at a time, an uncurried
function must be applied to all arguments at once, e.g.:
def prefixLog(prefix: String, message: String): Unit = println(s"$prefix:
$message")

The type of the curried function logWith is String => (String => Unit). By Scala’s
syntax conventions, the function arrow (=>) groups to the right. So, the parentheses
in the type expression String => (String => Unit) are not needed. The function’s
type can be written as String => String => Unit.
The type String => String => Unit is different from (String => String) => Unit,
which is the type of a function returning Unit and having a single argument of
type String => String.
When an argument’s type is a function type, e.g., String => String, it must be
enclosed in parentheses, as in (String => String) => Unit.
In general, a curried function takes an argument and returns another function
that again takes an argument and returns another function, and so on, until fi-
nally a non-function type is returned. So, the type signature of a curried function
generally looks like A => B => C => ... => R => S, where A, B, ..., R are the curried
arguments and S is the “final” result type.
139
4 The logic of types. II. Curried functions

For example, in the type expression A => B => C => D the types A, B, C are the
types of curried arguments, and D is the final result type. It takes time to get used
to reading this kind of syntax.
In Scala, functions defined with multiple argument lists (enclosed in multiple
pairs of parentheses) are curried functions. We have seen examples of curried
functions before:
def map[A, B](xs: Seq[A])(f: A => B): Seq[B]
def fmap[A, B](f: A => B)(xs: Option[A]): Option[B]
def foldLeft[A, R](xs: Seq[A])(init: R)(update: (R, A) => R): R

The type signatures of these functions can be also written equivalently without
argument names, although this is less convenient in practical coding:
def map[A, B]: Seq[A] => (A => B) => Seq[B]
def fmap[A, B]: (A => B) => Option[A] => Option[B]
def foldLeft[A, R]: Seq[A] => R => ((R, A) => R) => R

Curried arguments of a function type, such as (A => B), need parentheses.


To summarize, a curried function such as logWith can be defined in three equiv-
alent ways in Scala:
1 def logWith1(prefix: String)(message: String): Unit = println(s"$prefix:
$message")
2 def logWith2(prefix: String): String => Unit = { message => println(s"$prefix:
$message") }
3 def logWith3: String => String => Unit = { prefix => message =>
println(s"$prefix: $message") }

For clarity, we will sometimes enclose nameless functions in parentheses or curly


braces.
Line 3 above shows that the arrow symbols => group to the right within the code
of nameless functions. So, x => y => expr means {x => {y => expr}}, a nameless
function taking an argument x and returning a nameless function that takes an
argument y and returns an expression expr. This syntax convention is helpful since
the code x => y => z visually corresponds to the curried function’s type signature
A => B => C, which uses the same syntax convention. Also, the syntax (x => y) =>
z could not possibly work for a nameless function because matching a function
against the pattern x => y makes no sense. If we matched a function such as {
t => t + 20 } against the pattern x => y by setting x = t and y = t + 20, we would
have no value for the bound variable t. (What would be the integer value of y?)
So, x => (y => z) is the only sensible way of adding parentheses to x => y => z.
Although the code (x => y) => z is invalid, the type expression (A => B) => C is
valid. We may write a nameless function of type (A => B) => C as f => expr where
f: A => B is the argument and expr the body.

140
4.1 Functions that return functions

4.1.3 Equivalence of curried and uncurried functions


We defined the curried function logWith in order to be able to create logger func-
tions such as info and warn. However, some curried functions, such as foldLeft,
are almost always applied to all possible arguments. A curried function applied
to all its possible arguments is equivalent to an uncurried function that takes all
those arguments at once. Let us look at this equivalence in more detail.
Consider a curried function with type signature Int => Int => Int. This func-
tion takes an integer and returns an (uncurried) function taking an integer and
returning an integer. An example of such a curried function is:
def f1(x: Int): Int => Int = { y => x - y }

The function takes an integer x and returns the expression y => x - y, which is
a function of type Int => Int. The code of f1 can be written equivalently as:
val f1: Int => Int => Int = { x => y => x - y }

Let us rewrite f1 as a function that takes its two arguments at once:


def f2(x: Int, y: Int): Int = x - y

The function f2 has type signature (Int, Int) => Int. Calling f1 and f2 requires
different syntax:
scala> f1(20)(4)
res0: Int = 16

scala> f2(20, 4)
res1: Int = 16

The main difference is that f2 must be applied at once to both arguments, while
f1 could be applied to just the first argument (20). Applying a curried function to
some but not all possible arguments is called a partial application. The result of
evaluating f1(20) is a function that can be later applied to another argument:
scala> val r1 = f1(20)
r1: Int => Int = <function1>

scala> r1(4)
res2: Int = 16

Applying a curried function to all possible arguments is called a full applica-


tion. A full application returns a value that is not of a function type. So, it cannot
be applied to more arguments.
To partially apply an uncurried function, we can use the underscore (_) symbol:
1 scala> val r2: Int => Int = f2(20, _)
2 r2: Int => Int = <function1>
3
4 scala> r2(4)
5 res3: Int = 16

141
4 The logic of types. II. Curried functions

(The type annotation Int => Int is required in line 1.) This code creates a func-
tion r2 by applying f2 to the first argument but not to the second. Then r2 is the
same function as r1 defined above; i.e., r2 returns the same values for the same
arguments as r1. A more verbose syntax for a partial application is:
scala> val r3: Int => Int = { x => f2(20, x) } // Same as r2 above.
r3: Int => Int = <function1>

scala> r3(4)
res4: Int = 16

We can see that a curried function, such as f1, is better adapted for partial ap-
plication than f2, because the syntax is shorter. However, the types of functions f1
and f2 are equivalent: for any f1 of type Int => Int => Int we can reconstruct f2
of type (Int, Int) => Int and vice versa, without loss of information:
def f2new(x: Int, y: Int): Int = f1(x)(y) // f2new is equal to f2
def f1new: Int => Int => Int = { x => y => f2(x, y) } // f1new is equal to f1

It is clear that the function f1new computes the same results as f1, and that the
function f2new computes the same results as f2. The equivalence of the functions
f1 and f2 is not equality — these functions are different; but each of them can be re-
constructed from the other. The one-to-one correspondence between all functions
of type Int => Int => Int and all functions of type (Int, Int) => Int is what we call
the “equivalence of types”.
More generally, a curried function has a type signature of the form A => B => C
=> ... => R => S, where A, B, C, ..., S are some types. A function with this type signa-
ture is equivalent to an uncurried function with type signature (A,B,C,...,R) => S.
The uncurried function takes all arguments at once, while the curried function
takes one argument at a time. Other than that, these two functions compute the
same results given the same arguments.
We have seen how a curried function can be converted to an equivalent un-
curried one, and vice versa. The Scala library defines the methods curried and
uncurried that convert between these forms of functions. To convert between f2
and f1:
scala> val f1c = (f2 _).curried
f1c: Int => (Int => Int) = <function1>

scala> val f2u = Function.uncurried(f1c)


f2u: (Int, Int) => Int = <function2>

The syntax (f2 _) is needed in Scala to convert methods to function values. Recall
that Scala has two ways of defining a function: one as a method (defined using
def), another as a function value (defined using val). The extra underscore is un-
necessary in Scala 3.
The methods curried and uncurried are quick to implement (see Section 4.2.1
below). These functions are called the currying and uncurrying transformations.
142
4.2 Fully parametric functions

4.2 Fully parametric functions


Scala code may declare functions with type parameters, which are set only when
the function is applied to specific arguments. Examples of such functions are map
and filter, written as:
def map[A, B](xs: Seq[A])(f: A => B): Seq[B]
def filter[A](xs: Seq[A])(p: A => Boolean): Seq[A]

Such functions can be applied to arguments of different types without changing


the function’s code. It is better to write a single function with type parameters
instead of writing several functions with repeated code but working with different
types.
When we apply the function map as map(xs)(f) to a specific value xs of type, say,
Seq[Int], and a specific function f of type, say, Int => String, the Scala compiler
will automatically set the type parameters A = Int and B = String in the code of
map. We may also set type parameters explicitly and write, for example, map[Int,
String](xs)(f). This syntax shows a certain similarity between type parameters
such as Int, String and “value parameters” (arguments) xs and f. Setting type
parameters, e.g., map[Int, String], means substituting A = Int, B = String into the
type signature of the function, similarly to how setting value parameters means
substituting specific values into the function body.
In the functions map and filter as just shown, some types are parameters while
others are specific types, such as Seq and Boolean. It is sometimes possible to re-
place all specific types in the type signature of a function by type parameters. The
result is a “fully parametric” function.
We call a function fully parametric if its arguments have types described by
type parameters, and the code of the function does not use any information about
its argument types, other than assuming that those types correctly match the type
signature. In addition to type parameters, a fully parametric function may use
the Unit type, tuple types, disjunctive types, and function types. Fully parametric
functions may not use any library-defined types such as Int or String.
What kind of functions are fully parametric? To build up intuition, let us com-
pare the following two functions that have the same type signature:
def cos_sin(p: (Double, Double)): (Double, Double) = p match {
case (x, y) =>
val r = math.sqrt(x * x + y * y)
(x / r, y / r) // Return cos and sin of the angle, or `NaN` when
undefined.
}

def swap(p: (Double, Double)): (Double, Double) = p match {


case (x, y) => (y, x)
}

We can introduce type parameters into the type signature of swap to make it fully
143
4 The logic of types. II. Curried functions

parametric:
def swap[A, B](p: (A, B)): (B, A) = p match {
case (x, y) => (y, x)
}

Converting swap into a fully parametric function is possible because the operation
of swapping the parts of a tuple (A, B) works in the same way for all types A, B. No
changes were made in the body of the function. The specialized version of swap
working on (Double, Double) can be obtained from the fully parametric version of
swap if we set the type parameters as A = Double, B = Double.
In contrast, the function cos_sin performs a computation that is specific to the
type Double. That computation cannot be generalized to an arbitrary type param-
eter A instead of the type Double. For instance, the code of cos_sin uses the function
math.sqrt, which is defined only for the type Double.
To generalize cos_sin to a fully parametric function that works with a type pa-
rameter A, we would need to replace all computations specific to the type Double
by new arguments working with the type parameter A. For example, we could in-
troduce two new arguments (named, say, distance and ratio) and replace cos_sin
by the fully parametric function cos_sin_parametric:
def cos_sin_parametric[A](p: (A, A), distance: (A, A) => A, ratio: (A, A) =>
A): (A, A) = p match {
case (x, y) =>
val r = distance(x, y)
(ratio(x, r), ratio(y, r))
}

A fully parametric function has all its arguments typed with type parameters or
with some combinations of type parameters, i.e., type expressions such as (A, B)
or X => Either[X, Y].
The swap operation for pairs is already defined in the Scala library:
scala> (1, "abc").swap
res0: (String, Int) = (abc,1)

If needed, other swapping functions can be implemented for tuples with more
elements, e.g.:
def swapAC[A, B, C]: ((A, B, C)) => (C, B, A) = { case (x, y, z) => (z, y, x) }

The Scala syntax requires double parentheses around tuple types of arguments but
not around the tuple type of a function’s result. So, the function cos_sin may be
written as a value like this:
val cos_sin: ((Double, Double)) => (Double, Double) = ...

Further examples of fully parametric functions are the identity function, the
const function, the function composition methods, and the currying / uncurrying
transformations.
144
4.2 Fully parametric functions

The identity function is available in the Scala library as identity[T]:


def identity[T]: T => T = (t => t)

In the mathematical notation, we write the identity function as “id” for brevity.
The function available in the Scala library as Function.const[C, X] takes an argu-
ment c of type C and returns a new function that always returns c:
def const[C, X](c: C): X => C = (_ => c)

The syntax _ => c is used to emphasize that the new returned function ignores its
argument. One-argument functions that ignore their argument are called constant
functions.

4.2.1 Function composition


Consider two functions f: Int => Double and g: Double => String. We can apply f
to an integer argument x and get a result f(x) of type Double. Applying g to that
result gives a String value g(f(x)). The transformation from an x of type Int to a
final String value g(f(x)) can be viewed as a new function of type Int => String.
That new function is called the forward composition of the two functions f and g.
In Scala, the forward composition of f and g is written as f andThen g:
val f: Int => Double = (x => 5.67 + x)
val g: Double => String = (x => f"Result x = $x%3.2f")

scala> val h = f andThen g // h(x) is defined as g(f(x)).


h: Int => String = <function1>

scala> h(40)
res36: String = Result x = 45.67

The Scala compiler derives the type of h automatically as Int => String.
This book denotes the forward composition by the symbol # (which can be read
as “before”). We define 𝑓 # 𝑔 (reads “ 𝑓 before 𝑔”) by:

𝑓 # 𝑔 def
= 𝑥 → 𝑔( 𝑓 (𝑥)) . (4.1)

The symbol def


= means “is defined as” or “is equal by definition to”.
We may implement the forward composition as a fully parametric function:
def andThen[X, Y, Z](f: X => Y)(g: Y => Z): X => Z = { x => g(f(x)) }

This type signature requires the types of the function arguments to match in a
certain way, or else the composition is undefined (and the code would produce
a type error). The method andThen is an example of a function that both returns a
new function and takes other functions as arguments.
The backward composition of two functions 𝑓 and 𝑔 works in the opposite
order: first 𝑔 is applied and then 𝑓 . This operation is denoted by the symbol ◦
145
4 The logic of types. II. Curried functions

(pronounced “after”):
𝑓 ◦ 𝑔 def
= 𝑥 → 𝑓 (𝑔(𝑥)) . (4.2)
In Scala, the backward composition is called compose and used as f compose g. This
method may be implemented as a fully parametric function:
def compose[X, Y, Z](f: Y => X)(g: Z => Y): Z => X = { z => f(g(z)) }
We have already seen the methods curried and uncurried from the Scala library.
As an illustration, here is the code for the uncurrying transformation (converting
curried functions to uncurried):
def uncurry[A, B, R](f: A => B => R): ((A, B)) => R = { case (a, b) => f(a)(b) }
These examples show that fully parametric functions perform operations so
general that they work in the same way for all types. Some arguments of fully
parametric functions may have complicated types such as A => B => R, which are
type expressions built up from type parameters. But fully parametric functions
do not use values of specific types such as Int or String.
Functions with type parameters are often called “generic”. This book uses the
term “fully parametric” to designate a certain restricted kind of generic functions.

4.2.2 Laws of function composition


The operations of function composition, introduced in Section 4.2.1, have three
important properties or “laws”:

• The two identity laws: the composition of any function 𝑓 with an identity
function (identity[A]) will give again the function 𝑓 .

• The associativity law: the consecutive composition of three functions 𝑓 , 𝑔, ℎ


does not depend on the order in which the pairs are composed.

These laws hold equally for the forward and the backward composition, since
those are just syntactic variants of the same operation. Let us write these laws
rigorously as equations and prove them.
Proofs with forward composition The composition of the identity function with
an arbitrary function 𝑓 on the left is written as 𝑓 # id. The composition with the
function 𝑓 on the right is written as id # 𝑓 . In both cases, the result must be equal
to the function 𝑓 . The resulting two laws are:

left identity law of function composition : id # 𝑓 = 𝑓 ,


right identity law of function composition : 𝑓 # id = 𝑓 .

To prove that these laws hold, we need to show that the functions at both sides
of the laws give the same result when applied to an arbitrary value 𝑥. Let us first
clarify how the type parameters must be set for all types to match consistently.
146
4.2 Fully parametric functions

The laws must hold for an arbitrary function 𝑓 . Assume that 𝑓 has the type
signature 𝐴 → 𝐵, where 𝐴 and 𝐵 are arbitrary types (type parameters). Consider
the left identity law. The function (id # 𝑓 ) is, by definition (4.1), a function that
takes an argument 𝑥, applies id to that 𝑥, and then applies 𝑓 to the result:

id# 𝑓 = (𝑥 → 𝑓 (id (𝑥))) .

If 𝑓 has type 𝐴 → 𝐵, its argument must be of type 𝐴, or else the types will not
match. Therefore, the identity function must have type 𝐴 → 𝐴, and the argu-
ment 𝑥 must have type 𝐴. With these choices of the type parameters, the function
(𝑥 → 𝑓 (id(𝑥))) will have type 𝐴 → 𝐵. This type matches the right-hand side of
the law, which is just 𝑓 . We add type annotations to the code as superscripts:
 :𝐴→𝐵
id:𝐴→𝐴 # 𝑓 :𝐴→𝐵 = 𝑥 :𝐴 → 𝑓 (id (𝑥)) .

In the Scala syntax, this formula may be written as:


identity[A] andThen (f: A => B) == { x: A => f(identity(x)) }: A => B
We will follow the convention where type parameters are single uppercase let-
ters, as is common in Scala code (although this convention is not enforced by the
Scala compiler). The colon symbol (:) in the superscript 𝑥 :𝐴 means a type annota-
tion, as in Scala code x:A. Superscripts without a colon, such as id 𝐴 , denote type
parameters, as in Scala code identity[A]. Since the function identity[A] has type
:𝐴→𝐴
A => A, we can write id or equivalently (but more verbosely) id
𝐴
to denote
that function.
Now we can prove the law. By definition of the identity function, we have
id (𝑥) = 𝑥, and so:

id# 𝑓 = (𝑥 → 𝑓 (id (𝑥))) = (𝑥 → 𝑓 (𝑥)) = 𝑓 .

The last step works since 𝑥 → 𝑓 (𝑥) is a function that takes an argument 𝑥 and
applies 𝑓 to that argument. This is the same function as 𝑓 . We say that 𝑥 → 𝑓 (𝑥)
is an expanded form of the function 𝑓 .
We turn to the right identity law, 𝑓 # id = 𝑓 . Write out the left-hand side:

𝑓 # id = (𝑥 → id ( 𝑓 (𝑥))) .

To check that the types match, assume that 𝑓 :𝐴→𝐵 . Then 𝑥 must have type 𝐴, and
the identity function must have type 𝐵 → 𝐵. The result of id ( 𝑓 (𝑥)) will also have
type 𝐵. With these choices of type parameters, all types match:
 :𝐴→𝐵
𝑓 :𝐴→𝐵 # id:𝐵→𝐵 = 𝑥 :𝐴 → id ( 𝑓 (𝑥)) .

Since id ( 𝑓 (𝑥)) = 𝑓 (𝑥), we find:

𝑓 # id = (𝑥 → 𝑓 (𝑥)) = 𝑓 .
147
4 The logic of types. II. Curried functions

In this way, we have demonstrated that both identity laws hold.


The associativity law is written as an equation like this:
associativity law of function composition : ( 𝑓 # 𝑔) # ℎ = 𝑓 # (𝑔 # ℎ) . (4.3)
Let us verify that the types match here. The types of the functions 𝑓 , 𝑔, and ℎ
must be such that all the function compositions match. If 𝑓 has type 𝐴 → 𝐵 for
some type parameters 𝐴 and 𝐵, then the argument of 𝑔 must be of type 𝐵. So, we
must have 𝑔 :𝐵→𝐶 , where 𝐶 is another type parameter. The composition 𝑓 # 𝑔 has
type 𝐴 → 𝐶, so ℎ must have type 𝐶 → 𝐷 for some type 𝐷. Assuming the types as
𝑓 :𝐴→𝐵 , 𝑔 :𝐵→𝐶 , and ℎ:𝐶→𝐷 , we find that the types in all the compositions 𝑓 # 𝑔, 𝑔 # ℎ,
( 𝑓 # 𝑔) # ℎ, and 𝑓 # (𝑔 # ℎ) match. We can rewrite Eq. (4.3) with type annotations:
( 𝑓 :𝐴→𝐵 # 𝑔 :𝐵→𝐶 ) # ℎ:𝐶→𝐷 = 𝑓 :𝐴→𝐵 # (𝑔 :𝐵→𝐶 # ℎ:𝐶→𝐷 ) . (4.4)
After checking the types, we are ready to verify the associativity law. Note
that both sides of the law (4.4) are functions of type 𝐴 → 𝐷. To prove that two
functions are equal means to prove that they return the same results when applied
to the same arguments. So, let us apply both sides of Eq. (4.4) to an arbitrary value
𝑥 :𝐴 . Using definition (4.1) for the forward composition, we find:
(( 𝑓 # 𝑔) # ℎ) (𝑥) = ℎ (( 𝑓 # 𝑔) (𝑥)) = ℎ(𝑔( 𝑓 (𝑥))) ,
( 𝑓 # (𝑔 # ℎ)) (𝑥) = (𝑔 # ℎ) ( 𝑓 (𝑥)) = ℎ(𝑔( 𝑓 (𝑥))) .
Both sides of the law are equal when applied to an arbitrary value 𝑥. This con-
cludes the proof.
Because of the associativity law, we do not need parentheses when writing the
expression 𝑓 # 𝑔 # ℎ. The function ( 𝑓 # 𝑔) # ℎ is equal to the function 𝑓 # (𝑔 # ℎ).
In the proof, we have omitted the type annotations since we already checked
that all types match. Checking the types beforehand allows us to write shorter
derivations.
Proofs with backward composition This book prefers to use the forward com-
position 𝑓 # 𝑔 rather than the backward composition 𝑔 ◦ 𝑓 . If desired, all equations
can be converted from one notation to the other by reversing the order of compo-
sitions:
𝑓 # 𝑔 def
= 𝑔◦ 𝑓
for any functions 𝑓 :𝐴→𝐵 and 𝑔 :𝐵→𝐶 . Let us see how to prove the composition laws
in the backward notation. We will just need to reverse the order of function com-
positions in the proofs above.
The left identity and right identity laws are:
𝑓 ◦ id = 𝑓 , id ◦ 𝑓 = 𝑓 .
To match the types, we need to choose the type parameters as:

𝑓 :𝐴→𝐵 ◦ id:𝐴→𝐴 = 𝑓 :𝐴→𝐵 , id𝐵→𝐵 ◦ 𝑓 :𝐴→𝐵 = 𝑓 :𝐴→𝐵 .


148
4.2 Fully parametric functions

We now apply both sides of the laws to an arbitrary value 𝑥 :𝐴 . For the left identity
law, we find:

use definition (4.2) : 𝑓 ◦ id = (𝑥 → 𝑓 (id (𝑥))) = (𝑥 → 𝑓 (𝑥)) = 𝑓 .

Similarly for the right identity law:

id ◦ 𝑓 = (𝑥 → id ( 𝑓 (𝑥))) = (𝑥 → 𝑓 (𝑥)) = 𝑓 .

The associativity law,


ℎ ◦ (𝑔 ◦ 𝑓 ) = (ℎ ◦ 𝑔) ◦ 𝑓 ,
is proved by applying both sides to an arbitrary value 𝑥 of a suitable type:

(ℎ ◦ (𝑔 ◦ 𝑓 )) (𝑥) = ℎ ((𝑔 ◦ 𝑓 ) (𝑥)) = ℎ (𝑔 ( 𝑓 (𝑥))) ,


((ℎ ◦ 𝑔) ◦ 𝑓 ) (𝑥) = (ℎ ◦ 𝑔) ( 𝑓 (𝑥)) = ℎ (𝑔 ( 𝑓 (𝑥))) .

The types are checked by assuming that 𝑓 has the type 𝑓 :𝐴→𝐵 . The types in 𝑔 ◦ 𝑓
match only when 𝑔 :𝐵→𝐶 , and then 𝑔 ◦ 𝑓 is of type 𝐴 → 𝐶. The type of ℎ must be
ℎ:𝐶→𝐷 for the types in ℎ ◦ (𝑔 ◦ 𝑓 ) to match. We can write the associativity law with
type annotations as:

ℎ:𝐶→𝐷 ◦ (𝑔 :𝐵→𝐶 ◦ 𝑓 :𝐴→𝐵 ) = (ℎ:𝐶→𝐷 ◦ 𝑔 :𝐵→𝐶 ) ◦ 𝑓 :𝐴→𝐵 . (4.5)

The associativity law allows us to omit parentheses in the expression ℎ ◦ 𝑔 ◦ 𝑓 .


The length of calculations is the same in the forward and the backward notation.
One difference is that types of function compositions are more visually clear in
the forward notation: it is harder to check that types match in Eq. (4.5) than in
Eq. (4.4). To make the backward notation easier to work with, one could write1
the function types in reverse as, e.g., 𝑔 :𝐶←𝐵 ◦ 𝑓 :𝐵←𝐴 .

4.2.3 Example: A function that is not fully parametric


Fully parametric functions do not make any decisions based on the actual types
of arguments. As an example of code that is not fully parametric, consider the
following “fake identity” function:
def fid[A]: A => A = {
case x: Int => (x - 1).asInstanceOf[A] // Special code for A = Int.
case x => x // Standard code for all other
types A.
}
This function’s type signature is the same as that of identity[A], and its behavior
is the same for all types A except for A = Int:
1 This
is done in the book “Program design by calculation” by J. N. Oliveira where the backward
composition is used exclusively, see http://www4.di.uminho.pt/∼jno/ps/pdbc.pdf

149
4 The logic of types. II. Curried functions

scala> fid("abc")
res0: String = abc

scala> fid(true)
res1: Boolean = true

scala> fid(0)
res2: Int = -1

While Scala allows us to write this kind of code, the result is confusing: the type
signature A => A does not indicate a special behavior with A = Int. In any case, fid
is not a fully parametric function.

Let us see whether the identity laws of function composition hold when using
fid[A] instead of the correct function identity[A]. To see that, we compose fid with
a simple function f_1 defined by:
def f_1: Int => Int = { x => x + 1 }

The composition (f_1 andThen fid) has type Int => Int. Since f_1 has type Int =>
Int, Scala will automatically set the type parameter A = Int in fid[A]:
scala> def f_2 = f_1 andThen fid // 𝑓2 = 𝑓1 # fid
f_2: Int => Int

By the identity law, we should have 𝑓2 = 𝑓1 # id = 𝑓1 . But we can check that f_1 and
f_2 are not equal:
scala> f_1(0)
res3: Int = 1

scala> f_2(0)
res4: Int = 0

It is important that we are able to detect that fid is not a fully parametric func-
tion by checking whether some equation holds, without looking at the code of
fid. In this book, we will always formulate any desired properties through equa-
tions or “laws”. To verify that a law holds, we will perform symbolic calcula-
tions similar to the proofs in Section 4.2.2. These calculations are symbolic in
the sense that we are manipulating symbols (such as 𝑥, 𝑓 , 𝑔, ℎ) without substi-
tuting any specific values for these symbols but only using some general rules
and properties. This is similar to symbolic calculations in mathematics, such as
(𝑥 − 𝑦) (𝑥 2 + 𝑥𝑦 + 𝑦 2 ) = 𝑥 3 − 𝑦 3 . In the next section, we will get more experience
with symbolic calculations relevant to functional programming.
150
4.3 Symbolic calculations with nameless functions

4.3 Symbolic calculations with nameless functions


4.3.1 Calculations with curried functions
In mathematics, functions are evaluated by substituting their argument values
into their body. Each sub-expression is then evaluated and its result substituted
into the larger expression.
Nameless functions are evaluated in the same way. For example, applying the
nameless function 𝑥 → 𝑥 + 10 to an integer 2, we substitute 2 instead of 𝑥 in “𝑥 + 10”
and get the sub-expression “2 + 10”. Then we evaluate that sub-expression to 12.
The computation is written like this:

(𝑥 → 𝑥 + 10)(2) = 2 + 10 = 12 .

To run this computation in Scala, we need to add a type annotation to the nameless
function as in (𝑥 :Int → 𝑥 + 10)(2). The code is:
scala> ((x: Int) => x + 10)(2)
res0: Int = 12

Curried function calls such as 𝑓 (𝑥)(𝑦) or 𝑥 → expr(𝑥) (𝑦)(𝑧) may look unfa-
miliar and confusing. We need to get some experience working with them.
Consider the expression (x => y => x - y)(20)(4), and begin with the curried
argument 20. Applying a nameless function of the form (x => ...) to 20 means
substituting x = 20 into the body of the function. After that substitution, we obtain
the expression y => 20 - y, which is again a nameless function. Applying that
function to the remaining argument (4) means substituting y = 4 into the body of
y => 20 - y. We get the expression 20 - 4, which equals 16. Test in Scala:
scala> ((x: Int) => (y: Int) => x - y)(20)(4)
res1: Int = 16

Applying a curried function such as x => y => z => expr(x,y,z) to three curried
arguments 10, 20, and 30 means substituting x = 10, y = 20, and z = 30 into the
expression expr(x,y,z).
This calculation is made easier by the convention that f(g)(h) means first apply-
ing f to g and then applying the result to h. In other words, function application
groups to the left: f(g)(h) = (f(g))(h). It would be confusing if function applica-
tion grouped to the right and f(g)(h) meant first applying g to h and then applying
f to the result. If that were the syntax convention, it would be harder to reason
about applying a curried function to its arguments.
We see that the right grouping of the function arrow => is well adapted to the left
grouping of function applications. All functional languages follow these syntactic
conventions.
To make calculations shorter, we will write code in a mathematical notation
rather than in the Scala syntax. Type annotations are written with a colon in the
151
4 The logic of types. II. Curried functions

superscript. For example, 𝑥 :Int → 𝑥 + 10 is the code notation corresponding to the


Scala expression (x: Int) => x + 10.
The symbolic evaluation of the Scala code ((x: Int) => (y: Int) => x - y)(20)(4)
can be written as:
(𝑥 :Int → 𝑦 :Int → 𝑥 − 𝑦)(20) (4)
apply function and substitute 𝑥 = 20 : = (𝑦 :Int → 20 − 𝑦)(4)
apply function and substitute 𝑦 = 4 : = 20 − 4 = 16 .
In the above step-by-step calculation, the colored underlines and comments at left
are added for clarity. A colored underline indicates a sub-expression that is going
to be rewritten at the next step.
Here we performed calculations by substituting an argument into a function at
each step. A compiled Scala program is evaluated in a similar way at run time.
Nameless functions are values and can be used as part of larger expressions, just
as any other values. For instance, nameless functions can be arguments of other
functions (nameless or not). Here is an example of applying a nameless function
𝑓 → 𝑓 (9) to the nameless function 𝑥 → 𝑥% 4:
( 𝑓 → 𝑓 (9)) (𝑥 → 𝑥% 4)
substitute 𝑓 = (𝑥 → 𝑥% 4) : = (𝑥 → 𝑥% 4)(9)
substitute 𝑥 = 9 : = 9% 4 = 1 .
In the nameless function 𝑓 → 𝑓 (9), the argument 𝑓 has to be itself a function,
otherwise the expression 𝑓 (9) would make no sense. The argument 𝑥 of 𝑓 (𝑥)
must be an integer, or else we would not be able to compute 𝑥% 4. The result of
computing 𝑓 (9) is 1, an integer. We conclude that 𝑓 must have type Int → Int, or
else the types do not match.
To verify this result in Scala, we need to specify a type annotation for 𝑓 :
scala> ((f: Int => Int) => f(9))(x => x % 4)
res2: Int = 1
No type annotation is needed for 𝑥 → 𝑥% 4 because the Scala compiler already
knows the type of 𝑓 and figures out that 𝑥 in 𝑥 → 𝑥% 4 must have type Int.
Let us summarize the syntax conventions for curried nameless functions:
• Function expressions group everything to the right: 𝑥 → 𝑦 → 𝑧 → 𝑒 means
𝑥 → (𝑦 → (𝑧 → 𝑒)).

• Function calls group everything to the left: 𝑓 (𝑥)(𝑦)(𝑧) means ( 𝑓 (𝑥))(𝑦) (𝑧).
The expression 𝑓 (𝑥) is a new function that is applied to 𝑦, giving again a
new function that is finally applied to 𝑧.
• Function applications group stronger than infix operations, so 𝑓 (𝑥) + 𝑦 means
( 𝑓 (𝑥)) + 𝑦, as usual in mathematics, and not 𝑓 (𝑥 + 𝑦).
152
4.3 Symbolic calculations with nameless functions

Here are some more examples of performing function applications symbolically.


Types are omitted for brevity; every non-function value is of type Int:

(𝑥 → 𝑥 ∗ 2) (10) = 10 ∗ 2 = 20 .
( 𝑝 → 𝑧 → 𝑧 ∗ 𝑝) (𝑡) = (𝑧 → 𝑧 ∗ 𝑡) .
( 𝑝 → 𝑧 → 𝑧 ∗ 𝑝) (𝑡)(4) = (𝑧 → 𝑧 ∗ 𝑡)(4) = 4 ∗ 𝑡 .

Some results of these computation are integer values such as 20; other results are
nameless functions such as 𝑧 → 𝑧 ∗ 𝑡. Verify this in Scala:
scala> ((x: Int) => x * 2)(10)
res3: Int = 20

scala> ((p: Int) => (z: Int) => z * p)(10)


res4: Int => Int = <function1>

scala> ((p: Int) => (z: Int) => z * p)(10)(4)


res5: Int = 40
In the following examples, some arguments are themselves functions. Consider
an expression that uses the nameless function (𝑔 → 𝑔(2)) as an argument:

( 𝑓 → 𝑝 → 𝑓 ( 𝑝)) (𝑔 → 𝑔(2)) (4.6)


substitute 𝑓 = (𝑔 → 𝑔(2)) : = 𝑝 → (𝑔 → 𝑔(2)) ( 𝑝)
substitute 𝑔 = 𝑝 : = 𝑝 → 𝑝(2) . (4.7)

The final result, 𝑝 → 𝑝(2), cannot be simplified any more.


The function 𝑝 → 𝑝(2) applies its argument (𝑝) to the value 2. A possible value
for 𝑝 is the function 𝑥 → 𝑥 + 4. Let us apply expression (4.6) to 𝑥 → 𝑥 + 4:

( 𝑓 → 𝑝 → 𝑓 ( 𝑝)) (𝑔 → 𝑔(2)) (𝑥 → 𝑥 + 4)
use Eq. (4.7) : = ( 𝑝 → 𝑝(2)) (𝑥 → 𝑥 + 4)
substitute 𝑝 = (𝑥 → 𝑥 + 4) : = (𝑥 → 𝑥 + 4) (2)
substitute 𝑥 = 2 : = 2+4 = 6 .

To verify this calculation in Scala, we need to add appropriate type annotations


for 𝑓 and 𝑝. To figure out the types, we reason like this:
We know that the function 𝑓 → 𝑝 → 𝑓 ( 𝑝) is being applied to the arguments
𝑓 = (𝑔 → 𝑔(2)) and 𝑝 = (𝑥 → 𝑥 + 4). So, the argument 𝑓 in 𝑓 → 𝑝 → 𝑓 ( 𝑝) must be
a function that takes 𝑝 as an argument.
The variable 𝑥 in 𝑥 → 𝑥 + 4 must be of type Int. So, the type of the expression
𝑥 → 𝑥 + 4 is Int → Int, and the type of the argument 𝑝 must be the same. We write
𝑝 :Int→Int .
Finally, we need to make sure that the types match in the function 𝑓 → 𝑝 →
𝑓 ( 𝑝). Types match in 𝑓 ( 𝑝) if the type of 𝑓 ’s argument is the same as the type of
153
4 The logic of types. II. Curried functions

𝑝, which is Int → Int. So, 𝑓 ’s type must be (Int → Int) → 𝐴 for some type 𝐴.
Since in our example 𝑓 = (𝑔 → 𝑔(2)), types match only if 𝑔 has type Int → Int.
But then 𝑔(2) has type Int, and so we must have 𝐴 = Int. Thus, the type of 𝑓 is
(Int → Int) → Int. We know enough to write the Scala code now:
scala> ((f: (Int => Int) => Int) => p => f(p))(g => g(2))(x => x + 4)
res6: Int = 6
Type annotations for 𝑝, 𝑔, and 𝑥 may be omitted: Scala’s compiler can figure out
the missing types from the given type of 𝑓 . However, extra type annotations often
make code clearer.

4.3.2 Examples: Deriving a function’s type from its code


Checking that the types match is an important part of the functional programming
paradigm, both in the practice of writing code and in theoretical derivations of
laws for various functions. For instance, in the derivations of the composition
laws (Section 4.2.2), we were able to deduce the possible type parameters for 𝑓 , 𝑔,
and ℎ in the expression 𝑓 # 𝑔 # ℎ. This worked because the composition operation
andThen (denoted by the symbol # ) is fully parametric. Given a fully parametric
function, one can derive the most general type signature that matches the body
of that function. The same type-deriving procedure may also help in converting a
given function to a fully parametric form.
Let us look at some examples of doing this.
Example 4.3.2.1 The functions const and id were defined in Section 4.2.1. What
is the value const(id) and what is its type? Determine the most general type pa-
rameters in the expression const(id).
Solution We need to treat the functions const and id as values, since our goal
is to apply const to id. Write the code of these functions in a short notation:

const𝐶,𝑋 def
= 𝑐 :𝐶 → _:𝑋 → 𝑐 , id 𝐴 def
= 𝑎 :𝐴 → 𝑎 .

The types will match in the expression const(id) only if the argument of the func-
tion const has the same type as the type of id. Since const is a curried function,
we need to look at its first curried argument, which is of type 𝐶. The type of id is
𝐴 → 𝐴, where 𝐴 is (so far) an arbitrary type. So, the type parameter 𝐶 in const𝐶,𝑋
must be equal to 𝐴 → 𝐴:
𝐶=𝐴→𝐴 .
The type parameter 𝑋 in const𝐶,𝑋 is not constrained, so we keep it as 𝑋. The result
of applying const to id is of type 𝑋 → 𝐶, which equals 𝑋 → 𝐴 → 𝐴. In this way,
we find:
const 𝐴→𝐴,𝑋 (id 𝐴 ) : 𝑋 → 𝐴 → 𝐴 .
The types 𝐴 and 𝑋 remain arbitrary. The type 𝑋 → 𝐴 → 𝐴 is the most general type
for the expression const(id) because we have not made any assumptions about the
154
4.3 Symbolic calculations with nameless functions

types except requiring that all functions must be always applied to arguments of
the correct types.
To compute the value of const(id), it remains to substitute the code of const and
id. Since we already checked the types, we may omit all type annotations:

const (id)
definition of const : = (𝑐 → 𝑥 → 𝑐)(id)
apply function, substitute 𝑐 = id : = 𝑥 → id
definition of id : =𝑥→𝑎→𝑎 .
The function (𝑥 → 𝑎 → 𝑎) takes an argument 𝑥 :𝑋 and returns the identity func-
tion 𝑎 :𝐴 → 𝑎. It is clear that the argument 𝑥 is ignored by this function. So, we can
rewrite it equivalently as:
const (id) = _:𝑋 → 𝑎 :𝐴 → 𝑎 .
Example 4.3.2.2 Implement a function twice that takes a function f: Int => Int
as its argument and returns a function that applies f twice. For instance, if the
function f is { x => x + 3 }, the result of twice(f) should be equal to the function x
=> x + 6. Test this with the expression twice(x => x + 3)(10). After implementing
the function twice, generalize it to a fully parametric function.
Solution According to the requirements, the function twice must return a new
function of type Int => Int. So, the type signature of twice is:
def twice(f: Int => Int): Int => Int = ???
Since twice(f) must be a new function with an integer argument, we begin the
code of twice by writing a new nameless function { (x: Int) => ... },
def twice(f: Int => Int): Int => Int = { (x: Int) => ??? }
The new function must apply f twice to its argument, that is, it must return
f(f(x)). We can finish the implementation now:
def twice(f: Int => Int): Int => Int = { x => f(f(x)) }
The type annotation (x: Int) can be omitted. Let us verify that twice(x => x+3)(10)
equals 10 + 6:
scala> val g = twice(x => x + 3) // Expect g to be equal to the function { x
=> x + 6 }.
g: Int => Int = <function1>

scala> g(10) // Expect twice(x => x + 3)(10) to be equal


to (x => x + 6)(10) = 16.
res0: Int = 16
To transform twice into a fully parametric function means replacing its type sig-
nature by a fully parameterized type signature while keeping the function body
unchanged:
155
4 The logic of types. II. Curried functions

def twice[A, B, ...](f: ...): ... = { x => f(f(x)) }


To determine the type signature and the possible type parameters 𝐴, 𝐵, ..., we
need to determine the most general type that matches the function body. The
function body is the expression 𝑥 → 𝑓 ( 𝑓 (𝑥)). Assume that 𝑥 has type 𝐴; for types
to match in the sub-expression 𝑓 (𝑥), we need 𝑓 to have type 𝐴 → 𝐵 for some
type 𝐵. The sub-expression 𝑓 (𝑥) will then have type 𝐵. For types to match in
𝑓 ( 𝑓 (𝑥)), the argument of 𝑓 must have type 𝐵; but we already assumed 𝑓 :𝐴→𝐵 .
This is consistent only if 𝐴 = 𝐵. In this way, 𝑥 :𝐴 implies 𝑓 :𝐴→𝐴 , and the expression
𝑥 → 𝑓 ( 𝑓 (𝑥)) has type 𝐴 → 𝐴. We can now write the type signature of twice:
def twice[A](f: A => A): A => A = { x => f(f(x)) }
This fully parametric function can be written in the code notation as:

twice 𝐴 def
= 𝑓 :𝐴→𝐴 → 𝑥 :𝐴 → 𝑓 ( 𝑓 (𝑥)) = 𝑓 :𝐴→𝐴 → 𝑓 # 𝑓 . (4.8)

The procedure of deriving the most general type for a given code is called type
inference. In Example 4.3.2.2, the presence of the type parameter 𝐴 and the type
signature ( 𝐴 → 𝐴) → 𝐴 → 𝐴 have been “inferred” from the code 𝑓 → 𝑥 →
𝑓 ( 𝑓 (𝑥)).
Example 4.3.2.3 Consider the fully parametric function twice defined in Exam-
ple 4.3.2.2. What is the most general type of twice(twice), and what computation
does it perform? Test your answer on the expression twice(twice)(x => x + 3)(10).
What are the type parameters in that expression?
Solution Note that twice(twice) means that the function twice is used as its own
argument, i.e., this is twice(f) with f = twice. We begin by assuming unknown
type parameters as twice[A](twice[B]). The function twice[A] of type ( 𝐴 → 𝐴) →
𝐴 → 𝐴 can be applied to the argument twice[B] only if twice[B] has type 𝐴 → 𝐴.
But twice[B] is of type (𝐵 → 𝐵) → 𝐵 → 𝐵. The symbol → groups to the right, so
we have:
(𝐵 → 𝐵) → 𝐵 → 𝐵 = (𝐵 → 𝐵) → (𝐵 → 𝐵) .
This can match with 𝐴 → 𝐴 only if we set 𝐴 = (𝐵 → 𝐵). So, the most general type
of twice(twice) is:

twice𝐵→𝐵 (twice𝐵 ) : (𝐵 → 𝐵) → 𝐵 → 𝐵 . (4.9)

After checking that types match, we may omit types from further calculations.
Example 4.3.2.2 defined twice with the def syntax. To use twice as an argument in
the expression twice(twice), it is convenient to define twice as a value, val twice
= ... However, the function twice needs type parameters, and Scala 2 does not
directly support val definitions with type parameters. Scala 3 supports type pa-
rameters appearing together with arguments in a nameless function:
val twice = [A] => (f: A => A) => (x: A) => f(f(x)) // Valid only in Scala 3.

156
4.3 Symbolic calculations with nameless functions

Keeping this in mind, we use the definition of twice from Eq. (4.8): twice ( 𝑓 ) =
𝑓 # 𝑓 , which omits the curried argument 𝑥 :𝐴 and makes the calculation shorter.
Substituting that into twice(twice), we find:

twice (twice) = twice# twice


expand function composition : = 𝑓 → twice (twice ( 𝑓 )) .
definition of twice ( 𝑓 ) : = 𝑓 → twice ( 𝑓 # 𝑓 )
definition of twice : = 𝑓 → 𝑓#𝑓#𝑓#𝑓 .

This clearly shows that twice(twice) is a function applying its (function-typed)


argument four times.
The types in twice(twice)(x => x + 3) follow from Eq. (4.9): since x => x + 3 has
type Int => Int, types will match only if we set 𝐵 = Int. The result is twice[Int =>
Int](twice[Int]). To test, we need to write at least one type parameter in the code,
or else Scala cannot correctly infer the types in twice(twice):
scala> twice(twice[Int])(x => x + 3)(10) // Or write `twice[Int =>
Int](twice)(x => x + 3)(10)` .
res0: Int = 22

This confirms that twice(twice)(x => x + 3) equals the function x => x + 12.
Example 4.3.2.4 (a) Infer a general type signature with type parameter(s) for the
given function p:
def p[...]:... = { f => f(2) }

(b) Could we choose the type parameters in the expression p(p) such that the types
match?
Solution (a) In the nameless function 𝑓 → 𝑓 (2), the argument 𝑓 must be itself
a function with an argument of type Int, otherwise the sub-expression 𝑓 (2) is ill-
typed. So, types will match if 𝑓 has type Int → Int or Int → String or similar. The
most general case is when 𝑓 has type Int → 𝐴, where 𝐴 is an arbitrary type (i.e.,
a type parameter); then the value 𝑓 (2) has type 𝐴. Since the nameless function
𝑓 → 𝑓 (2) has an argument 𝑓 of type Int → 𝐴 and a result 𝑓 (2) of type 𝐴, we find
that the type of 𝑝 must be (Int → 𝐴) → 𝐴. With this type assignment, all types
match. The type parameter 𝐴 remains undetermined and is added to the type
signature of the function p. The code is:
def p[A]: (Int => A) => A = { f => f(2) }

(b) The expression p(p) applies p to itself, just as twice(twice) did in Exam-
ple 4.3.2.3. Begin by writing p(p) with unknown type parameters: p[A](p[B]). Then
try to choose A and B so that the types match in that expression. Does the type of
p[B], which is (Int => B) => B, match the type of the argument of p[A], which is
Int => A, with some choice of A and B? A function type P => Q matches X => Y only
if P = X and Q = Y. So, (Int => B) => B can match Int => A only if Int => B matches
157
4 The logic of types. II. Curried functions

Notation Scala syntax Comments

𝑥: 𝐴 x: A a value or an argument of type A


𝑓 : 𝐴→𝐵 f: A => B a function of type A => B
𝑥 :Int → 𝑥 + 1 (x: Int) => x + 1 a nameless function of type Int => Int
𝑓 𝐴,𝐵 def
= ... def f[A, B] = ... a function with type parameters
id 𝐴, also id: 𝐴→𝐴 identity[A] the standard “identity” function
𝐴→𝐵→𝐶 A => B => C the type of a curried function
𝑓 #𝑔 f andThen g forward composition of functions
𝑔◦ 𝑓 g compose f backward composition of functions

Table 4.1: Some notation for symbolic reasoning about code.

Int and if B = A. But it is impossible for Int => B to match Int, no matter how we
choose B.
We conclude that the expression p[A](p[B]) has a problem: for any choice of A
and B, some type will be mismatched. One says that the expression p(p) is not
well-typed. Such expressions contain a type error and are rejected by the Scala
compiler. 
In the examples seen so far, we inferred the most general type of a code ex-
pression simply by trying to make all function types match the types of their
arguments. The Damas-Hindley-Milner algorithm2 performs type inference (or
determines that there is a type error) for any code containing functions, tuples,
and disjunctive types.

4.4 Summary
Table 4.1 shows the notations introduced in this chapter.
What can we do using this chapter’s techniques?
• Make functions that return new functions and/or take functions as argu-
ments.
• Simplify expressions symbolically when functions are applied to arguments.
• Derive a general type for a given code expression (perform type inference).
• Convert functions to a fully parametric form when possible.
The following examples and exercises illustrate these techniques further.
2 https://en.wikipedia.org/wiki/Hindley%E2%80%93Milner_type_system#Algorithm_W

158
4.4 Summary

4.4.1 Examples
Example 4.4.1.1 Implement a function that applies a given function 𝑓 repeatedly
to an initial value 𝑥0 , until a given function cond returns true:
def converge[X](f: X => X, x0: X, cond: X => Boolean): X = ???

Solution We call find on an iterator that keeps applying f; this stops when the
condition is true:
def converge[X](f: X => X, x0: X, cond: X => Boolean): X =
Stream.iterate(x0)(f) // Type is Stream[X].
.find(cond) // Type is Option[X].
.get // Type is X.

The method get is a partial function that can be applied only to non-empty Option
values. It is safe to call get here, because the stream is unbounded and, if the
condition cond never becomes true, the program will run out of memory (since
Stream.iterate keeps all computed values in memory) or the user will run out of
patience. So, _.find(cond) can never return an empty Option value. Of course, it is
not satisfactory that the program crashes when the sequence does not converge.
Exercise 4.4.2.2 will implement a safer version of this function by limiting the
allowed number of iterations.
A tail-recursive implementation that works in constant memory is:
@tailrec def converge[X](f: X => X, x0: X, cond: X => Boolean): X =
if (cond(x0)) x0 else converge(f, f(x0), cond)

To test this code, compute an approximation to 𝑞 by Newton’s method with the
iteration function 𝑓 (𝑥) = 21 𝑥 + 𝑞𝑥 . We iterate 𝑓 (𝑥) starting with 𝑥 0 = 𝑞/2 until a

given precision is obtained:
def approx_sqrt(q: Double, precision: Double): Double = {
def cond(x: Double): Boolean = math.abs(x * x - q) <= precision
def iterate_sqrt(x: Double): Double = 0.5 * (x + q / x)
converge(iterate_sqrt, q / 2, cond)
}

Newton’s method for 𝑞 is guaranteed to converge when 𝑞 ≥ 0. Test it:
scala> approx_sqrt(25, 1.0e-8)
res0: Double = 5.000000000016778

Example 4.4.1.2 Using both def and val, define a Scala function that takes an
integer x and returns a function that adds x to its argument.
Solution Let us first write down the required type signature. The function
must take an integer argument x: Int, and the return value must be a function of
type Int => Int:
def add_x(x: Int): Int => Int = ???

We are required to return a function that adds x to its argument. Let us call that ar-
159
4 The logic of types. II. Curried functions

gument z, to avoid confusion with the x. So, we are required to return the function
{ z => z + x }. Since functions are values, we return a new function by writing a
nameless function expression:
def add_x(x: Int): Int => Int = { z => z + x }

To implement the same function by using a val, we first convert the type signature
of add_x to the equivalent curried type Int → Int → Int. Now we can write the
Scala code of a function add_x_v:
val add_x_v: Int => Int => Int = { x => z => z + x }

The function add_x_v is equal to add_x except for using the val syntax instead of def.
We do not need to write the type of the arguments x and z since we already wrote
the type Int → Int → Int of add_x_v.
Example 4.4.1.3 Using def and val, implement a curried function prime_f that
takes a function 𝑓 and an integer 𝑥, and returns true when 𝑓 (𝑥) is prime. Use the
function isPrime from Section 1.1.2.
Solution First, determine the required type signature of prime_f. The value
𝑓 (𝑥) must have type Int, or else we cannot check whether it is prime. So, 𝑓 must
have type Int → Int. Since prime_f should be a curried function, we need to put
each argument into its own set of parentheses:
def prime_f(f: Int => Int)(x: Int): Boolean = ???

To implement prime_f, we need to return the result of isPrime applied to f(x). A


simple solution is:
def prime_f(f: Int => Int)(x: Int): Boolean = isPrime(f(x))

To implement the same function using val, rewrite its type signature as:
val prime_f: (Int => Int) => Int => Boolean = ???

(The parentheses around Int => Int are mandatory as Int => Int => Int => Boolean
would be a completely different type.) The implementation is:
val prime_f: (Int => Int) => Int => Boolean = { f => x => isPrime(f(x)) }

The code isPrime(f(x)) is a forward composition of the functions f and isPrime, so


we can write:
val prime_f: (Int => Int) => Int => Boolean = (f => f andThen isPrime)

A nameless function of the form f => f.something is equivalent to a shorter Scala


syntax (_.something). We finally rewrite the code of prime_f as:
val prime_f: (Int => Int) => Int => Boolean = (_ andThen isPrime)

Example 4.4.1.4 Implement a function choice(x, p, f, g) that takes a value 𝑥, a


predicate 𝑝, and two functions 𝑓 and 𝑔. The return value must be 𝑓 (𝑥) if 𝑝(𝑥)
returns true; otherwise the return value must be 𝑔(𝑥). Infer the most general type
for this function.
160
4.4 Summary

Solution The code of this function must be:


def choice[...](x, p, f, g) = if (p(x)) f(x) else g(x)
To infer the most general type for this code, begin by assuming that 𝑥 has type 𝐴,
where 𝐴 is a type parameter. Then the predicate 𝑝 must have type A => Boolean.
Since 𝑝 is an arbitrary predicate, the value 𝑝(𝑥) will be sometimes true and some-
times false. So, choice(x, p, f, g) will sometimes compute 𝑓 (𝑥) and sometimes
𝑔(𝑥). It follows that type 𝐴 must be the argument type of both 𝑓 and 𝑔, which
means that the most general types so far are 𝑓 :𝐴→𝐵 and 𝑔 :𝐴→𝐶 , yielding the type
signature:
choice(𝑥 :𝐴 , 𝑝 :𝐴→Boolean , 𝑓 :𝐴→𝐵 , 𝑔 :𝐴→𝐶 ) .
What could be the return type of choice(x, p, f, g)? If 𝑝(𝑥) returns true, the
function choice returns 𝑓 (𝑥), which is of type 𝐵. Otherwise, choice returns 𝑔(𝑥),
which is of type 𝐶. However, the type signature of choice must be fixed in advance
(at compile time) and cannot depend on the value 𝑝(𝑥) computed at run time. So,
the types of 𝑓 (𝑥) and of 𝑔(𝑥) must be the same, 𝐵 = 𝐶. The type signature of choice
will thus have only two type parameters, 𝐴 and 𝐵:
def choice[A, B](x: A, p: A => Boolean, f: A => B, g: A => B): B =
if (p(x)) f(x) else g(x)

Example 4.4.1.5 Infer the most general type for the fully parametric function:
def q[...]: ... = { f => g => g(f) }
What types are inferred for the expressions q(q) and q(q(q))?
Solution To begin, assume 𝑓 :𝐴 with a type parameter 𝐴. In the sub-expression
𝑔 → 𝑔( 𝑓 ), the curried argument 𝑔 must itself be a function, because it is being
applied to 𝑓 as 𝑔( 𝑓 ). So, we assign types as 𝑓 :𝐴 → 𝑔 :𝐴→𝐵 → 𝑔( 𝑓 ), where 𝐴 and
𝐵 are type parameters. Then the final returned value 𝑔( 𝑓 ) has type 𝐵. Since there
are no other constraints on the types, the types 𝐴 and 𝐵 remain arbitrary, so we
add them to the type signature:
def q[A, B]: A => (A => B) => B = { f => g => g(f) }
To match types in the expression q(q), we first assume arbitrary type parame-
ters and write q[A, B](q[C, D]). We need to introduce new type parameters 𝐶, 𝐷
because those type parameters may need to be set differently from 𝐴, 𝐵 when we
try to match the types in the expression q(q).
The type of the first curried argument of q[A, B], which is 𝐴, must match the
entire type of q[C, D], which is 𝐶 → (𝐶 → 𝐷) → 𝐷. So, we must choose 𝐴 as:
𝐴 = 𝐶 → (𝐶 → 𝐷) → 𝐷 .
The type of q(q) becomes:
𝑞 𝐴,𝐵 (𝑞𝐶,𝐷 ) : ((𝐶 → (𝐶 → 𝐷) → 𝐷) → 𝐵) → 𝐵 ,
where 𝐴 = 𝐶 → (𝐶 → 𝐷) → 𝐷 .

161
4 The logic of types. II. Curried functions

There are no other constraints on the type parameters 𝐵, 𝐶, 𝐷.


We use this result to infer the most general type for q(q(q)). Denote 𝑟 def
= 𝑞(𝑞)
for brevity; then, as we just found, 𝑟 has type ((𝐶 → (𝐶 → 𝐷) → 𝐷) → 𝐵) → 𝐵.
To infer types in the expression q(r), we introduce new type parameters 𝐸, 𝐹 and
write q[E, F](r). The type of the argument of q[E, F] is 𝐸, and this must be the
same as the type of 𝑟. This gives the constraint:
𝐸 = ((𝐶 → (𝐶 → 𝐷) → 𝐷) → 𝐵) → 𝐵 .
Other than that, the type parameters are arbitrary. The type of the expression
q(q(q)) is (𝐸 → 𝐹) → 𝐹. We conclude that the most general type of q(q(q)) is:

𝑞 𝐸,𝐹 (𝑞 𝐴,𝐵 (𝑞𝐶,𝐷 )) : ((((𝐶 → (𝐶 → 𝐷) → 𝐷) → 𝐵) → 𝐵) → 𝐹) → 𝐹 ,


where 𝐴 = 𝐶 → (𝐶 → 𝐷) → 𝐷
and 𝐸 = ((𝐶 → (𝐶 → 𝐷) → 𝐷) → 𝐵) → 𝐵 .
It is clear from this derivation that expressions such as q(q(q(q))), q(q(q(q(q)))),
etc., are well-typed.
Let us test these results in Scala, renaming the type parameters for clarity to A,
B , C, D:
scala> def qq[A, B, C]: ((A => (A => B) => B) => C) => C = q(q)
qq: [A, B, C]=> ((A => ((A => B) => B)) => C) => C

scala> def qqq[A, B, C, D]: ((((A => (A => B) => B) => C) => C) => D) => D =
q(q(q))
qqq: [A, B, C, D]=> ((((A => ((A => B) => B)) => C) => C) => D) => D
We did not need to write any type parameters within the expressions q(q) and
q(q(q)) because the full type signature was declared for each of these expressions.
Since the Scala compiler did not print any error messages, we are assured that the
types match correctly.
Example 4.4.1.6 For the following expressions, infer the most general types or
show that the expression is not well-typed with simple types:
(a) 𝑓 → 𝑓 ( 𝑓 ) .
(b) 𝑓 → 𝑓 (ℎ → ℎ( 𝑓 )) .
(c) 𝑓 → 𝑔 → 𝑓 (ℎ → ℎ(𝑔)) .
By “simple types” we mean that 𝑓 , 𝑔, ℎ cannot have their own type parameters.
Solution (a) The type of 𝑓 is unknown, so we begin by assigning an arbitrary
type 𝐴 to it. Types now need to match in the expression 𝑓 ( 𝑓 ) with 𝑓 :𝐴 . So, the
type 𝐴 must be a function type whose argument is again of type 𝐴. We can write
that function type as 𝐴 → 𝐵, where 𝐵 is another arbitrary type. Now, types match
only if 𝐴 and 𝐴 → 𝐵 is the same type. But there are no simple types 𝐴 and 𝐵 such
that 𝐴 = 𝐴 → 𝐵. So, the expression 𝑓 → 𝑓 ( 𝑓 ) is not well-typed.
This conclusion holds only because we do not allow the function 𝑓 to have
its own type parameters. Otherwise, the expression 𝑓 ( 𝑓 ) could be well-typed.
162
4.4 Summary

See, for instance, Example 4.3.2.3 showing that the expression twice(twice) is well-
typed.
(b) Begin by assigning type parameters as 𝑓 :𝐴 and ℎ:𝐵 , where 𝐴 and 𝐵 are un-
known. To match types in ℎ( 𝑓 ), the type of ℎ must be a function type with an
argument of type 𝐴. So, we must have 𝐵 = 𝐴 → 𝐶, where 𝐶 is unknown. Then
ℎ( 𝑓 ) has type 𝐶, and ℎ → ℎ( 𝑓 ) has type ( 𝐴 → 𝐶) → 𝐶. This is the type of an
argument of 𝑓 , so 𝐴 = (( 𝐴 → 𝐶) → 𝐶) → 𝐷, where 𝐷 is unknown. But we cannot
have a simple type 𝐴 that satisfies the type equation 𝐴 = (( 𝐴 → 𝐶) → 𝐶) → 𝐷.
We conclude that the expression 𝑓 → 𝑓 (ℎ → ℎ( 𝑓 )) is not well-typed.
(c) Begin by assigning type parameters as 𝑓 :𝐴 , 𝑔 :𝐵 , ℎ:𝐶 . To match types in ℎ(𝑔),
we must have 𝐶 = 𝐵 → 𝐷. Then ℎ → ℎ( 𝑓 ) has type 𝐶 → 𝐷, and that must be the
type of 𝑓 ’s argument. So, we must have:

𝐴 = (𝐶 → 𝐷) → 𝐸 = ((𝐵 → 𝐷) → 𝐷) → 𝐸 .

There are no other restrictions. We have found the most general type:

𝑓 :((𝐵→𝐷)→𝐷)→𝐸 → 𝑔 :𝐵 → 𝑓 (ℎ:𝐵→𝐷 → ℎ(𝑔)) : (((𝐵 → 𝐷) → 𝐷) → 𝐸) → 𝐵 → 𝐸



.

The type parameters 𝐵, 𝐷, 𝐸 remain arbitrary.


Example 4.4.1.7 Infer types in the code expression:

( 𝑓 → 𝑔 → 𝑔( 𝑓 )) ( 𝑓 → 𝑔 → 𝑔( 𝑓 )) ( 𝑓 → 𝑓 (10)) ,

and simplify the code through symbolic calculations.


Solution The given expression is a curried function 𝑓 → 𝑔 → 𝑔( 𝑓 ) applied to
two curried arguments. The plan is to consider each of these sub-expressions in
turn, assigning types for them using type parameters, and then to figure out how
to set the type parameters so that all types match.
Begin by renaming the shadowed variables ( 𝑓 and 𝑔) to remove shadowing:

( 𝑓 → 𝑔 → 𝑔( 𝑓 )) (𝑥 → 𝑦 → 𝑦(𝑥)) (ℎ → ℎ(10)) . (4.10)

As we have seen in Example 4.4.1.5, the sub-expression 𝑓 → 𝑔 → 𝑔( 𝑓 ) is typed


as 𝑓 :𝐴 → 𝑔 :𝐴→𝐵 → 𝑔( 𝑓 ), where 𝐴 and 𝐵 are some type parameters. The sub-
expression 𝑥 → 𝑦 → 𝑦(𝑥) is the same function as 𝑓 → 𝑔 → 𝑔( 𝑓 ) but with possibly
different type parameters, say, 𝑥 :𝐶 → 𝑦 :𝐶→𝐷 → 𝑦(𝑥). The types 𝐴, 𝐵, 𝐶, 𝐷 are so
far unknown.
Finally, the variable ℎ in the sub-expression ℎ → ℎ(10) must have type Int → 𝐸,
where 𝐸 is another type parameter. So, the sub-expression ℎ → ℎ(10) is a function
of type (Int → 𝐸) → 𝐸.
The types must match in the entire expression (4.10):

( 𝑓 :𝐴 → 𝑔 :𝐴→𝐵 → 𝑔( 𝑓 ))(𝑥 :𝐶 → 𝑦 :𝐶→𝐷 → 𝑦(𝑥))(ℎ:Int→𝐸 → ℎ(10)) . (4.11)


163
4 The logic of types. II. Curried functions

It follows that 𝑓 must have the same type as 𝑥 → 𝑦 → 𝑦(𝑥), while 𝑔 must have the
same type as ℎ → ℎ(10). The type of 𝑔, which we know as 𝐴 → 𝐵, will match the
type of ℎ → ℎ(10), which we know as (Int → 𝐸) → 𝐸, only if 𝐴 = (Int → 𝐸) and
𝐵 = 𝐸. It follows that 𝑓 has type Int → 𝐸. At the same time, the type of 𝑓 must
match the type of 𝑥 → 𝑦 → 𝑦(𝑥), which is 𝐶 → (𝐶 → 𝐷) → 𝐷. This can work only
if 𝐶 = Int and 𝐸 = (𝐶 → 𝐷) → 𝐷 = (Int → 𝐷) → 𝐷.
In this way, we have found all the relationships between the type parameters 𝐴,
𝐵, 𝐶, 𝐷, 𝐸 in Eq. (4.11). The type 𝐷 remains arbitrary, while the type parameters
𝐴, 𝐵, 𝐶, 𝐸 are expressed as:

𝐴 = Int → (Int → 𝐷) → 𝐷 , (4.12)


𝐵 = 𝐸 = (Int → 𝐷) → 𝐷 , (4.13)
𝐶 = Int .

The entire expression in Eq. (4.11) is a full application of a curried function, and
thus has the same type as the “final” result expression 𝑔( 𝑓 ), which has type 𝐵. So,
the entire expression in Eq. (4.11) has type 𝐵 = (Int → 𝐷) → 𝐷.
Having established that types match, we can now omit the type annotations
and rewrite the code:

( 𝑓 → 𝑔 → 𝑔( 𝑓 ))(𝑥 → 𝑦 → 𝑦(𝑥)) (ℎ → ℎ(10))



substitute 𝑓 = 𝑥 → 𝑦 → 𝑦(𝑥) : = 𝑔 → 𝑔(𝑥 → 𝑦 → 𝑦(𝑥)) (ℎ → ℎ(10))
substitute 𝑔 = ℎ → ℎ(10) : = (ℎ → ℎ(10))(𝑥 → 𝑦 → 𝑦(𝑥))
substitute ℎ = 𝑥 → 𝑦 → 𝑦(𝑥) : = (𝑥 → 𝑦 → 𝑦(𝑥))(10)
substitute 𝑥 = 10 : = 𝑦 → 𝑦(10) .

The type of this expression is (Int → 𝐷) → 𝐷 with a type parameter 𝐷. Since the
argument 𝑦 is an arbitrary function, we cannot simplify either 𝑦(10) or 𝑦 → 𝑦(10)
any further. So, the final simplified form of Eq. (4.10) is 𝑦 :Int→𝐷 → 𝑦(10).
To test this, we first define the function 𝑓 → 𝑔 → 𝑔( 𝑓 ) as in Example 4.4.1.5:
def q[A, B]: A => (A => B) => B = { f => g => g(f) }

We also define the function ℎ → ℎ(10) with a general type (Int → 𝐸) → 𝐸:


def r[E]: (Int => E) => E = { h => h(10) }

To help Scala evaluate Eq. (4.11), we need to set the type parameters for the first q
function as q[A, B] where 𝐴 and 𝐵 are given by Eqs. (4.12)–(4.13):
scala> def s[D] = q[Int => (Int => D) => D, (Int => D) => D](q)(r)
s: [D]=> (Int => D) => D

To verify that the function 𝑠 𝐷 indeed equals 𝑦 :Int→𝐷 → 𝑦(10), we apply 𝑠 𝐷 to some
functions of type Int → 𝐷, say, with 𝐷 = Boolean or 𝐷 = Int:
164
4.4 Summary

scala> s(_ > 0) // Set D = Boolean and evaluate (10 > 0).
res6: Boolean = true

scala> s(_ + 20) // Set D = Int and evaluate (10 + 20).


res7: Int = 30

Example 4.4.1.8 Compute (𝑥 → 𝑦 → 𝑥(𝑥(𝑦))) # ( 𝑝 → 𝑝(2)) # (𝑧 → 𝑧 + 3) symboli-


cally and infer types.
Solution The forward composition 𝑓 # 𝑔 substitutes the body of 𝑓 into the argu-
ment of 𝑔:
substitute 𝑦 = 𝑓 (𝑥) : (𝑥 → 𝑓 (𝑥)) # (𝑦 → 𝑔(𝑦)) = (𝑥 → 𝑔( 𝑓 (𝑥))) .
Here, we substituted 𝑓 (𝑥) instead of 𝑦 in 𝑔(𝑦) and obtained 𝑔( 𝑓 (𝑥)). This shows
how to compute the forward compositions left to right:
(𝑥 → 𝑦 → 𝑥(𝑥(𝑦))) # ( 𝑝 → 𝑝(2)) = 𝑥 → (𝑦 → 𝑥(𝑥(𝑦)))(2) = 𝑥 → 𝑥(𝑥(2)) .
(𝑥 → 𝑥(𝑥(2))) # (𝑧 → 𝑧 + 3) = 𝑥 → 𝑥(𝑥(2)) + 3 .
Computing the pairwise combinations in another order, we get the same result:
first compute : ( 𝑝 → 𝑝(2)) # (𝑧 → 𝑧 + 3) = 𝑝 → 𝑝(2) + 3 .
then compute : (𝑥 → 𝑦 → 𝑥(𝑥(𝑦))) # ( 𝑝 → 𝑝(2) + 3)
= 𝑥 → (𝑦 → 𝑥(𝑥(𝑦))) (2) + 3
= 𝑥 → 𝑥(𝑥(2)) + 3 .
This is to be expected due to the associativity law (4.3). Types are inferred as:
(𝑥 → 𝑦 → 𝑥(𝑥(𝑦))) :(Int→Int)→(Int→Int) # ( 𝑝 → 𝑝(2)) :(Int→Int)→Int # (𝑧 → 𝑧 + 3) :Int→Int .
Example 4.4.1.9 We are given a function 𝑞 :𝐴→𝐴 , and we only know that for any
𝑓 :𝐴→𝐴 the law 𝑓 # 𝑞 = 𝑞 # 𝑓 holds (i.e., 𝑞 commutes with every function). Show that
𝑞 must be an identity function.
Solution Since the law must hold for any 𝑓 , we may choose 𝑓 at will. Let us
fix a value 𝑧 :𝐴 and choose 𝑓 def = _ → 𝑧, that is, a constant function returning 𝑧.
Applying both sides of the law 𝑓 # 𝑞 = 𝑞 # 𝑓 to an arbitrary 𝑥 :𝐴 , we get:
( 𝑓 # 𝑞)(𝑥) = 𝑞( 𝑓 (𝑥)) = 𝑞(𝑧) , (𝑞 # 𝑓 )(𝑥) = 𝑓 (𝑞(𝑥)) = 𝑧 .
It follows that 𝑞(𝑧) = 𝑧 for any chosen 𝑧 :𝐴 . In other words, 𝑞 is an identity function
(id:𝐴→𝐴 ).

4.4.2 Exercises
Exercise 4.4.2.1 Revise the function from Exercise 1.6.2.4, making it a curried
function and replacing the hard-coded number 100 by a curried first argument.
The type signature should become Int => List[List[Int]] => List[List[Int]].
165
4 The logic of types. II. Curried functions

Exercise 4.4.2.2 Implement the function converge from Example 4.4.1.1 as a cur-
ried function with an additional argument to set the maximum number of itera-
tions, returning Option[Double] as the final result type. The new version of converge
should return None if the convergence condition is not satisfied after the given
maximum number of iterations. The type signature and an example test:
@tailrec def convergeN[X](cond: X => Boolean)(x0: X)(maxIter: Int)(f: X => X):
Option[X] = ???

scala> convergeN[Int](_ < 0)(0)(10)(_ + 1) // This does not converge.


res0: Option[Int] = None

scala> convergeN[Double]{ x => math.abs(x * x - 25) < 1e-8 }(1.0)(10) { x =>


0.5 * (x + 25 / x ) }
res1: Option[Double] = Some(5.000000000053722)

Exercise 4.4.2.3 Implement a fully parametric, information-preserving, curried


function that recovers from an error using a given function argument. The type
signature and an example test:
def recover[E, A]: Option[Either[E, A]] => (E => A) => Option[A] = ???

scala> recover(Some(Left("error"))) { _ => 123 }


res0: Option[Int] = Some(123)

Exercise 4.4.2.4 For id and const as defined above, what are the types of id(id),
id(id)(id), id(id(id)), id(const), and const(const)? Simplify these code expressions
by symbolic calculations.
Exercise 4.4.2.5 For the function twice from Example 4.3.2.2, show that the func-
tion twice(twice(f))) is the same as twice(twice)(f) for any f: Int => Int.
Exercise 4.4.2.6 For the function twice from Example 4.3.2.2, infer the most gen-
eral type for the function twice(twice(twice))). What does that function do? Test
your answer on an example.
Exercise 4.4.2.7 Define a function thrice similarly to twice except it should apply
a given function 3 times. What does the function thrice(thrice(thrice))) do?
Exercise 4.4.2.8 Define a function ence similarly to twice except it should apply a
given function 𝑛 times, where 𝑛 is an additional curried argument.
Exercise 4.4.2.9 Define a fully parametric function flip(f) that swaps arguments
for any given uncurried function f having two arguments. To test:
def f(x: Int, y: Int) = x - y // Expect f(10, 2) == 8.
val g = flip(f) // Now expect g(2, 10) == 8.

scala> assert( f (10, 2) == 8 && g(2, 10) == 8 )

Exercise 4.4.2.10 Write a function curry2 converting a function of type (A, A) => A
into an equivalent curried function of type A => A => A.
166
4.4 Summary

Exercise 4.4.2.11 Apply the function (𝑥 → _ → 𝑥) to the value (𝑧 → 𝑧(𝑞)) where


𝑞 :𝑄 is a given value of type 𝑄. Infer types in these expressions.
Exercise 4.4.2.12 Infer types in the following expressions and test in Scala:
(a) 𝑝 → 𝑞 → 𝑟 → 𝑝(𝑞(𝑟)) .
(b) 𝑝 → 𝑞 → 𝑞(𝑟 → 𝑝) .
(c) 𝑝 → 𝑞 → 𝑞(𝑟 → 𝑞( 𝑝)) .
(d) 𝑝 → 𝑞 → 𝑞(𝑟 → 𝑝(𝑞)) .
(e) 𝑝 → 𝑞 → 𝑝(𝑟 → 𝑟 (𝑞)) .
(f) 𝑝 → 𝑞 → 𝑞(𝑟 → 𝑟 ( 𝑝(𝑞))) .
Exercise 4.4.2.13 Show that the following expressions cannot be well-typed with
simple types (see Example 4.4.1.6 for reference):
(a) 𝑝 → 𝑞 → 𝑝(𝑞)( 𝑝(𝑞)) .
(b) 𝑝 → 𝑞 → 𝑞(𝑟 → 𝑝(𝑞(𝑟))) .
Exercise 4.4.2.14 Infer types and simplify the following code expressions by sym-
bolic calculations:
(a) 𝑞 → (𝑥 → 𝑦 → 𝑧 → 𝑥(𝑧)(𝑦(𝑧))) (𝑎 → 𝑎) (𝑏 → 𝑏(𝑞)) .
(b) ( 𝑓 → 𝑔 → ℎ → 𝑓 (𝑔(ℎ))) (𝑥 → 𝑥) .
(c) (𝑥 → 𝑦 → 𝑥(𝑦)) (𝑥 → 𝑦 → 𝑥) .
(d) (𝑥 → 𝑦 → 𝑥(𝑦)) (𝑥 → 𝑦 → 𝑦) .
(e) 𝑥 → ( 𝑓 → 𝑦 → 𝑓 (𝑦)(𝑥)) (𝑧 → _ → 𝑧) .
(f) 𝑧 → (𝑥 → 𝑦 → 𝑥) (𝑥 → 𝑥(𝑧)) (𝑦 → 𝑦(𝑧)) .
Exercise 4.4.2.15 Infer types and simplify the following code expressions by sym-
bolic calculations:
(a) (𝑧 → 𝑧 + 1) # (𝑥 → 𝑦 → 𝑥/𝑦) # ( 𝑝 → 𝑝(2)) .
(b) ( 𝑝 → 𝑞 → 𝑝 + 𝑞 + 1) # ( 𝑓 → 𝑓 # 𝑓 ) # (𝑥 → 𝑥(1)) .
Exercise 4.4.2.16 In the following statements, the types 𝐴 and 𝐵 are fixed, and
functions are not assumed to be fully parametric in 𝐴 or 𝐵.
(a) Given a function ℎ:𝐴→𝐵 that satisfies the law 𝑓 :𝐴→𝐴 # ℎ:𝐴→𝐵 = ℎ:𝐴→𝐵 for any
𝑓 :𝐴→𝐴 , prove that the function ℎ must ignore its argument and return a fixed value
of type 𝐵.
(b) We are given two functions 𝑔 :𝐴→𝐴 and ℎ:𝐵→𝐵 . We only know that 𝑔 and ℎ
satisfy the law 𝑓 :𝐴→𝐵 # ℎ:𝐵→𝐵 = 𝑔 :𝐴→𝐴 # 𝑓 :𝐴→𝐵 for any function 𝑓 :𝐴→𝐵 . Prove that
both 𝑔 and ℎ must be equal to identity functions of suitable types: 𝑔 :𝐴→𝐴 = id 𝐴
and ℎ:𝐵→𝐵 = id𝐵 .
Hint: choose 𝑓 to be a suitable constant function and substitute 𝑓 into the given
laws.
167
4 The logic of types. II. Curried functions

4.5 Discussion and further developments


4.5.1 Higher-order functions
The order of a function is the number of function arrows (=>) contained in the type
signature of that function. If a function’s type signature contains more than one
arrow, the function is called a higher-order function. Higher-order functions take
functions as arguments and/or return functions.
The methods andThen, compose, curried, and uncurried are examples of higher-
order functions that take other functions as arguments and return new functions.
The following examples illustrate the concept of a function’s “order”. Consider
the code:
def f1(x: Int): Int = x + 10

The function f1 has type signature Int => Int and order 1, so it is not a higher-order
function.
def f2(x: Int): Int => Int = (z => z + x)

The function f2 has type signature Int => Int => Int and is a higher-order function
of order 2.
def f3(g: Int => Int): Int = g(123)

The function f3 has type signature (Int => Int) => Int and is a higher-order func-
tion of order 2.
Note that f2 is a higher-order function only because its return value is of a func-
tion type. An equivalent computation can be performed by an uncurried function
that is not higher-order:
scala> def f2u(x: Int, z: Int): Int = z + x // Type signature (Int, Int) =>
Int

Unlike f2, the function f3 cannot be converted to a first-order function because


f3 has an argument of a function type. Converting to an uncurried form cannot
eliminate such arguments.

4.5.2 Name shadowing and the scope of bound variables


Bound variables are introduced in nameless functions whenever an argument is
defined. For example, in the nameless function 𝑥 → 𝑦 → 𝑥 + 𝑦, the bound variables
are the curried arguments 𝑥 and 𝑦. The variable 𝑦 is only defined within the scope
(𝑦 → 𝑥 + 𝑦) of the inner function; the variable 𝑥 is defined within the entire scope
of 𝑥 → 𝑦 → 𝑥 + 𝑦.
Another way of introducing bound variables in Scala is to write a val or a def
within curly braces:
val x = {

168
4.5 Discussion and further developments

val y = 10 // Bound variable `y`.


y + y * y
} // Same as `val x = 10 + 10 * 10`.
A bound variable is invisible outside the scope that defines it. So, it is easy to
rename a bound variable: no outside code could possibly use it and depend on its
value.
However, outside code may define a variable that (by chance) has the same
name as a bound variable inside the scope. Consider an example from calculus
where a function 𝑓 is defined via an integral:
∫ 𝑥
𝑑𝑥
𝑓 (𝑥) = .
0 1+𝑥

Here, two bound variables named 𝑥 are defined in two scopes: one in the scope
1
of 𝑓 , another in the scope of the nameless function 𝑥 → 1+𝑥 . The convention in
mathematics is to treat these two 𝑥’s as two completely different variables that just
happen to have the same name. In sub-expressions where both of these bound
variables are visible, priority is given to the bound variable defined in the smaller
inner scope. The outer definition of 𝑥 is then shadowed (hidden) by the inner
definition of 𝑥. For this reason, evaluating 𝑓 (10) will give:
∫ 10
𝑑𝑥
𝑓 (10) = = log𝑒 (11) ≈ 2.398 ,
0 1+𝑥
∫ 10 𝑑𝑥 10
rather than 0 1+10 = 11 . The outer definition 𝑥 = 10 is shadowed within the
1 1
expression 1+𝑥 by the definition of 𝑥 in the smaller local scope of 𝑥 → 1+𝑥 .
Since this is the standard mathematical convention, the same convention is
adopted in functional programming. A variable defined in a function scope (i.e.,
a bound variable) will shadow any outside definitions of a variable with the same
name.
Name shadowing is not advisable in practical programming, because it usually
decreases the clarity of code and so invites errors. Consider the nameless function:
𝑥→𝑥→𝑥 ,
and let us decipher this confusing syntax. The symbol → groups to the right, so
𝑥 → 𝑥 → 𝑥 is the same as 𝑥 → (𝑥 → 𝑥). It is a function that takes 𝑥 and returns
𝑥 → 𝑥. Since the argument 𝑥 in (𝑥 → 𝑥) may be renamed to y without changing
the function, we can rewrite the code to:
𝑥 → (𝑦 → 𝑦) .
Having removed name shadowing, we can more easily understand this code and
reason about it. For instance, it becomes clear that this function ignores its argu-
ment 𝑥 and always returns the same value (the identity function 𝑦 → 𝑦). So, we
can rewrite (𝑥 → 𝑥 → 𝑥) as (_ → 𝑦 → 𝑦), which is clearer.
169
4 The logic of types. II. Curried functions

4.5.3 Operator syntax for function applications


In mathematics, function applications are sometimes written without parenthe-
ses, for instance: cos 𝑥 or log 𝑧. Commonly used formulas such as 2 sin 𝑥 cos 𝑥
imply parentheses as 2 · (sin (𝑥)) · (cos (𝑥)). This convention allows us to treat cer-
tain functions as “operators” that are written without parentheses, similar to the
Í
operators of summation, 𝑘 𝑓 (𝑘), or differentiation, 𝑑𝑥 𝑑
𝑓 (𝑥).
Some programming languages (such as OCaml, Haskell, and F#) have adopted
this “operator syntax”, making parentheses optional for all function arguments.
In those languages, f x means the same as f(x).3 Parentheses are still used, for
example, in expressions such as f(g x). For curried functions, function applica-
tions group to the left, so f x y z means ((f x) y) z. Function applications group
stronger than infix operations, so f x + y means (f x) + y, following the conven-
tion used in mathematics where “cos 𝑥 + 𝑦” groups “cos 𝑥” stronger than the infix
“+” operation.
This book does not use the “operator syntax” when reasoning about code. Scala
does not support the parentheses-free operator syntax; parentheses are needed
around each curried argument (or a curried list of arguments).
In programming language theory, curried functions are “simpler” because they
always have a single argument (but may return a function that will consume fur-
ther arguments). From the point of view of programming practice, curried func-
tions are often harder to read and to write.
In the operator syntax, a curried function f is applied to curried arguments as,
e.g., f 20 4. This departs further from the mathematical tradition and requires
some getting used to. If the two arguments are more complicated than just 20 and
4, the resulting expression may become harder to read, compared with the syntax
where commas are used to separate the arguments. (Consider, for instance, the
expression f (g 10) (h 20) + 30.) To improve readability of code, programmers
may prefer to define names for complicated expressions and then use those names
as curried arguments.
In Scala, the choice of whether to use curried or uncurried function signatures
is mostly a matter of syntactic convenience. Most Scala code seems to be written
with uncurried functions.
One of the syntactic features of Scala is the ability to give a curried argument us-
ing the curly brace syntax. Compare the two definitions of the function summation
described in Section 1.7.4:
def summation1(a: Int, b: Int, g: Int => Double): Double = (a to b).map(g).sum

def summation2(a: Int, b: Int)(g: Int => Double): Double = (a to b).map(g).sum

3 Theoperator syntax has a long history in programming. It is used in Unix shell commands, for
example cp file1 file2, and also in the language Tcl. In LISP-like languages, function applica-
tions are enclosed in parentheses but the arguments are space-separated, for example (f 10 20).

170
4.5 Discussion and further developments

These functions are applied to arguments like this:


scala> summation1(1, 10, { x => x * x * x + 2 * x })
res0: Double = 3135.0

scala> summation2(1, 10) { x => x * x * x + 2 * x }


res1: Double = 3135.0

The code that calls summation2 is easier to read because the curried argument is
syntactically separated from the rest of the code by curly braces. This is especially
useful when the curried argument is itself a function with a complicated body,
since Scala’s curly braces syntax allows function bodies to contain local definitions
(val or def) of new bound variables.
Another feature of Scala is the “dotless” method syntax: for example, xs map f is
equivalent to xs.map(f) and f andThen g is equivalent to f.andThen(g). The “dotless”
syntax is available only for infix methods, such as map, defined on specific types
such as Seq. In Scala 3, the “dotless” syntax is generally enabled by the infix
def annotation. Do not confuse Scala’s “dotless” method syntax with the operator
syntax used in Haskell and other languages.

4.5.4 Deriving a function’s code from its type


We have seen how the procedure of type inference derives the type signature
from a function’s code. A well-known algorithm for type inference is the Damas-
Hindley-Milner algorithm,4 with a Scala implementation available.5
It is remarkable that one can sometimes perform “code inference”: derive a func-
tion’s code from the function’s type signature. We will now look at some examples
of this.
Consider a fully parametric function that performs partial applications for arbi-
trary other functions. A possible type signature is:
def pa[A, B, C](x: A)(f: (A, B) => C): B => C = ???

How can we implement pa? Since pa(x)(f) must return a function of type B => C,
we have no choice other than to begin writing a nameless function in the code:
def pa[A, B, C](x: A)(f: (A, B) => C): B => C = { y: B =>
??? // Need to compute a value of type C in this scope.
}

In the inner scope, we need to compute a value of type C, and we have values x: A,
y: B, and f: (A, B) => C. How can we compute a value of type C? If we knew that
C = Int when pa(x)(f) is applied, we could have simply selected a fixed integer
value, say, 1, as the value of type C. If we knew that C = String, we could have
4 See https://en.wikipedia.org/wiki/Hindley%E2%80%93Milner_type_system
5 See http://dysphoria.net/2009/06/28/hindley-milner-type-inference-in-scala/

171
4 The logic of types. II. Curried functions

selected a fixed string, say, "hello", as the value of type C. But a fully parametric
function cannot use any knowledge of the types of its actual arguments.
So, a fully parametric function cannot produce a value of an arbitrary type C
from scratch. The only way of producing a value of type C is by applying the
function f to arguments of types A and B. Since the types A and B are arbitrary, we
cannot obtain any values of these types other than x: A and y: B. So, the only way
of getting a value of type C is to compute f(x, y). Thus, the body of pa must be:
def pa[A, B, C](x: A)(f: (A, B) => C): B => C = { y => f(x, y) }
In this way, we have unambiguously derived the body of this function from its type
signature, by assuming that the function must be fully parametric.
Another example is the operation of forward composition 𝑓 # 𝑔 viewed as a fully
parametric function with type signature:
def before[A, B, C](f: A => B, g: B => C): A => C = ???
To implement before, we need to create a nameless function of type A => C:
def before[A, B, C](f: A => B, g: B => C): A => C = { x: A =>
??? // Need to compute a value of type C in this scope.
}
In the inner scope, we need to compute a value of type 𝐶 from the values 𝑥 :𝐴 ,
𝑓 :𝐴→𝐵 , and 𝑔 :𝐵→𝐶 . Since the type 𝐶 is arbitrary, the only way of obtaining a value
of type 𝐶 is by applying 𝑔 to an argument of type 𝐵. In turn, the only way of
obtaining a value of type 𝐵 is to apply 𝑓 to an argument of type 𝐴. Finally, we
have only one value of type 𝐴, namely 𝑥 :𝐴 . So, the only way of obtaining the
required result is to compute 𝑔( 𝑓 (𝑥)).
We have derived the body of the function from its type signature:
def before[A, B, C](f: A => B, g: B => C): A => C = { x => g(f(x)) }
Chapter 5 will show how code can be derived from type signatures for a wide
range of fully parametric functions.

172
5 The logic of types. III. The
Curry-Howard correspondence
Fully parametric functions (introduced in Section 4.2) perform operations so gen-
eral that their code works in the same way for all types. An example of a fully
parametric function is:
def before[A, B, C](f: A => B, g: B => C): A => C = { x => g(f(x)) }

We have seen in Section 4.5.4 that for certain functions of this kind one can de-
rive the code unambiguously from the type signature. There exists a mathematical
theory (called the Curry-Howard correspondence) that gives precise conditions
for the possibility of deriving a function’s code from its type. There is also a sys-
tematic derivation algorithm that either produces the function’s code or proves
that the given type signature cannot be implemented. This chapter describes the
main results and applications of that theory to functional programming.

5.1 Values computed by fully parametric functions


5.1.1 Motivation and outlook
Consider the following sketch of a fully parametric function’s Scala code:
def f[A, B, ...]: ... = {
val x: Either[A, B] = ... // Some expression here.
...
}

If this program compiles without type errors, it means that the types match and,
in particular, that the function f is able to compute a value x of type Either[A, B].
It is sometimes impossible to compute a value of a certain type in fully parametric
code. For example, the fully parametric function fmap shown in Example 3.2.3.1
cannot compute a value of type A:
def fmap[A, B](f: A => B): Option[A] => Option[B] = {
val x: A = ??? // Cannot compute x here!
...
}

The reason is that no fully parametric code can compute values of type A “from
scratch”, that is, without using any previously given value of type A and without
173
5 The logic of types. III. The Curry-Howard correspondence

applying a previously given function that returns a value of type A. In fmap, no


values of type A are given as arguments; the given function f: A => B returns val-
ues of type B and not A. The code of fmap must perform pattern matching on an
argument of type Option[A]:
def fmap[A, B](f: A => B)(pa: Option[A]): Option[B] = pa match {
case None =>
val x: A = ??? // Cannot compute x here!
...
case Some(a) =>
val x: A = a // Can compute x in this scope.
...
}

Since the case None has no values of type A, we are unable to compute a value x in
that scope (as long as fmap remains a fully parametric function).
“Being able” to compute x: A means that, if needed, the code should be able to
return x as a result value. This requires computing x in all cases, not just within
one part (case ...) of a pattern-matching expression. For that, one would need to
implement the following type signature via fully parametric code:
def bad[A, B](f: A => B)(pa: Option[A]): A = ??? // Cannot implement.

So, the question “can we compute a value of type A within a fully parametric
function with arguments of type B and C” is equivalent to the question “can be
implement a fully parametric function of type (B, C) => A”. From now on, we will
focus on the latter kind of questions.
Here are some other examples where no fully parametric code can implement a
given type signature:
def bad2[A, B](f: A => B): A = ???

def bad3[A, B, C](p: A => Either[B, C]): Either[A => B, A => C] = ???

The problem with bad2 is that no data of type A is given, while the given function
f returns values of type B, not A.
The problem with bad3 is that it needs to hard-code the decision of whether to
return the Left or the Right part of Either. That decision cannot depend on the
function p because one cannot pattern-match on a function, and because bad3 does
not receive any data of type A and so cannot call p. Suppose bad3 is hard-coded
to always return a Left(f) with some f: A => B. It is then necessary to compute f
from p, but that is impossible: the given function p may return either Left(b) or
Right(c) for different values of its argument (of type A). This data is insufficient to
create a function of type A => B. Similarly, bad3 is not able to return Right(f) with
some f: A => C.
Could we try to switch between functions of type A => B and A => C depending
on a given value of type A? This idea means that we are working with a different
type signature, which has an additional argument of type A. That type signature
174
5.1 Values computed by fully parametric functions

can be implemented, for instance, by this Scala code:


def q[A, B, C](g: A => Either[B, C])(a: A): Either[A => B, A => C] =
g(a) match {
case Left(b) => Left(_ => b)
case Right(c) => Right(_ => c)
}

But q does not have the required type signature of bad3.


In all these examples, we see that the a type signature cannot be implemented
because the information given in a function’s arguments is in some way insuffi-
cient for computing the result value.
The type signature inverse to that of bad3 is:
def good3[A, B, C](q: Either[A => B, A => C]): A => Either[B, C] = ???

This type signature can be implemented:


def good3[A, B, C](q: Either[A => B, A => C]): A => Either[B, C] = q match {
case Left(k) => { a => Left(k(a)) }
case Right(k) => { a => Right(k(a)) }
}

So, when working with fully parametric code and looking at some type sig-
nature of a function, we may ask the question — is that type signature imple-
mentable, and if so, can we derive the code by just “following the types”?
It is remarkable that this question makes sense at all. When working with non-
FP languages, the notion of fully parametric functions is usually not relevant, and
implementations cannot be derived from types. But in functional programming,
fully parametric functions are used often. It is then important for the programmer
to know whether a given fully parametric type signature can be implemented,
and if so, to be able to derive the code.
Can we prove rigorously that the functions bad, bad2, bad3 cannot be imple-
mented by any fully parametric code? Or, perhaps, we are mistaken and a clever
trick could produce some code for those type signatures?
So far, we only saw informal arguments about whether values of certain types
can be computed. To make those arguments rigorous, we need to translate state-
ments such as “a fully parametric function before can compute a value of type
C => A” into mathematical formulas with rules for proving them true or false.
The first step towards a rigorous mathematical formulation is to choose a pre-
cise notation. In Section 3.5.3, we denoted by CH ( 𝐴) the proposition “we Can
H ave a value of type 𝐴 within a fully parametric function”. When writing the
code of that function, we may use the function’s arguments, which might have
types, say, 𝑋, 𝑌 , ..., 𝑍. So, we are interested in proving statements like this:

a fully parametric function can compute a value of type 𝐴


using given arguments of types 𝑋, 𝑌 , ..., 𝑍 . (5.1)

175
5 The logic of types. III. The Curry-Howard correspondence

Here 𝑋, 𝑌 , ..., 𝑍, 𝐴 may be either type parameters or more complicated type ex-
pressions, such as 𝐵 → 𝐶 or (𝐶 → 𝐷) → 𝐸, built from some type parameters.
If arguments of types 𝑋, 𝑌 , ..., 𝑍 are given, it means we “already have” val-
ues of those types, i.e., the propositions CH (𝑋), CH (𝑌 ), ..., CH (𝑍) will be true.
So, proposition (5.1) is equivalent to “CH ( 𝐴) is true assuming CH (𝑋), CH (𝑌 ),
..., CH (𝑍) are true”. In mathematical logic, a statement of this form is called a
sequent and is denoted using the symbol ` (called the “turnstile”):

sequent : CH (𝑋), CH (𝑌 ), ..., CH (𝑍) ` CH ( 𝐴) . (5.2)

The assumptions CH (𝑋), CH (𝑌 ), ..., CH (𝑍) are called premises and the proposi-
tion CH ( 𝐴) is called the goal of the sequent.
Sequents provide a notation for questions about implementability of fully para-
metric functions. Since our goal is to answer such questions rigorously, we will
need to be able to prove sequents of the form (5.2). The following sequents corre-
spond to the type signatures we just saw:

fmap for Option : CH ( A => B) ` CH ( Option[A] => Option[B])


the function before : CH ( A => B), CH ( B => C) ` CH ( A => C)
the function bad : CH ( A => B), CH ( Option[A]) ` CH ( A)
the function bad2 : CH ( A => B), CH ( B => C) ` CH ( A)

So far, we only saw informal arguments towards proving the first two sequents
and disproving the last two. We will now develop tools for proving sequents
rigorously.
In formal logic, sequents are proved by starting with certain axioms and fol-
lowing certain derivation rules. Different choices of axioms and derivation rules
will give different logics. We need to discover the correct logic for reasoning about
sequents with CH -propositions. To find that logic’s complete set of axioms and
derivation rules, we will systematically examine all the types and code construc-
tions that are possible in a fully parametric function. The resulting logic is known
under the name “constructive propositional logic”. That logic’s axioms and deriva-
tion rules directly correspond to programming language constructions allowed
by fully parametric code. For that reason, constructive propositional logic gives
correct answers about implementable and non-implementable type signatures of
fully parametric functions.
We will then be able to borrow the results and methods available in the mathe-
matical literature. The main result is an algorithm (called LJT) for finding a proof
for a given sequent in the constructive propositional logic. If a proof is found,
the algorithm also provides the code of a function that has the type signature cor-
responding to the sequent. If a proof is not found, it means that the given type
signature cannot be implemented by fully parametric code.
176
5.1 Values computed by fully parametric functions

5.1.2 Type notation for standard type constructions


In the following sections, we will be reasoning about sequents of the form:

CH (𝑋), CH (𝑌 ), ..., CH (𝑍) ` CH ( 𝐴)

that represent type signatures of fully parametric functions. It will be convenient


to denote the set of all premises by the symbol Γ. We will then write just Γ `
CH ( 𝐴) instead of CH (𝑋), CH (𝑌 ), ..., CH (𝑍) ` CH ( 𝐴).
A special type notation explained in this section will help us write type expres-
sions more concisely. (See Appendix A on page 1157 for a full summary of the
type notation.)
There exist six standard type constructions supported by all functional lan-
guages: primitive types, including the Unit type and the void type (Nothing), tu-
ples (also called product types), disjunctive types (also called co-product types),
function types, parameterized types, and recursive types.
We now define a shorter notation for those types. This notation is often found
in the literature on functional programming languages, except we are using super-
scripts to denote type parameters.

Description Scala syntax Type notation

Unit type Unit 1


Void type Nothing 0
Built-in types Int, String, ... Int, String, ...
Tuple type (A, B) 𝐴×𝐵
Disjunctive type Either[A, B] 𝐴+𝐵
Function type A => B 𝐴→𝐵
Parameterized types def f[A]: F[A] 𝑓 𝐴 : 𝐹 𝐴 or 𝑓 : ∀𝐴. 𝐹 𝐴

Tuples and case classes with more than two parts are denoted by 𝐴 × 𝐵 × 𝐶 or
𝐴 × 𝐵 × 𝐶 × 𝐷, etc. For example, the Scala definition:
case class Person(firstName: String, lastName: String, age: Int)

is written in the type notation as String × String × Int.


The Scala type Either[A, B] is written as 𝐴 + 𝐵, and disjunctive types with more
parts are written as 𝐴 + 𝐵 + 𝐶 and so on. As another example, the Scala definition:
sealed trait RootsOfQ
final case class NoRoots() extends RootsOfQ
final case class OneRoot(x: Double) extends RootsOfQ
final case class TwoRoots(x: Double, y: Double) extends RootsOfQ

177
5 The logic of types. III. The Curry-Howard correspondence

is translated to the type notation as:

RootsOfQ def
= 1 + Double + Double × Double .

The type notation is significantly shorter because it omits all case class names and
part names from the Scala type definitions.
To clarify our notation for parameterized types, consider this code:
def f[A, B]: A => (A => B) => B = { x => g => g(x) }
The type notation for the type signature of f may be written as:

𝑓 𝐴,𝐵 : 𝐴 → ( 𝐴 → 𝐵) → 𝐵 , or equivalently : 𝑓 : ∀( 𝐴, 𝐵). 𝐴 → ( 𝐴 → 𝐵) → 𝐵 .

The type quantifier ∀( 𝐴, 𝐵) (reads “for all 𝐴 and 𝐵”) indicates that 𝑓 can be used
with all types 𝐴 and 𝐵.
In Scala, type expressions can be named, and those names (called type aliases)
can be used to make code shorter. Type aliases may also contain type parameters.
Defining and using a type alias for the type signature of the function f looks like
this:
type F[A, B] = A => (A => B) => B
def f[A, B]: F[A, B] = { x => g => g(x) }
This is written in the type notation by placing all type parameters into super-
scripts:

𝐹 𝐴,𝐵 def
= 𝐴 → ( 𝐴 → 𝐵) → 𝐵 ,
𝑓 𝐴,𝐵 : 𝐹 𝐴,𝐵 def
= 𝑥 :𝐴 → 𝑔 :𝐴→𝐵 → 𝑔(𝑥) ,

or equivalently (although less readably) as:

𝑓 : ∀( 𝐴, 𝐵). 𝐹 𝐴,𝐵 def


= 𝐴,𝐵 → 𝑥 :𝐴 → 𝑔 :𝐴→𝐵 → 𝑔(𝑥)

.

In Scala 3, the function f can be written as a value (val) via this syntax:
val f: [A, B] => A => (A => B) => B = { // Valid only in Scala 3.
[A, B] => (x: A) => (g: A => B) => g(x)
}
This syntax closely corresponds to the code notation 𝐴,𝐵 → 𝑥 :𝐴 → 𝑔 :𝐴→𝐵 → 𝑔(𝑥).
The precedence of operators in the type notation is chosen in order to write
fewer parentheses in some frequently used type expressions. The rules of prece-
dence are:

• The type product operator (×) groups stronger than the disjunctive operator
(+), so that type expressions such as 𝐴 + 𝐵 × 𝐶 have the same operator prece-
dence as in arithmetic. That is, 𝐴 + 𝐵 × 𝐶 means 𝐴 + (𝐵 × 𝐶). This convention
makes type expressions easier to read.
178
5.1 Values computed by fully parametric functions

• The function type arrow (→) groups weaker than the operators + and ×,
so that often-used types such as 𝐴 → 1 + 𝐵 (representing A => Option[B]) or
𝐴 × 𝐵 → 𝐶 (representing ((A, B)) => C) can be written without any paren-
theses. Type expressions such as ( 𝐴 → 𝐵) × 𝐶 will require parentheses but
are needed less often.

• The type quantifiers group weaker than all other operators, so we can write
types such as ∀𝐴. 𝐴 → 𝐴 → 𝐴 without parentheses. This is helpful because
type quantifiers are most often placed at the top level of a type expression.
When that is not the case, parentheses are necessary. An example is the type
expression (∀𝐴. 𝐴 → 𝐴 → 𝐴) → 1 + 1.

5.1.3 Rules for writing CH -propositions


In Section 3.5.3 we saw examples of reasoning about CH -propositions for case
classes and for disjunctive types. This reasoning needs to be extended systemat-
ically to all type constructions that fully parametric programs may use. Then we
will be able to express CH -propositions with arbitrary type expressions, such as
CH ( Either[(A, A), Option[B] => Either[(A, B), C]]), in terms of CH -propositions
for simple type parameters: CH ( 𝐴), CH (𝐵), etc.
We will now derive the rules for writing CH -propositions for each of the stan-
dard type constructions except recursive types.
1a) Rule for the Unit type The Unit type has only a single value (), an “empty
tuple”. This value can be always computed as it does not need any previous data:
def f[...]: ... = {
...
val x: Unit = () // We can always compute a `Unit` value.
...

So, the proposition CH ( Unit) is always true. In the type notation, the Unit type is
denoted by 1. We may write the rule as CH (1) = 𝑇𝑟𝑢𝑒.
Named unit types also have a single value that is always possible to compute.
For example:
final case class N1()

defines a named unit type. We can compute a value of type N1 without using any
previously given values:
val x: N1 = N1()

So, the proposition CH ( N1) is always true. In the type notation, named unit types
are also denoted by 1, same as the Unit type itself.
1b) Rule for the void type The Scala type Nothing has no values, so the propo-
sition CH ( Nothing) is always false. The type Nothing is denoted by 0 in the type
notation. So, the rule is CH (0) = 𝐹𝑎𝑙𝑠𝑒.
179
5 The logic of types. III. The Curry-Howard correspondence

1c) Rule for primitive types For a specific primitive (or library-defined) type
such as Int or String, the corresponding CH -proposition is always true because
we may always create a constant value of that type, e.g.:
def f[...]: ... {
...
val x: String = "abc" // We can always compute a `String` value.
...
}

So, the rule for primitive types is the same as that for the Unit type. For example,
CH (String) = 𝑇𝑟𝑢𝑒.
2) Rule for tuple types To compute a value of a tuple type (A, B) requires com-
puting a value of type A and a value of type B. This is expressed by the logical
formula CH ( (A, B)) = CH ( 𝐴) ∧ CH (𝐵). A similar formula holds for case classes,
as Eq. (3.2) shows. In the type notation, the tuple (A, B) is written as 𝐴 × 𝐵, and
tuples with more parts are written similarly. So, we write the rule for tuples as:

CH ( 𝐴 × 𝐵 × ... × 𝐶) = CH ( 𝐴) ∧ CH (𝐵) ∧ ... ∧ CH (𝐶) .

3) Rule for disjunctive types A disjunctive type may consist of several cases.
Having a value of a disjunctive type means to have a value of (at least) one of
those cases. An example of translating this relationship into a formula was shown
by Eq. (3.1). For the standard disjunctive type Either[A, B], we have the logical
formula CH ( Either[A, B]) = CH ( 𝐴) ∨ CH (𝐵). In the type notation, disjunctive
types with more than two parts are written similarly as 𝐴 + 𝐵 + ... + 𝐶. So, the rule
for disjunctive types is written as:

CH ( 𝐴 + 𝐵 + ... + 𝐶) = CH ( 𝐴) ∨ CH (𝐵) ∨ ... ∨ CH (𝐶) .

4) Rule for function types Consider now a function type such as A => B. This
type is written in the type notation as 𝐴 → 𝐵. To compute a value of that type, we
need to write code like this:
val f: A => B = { (a: A) =>
??? // Compute a value of type B in this scope.
}

The inner scope of this function needs to compute a value of type 𝐵, and the given
value a: A may be used for that. So, CH ( 𝐴 → 𝐵) is true if and only if we are able
to compute a value of type 𝐵 when we are given a value of type 𝐴. To translate
this statement into the language of logical propositions, we need to use the logical
implication, CH ( 𝐴) ⇒ CH (𝐵), which means that CH (𝐵) can be proved if we
already have a proof of CH ( 𝐴). So, the rule for function types is:

CH ( 𝐴 → 𝐵) = CH ( 𝐴) ⇒ CH (𝐵) .
180
5.1 Values computed by fully parametric functions

5) Rule for parameterized types Consider this function with type parameters:
def f[A, B]: A => (A => B) => B = { x => g => g(x) }

Being able to define the body of such a function is the same as being able to com-
pute a value of type A => (A => B) => B for all possible Scala types A and B. In the
notation of logic, this is written as:

CH (∀( 𝐴, 𝐵). 𝐴 → ( 𝐴 → 𝐵) → 𝐵) ,

and is equivalent to:

∀( 𝐴, 𝐵). CH ( 𝐴 → ( 𝐴 → 𝐵) → 𝐵) .

The symbol ∀ means “for all” and is called the universal quantifier in logic. We
read ∀𝐴. CH (𝐹 𝐴 ) as the proposition “for all types A, we can compute a value of
type F[A]”. Here, F[A] can be any type expression that depends on A (or even a
type expression that does not depend on A).
So, the rule for parameterized types of the form ∀𝐴. 𝐹 𝐴 is:

CH (∀𝐴. 𝐹 𝐴 ) = ∀𝐴. CH (𝐹 𝐴 ) .

The rules just shown will allow us to express CH -propositions for complicated
types via CH -propositions for type parameters. Then any type signature can be
rewritten as a sequent that contains CH -propositions only for the individual type
parameters.
In this way, we find a correspondence between a fully parametric type signa-
ture and a logical sequent that expresses the statement “the type signature can be
implemented”. This is the first part of the Curry-Howard correspondence.
Table 5.1 summarizes the type notation and shows how to translate it into logic
formulas with CH -propositions. Apart from recursive types (which we do not
consider in this chapter), Table 5.1 lists all type constructions that may be used in
the code of a fully parametric function.

5.1.4 Examples: Type notation


From now on, we will prefer to write types in the type notation rather than in
the Scala syntax. The type notation allows us to write nameless type expressions
and makes the structure of complicated types clearer than in the Scala syntax.
Names are, of course, helpful for reminding programmers of the meaning of data
represented by case classes and disjunctive types. However, writing names for
every part of every type does not help in reasoning about the properties of types.
Once the programmer has finished deriving the necessary types and verifying
their properties, the type notation can be straightforwardly translated into Scala
code. Let us get some experience doing that.
181
5 The logic of types. III. The Curry-Howard correspondence

Scala syntax Type notation CH -proposition

Unit 1 CH (1) = 𝑇𝑟𝑢𝑒


Nothing 0 CH (0) = 𝐹𝑎𝑙𝑠𝑒
Int, String, ... Int, String, ... CH (Int) = 𝑇𝑟𝑢𝑒
(A, B) 𝐴×𝐵 CH ( 𝐴) ∧ CH (𝐵)
Either[A, B] 𝐴+𝐵 CH ( 𝐴) ∨ CH (𝐵)
A => B 𝐴→𝐵 CH ( 𝐴) ⇒ CH (𝐵)
A (type parameter) 𝐴 CH ( 𝐴)
def f[A]: F[A] 𝑓 𝐴 : 𝐹𝐴 ∀𝐴. CH (𝐹 𝐴 )
val f: [A] => F[A] (Scala 3) 𝑓 : ∀𝐴. 𝐹 𝐴 ∀𝐴. CH (𝐹 𝐴 )

Table 5.1: The correspondence between types and CH -propositions.

Example 5.1.4.1 Define a function delta taking an argument x and returning the
pair (x, x). Derive the most general type for this function. Write the type signa-
ture of delta in the type notation, and translate it into a CH -proposition. Simplify
the CH -proposition if possible.
Solution Begin by writing the code of the function:
def delta(x: ...) = (x, x)

To derive the most general type for delta, first assume x: A, where A is a type
parameter; then the tuple (x, x) has type (A, A). We do not see any constraints on
the type parameter A. So, A represents an arbitrary type and needs to be added to
the type signature of delta:
def delta[A](x: A): (A, A) = (x, x)

We find that the most general type of delta is A => (A, A). We also note that delta
seems to be the only way of implementing a fully parametric function with type
signature A => (A, A).
We will use the letter Δ for the function delta. In the type notation, the type
signature of Δ is:
Δ𝐴 : 𝐴 → 𝐴 × 𝐴 .
So, the proposition CH (Δ) (meaning “the function Δ can be implemented”) is:

CH (Δ) = ∀𝐴. CH ( 𝐴 → 𝐴 × 𝐴) .

In the type expression 𝐴 → 𝐴 × 𝐴, the product symbol (×) binds stronger than the
function arrow (→), so the parentheses in 𝐴 → ( 𝐴 × 𝐴) may be omitted.
182
5.1 Values computed by fully parametric functions

Using the rules for transforming CH -propositions, we rewrite:

CH ( 𝐴 → 𝐴 × 𝐴)
rule for function types : = CH ( 𝐴) ⇒ CH ( 𝐴 × 𝐴)
rule for tuple types : = CH ( 𝐴) ⇒ (CH ( 𝐴) ∧ CH ( 𝐴)) .

Thus the proposition CH (Δ) is equivalent to:

CH (Δ) = ∀𝐴. CH ( 𝐴) ⇒ (CH ( 𝐴) ∧ CH ( 𝐴)) .

It is intuitively clear that the proposition CH (Δ) is true: it just says that if
CH ( 𝐴) is true then CH ( 𝐴) and CH ( 𝐴) is true. The point of writing CH (Δ) in
a mathematical notation is to prepare for proving that proposition rigorously.
Example 5.1.4.2 The standard types Either[A, B] and Option[A] are written in the
type notation as:

Either 𝐴,𝐵 def


= 𝐴+𝐵 , Opt 𝐴 def
= 1+ 𝐴 .

The type Either[A, B] is written as 𝐴 + 𝐵 by definition of the disjunctive type no-


tation (+). The type Option[A] has two disjoint cases, None and Some[A]. The case
class None is a “named Unit” and is denoted by 1. The case class Some[A] contains
a single value of type 𝐴. So, the type notation for Option[A] is 1 + 𝐴. We will also
sometimes write Opt 𝐴 to denote Option[A].
Example 5.1.4.3 The Scala definition of the disjunctive type UserAction:
sealed trait UserAction
final case class SetName(first: String, last: String) extends UserAction
final case class SetEmail(email: String) extends UserAction
final case class SetUserId(id: Long) extends UserAction
is written in the type notation as:

UserAction def
= String × String + String + Long . (5.3)

The type operation × groups stronger than +, as in arithmetic. To derive the type
notation (5.3), we first drop all names from case classes and get three nameless
tuples (String, String), (String), and (Long). Each of these tuples is then converted
into a product using the operator ×, and all products are “summed” in the type
notation using the operator +.
Example 5.1.4.4 The parameterized disjunctive type Either3 is a generalization
of Either:
sealed trait Either3[A, B, C]
final case class Left[A, B, C](x: A) extends Either3[A, B, C]
final case class Middle[A, B, C](x: B) extends Either3[A, B, C]
final case class Right[A, B, C](x: C) extends Either3[A, B, C]
This disjunctive type is written in the type notation as Either3 𝐴,𝐵,𝐶 def
= 𝐴 + 𝐵 + 𝐶.
183
5 The logic of types. III. The Curry-Howard correspondence

Example 5.1.4.5 Define a Scala type constructor F corresponding to the type no-
tation:
𝐹 𝐴 def
= 1 + Int × 𝐴 × 𝐴 + Int × (Int → 𝐴) .

Solution The formula for 𝐹 𝐴 defines a disjunctive type F[A] with three parts.
To implement F[A] in Scala, we need to choose names for each of the disjoint parts,
which will become case classes. For the purposes of this example, let us choose
names F1, F2, and F3. Each of these case classes needs to have the same type pa-
rameter A. So, we begin writing the code as:
sealed trait F[A]
final case class F1[A](...) extends F[A]
final case class F2[A](...) extends F[A]
final case class F3[A](...) extends F[A]

Each of these case classes represents one part of the disjunctive type: F1 represents
1, F2 represents Int × 𝐴 × 𝐴, and F3 represents Int × (Int → 𝐴). It remains to choose
names and define the case classes:
sealed trait F[A]
final case class F1[A]() extends F[A] // Named unit type.
final case class F2[A](n: Int, x1: A, x2: A) extends F[A]
final case class F3[A](n: Int, f: Int => A) extends F[A]

The names n, x1, x2, and f are chosen arbitrarily.


Example 5.1.4.6 Write the type signature of the following function in the type
notation:
def fmap[A, B](f: A => B): Option[A] => Option[B]

Solution This is a curried function, so we first rewrite the type signature as:
def fmap[A, B]: (A => B) => Option[A] => Option[B]

The type notation for Option[A] is 1 + 𝐴. Now we can write the type signature of
fmap as:

fmap 𝐴,𝐵 : ( 𝐴 → 𝐵) → 1 + 𝐴 → 1 + 𝐵 ,
or equivalently : fmap : ∀( 𝐴, 𝐵). ( 𝐴 → 𝐵) → 1 + 𝐴 → 1 + 𝐵 .

We do not put parentheses around 1 + 𝐴 and 1 + 𝐵 because the function arrow (→)
groups weaker than the other type operations. But parentheses around ( 𝐴 → 𝐵)
are required.
We will usually prefer to write type parameters in superscripts rather than un-
der type quantifiers. For example, we will prefer to write the type signature of an
identity function as id 𝐴 : 𝐴 → 𝐴 rather than as id : ∀𝐴. 𝐴 → 𝐴.
184
5.2 The logic of CH -propositions

5.1.5 Exercises: Type notation


Exercise 5.1.5.1 Define a Scala disjunctive type Q[T, A] corresponding to this type
notation:
𝑄𝑇,𝐴 def
= 1 + 𝑇 × 𝐴 + Int × (𝑇 → 𝑇) + String × 𝐴 .
Exercise 5.1.5.2 Convert the type Either[(A, Int), Either[(A, Char), (A, Float)]]
from Scala syntax to the type notation.
Exercise 5.1.5.3 Define a Scala type Opt2[A, B] written in the type notation as
1 + 𝐴 + 𝐵.
Exercise 5.1.5.4 Write a Scala type signature for this fully parametric function:

𝑓 𝐴,𝐵,𝐶 : 1 + 𝐴 + 𝐵 + 𝐶 → ( 𝐴 → 1 + 𝐵) → 1 + 𝐵 + 𝐶 .

Implement this function, preserving information as much as possible.

5.2 The logic of CH -propositions


So far, we were able to convert statements such as “a fully parametric function can
compute values of type 𝐴” into logical propositions that we called CH -propositions.
The next step is to determine the proof rules suitable for rigorous reasoning about
CH -propositions. Those rules will be rules of a formal logic.
This section is an extended voyage into certain aspects of formal logic that are
needed for obtaining proofs of CH -propositions and deriving program code from
those proofs. While that theory (known as the Curry-Howard correspondence)
is important as a technique of reasoning about types in functional programs, the
material of Section 5.2 is used only for the discussions in Section 5.5. An exception
is Section 5.2.2 that introduces the short notation for fully parametric code. That
notation will be further developed and used throughout the book. Other than
that, the rest of the book does not depend on the Curry-Howard correspondence.

5.2.1 Motivation and first examples


Formal logic uses axioms and derivation rules for proving that certain formulas
are true or false. We will use Greek letters (𝛼, 𝛽, etc.) to denote propositions.
We will often need logical formulas that talk about properties of arbitrary propo-
sitions. This is denoted by the universal quantifier symbol (∀), which means “for
all”. The universal quantifier will be usually located in front of the formula, e.g.:

∀(𝛼, 𝛽). (𝛼 ⇒ 𝛽) ⇒ 𝛼 ⇒ 𝛼 .

The symbol ⇒ denotes implication: 𝛼 ⇒ 𝛽 means that if 𝛼 is proved true then 𝛽


will be proved true.
185
5 The logic of types. III. The Curry-Howard correspondence

Formulas whose propositions are all universally quantified correspond to type


signatures that are made entirely from type parameters. For instance, the formula
shown above corresponds to the following type signature:
def f[A, B]: (A => B) => A => A
The universal quantifier ∀(𝛼, 𝛽) corresponds to the fact that the function f works
with any choice of types A and B.
A simple example of a true logical formula is “if 𝛼 is true then 𝛼 is true” (any
proposition 𝛼 follows from itself):
∀𝛼. 𝛼 ⇒ 𝛼 . (5.4)
If the proposition 𝛼 is a CH -proposition, that is, if 𝛼 def
= CH ( 𝐴) for some type 𝐴,
we obtain from Eq. (5.4) the formula:
∀𝐴. CH ( 𝐴) ⇒ CH ( 𝐴) . (5.5)
We expect true CH -propositions to correspond to types that can be computed in a
fully parametric function. Let us see if this example fits our expectations. We can
rewrite Eq. (5.5) as:
∀𝐴. CH ( 𝐴) ⇒ CH ( 𝐴)
rule for function types : = ∀𝐴. CH ( 𝐴 → 𝐴)
rule for parameterized types : = CH (∀𝐴. 𝐴 → 𝐴) .
The last line shows a CH -proposition that corresponds to the type ∀𝐴. 𝐴 → 𝐴.
Translating this type notation into a Scala type signature, we get:
def f[A]: A => A
This type signature can be implemented by an identity function:
def f[A]: A => A = { x => x }
This example shows a true CH -proposition that corresponds to a type signature
of a function f, and we see that f can be implemented in code.
While the correctness of the formula ∀𝛼. 𝛼 ⇒ 𝛼 may be self-evident, the point
of using formal logic is to have a set of axioms and proof rules that allow us to
verify all true formulas systematically, without guessing or testing. What axioms
and proof rules are suitable for proving CH -propositions?
A set of axioms and proof rules defines a formal logic. Mathematicians have
studied many different logics that are useful for solving different problems. We
are now looking for a specific formal logic that gives correct answers when rea-
soning about CH -propositions.1
1 For an overview and more details about that logic and the necessary proof tech-
niques, see the book by R. Bornat, “Proof and disproof in formal logic: an in-
troduction for programmers”. An early draft version of that book is available at
https://homepages.phonecoop.coop/randj/richard/books/ProofandDisproof.pdf

186
5.2 The logic of CH -propositions

5.2.2 Short notation for fully parametric code


To derive the suitable logical axioms and proof rules systematically, we need to
examine what could make a sequent with CH -propositions true.
A sequent CH ( 𝐴) ` CH (𝑋) is true when a value of type 𝑋 can be computed
by fully parametric code that may only use a given value of type 𝐴. To describe
all possible ways of computing a value of type 𝑋, we need to enumerate all pos-
sible ways of writing code in a fully parametric function. The requirement of full
parametricity means that we are not allowed to use any specific types such as Int
or String, any concrete values such as 123 or "hello", or any library functions that
work with specific (non-parametric) types. We are only allowed to work with val-
ues of unknown types described by the given type parameters. However, we are
permitted to use fully parametric types such as Either[A, B] or Option[A].
In fact, we can enumerate all the allowed constructions that may be used by
fully parametric code. There are eight code constructions as illustrated here:
def f[A, B, ...](a: A, b: B): X = { // Any given type signature.
val x1: Unit = () // 1) Use a value of type Unit.
val x2: A = a // 2) Use a given argument.
val x3 = { (x: A) => b } // 3) Create a function.
val x4: D = x3(x2) // 4) Use a function.
val x5: (A, B) = (a, b) // 5) Create a tuple.
val x6: B = x5._2 // 6) Use a tuple.
val x7: Either[A, B] = Right(x6) // 7) Create values of a disjunctive type.
val x8 = x7 match { ... } // 8) Use values of a disjunctive type.
} // 9) Call f itself recursively. Not included here because recursion is
not supported by CH -propositions.

The proposition CH (𝑋) is true if we can create a sequence of computed values


such as x1, x2, ..., xN, each using one of these eight code constructs, with xN having
type 𝑋.
So, each of the eight code constructs will give a proof rule in the logic.
It is important that there are only a finite number of allowed code constructions.
This defines rigorously the concept of “fully parametric code” and allows us to
prove CH -propositions.
The ninth code construction (the recursive call) is also a valid construction in
fully parametric code. We will be using that construction in later chapters of this
book. However, that construction is not supported by propositional logic, so we
cannot map recursive code to a proof rule for CH -propositions.
When writing code and reasoning symbolically about code, the syntax of Scala
is too verbose. We have started introducing a shorter code notation in Chapter 4,
and we will develop it in more detail in this chapter. This code notation will be
used systematically in later chapters of this book.
In the code notation, we denote values of the Unit type by the symbol 1. There
will be no confusion with the integer value 1 because the latter is rarely used in
symbolic reasoning.
187
5 The logic of types. III. The Curry-Howard correspondence

We denote functions by expressions of the form 𝑥 :𝐴 → 𝑓 (𝑥). We will use two


equivalent notations for applying a function to an argument: 𝑓 (𝑥) and 𝑥 ⊲ 𝑓 (the
pipe notation).
Tuples are denoted using the product symbol (×). For example, a pair (a, b) is
written in the code notation as 𝑎 × 𝑏. A tuple (a, b, c) is written as 𝑎 × 𝑏 × 𝑐, etc.
In this book, all type parameters are capitalized and all values are written in
lowercase. This makes it clear that, say, 𝐴 → 𝐴 and 𝐴 × 𝐵 are type expressions
while 𝑥 → 𝑥 and 𝑎 × 𝑏 are values.
Functions that pattern-match on a tuple are denoted by 𝑎 × 𝑏 → ... (that is, the
argument is written in a tuple form). For example, the function denoted in Scala
by _._1 is written as 𝑎 × 𝑏 → 𝑎.
Disjunctive types are treated specially in the code notation. As a motivating
example, consider the standard type Either[A, B]. In the type notation, it is written
as 𝐴 + 𝐵. Values of that type can be of the form Left(x) or Right(y). Those values
are written as 𝑥 + 0 and 0 + 𝑦 in the code notation.
To motivate this unconventional notation, consider the type inferred by Scala
for a value Left(123):
scala> Left(123)
res0: Left[Int, Nothing] = Left(123)
The inferred type is Left[Int, Nothing], which is written as Int + 0 in the type no-
tation. As the void type (0) has no values, it makes sense that any value of type
Int + 0 must be of the form Left(x) with an integer x. More generally, a value of
type 𝐴 + 0 is always of the form Left(x) with some x: A, and a value of type 0 + 𝐵
must be of the form Right(y) with some y: B. So, the code notation writes 𝑥 :𝐴 + 0
for values of type 𝐴 + 0 and 0 + 𝑦 :𝐵 for values of type 0 + 𝐵.
The type notation 𝐴 + 0 and 0 + 𝐵 for the Left and the Right parts of the dis-
junctive type Either[A, B] agrees with the behavior of the Scala compiler, which
will infer the types Either[A, Nothing] and Either[Nothing, B] for the correspond-
ing code:
def toLeft[A, B]: A => Either[A, B] = x => Left(x)
def toRight[A, B]: B => Either[A, B] = y => Right(y)

scala> toLeft(123)
res0: Either[Int, Nothing] = Left(123)

scala> toRight("abc")
res1: Either[Nothing, String] = Right("abc")
We can write the functions toLeft and toRight in the code notation as:
toLeft 𝐴,𝐵 def
= 𝑥 :𝐴 → 𝑥 + 0:𝐵 , toRight 𝐴,𝐵 def
= 𝑦 :𝐵 → 0:𝐴 + 𝑦 .
The code notation shows values of disjunctive types without using Scala class
names such as Either, Right, and Left. This shortens the writing and speeds up
reasoning about code.
188
5.2 The logic of CH -propositions

In the notation 0:𝐴 + 𝑦, we use the symbol 0 rather than an ordinary zero (0), to
avoid suggesting that 0 is a value of type 0. The void type 0 has no values, unlike
the Unit type, 1, which has a value denoted by 1 in the code notation.
Type annotations such as 0:𝐴 are helpful to remind ourselves about the type
parameter 𝐴 used, e.g., by the disjunctive value 0:𝐴 + 𝑦 :𝐵 in the body of toRight[A,
B]. Without that type annotation, 0 + 𝑦 :𝐵 needs to be interpreted as a value of type
Either[A, B], where the type parameter 𝐴 must be determined by matching with
the types of other expressions. When it is clear what types are being used, we may
omit type annotations and write simply 0 + 𝑦 instead of 0:𝐴 + 𝑦 :𝐵 .
The type notation for pattern matching is also unconventional because it uses
“function matrices” (matrices whose elements are functions). To motivate that, we
view a match/case expression as a set of functions that map parts of a disjunctive
type into parts of another disjunctive type. Consider this example code:
1 def f: Either[Int, String] => Either[String, Int] = {
2 case Left(x) => Right(10 * x)
3 case Right(y) => Left("a" + y + "b")
4 }
If we ignore the type names (Left and Right), we will see that line 2 is similar to the
function x => 10 * x of type Int => Int, while line 3 is similar to the function y =>
"a" + y + "b" of type String => String. These functions become matrix elements in
the “function matrix” for f:
String Int
:Int+String→String+Int def
𝑓 = Int 0 𝑥 → 10 ∗ 𝑥 .
String 𝑦 → "a" + 𝑦 + "b" 0
The rows of the matrix correspond to the case rows in the Scala code. There is one
row for each part of the disjunctive type of the input argument. The columns of
the matrix correspond to the parts of the disjunctive type of the output. The matrix
element in the first row and second column is the function 𝑥 → 10 ∗ 𝑥 that corre-
sponds to line 2 in the Scala code. The result type for that case is Right[Nothing,
Int], which is written as 0 + Int in the type notation. The function 𝑥 → 10 ∗ 𝑥 is
written in the second column to indicate that the result type is 0 + Int. The matrix
element in the first row and the first column is written as 0 because no value of
the type Left is returned in that case.
The matrix element in the second row and first column is the function 𝑦 →
"a" + 𝑦 + "b" that corresponds to line 3 in the Scala code. The result type for that
case is String + 0. The other matrix element in the second row is written as 0,
according to the return type String + 0.
In this way, we translate all lines of the match/case expression into a code matrix.
In each row of the matrix, there can be only one element that is not 0.
It turns out that the matrix notation is well adapted to computing forward compo-
sitions of functions that operate on disjunctive types. We will see many examples
189
5 The logic of types. III. The Curry-Howard correspondence

of such computations later in this book. In this chapter, we will use code matrices
in Example 5.3.2.4, Example 5.3.4.6, and some others.

5.2.3 The rules of proof for CH -propositions


Reasoning about proof rules begins by translating the eight standard code con-
structs into sequents. A proof of a sequent, e.g., CH ( 𝐴) ` CH (𝑋), will consist
of applying some of those proof rules. We will then combine the code constructs
corresponding to each rule and obtain some code that computes a value of type 𝑋
using an argument of type 𝐴.
Conversely, any fully parametric (and non-recursive) code computing a value
of type 𝑋 must be a combination of some of the eight code constructs. That code
combination can be translated into a combination of logic rules, which will pro-
duce a proof of the proposition CH (𝑋).
In this way, we will get a correspondence between fully parametric programs
and proofs of sequents. This is the second part of the Curry-Howard correspon-
dence.
In the following text, we will need to write CH -propositions such as Eq. (5.1) as
sequents such as Eq. (5.2). As we have seen, CH -propositions involving compli-
cated type expressions can be always rewritten via CH -propositions for individ-
ual type parameters (CH ( 𝐴), CH (𝐵), etc.). So, we will only need sequents involv-
ing such CH -propositions. For brevity, we denote 𝛼 def = CH ( 𝐴), 𝛽 def
= CH (𝐵), etc.
We will use the letter Γ to stand for a set of premises and write a shorter formula
Γ ` 𝛼 instead of the sequent (5.2).
With these notations, we list the rules for proving CH -propositions and the
corresponding code:
1) The Unit value At any place in the code, we may write the expression () of
type Unit. This expression corresponds to a proof of the proposition CH (1) with
any set Γ of premises (even with an empty set of premises). So, the sequent Γ `
CH (1) is always true. The code corresponding to the proof of this sequent is an
expression that creates a value of the Unit type:

Proof Γ ` CH (1) = 1 ,

where we denoted by 1 the value ().


In formal logic, a sequent that is always true, such as our Γ ` CH (1), is called
an axiom and is written in the following notation:

(create unit) .
Γ ` CH (1)

The “fraction with a label” represents a proof rule. The denominator of the fraction
is the target sequent that we need to prove. The numerator of the fraction can have
zero or more other sequents that need to be proved before the target sequent can
190
5.2 The logic of CH -propositions

be proved. In this case, the set of previous sequents is empty: the target sequent is
an axiom and so requires no previous sequents for its proof. The label “create unit”
is an arbitrary name used to refer to the rule.
2) Use a given value At any place within the code of a fully parametric function,
we may use one of the function’s arguments, say 𝑥 :𝐴 . If some argument has type
𝐴, it means that we already have a value of type 𝐴. So, the corresponding propo-
sition, 𝛼 def
= CH ( 𝐴), belongs to the set of premises of the sequent we are trying to
prove. To indicate this, we write the set of premises as “Γ, 𝛼”. The code construct
x computes a value of type 𝐴, i.e., shows that “𝛼 is true with premises Γ, 𝛼”. That
proposition is the meaning of the sequent Γ, 𝛼 ` 𝛼. The proof code for that sequent
is an expression that just returns the value 𝑥:

Proof Γ, 𝛼 ` 𝛼 given 𝑥 : 𝐴 = 𝑥 .

Here, the subscript “given 𝑥 :𝐴 ” indicates that the value 𝑥 :𝐴 must come from the
premises. In this case, the set of premises is Γ, 𝛼 and so the proposition 𝛼 must
have been already proved. The proof of 𝛼 will give a value 𝑥 :𝐴 .
Actually, the premises for the sequent Γ, 𝛼 ` 𝛼 may give us not only a value
:𝐴
𝑥 but also some other values of other types. We may collectively denote those
values by 𝑝 :Γ . But the proof of the sequent Γ, 𝛼 ` 𝛼 does not need to use 𝑝. To
show that explicitly, we may write:

Proof Γ, 𝛼 ` 𝛼 given 𝑝 :Γ , 𝑥 : 𝐴 = 𝑥 .
The sequent Γ, 𝛼 ` 𝛼 is an axiom since its proof requires no previous sequents;
a value of type 𝐴 is already given in the premises. We denote this axiom by:
(use value) .
Γ, 𝛼 ` 𝛼
3) Create a function At any place in the code, we may compute a nameless func-
tion of type, say, 𝐴 → 𝐵, by writing (x: A) => expr as long as a value expr of type
𝐵 can be computed in the inner scope of the function. The code for expr is also
required to be fully parametric; it may use x and/or other values visible in that
scope. So, we now need to answer the question of whether a fully parametric
function can compute a value of type 𝐵, given an argument of type 𝐴 as well as
all other arguments previously given to the parent function. This question is an-
swered by a sequent whose premises contain one more proposition, CH ( 𝐴), in
addition to all previously available premises. Translating this into the language of
CH -propositions, we find that we will prove the sequent:
Γ ` CH ( 𝐴 → 𝐵) = Γ ` CH ( 𝐴) ⇒ CH (𝐵) = Γ`𝛼⇒𝛽
if we can prove the sequent Γ, CH ( 𝐴) ` CH (𝐵) = Γ, 𝛼 ` 𝛽. In the notation of
formal logic, this is a derivation rule (rather than an axiom) and is written as:
Γ, 𝛼 ` 𝛽
(create function) .
Γ`𝛼⇒𝛽
191
5 The logic of types. III. The Curry-Howard correspondence

The turnstile symbol (`) groups weaker than other operators. So, we can write
sequents such as (Γ, 𝛼) ` (𝛽 ⇒ 𝛾) with fewer parentheses: Γ, 𝛼 ` 𝛽 ⇒ 𝛾.
What code corresponds to the “create function” rule? The proof of Γ ` 𝛼 ⇒ 𝛽
depends on a proof of another sequent. So, the corresponding code must be a
function that takes a proof of the previous sequent as an argument and returns a
proof of the new sequent. We call that function a proof transformer.
By the CH correspondence, a proof of a sequent corresponds to a code expres-
sion of the type given by the goal of the sequent. That expression may use argu-
ments of types corresponding to the premises of the sequent. So, a proof of the
sequent Γ, 𝛼 ` 𝛽 is an expression exprB of type 𝐵 that may use a given value of
type 𝐴 as well as any other arguments given previously. Then we can write the
proof code for the sequent Γ ` 𝛼 ⇒ 𝛽 as the nameless function (x: A) => exprB.
This function has type 𝐴 → 𝐵 and requires us to already have a suitable expres-
sion exprB. This exactly corresponds to the proof rule “create function”. That rule’s
proof transformer is:
Proof Γ ` 𝛼 ⇒ 𝛽 given 𝑝 :Γ = 𝑥 :𝐴 → Proof Γ, 𝛼 ` 𝛽 given 𝑝 :Γ , 𝑥 : 𝐴 .
 

4) Use a function At any place in the code, we may apply an already defined
function of type 𝐴 → 𝐵 to an already computed value of type 𝐴. The result will
be a value of type 𝐵. This corresponds to assuming CH ( 𝐴 → 𝐵) and CH ( 𝐴), and
then deriving CH (𝐵). The notation for this proof rule is:
Γ`𝛼 Γ`𝛼⇒𝛽
(use function) .
Γ`𝛽
The code corresponding to this proof rule takes previously computed values x: A
and f: A => B, and writes the expression f(x). This can be written as a function
application:
Proof (Γ ` 𝛽) = Proof (Γ ` 𝛼 ⇒ 𝛽) (Proof (Γ ` 𝛼)) .
Here we omitted the subscripts “given 𝑝 :Γ ” for brevity, since all sequents have the
same premises Γ.
5) Create a tuple If we have already computed some values a: A and b: B, we
may write the expression (a, b) and so compute a value of the tuple type (A, B).
The proof rule is:
Γ`𝛼 Γ`𝛽
(create tuple) .
Γ` 𝛼∧𝛽
In the code notation, 𝑎 × 𝑏 means the pair (a, b), so we can write the code as:
Proof (Γ ` 𝛼 ∧ 𝛽) = Proof (Γ ` 𝛼) × Proof (Γ ` 𝛽) .
This rule describes creating a tuple of 2 values. A larger tuple, such as (w, x,
y, z), can be expressed via nested pairs, e.g., as (w, (x, (y, z))). So, it suffices
to have a derivation rule for creating pairs. That rule allows us to derive the
rules for creating all larger tuples, without having to define separate rules for, say,
Γ ` 𝛼 ∧ 𝛽 ∧ 𝛾.
192
5.2 The logic of CH -propositions

6) Use a tuple If we already have a value t: (A, B) of a tuple type 𝐴 × 𝐵, we can


extract one of the parts of the tuple and obtain a value of type A or a value of type
B. The code is t._1 and t._2 respectively, and the corresponding proof rules are:

Γ` 𝛼∧𝛽 Γ` 𝛼∧𝛽
(use tuple-1) , (use tuple-2) .
Γ`𝛼 Γ`𝛽

The proof code can be written as:

Proof (Γ ` 𝛼) = 𝜋1 (Proof (Γ ` 𝛼 ∧ 𝛽)) , Proof (Γ ` 𝛽) = 𝜋2 (Proof (Γ ` 𝛼 ∧ 𝛽)) ,

where we introduced the notation 𝜋1 and 𝜋2 to mean the Scala code _._1 and _._2.
Since all tuples can be expressed through pairs, it is sufficient to have proof
rules for pairs.
7) Create a disjunctive value The type Either[A, B] corresponding to the disjunc-
tion 𝛼 ∨ 𝛽 can be used to define any other disjunctive type; e.g., a disjunctive type
with three parts can be expressed as Either[A, Either[B, C]]. So, it suffices to have
proof rules for a disjunction of two propositions.
There are two ways of creating a value of the type Either[A, B]: the code ex-
pressions are Left(x: A) and Right(y: B). The values x: A or y: B must have been
computed previously (and correspond to previously proved sequents). So, the
sequent proof rules are:

Γ`𝛼 Γ`𝛽
(create Left) (create Right) .
Γ` 𝛼∨𝛽 Γ` 𝛼∨𝛽

The corresponding proof transformers can be written using the case class names
Left and Right as:

Proof (Γ ` 𝛼 ∨ 𝛽) = Left (Proof (Γ ` 𝛼)) ,


Proof (Γ ` 𝛼 ∨ 𝛽) = Right (Proof (Γ ` 𝛽)) .

8) Use a disjunctive value Pattern matching is the basic way of using a value of
type Either[A, B]:
val result: C = (e: Either[A, B]) match {
case Left(x: A) => expr1(x)
case Right(y: B) => expr2(y)
}

Here, expr1(x) is an expression of type C computed using x: A and any previously


computed values. Similarly, expr2(y) is computed using y: B and previous val-
ues. The values used in computation correspond to the premises of a sequent.
So, expr1(x) represents a proof of a sequent with an additional premise of type A.
Denoting 𝛾 def
= CH (𝐶), we write that sequent as: Γ, 𝛼 ` 𝛾. Similarly, expr2(y) is a
193
5 The logic of types. III. The Curry-Howard correspondence

axioms : (use unit) (use value)


Γ ` CH (1) Γ, 𝛼 ` 𝛼
Γ, 𝛼 ` 𝛽
derivation rules : (create function)
Γ`𝛼⇒𝛽
Γ`𝛼 Γ`𝛼⇒𝛽
(use function)
Γ`𝛽
Γ`𝛼 Γ`𝛽
(create tuple)
Γ` 𝛼∧𝛽
Γ` 𝛼∧𝛽 Γ` 𝛼∧𝛽
(use tuple-1) (use tuple-2)
Γ`𝛼 Γ`𝛽
Γ`𝛼 Γ`𝛽
(create Left) (create Right)
Γ` 𝛼∨𝛽 Γ` 𝛼∨𝛽
Γ` 𝛼∨𝛽 Γ, 𝛼 ` 𝛾 Γ, 𝛽 ` 𝛾
(use Either)
Γ`𝛾

Table 5.2: Proof rules for the constructive logic.

proof of the sequent Γ, 𝛽 ` 𝛾. We can compute result only if we can compute e,


expr1, and expr2. So, the proof rule is:

Γ` 𝛼∨𝛽 Γ, 𝛼 ` 𝛾 Γ, 𝛽 ` 𝛾
(use Either) .
Γ`𝛾

The corresponding code can be written as:


(
case 𝑎 :𝐴 → Proof (Γ, 𝛼 ` 𝛾) given 𝑎
Proof (Γ ` 𝛾) = Proof (Γ ` 𝛼 ∨ 𝛽) match .
case 𝑏 :𝐵 → Proof (Γ, 𝛽 ` 𝛾) given 𝑏

We found eight proof rules shown in Table 5.2. These rules define the intuition-
istic propositional logic, also called constructive propositional logic. We will
call this logic “constructive” for short.

5.2.4 Examples: Deriving code from proofs of


CH -propositions
Using the proof rules of Table 5.2, we can (in principle) derive code from type
signatures of fully parametric functions. We will now show two simple examples
and we perform such derivations step by step.
Example 5.2.4.1 Derive the code for the type signature:
def d[X]: X => (X, X)

194
5.2 The logic of CH -propositions

Solution First, we formulate the task as proving the proposition “For any type
𝑋, we can have a value of type 𝑋 → 𝑋 × 𝑋”. This corresponds to the proposition
CH (∀𝑋. 𝑋 → 𝑋 × 𝑋). That proposition will be the goal of a sequent. The function
has no arguments, so there are no premises for the sequent. We denote an empty
set of premises by the symbol ∅ . So, the sequent is written as:
∅ ` CH (∀𝑋. 𝑋 → 𝑋 × 𝑋) .
We denote 𝜒 def= CH (𝑋) and rewrite this sequent using the rules of Table 5.1. The
result is a sequent involving just 𝜒:
∀𝜒. ∅ ` 𝜒 ⇒ 𝜒 ∧ 𝜒 .
Next, we look for a proof of this sequent. For brevity, we will omit the quantifier
∀𝜒 since it will be present in front of every sequent.
We search through the proof rules in Table 5.2, looking for “denominators” that
match our current sequent. If we find such a rule, we will apply that rule to our
sequent. Then we will need to prove the sequents in the rule’s “numerator”.
Beginning with ∅ ` 𝜒 ⇒ 𝜒 ∧ 𝜒, we find a match with the rule “create function”:
Γ, 𝛼 ` 𝛽
(create function)
Γ`𝛼⇒𝛽
The denominator of that rule is Γ ` 𝛼 ⇒ 𝛽. This pattern will match our sequent
(∅ ` 𝜒 ⇒ 𝜒 ∧ 𝜒) if we set Γ = ∅, 𝛼 = 𝜒, and 𝛽 = 𝜒 ∧ 𝜒. So, we are allowed to apply
the rule “create function” with these assignments.
After these assignments, the rule “create function” becomes:
∅, 𝜒 ` 𝜒 ∧ 𝜒
.
∅ ` 𝜒 ⇒ 𝜒∧𝜒
Now the rule says: we will prove the denominator (∅ ` 𝜒 ⇒ 𝜒 ∧ 𝜒) if we first
prove the numerator (∅, 𝜒 ` 𝜒 ∧ 𝜒).
The set of premises ∅, 𝜒 is the union of an empty set and the set having a single
premise 𝜒. So, we can write the last sequent also as 𝜒 ` 𝜒 ∧ 𝜒 if we like.
To prove that sequent, we again look for a rule whose denominator matches our
sequent. That rule is “create tuple”:
Γ`𝛼 Γ`𝛽
(create tuple)
Γ` 𝛼∧𝛽
The denominator (Γ ` 𝛼 ∧ 𝛽) will match our sequent (𝜒 ` 𝜒 ∧ 𝜒) if we assign Γ = 𝜒,
𝛼 = 𝜒, 𝛽 = 𝜒. With these assignments, the rule says that we need to prove two
sequents (Γ ` 𝛼 and Γ ` 𝛽), which are in fact the same sequent (𝜒 ` 𝜒).
To prove that sequent, we apply the axiom “use value”:
.
Γ, 𝛼 ` 𝛼
195
5 The logic of types. III. The Curry-Howard correspondence

∅ ` 𝜒 ⇒ 𝜒∧𝜒

rule “create function”

𝜒 ` 𝜒∧𝜒

rule “create tuple”

𝜒`𝜒 𝜒`𝜒

axiom “use value” axiom “use value”

Figure 5.1: Proof tree for the sequent ∅ ` 𝜒 ⇒ 𝜒 ∧ 𝜒.

The denominator of that axiom matches ∅, 𝜒 ` 𝜒 if we set Γ = ∅ and 𝛼 = 𝜒. Then


the axiom “use value” becomes:

.
∅, 𝜒 ` 𝜒
This axiom says that the sequent ∅, 𝜒 ` 𝜒 (or equivalently 𝜒 ` 𝜒) is already true
with nothing more needed to prove. So, the proof is finished.
We may visualize the proof as a tree shown in Figure 5.1. The tree starts with
the initial sequent and applies rules that require us to prove other sequents. The
tree stops with axioms in leaf positions.
Now we need to extract code from the proof. We begin with the leaves of the
tree and go back towards the top of the proof.
The axiom “use value” has the proof code 𝑥, where 𝑥 is given in the premises:

Proof Γ, 𝛼 ` 𝛼 given 𝑥 : 𝐴 = 𝑥 .

The proof uses this axiom twice with 𝛼 = 𝜒. Recall that 𝜒 denotes CH (𝑋), and so
the value 𝑥 must have type 𝑋. So, we write:

Proof 𝜒 ` 𝜒 given 𝑥 :𝑋 = 𝑥 .

The previous rule used by the proof was “create tuple”. Its proof code is:

Proof (Γ ` 𝛼 ∧ 𝛽) = Proof (Γ ` 𝛼) × Proof (Γ ` 𝛽) .

That rule was used with Γ = 𝜒, 𝛼 = 𝜒, and 𝛽 = 𝜒. So, the proof code becomes:

Proof ( 𝜒 ` 𝜒 ∧ 𝜒) given 𝑥 :𝑋 = Proof ( 𝜒 ` 𝜒) given 𝑥 :𝑋 × Proof ( 𝜒 ` 𝜒) given 𝑥 :𝑋


= 𝑥×𝑥 .

196
5.2 The logic of CH -propositions

Finally, the first rule “create function” has the proof code:

Proof Γ ` 𝛼 ⇒ 𝛽 = 𝑥 :𝐴 → Proof Γ, 𝛼 ` 𝛽 given 𝑥 : 𝐴


 
.

That rule was used with Γ = ∅, 𝐴 = 𝑋, 𝛼 = 𝜒, and 𝛽 = 𝜒 ∧ 𝜒. Using these


assignments, we obtain the code:

Proof ∅ ` 𝜒 ⇒ 𝜒 ∧ 𝜒 = 𝑥 :𝑋 → Proof 𝜒 ` 𝜒 × 𝜒 given 𝑥 :𝑋 = 𝑥 :𝑋 → 𝑥 × 𝑥 .


 

In Scala, this code is:


def d[X]: X => (X, X) = (x: X) => (x, x)

Example 5.2.4.2 Derive the code for the type signature:


def s[A, B]: ((A => A) => B) => B
Solution The task is to compute a value of type (( 𝐴 → 𝐴) → 𝐵) → 𝐵 for
arbitrary types 𝐴, 𝐵 without any arguments. This is written as the sequent:

∅ ` CH ∀( 𝐴, 𝐵). (( 𝐴 → 𝐴) → 𝐵) → 𝐵 .

Denote 𝛼 def
= CH ( 𝐴) and 𝛽 def= CH (𝐵), and rewrite the sequent using the rules of
Table 5.1 to obtain a logic formula that involves just 𝛼 and 𝛽:

∀(𝛼, 𝛽). ∅ ` ((𝛼 ⇒ 𝛼) ⇒ 𝛽) ⇒ 𝛽 . (5.6)

The next step is to prove the sequent (5.6). For brevity, we will omit the quanti-
fier ∀(𝛼, 𝛽) since it will be present in front of every sequent.
Begin by looking for a proof rule whose “denominator” has a sequent similar to
Eq. (5.6), i.e., has an implication (𝑝 ⇒ 𝑞) in the goal. We have only one rule with
the “denominator” of the form Γ ` 𝑝 ⇒ 𝑞. That rule is “create function”, which we
will rewrite for clarity as:
Γ, 𝑝 ` 𝑞
(create function)
Γ`𝑝⇒𝑞

To match the denominator, we use this rule with the assignments Γ = ∅, 𝑝 def
= (𝛼 ⇒
def
𝛼) ⇒ 𝛽 and 𝑞 = 𝛽:
(𝛼 ⇒ 𝛼) ⇒ 𝛽 ` 𝛽
(create function)
∅ ` ((𝛼 ⇒ 𝛼) ⇒ 𝛽) ⇒ 𝛽
The rule’s numerator now requires us to prove the sequent (𝛼 ⇒ 𝛼) ⇒ 𝛽 ` 𝛽.
We may write that sequent as as 𝛾 ` 𝛽, where we defined 𝛾 def
= (𝛼 ⇒ 𝛼) ⇒ 𝛽 for
brevity.
So, the next step is to prove the sequent 𝛾 ` 𝛽. The premise (𝛾) contains an
implication. But there is no proof rule whose denominator has a premise in the
197
5 The logic of types. III. The Curry-Howard correspondence

form of an implication (𝑝 ⇒ 𝑞). Instead, we have the rule “use function” whose
denominator contains an arbitrary sequent:

Γ`𝑝 Γ`𝑝⇒𝑞
(use function)
Γ`𝑞

This rule’s denominator matches 𝛾 ` 𝛽 if we set Γ = 𝛾 and 𝑞 = 𝛽. But it is not clear


how to choose 𝑝. After some trial and error, one finds that the proof will work if
we set 𝑝 = (𝛼 ⇒ 𝛼):

𝛾`𝛼⇒𝛼 𝛾 ` (𝛼 ⇒ 𝛼) ⇒ 𝛽
(use function)
𝛾`𝛽

This rule’s numerator now requires us to prove two new sequents: 𝛾 ` 𝛼 ⇒ 𝛼


and 𝛾 ` (𝛼 ⇒ 𝛼) ⇒ 𝛽. To prove the first of these sequents, apply the rule
“create function” like this:
𝛾, 𝛼 ` 𝛼
(create function)
𝛾`𝛼⇒𝛼

The sequent in the numerator 𝛾, 𝛼 ` 𝛼 is proved directly by the axiom “use value”.
The sequent 𝛾 ` (𝛼 ⇒ 𝛼) ⇒ 𝛽 is the same as 𝛾 ` 𝛾 and is also proved by the axiom
“use value”.
The proof of the sequent (5.6) is now complete and can be drawn as a tree (see
Figure 5.2). The next step is to convert that proof to Scala code.
To do that, we combine the code expressions that correspond to each of the
proof rules we used. We need to retrace the proof backwards, starting from the
leaves of the tree and going towards the root. We will then combine the corre-
sponding Proof (...) code expressions.
Begin with the left-most leaf: “use value”. That rule gives the code 𝑥 :𝐴 :

Proof (𝛾, 𝛼 ` 𝛼) given 𝑥 : 𝐴 = 𝑥 :𝐴 .

Here “given 𝑥 :𝐴 ” means that 𝑥 :𝐴 must be a proof of the premise 𝛼 in the sequent
𝛾, 𝛼 ` 𝛼 (recall that 𝛼 denotes CH ( 𝐴), and so 𝑥 has type 𝐴). We need to use the
same 𝑥 :𝐴 when we write the code for the previous rule, “create function”:

Proof (𝛾 ` 𝛼 ⇒ 𝛼) = 𝑥 :𝐴 → Proof (𝛾, 𝛼 ` 𝛼) given 𝑥 : 𝐴 = (𝑥 :𝐴 → 𝑥) .




Note that in this code we are able to use a value 𝑥 of type 𝐴 even though no
such value is given as an argument of our function s[A, B]. The reason is that the
sequent 𝛾, 𝛼 ` 𝛼 has an extra premise 𝛼 added to the set of premises at this step of
the proof. Once we are finished with this step, we again will not have any values
of type 𝐴 available. In the code, this corresponds to the local scoping of the bound
value 𝑥 :𝐴 in the function 𝑥 :𝐴 → 𝑥.
198
5.2 The logic of CH -propositions

∅ ` ((𝛼 ⇒ 𝛼) ⇒ 𝛽) ⇒ 𝛽

rule “create function”

(𝛼 ⇒ 𝛼) ⇒ 𝛽 ` 𝛽

rule “use function”

(𝛼 ⇒ 𝛼) ⇒ 𝛽 ` 𝛼 ⇒ 𝛼 (𝛼 ⇒ 𝛼) ⇒ 𝛽 ` (𝛼 ⇒ 𝛼) ⇒ 𝛽

rule “create function” axiom “use value”

(𝛼 ⇒ 𝛼) ⇒ 𝛽, 𝛼 ` 𝛼

axiom “use value”

Figure 5.2: Proof tree for sequent (5.6).

We continue tracing the proof tree bottom-up. The right-most leaf “use value”
corresponds to the code 𝑓 :( 𝐴→𝐴)→𝐵 , where 𝑓 is the code corresponding to the
premise 𝛾 = (𝛼 ⇒ 𝛼) ⇒ 𝛽. So, we can write:

Proof (𝛾 ` 𝛾) given 𝑓 :( 𝐴→𝐴)→𝐵 = 𝑓 :( 𝐴→𝐴)→𝐵 .

The previous rule (“use function”) combines the two preceding proofs:

Proof ((𝛼 ⇒ 𝛼) ⇒ 𝛽 ` 𝛽) given 𝑓 :( 𝐴→𝐴)→𝐵


= Proof (𝛾 ` 𝛾) (Proof (𝛾 ` 𝛼 ⇒ 𝛼)) given 𝑓 :( 𝐴→𝐴)→𝐵
= 𝑓 (𝑥 :𝐴 → 𝑥) .

Going further backwards, we find that the rule applied before “use function” was
“create function”. We need to provide the same 𝑓 :( 𝐴→𝐴)→𝐵 as in the premise above,
and so we obtain the code:

Proof (∅ ` ((𝛼 ⇒ 𝛼) ⇒ 𝛽) ⇒ 𝛽)
= 𝑓 :( 𝐴→𝐴)→𝐵 → Proof ((𝛼 ⇒ 𝛼) ⇒ 𝛽 ` 𝛽) given 𝑓 :( 𝐴→𝐴)→𝐵

= 𝑓 :( 𝐴→𝐴)→𝐵 → 𝑓 (𝑥 :𝐴 → 𝑥) .

This is the final code expression that implements the type (( 𝐴 → 𝐴) → 𝐵) → 𝐵.


In this way, we have systematically derived the code from the type signature of a
function. That code can be written in Scala as:
199
5 The logic of types. III. The Curry-Howard correspondence

def s[A, B]: ((A => A) => B) => B = { (f : A => A) => f(x => x) }

We found the proof tree in Figure 5.2 by combining various proof rules that
match our sequents. But we had to guess how to apply the “use function” rule: it
was not obvious how to assign the rule’s variable 𝑝. If we somehow find a proof
tree for a sequent, we can derive the corresponding code (perform “code inference”
from type). As we have seen, choosing the proof rules from Table 5.7 requires
guessing or trying different possibilities.
In other words, the rules in Table 5.7 do not provide an algorithm for finding a
proof tree automatically. It turns out that one can replace the rules in Table 5.7 by
a different but equivalent set of derivation rules that do give an algorithm (called
the “LJT algorithm”, see Section 5.2.5 below). That algorithm either finds that the
given formula cannot be proved, or it finds a proof and infers code that has the
given type signature.
The library curryhoward2 implements the LJT algorithm. Here are some examples
of using this library for code inference. We will run the ammonite3 shell to load the
library more easily.
As a non-trivial (but artificial) example, consider the type signature:
∀( 𝐴, 𝐵). (((( 𝐴 → 𝐵) → 𝐴) → 𝐴) → 𝐵) → 𝐵 .
It is not obvious whether a function with this type signature exists. The LJT algo-
rithm can figure that out and derive the code automatically. The library does this
via the method implement:
@ import $ivy.`io.chymyst::curryhoward:0.3.8`, io.chymyst.ch._

@ def f[A, B]: ((((A => B) => A) => A) => B) => B = implement
defined function f

@ println(f.lambdaTerm.prettyPrint)
a => a (b => b (c => a (d => c)))
The code 𝑎 → 𝑎 (𝑏 → 𝑏 (𝑐 → 𝑎 (𝑑 → 𝑐))) was produced automatically for the
function f. The function f has been compiled and is ready to be used in any sub-
sequent code.
A compile-time error occurs when no fully parametric function has the given
type signature:
@ def g[A, B]: ((A => B) => A) => A = implement
cmd3.sc:1: type ((A => B) => A) => A cannot be implemented
The logical formula corresponding to this type signature is:
∀(𝛼, 𝛽). ((𝛼 ⇒ 𝛽) ⇒ 𝛼) ⇒ 𝛼 . (5.7)
2 https://github.com/Chymyst/curryhoward
3 http://ammonite.io/#Ammonite-Shell

200
5.2 The logic of CH -propositions

This formula is known as Peirce’s law4 and gives an example showing that the
logic of types in functional programming languages is not Boolean (other exam-
ples are shown in Sections 5.2.6 and 5.5.4). Peirce’s law is true in Boolean logic but
does not hold in the constructive logic, i.e., it cannot be derived using the proof
rules in Table 5.7. If we try to implement g[A, B] with the type signature shown
above via fully parametric code, we will fail to write code that compiles without
type errors. This is because no such code exists, — not because we are insuffi-
ciently clever. The LJT algorithm can prove that the given type signature cannot
be implemented. The curryhoward library will then print an error message, and
compilation will fail.
As another example, let us verify that the type signature from Section 5.1.1 is
not implementable:
@ def bad2[A, B, C](g: A => Either[B, C]): Either[A => B, A => C] = implement
cmd4.sc:1: type (A => Either[B, C]) => Either[A => B, A => C] cannot be
implemented

The LJT algorithm will sometimes find several inequivalent proofs of the same
logic formula. In that case, each of the different proofs will be automatically trans-
lated into code. The curryhoward library uses heuristics to try finding the code that
has the least information loss. In many cases, the heuristics will select the imple-
mentation that is most useful to the programmer.
The rules of constructive logic and the LJT algorithm define rigorously what
it means to write code “guided by the types”. However, in order to use the LJT
algorithm well, a programmer needs to learn how to infer code from types by
hand. We will practice doing that throughout the book.

5.2.5 The LJT algorithm


The LJT algorithm solves an important problem: namely, that the logic rules in
Table 5.7 do not provide an algorithm for finding a proof for a given sequent. In
the previous section, we saw an example showing that searching for a proof of a
sequent via Table 5.7 sometimes requires guessing. To illustrate this difficulty on
another example, let us try proving the sequent:

𝐴, 𝐵 ∨ 𝐶 ` ( 𝐴 ∧ 𝐵) ∨ 𝐶 .

We expect that this sequent is provable because we can write the corresponding
Scala code:
def f[A, B, C](a: A): Either[B, C] => Either[(A, B), C] = {
case Left(b) => Left((a, b))
case Right(c) => Right(c)
}

4 https://en.wikipedia.org/wiki/Peirce%27s_law

201
5 The logic of types. III. The Curry-Howard correspondence

How can we obtain a proof of this sequent via Table 5.7? We could potentially
apply the rules “create Left”, “create Right”, “use Either”, and “use function”. But we
will get stuck at the next step, no matter what rule we choose. Let us see why:
To apply “create Left”, we first need to prove the sequent 𝐴, 𝐵 ∨ 𝐶 ` 𝐴 ∧ 𝐵. But
this sequent cannot be proved: we do not necessarily have values of both types
𝐴 and 𝐵 if we are only given values of type 𝐴 and of type Either[B, C]. To apply
“create Right”, we need to prove the sequent 𝐴, 𝐵 ∨ 𝐶 ` 𝐶. Again, we find that this
sequent cannot be proved. The next choice is the rule “use Either” that matches
any goal of the sequent as the proposition 𝛾. But we are then required to choose
two new propositions (𝛼 and 𝛽) such that we can prove 𝐴, 𝐵 ∨ 𝐶 ` 𝛼 ∨ 𝛽 as well
as 𝐴, 𝐵 ∨ 𝐶, 𝛼 ` ( 𝐴 ∧ 𝐵) ∨ 𝐶 and 𝐴, 𝐵 ∨ 𝐶, 𝛽 ` ( 𝐴 ∧ 𝐵) ∨ 𝐶. It is not clear how we
should choose 𝛼 and 𝛽 in order to make progress with the proof. The remaining
rule, “use function”, similarly requires us to choose a new proposition 𝛼 such that
we can prove 𝐴, 𝐵 ∨ 𝐶 ` 𝛼 and 𝐴, 𝐵 ∨ 𝐶 ` 𝛼 ⇒ (( 𝐴 ∧ 𝐵) ∨ 𝐶). The rules give us no
guidance for choosing 𝛼 appropriately.
The rules in Table 5.2 are not helpful for proof search because the rules “use
function” and “use Either” require us to choose new unknown propositions and to
prove sequents more complicated than the ones we had before.
For instance, the rule “use function” gives a proof of Γ ` 𝛽 only if we first choose
some other proposition 𝛼 and prove the sequents Γ ` 𝛼 and Γ ` 𝛼 ⇒ 𝛽. The rule
does not tell us how to choose the proposition 𝛼 correctly. We need to guess the
correct 𝛼 by trial and error. Even after choosing 𝛼 in some way, we will have to
prove a more complicated sequent (Γ ` 𝛼 ⇒ 𝛽). It is not guaranteed that we are
getting closer to finding the proof of the initial sequent (Γ ` 𝛽).
It is far from obvious how to overcome that difficulty. Mathematicians have
studied the constructive logic for more than 60 years, trying to replace the rules
in Table 5.7 by a different but equivalent set of derivation rules that require no
guessing when looking for a proof. The first partial success came in 1935 with an
algorithm called “LJ”.5 The LJ algorithm works in many cases but still has a signif-
icant problem: one of its derivation rules may be applied infinitely many times,
leading to an infinite loop. So, the LJ algorithm is not guaranteed to terminate
without some heuristics for avoiding infinite loops. This problem is solved by a
modification of the LJ algorithm, called LJT, first formulated in 1992.6
We will begin with the LJ algorithm. Although that algorithm does not guaran-
tee termination, it is simpler to understand and to apply by hand. Then we will
show how to modify the LJ algorithm in order to obtain the always-terminating
LJT algorithm.
The LJ algorithm Figure 5.3 shows the LJ algorithm’s axioms and derivation
rules. Each rule says that the bottom sequent will be proved if proofs are given for
5 Seehttps://en.wikipedia.org/wiki/Sequent_calculus#Overview
6 An often cited paper by R. Dyckhoff is https://philpapers.org/rec/DYCCSC. For the history
of that research, see https://research-repository.st-andrews.ac.uk/handle/10023/8824

202
5.2 The logic of CH -propositions

(Id) (True)
Γ, 𝑋 ` 𝑋 Γ`>
Γ, 𝐴 ⇒ 𝐵 ` 𝐴 Γ, 𝐵 ` 𝐶 Γ, 𝐴 ` 𝐵
(Left ⇒) (Right ⇒)
Γ, 𝐴 ⇒ 𝐵 ` 𝐶 Γ`𝐴⇒𝐵
Γ, 𝐴𝑖 ` 𝐶 Γ`𝐴 Γ`𝐵
(Left∧𝑖 ) (Right∧)
Γ, 𝐴1 ∧ 𝐴2 ` 𝐶 Γ ` 𝐴∧𝐵
Γ, 𝐴 ` 𝐶 Γ, 𝐵 ` 𝐶 Γ ` 𝐴𝑖
(Left∨) (Right∨𝑖 )
Γ, 𝐴 ∨ 𝐵 ` 𝐶 Γ ` 𝐴1 ∨ 𝐴2

Figure 5.3: Axioms and derivation rules of the LJ algorithm. Each of the rules
“(Left∧𝑖 )” and “(Right∨𝑖 )” have two versions, with 𝑖 = 1 or 𝑖 = 2.

sequent(s) at the top. For each possible sub-expression (conjunction 𝑋 ∧𝑌 , disjunc-


tion 𝑋 ∨ 𝑌 , and implication 𝑋 ⇒ 𝑌 ) there is one rule where that sub-expression is
a premise (at “left”) and one rule where that sub-expression is the goal (at “right”).
Those sub-expressions are shown in red in Figure 5.3 to help us look for a proof.
To find out which rules apply, we match some part of the sequent with a red sub-
expression.
It turns out that the rules in Figure 5.3 are equivalent to the rules in Table 5.2. The
proof is beyond the scope of this book. We only remark that this equivalence is
far from obvious. To prove it, one needs to demonstrate that any sequent derived
through the first set of rules is also derivable through the second set, and vice
versa.
To illustrate the LJ algorithm, let us prove the sequent (5.6). Denote that sequent
by 𝑆0 :
𝑆0 def
= ∅ ` ((𝛼 ⇒ 𝛼) ⇒ 𝛽) ⇒ 𝛽 .
Since the goal of 𝑆0 contains an implication, we use the rule “(Right ⇒)” and get a
sequent 𝑆1 :
𝑆1 def
= (𝛼 ⇒ 𝛼) ⇒ 𝛽 ` 𝛽 .
Now the implication is in the premise, so we use the rule “(Left ⇒)” and get two
new sequents:

𝑆2 def
= (𝛼 ⇒ 𝛼) ⇒ 𝛽 ` 𝛼 ⇒ 𝛼 , 𝑆3 def
= 𝛽`𝛽 .

Sequent 𝑆3 follows from the “(Id)” axiom, so it remains to prove 𝑆2 . Since 𝑆2 con-
tains an implication both as a premise and as the goal, we may apply either the
rule “(Left ⇒)” or the rule “(Right ⇒)”. We choose to apply “(Left ⇒)” and get two
new sequents:

𝑆4 def
= (𝛼 ⇒ 𝛼) ⇒ 𝛽 ` 𝛼 ⇒ 𝛼 , 𝑆5 : 𝛽 ` 𝛼 ⇒ 𝛼 .
203
5 The logic of types. III. The Curry-Howard correspondence

Notice that 𝑆4 = 𝑆2 . So, our proof search is getting into an infinite loop trying to
prove the same sequent 𝑆2 over and over again. We can prove 𝑆5 but this will not
help us break the loop.
Once we recognize the problem, we backtrack to the point where we chose to
apply “(Left ⇒)” to 𝑆2 . That was a bad choice, so let us instead apply “(Right ⇒)”
to 𝑆2 . This yields a new sequent 𝑆6 :

𝑆6 def
= (𝛼 ⇒ 𝛼) ⇒ 𝛽, 𝛼 ` 𝛼 .

This sequent follows from the “(Id)” axiom. There are no more sequents to prove,
so the proof of 𝑆0 is finished. It can be drawn as a proof tree like this:

4 (Id)
𝑆3

𝑆0
/ (Right ⇒)
𝑆1
/ (Left ⇒)
𝑆2
/ (Right ⇒)
𝑆6
/ (Id)

The nodes of the proof tree are axioms or derivation rules, and the edges are in-
termediate sequents required by the rules. Some rule nodes branch into several
sequents because some rules require more than one new sequent to be proved.
The leaves of the tree are axioms that do not require proving any further sequents.
Extracting code from proofs According to the Curry-Howard correspondence,
a sequent of the form CH ( 𝐴), CH (𝐵), ..., CH (𝐶) ` CH (𝑋) represents the task of
writing a fully parametric code expression of type 𝑋 that uses some given values
of types 𝐴, 𝐵, ..., 𝐶. The sequent is true (i.e., can be proved) if that code expression
can be found. So, the code serves as an “evidence of proof” for the sequent.
In the previous subsection, we have found a proof of the sequent 𝑆0 , which
represents the task of writing a fully parametric function with type signature
(( 𝐴 → 𝐴) → 𝐵) → 𝐵). Let us now see how we can extract the code of that function
from the proof of the sequent 𝑆0 .
We start from the leaves of the proof tree and move step by step towards the
initial sequent. At each step, we shorten the proof tree by replacing some sequent
by its corresponding evidence-of-proof code. Eventually we will replace the initial
sequent by its corresponding code. Let us see how this procedure works for the
proof tree of the sequent 𝑆0 shown in the previous section.
Since the leaves are axioms, let us write the code corresponding to each axiom
of LJ:

(Id) : Proof (Γ, 𝑋 ` 𝑋)given 𝑝 :Γ , 𝑥 :𝑋 = 𝑥 ;


Γ, 𝑋 ` 𝑋
(True) : Proof (Γ ` >)given 𝑝 :Γ = 1 .
Γ`>
Here we denote explicitly the values (such as 𝑝 and 𝑥) given as premises to the
sequent. The notation 𝑝 :Γ means all values given in the set of premises Γ. Below
204
5.2 The logic of CH -propositions

(Id) Proof (Γ, 𝐴 ` 𝐴)given 𝑝 :Γ ,𝑥 : 𝐴 = 𝑥


Γ, 𝐴 ` 𝐴
(True) Proof (Γ ` >)given 𝑝 :Γ = 1
Γ`>
Γ, 𝐴 ⇒ 𝐵 ` 𝐴 Γ, 𝐵 ` 𝐶
(Left ⇒) Proof (Γ, 𝐴 ⇒ 𝐵 ` 𝐶)given 𝑝 :Γ ,𝑞 : 𝐴→𝐵
Γ, 𝐴 ⇒ 𝐵 ` 𝐶
= Proof (Γ, 𝐵 ` 𝐶)given 𝑝,𝑏 :𝐵
𝑏 :𝐵 def

where = 𝑞 Proof (Γ, 𝐴 ⇒ 𝐵 ` 𝐴)given 𝑝,𝑞
Γ, 𝐴 ` 𝐵
(Right ⇒) Proof (Γ ` 𝐴 ⇒ 𝐵)given 𝑝 :Γ
Γ`𝐴⇒𝐵
= 𝑥 : 𝐴 → Proof (Γ, 𝐴 ` 𝐵)given 𝑝 :Γ ,𝑥 : 𝐴
Γ, 𝐴 ` 𝐶
(Left∧1 ) Proof (Γ, 𝐴 ∧ 𝐵 ` 𝐶)given 𝑝 :Γ ,(𝑎: 𝐴×𝑏 :𝐵 )
Γ, 𝐴 ∧ 𝐵 ` 𝐶
= Proof (Γ, 𝐴 ` 𝐶)given 𝑝 :Γ ,𝑎: 𝐴
Γ, 𝐵 ` 𝐶
(Left∧2 ) Proof (Γ, 𝐴 ∧ 𝐵 ` 𝐶)given 𝑝 :Γ ,(𝑎: 𝐴×𝑏 :𝐵 )
Γ, 𝐴 ∧ 𝐵 ` 𝐶
= Proof (Γ, 𝐵 ` 𝐶)given 𝑝 :Γ ,𝑏 :𝐵
Γ`𝐴 Γ`𝐵
(Right∧) Proof (Γ ` 𝐴 ∧ 𝐵)given 𝑝 :Γ
Γ ` 𝐴∧𝐵
= Proof (Γ ` 𝐴)given 𝑝 :Γ
× Proof (Γ ` 𝐵)given 𝑝 :Γ
Γ, 𝐴 ` 𝐶 Γ, 𝐵 ` 𝐶
(Left∨) Proof (Γ, 𝐴 ∨ 𝐵 ` 𝐶)given 𝑝 :Γ ,𝑞 : 𝐴+𝐵
Γ, 𝐴 ∨ 𝐵 ` 𝐶
𝐶
=𝑞⊲ 𝐴 𝑥 : 𝐴 → Proof (Γ, 𝐴 ` 𝐶)given 𝑝,𝑥
𝐵 𝑦 :𝐵 → Proof (Γ, 𝐵 ` 𝐶)given 𝑝,𝑦
Γ`𝐴
(Right∨1 ) Proof (Γ ` 𝐴 ∨ 𝐵)given 𝑝 :Γ = Proof (Γ ` 𝐴) + 0:𝐵
Γ ` 𝐴∨𝐵
Γ`𝐵
(Right∨2 ) Proof (Γ ` 𝐴 ∨ 𝐵)given 𝑝 :Γ = 0: 𝐴 + Proof (Γ ` 𝐵)
Γ ` 𝐴∨𝐵

Figure 5.4: Proof transformers for the rules of the LJ algorithm.

205
5 The logic of types. III. The Curry-Howard correspondence

we will assume that the propositions 𝛼 and 𝛽 correspond to types 𝐴 and 𝐵; that
is, 𝛼 def
= CH ( 𝐴) and 𝛽 def
= CH (𝐵).
The leaves in the proof tree for 𝑆0 are the “(Id)” axioms used to prove the se-
quents 𝑆3 and 𝑆6 . Let us write the code that serves as the “evidence of proof” for
these sequents. For brevity, we denote 𝛾 def
= (𝛼 ⇒ 𝛼) ⇒ 𝛽 and 𝐶 def = ( 𝐴 → 𝐴) → 𝐵,
so that 𝛾 = CH (𝐶). Then we can write:
𝑆3 def
= 𝛽`𝛽 , Proof (𝑆3 )given 𝑦 :𝐵 = 𝑦 ,
def
𝑆6 = 𝛾, 𝛼 ` 𝛼 , Proof (𝑆6 )given 𝑞 :𝐶 , 𝑥 : 𝐴 = 𝑥 .
Note that the proof of 𝑆6 does not use the first given value 𝑞 :𝐶 (corresponding to
the premise 𝛾).
We now shorten the proof tree by replacing the sequents 𝑆3 and 𝑆6 by their
“evidence of proof”:
(𝑦)given 𝑦 :𝐵 4 
(𝑥)given 𝑞 :𝐶 ,𝑥 : 𝐴
𝑆0
/ (Right ⇒)
𝑆1
/ (Left ⇒)
𝑆2
/ (Right ⇒) / 
The next step is to consider the proof of 𝑆2 , which is found by applying the rule
“(Right ⇒)”. This rule promises to give a proof of 𝑆2 if we have a proof of 𝑆6 .
In order to extract code from that rule, we can write a function that transforms a
proof of 𝑆6 into a proof of 𝑆2 . That function is the proof transformer corresponding
to the rule “(Right ⇒)”. That rule and its transformer are defined as:
Γ, 𝐴 ` 𝐵
(Right ⇒) : Proof (Γ ` 𝐴 ⇒ 𝐵)given 𝑝 :Γ
Γ`𝐴⇒𝐵
= 𝑥 :𝐴 → Proof (Γ, 𝐴 ` 𝐵)given 𝑝 :Γ , 𝑥 : 𝐴 .
Applying the proof transformer to the known proof of 𝑆6 , we obtain a proof of 𝑆2 :
Proof (𝑆2 )given 𝑞 :𝐶 = 𝑥 :𝐴 → Proof (𝑆6 )given 𝑞 :𝐶 , 𝑥 : 𝐴 = (𝑥 :𝐴 → 𝑥)given 𝑞 :𝐶 .
The proof tree can be now shortened to:
(𝑦)given 𝑦 :𝐵 6 
(𝑥 : 𝐴→𝑥)given 𝑞 :𝐶
𝑆0
/ (Right ⇒)
𝑆1
/ (Left ⇒) /

The next step is to get the proof of 𝑆1 obtained by applying the rule “(Left ⇒)”.
That rule requires two previous sequents, so its transformer is a function of two
previously obtained proofs:
Γ, 𝐴 ⇒ 𝐵 ` 𝐴 Γ, 𝐵 ` 𝐶
(Left ⇒) :
Γ, 𝐴 ⇒ 𝐵 ` 𝐶
Proof (Γ, 𝐴 ⇒ 𝐵 ` 𝐶)given 𝑝 :Γ ,𝑞 : 𝐴→𝐵 = Proof (Γ, 𝐵 ` 𝐶)given 𝑝 :Γ ,𝑏 :𝐵
where 𝑏 :𝐵 def

= 𝑞 Proof (Γ, 𝐴 ⇒ 𝐵 ` 𝐴)given 𝑝 :Γ ,𝑞 : 𝐴→𝐵 .

206
5.2 The logic of CH -propositions

In the proof tree shown above, we obtain a proof of 𝑆1 by applying that proof
transformer to the proofs of 𝑆2 and 𝑆3 :
Proof (𝑆1 )given 𝑞 :𝐶 = Proof (𝑆3 )given 𝑏 :𝐵 where 𝑏 :𝐵 def
= 𝑞(Proof (𝑆2 ))given 𝑞 :𝐶
= 𝑏 where 𝑏 :𝐵 def
= 𝑞(𝑥 :𝐴 → 𝑥)given 𝑞 :𝐶 = 𝑞(𝑥 :𝐴 → 𝑥)given 𝑞 :𝐶 .
Substituting this proof into the proof tree, we shorten the tree to:
𝑞(𝑥 : 𝐴→𝑥)given 𝑞 :𝐶
𝑆0
/ (Right ⇒) / 
It remains to obtain the proof of 𝑆0 by applying the proof transformer of the rule
“(Right ⇒)”:
Proof (𝑆0 ) = Proof (∅ ` (𝛼 ⇒ 𝛼) ⇒ 𝛽) ⇒ 𝛽)
= 𝑞 :( 𝐴→𝐴)→𝐵 → Proof (𝑆1 )given 𝑞 :𝐶 = 𝑞 :( 𝐴→𝐴)→𝐵 → 𝑞(𝑥 :𝐴 → 𝑥) .
The proof tree is now shortened to just the code 𝑞 :( 𝐴→𝐴)→𝐵 → 𝑞(𝑥 :𝐴 → 𝑥), which
has type (( 𝐴 → 𝐴) → 𝐵) → 𝐵. So, that code is an evidence of proof for 𝑆0 . In
this way, we have derived the code of a fully parametric function from its type
signature.
Figure 5.4 shows the proof transformers for all the rules of the LJ algorithm.
Apart from the special rule “(Left ⇒)”, all other rules have proof transformers us-
ing just one of the code constructions (“create function”, “create tuple”, “use tuple”,
etc.) allowed within fully parametric code.
The LJT algorithm As we have seen, the LJ algorithm enters a loop if the rule
“(Left ⇒)” gives a sequent we already had at a previous step. That rule requires
us to prove two new sequents:
Γ, 𝐴 ⇒ 𝐵 ` 𝐴 Γ, 𝐵 ` 𝐶
(Left ⇒) .
Γ, 𝐴 ⇒ 𝐵 ` 𝐶
A sign of trouble is that the first of these sequents (Γ, 𝐴 ⇒ 𝐵 ` 𝐴) does not have
a simpler form than the initial sequent (Γ, 𝐴 ⇒ 𝐵 ` 𝐶). So, it is not clear that we
are getting closer to completing the proof. If 𝐴 = 𝐶, the new sequent will simply
repeat the initial sequent, immediately creating a loop.
In some cases, a repeated sequent will occur after more than one step. It is not
easy to formulate rigorous conditions for stopping the loop or for avoiding the
rule “(Left ⇒)”.
The LJT algorithm solves this problem by removing the rule “(Left ⇒)” from the
LJ algorithm. Instead, four new rules are introduced. Each of these rules contains
a different pattern instead of 𝐴 in the premise 𝐴 ⇒ 𝐶:
Γ, 𝐴, 𝐵 ` 𝐷 Γ, 𝐴 ⇒ 𝐵 ⇒ 𝐶 ` 𝐷
(𝐴 is atomic) (Left ⇒ 𝐴 ) (Left ⇒∧ )
Γ, 𝐴, 𝐴 ⇒ 𝐵 ` 𝐷 Γ, ( 𝐴 ∧ 𝐵) ⇒ 𝐶 ` 𝐷
Γ, 𝐵 ⇒ 𝐶 ` 𝐴 ⇒ 𝐵 Γ, 𝐶 ` 𝐷 Γ, 𝐴 ⇒ 𝐶, 𝐵 ⇒ 𝐶 ` 𝐷
(Left ⇒⇒ ) (Left ⇒∨ )
Γ, ( 𝐴 ⇒ 𝐵) ⇒ 𝐶 ` 𝐷 Γ, ( 𝐴 ∨ 𝐵) ⇒ 𝐶 ` 𝐷

207
5 The logic of types. III. The Curry-Howard correspondence

The rule “Left ⇒ 𝐴 ” applies only if the implication starts with an “atomic” type
expression, i.e., a single type parameter or a unit type. In all other cases, the
implication must start with a conjunction, a disjunction, or an implication, which
means that one of the three remaining rules will apply.
The LJT algorithm retains all the rules in Figure 5.4 except the rule “(Left ⇒)”,
which is replaced by the four new rules. It is far from obvious that the new rules
are equivalent to the old ones. It took mathematicians several decades to come up
with the LJT rules and to prove their validity. This book will rely on that result
and will not attempt to prove it.
The proof transformers for the new rules are shown in Figure 5.5. Figures 5.4–5.5
define the set of proof transformers sufficient for using the LJT algorithm in prac-
tice. The curryhoward library7 implements those proof transformers.
The most complicated of the new rules is the rule “(Left ⇒⇒ )”. It is far from
obvious why the rule Left ⇒⇒ is useful or even correct. This rule is based on a
non-trivial logic identity:

(( 𝐴 → 𝐵) → 𝐶) → ( 𝐴 → 𝐵) ⇐⇒ (𝐵 → 𝐶) → ( 𝐴 → 𝐵) .

Consider the type at the left-hand side of this identity:

(( 𝐴 → 𝐵) → 𝐶) → 𝐵 → 𝐶 .

A function with that type can be written as:

𝑓 = 𝑘 :( 𝐴→𝐵)→𝐶 → 𝑏 :𝐵 → 𝑘 (_:𝐴 → 𝑏) .

The function 𝑓 occurs in the proof transformer for the rule Left ⇒⇒ (shown below
in Table 5.5). Note that this 𝑓 applies 𝑘 to a function (_ → 𝑏) that ignores its
argument. We expect to be able to simplify the resulting expression at the place
when (_ → 𝑏) is applied to some argument expression, which we can then ignore.
For this reason, applying the transformer for the rule Left ⇒⇒ results in evidence-
of-proof code that is longer than the code obtained via LJ’s rule transformers. The
code obtained via the LJT algorithm needs to be simplified symbolically.
As an example of using the LJT algorithm, we again prove the sequent from the
previous section: 𝑆0 = ∅ ` ((𝛼 ⇒ 𝛼) ⇒ 𝛽) ⇒ 𝛽. At each step, only one LJT rule
applies to each sequent. The initial part of the proof tree looks like this:
𝛽`𝛽
3

∅`((𝛼⇒𝛼)⇒𝛽)⇒𝛽 (𝛼⇒𝛼)⇒𝛽`𝛽
/ (Right ⇒) / (Left ⇒⇒ )
𝛼⇒𝛽`𝛼⇒𝛼
/

The proofs for the sequents 𝛽 ` 𝛽 and 𝛼 ⇒ 𝛽 ` 𝛼 ⇒ 𝛼 are the same as before:

Proof (𝛽 ` 𝛽)given 𝑦 :𝐵 = 𝑦 , Proof (𝛼 ⇒ 𝛽 ` 𝛼 ⇒ 𝛼)given 𝑟 : 𝐴→𝐵 = 𝑥 :𝐴 → 𝑥 .


7 See https://github.com/Chymyst/curryhoward

208
5.2 The logic of CH -propositions

Γ, 𝐴, 𝐵 ` 𝐷
(Left ⇒ 𝐴) Proof (Γ, 𝐴, 𝐴 ⇒ 𝐵 ` 𝐷)given 𝑝 :Γ ,𝑥 : 𝐴,𝑞 : 𝐴→𝐵
Γ, 𝐴, 𝐴 ⇒ 𝐵 ` 𝐷
= Proof (Γ, 𝐴, 𝐵 ` 𝐷)given 𝑝,𝑥,𝑞 ( 𝑥)
Γ, 𝐴 ⇒ 𝐵 ⇒ 𝐶 ` 𝐷
(Left ⇒∧ ) Proof (Γ, ( 𝐴 ∧ 𝐵) ⇒ 𝐶 ` 𝐷)given 𝑝 :Γ ,𝑞 : 𝐴×𝐵→𝐶
Γ, ( 𝐴 ∧ 𝐵) ⇒ 𝐶 ` 𝐷
= Proof (Γ,
𝐴 ⇒ 𝐵 ⇒ 𝐶 ` 𝐷)given 𝑝,(𝑎: 𝐴→𝑏 :𝐵 →𝑞 (𝑎×𝑏))
Γ, 𝐴 ⇒ 𝐶, 𝐵 ⇒ 𝐶 ` 𝐷
(Left ⇒∨ ) Proof (Γ, ( 𝐴 ∨ 𝐵) ⇒ 𝐶 ` 𝐷)given 𝑝 :Γ ,𝑞 : 𝐴+𝐵→𝐶
Γ, ( 𝐴 ∨ 𝐵) ⇒ 𝐶 ` 𝐷
= Proof (Γ, 𝐴 ⇒ 𝐶, 𝐵 ⇒ 𝐶 ` 𝐷)given 𝑝,𝑟 ,𝑠
where 𝑟 def
= 𝑎 : 𝐴 → 𝑞(𝑎 + 0)
and 𝑠 def
= 𝑏 :𝐵 → 𝑞(0 + 𝑏)
Γ, 𝐵 ⇒ 𝐶 ` 𝐴 ⇒ 𝐵 Γ, 𝐶 ` 𝐷
(Left ⇒⇒ ) Proof (Γ, ( 𝐴 ⇒ 𝐵) ⇒ 𝐶 ` 𝐷)given 𝑝 :Γ ,𝑞 :( 𝐴→𝐵)→𝐶
Γ, ( 𝐴 ⇒ 𝐵) ⇒ 𝐶 ` 𝐷
= Proof (Γ, 𝐶 ` 𝐷)given 𝑝,𝑐
where 𝑐 :𝐶 def
= 𝑞 Proof (Γ,

𝐵 ⇒ 𝐶 ` 𝐴 ⇒ 𝐵)given 𝑝,𝑟
and 𝑟 :𝐵→𝐶 def
= 𝑏 :𝐵 → 𝑞(_: 𝐴 → 𝑏)

Figure 5.5: Proof transformers for the four new rules of the LJT algorithm.

Substituting these proofs into the proof transformer of the rule “(Left ⇒⇒ )” pro-
duces this code:

Proof ((𝛼 ⇒ 𝛼) ⇒ 𝛽 ` 𝛽)given 𝑞 :( 𝐴→𝐴)→𝐵 = 𝑞 Proof (𝛼 ⇒ 𝛽 ` 𝛼 ⇒ 𝛼)given 𝑟 : 𝐴→𝐵
where 𝑟 :𝐴→𝐵 = 𝑎 :𝐴 → 𝑞(_:𝐴 → 𝑎)
= 𝑞(𝑥 :𝐴 → 𝑥) .

The proof of 𝛼 ⇒ 𝛽 ` 𝛼 ⇒ 𝛼 does not actually use the intermediate value 𝑟 :𝐴→𝐵
provided by the proof transformer. As a symbolic simplification step, we may
simply omit the code of 𝑟. The curryhoward library always performs symbolic sim-
plification after applying the LJT algorithm.
The reason the LJT algorithm terminates is that each rule replaces a given se-
quent by one or more sequents with simpler premises or goals.8 This guarantees
that the proof search will terminate either with a complete proof or with a sequent
8 The paper http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.35.2618 shows
that the LJT algorithm terminates by giving an explicit decreasing measure on sequents.

209
5 The logic of types. III. The Curry-Howard correspondence

to which no more rules apply. An example of such a “dead-end” sequent is 𝛼 ` 𝛽


where 𝛼 and 𝛽 are different, unrelated propositions. When no more rules apply,
the LJT algorithm concludes that the initial sequent cannot be proved.
To prove that there is no proof, one needs to use methods that are beyond the
scope of this book. An introduction to the required techniques is in the already
mentioned book “Proof and Disproof in Formal Logic” by R. Bornat.

5.2.6 Failure of Boolean logic in reasoning about


CH -propositions
Programmers are familiar with the Boolean logic whose operations are written
in Scala as x && y (conjunction), x || y (disjunction), and !x (negation). However,
it turns out that the Boolean logic does not always produce correct conclusions
when reasoning about CH -propositions and implementable type signatures. For
correct reasoning about those questions, one needs to use the constructive logic.
Let us nevertheless briefly look at how Boolean logic would handle that rea-
soning. In the Boolean logic, each proposition (𝛼, 𝛽, ...) is either 𝑇𝑟𝑢𝑒 or 𝐹𝑎𝑙𝑠𝑒.
The operations are 𝛼 ∧ 𝛽 (conjunction), 𝛼 ∨ 𝛽 (disjunction), and ¬𝛼 (negation). The
implication (⇒) is defined through other operations by:

(𝛼 ⇒ 𝛽) def
= ((¬𝛼) ∨ 𝛽) . (5.8)

To verify whether a formula is true in the Boolean logic, we can substitute either
𝑇𝑟𝑢𝑒 or 𝐹𝑎𝑙𝑠𝑒 into every variable and check if the formula has the value 𝑇𝑟𝑢𝑒 in
all possible cases. The result can be arranged into a truth table. The formula is
true if all values in its truth table are 𝑇𝑟𝑢𝑒.
Disjunction, conjunction, negation, and implication operations are described by
this truth table:

𝛼 𝛽 𝛼∨𝛽 𝛼∧𝛽 ¬𝛼 𝛼⇒𝛽

𝑇𝑟𝑢𝑒 𝑇𝑟𝑢𝑒 𝑇𝑟𝑢𝑒 𝑇𝑟𝑢𝑒 𝐹𝑎𝑙𝑠𝑒 𝑇𝑟𝑢𝑒


𝑇𝑟𝑢𝑒 𝐹𝑎𝑙𝑠𝑒 𝑇𝑟𝑢𝑒 𝐹𝑎𝑙𝑠𝑒 𝐹𝑎𝑙𝑠𝑒 𝐹𝑎𝑙𝑠𝑒
𝐹𝑎𝑙𝑠𝑒 𝑇𝑟𝑢𝑒 𝑇𝑟𝑢𝑒 𝐹𝑎𝑙𝑠𝑒 𝑇𝑟𝑢𝑒 𝑇𝑟𝑢𝑒
𝐹𝑎𝑙𝑠𝑒 𝐹𝑎𝑙𝑠𝑒 𝐹𝑎𝑙𝑠𝑒 𝐹𝑎𝑙𝑠𝑒 𝑇𝑟𝑢𝑒 𝑇𝑟𝑢𝑒

Using this table, we find that the formula 𝛼 ⇒ 𝛼 has the value 𝑇𝑟𝑢𝑒 in all cases,
whether 𝛼 itself is 𝑇𝑟𝑢𝑒 or 𝐹𝑎𝑙𝑠𝑒. This check is sufficient to show that ∀𝛼. 𝛼 ⇒ 𝛼
is true in Boolean logic.
Here is the truth table for the formulas ∀(𝛼, 𝛽). (𝛼 ∧ 𝛽) ⇒ 𝛼 and ∀(𝛼, 𝛽). 𝛼 ⇒
(𝛼 ∧ 𝛽). The first formula is true since all values in its column are 𝑇𝑟𝑢𝑒, while the
second formula is not true since one value in the last column is 𝐹𝑎𝑙𝑠𝑒:
210
5.2 The logic of CH -propositions

𝛼 𝛽 𝛼∧𝛽 (𝛼 ∧ 𝛽) ⇒ 𝛼 𝛼 ⇒ (𝛼 ∧ 𝛽)

𝑇𝑟𝑢𝑒 𝑇𝑟𝑢𝑒 𝑇𝑟𝑢𝑒 𝑇𝑟𝑢𝑒 𝑇𝑟𝑢𝑒


𝑇𝑟𝑢𝑒 𝐹𝑎𝑙𝑠𝑒 𝐹𝑎𝑙𝑠𝑒 𝑇𝑟𝑢𝑒 𝐹𝑎𝑙𝑠𝑒
𝐹𝑎𝑙𝑠𝑒 𝑇𝑟𝑢𝑒 𝐹𝑎𝑙𝑠𝑒 𝑇𝑟𝑢𝑒 𝑇𝑟𝑢𝑒
𝐹𝑎𝑙𝑠𝑒 𝐹𝑎𝑙𝑠𝑒 𝐹𝑎𝑙𝑠𝑒 𝑇𝑟𝑢𝑒 𝑇𝑟𝑢𝑒

Table 5.3 shows more examples of logical formulas that are true in Boolean logic.
Each formula is first written in terms of CH -propositions (we denote 𝛼 def
= CH ( 𝐴)
def
and 𝛽 = CH (𝐵) for brevity) and then as a Scala type signature of a function. So,
all these type signatures can be implemented.

Logic formula Type formula Scala code

∀𝛼. 𝛼 ⇒ 𝛼 ∀𝐴. 𝐴 → 𝐴 def id[A](x: A): A = x


∀𝛼. 𝛼 ⇒ 𝑇𝑟𝑢𝑒 ∀𝐴. 𝐴 → 1 def toUnit[A](x: A): Unit = ()
∀(𝛼, 𝛽). 𝛼 ⇒ (𝛼 ∨ 𝛽) ∀( 𝐴, 𝐵). 𝐴 → 𝐴 + 𝐵 def f[A,B](x: A): Either[A,B] = Left(x)
∀(𝛼, 𝛽). (𝛼 ∧ 𝛽) ⇒ 𝛼 ∀( 𝐴, 𝐵). 𝐴 × 𝐵 → 𝐴 def f[A, B](p: (A, B)): A = p._1
∀(𝛼, 𝛽). 𝛼 ⇒ (𝛽 ⇒ 𝛼) ∀( 𝐴, 𝐵). 𝐴 → (𝐵 → 𝐴) def f[A, B](x: A): B => A = (_ => x)

Table 5.3: Examples of logical formulas that are true theorems in Boolean logic.

Table 5.4 shows some examples of formulas that are not true in Boolean logic.
Translated into type formulas and then into Scala, these formulas yield type sig-
natures that cannot be implemented by fully parametric functions.

Logic formula Type formula Scala type signature

∀𝛼. 𝑇𝑟𝑢𝑒 ⇒ 𝛼 ∀𝐴. 1 → 𝐴 def f[A](x: Unit): A


∀(𝛼, 𝛽). (𝛼 ∨ 𝛽) ⇒ 𝛼 ∀( 𝐴, 𝐵). 𝐴 + 𝐵 → 𝐴 def f[A,B](x: Either[A, B]): A
∀(𝛼, 𝛽). 𝛼 ⇒ (𝛼 ∧ 𝛽) ∀( 𝐴, 𝐵). 𝐴 → 𝐴 × 𝐵 def f[A,B](p: A): (A, B)
∀(𝛼, 𝛽). (𝛼 ⇒ 𝛽) ⇒ 𝛼 ∀( 𝐴, 𝐵). ( 𝐴 → 𝐵) → 𝐴 def f[A,B](x: A => B): A

Table 5.4: Examples of logical formulas that are not true in Boolean logic.

At first sight, it may appear from these examples that whenever a formula is
true in Boolean logic, the corresponding type signature can be implemented in
code, and vice versa. However, this is incorrect: the rules of Boolean logic are not
fully suitable for reasoning about types in a functional language. False Boolean
formulas do correspond to unimplementable type signatures. But not all true
211
5 The logic of types. III. The Curry-Howard correspondence

Boolean formulas correspond to implementable function types. An example is


the function bad3 shown in Section 5.1.1:
def bad3[A, B, C](g: A => Either[B, C]): Either[A => B, A => C] = ???

∀( 𝐴, 𝐵, 𝐶). ( 𝐴 → 𝐵 + 𝐶) → ( 𝐴 → 𝐵) + ( 𝐴 → 𝐶) , (5.9)
The function bad3 cannot be implemented via fully parametric code, as we al-
ready discussed in Section 5.1.1. Now, the type signature (5.9) gives this CH -
proposition:

∀(𝛼, 𝛽, 𝛾). (𝛼 ⇒ (𝛽 ∨ 𝛾)) ⇒ ((𝛼 ⇒ 𝛽) ∨ (𝛼 ⇒ 𝛾)) , (5.10)


def def def
where we denoted : 𝛼 = CH ( 𝐴), 𝛽 = CH (𝐵), 𝛾 = CH (𝐶) .

It turns out that this formula is true in Boolean logic. To prove this, we need to
show that Eq. (5.10) is equal to 𝑇𝑟𝑢𝑒 for any Boolean values of the variables 𝛼, 𝛽,
𝛾. One way is to rewrite the expression (5.10) using the rules of Boolean logic:

𝛼 ⇒ (𝛽 ∨ 𝛾)
definition of ⇒ via Eq. (5.8) : = (¬𝛼) ∨ 𝛽 ∨ 𝛾 ,
(𝛼 ⇒ 𝛽) ∨ (𝛼 ⇒ 𝛾)
definition of ⇒ via Eq. (5.8) : = (¬𝛼) ∨ 𝛽 ∨ (¬𝛼) ∨ 𝛾
property 𝑥 ∨ 𝑥 = 𝑥 in Boolean logic : = (¬𝛼) ∨ 𝛽 ∨ 𝛾 .

So, 𝛼 ⇒ (𝛽 ∨ 𝛾) is in fact equal to (𝛼 ⇒ 𝛽) ∨ (𝛼 ⇒ 𝛾) in Boolean logic.


Let us also give a proof by truth-value reasoning. The only possibility for an
implication 𝑋 ⇒ 𝑌 to be 𝐹𝑎𝑙𝑠𝑒 is when 𝑋 = 𝑇𝑟𝑢𝑒 and 𝑌 = 𝐹𝑎𝑙𝑠𝑒. So, Eq. (5.10)
can be 𝐹𝑎𝑙𝑠𝑒 only if (𝛼 ⇒ (𝛽 ∨ 𝛾)) = 𝑇𝑟𝑢𝑒 and (𝛼 ⇒ 𝛽) ∨ (𝛼 ⇒ 𝛾) = 𝐹𝑎𝑙𝑠𝑒. A
disjunction can be false only when both parts are false. So, we must have both
(𝛼 ⇒ 𝛽) = 𝐹𝑎𝑙𝑠𝑒 and (𝛼 ⇒ 𝛾) = 𝐹𝑎𝑙𝑠𝑒. This is only possible if 𝛼 = 𝑇𝑟𝑢𝑒 and
𝛽 = 𝛾 = 𝐹𝑎𝑙𝑠𝑒. But, with these value assignments, we find (𝛼 ⇒ (𝛽 ∨ 𝛾)) = 𝐹𝑎𝑙𝑠𝑒
rather than 𝑇𝑟𝑢𝑒 as we assumed. It follows that we cannot ever make Eq. (5.10)
equal to 𝐹𝑎𝑙𝑠𝑒. So, Eq. (5.10) is true in Boolean logic.

5.3 Equivalence of types


We found a correspondence between types, code, logical propositions, and proofs,
which is known as the Curry-Howard correspondence. An example of the CH
correspondence is that a proof of the logical proposition:

∀(𝛼, 𝛽). 𝛼 ⇒ (𝛽 ⇒ 𝛼) (5.11)

corresponds to the code of the following function:


212
5.3 Equivalence of types

def f[A, B]: A => (B => A) = { x => _ => x }

With the CH correspondence in mind, we may say that the existence of the code
x => _ => x with the type 𝐴 → (𝐵 → 𝐴) “is” a proof of the logical formula (5.11),
because it shows how to compute a value of type ∀( 𝐴, 𝐵). 𝐴 → 𝐵 → 𝐴.
The Curry-Howard correspondence maps logic formulas such as (𝛼 ∨ 𝛽) ∧ 𝛾 into
type expressions such as ( 𝐴 + 𝐵) × 𝐶. We have seen that types behave similarly to
logic formulas in one respect: A logic formula is a true theorem of constructive
logic when the corresponding type signature can be implemented as a fully para-
metric function, and vice versa.
It turns out that the similarity ends here. In other respects, type expressions
behave as arithmetic expressions and not as logic formulas. For this reason, the
type notation used in this book denotes disjunctive types by 𝐴 + 𝐵 and tuples by
𝐴 × 𝐵, which is designed to remind us of arithmetic expressions (such as 1 + 2 and
2 × 3) rather than of logical formulas (such as 𝐴 ∨ 𝐵 and 𝐴 ∧ 𝐵).
An important use of the type notation is for writing equations with types. Can
we use the arithmetic intuition for writing type equations such as:

( 𝐴 + 𝐵) × 𝐶 = 𝐴 × 𝐶 + 𝐵 × 𝐶 ? (5.12)

In this section, we will learn how to check whether one type expression is equiv-
alent to another.

5.3.1 Logical identity does not correspond to type equivalence


The CH correspondence maps Eq. (5.12) into the logic formula:

∀( 𝐴, 𝐵, 𝐶). ( 𝐴 ∨ 𝐵) ∧ 𝐶 = ( 𝐴 ∧ 𝐶) ∨ (𝐵 ∧ 𝐶) . (5.13)

This formula is a well-known distributive law9 valid in Boolean logic as well as in


the constructive logic. Since a logical equation 𝑃 = 𝑄 means 𝑃 ⇒ 𝑄 and 𝑄 ⇒ 𝑃,
the distributive law (5.13) means that the two formulas hold:

∀( 𝐴, 𝐵, 𝐶). ( 𝐴 ∨ 𝐵) ∧ 𝐶 ⇒ ( 𝐴 ∧ 𝐶) ∨ (𝐵 ∧ 𝐶) , (5.14)
∀( 𝐴, 𝐵, 𝐶). ( 𝐴 ∧ 𝐶) ∨ (𝐵 ∧ 𝐶) ⇒ ( 𝐴 ∨ 𝐵) ∧ 𝐶 . (5.15)

The CH correspondence maps these logical formulas to fully parametric functions


with types:
def f1[A, B, C]: ((Either[A, B], C)) => Either[(A, C), (B, C)] = ???
def f2[A, B, C]: Either[(A, C), (B, C)] => (Either[A, B], C) = ???

9 See https://en.wikipedia.org/wiki/Distributive_property#Rule_of_replacement

213
5 The logic of types. III. The Curry-Howard correspondence

In the type notation, these type signatures are written as:

𝑓1𝐴,𝐵,𝐶 : ( 𝐴 + 𝐵) × 𝐶 → 𝐴 × 𝐶 + 𝐵 × 𝐶 ,
𝑓2𝐴,𝐵,𝐶 : 𝐴 × 𝐶 + 𝐵 × 𝐶 → ( 𝐴 + 𝐵) × 𝐶 .

Since the two logical formulas (5.14)–(5.15) are true theorems in constructive logic,
we expect to be able to implement the functions f1 and f2. We could use the
proof rules of the LJT algorithm to obtain proofs of Eqs. (5.14)–(5.15) and to derive
implementations of f1 and f2. Instead, let us exercise our intuition and write the
Scala code directly.
To implement f1, we need to perform pattern matching on the argument:
def f1[A, B, C]: ((Either[A, B], C)) => Either[(A, C), (B, C)] = {
case (Left(a), c) => Left((a, c)) // No other choice here.
case (Right(b), c) => Right((b, c)) // No other choice here.
}
In both cases, we have only one possible expression of the correct type.
Similarly, the implementation of f2 leaves us no choices:
def f2[A, B, C]: Either[(A, C), (B, C)] => (Either[A, B], C) = {
case Left((a, c)) => (Left(a), c) // No other choice here.
case Right((b, c)) => (Right(b), c) // No other choice here.
}
The code of f1 and f2 never discards any given values; in other words, these
functions appear to preserve information. We can formulate this property rig-
orously as a requirement that an arbitrary value x: (Either[A, B], C) be mapped
by f1 to some value y: Either[(A, C), (B, C)] and then mapped by f2 back to the
same value x. Similarly, any value y of type Either[(A, C), (B, C)] should be trans-
formed by f2 and then by f1 back to the same value y.
Let us write those conditions as equations:

∀𝑥 :( 𝐴+𝐵)×𝐶 . 𝑓2 ( 𝑓1 (𝑥)) = 𝑥 , ∀𝑦 :𝐴×𝐶+𝐵×𝐶 . 𝑓1 ( 𝑓2 (𝑦)) = 𝑦 .

If these equations hold, it means that all the information in a value 𝑥 :( 𝐴+𝐵)×𝐶 is
completely preserved inside the value 𝑦 def= 𝑓1 (𝑥); the original value 𝑥 can be re-
covered as 𝑥 = 𝑓2 (𝑦). Then the function 𝑓1 is the inverse of 𝑓2 . Conversely, all
the information in a value 𝑦 :𝐴×𝐶+𝐵×𝐶 is preserved inside 𝑥 def = 𝑓2 (𝑦) and can be
recovered by applying 𝑓1 . Since the values 𝑥 :( 𝐴+𝐵)×𝐶 and 𝑦 :𝐴×𝐶+𝐵×𝐶 are arbitrary,
it will follow that the data types themselves, ( 𝐴 + 𝐵) × 𝐶 and 𝐴 × 𝐶 + 𝐵 × 𝐶, carry
equivalent information. Such types are called equivalent or isomorphic.
Generally, we say that types 𝑃 and 𝑄 are equivalent or isomorphic (denoted
𝑃  𝑄) when there exist functions 𝑓1 : 𝑃 → 𝑄 and 𝑓2 : 𝑄 → 𝑃 that are inverses of
each other. We can write that using the notation ( 𝑓1 # 𝑓2 )(𝑥) def
= 𝑓2 ( 𝑓1 (𝑥)) as:

𝑓1 # 𝑓2 = id , 𝑓2 # 𝑓1 = id .
214
5.3 Equivalence of types

(In Scala, the forward composition 𝑓1 # 𝑓2 is the function f1 andThen f2. We omit
type annotations since we already checked that the types match.) If these condi-
tions hold, there is a one-to-one correspondence between values of types 𝑃 and 𝑄.
In other words, the data types 𝑃 and 𝑄 carry equivalent information.
To verify that the Scala functions f1 and f2 defined above are inverses of each
other, we first check if 𝑓1 # 𝑓2 = id. Applying 𝑓1 # 𝑓2 means to apply 𝑓1 and then
to apply 𝑓2 to the result. Begin by applying 𝑓1 to an arbitrary value 𝑥 :( 𝐴+𝐵)×𝐶 . A
value 𝑥 of that type can be in only one of the two disjoint cases: a tuple (Left(a),
c) or a tuple (Right(b), c), for some values a:A, b:B, and c:C. The Scala code of f1
maps these tuples to Left((a, c)) and to Right((b, c)) respectively; we can see this
directly from the code of f1. We then apply 𝑓2 to those values, which maps them
back to a tuple (Left(a), c) or to a tuple (Right(b), c) respectively, according to
the code of f2. These tuples are exactly the value 𝑥 we started with. So, applying
𝑓1 # 𝑓2 to an arbitrary 𝑥 :( 𝐴+𝐵)×𝐶 returns that value 𝑥. This is the same as to say that
𝑓1 # 𝑓2 = id.
To check whether 𝑓2 # 𝑓1 = id, we apply 𝑓2 to an arbitrary value 𝑦 :𝐴×𝐶+𝐵×𝐶 , which
must be one of the two disjoint cases, Left((a, c)) or Right((b, c)). The code of
f2 maps these two cases into tuples (Left(a), c) and (Right(b), c) respectively.
Then we apply f1 and map these tuples back to Left((a, c)) and Right((b, c))
respectively. It follows that applying 𝑓2 and then 𝑓1 will always return the initial
value. As a formula, this is written as 𝑓2 # 𝑓1 = id.
By looking at the code of f1 and f2, we can directly observe that these functions
are inverses of each other: the tuple pattern (Left(a), c) is mapped to Left((a,
c)), and the pattern (Right(b), c) to Right((b, c)), or vice versa. It is visually clear
that no information is lost and that the original values are returned by function
compositions 𝑓1 # 𝑓2 or 𝑓2 # 𝑓1 .
We find that the logical identity (5.13) leads to an equivalence of the correspond-
ing types:
( 𝐴 + 𝐵) × 𝐶  𝐴 × 𝐶 + 𝐵 × 𝐶 . (5.16)
To get Eq. (5.16) from Eq. (5.13), we need to convert a logical formula to an arith-
metic expression by replacing the disjunction operations ∨ by + and the conjunc-
tions ∧ by × everywhere.
As another example of a logical identity, consider the associativity law for con-
junction:
(𝛼 ∧ 𝛽) ∧ 𝛾 = 𝛼 ∧ (𝛽 ∧ 𝛾) . (5.17)
The corresponding types are ( 𝐴 × 𝐵) × 𝐶 and 𝐴 × (𝐵 × 𝐶); in Scala, ((A, B), C) and
(A, (B, C)). We can define functions that convert between these types without
information loss:
def f3[A, B, C]: (((A, B), C)) => (A, (B, C)) = { case ((a, b), c) =>
(a, (b, c)) }
def f4[A, B, C]: (A, (B, C)) => (((A, B), C)) = { case (a, (b, c)) =>
((a, b), c) }

215
5 The logic of types. III. The Curry-Howard correspondence

By applying these functions to arbitrary values of types ((A, B), C) and (A, (B,
C)), it is easy to see that the functions f3 and f4 are inverses of each other. This
is also directly visible in the code: the nested tuple pattern ((a, b), c) is mapped
to the pattern (a, (b, c)) and back. So, the types ( 𝐴 × 𝐵) × 𝐶 and 𝐴 × (𝐵 × 𝐶) are
equivalent. We will often write 𝐴 × 𝐵 × 𝐶 without parentheses.
Does a logical identity always correspond to an equivalence of types? This turns
out to be not so. A simple example of a logical identity that does not correspond
to a type equivalence is:
𝑇𝑟𝑢𝑒 ∨ 𝛼 = 𝑇𝑟𝑢𝑒 . (5.18)
Since the CH correspondence maps the logical constant 𝑇𝑟𝑢𝑒 into the unit type 1,
the type equivalence corresponding to Eq. (5.18) is 1 + 𝐴  1. The type denoted by
1 + 𝐴 means Option[A] in Scala, so the corresponding equivalence is Option[A] Unit.
Intuitively, this type equivalence should not hold: an Option[A] may carry a value
of type A, which cannot possibly be stored in a value of type Unit. We can verify
this intuition rigorously by proving that any fully parametric functions with type
signatures 𝑔1 : 1 + 𝐴 → 1 and 𝑔2 : 1 → 1 + 𝐴 will not satisfy 𝑔1 # 𝑔2 = id. To verify
this, we note that 𝑔2 : 1 → 1 + 𝐴 must have this type signature:
def g2[A]: Unit => Option[A] = ???

This function must always return None, since a fully parametric function cannot
produce values of an arbitrary type A from scratch. Therefore, 𝑔1 # 𝑔2 is also a func-
tion that always returns None. The function 𝑔1 # 𝑔2 has type signature 1 + 𝐴 → 1 + 𝐴
or, in Scala syntax, Option[A] => Option[A], and is not equal to the identity function,
because the identity function does not always return None.
Another example of a logical identity that does not correspond to a type equiv-
alence is the distributive law:

∀( 𝐴, 𝐵, 𝐶). ( 𝐴 ∧ 𝐵) ∨ 𝐶 = ( 𝐴 ∨ 𝐶) ∧ (𝐵 ∨ 𝐶) , (5.19)

which is “dual” to the law (5.13), i.e., it is obtained from Eq. (5.13) by swapping all
conjunctions (∧) with disjunctions (∨). In logic, a dual formula to an identity is
also an identity. The CH correspondence maps Eq. (5.19) into the type equation:

?
∀( 𝐴, 𝐵, 𝐶). ( 𝐴 × 𝐵) + 𝐶 = ( 𝐴 + 𝐶) × (𝐵 + 𝐶) . (5.20)

However, the types 𝐴 × 𝐵 + 𝐶 and ( 𝐴 + 𝐶) × (𝐵 + 𝐶) are not equivalent. To see why,


look at the possible code of a function 𝑔3 : ( 𝐴 + 𝐶) × (𝐵 + 𝐶) → 𝐴 × 𝐵 + 𝐶:
1 def g3[A, B, C]: ((Either[A, C], Either[B, C])) => Either[(A, B), C] = {
2 case (Left(a), Left(b)) => Left((a, b)) // No other choice.
3 case (Left(a), Right(c)) => Right(c) // No other choice.
4 case (Right(c), Left(b)) => Right(c) // No other choice.
5 case (Right(c1), Right(c2)) => Right(c1) // Must discard c1 or c2 here!
6 } // May return Right(c2) instead of Right(c1) in the last line.

216
5.3 Equivalence of types

In line 5, we have a choice of returning Right(c1) or Right(c2). Whichever we


choose, we will lose information because we will have discarded one of the given
values c1, c2. After evaluating 𝑔3 , we will not be able to restore both c1 and c2, no
matter what code we write. So, the composition 𝑔3 # 𝑔4 with any 𝑔4 cannot be equal
to the identity function. The type equation (5.20) is incorrect.
We find that a logical identity CH (𝑃) = CH (𝑄) guarantees, via the CH corre-
spondence, that we can implement some fully parametric functions of types 𝑃 → 𝑄
and 𝑄 → 𝑃. However, it is not guaranteed that these functions are inverses of each
other, i.e., that the type conversions 𝑃 → 𝑄 or 𝑄 → 𝑃 have no information loss.
So, the type equivalence 𝑃  𝑄 does not automatically follow from the logical
identity CH (𝑃) = CH (𝑄).
The CH correspondence means that for true propositions CH (𝑋) we can com-
pute some value 𝑥 of type 𝑋. However, the CH correspondence does not guarantee
that the computed value 𝑥 :𝑋 will satisfy any additional properties or laws.

5.3.2 Arithmetic identity corresponds to type equivalence


Looking at the examples of equivalent types, we notice that correct type equiv-
alences correspond to arithmetical identities rather than logical identities. For in-
stance, the logical identity in Eq. (5.13) leads to the type equivalence (5.16), which
looks like a standard identity of arithmetic, such as:

(1 + 10) × 20 = 1 × 20 + 10 × 20 .

The logical identity in Eq. (5.19), which does not yield a type equivalence, leads
to an incorrect arithmetic equation (5.20), e.g., (1 × 10) + 20 ≠ (1 + 20) × (10 + 20).
Similarly, the associativity law (5.17) leads to a type equivalence and to the arith-
metic identity:
(𝑎 × 𝑏) × 𝑐 = 𝑎 × (𝑏 × 𝑐) ,
The logical identity in Eq. (5.18), which does not yield a type equivalence, leads
to an incorrect arithmetic statement (“1 + 𝑎 = 1 for all 𝑎”).
Table 5.5 summarizes these and other examples of logical identities and the
corresponding type equivalences. In all rows, quantifiers such as ∀𝛼 or ∀( 𝐴, 𝐵)
are implied as necessary.
Because the type notation is similar to the ordinary arithmetic notation, it is easy
to translate a possible type equivalence into an arithmetic equation. In all cases,
valid arithmetic identities correspond to type equivalences, and failures to obtain
a type equivalence correspond to incorrect arithmetic identities. With regard to
type equivalence, types such as 𝐴 + 𝐵 and 𝐴 × 𝐵 behave similarly to arithmetic
expressions such as 10 + 20 and 10 × 20 and not similarly to logical formulas such
as 𝛼 ∨ 𝛽 and 𝛼 ∧ 𝛽.
We already verified the first line and the last three lines of Table 5.5. Other
identities are verified in a similar way. Let us begin with lines 3 and 4 of Table 5.5,
217
5 The logic of types. III. The Curry-Howard correspondence

Logical identity Type equivalence (if it holds)

𝑇𝑟𝑢𝑒 ∨ 𝛼 = 𝑇𝑟𝑢𝑒 1+ 𝐴  1
𝑇𝑟𝑢𝑒 ∧ 𝛼 = 𝛼 1× 𝐴  𝐴
𝐹𝑎𝑙𝑠𝑒 ∨ 𝛼 = 𝛼 0+ 𝐴  𝐴
𝐹𝑎𝑙𝑠𝑒 ∧ 𝛼 = 𝐹𝑎𝑙𝑠𝑒 0× 𝐴  0
𝛼∨𝛽 = 𝛽∨𝛼 𝐴+𝐵  𝐵+𝐴
𝛼∧𝛽 = 𝛽∧𝛼 𝐴×𝐵  𝐵×𝐴
(𝛼 ∨ 𝛽) ∨ 𝛾 = 𝛼 ∨ (𝛽 ∨ 𝛾) ( 𝐴 + 𝐵) + 𝐶  𝐴 + (𝐵 + 𝐶)
(𝛼 ∧ 𝛽) ∧ 𝛾 = 𝛼 ∧ (𝛽 ∧ 𝛾) ( 𝐴 × 𝐵) × 𝐶  𝐴 × (𝐵 × 𝐶)
(𝛼 ∨ 𝛽) ∧ 𝛾 = (𝛼 ∧ 𝛾) ∨ (𝛽 ∧ 𝛾) ( 𝐴 + 𝐵) × 𝐶  𝐴 × 𝐶 + 𝐵 × 𝐶
(𝛼 ∧ 𝛽) ∨ 𝛾 = (𝛼 ∨ 𝛾) ∧ (𝛽 ∨ 𝛾) ( 𝐴 × 𝐵) + 𝐶  ( 𝐴 + 𝐶) × (𝐵 + 𝐶)

Table 5.5: Logic identities with disjunction and conjunction, and the possible type
equivalences.

which involve the proposition 𝐹𝑎𝑙𝑠𝑒 and the corresponding void type 0 (Scala’s
Nothing). Reasoning about the void type needs a special technique that we will
now develop while verifying the type isomorphisms 0 × 𝐴  0 and 0 + 𝐴  𝐴.
Example 5.3.2.1 Verify the type equivalence 0 × 𝐴  0.
Solution Recall that the type notation 0 × 𝐴 represents the Scala tuple type
(Nothing, A). To demonstrate that the type (Nothing, A) is equivalent to the type
Nothing, we need to show that the type (Nothing, A) has no values. Indeed, how
could we create a value of type, say, (Nothing, Int)? We would need to fill both
parts of the tuple. We have values of type Int, but we can never get a value of type
Nothing. So, regardless of the type A, it is impossible to create any values of type
(Nothing, A). In other words, the set of values of the type (Nothing, A) is empty.
But that is the definition of the void type Nothing. The types (Nothing, A) (denoted
by 0 × 𝐴) and Nothing (denoted by 0) are both void and therefore equivalent.
Example 5.3.2.2 Verify the type equivalence 0 + 𝐴  𝐴.
Solution The type notation 0 + 𝐴 corresponds to the Scala type Either[Nothing,
A]. We need to show that any value of that type can be mapped without loss of
information to a value of type A, and vice versa. This means implementing func-
tions 𝑓1 : 0 + 𝐴 → 𝐴 and 𝑓2 : 𝐴 → 0 + 𝐴 such that 𝑓1 # 𝑓2 = id and 𝑓2 # 𝑓1 = id.
The argument of 𝑓1 is of type Either[Nothing, A]. How can we create a value of
that type? Our only choices are to create a Left(x) with x: Nothing, or to create a
Right(y) with y: A. However, we cannot create a value x of type Nothing because
the type Nothing has no values. We cannot create a Left(x). The only remaining
218
5.3 Equivalence of types

possibility is to create a Right(y) with some value y of type A. So, any values of
type 0 + 𝐴 must be of the form Right(y), and we can extract that y to obtain a value
of type A:
def f1[A]: Either[Nothing, A] => A = {
case Right(y) => y
// No need for `case Left(x) => ...` since no `x` can ever be given as
`Left(x)`.
}

For the same reason, there is only one implementation of the function f2:
def f2[A]: A => Either[Nothing, A] = { y => Right(y) }

It is clear from the code that the functions f1 and f2 are inverses of each other.
Example 5.3.2.3 Verify the type equivalence 𝐴 × 1  𝐴.
Solution The corresponding Scala types are the tuple (A, Unit) and the type
A. We need to implement functions 𝑓1 : ∀𝐴. 𝐴 × 1 → 𝐴 and 𝑓2 : ∀𝐴. 𝐴 → 𝐴 × 1
and to demonstrate that they are inverses of each other. The Scala code for these
functions is:
def f1[A]: ((A, Unit)) => A = { case (a, ()) => a }
def f2[A]: A => (A, Unit) = { a => (a, ()) }

Let us first write a proof by reasoning directly with Scala code:


(f1 andThen f2)((a,())) == f2(f1((a,())) == f2(a) == (a, ())
(f2 andThen f1)(a) == f1(f2(a)) == f1((a, ())) = a

Now let us write a proof in the code notation. The codes of 𝑓1 and 𝑓2 are:

𝑓1 def
= 𝑎 :𝐴 × 1 → 𝑎 , 𝑓2 def
= 𝑎 :𝐴 → 𝑎 × 1 ,

where we denoted by 1 the value () of the Unit type. We find:

( 𝑓1 # 𝑓2 )(𝑎 :𝐴 × 1) = 𝑓2 ( 𝑓1 (𝑎 × 1)) = 𝑓2 (𝑎) = 𝑎 × 1 ,


:𝐴
( 𝑓2 # 𝑓1 )(𝑎 ) = 𝑓1 ( 𝑓2 (𝑎)) = 𝑓1 (𝑎 × 1) = 𝑎 .

This shows that both compositions are identity functions. Another way of writ-
ing the proof is by computing the function compositions symbolically, without
applying to a value 𝑎 :𝐴 :

𝑓1 # 𝑓2 = (𝑎 × 1 → 𝑎) # (𝑎 → 𝑎 × 1) = (𝑎 × 1 → 𝑎 × 1) = id 𝐴×1 ,
𝐴
𝑓2 # 𝑓1 = (𝑎 → 𝑎 × 1) # (𝑎 × 1 → 𝑎) = (𝑎 → 𝑎) = id .

Example 5.3.2.4 Verify the type equivalence 𝐴 + 𝐵  𝐵 + 𝐴.


Solution The corresponding Scala types are Either[A, B] and Either[B, A]. Use
pattern matching to implement the functions required for the type equivalence:
219
5 The logic of types. III. The Curry-Howard correspondence

def f1[A, B]: Either[A, B] => Either[B, A] = {


case Left(a) => Right(a) // No other choice here.
case Right(b) => Left(b) // No other choice here.
}
def f2[A, B]: Either[B, A] => Either[A, B] = f1[B, A]

The functions f1 and f2 are implemented by code that can be derived unambigu-
ously from the type signatures. For instance, the line case Left(a) => ... is re-
quired to return a value of type Either[B, A] by using a given value a: A. The only
way of doing that is by returning Right(a).
It is clear from the code that the functions f1 and f2 are inverses of each other. To
verify that rigorously, we need to show that f1 andThen f2 is equal to an identity
function. The function f1 andThen f2 applies f2 to the result of f1. The code of
f1 contains two case ... lines, each returning a result. So, we need to apply f2
separately in each line. Evaluate the code symbolically:
(f1 andThen f2) == {
case Left(a) => f2(Right(a))
case Right(b) => f2(Left(b))
} == {
case Left(a) => Left(a)
case Right(b) => Right(b)
}

The result is a function of type Either[A, B] => Either[A, B] that does not change
its argument; so, it is equal to the identity function.
Let us now write the function f1 in the code notation and perform the same
derivation. We will also develop a useful notation for functions operating on dis-
junctive types.
The pattern matching construction in the Scala code of f1 is similar to a pair of
functions with types A => Either[B, A] and B => Either[B, A]. One of these func-
tions is applied depending on whether the argument of f1 has type 𝐴 + 0 or 0 + 𝐵.
So, we may write the code of f1 as:
(
def :𝐴+𝐵 if 𝑥 = 𝑎 :𝐴 + 0:𝐵 : 0:𝐵 + 𝑎 :𝐴
𝑓1 = 𝑥 →
if 𝑥 = 0:𝐴 + 𝑏 :𝐵 : 𝑏 :𝐵 + 0:𝐴

Since both the argument and the result of 𝑓1 are disjunctive types with 2 parts
each, the code notation represents 𝑓1 as a 2 × 2 matrix that maps the input parts to
the output parts:
def f1[A, B]: Either[A, B] => Either[B, A] = {
case Left(a) => Right(a)
case Right(b) => Left(b)
}

220
5.3 Equivalence of types

𝐵 𝐴
𝑓1 def
= 𝐴 0 𝑎 :𝐴 → 𝑎 .
𝐵 𝑏 :𝐵 → 𝑏 0
The matrix element in row 𝐴 and column 𝐴 is a function 𝑎 :𝐴 → 𝑎 of type 𝐴 → 𝐴
that corresponds to the line case Left(a) => Right(a) in the Scala code. The matrix
element in row 𝐴 and column 𝐵 is written as 0 because no value of that type is
returned. The matrix row 𝐵 contains just the function 𝑏 :𝐵 → 𝑏 in the first column.
In the second column, row 𝐵 contains a 0.
The code of 𝑓2 is written similarly. Let us rename arguments for clarity:
def f2[A, B]: Either[B, A] => Either[A, B] = {
case Left(y) => Right(y)
case Right(x) => Left(x)
}

𝐴 𝐵
def
𝑓2 = 𝐵 0 𝑦 :𝐵 → 𝑦 .
𝐴 𝑥 :𝐴 → 𝑥 0
The forward composition 𝑓1 # 𝑓2 is computed by the standard rules of row-by-
column matrix multiplication.10 Any terms containing 0 are omitted, and the
remaining functions are composed:

𝐵 𝐴 𝐴 𝐵
𝑓1 # 𝑓2 = 𝐴 0 𝑎 :𝐴 → 𝑎 # 𝐵 0 𝑦 :𝐵 → 𝑦
𝐵 𝑏 :𝐵 → 𝑏 0 𝐴 𝑥 :𝐴 → 𝑥 0

𝐴 𝐵
matrix composition : = 𝐴 (𝑎 :𝐴 → 𝑎) # (𝑥 :𝐴 → 𝑥) 0
𝐵 0 (𝑏 :𝐵 → 𝑏) # (𝑦 :𝐵 → 𝑦)

𝐴 𝐵
function composition : = 𝐴 id 0 = id:𝐴+𝐵→𝐴+𝐵 .
𝐵 0 id

Several features of the matrix notation are helpful in such calculations. The
parts of the code of 𝑓1 are automatically composed with the corresponding parts
of the code of 𝑓2 . To check that the types match in the function composition, we
10 https://en.wikipedia.org/wiki/Matrix_multiplication

221
5 The logic of types. III. The Curry-Howard correspondence

just need to compare the types in the output row 𝐵 𝐴 of 𝑓1 with the input
𝐵
column of 𝑓2 . Once we verified that all types match, we may omit the type
𝐴
annotations and write the same derivation more concisely as:

0 𝑎 :𝐴 → 𝑎 0 𝑦 :𝐵 → 𝑦
𝑓1 # 𝑓2 = #
𝑏 :𝐵 → 𝑏 0 𝑥 :𝐴 → 𝑥 0

(𝑎 :𝐴 → 𝑎) # (𝑥 :𝐴 → 𝑥) 0
matrix composition : =
0 (𝑏 :𝐵 → 𝑏) # (𝑦 :𝐵 → 𝑦)

id 0
function composition : = = id .
0 id

id 0
The identity function is represented by this diagonal matrix: .
0 id
Exercise 5.3.2.5 Verify the type equivalence 𝐴 × 𝐵  𝐵 × 𝐴.
Exercise 5.3.2.6 Verify the type equivalence ( 𝐴 + 𝐵) + 𝐶  𝐴 + (𝐵 + 𝐶). Since Sec-
tion 5.3.1 proved the equivalences ( 𝐴 + 𝐵) + 𝐶  𝐴 + (𝐵 + 𝐶) and ( 𝐴 × 𝐵) × 𝐶 
𝐴 × (𝐵 × 𝐶), we may write 𝐴 + 𝐵 + 𝐶 and 𝐴 × 𝐵 × 𝐶 without any parentheses.
Exercise 5.3.2.7 Verify the type equivalence:

( 𝐴 + 𝐵) × ( 𝐴 + 𝐵) = 𝐴 × 𝐴 + 2 × 𝐴 × 𝐵 + 𝐵 × 𝐵 ,

where 2 denotes the Boolean type (which may be defined as 2 def


= 1 + 1).

5.3.3 Type cardinalities and type equivalence


To understand why type equivalences are related to arithmetic identities, consider
the question of how many different values a given type can have.
For a given type 𝐴, let us denote by | 𝐴| the number of distinct values of type 𝐴.
The number | 𝐴| is called the cardinality of type 𝐴. This is the same as the number
of elements in the set of all values of type 𝐴.
Begin by counting the number of distinct values for simple types. We find that
the Unit type has only one distinct value; the type Nothing has zero values; the
Boolean type has two distinct values (true and false); and the type Int has 232 dis-
tinct values.
It is more difficult to count the number of distinct values in a type such as String,
which is equivalent to a list of unknown length, List[Char]. In practice, each com-
puter’s memory is limited, so there will exist a maximum length for values of type
222
5.3 Equivalence of types

String. So, the total number of possible different strings will be finite, depending
on the computer. Similarly, the set of all possible values of type List[Int] will be a
finite set.
But this introduces an arbitrary limit on the total size of data, which is incon-
venient for reasoning about programs. For instance, string concatenation and list
concatenation will become partial functions; those operations will fail when the
total size of the result is larger than the memory limit. It is more convenient to
imagine a computer with an infinite array of memory locations. On that com-
puter, each program is still only allowed to use a finite amount of memory, but
that amount is not limited in advance. Then all basic operations on data types be-
come total functions; for example, the concatenation of any two strings is always
well-defined.
In the model of an infinite computer, the set of all possible strings will be a
countably infinite set consisting of all possible character sequences (where char-
acters come from a finite set). Similarly, the set of all possible values of type
List[Int] will be a countably infinite set. This makes it difficult to reason about
the total number of values of a given type. So, for the purposes of this section, we
will limit our consideration to types 𝐴, 𝐵, ..., that have finite cardinalities.
The next step is to consider the cardinality of types such as 𝐴 × 𝐵 and 𝐴 + 𝐵.
If the types 𝐴 and 𝐵 have cardinalities | 𝐴| and |𝐵|, it follows that the set of all
distinct pairs (A, B) has | 𝐴| × |𝐵| elements. So, the cardinality of the type 𝐴 × 𝐵
is equal to the (arithmetic) product of the cardinalities of 𝐴 and 𝐵. The set of all
pairs, denoted in mathematics by:

{(𝑎, 𝑏) | 𝑎 ∈ 𝐴, 𝑏 ∈ 𝐵} ,

is called the Cartesian product of sets 𝐴 and 𝐵, and is denoted by 𝐴 × 𝐵. For


this reason, the tuple type is also called the product type. Accordingly, the type
notation adopts the symbol × for the product type.
The set of all distinct values of the type 𝐴 + 𝐵, i.e., of the Scala type Either[A, B],
is a labeled union of the set of values of the form Left(a) and the set of values of
the form Right(b). It is clear that the cardinalities of these two sets are equal to | 𝐴|
and |𝐵| respectively. So, the cardinality of the type Either[A, B] is equal to | 𝐴| + |𝐵|.
For this reason, disjunctive types are also called sum types, and the type notation
adopts the symbol + for these types.
We can write our conclusions as:

| 𝐴 × 𝐵| = | 𝐴| × |𝐵| ,
| 𝐴 + 𝐵| = | 𝐴| + |𝐵| .

The type notation, 𝐴 × 𝐵 for (A,B) and 𝐴 + 𝐵 for Either[A, B], translates directly
into type cardinalities.
The last step is to notice that two types can be equivalent, 𝑃  𝑄, only if their
cardinalities are equal, |𝑃| = |𝑄|. When the cardinalities are not equal, |𝑃| ≠ |𝑄|, it
223
5 The logic of types. III. The Curry-Howard correspondence

will be impossible to have a one-to-one correspondence between the sets of values


of type 𝑃 and values of type 𝑄. So, it will be impossible to convert values from
type 𝑃 to type 𝑄 and back without loss of information.
We conclude that types are equivalent when a logical identity and an arithmetic
identity hold.
The presence of both identities does not automatically guarantee a useful type
equivalence. The fact that information in one type can be identically stored in
another type does not necessarily mean that it is helpful to do so in a given appli-
cation.
For example, the types Option[Option[A]] and Either[Boolean, A] are equivalent
because both types contain 2 + | 𝐴| distinct values. The short notation for these
types is 1 + 1 + 𝐴 and 2 + 𝐴 respectively. The type Boolean is denoted by 2 since it
has only two distinct values.
One could write code for converting between these types without loss of infor-
mation:
def f1[A]: Option[Option[A]] => Either[Boolean, A] = {
case None => Left(false) // Or maybe Left(true)?
case Some(None) => Left(true)
case Some(Some(x)) => Right(x)
}

def f2[A]: Either[Boolean, A] => Option[Option[A]] = {


case Left(false) => None
case Left(true) => Some(None)
case Right(x) => Some(Some(x))
}

The presence of an arbitrary choice in this code is a warning sign. In f1, we could
map None to Left(false) or to Left(true) and adjust the rest of the code accord-
ingly. The type equivalence holds with either choice. So, these types are equiva-
lent, but there is no natural choice of the conversion functions f1 and f2 because
the meaning of those data types will be application-dependent. We call this type
equivalence accidental.
Example 5.3.3.1 Are the types Option[A] and Either[Unit, A] equivalent? Check
whether the corresponding logic identity and arithmetic identity hold.
Solution Begin by writing the given types in the type notation: Option[A] is writ-
ten as 1 + 𝐴, and Either[Unit, A] is written also as 1 + 𝐴. The notation already
indicates that the types are equivalent. But let us verify explicitly that the type
notation is not misleading us here.
To establish the type equivalence, we need to implement two functions:
def f1[A]: Option[A] => Either[Unit, A] = ???
def f2[A]: Either[Unit, A] => Option[A] = ???

These functions must satisfy 𝑓1 # 𝑓2 = id and 𝑓2 # 𝑓1 = id. It is straightforward to


implement f1 and f2:
224
5.3 Equivalence of types

def f1[A]: Option[A] => Either[Unit, A] = {


case None => Left(())
case Some(x) => Right(x)
}
def f2[A]: Either[Unit, A] => Option[A] = {
case Left(()) => None
case Right(x) => Some(x)
}

The code clearly shows that f1 and f2 are inverses of each other. This verifies the
type equivalence.
The logic identity is 𝑇𝑟𝑢𝑒 ∨ 𝐴 = 𝑇𝑟𝑢𝑒 ∨ 𝐴 and holds trivially. It remains to
check the arithmetic identity, which relates the cardinalities of types Option[A] and
Either[Unit, A]. Assume that the cardinality of type A is | 𝐴|. Any possible value of
type Option[A] must be either None or Some(x), where x is a value of type A. So, the
number of distinct values of type Option[A] is 1 + | 𝐴|. All possible values of type
Either[Unit, A] are of the form Left(()) or Right(x), where x is a value of type A. So,
the cardinality of type Either[Unit, A] is 1 + | 𝐴|. We see that the arithmetic identity
holds: the types Option[A] and Either[Unit, A] have equally many distinct values.
This example shows that the type notation is helpful for reasoning about type
equivalences. The answer was found immediately when we wrote the type nota-
tion (1 + 𝐴) for the given types.

5.3.4 Type equivalence involving function types


Until now, we have looked at product types and disjunctive types. We now turn
to type expressions involving function types.
Consider two types 𝐴 and 𝐵 having known cardinalities | 𝐴| and |𝐵|. How many
distinct values does the function type 𝐴 → 𝐵 have? A function f: A => B needs to
select a value of type 𝐵 for each possible value of type 𝐴. Therefore, the number
of different functions f: A => B is |𝐵| | 𝐴| .
Here |𝐵| | 𝐴| denotes the numeric exponent, that is, |𝐵| to the power | 𝐴|. We use
the numeric exponent notation (𝑎 𝑏 ) only when computing with numbers. When
denoting types and code, this book uses superscripts for type parameters and type
annotations.
For the types 𝐴 = 𝐵 = Int, we have | 𝐴| = |𝐵| = 232 , and the exponential formula
gives:
32 32 37 10
| 𝐴 → 𝐵| = (232 ) ( 2 ) = 232×2 = 22 ≈ 104.1×10 .
This number greatly exceeds the number of atoms in the observable Universe.11
However, almost all of those functions will map integers to integers in extremely
11 Estimatedin https://www.universetoday.com/36302/atoms-in-the-universe/amp/ to be
between 1078 and 1082 .

225
5 The logic of types. III. The Curry-Howard correspondence

Logical identity (if holds) Type equivalence Arithmetic identity

(𝑇𝑟𝑢𝑒 ⇒ 𝛼) = 𝛼 1→𝐴 𝐴 𝑎1 = 𝑎

(𝐹𝑎𝑙𝑠𝑒 ⇒ 𝛼) = 𝑇𝑟𝑢𝑒 0→𝐴1 𝑎0 = 1

(𝛼 ⇒ 𝑇𝑟𝑢𝑒) = 𝑇𝑟𝑢𝑒 𝐴→11 1𝑎 = 1

(𝛼 ⇒ 𝐹𝑎𝑙𝑠𝑒) ≠ 𝐹𝑎𝑙𝑠𝑒 𝐴→00 0𝑎 ≠ 0

(𝛼 ∨ 𝛽) ⇒ 𝛾 = (𝛼 ⇒ 𝛾) ∧ (𝛽 ⇒ 𝛾) 𝐴 + 𝐵 → 𝐶  ( 𝐴 → 𝐶) × (𝐵 → 𝐶) 𝑐 𝑎+𝑏 = 𝑐 𝑎 × 𝑐 𝑏

(𝛼 ∧ 𝛽) ⇒ 𝛾 = 𝛼 ⇒ (𝛽 ⇒ 𝛾) 𝐴×𝐵 →𝐶  𝐴 → 𝐵 →𝐶 𝑐 𝑎×𝑏 = (𝑐 𝑏 ) 𝑎

𝛼 ⇒ (𝛽 ∧ 𝛾) = (𝛼 ⇒ 𝛽) ∧ (𝛼 ⇒ 𝛾) 𝐴 → 𝐵 × 𝐶  ( 𝐴 → 𝐵) × ( 𝐴 → 𝐶) (𝑏 × 𝑐) 𝑎 = 𝑏 𝑎 × 𝑐 𝑎

Table 5.6: Logical identities with implication, and the corresponding type equiva-
lences and arithmetic identities.

complicated (and practically useless) ways. The code of those functions will be
much larger than the available memory of a realistic computer. So, the num-
ber of practically implementable functions of type 𝐴 → 𝐵 can be much smaller
than |𝐵| | 𝐴| . Since the code of a function is a list of bytes that needs to fit into the
computer’s memory, the number of implementable functions is no larger than the
number of possible byte lists.
Nevertheless, the formula |𝐵| | 𝐴| is useful since it shows the number of distinct
functions that are possible in principle, on an imaginary computer with infinite
memory (although we still need to limit our consideration to types 𝐴, 𝐵 with finite
cardinalities | 𝐴|, |𝐵|). When types 𝐴 and 𝐵 have only a small number of distinct
values (for example, with 𝐴 = Option[Boolean]] and 𝐵 = Either[Boolean, Boolean]),
the formula |𝐵| | 𝐴| gives an exact and practically relevant answer.
Let us now look for logic identities and arithmetic identities involving function
types. Table 5.6 lists the available identities and the corresponding type equiva-
lences. (In the last column, we defined 𝑎 def = | 𝐴|, 𝑏 def
= |𝐵|, and 𝑐 def
= |𝐶 | for brevity.)
It is notable that no logic identity is available for the formula 𝛼 ⇒ (𝛽 ∨ 𝛾), and
correspondingly no type equivalence is available for the type expression 𝐴 →
𝐵 + 𝐶 (although there is an identity for 𝐴 → 𝐵 × 𝐶). Reasoning about types of
the form 𝐴 → 𝐵 + 𝐶 is more complicated because those types usually cannot be
rewritten as simpler types.
We will now prove some of the type identities in Table 5.6.
Example 5.3.4.1 Verify the type equivalence 1 → 𝐴  𝐴.
Solution Recall that the type notation 1 → 𝐴 means the Scala function type
Unit => A. There is only one value of type Unit. The choice of a function of type
Unit => A is the same as the choice of a value of type A. So, the type 1 → 𝐴 has | 𝐴|
distinct values, and the arithmetic identity holds.
226
5.3 Equivalence of types

To verify the type equivalence explicitly, we need to implement two functions:


def f1[A]: (Unit => A) => A = ???
def f2[A]: A => Unit => A = ???
The first function needs to produce a value of type A, given an argument of the
function type Unit => A. The only possibility is to apply that function to the value
of type Unit. We produce that value as ():
def f1[A]: (Unit => A) => A = (h: Unit => A) => h(())
Implementing f2 is straightforward. We can just discard the Unit argument:
def f2[A]: A => Unit => A = (x: A) => _ => x
It remains to show that the functions f1 and f2 are inverses of each other. Let us
show the proof using Scala code and then using the code notation.
Writing Scala code, compute f1(f2(x)) for an arbitrary x: A. Use the code of f1
and f2 and get:
f1(f2(x)) == f1(_ => x) == (_ => x)(()) == x
Now compute f2(f1(h)) for arbitrary h: Unit => A.
f2(f1(h)) == f2(h(())) == { _ => h(()) }
How can we show that the function {_ => h(())} is equal to h? Whenever we apply
equal functions to equal arguments, they return equal results. In our case, the
argument of h is of type Unit, so we only need to verify that the result of applying
h to the value () is the same as the result of applying {_ => h(())} to (). In other
words, we need to apply both sides to an additional argument ():
f2(f1(h))(()) == { _ => h(()) } (()) == h(())
This completes the proof.
For comparison, let us show the same proof in the code notation. The functions
𝑓1 and 𝑓2 are:

𝑓1 def
= ℎ:1→𝐴 → ℎ(1) , 𝑓2 def
= 𝑥 :𝐴 → 1 → 𝑥 .

Now write the function compositions in both directions:

expect to equal id : 𝑓1 # 𝑓2 = (ℎ:1→𝐴 → ℎ(1)) # (𝑥 :𝐴 → 1 → 𝑥)


compute composition : = ℎ:1→𝐴 → 1 → ℎ(1)
note that 1 → ℎ(1) is the same as ℎ : = (ℎ:1→𝐴 → ℎ) = id .

expect to equal id : 𝑓2 # 𝑓1 = (𝑥 :𝐴 → 1 → 𝑥) # (ℎ:1→𝐴 → ℎ(1))


compute composition : = 𝑥 :𝐴 → (1 → 𝑥)(1)
apply function : = (𝑥 :𝐴 → 𝑥) = id .

227
5 The logic of types. III. The Curry-Howard correspondence

The type 1 → 𝐴 is equivalent to the type 𝐴 in the sense of carrying the same
information, but these types are not exactly the same. An important difference
between these types is that a value of type 𝐴 is available immediately, while a
value of type 1 → 𝐴 is a function that still needs to be applied to an argument
(of type 1) before a value of type 𝐴 is obtained. The type 1 → 𝐴 may represent
an “on-call” value of type 𝐴. That value is computed on demand every time it is
requested. (See Section 2.6.3 for more details about “on-call” values.)
The void type 0 needs special reasoning, as the next examples show:
Example 5.3.4.2 Verify the type equivalence 0 → 𝐴  1.
Solution To verify that a type 𝑋 is equivalent to the Unit type, we need to show
that there is only one distinct value of type 𝑋. So, let us find out how many values
the type 0 → 𝐴 has. Consider a value of that type, which is a function 𝑓 :0→𝐴 from
the type 0 to a type 𝐴. Since there exist no values of type 0, the function 𝑓 will
never be applied to any arguments and so does not need to compute any actual
values of type 𝐴. So, 𝑓 is a function whose body may be “empty”. At least, 𝑓 ’s
body does not need to contain any expressions of type 𝐴. In Scala, such a function
can be written as:
def absurd[A]: Nothing => A = { ??? }

This code will compile without type errors. An equivalent code is:
def absurd[A]: Nothing => A = { x => ??? }

The symbol ??? is defined in the Scala library and represents code that is “not
implemented”. Trying to evaluate this symbol will produce an error:
scala> val x = ???
scala.NotImplementedError: an implementation is missing

Since the function absurd can never be applied to an argument, this error will never
happen. So, one can pretend that the result value (which will never be computed)
has any required type, e.g., the type 𝐴. In this way, the compiler will accept the
definition of absurd.
Let us now verify that there exists only one distinct function of type 0 → 𝐴. Take
any two functions of that type, 𝑓 :0→𝐴 and 𝑔 :0→𝐴 . Are they different? The only way
of showing that 𝑓 and 𝑔 are different is by finding a value 𝑥 such that 𝑓 (𝑥) ≠ 𝑔(𝑥).
But then 𝑥 would be of type 0, and there are no values of type 0. So, we will never
be able to find the required value 𝑥. It follows that any two functions 𝑓 and 𝑔 of
type 0 → 𝐴 are equal, 𝑓 = 𝑔. In other words, there exists only one distinct value
of type 0 → 𝐴. Since the cardinality of the type 0 → 𝐴 is 1, we obtain the type
equivalence 0 → 𝐴  1.
Example 5.3.4.3 Show that 𝐴 → 0  0 and 𝐴 → 0  1, where 𝐴 is an arbitrary
unknown type.
Solution To prove that two types are not equivalent, it is sufficient to show
that their cardinalities are different. Let us determine how the cardinality of the
228
5.3 Equivalence of types

type 𝐴 → 0 depends on the cardinality of 𝐴. We note that a function of type, say,


Int → 0 is impossible to implement. (If we had such a function 𝑓 :Int→0 , we could
evaluate, say, 𝑥 def
= 𝑓 (123) and obtain a value 𝑥 of type 0, which is impossible by
definition of the type 0.) It follows that |Int → 0| = 0. However, Example 5.3.4.2
shows that 0 → 0 has cardinality 1. So, we find that | 𝐴 → 0| = 1 if the type 𝐴 is
itself 0 but | 𝐴 → 0| = 0 for all other types 𝐴. We conclude that the type 𝐴 → 0 is
not equivalent to 0 or 1 when 𝐴 is an unknown type. The type 𝐴 → 0 is void for
non-void types 𝐴, and vice versa.
Example 5.3.4.4 Verify the type equivalence 𝐴 → 1  1.
Solution There is only one fully parametric function that returns 1:
def f[A]: A => Unit = { _ => () }
The function 𝑓 must return the fixed value () of type Unit. The argument of type
𝐴 is of no use for that. So, the code of 𝑓 must discard its argument and return the
value (). In the code notation, this function is written as:

𝑓 :𝐴→1 def
= _→1 .

We can show that there exists only one distinct function of type 𝐴 → 1 (that is, the
type 𝐴 → 1 has cardinality 1). Assume that 𝑓 and 𝑔 are two such functions, and
try to find a value 𝑥 :𝐴 such that 𝑓 (𝑥) ≠ 𝑔(𝑥). We cannot find any such 𝑥 because
𝑓 (𝑥) = 1 and 𝑔(𝑥) = 1 for all 𝑥. So, any two functions 𝑓 and 𝑔 of type 𝐴 → 1 must
be equal to each other. The cardinality of the type 𝐴 → 1 is 1.
Any type having cardinality 1 is equivalent to the Unit type (1). So, 𝐴 → 1  1.
Example 5.3.4.5 Denote by _:𝐴 → 𝐵 the type of constant functions of type 𝐴 → 𝐵
(functions that ignore their argument). Show that the type _:𝐴 → 𝐵 is equivalent
to the type 𝐵, as long as 𝐴 ≠ 0.
Solution An isomorphism between the types 𝐵 and _:𝐴 → 𝐵 is given by the
two functions:

𝑓1 : 𝐵 → _:𝐴 → 𝐵 , 𝑓1 def
= 𝑏→_→𝑏 ;
:𝐴 def :_→𝐵
𝑓2 : (_ → 𝐵) → 𝐵 , 𝑓2 = 𝑘 → 𝑘 (𝑥 :𝐴 ) ,

where 𝑥 is any value of type 𝐴. That value exists since the type 𝐴 is not void. The
function 𝑓2 does not depend on the choice of 𝑥 because 𝑘 is a constant function, so
𝑘 (𝑥) is the same for all 𝑥. In other words, the function 𝑘 satisfies 𝑘 = (_ → 𝑘 (𝑥))
with any chosen 𝑥. To prove that 𝑓1 and 𝑓2 are inverses:

𝑓1 # 𝑓2 = (𝑏 → _ → 𝑏) # (𝑘 → 𝑘 (𝑥)) = 𝑏 → (_ → 𝑏)(𝑥) = (𝑏 → 𝑏) = id ,
𝑓2 # 𝑓1 = (𝑘 → 𝑘 (𝑥)) # (𝑏 → _ → 𝑏) = 𝑘 → _ → 𝑘 (𝑥) = 𝑘 → 𝑘 = id .

Example 5.3.4.6 Verify the following type equivalence:

𝐴 + 𝐵 → 𝐶  ( 𝐴 → 𝐶) × (𝐵 → 𝐶) .
229
5 The logic of types. III. The Curry-Howard correspondence

Solution Begin by implementing two functions with type signatures:


def f1[A, B, C]: (Either[A, B] => C) => (A => C, B => C) = ???
def f2[A, B, C]: ((A => C, B => C)) => Either[A, B] => C = ???
The code can be derived unambiguously from the type signatures. For the first
function, we need to produce a pair of functions of type (A => C, B => C). Can we
produce the first part of that pair? Computing a function of type A => C means
that we need to produce a value of type C given an arbitrary value a: A. The
available data is a function of type Either[A, B] => C called, say, h. We can apply
that function to Left(a) and obtain a value of type C as required. So, a function
of type A => C is computed as a => h(Left(a)). We can produce a function of type
B => C similarly. The code is:
def f1[A, B, C]: (Either[A, B] => C) => (A => C, B => C) =
(h: Either[A, B] => C) => (a => h(Left(a)), b => h(Right(b)))
We write this function in the code notation like this:

𝑓1 : ( 𝐴 + 𝐵 → 𝐶) → ( 𝐴 → 𝐶) × (𝐵 → 𝐶) ,
𝑓1 def
= ℎ:𝐴+𝐵→𝐶 → 𝑎 :𝐴 → ℎ(𝑎 + 0:𝐵 ) × 𝑏 :𝐵 → ℎ(0:𝐴 + 𝑏)
 
.

For the function f2, we need to apply pattern matching to both curried argu-
ments and then return a value of type C. This can be achieved in only one way:
def f2[A, B, C](f: A => C, g: B => C): Either[A, B] => C = {
case Left(a) => f(a)
case Right(b) => g(b)
}
We write this function in the code notation like this:

𝑓2 : ( 𝐴 → 𝐶) × (𝐵 → 𝐶) → 𝐴 + 𝐵 → 𝐶 ,
𝐶
𝑓2 def
= 𝑓 :𝐴→𝐶 × 𝑔 :𝐵→𝐶 → 𝐴 𝑎 → 𝑓 (𝑎) .
𝐵 𝑏 → 𝑔(𝑏)

The matrix in the last line has only one column because the result type, 𝐶, is not
known to be a disjunctive type. We may also simplify the functions, e.g., replace
𝑎 → 𝑓 (𝑎) by just 𝑓 , and write:

𝐶
𝑓2 def
= 𝑓 :𝐴→𝐶 × 𝑔 :𝐵→𝐶 → 𝐴 𝑓 .
𝐵 𝑔

It remains to verify that 𝑓1 # 𝑓2 = id and 𝑓2 # 𝑓1 = id. To compute 𝑓1 # 𝑓2 , we write


230
5.3 Equivalence of types

(omitting types):
 
 𝑓
𝑓1 # 𝑓2 = ℎ → (𝑎 → ℎ(𝑎 + 0)) × (𝑏 → ℎ(0 + 𝑏)) # 𝑓 × 𝑔 →
𝑔

𝑎 → ℎ(𝑎 + 0)
compute composition : =ℎ→ .
𝑏 → ℎ(0 + 𝑏)

To proceed, we need to simplify the expressions ℎ(𝑎 + 0) and ℎ(0 + 𝑏). We rewrite
the argument ℎ (an arbitrary function of type 𝐴 + 𝐵 → 𝐶) in the matrix notation:

𝐶 𝐶
ℎ def
= 𝐴 𝑎 → 𝑝(𝑎) = 𝐴 𝑝 ,
𝐵 𝑏 → 𝑞(𝑏) 𝐵 𝑞

where 𝑝 :𝐴→𝐶 and 𝑞 :𝐵→𝐶 are new arbitrary functions. Since we already checked the
types, we can omit all type annotations and write ℎ as:

𝑝
ℎ def
= .
𝑞

To evaluate expressions such as ℎ(𝑎 + 0) and ℎ(0 + 𝑏), we need to use one of the
rows of this matrix. The correct row will be selected automatically by the rules of
matrix multiplication if we place a row vector to the left of the matrix and use the
convention of omitting terms containing 0:

𝑝 𝑝
𝑎 0 ⊲ =𝑎⊲𝑝 , 0 𝑏 ⊲ = 𝑏⊲𝑞 .
𝑞 𝑞

Here we used the symbol ⊲ to separate an argument from a function when the
argument is written to the left of the function. The symbol ⊲ (pronounced “pipe”)
is defined by 𝑥 ⊲ 𝑓 def
= 𝑓 (𝑥). In Scala, this operation is available as x.pipe(f) as of
Scala 2.13.
We can write values of disjunctive types, such as 𝑎 + 0, as row vectors 𝑎 0 :

ℎ(𝑎 + 0) = (𝑎 + 0) ⊲ ℎ = 𝑎 0 ⊲ ℎ . (5.21)

With these notations, we compute further. Omit all terms applying 0 or applying
something to 0:

𝑝
ℎ(𝑎 + 0) = 𝑎 0 ⊲ ℎ = 𝑎 0 ⊲ = 𝑎 ⊲ 𝑝 = 𝑝(𝑎) ,
𝑞
231
5 The logic of types. III. The Curry-Howard correspondence

𝑝
ℎ(0 + 𝑏) = 0 𝑏 ⊲ ℎ = 0 𝑏 ⊲ = 𝑏 ⊲ 𝑞 = 𝑞(𝑏) .
𝑞
Now we can complete the proof of 𝑓1 # 𝑓2 = id:

𝑎 → ℎ(𝑎 + 0)
𝑓1 # 𝑓2 = ℎ →
𝑏 → ℎ(0 + 𝑏)

𝑝 𝑎 → 𝑝(𝑎)
previous equations : = →
𝑞 𝑏 → 𝑞(𝑏)

𝑝 𝑝
simplify functions : = → = id .
𝑞 𝑞

To prove that 𝑓2 # 𝑓1 = id, use the notation (5.21):


 
𝑓 
𝑓2 # 𝑓1 = 𝑓 × 𝑔 → # ℎ → (𝑎 → (𝑎 + 0) ⊲ ℎ) × (𝑏 → (0 + 𝑏) ⊲ ℎ)
𝑔

𝑓  𝑓 
composition : = 𝑓 ×𝑔 → 𝑎 → 𝑎 0 ⊲ × 𝑏→ 0 𝑏 ⊲
𝑔 𝑔
apply functions : = 𝑓 × 𝑔 → (𝑎 → 𝑎 ⊲ 𝑓 ) × (𝑏 → 𝑏 ⊲ 𝑔)
definition of ⊲ : = 𝑓 × 𝑔 → (𝑎 → 𝑓 (𝑎)) × (𝑏 → 𝑔(𝑏))
simplify functions : = ( 𝑓 × 𝑔 → 𝑓 × 𝑔) = id .
In this way, we have proved that 𝑓1 and 𝑓2 are mutual inverses. The proofs
appear long because we took time to motivate and introduce new notation for
applying matrices to row vectors. Once this notation is understood, a proof of
𝑓1 # 𝑓2 = id can be written as:
 
𝑓
𝑓1 # 𝑓2 = (ℎ → (𝑎 → (𝑎 + 0) ⊲ ℎ) × (𝑏 → (0 + 𝑏) ⊲ ℎ)) # 𝑓 × 𝑔 →
𝑔

𝑝
𝑎→ 𝑎 0 ⊲
𝑎 → 𝑎 0 ⊲ℎ 𝑝 𝑞
composition : =ℎ→ = →
𝑏 → 0 𝑏 ⊲ℎ 𝑞 𝑝
𝑏 → 0 𝑏 ⊲
𝑞

𝑝 𝑎 →𝑎⊲𝑝 𝑝 𝑝
apply functions : = → = → = id .
𝑞 𝑏 → 𝑏⊲𝑞 𝑞 𝑞

232
5.3 Equivalence of types

Proofs in the code notation are shorter than in Scala syntax because certain names
and keywords (such as Left, Right, case, match, etc.) are omitted. From now on, we
will prefer to use the code notation in proofs, keeping in mind that one can always
convert the code notation to Scala.
The function arrow (→) binds weaker than the pipe operation (⊲), so the formula
𝑥 → 𝑦 ⊲ 𝑧 means 𝑥 → (𝑦 ⊲ 𝑧). We will review the pipe notation more systematically
in Chapter 7.
Example 5.3.4.7 Verify the type equivalence:

𝐴×𝐵 →𝐶  𝐴 → 𝐵 →𝐶 .

Solution Begin by implementing the two functions:


def f1[A, B, C]: (((A, B)) => C) => A => B => C = ???
def f2[A, B, C]: (A => B => C) => ((A, B)) => C = ???

The Scala code can be derived from the type signatures unambiguously:
def f1[A,B,C]: (((A, B)) => C) => A => B => C = g => a => b => g((a, b))
def f2[A,B,C]: (A => B => C) => ((A, B)) => C = h => { case (a, b) => h(a)(b) }

Write these functions in the code notation:

𝑓1 = 𝑔 :𝐴×𝐵→𝐶 → 𝑎 :𝐴 → 𝑏 :𝐵 → 𝑔(𝑎 × 𝑏) ,
:𝐴→𝐵→𝐶 :𝐴×𝐵
𝑓2 = ℎ → (𝑎 × 𝑏) → ℎ(𝑎)(𝑏) .

We denote by (𝑎 × 𝑏) :𝐴×𝐵 an argument of type (A, B) with pattern matching im-


plied. This notation allows us to write shorter code formulas where tuples are
destructured.
Compute the function composition 𝑓1 # 𝑓2 and show that it is equal to an identity
function:

expect to equal id : 𝑓1 # 𝑓2 = (𝑔 → 𝑎 → 𝑏 → 𝑔(𝑎 × 𝑏)) # (ℎ → 𝑎 × 𝑏 → ℎ(𝑎)(𝑏))


composition : = 𝑔 → 𝑎 × 𝑏 → 𝑔(𝑎 × 𝑏)
simplify function : = (𝑔 → 𝑔) = id .

Compute the function composition 𝑓2 # 𝑓1 and show that it is equal to an identity


function:

expect to equal id : 𝑓2 # 𝑓1 = (ℎ → 𝑎 × 𝑏 → ℎ(𝑎)(𝑏)) # (𝑔 → 𝑎 → 𝑏 → 𝑔(𝑎 × 𝑏))


composition : = ℎ → 𝑎 → 𝑏 → ℎ(𝑎)(𝑏)
simplify 𝑏 → ℎ(𝑎)(𝑏) : = ℎ → 𝑎 → ℎ(𝑎)
simplify 𝑎 → ℎ(𝑎) to ℎ : = (ℎ → ℎ) = id .
233
5 The logic of types. III. The Curry-Howard correspondence

5.4 Summary
What tasks can we perform now?
• Convert a fully parametric type signature into a logical formula and:
– Decide whether the type signature can be implemented in code.
– If possible, derive the code using the CH correspondence.
• Use the type notation (Table 5.1) for reasoning about types to:
– Decide type equivalence using the rules in Tables 5.5–5.6.
– Simplify type expressions before writing code.
• Use the matrix notation and the pipe notation to write code that works on
disjunctive types.
What tasks cannot be performed with these tools?
• Automatically generate code for recursive functions. The CH correspondence
is based on propositional logic, which cannot describe recursion. Accord-
ingly, recursion is absent from the eight code constructions of Section 5.2.2.
Recursive functions need to be coded by hand.
• Automatically generate code satisfying a property (e.g., isomorphism). We
may generate some code, but the CH correspondence does not guarantee
that properties will hold. We need to verify the required properties manu-
ally, after deriving the code.
• Express complicated conditions (e.g., “array is sorted”) in a type signature.
This can be done using dependent types (i.e., types that directly depend on
values in some way). This is an advanced technique beyond the scope of this
book. Programming languages such as Coq, Agda, and Idris fully support
dependent types, while Scala has only limited support.
• Generate code using type constructors with known methods (e.g., the map
method).
As an example of using type constructors with known methods, consider this type
signature:
def q[A]: Array[A] => (A => Option[B]) => Array[Option[B]]
Can we generate the code of this function from its type signature? We know that
the Scala library defines a map method on the Array class. So, an implementation of
q is:
def q[A]: Array[A] => (A => Option[B]) => Array[Option[B]] = { arr => f =>
arr.map(f) }

234
5.4 Summary

However, it is hard to create an algorithm that can derive this implementation au-
tomatically from the type signature of q via the Curry-Howard correspondence.
The algorithm would have to convert the type signature of q into this logical for-
mula:
𝐵
CH (Array 𝐴 ) ⇒ CH ( 𝐴 → Opt𝐵 ) ⇒ CH (ArrayOpt ) . (5.22)
To derive an implementation, the algorithm would need to use the available map
method for Array. That method has the type signature:

map : ∀( 𝐴, 𝐵). Array 𝐴 → ( 𝐴 → 𝐵) → Array𝐵 .

To derive the CH -proposition (5.22), the algorithm will need to assume that the
CH -proposition:

CH ∀( 𝐴, 𝐵). Array 𝐴 → ( 𝐴 → 𝐵) → Array𝐵 (5.23)

already holds. In other words, Eq. (5.23) must be one of the premises of a sequent.
Reasoning about premises such as Eq. (5.23) requires first-order logic — a logic
whose proof rules can handle quantified types such as ∀( 𝐴, 𝐵) inside premises.
However, first-order logic is undecidable: no algorithm can find a proof (or verify
the absence of a proof) in all cases.
The constructive propositional logic with the rules listed in Table 5.2 is decid-
able, i.e., it has an algorithm that either finds a proof or disproves any given for-
mula. However, that logic cannot handle type constructors such as Array. It also
cannot handle premises containing type quantifiers such as ∀( 𝐴, 𝐵), because all
the available logic rules have the quantifiers placed outside the premises.
So, code for functions such as q can only be derived by trial and error, informed
by intuition. This book will help programmers to acquire the necessary intuition
and technique.

5.4.1 Examples
Example 5.4.1.1 Find the cardinality of the type P = Option[Option[Boolean] =>
Boolean]. Write P in the type notation.
Solution Begin with the type Option[Boolean], which can be either None or Some(x)
with some value x: Boolean. Because the type Boolean has 2 possible values, the
type Option[Boolean] has 3 possible values:

|OptBoolean | = |1 + Boolean| = 1 + |Boolean| = 1 + 2 = 3 .

In the type notation, Boolean is denoted by the symbol 2 and Option[Boolean] by


1 + 2. So, the type notation 1 + 2 is consistent with the cardinality 3 of that type:

|1 + Boolean| = |1 + 2| = 1 + 2 = 3 .
235
5 The logic of types. III. The Curry-Howard correspondence

The function type Option[Boolean] => Boolean is denoted by 1 + 2 → 2. Compute


its cardinality as:

|OptBoolean → Boolean| = |1 + 2 → 2| = |2| |1+2| = 23 = 8 .

Finally, the we write P in the type notation as 𝑃 = 1 + (1 + 2 → 2) and find:

|𝑃| = |1 + (1 + 2 → 2)| = 1 + |1 + 2 → 2| = 1 + 8 = 9 .

Example 5.4.1.2 Implement a Scala type P[A] given by this type notation:

𝑃 𝐴 def
= 1 + 𝐴 + Int × 𝐴 + (String → 𝐴) .

Solution To translate type notation into Scala code, begin by defining the dis-
junctive types as case classes, choosing class names for convenience. In this case,
𝑃 𝐴 is a disjunctive type with four parts, so we need four case classes:
sealed trait P[A]
final case class P1[A](???) extends P[A]
final case class P2[A](???) extends P[A]
final case class P3[A](???) extends P[A]
final case class P4[A](???) extends P[A]

Each of the case classes represents one part of the disjunctive type. Now we write
the contents for each of the case classes, in order to implement the data in each of
the disjunctive parts:
sealed trait P[A]
final case class P1[A]() extends P[A]
final case class P2[A](x: A) extends P[A]
final case class P3[A](n: Int, x: A) extends P[A]
final case class P4[A](f: String => A) extends P[A]

Example 5.4.1.3 Find an equivalent disjunctive type for the type P = (Either[A,
B], Either[C, D]).
Solution Begin by writing the given type in the type notation. The tuple be-
comes a product type, and Either becomes a disjunctive type:

𝑃 def
= ( 𝐴 + 𝐵) × (𝐶 + 𝐷) .

By the usual rules of arithmetic, we expand brackets and obtain an equivalent


type:
𝑃  𝐴×𝐶 + 𝐴×𝐷 + 𝐵×𝐶 + 𝐵×𝐷 .
This is a disjunctive type with 4 parts.
Example 5.4.1.4 Show that the following type equivalences do not hold: 𝐴 + 𝐴 
𝐴 and 𝐴 × 𝐴  𝐴, although the corresponding logical identities hold.
236
5.4 Summary

Solution The arithmetic equalities do not hold, 𝐴 + 𝐴 ≠ 𝐴 and 𝐴 × 𝐴 ≠ 𝐴.


This already indicates that the types are not equivalent. To build further intuition,
consider that a value of type 𝐴 + 𝐴 (in Scala, Either[A, A]) is a Left(a) or a Right(a)
for some a:A. In the code notation, it is either 𝑎 :𝐴 + 0 or 0 + 𝑎 :𝐴 . So, a value of type
𝐴 + 𝐴 contains a value of type 𝐴 with the additional information about whether
it is the first or the second part of the disjunctive type. We cannot represent that
information in a single value of type 𝐴.
Similarly, a value of type 𝐴 × 𝐴 contains two (possibly different) values of type
𝐴, which cannot be represented by a single value of type 𝐴 without loss of infor-
mation.
However, the corresponding logical identities 𝛼 ∨ 𝛼 = 𝛼 and 𝛼 ∧ 𝛼 = 𝛼 hold. To
see that, we could derive the four formulas:
𝛼∨𝛼 ⇒ 𝛼 , 𝛼 ⇒ 𝛼∨𝛼 , 𝛼∧𝛼 ⇒ 𝛼 , 𝛼 ⇒ 𝛼∧𝛼 ,
using the proof rules of Section 5.2.3. Alternatively, we may use the CH corre-
spondence and show that the type signatures:
∀𝐴. 𝐴 + 𝐴 → 𝐴 , ∀𝐴. 𝐴 → 𝐴 + 𝐴 , ∀𝐴. 𝐴 × 𝐴 → 𝐴 , ∀𝐴. 𝐴 → 𝐴 × 𝐴
can be implemented via fully parametric functions. For a programmer, it is easier
to write code than to guess the correct sequence of proof rules. For the first pair
of type signatures, we find:
def f1[A]: Either[A, A] => A = {
case Left(a) => a // No other choice here.
case Right(a) => a // No other choice here.
}
def f2[A]: A => Either[A, A] = { a => Left(a) } // Can be also Right(a).
The presence of an arbitrary choice, to return Left(a) or Right(a), is a warning sign
showing that additional information is required to create a value of type Either[A,
A]. This is precisely the information present in the type 𝐴 + 𝐴 but missing in the
type 𝐴.
The code notation for these functions is:
𝐴 𝐴
def
𝑓1 = 𝐴 𝑎 :𝐴 → 𝑎 = 𝐴 id ,
𝐴 𝑎 :𝐴 → 𝑎 𝐴 id

𝐴 𝐴 𝐴 𝐴
𝑓2 def
= 𝑎 :𝐴 → 𝑎 + 0:𝐴 = = .
𝐴 𝑎 :𝐴 → 𝑎 0 𝐴 id 0
The composition of these functions is not equal to identity:

id id 0 id 0
𝑓1 # 𝑓2 = # id 0 = , while we have id:𝐴+𝐴→𝐴+𝐴 = .
id id 0 0 id
237
5 The logic of types. III. The Curry-Howard correspondence

For the second pair of type signatures, the code is:


def f1[A]: ((A, A)) => A = { case (a1, a2) => a1 } // Could be also `a2`.
cef f2[A]: A => (A, A) = { a => (a, a) } // No other choice here.
It is clear that the first function loses information when it returns a1 and discards
a2 (or vice versa).
The code notation for the functions f1 and f2 is:

𝑓1 def
= 𝑎 1:𝐴 × 𝑎 2:𝐴 → 𝑎 1 = 𝜋1:𝐴×𝐴→𝐴 , 𝑓2 def
= 𝑎 :𝐴 → 𝑎 × 𝑎 = Δ:𝐴→𝐴×𝐴 .

Computing the compositions of these functions, we find that 𝑓2 # 𝑓1 = id while


𝑓1 # 𝑓2 ≠ id:

𝑓1 # 𝑓2 = (𝑎 1 × 𝑎 2 → 𝑎 1 ) # (𝑎 → 𝑎 × 𝑎)
= (𝑎 1 × 𝑎 2 → 𝑎 1 × 𝑎 1 ) ≠ id = (𝑎 1 × 𝑎 2 → 𝑎 1 × 𝑎 2 ) .

We have implemented all four type signatures as fully parametric functions,


which shows that the corresponding logical formulas are all true (i.e., can be de-
rived using the proof rules). However, the functions 𝑓1 and 𝑓2 are not inverses of
each other. So, the type equivalences do not hold.
Example 5.4.1.5 Show that (( 𝐴 ∧ 𝐵) ⇒ 𝐶) ≠ ( 𝐴 ⇒ 𝐶) ∨ (𝐵 ⇒ 𝐶) in the construc-
tive logic, but the equality holds in Boolean logic. This is another example where
the Boolean reasoning fails to give correct answers about implementability of type
signatures.
Solution Begin by rewriting the logical equality as two implications:

( 𝐴 ∧ 𝐵 ⇒ 𝐶) ⇒ ( 𝐴 ⇒ 𝐶) ∨ (𝐵 ⇒ 𝐶)
and (( 𝐴 ⇒ 𝐶) ∨ (𝐵 ⇒ 𝐶)) ⇒ (( 𝐴 ∧ 𝐵) ⇒ 𝐶) .

It is sufficient to show that one of these implications is incorrect. Rather than


looking for a proof tree in the constructive logic (which would be difficult, since
we need to demonstrate that no proof exists), let us use the CH correspondence.
According to the CH correspondence, an equivalent task is to implement fully
parametric functions with the type signatures:

( 𝐴 × 𝐵 → 𝐶) → ( 𝐴 → 𝐶) + (𝐵 → 𝐶) and ( 𝐴 → 𝐶) + (𝐵 → 𝐶) → 𝐴 × 𝐵 → 𝐶 .

For the first type signature, the Scala code is:


def f1[A, B, C]: (((A, B)) => C) => Either[A => C, B => C] = { k => ??? }
We are required to return either a Left(g) with g: A => C, or a Right(h) with h: B =>
C. The only given data is a function k of type 𝐴 × 𝐵 → 𝐶, so the decision of whether
to return a Left or a Right must be hard-coded in the function f1 independently of
k. Can we produce a function g of type A => C? Given a value of type A, we would
need to return a value of type C. The only way to obtain a value of type C is by
238
5.4 Summary

applying k to some arguments. But to apply k, we need a value of type B, which


we do not have. So, we cannot produce a g: A => C. Similarly, we cannot produce
a function h of type B => C.
We repeat the same argument in the type notation. Obtaining a value of type
( 𝐴 → 𝐶) + (𝐵 → 𝐶) means to compute either 𝑔 :𝐴→𝐶 + 0 or 0 + ℎ:𝐵→𝐶 . This decision
must be hard-coded since the only data is a function 𝑘 :𝐴×𝐵→𝐶 . We can compute
𝑔 :𝐴→𝐶 only by partially applying 𝑘 :𝐴×𝐵→𝐶 to a value of type 𝐵. However, we have
no values of type 𝐵. Similarly, we cannot compute a value ℎ:𝐵→𝐶 .
The inverse type signature can be implemented:
def f2[A, B, C]: Either[A => C, B => C] => ((A, B)) => C = {
case Left(g) => { case (a, b) => g(a) }
case Right(h) => { case (a, b) => h(b) }
}

𝐴×𝐵 →𝐶
𝑓2 def
= 𝐴→𝐶 𝑔 :𝐴→𝐶 → 𝑎 × 𝑏 → 𝑔(𝑎) .
𝐵→𝐶 ℎ:𝐵→𝐶 → 𝑎 × 𝑏 → ℎ(𝑏)
Let us now show that the logical identity:

((𝛼 ∧ 𝛽) ⇒ 𝛾) = ((𝛼 ⇒ 𝛾) ∨ (𝛽 ⇒ 𝛾)) (5.24)

holds in Boolean logic. A straightforward calculation is to simplify the Boolean


expression using Eq. (5.8), which only holds in Boolean logic (but not in the con-
structive logic). We find:

left-hand side of Eq. (5.24) : (𝛼 ∧ 𝛽) ⇒ 𝛾


use Eq. (5.8) : = ¬(𝛼 ∧ 𝛽) ∨ 𝛾
use de Morgan’s law : = ¬𝛼 ∨ ¬𝛽 ∨ 𝛾 .
right-hand side of Eq. (5.24) : (𝛼 ⇒ 𝛾) ∨ (𝛽 ⇒ 𝛾)
use Eq. (5.8) : = ¬𝛼 ∨ 𝛾 ∨ ¬𝛽 ∨ 𝛾
use identity 𝛾 ∨ 𝛾 = 𝛾 : = ¬𝛼 ∨ ¬𝛽 ∨ 𝛾 .

Both sides of Eq. (5.24) are equal to the same formula, ¬𝛼 ∨ ¬𝛽 ∨ 𝛾, so the identity
holds.
This calculation does not work in the constructive logic because its proof rules
can derive neither the Boolean formula (5.8) nor the law of de Morgan, ¬(𝛼 ∧ 𝛽) =
(¬𝛼 ∨ ¬𝛽).
Another way of proving the Boolean identity (5.24) is to enumerate all possible
truth values for the variables 𝛼, 𝛽, and 𝛾. The left-hand side, (𝛼 ∧ 𝛽) ⇒ 𝛾, can be
𝐹𝑎𝑙𝑠𝑒 only if 𝛼 ∧ 𝛽 = 𝑇𝑟𝑢𝑒 (that is, both 𝛼 and 𝛽 are 𝑇𝑟𝑢𝑒) and 𝛾 = 𝐹𝑎𝑙𝑠𝑒. For all
other truth values of 𝛼, 𝛽, and 𝛾, the formula (𝛼 ∧ 𝛽) ⇒ 𝛾 is 𝑇𝑟𝑢𝑒. Let us determine
239
5 The logic of types. III. The Curry-Howard correspondence

when the right-hand side, (𝛼 ⇒ 𝛾) ∨ (𝛽 ⇒ 𝛾), can be 𝐹𝑎𝑙𝑠𝑒. This can happen only
if both parts of the disjunction are 𝐹𝑎𝑙𝑠𝑒. That means 𝛼 = 𝑇𝑟𝑢𝑒, 𝛽 = 𝑇𝑟𝑢𝑒, and
𝛾 = 𝐹𝑎𝑙𝑠𝑒. So, the two sides of the identity (5.24) are both 𝑇𝑟𝑢𝑒 or both 𝐹𝑎𝑙𝑠𝑒
with any choice of truth values of 𝛼, 𝛽, and 𝛾. In Boolean logic, this is sufficient to
prove the identity (5.24). 
The following example shows how to use the formulas from Tables 5.5–5.6 to
derive the type equivalence of complicated type expressions without need for
proofs.
Example 5.4.1.6 Use known formulas to verify the type equivalences without
direct proofs:
(a) 𝐴 × ( 𝐴 + 1) × ( 𝐴 + 1 + 1)  𝐴 × (1 + 1 + 𝐴 × (1 + 1 + 1 + 𝐴)).
(b) 1 + 𝐴 + 𝐵 → 1 × 𝐵  (𝐵 → 𝐵) × ( 𝐴 → 𝐵) × 𝐵.
Solution (a) We can expand brackets in type expressions as in arithmetic:

𝐴 × ( 𝐴 + 1)  𝐴 × 𝐴 + 𝐴 × 1  𝐴 × 𝐴 + 𝐴 ,
𝐴 × ( 𝐴 + 1) × ( 𝐴 + 1 + 1)  ( 𝐴 × 𝐴 + 𝐴) × ( 𝐴 + 1 + 1)
 𝐴 × 𝐴 × 𝐴 + 𝐴 × 𝐴 + 𝐴 × 𝐴 × (1 + 1) + 𝐴 × (1 + 1)
 𝐴 × 𝐴 × 𝐴 + 𝐴 × 𝐴 × (1 + 1 + 1) + 𝐴 × (1 + 1) .

The result looks like a polynomial in 𝐴, which we can now rearrange into the
required form:

𝐴 × 𝐴 × 𝐴 + 𝐴 × 𝐴 × (1 + 1 + 1) + 𝐴 × (1 + 1)  𝐴 × (1 + 1 + 𝐴 × (1 + 1 + 1 + 𝐴)) .

(b) Keep in mind that the conventions of the type notation make the function
arrow (→) group weaker than other type operations. So, the type expression 1 +
𝐴 + 𝐵 → 1 × 𝐵 means a function from 1 + 𝐴 + 𝐵 to 1 × 𝐵.
Begin by using the equivalence 1 × 𝐵  𝐵 to obtain 1 + 𝐴 + 𝐵 → 𝐵. Now we use
another rule:
𝐴 + 𝐵 → 𝐶  ( 𝐴 → 𝐶) × (𝐵 → 𝐶)
and derive the equivalence:

1 + 𝐴 + 𝐵 → 𝐵  (1 → 𝐵) × ( 𝐴 → 𝐵) × (𝐵 → 𝐵) .

Finally, we note that 1 → 𝐵  𝐵 and that the type product is commutative, so we


can rearrange the last type expression into the required form:

𝐵 × ( 𝐴 → 𝐵) × (𝐵 → 𝐵)  (𝐵 → 𝐵) × ( 𝐴 → 𝐵) × 𝐵 .

We obtain the required type expression: (𝐵 → 𝐵) × ( 𝐴 → 𝐵) × 𝐵.


Example 5.4.1.7 Denote Reader𝐸,𝐴 def = 𝐸 → 𝐴 and implement fully parametric
𝐸,𝐴
functions with types 𝐴 → Reader and Reader𝐸,𝐴 → ( 𝐴 → 𝐵) → Reader𝐸,𝐵 .
240
5.4 Summary

Solution Begin by defining a type alias for the type Reader𝐸,𝐴 :


type Reader[E, A] = E => A
The first type signature has only one implementation:
def p[E, A]: A => Reader[E, A] = { x => _ => x }
We must discard the argument of type 𝐸; we cannot use it for computing a value
of type A.
The second type signature has three type parameters. It is the curried version
of the function map:
def map[E, A, B]: Reader[E, A] => (A => B) => Reader[E, B] = ???
Expanding the type alias, we see that the two curried arguments are functions of
types 𝐸 → 𝐴 and 𝐴 → 𝐵. The forward composition of these functions is a function
of type 𝐸 → 𝐵, or Reader𝐸,𝐵 , which is exactly what we are required to return. So,
the code can be written as:
def map[E, A, B]: (E => A) => (A => B) => E => B = { r => f => r andThen f }
If we did not notice this shortcut, we would reason differently: We are required
to compute a value of type 𝐵 given three curried arguments 𝑟 :𝐸→𝐴 , 𝑓 :𝐴→𝐵 , and 𝑒 :𝐸 .
Write this requirement as:
map def
= 𝑟 :𝐸→𝐴 → 𝑓 :𝐴→𝐵 → 𝑒 :𝐸 →???:𝐵 ,
The symbol ???:𝐵 denotes a typed hole. It stands for a value that we are still
figuring out how to compute, but whose type is already known. Typed holes are
supported in Scala by an experimental compiler plugin.12 The plugin will print
the known information about the typed hole.
To fill the typed hole ???:𝐵 , we need a value of type 𝐵. Since no arguments have
type 𝐵, the only way of getting a value of type 𝐵 is to apply 𝑓 :𝐴→𝐵 to some value
of type 𝐴. So, we write:
map def
= 𝑟 :𝐸→𝐴 → 𝑓 :𝐴→𝐵 → 𝑒 :𝐸 → 𝑓 (???:𝐴 ) .
The only way of getting an 𝐴 is to apply 𝑟 to some value of type 𝐸:
map def
= 𝑟 :𝐸→𝐴 → 𝑓 :𝐴→𝐵 → 𝑒 :𝐸 → 𝑓 (𝑟 (???:𝐸 )) .
We have exactly one value of type 𝐸, namely 𝑒 :𝐸 . So, the code must be:
map𝐸,𝐴,𝐵 def
= 𝑟 :𝐸→𝐴 → 𝑓 :𝐴→𝐵 → 𝑒 :𝐸 → 𝑓 (𝑟 (𝑒)) .
Translate this to the Scala syntax:
def map[E, A, B]: (E => A) => (A => B) => E => B = { r => f => e => f(r(e)) }
We may now notice that the expression 𝑒 → 𝑓 (𝑟 (𝑒)) is a function composition 𝑟 # 𝑓
applied to 𝑒, and simplify the code accordingly.
12 See https://github.com/cb372/scala-typed-holes

241
5 The logic of types. III. The Curry-Howard correspondence

Example 5.4.1.8 Show that one cannot implement the type signature Reader[A,
T] => (A => B) => Reader[B, T] by a fully parametric function.
Solution Expand the type signature and try implementing this function:
def m[A, B, T] : (A => T) => (A => B) => B => T = { r => f => b => ??? }
Given values 𝑟 :𝐴→𝑇 , 𝑓 :𝐴→𝐵 , and 𝑏 :𝐵 , we need to compute a value of type 𝑇:
𝑚 = 𝑟 :𝐴→𝑇 → 𝑓 :𝐴→𝐵 → 𝑏 :𝐵 →???:𝑇 .
The only way of getting a value of type 𝑇 is to apply 𝑟 to some value of type 𝐴:
𝑚 = 𝑟 :𝐴→𝑇 → 𝑓 :𝐴→𝐵 → 𝑏 :𝐵 → 𝑟 (???:𝐴 ) .
However, we do not have any values of type 𝐴. We have a function 𝑓 :𝐴→𝐵 that
consumes values of type 𝐴, and we cannot use 𝑓 to produce any values of type
𝐴. So, it seems that we are unable to fill the typed hole ???:𝐴 and implement the
function m.
In order to verify that m is unimplementable, we need to prove that the logical
formula:
∀(𝛼, 𝛽, 𝜏). (𝛼 ⇒ 𝜏) ⇒ (𝛼 ⇒ 𝛽) ⇒ (𝛽 ⇒ 𝜏) (5.25)
is not true in the constructive logic. We could use the curryhoward library for that:
@ def m[A, B, T] : (A => T) => (A => B) => B => T = implement
cmd1.sc:1: type (A => T) => (A => B) => B => T cannot be implemented
def m[A, B, T] : (A => T) => (A => B) => B => T = implement
^
Compilation Failed
Another way is to check whether this formula is true in Boolean logic. A formula
that holds in constructive logic will always hold in Boolean logic, because all rules
shown in Section 5.2.3 preserve Boolean truth values (see Section 5.5.4 for a proof).
It follows that any formula that fails to hold in Boolean logic will also not hold in
constructive logic.
It is relatively easy to check whether a given Boolean formula is always equal
to 𝑇𝑟𝑢𝑒. Simplifying Eq. (5.25) with the rules of Boolean logic, we find:
(𝛼 ⇒ 𝜏) ⇒ (𝛼 ⇒ 𝛽) ⇒ (𝛽 ⇒ 𝜏)
use Eq. (5.8) : = ¬(𝛼 ⇒ 𝜏) ∨ ¬(𝛼 ⇒ 𝛽) ∨ (𝛽 ⇒ 𝜏)
use Eq. (5.8) : = ¬(¬𝛼 ∨ 𝜏) ∨ ¬(¬𝛼 ∨ 𝛽) ∨ (¬𝛽 ∨ 𝜏)
use de Morgan’s law : = (𝛼 ∧ ¬𝜏) ∨ (𝛼 ∧ ¬𝛽) ∨ ¬𝛽 ∨ 𝜏
use identity ( 𝑝 ∧ 𝑞) ∨ 𝑞 = 𝑞 : = (𝛼 ∧ ¬𝜏) ∨ ¬𝛽 ∨ 𝜏
use identity ( 𝑝 ∧ ¬𝑞) ∨ 𝑞 = 𝑝 ∨ 𝑞 : = 𝛼 ∨ ¬𝛽 ∨ 𝜏 .
This formula is not identically 𝑇𝑟𝑢𝑒: it is 𝐹𝑎𝑙𝑠𝑒 when 𝛼 = 𝜏 = 𝐹𝑎𝑙𝑠𝑒 and 𝛽 = 𝑇𝑟𝑢𝑒.
So, Eq. (5.25) is not true in Boolean logic, therefore it is also not true in constructive
logic. By the CH correspondence, we conclude that the type signature of m cannot
be implemented by a fully parametric function.
242
5.4 Summary

Example 5.4.1.9 Define the type constructor 𝑃 𝐴 def = 1 + 𝐴 + 𝐴 and implement map
for it, with the type signature map 𝐴,𝐵 : 𝑃 𝐴 → ( 𝐴 → 𝐵) → 𝑃 𝐵 . To check that map
preserves information, verify the law map(p)(x => x) == p for all p: P[A].
Solution It is implied that map should be fully parametric and information-
preserving. Begin by defining a Scala type constructor for the notation 𝑃 𝐴 def
= 1+
𝐴 + 𝐴:
sealed trait P[A]
final case class P1[A]() extends P[A]
final case class P2[A](x: A) extends P[A]
final case class P3[A](x: A) extends P[A]
Now we can write code to implement the required type signature. Each time we
have several choices of an implementation, we will choose to preserve informa-
tion as much as possible.
def map[A, B]: P[A] => (A => B) => P[B] =
p => f => p match {
case P1() => P1() // No other choice.
case P2(x) => ???
case P3(x) => ???
}
In the case P2(x), we are required to produce a value of type 𝑃 𝐵 from a value
𝑥 :𝐴 and a function 𝑓 :𝐴→𝐵 . Since 𝑃 𝐵 is a disjunctive type with three parts, we can
produce a value of type 𝑃 𝐵 in three different ways: P1(), P2(...), and P3(...). If we
return P1(), we will lose the information about the value x. If we return P3(...),
we will preserve the information about x but lose the information that the input
value was a P2 rather than a P3. By returning P2(...) in that scope, we preserve the
entire input information.
The value under P2(...) must be of type 𝐵, and the only way of getting a value
of type 𝐵 is to apply 𝑓 to 𝑥. So, we return P2(f(x)).
Similarly, in the case P3(x), we should return P3(f(x)). The final code of map is:
def map[A, B]: P[A] => (A => B) => P[B] = p => f => p match {
case P1() => P1() // No other choice here.
case P2(x) => P2(f(x)) // Preserve information.
case P3(x) => P3(f(x)) // Preserve information.
}
To verify the given law, we first write a matrix notation for map:

1 𝐵 𝐵
1 id 0 0
map 𝐴,𝐵 def
= 𝑝 :1+𝐴+𝐴 → 𝑓 :𝐴→𝐵 → 𝑝 ⊲ .
𝐴 0 𝑓 0
𝐴 0 0 𝑓

The required law is written as an equation map ( 𝑝) (id) = 𝑝, called the identity
243
5 The logic of types. III. The Curry-Howard correspondence

law. Substituting the code notation for map, we verify the law:

expect to equal 𝑝 : map ( 𝑝) (id)


id 0 0
apply map()() to arguments : = 𝑝⊲ 0 id 0
0 0 id
identity function in matrix notation : = 𝑝 ⊲ id
⊲-notation : = id ( 𝑝) = 𝑝 .

Example 5.4.1.10 Implement map and flatMap for Either[L, R], applied to the type
parameter L.
Solution For a type constructor, say, 𝑃, the standard type signatures for map
and flatMap are:

map : 𝑃 𝐴 → ( 𝐴 → 𝐵) → 𝑃 𝐵 , flatMap : 𝑃 𝐴 → ( 𝐴 → 𝑃 𝐵 ) → 𝑃 𝐵 .

If a type constructor has more than one type parameter, e.g., 𝑃 𝐴,𝑆,𝑇 , one can define
the functions map and flatMap applied to a chosen type parameter. For example,
when applied to the type parameter 𝐴, the type signatures are:

map : 𝑃 𝐴,𝑆,𝑇 → ( 𝐴 → 𝐵) → 𝑃 𝐵,𝑆,𝑇 ,


flatMap : 𝑃 𝐴,𝑆,𝑇 → ( 𝐴 → 𝑃 𝐵,𝑆,𝑇 ) → 𝑃 𝐵,𝑆,𝑇 .

Being “applied to the type parameter 𝐴” means that the other type parameters 𝑆, 𝑇
in 𝑃 𝐴,𝑆,𝑇 remain fixed while the type parameter 𝐴 is replaced by 𝐵 in the type
signatures of map and flatMap.
For the type Either[L, R] (in the type notation, 𝐿 + 𝑅), we keep the type param-
eter 𝑅 fixed while 𝐿 is replaced by 𝑀. So, we obtain the type signatures:

map : 𝐿 + 𝑅 → (𝐿 → 𝑀) → 𝑀 + 𝑅 ,
flatMap : 𝐿 + 𝑅 → (𝐿 → 𝑀 + 𝑅) → 𝑀 + 𝑅 .

Implementing these functions is straightforward:


def map[L,M,R]: Either[L, R] => (L => M) => Either[M, R] = e => f => e match {
case Left(x) => Left(f(x))
case Right(y) => Right(y)
}

def flatMap[L, M, R]: Either[L, R] => (L => Either[M, R]) => Either[M, R] = e
=> f => e match {
case Left(x) => f(x)
case Right(y) => Right(y)
}

244
5.4 Summary

The code notation for these functions is:

𝑀 𝑅
def
map = 𝑒 :𝐿+𝑅 → 𝑓 :𝐿→𝑀 → 𝑒 ⊲ 𝐿 𝑓 0 ,
𝑅 0 id

𝑀+𝑅
flatMap def
= 𝑒 :𝐿+𝑅 → 𝑓 :𝐿→𝑀+𝑅 → 𝑒 ⊲ 𝐿 𝑓 .
𝑅 𝑦 :𝑅 → 0:𝑀 + 𝑦

Note that the code matrix for flatMap cannot be split into the 𝑀 and 𝑅 columns
because we do not know in advance which part of the disjunctive type 𝑀 + 𝑅 will
be returned when we evaluate 𝑓 (𝑥 :𝐿 ).
Example 5.4.1.11 Define a type constructor State𝑆,𝐴 ≡ 𝑆 → 𝐴 × 𝑆 and implement
the functions:
(a) pure𝑆,𝐴 : 𝐴 → State𝑆,𝐴 .
(b) map𝑆,𝐴,𝐵 : State𝑆,𝐴 → ( 𝐴 → 𝐵) → State𝑆,𝐵 .
(c) flatMap𝑆,𝐴,𝐵 : State𝑆,𝐴 → ( 𝐴 → State𝑆,𝐵 ) → State𝑆,𝐵 .
Solution It is assumed that all functions must be fully parametric and pre-
serve as much information as possible. We define the type alias:
type State[S, A] = S => (A, S)
(a) The type signature is 𝐴 → 𝑆 → 𝐴 × 𝑆, and there is only one implementation:
def pure[S, A]: A => State[S, A] = a => s => (a, s)
In the code notation, this is written as:

pu𝑆,𝐴 def
= 𝑎 :𝐴 → 𝑠 :𝑆 → 𝑎 × 𝑠 .

(b) The type signature is:

map𝑆,𝐴,𝐵 : (𝑆 → 𝐴 × 𝑆) → ( 𝐴 → 𝐵) → 𝑆 → 𝐵 × 𝑆 .

Begin writing a Scala implementation:


def map[S, A, B]: State[S, A] => (A => B) => State[S, B] = { t => f => s => ???
}
We need to compute a value of 𝐵 × 𝑆 from the curried arguments 𝑡 :𝑆→𝐴×𝑆 , 𝑓 :𝐴→𝐵 ,
and 𝑠 :𝑆 . We begin writing the code of map using a typed hole:

map def
= 𝑡 :𝑆→𝐴×𝑆 → 𝑓 :𝐴→𝐵 → 𝑠 :𝑆 → ???:𝐵 × ???:𝑆 .

The only way of getting a value of type 𝐵 is by applying 𝑓 to a value of type 𝐴:

map def
= 𝑡 :𝑆→𝐴×𝑆 → 𝑓 :𝐴→𝐵 → 𝑠 :𝑆 → 𝑓 (???:𝐴 ) × ???:𝑆 .
245
5 The logic of types. III. The Curry-Howard correspondence

The only possibility of filling the typed hole ???:𝐴 is to apply 𝑡 to a value of type
𝑆. We already have such a value, 𝑠 :𝑆 . Computing 𝑡 (𝑠) yields a pair of type 𝐴 × 𝑆,
from which we may take the first part (of type 𝐴) to fill the typed hole ???:𝐴 . The
second part of the pair is a value of type 𝑆 that we may use to fill the second typed
hole, ???:𝑆 . So, the Scala code is:
1 def map[S, A, B]: State[S, A] => (A => B) => State[S, B] = {
2 t => f => s =>
3 val (a, s2) = t(s)
4 (f(a), s2) // We could also return `(f(a), s)` here.
5 }
Why not return the original value s in the tuple 𝐵 × 𝑆, instead of the new value
s2? The reason is that we would like to preserve information as much as possible.
If we return (f(a), s) in line 4, we will have discarded the computed value s2,
which is a loss of information.
To write the code notation for map, we need to destructure the pair that 𝑡 (𝑠)
returns. We can write explicit destructuring code like this:
map def
= 𝑡 :𝑆→𝐴×𝑆 → 𝑓 :𝐴→𝐵 → 𝑠 :𝑆 → (𝑎 :𝐴 × 𝑠2:𝑆 → 𝑓 (𝑎) × 𝑠2 )(𝑡 (𝑠)) .
If we temporarily denote by 𝑞 the following destructuring function:
𝑞 def
= (𝑎 :𝐴 × 𝑠2:𝑆 → 𝑓 (𝑎) × 𝑠2 ) ,
we will notice that the expression 𝑠 → 𝑞(𝑡 (𝑠)) is a function composition applied
to 𝑠. So, we rewrite 𝑠 → 𝑞(𝑡 (𝑠)) as the composition 𝑡 # 𝑞 and obtain shorter code:
map def
= 𝑡 :𝑆→𝐴×𝑆 → 𝑓 :𝐴→𝐵 → 𝑡 # (𝑎 :𝐴 × 𝑠 :𝑆 → 𝑓 (𝑎) × 𝑠) .
Shorter formulas are often easier to reason about in derivations, although not nec-
essarily easier to read when converted to program code.
(c) The required type signature is:
flatMap𝑆,𝐴,𝐵 : (𝑆 → 𝐴 × 𝑆) → ( 𝐴 → 𝑆 → 𝐵 × 𝑆) → 𝑆 → 𝐵 × 𝑆 .
We perform code reasoning with typed holes:
flatMap def
= 𝑡 :𝑆→𝐴×𝑆 → 𝑓 :𝐴→𝑆→𝐵×𝑆 → 𝑠 :𝑆 → ???:𝐵×𝑆 .
To fill ???:𝐵×𝑆 , we need to apply 𝑓 to some arguments, since 𝑓 is the only function
that returns any values of type 𝐵. Applying 𝑓 to two values will yield a value of
type 𝐵 × 𝑆, just as we need:
flatMap def
= 𝑡 :𝑆→𝐴×𝑆 → 𝑓 :𝐴→𝑆→𝐵×𝑆 → 𝑠 :𝑆 → 𝑓 (???:𝐴 )(???:𝑆 ) .
To fill the new typed holes, we need to apply 𝑡 to an argument of type 𝑆. We have
only one given value 𝑠 :𝑆 of type 𝑆, so we must compute 𝑡 (𝑠) and destructure it:
flatMap def
= 𝑡 :𝑆→𝐴×𝑆 → 𝑓 :𝐴→𝑆→𝐵×𝑆 → 𝑠 :𝑆 → (𝑎 × 𝑠2 → 𝑓 (𝑎)(𝑠2 )) (𝑡 (𝑠)) .
Translating this notation into Scala code, we obtain:
246
5.4 Summary

def flatMap[S, A, B]: State[S, A] => (A => State[S, B]) => State[S, B] = {
t => f => s =>
val (a, s2) = t(s)
f(a)(s2) // We could also return `f(a)(s)` here, but that would
lose information.
}
In order to preserve information, we choose not to discard the computed value s2.
The code notation for this flatMap can be simplified to:

flatMap def
= 𝑡 :𝑆→𝐴×𝑆 → 𝑓 :𝐴→𝑆→𝐵×𝑆 → 𝑡 # (𝑎 × 𝑠 → 𝑓 (𝑎)(𝑠)) .

5.4.2 Exercises
Exercise 5.4.2.1 Find the cardinality of the following Scala type:
type P = Option[Boolean => Option[Boolean]]
Show that P is equivalent to Option[Boolean] => Boolean, but the equivalence is ac-
cidental and not “natural”.
Exercise 5.4.2.2 Verify the type equivalences 𝐴 + 𝐴  2 × 𝐴 and 𝐴 × 𝐴  2 → 𝐴,
where 2 denotes the Boolean type.
Exercise 5.4.2.3 Show that 𝛼 ⇒ (𝛽 ∨ 𝛾) ≠ (𝛼 ⇒ 𝛽) ∧ (𝛼 ⇒ 𝛾) in constructive and
Boolean logic.
Exercise 5.4.2.4 Verify the type equivalence ( 𝐴 → 𝐵 × 𝐶)  ( 𝐴 → 𝐵) × ( 𝐴 → 𝐶)
with full proofs.
Exercise 5.4.2.5 Use known rules to verify the type equivalences without need
for proofs:
(a) ( 𝐴 + 𝐵) × ( 𝐴 → 𝐵)  𝐴 × ( 𝐴 → 𝐵) + (1 + 𝐴 → 𝐵) .
(b) ( 𝐴 × (1 + 𝐴) → 𝐵)  ( 𝐴 → 𝐵) × ( 𝐴 → 𝐴 → 𝐵) .
(c) 𝐴 → (1 + 𝐵) → 𝐶 × 𝐷  ( 𝐴 → 𝐶) × ( 𝐴 → 𝐷) × ( 𝐴 × 𝐵 → 𝐶) × ( 𝐴 × 𝐵 → 𝐷) .
Exercise 5.4.2.6 Write the type notation for Either[(A, Int), Either[(A, Char),
(A, Float)]]. Transform this type into an equivalent type of the form 𝐴 × (...).
Exercise 5.4.2.7 Define a type OptE𝑇,𝐴 def = 1 + 𝑇 + 𝐴 and implement information-
preserving map and flatMap for it, applied to the type parameter 𝐴. Get the same
result using the equivalent type (1 + 𝐴) + 𝑇, i.e., Either[Option[A], T]. The required
type signatures are:

map 𝐴,𝐵,𝑇 : OptE𝑇,𝐴 → ( 𝐴 → 𝐵) → OptE𝑇,𝐵 ,


flatMap 𝐴,𝐵,𝑇 : OptE𝑇,𝐴 → ( 𝐴 → OptE𝑇,𝐵 ) → OptE𝑇,𝐵 .

Exercise 5.4.2.8 Implement the map function for the type constructor P from Ex-
ample 5.4.1.2. The required type signature is 𝑃 𝐴 → ( 𝐴 → 𝐵) → 𝑃 𝐵 . Preserve
information as much as possible.
247
5 The logic of types. III. The Curry-Howard correspondence

Exercise 5.4.2.9 For the type constructor 𝑄 defined in Exercise 5.1.5.1, define the
map function, preserving information as much as possible:

map𝑇,𝐴,𝐵 : 𝑄𝑇,𝐴 → ( 𝐴 → 𝐵) → 𝑄𝑇,𝐵 .

Exercise 5.4.2.10 Define a recursive type constructor Tr3 as Tr3 𝐴 def


= 1+ 𝐴× 𝐴×
𝐴 × Tr3 and implement the map function for it, with the standard type signature:
𝐴

map 𝐴,𝐵 : Tr3 𝐴 → ( 𝐴 → 𝐵) → Tr3 𝐵 .


Exercise 5.4.2.11 Implement fully parametric, information-preserving functions
with the following type signatures:
(a) ( 𝐴 → 𝐵 → 𝐶) → 𝐴 → 𝐵 → 𝐶 .
(b) ( 𝐴 → 𝐶) → (𝐵 → 𝐷) → 𝐴 + 𝐵 → 𝐶 + 𝐷 .
(c) ( 𝐴 → 𝐶) → (𝐵 → 𝐷) → 𝐴 × 𝐵 → 𝐶 × 𝐷 .
(d) (( 𝐴 → 𝐴) → 𝐴) → 𝐴 .
(e) (( 𝐴 → 𝐵) → 𝐶) → 𝐵 → 𝐶 .
(f) (( 𝐴 → 𝐵) → 𝐴) → ( 𝐴 → 𝐵) → 𝐵 .
(g) ( 𝐴 → 𝐵 + 𝐶) → (𝐵 → 𝐶) → 𝐴 → 𝐶 .
(h) ( 𝐴 + 𝐵 → 𝐶) → (𝐵 → 𝐶) → 𝐴 → 𝐶 .
(i) Reader𝐸,𝐴 → ( 𝐴 → Reader𝐸,𝐵 ) → Reader𝐸,𝐵 .
(j) Reader𝐸,𝐴 × Reader𝐸,𝐵 → ( 𝐴 × 𝐵 → 𝐶) → Reader𝐸,𝐶 .
(k) State𝑆,𝐴 → (𝑆 × 𝐴 → 𝐵) → State𝑆,𝐵 .
(l) 𝐴 + 𝑍 → 𝐵 + 𝑍 → ( 𝐴 → 𝐵 → 𝐶) → 𝐶 + 𝑍 .
(m) 𝑃 + 𝐴 × 𝐴 → ( 𝐴 → 𝐵) → (𝑃 → 𝐴 + 𝑄) → 𝑄 + 𝐵 × 𝐵 .
Exercise 5.4.2.12 Denote Cont 𝑅,𝑇 def = (𝑇 → 𝑅) → 𝑅 and implement the functions:
(a) map 𝑅,𝑇,𝑈 : Cont → (𝑇 → 𝑈) → Cont 𝑅,𝑈 .
𝑅,𝑇
𝑅,𝑇,𝑈
(b) flatMap : Cont 𝑅,𝑇 → (𝑇 → Cont 𝑅,𝑈 ) → Cont 𝑅,𝑈 .
Exercise 5.4.2.13 Denote Sel𝑍,𝑇 def = (𝑇 → 𝑍) → 𝑇 and implement the functions:
𝑍,𝐴
(a) map 𝑍,𝐴,𝐵 : Sel → ( 𝐴 → 𝐵) → Sel𝑍,𝐵 .
(b) flatMap𝑍,𝐴,𝐵 : Sel𝑍,𝐴 → ( 𝐴 → Sel𝑍,𝐵 ) → Sel𝑍,𝐵 .

5.5 Discussion and further developments


5.5.1 Using the Curry-Howard correspondence for writing
code
This chapter focuses on two practically important reasoning tasks: checking if a
type signature can be implemented as a fully parametric function, and determin-
ing whether two types are equivalent. For the first task, we use the CH correspon-
dence to map type expressions into formulas in the constructive logic and then
apply the proof rules of that logic. For the second task, we map type expressions
into arithmetic formulas and apply the ordinary rules of arithmetic.
248
5.5 Discussion and further developments

Although tools such as the curryhoward library can sometimes derive code from
types, it is beneficial if a programmer is able to derive an implementation by hand
or to determine that an implementation is impossible. For instance, the program-
mer should recognize that the type signature:
def f[A, B]: A => (A => B) => B

has only one fully parametric implementation, while the following two type sig-
natures have none:
def g[A, B]: A => (B => A) => B
def h[A, B]: ((A => B) => A) => B

Exercises in this chapter help to build up the required technique and intuition. The
two main guidelines for code derivation are: “values of parametric types cannot
be constructed from scratch” and “one must hard-code the decision to return a
chosen part of a disjunctive type when no other disjunctive value is given”. These
guidelines can be justified by referring to the rules of proof in Table 5.2. Sequents
producing a value of type 𝐴 can be proved only if there is a premise containing
𝐴 or a function that returns a value of type 𝐴.13 One can derive a disjunction
without hard-coding only if one already has a disjunction in the premises (and
then the rule “use Either” could apply).
Throughout this chapter, we require all code to be fully parametric. This is
because the CH correspondence gives useful, non-trivial results only for param-
eterized types and fully parametric code. For concrete, non-parameterized types
(Int, String, etc.), one can always produce some values even with no previous data.
So, the propositions CH (Int) or CH (String) are always true.
Consider the function (x: Int) => x + 1. Its type signature, Int => Int, may be
implemented by many other functions, such as x => x - 1, x => x * 2, etc. So, the
type signature Int => Int is insufficient to specify the code of the function, and
deriving code from that type is not a meaningful task. Only a fully parametric
type signature, such as 𝐴 → ( 𝐴 → 𝐵) → 𝐵, could give enough information for de-
riving the function’s code. Additionally, we must require the code of functions to
be fully parametric. Otherwise we will be unable to reason about code derivation
from type signatures.
Validity of a CH -proposition CH (𝑇) means that we can implement some value
of the given type 𝑇. But this does not give any information about the properties
of that value, such as whether it satisfies any laws. This is why type equivalence
(which requires the laws of isomorphisms) is not determined by an equivalence
of logical formulas.
It is useful for programmers to be able to transform type expressions to equiv-
alent simpler types before starting to write code. The type notation introduced
in this book is designed to help programmers to recognize patterns in type ex-
13 This is proved rigorously by R. Dyckhoff as the “Theorem” in section 6 (“Goal-directed pruning”),

see https://research-repository.st-andrews.ac.uk/handle/10023/8824

249
5 The logic of types. III. The Curry-Howard correspondence

pressions and to reason about them more easily. We have shown that a type
equivalence corresponds to each standard arithmetic identity such as (𝑎 + 𝑏) + 𝑐 =
𝑎 + (𝑏 + 𝑐), (𝑎 × 𝑏) × 𝑐 = 𝑎 × (𝑏 × 𝑐), 1 × 𝑎 = 𝑎, (𝑎 + 𝑏) × 𝑐 = 𝑎 × 𝑐 + 𝑏 × 𝑐, and so on.
Because of this, we are allowed to transform and simplify types as if they were
arithmetic expressions, e.g., to rewrite:
1 × ( 𝐴 + 𝐵) × 𝐶 + 𝐷  𝐷 + 𝐴 × 𝐶 + 𝐵 × 𝐶 .
The type notation makes this reasoning more intuitive.
These results apply to all type expressions built up using product types, disjunc-
tive types (also called “sum” types because they correspond to arithmetic sums),
and function types (also called “exponential” types because they correspond to
arithmetic exponentials). Type expressions that contain only products and sum
types are called polynomial.14 Type expressions that also contain function types
are called exponential-polynomial. We focus on exponential-polynomial types
because they are sufficient for almost all design patterns used in functional pro-
gramming.
There are no types corresponding to subtraction or division, so arithmetic equa-
tions such as:
𝑡 +𝑡 ×𝑡
(1 − 𝑡) × (1 + 𝑡) = 1 − 𝑡 × 𝑡 , and = 1+𝑡 ,
𝑡
do not directly yield any type equivalences. However, consider this well-known
formula:
1
= 1 + 𝑡 + 𝑡 2 + 𝑡 3 + ... + 𝑡 𝑛 + ... .
1−𝑡
At first sight, this formula appears to involve subtraction, division, and an infinite
series, and so cannot be directly translated into a type equivalence. However, the
formula can be rewritten as:
1
= 𝐿 (𝑡) where 𝐿 (𝑡) def = 1 + 𝑡 + 𝑡 2 + 𝑡 3 + ... + 𝑡 𝑛 × 𝐿 (𝑡) . (5.26)
1−𝑡
The definition of 𝐿(𝑡) is finite and only contains additions and multiplications. So,
Eq. (5.26) can be translated into a type equivalence:
𝐿 𝐴  1 + 𝐴 + 𝐴 × 𝐴 + 𝐴 × 𝐴 × 𝐴 + ... + 𝐴 × ... × 𝐴 × 𝐿 𝐴 . (5.27)
| {z }
𝑛 times

This type formula (with 𝑛 = 1) is equivalent to a recursive definition of the type


constructor List:
List 𝐴 def
= 1 + 𝐴 × List 𝐴 .
The type equivalence (5.27) suggests that we may view the recursive type List
heuristically as an “infinite disjunction” describing lists of zero, one, two, etc., ele-
ments.
14 Thesetypes are often called “algebraic data types” but this book prefers the more precise term
“polynomial types”.

250
5.5 Discussion and further developments

5.5.2 Implications for designing new programming languages


Today’s functional programming practice assumes, at the minimum, that pro-
grammers will use the six standard type constructions (Section 5.1.2) and the eight
standard code constructions (Section 5.2.2). These constructions are foundational
in the sense that they are used to express all design patterns of functional pro-
gramming. A language that does not directly support all of those constructions
cannot be considered a functional programming language.
A remarkable result of the CH correspondence is that the type system of any
given programming language (functional or not) is mapped into a certain logic,
i.e., a system of logical operations and proof rules. A logical operation will cor-
respond to each of the type constructions available in the programming language.
A proof rule will correspond to each of the available code constructions. Program-
ming languages that support all the standard type and code constructions — for
instance, OCaml, Haskell, F#, Scala, Swift, Rust, — are mapped into the construc-
tive logic with all standard logical operations available (𝑇𝑟𝑢𝑒, 𝐹𝑎𝑙𝑠𝑒, disjunction,
conjunction, and implication).
Languages such as C, C++, Java, C#, Go are mapped into logics that do not have
the disjunction operation or the constants 𝑇𝑟𝑢𝑒 and 𝐹𝑎𝑙𝑠𝑒. In other words, these
languages are mapped into incomplete logics where some true formulas cannot be
proved. Incompleteness of the logic of types will make a programming language
unable to express certain computations, e.g., directly handle data that belongs to
a disjoint domain.
Languages that do not enforce type checking (e.g., Python or JavaScript) are
mapped to inconsistent logics where any proposition can be proved — even propo-
sitions normally considered 𝐹𝑎𝑙𝑠𝑒. The CH correspondence will map such absurd
proofs to code that appears to compute a certain value (since the CH -proposition
was proved to be 𝑇𝑟𝑢𝑒) although that value is not actually available. In practice,
such code will crash because of a value that has a wrong type or is “null” (a pointer
to an invalid memory location). Those errors cannot happen in a programming
language whose logic of types is consistent and whose compiler checks all types
at compile time.
So, the CH correspondence gives a mathematically justified procedure for de-
signing new programming languages. The procedure has the following steps:

• Choose a formal logic that is complete and free of inconsistencies.

• For each logical operation, provide a type construction in the language.

• For each axiom and proof rule of the logic, provide a code construction in
the language.

Mathematicians have studied different logics, such as modal logic, temporal logic,
or linear logic. Compared with the constructive logic, those other logics have
251
5 The logic of types. III. The Curry-Howard correspondence

some additional type operations. For instance, modal logic adds the operations
“necessarily” and “possibly”, and temporal logic adds the operation “until”. For
each logic, mathematicians have determined the minimal complete sets of oper-
ations, axioms, and proof rules that do not lead to inconsistency. Programming
language designers can use this mathematical knowledge by choosing a logic and
translating it into a minimal “core” of a programming language. Code in that
language will be guaranteed never to crash as long as all types match. This math-
ematical guarantee (known as type safety) is a powerful help for programmers
since it automatically prevents a large number of coding errors. So, programmers
will benefit if they use languages designed using the CH correspondence.
Practically useful programming languages will of course need more features
than the minimal set of mathematically necessary features derived from a chosen
logic. Language designers need to make sure that all added features are consistent
with the core language.
At present, it is still not fully understood how a practical programming lan-
guage could use, say, modal or linear logic as its logic of types. Experience sug-
gests that, at least, the operations of the plain constructive logic should be avail-
able. So, it appears that the six type constructions and the eight code constructions
will remain available in all future languages of functional programming.
It is possible to apply the FP paradigm while writing code in any program-
ming language. However, some languages lack certain features that make FP tech-
niques easier to use in practice. For example, in a language such as C++ or Java,
one can easily use the map/reduce operations but not disjunctive types. More ad-
vanced FP constructions (such as typeclasses) are impractical in those languages:
the required code becomes too hard to read and to write without errors, which
negates the advantages of rigorous reasoning about functional programs.
Some programming languages, such as Haskell and OCaml, were designed
specifically for advanced use and exploration of the FP paradigm. Other lan-
guages, such as F#, Scala, Swift, and Rust, have different design goals but still
support enough FP features to be considered FP languages. This book uses Scala,
but the same constructions may be implemented in other FP languages in a simi-
lar way. Differences between OCaml, Haskell, F#, Scala, Swift, Rust, and other FP
languages do not play a significant role at the level of detail needed in this book.

5.5.3 Practical uses of the void type (Scala’s Nothing)


The void type15 (Scala’s Nothing) corresponds to the logical constant 𝐹𝑎𝑙𝑠𝑒. (The
proposition “the code can compute a value of the void type” is always false.) The void
type is used in some theoretical proofs but has few practical uses. One use case
15 The “void” type is a type with no values. It is not the same as the void keyword in Java or C
that denotes functions returning “no value”. Those functions are equivalent to Scala functions
returning the (unique) value of Unit type.

252
5.5 Discussion and further developments

is for a branch of a match/case expression that throws an exception instead of re-


turning a value. In this sense, returning a value of the void type corresponds to
a crash in the program. So, a throw expression is defined as if it returns a value
of type Nothing. We can then pretend to convert that “value” (which will never be
actually returned) into a value of any other type. Example 5.3.4.2 shows how to
write a function absurd[A] of type Nothing => A.
To see how this trick is used, consider this code defining a value x:
val x: Double = if (t >= 0.0) math.sqrt(t) else throw new Exception("error")

The else branch does not return a value, but x is declared to have type Double. For
this code to type-check, both branches must return values of the same type. So,
the compiler needs to pretend that the else branch also returns a value of type
Double. The compiler first assigns the type Nothing to the expression throw ... and
then automatically uses the conversion Nothing => Double to convert that type to
Double. In this way, types will match in the definition of the value x.
This book does not discuss exceptions in much detail. The functional program-
ming paradigm does not use exceptions because their presence prevents mathe-
matical reasoning about code.
As another example of using the void type, suppose an external library imple-
ments a function:
def parallel_run[E, A, B](f: A => Either[E, B]): Either[E, B] = ???

We may imagine that parallel_run(f) performs some parallel computations using


a given function 𝑓 . In general, functions 𝑓 :𝐴→𝐸+𝐵 may return an error of type 𝐸
or a result of type 𝐵. Suppose we know that a particular function 𝑓 never fails to
compute its result. To express that knowledge in code, we may explicitly set the
type parameter 𝐸 to the void type Nothing when applying parallel_run:
parallel_run[Nothing, A, B](f) // Types match only when values f(a) are always
of the form Right(b).

Returning an error is now impossible (the type Nothing has no values). If the func-
tion parallel_run is fully parametric, it will work in the same way with all types 𝐸,
including 𝐸 = 0. The code implements our intention via type parameters, giving
a compile-time guarantee of correct results.
So far, none of our examples involved the logical negation operation. It is de-
fined as:
¬𝛼 def
= (𝛼 ⇒ 𝐹𝑎𝑙𝑠𝑒) .

Its practical use in functional programming is as limited as that of 𝐹𝑎𝑙𝑠𝑒 and the
void type. (The type corresponding to 𝛼 ⇒ 𝐹𝑎𝑙𝑠𝑒 is the function type 𝐴 → 0 or, in
Scala, A => Nothing.) However, logical negation plays an important role in Boolean
logic.
253
5 The logic of types. III. The Curry-Howard correspondence

5.5.4 Relationship between Boolean logic and constructive


logic
We have seen that some true theorems of Boolean logic are not true in construc-
tive logic. Here are some more examples: the Boolean identities ¬ (¬𝛼) = 𝛼 and
(𝛼 ⇒ 𝛽) = (¬𝛼 ∨ 𝛽).
Example 5.5.4.1 Show that the Boolean identity ¬ (¬𝛼) = 𝛼 does not hold in
constructive logic by arguing that there is no fully parametric function with the
type signature (( 𝐴 → 0) → 0) → 𝐴, or in Scala:
def bad4[A](f: (A => Nothing) => Nothing): A

Solution The negation ¬𝛼 is translated by the Curry-Howard correspondence


to the type 𝐴 → 0 (in Scala, A => Nothing). If the Boolean identity ¬ (¬𝛼) = 𝛼 were
also true in the constructive logic, the formula ¬(¬𝛼) ⇒ 𝛼 would be true. The CH
correspondence then says that we would be able to implement a fully parametric
function of type (( 𝐴 → 0) → 0) → 𝐴. Can we do that? The result type of bad4
is an arbitrary, unknown type 𝐴, so we cannot produce values of type 𝐴 from
scratch. We also do not have any given arguments of type 𝐴 or given functions of
type 𝑋 → 𝐴 for some 𝑋. The only remaining possibility of implementing bad4 is by
applying the argument f to a value of type A => Nothing. The resulting value would
be of type Nothing, and we would then use the function absurd from Example 5.3.4.2
to obtain a value of type A. It remains to get a value of type A => Nothing from
scratch. But we cannot do that: Example 5.3.4.3) shows that the type A => Nothing
is itself void unless A is void. So, we cannot compute a value of type A => Nothing
via fully parametric code that works in the same way for all types A. We conclude
that bad4 cannot be implemented. 
Example 5.5.4.2 Show that the Boolean identity (𝛼 ⇒ 𝛽) = ¬𝛼 ∨ 𝛽 does not hold
in constructive logic by arguing that there is no fully parametric function with the
type signature ( 𝐴 → 𝐵) → ( 𝐴 → 0) + 𝐵, or in Scala:
def bad5[A, B](f: A => B): Either[A => Nothing, B]

Solution The result type of bad5 is a disjunctive type, but the argument type is
not. The code of bad5 cannot pattern-match on its argument f to make a decision
about returning a Left or a Right. So, bad5 must hard-code that decision and either
always return a value of type ( 𝐴 → 0) + 0:𝐵 , or always return a value of type
0:𝐴→0 + 𝐵. A value of type 𝐵 cannot be computed from scratch because the type 𝐵
is unknown. The only way of getting a value of type 𝐵 would be by computing
f(x) for some x: A, but no values of type 𝐴 are available. The remaining possibility
is to return a value of type 𝐴 → 0. But the type 𝐴 → 0 is void unless 𝐴 = 0 (see
Example 5.3.4.3). As the type 𝐴 is unknown, we cannot write code that works in
the same way for all types 𝐴 and produces a value of type 𝐴 → 0. So, the function
bad5 cannot be implemented via fully parametric code. 
Nevertheless, as we will now show, any theorem of constructive logic is also a
254
5.5 Discussion and further developments

Constructive logic Boolean logic

Γ`C H (1) (create unit) ¬Γ ∨ 𝑇𝑟𝑢𝑒 = 𝑇𝑟𝑢𝑒

Γ,𝛼`𝛼 (use value) ¬Γ ∨ ¬𝛼 ∨ 𝛼 = 𝑇𝑟𝑢𝑒


Γ,𝛼`𝛽
Γ`𝛼⇒𝛽 (create function) (¬Γ ∨ ¬𝛼 ∨ 𝛽) = (¬Γ ∨ (𝛼 ⇒ 𝛽))
Γ`𝛼 Γ`𝛼⇒𝛽
Γ`𝛽 (use function) ((¬Γ ∨ 𝛼) ∧ (¬Γ ∨ (𝛼 ⇒ 𝛽))) ⇒ (¬Γ ∨ 𝛽)
Γ`𝛼 Γ`𝛽
Γ`𝛼∧𝛽 (create tuple) (¬Γ ∨ 𝛼) ∧ (¬Γ ∨ 𝛽) = (¬Γ ∨ (𝛼 ∧ 𝛽))
Γ`𝛼∧𝛽
Γ`𝛼 (use tuple-1) (¬Γ ∨ (𝛼 ∧ 𝛽)) ⇒ (¬Γ ∨ 𝛼)
Γ`𝛼
Γ`𝛼∨𝛽 (create Left) (¬Γ ∨ 𝛼) ⇒ (¬Γ ∨ (𝛼 ∨ 𝛽))
Γ`𝛼∨𝛽 Γ,𝛼`𝛾 Γ,𝛽`𝛾
Γ`𝛾 (use Either) (¬Γ ∨ 𝛼 ∨ 𝛽) ∧ (¬Γ ∨ ¬𝛼 ∨ 𝛾)
∧ (¬Γ ∨ ¬𝛽 ∨ 𝛾) ⇒ (¬Γ ∨ 𝛾)

Table 5.7: Proof rules of constructive logic are true also in the Boolean logic.

theorem of Boolean logic. The reason is that all eight rules of constructive logic
(Section 5.2.3) also hold in Boolean logic.
To verify that a formula is true in Boolean logic, it is sufficient to check that
the value of the formula is 𝑇𝑟𝑢𝑒 for all possible truth values (𝑇𝑟𝑢𝑒 or 𝐹𝑎𝑙𝑠𝑒) of its
variables. A sequent such as 𝛼, 𝛽 ` 𝛾 is true in Boolean logic if and only if 𝛾 = 𝑇𝑟𝑢𝑒
under the assumption that 𝛼 = 𝛽 = 𝑇𝑟𝑢𝑒. So, the sequent 𝛼, 𝛽 ` 𝛾 is translated into
the Boolean formula:
𝛼, 𝛽 ` 𝛾 = ((𝛼 ∧ 𝛽) ⇒ 𝛾) = (¬𝛼 ∨ ¬𝛽 ∨ 𝛾) .
Table 5.7 translates all proof rules of Section 5.2.3 into Boolean formulas. The first
two lines are axioms, while the subsequent lines are Boolean theorems that can be
verified by calculation.
To simplify the calculations, note that all terms in the formulas contain the op-
eration (¬Γ ∨ ...) corresponding to the context Γ. Now, if Γ is 𝐹𝑎𝑙𝑠𝑒, the entire
formula becomes automatically 𝑇𝑟𝑢𝑒, and there is nothing else to check. So, it
remains to verify the formula in case Γ = 𝑇𝑟𝑢𝑒, and then we can simply omit all
instances of ¬Γ in the formulas. Let us show the Boolean derivations for the rules
“use function” and “use Either”; other formulas are checked in a similar way:
formula “use function” : (𝛼 ∧ (𝛼 ⇒ 𝛽)) ⇒ 𝛽
use Eq. (5.8) : = ¬(𝛼 ∧ (¬𝛼 ∨ 𝛽)) ∨ 𝛽
de Morgan’s laws : = ¬𝛼 ∨ (𝛼 ∧ ¬𝛽) ∨ 𝛽

identity 𝑝 ∨ (¬𝑝 ∧ 𝑞) = 𝑝 ∨ 𝑞 with 𝑝 = ¬𝛼 and 𝑞 = 𝛽 : = ¬𝛼 ∨ ¬𝛽 ∨ 𝛽


axiom “use value” : = 𝑇𝑟𝑢𝑒 .

255
5 The logic of types. III. The Curry-Howard correspondence

formula “use Either” : ((𝛼 ∨ 𝛽) ∧ (𝛼 ⇒ 𝛾) ∧ (𝛽 ⇒ 𝛾)) ⇒ 𝛾


use Eq. (5.8) : = ¬ ((𝛼 ∨ 𝛽) ∧ (¬𝛼 ∨ 𝛾) ∧ (¬𝛽 ∨ 𝛾)) ∨ 𝛾
de Morgan’s laws : = (¬𝛼 ∧ ¬𝛽) ∨ (𝛼 ∧ ¬𝛾) ∨ (𝛽 ∧ ¬𝛾) ∨ 𝛾
identity 𝑝 ∨ (¬𝑝 ∧ 𝑞) = 𝑝 ∨ 𝑞 : = (¬𝛼 ∧ ¬𝛽) ∨ 𝛼 ∨ 𝛽 ∨ 𝛾
identity 𝑝 ∨ (¬𝑝 ∧ 𝑞) = 𝑝 ∨ 𝑞 : = ¬𝛼 ∨ 𝛼 ∨ 𝛽 ∨ 𝛾
axiom “use value” : = 𝑇𝑟𝑢𝑒 .

Since each proof rule of the constructive logic is translated into a true formula in
Boolean logic, it follows that a proof tree in the constructive logic will be translated
into a tree of Boolean formulas that have value 𝑇𝑟𝑢𝑒 for each axiom or proof rule.
So, a proof tree for a sequent such as ∅ ` 𝑓 (𝛼, 𝛽, 𝛾) is translated into a tree of
Boolean implications that look like this:

𝑇𝑟𝑢𝑒 = (𝑇𝑟𝑢𝑒) ⇒ (𝑇𝑟𝑢𝑒) ⇒ ... ⇒ 𝑓 (𝛼, 𝛽, 𝛾) .

Since (𝑇𝑟𝑢𝑒 ⇒ 𝑥) = 𝑥 for any 𝑥, the Boolean formula 𝑓 (𝛼, 𝛽, 𝛾) will be proved 𝑇𝑟𝑢𝑒.
To see how this works in practice, consider the proof tree shown in Figure 5.2
(page 199). Each step in that proof is made via an axiom or via a derivation rule.
Denoting for brevity 𝛾 def
= (𝛼 ⇒ 𝛼) ⇒ 𝛽, let us translate each of those axioms and
rules into Boolean formulas that (as we know) all have value 𝑇𝑟𝑢𝑒:

“use value” : 𝑇𝑟𝑢𝑒 = (𝛾 ∧ 𝛼) ⇒ 𝛼 .


“create function” : 𝑇𝑟𝑢𝑒 = ((𝛾 ∧ 𝛼) ⇒ 𝛼) ⇒ (𝛾 ⇒ (𝛼 ⇒ 𝛼))
= (𝑇𝑟𝑢𝑒) ⇒ (𝛾 ⇒ (𝛼 ⇒ 𝛼)) .
“use value” : 𝑇𝑟𝑢𝑒 = (𝛾 ⇒ ((𝛼 ⇒ 𝛼) ⇒ 𝛽)) .
“use function” : 𝑇𝑟𝑢𝑒 = (𝛾 ⇒ (𝛼 ⇒ 𝛼)) ∧ (𝛾 ⇒ ((𝛼 ⇒ 𝛼) ⇒ 𝛽)) ⇒ (𝛾 ⇒ 𝛽)
= (𝑇𝑟𝑢𝑒) ∧ (𝑇𝑟𝑢𝑒) ⇒ (𝛾 ⇒ 𝛽) .
“create function” : 𝑇𝑟𝑢𝑒 = (𝛾 ⇒ 𝛽) ⇒ (𝛾 ⇒ 𝛽) = (𝑇𝑟𝑢𝑒) ⇒ (𝛾 ⇒ 𝛽) .

These Boolean implications ultimately show that 𝛾 ⇒ 𝛽 is 𝑇𝑟𝑢𝑒.


It is easier to check Boolean truth tables than to find a proof tree in constructive
logic (or to establish that no proof tree exists). If we find that a formula is not true
in Boolean logic, we know it is also not true in constructive logic. This gives us
a quick way of proving that some type signatures are not implementable as fully
parametric functions. However, if a formula is true in Boolean logic, it does not
follow that the formula is also true in the constructive logic.
In addition to formulas shown in Table 5.4 (Section 5.2.1), here are three more
examples of formulas that are not true in Boolean logic:

∀𝛼. 𝛼 , ∀(𝛼, 𝛽). 𝛼 ⇒ 𝛽 , ∀(𝛼, 𝛽). (𝛼 ⇒ 𝛽) ⇒ 𝛽 .

These formulas are also not true in the constructive logic.


256
5.5 Discussion and further developments

5.5.5 The constructive logic and the law of excluded middle


Computations in the Boolean logic are often performed using truth tables. It is
perhaps surprising that the proof rules of the constructive logic are not equivalent
to checking whether some propositions are 𝑇𝑟𝑢𝑒 or 𝐹𝑎𝑙𝑠𝑒 via a truth table. A
general form of this statement was proved by K. Gödel in 1932.16 In this sense,
constructive logic does not imply that every proposition is either 𝑇𝑟𝑢𝑒 or 𝐹𝑎𝑙𝑠𝑒.
This is not intuitive and requires getting used to. Reasoning in the constructive
logic must use the axioms and derivation rules directly, instead of truth tables.
The Boolean logic can use truth tables because every Boolean proposition may
be assumed in advance to be either 𝑇𝑟𝑢𝑒 or 𝐹𝑎𝑙𝑠𝑒. This can be written as the
formula ∀𝛼. (¬𝛼 ∨ 𝛼 = 𝑇𝑟𝑢𝑒). Table 5.7 uses the Boolean identity (𝛼 ⇒ 𝛽) = (¬𝛼 ∨
𝛽), which does not hold in the constructive logic, to translate the constructive
axiom “use value” into the Boolean axiom ¬𝛼 ∨ 𝛼 = 𝑇𝑟𝑢𝑒.
The formula ∀𝛼. ¬𝛼 ∨ 𝛼 = 𝑇𝑟𝑢𝑒 is known as the law of excluded middle.17 It is
remarkable that the constructive logic does not have the law of excluded middle. It
is neither an axiom nor a derived theorem in constructive logic.
To see why, translate the constructive logic formula ∀𝛼. ¬𝛼 ∨ 𝛼 into a type. The
negation operation (¬𝛼) is defined as the implication 𝛼 ⇒ 𝐹𝑎𝑙𝑠𝑒. So, the formula
∀𝛼. ¬𝛼 ∨ 𝛼 corresponds to the type ∀𝐴. ( 𝐴 → 0) + 𝐴. Can we compute a value of
this type via fully parametric code? For that, we need to compute either a value of
type 𝐴 → 0 or a value of type 𝐴. This decision needs to be made independently of
𝐴, because the code of a fully parametric function must operate in the same way
for all types. We cannot compute a value of type 𝐴 from scratch, as the type 𝐴 is
unknown. So, the only remaining possibility is to compute a value of type 𝐴 → 0.
As we have seen in Example 5.3.4.3, a value of type 𝐴 → 0 exists only if the type
𝐴 is itself 0. But we do not know in advance whether 𝐴 = 0. Since there are no
values of type 0, and the type parameter 𝐴 could be, say, Int, we cannot compute
a value of type 𝐴 → 0.
Is it really impossible to implement a value of the type ( 𝐴 → 0) + 𝐴? We could
reason like this: the type 𝐴 is either void or not void. If 𝐴 is void then ( 𝐴 → 0)  1
is not void (as Example 5.3.4.3 shows). So, one of the types in the disjunction
( 𝐴 → 0) + 𝐴 should be non-void and have values that we can compute.
While this argument is true, it does not help implementing a value of type
( 𝐴 → 0) + 𝐴 via fully parametric code. It is not enough to know that one of the
two values “should exist”. We need to know which of the two values exists, and we
need to write code that computes that value. That code may not decide what to
do depending on whether the type 𝐴 is void, because the code must work in the
same way for all types 𝐴 (void or not). As we have seen, that code is impossible
to write.
16 See plato.stanford.edu/entries/intuitionistic-logic-development/
17 See https://en.wikipedia.org/wiki/Law_of_excluded_middle

257
5 The logic of types. III. The Curry-Howard correspondence

In Boolean logic, one may prove that a value “should exist” by showing that
the non-existence of a value is contradictory in some way. However, any prac-
tically useful program needs to “construct” (i.e., to compute) actual values. The
“constructive” logic got its name from this requirement. So, it is the constructive
logic (not the Boolean logic) that provides correct reasoning about the types of
values computable by fully parametric functional programs.
If we drop the requirement of full parametricity, we could implement the law of
excluded middle. Special features of Scala (reflection, type tags, and type casts)
allow programmers to compare types as values and to determine what type was
given to a type parameter when a function is applied:
import scala.reflect.runtime.universe._
// Convert the type parameter T into a special value.
def getType[T: TypeTag]: Type = weakTypeOf[T]
// Compare types A and B.
def equalTypes[A: TypeTag, B: TypeTag]: Boolean = getType[A] =:= getType[B]

// excludedMiddle has type ∀𝐴. ( 𝐴 → 0) + 𝐴.


def excludedMiddle[A: TypeTag]: Either[A, A => Nothing] =
if (equalTypes[A, Nothing]) Right((identity _).asInstanceOf[A => Nothing])
// Return id : 0 → 0.
else if (equalTypes[A, Int]) Left(123.asInstanceOf[A]) // Produce some
value of type Int.
else if (equalTypes[A, Boolean]) Left(true.asInstanceOf[A]) // Produce some
value of type Boolean.
else ??? // Write more definitions to support all other Scala types.

scala> excludedMiddle[Int]
res0: Either[Int,Int => Nothing] = Left(123)

scala> excludedMiddle[Nothing]
res1: Either[Nothing,Nothing => Nothing] = Right(<function1>)
In this code, we check whether 𝐴 = 0. If so, we can implement 𝐴 → 0 as an
identity function of type 0 → 0. Otherwise, we know that 𝐴 is one of the existing
Scala types (Int, Boolean, etc.), which are not void and have values that we can
simply write down one by one in the subsequent code.
Explicit type casts, such as 123.asInstanceOf[A], are needed because the Scala
compiler cannot know that A is Int in the scope where we return Left(123). Without
a type cast, the compiler will not accept 123 as a value of type A in that scope.
The method asInstanceOf is dangerous because the code x.asInstanceOf[T] dis-
ables the type checking for the value x. This tells the Scala compiler to believe that
x has type T even when the type T is inconsistent with the actually given code of x.
The resulting programs compile but may give unexpected results or crash. These
errors would have been prevented if we did not disable the type checking. In this
book, we will avoid writing such code whenever possible.

258
Essay: Towards functional data
engineering with Scala
Data engineering is among the highest-demand1 novel occupations in the IT world
today. Data engineers create software pipelines that process large volumes of
data efficiently. Why did the Scala programming language emerge as a premier
tool2 for crafting the foundational data engineering technologies such as Spark or
Akka? Why is Scala in high demand3 within the world of big data?
There are reasons to believe that the choice of Scala was not accidental.

Data is math
Humanity has been working with data at least since Babylonian tax tables4 and
the ancient Chinese number books.5 Mathematics summarizes several millennia’s
worth of data processing experience in a few fundamental tenets:

• Data is immutable (because true facts are immutable).

• Values of different type (population count, land area, distance, price, loca-
tion, time, growth percentage, etc.) need to be handled separately. For ex-
ample, it is an error to add a distance to a population count.

• Data processing should be performed according to mathematical formulas.


True mathematical formulas are immutable and always give the same re-
sults from the same input data.

Violating these tenets produces nonsense (see Fig. 5.1 for a real-life illustration).
The power of the principles of mathematics extends over all epochs and all
cultures; math is the same in San Francisco, in Rio de Janeiro, in Kuala-Lumpur,
and in Pyongyang (Fig. 5.2).
1 http://archive.is/mK59h
2 https://tinyurl.com/4wwsedrz
3 https://techcrunch.com/2016/06/14/scala-is-the-new-golden-child/
4 https://www.nytimes.com/2017/08/29/science/trigonometry-babylonian-tablet.html
5 https://quatr.us/china/science/chinamath.htm

259
Essay: Towards functional data engineering with Scala

Figure 5.1: Mixing incompatible data types produces nonsensical results.

Functional programming is math


The functional programming paradigm is based on mathematical principles: val-
ues are immutable, data processing is coded through formula-like expressions,
and each type of data is required to match correctly during the computations. The
type-checking process automatically prevents programmers from making many
kinds of coding errors. In addition, programming languages such as Scala and
Haskell have features intended for building powerful abstractions and domain-
specific languages. This power of abstraction is not accidental. Mathematics is
the ultimate art of building formal abstractions, and math-based functional pro-
gramming languages capitalize on several millennia of mathematical experience.
A prominent example of how mathematics informs the design of programming
languages is the connection between constructive logic6 and the programming
language’s type system, called the Curry-Howard (CH) correspondence. The
main idea of the CH correspondence is to think of programs as mathematical for-
mulas that compute a value of a certain type 𝐴. The CH correspondence is be-
tween programs and logical propositions: To any program that computes a value
of type 𝐴, there corresponds a proposition stating that “a value of type 𝐴 can be
computed”.
This may sound rather theoretical so far. To see the real value of the CH corre-
spondence, recall that formal logic has operations “and”, “or”, and “implies”. For
any two propositions 𝐴, 𝐵, we can construct the propositions “𝐴 and 𝐵”, “𝐴 or
𝐵”, “𝐴 implies 𝐵”. These three logical operations are foundational; without one of
them, the logic is incomplete (cannot derive some theorems).
6 https://en.wikipedia.org/wiki/Intuitionistic_logic

260
The power of abstraction

Figure 5.2: The Pyongyang method of error-free software engineering.

A programming language obeys the CH correspondence with the logic if for


any types 𝐴, 𝐵, the language also contains composite types corresponding to the
logical formulas “𝐴 or 𝐵”, “𝐴 and 𝐵”, “𝐴 implies 𝐵”. In Scala, these composite
types are Either[A, B], the tuple (A, B), and the function type A => B. All modern
functional languages such as OCaml, Haskell, Scala, F#, Swift, and Rust support
these type constructions and obey the CH correspondence. Having a complete
logic in a language’s type system enables declarative domain-driven code design.7
It is interesting to note that most older programming languages (C/C++, Java,
JavaScript, Python) do not support some of these composite types. In other words,
those programming languages have type systems based on an incomplete logic.
As a result, users of those languages have to implement burdensome workarounds
that make for error-prone code. Failure to follow mathematical principles has real
costs (Figure 5.2).

The power of abstraction


Early adopters of Scala, such as Netflix, LinkedIn, and Twitter, were implement-
ing what was then called “big data engineering”. The required software needs to
be highly concurrent, distributed, and resilient to failure. Those software compa-
nies used Scala as their main implementation language and reaped the benefits of
functional programming.
What makes Scala suitable for big data tasks? The only reliable way of man-
aging massively concurrent code is to use sufficiently high-level abstractions that
make application code declarative. The two most important such abstractions are
the “resilient distributed dataset” (RDD) of Apache Spark and the “reactive stream”
used in systems such as Kafka, Akka Streams, and Apache Flink. While these ab-
7 https://fsharpforfunandprofit.com/ddd/

261
Essay: Towards functional data engineering with Scala

stractions are certainly implementable in Java or Python, a fully declarative and


type-safe usage is possible only in a programming language with a sophisticated
type system. Among the currently available mature functional languages, only
Scala and Haskell are technically adequate for that task, due to their support for
typeclasses and higher-order types. The early adopters of Scala were able to bene-
fit from the powerful abstractions Scala supports. In this way, Scala enabled those
businesses to engineer and to scale up their massively concurrent computations.
It remains to see why Scala (and not, say, OCaml or Haskell) became the lingua
franca of big data.

Scala is Java on math


The recently invented general-purpose functional programming languages may
be divided into “academic” (OCaml, Haskell) and “industrial” (F#, Scala, Swift).
The “academic” languages are clean-room implementations of well-researched
mathematical principles of programming language design (the CH correspon-
dence being one such principle). These languages are not limited by require-
ments of compatibility with any existing platforms or libraries. Because of this, the
“academic” languages have been designed and used for pursuing various mathe-
matical ideas to their logical conclusion.8 At the same time, software practitioners
struggle to adopt these programming languages due to a steep learning curve, a
lack of enterprise-grade libraries and tool support, and immature package man-
agement.
The languages from the “industrial” group are based on existing and mature
software ecosystems: F# on .NET, Scala on JVM, and Swift on the MacOS/iOS
platform. One of the important design requirements for those languages is 100%
binary compatibility with their “parent” platform’s languages (F# with C#, Scala
with Java, and Swift with Objective-C). Because of this, developers can immedi-
ately take advantage of the existing tooling, package management, and industry-
strength libraries, while slowly ramping up the idiomatic usage of new language
features. However, the same compatibility requirements dictate certain limita-
tions in the languages, making their design less than fully satisfactory from the
functional programming viewpoint.
It is now easy to see why the adoption rate of the “industrial” group of languages
is much higher9 than that of the “academic” languages. The transition to the func-
tional paradigm is also smoother for software developers because F#, Scala, and
Swift seamlessly support the familiar object-oriented programming paradigm. At
the same time, those new “industrial” functional languages still have logically
8 OCaml has recursive and polymorphic product and co-product types that can be freely com-
bined with object-oriented types. Haskell removes all side effects from the language and sup-
ports type-level functions of arbitrarily high order.
9 https://www.tiobe.com/tiobe-index/, archived in 2019 at http://archive.is/RsNH8

262
Summary

complete type systems, which gives developers an important benefit of type-safe


domain modeling.
Nevertheless, the type systems of those languages are not equally powerful.
For instance, F# and Swift are similar to OCaml in many ways but omit OCaml’s
parameterized modules and some other features. Of all the mentioned languages,
only Scala and Haskell directly support typeclasses and higher-order functions on
types, which are helpful for expressing abstractions such as automatically paral-
lelized data sets or asynchronous data streams.
To see the impact of these advanced features, consider LINQ, a domain-specific
language for database queries on .NET, implemented in C# and F# through a spe-
cial built-in syntax supported by Microsoft’s compilers. Analogous functionality
is provided in Scala as a library, without need to modify the Scala compiler, by
several open-source projects such as Slick and Quill. Similar libraries exist for
Haskell — but not in languages with less powerful type systems.

Summary
Only Scala has all of the features required for industrial-grade functional pro-
gramming:

1. Functional collections in the standard library.

2. A sophisticated type system with support for typeclasses and higher-order


types.

3. Seamless compatibility with a mature software ecosystem (JVM).

Based on this assessment, it appears that Scala is a good choice of an implementa-


tion language for big data engineering.

263
List of Tables
1.1 Translating mathematics into code. . . . . . . . . . . . . . . . . . . . 19
1.2 Nameless functions in various programming languages. . . . . . . 29

2.1 Implementing mathematical induction. . . . . . . . . . . . . . . . . 69

4.1 Some notation for symbolic reasoning about code. . . . . . . . . . . 158

5.1 The correspondence between types and CH -propositions. . . . . . 182


5.2 Proof rules for the constructive logic. . . . . . . . . . . . . . . . . . . 194
5.3 Examples of logical formulas that are true theorems in Boolean logic. 211
5.4 Examples of logical formulas that are not true in Boolean logic. . . . 211
5.5 Logic identities with disjunction and conjunction, and the possible
type equivalences. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
5.6 Logical identities with implication, and the corresponding type equiv-
alences and arithmetic identities. . . . . . . . . . . . . . . . . . . . . 226
5.7 Proof rules of constructive logic are true also in the Boolean logic. . 255

265
List of Figures
3.1 The disjoint domain represented by the type RootsOfQ. . . . . . . . . 132

5.1 Proof tree for the sequent ∅ ` 𝜒 ⇒ 𝜒 ∧ 𝜒. . . . . . . . . . . . . . . . . 196


5.2 Proof tree for sequent (5.6). . . . . . . . . . . . . . . . . . . . . . . . 199
5.3 Axioms and derivation rules of the LJ algorithm. Each of the rules
“(Left∧𝑖 )” and “(Right∨𝑖 )” have two versions, with 𝑖 = 1 or 𝑖 = 2. . . 203
5.4 Proof transformers for the rules of the LJ algorithm. . . . . . . . . . 205
5.5 Proof transformers for the four new rules of the LJT algorithm. . . . 209

5.1 Mixing incompatible data types produces nonsensical results. . . . 260


5.2 The Pyongyang method of error-free software engineering. . . . . . 261

267
Index

` (turnstile) symbol, 176, 192 constant function, 145, 167, 229


𝜆-calculus, 28 constructive logic, 194, 258
continuation-passing, 70
∅ (empty set), 195 curried arguments, 139
2 (the Boolean type), 222 curried function, 139
abstract syntax tree, 122 full application, 141
accumulator argument, 53 Curry-Howard correspondence, 181,
aggregation, 2, 16, 57 185, 190, 212, 260, 261
algebraic data types, 250 proof transformer, 192
anonymous function curryhoward library, 200, 208
see “nameless functions”, 28 currying, 142
argument list, 73
data transformation, 16, 35
assembly language, 27
decidable logic, 235
associativity law
default value, 54
of addition, 54
dependent type, 234
of function composition, 146
derivation rule, 191
backward composition, 145, 148 destructuring, 33
binary search, 72 dictionary, 27
binary tree, 118 disjoint union
Boolean logic, 210 see “labeled union”, 133
bound variable, 12, 84 disjunction (in logic), 135
disjunctive type, 99
cardinality, 222 matrix notation, 220
Cartesian product, 47, 223 domain of a function, 8
case class, 92 dynamic programming, 78, 82
case expression, 33
closure, 138 eager collection, 85
co-product type eager value, 85
see “disjunctive type”, 99 eight code constructions, 187, 190
co-product types, 177 embedded if, 101
code inference, 171, 200 empty case class, 96
Collatz sequence, 79 enumeration type, 125, 134
conjunction (in logic), 135 Euler product, 21
269
Index

examples (with code), 16, 40, 59, 70, Kurt Gödel, 257
99, 105, 116, 121, 125, 154, 159,
181, 218, 224, 226, 235 labeled union, 133, 223
exception, 82, 100, 110, 253 lambda-function
exercises, 20, 46, 79, 118, 119, 122, 130, see “nameless function”, 28
165, 185, 222, 247 law of de Morgan, 239
expanded form of a function, 147 law of excluded middle, 257, 258
exponent, 225 lazy collection, 85
exponential-polynomial type, 250 lazy value, 85
expression, 7 lifting, 106
expression block, 10 LJT algorithm, 200, 202, 209
local scope, 10, 27, 84, 100
factorial function, 7 logical axiom, 190
first-order logic, 235 logical implication, 180, 185
formal logic, 186 loop detection, 76
forward composition, 145, 148
free variable, 12 Machin’s formula, 20
fully parametric map/reduce programming style, 16, 28
code, 143, 146, 187 mathematical induction, 20, 23, 49
code constructions, 187 base case, 49
function as a value, 9, 142 inductive assumption, 49
function composition, 145 inductive step, 49
functional programming paradigm, 22 matrix notation, 220
method syntax, 11
generic functions, 146 mutable value, 88

higher-order function, 168 nameless function, 9


negation (in logic), 253
identity function, 145 Newton’s method, 159
identity laws non-empty list, 116
of function composition, 146
of map, 243 object-oriented programming, 262
immutable value, 87 on-call value, 85, 228
implication (in logic), 185, 210 opaque type, 91
infinite loop in type recursion, 112, 113 operator syntax, 170
infix syntax, 11, 14 order of a function, 168
information loss, 129, 130, 215–217, 243
interpreter, 124 palindrome integer, 80
intuitionistic propositional logic, 194 paradigm of programming, 22
inverse function, 214 parametric code, 94
isomorphic types, 214 partial application, 141, 171
iterator, 85 partial function, 82, 100, 159
pattern matching, 33
jokes, 260, 261 in matrix notation, 189, 220
270
Index

infallible, 83 stream, 66
pattern variables, 33 sum type
Peirce’s law, 201 see “disjunctive type”, 223
perfect numbers, 81 symbolic calculations, 150
perfect-shaped tree, 120
pipe notation, 188, 231 tail recursion, 52
operator precedence, 233 total function, 82
planned exception, 110 trampolines, 70
polynomial type, 250 ⊲-notation
predicate, 10 see “pipe notation”, 231
procedure, 131 truth table, 210
product type, 223 tuples, 31
product types, 177 accessors, 32
proof (in logic), 176 as function arguments, 144
proof by induction, 50 fields, 31
proof transformer, 192 nested, 32
proof tree, 204 nested in pattern matching, 68
proposition (in logic), 135 parts, 31
pure function, 87, 89 turnstile (`) symbol, 176, 192
type alias, 59, 91, 178
recursive function, 50 type annotation, 105, 128
accumulator argument, 53 type casts, 258
recursive types, 112 type constructor, 95
infinite loop, 112 type equivalence, 142, 214
referential transparency, 87 accidental, 224, 247
Richard Bornat, 186, 210 type error, 8, 31, 32, 45, 83, 92
Riemann’s zeta function, 21 type expression, 144
rose tree, 119 type inference, 156, 158, 171
Roy Dyckhoff, 202, 249 type notation, 177, 182
run-time error, 83 operator precedence, 178
runner, 124 type parameter, 36, 94
type safety, 252
Scala method, 142 typed hole, 241
Scala’s Iterator class, 86–88 types, 24
sequent (in logic), 176 equivalent, 214
goal, 176 exponential-polynomial types, 250
premises, 176 isomorphic, 214
shadowed name, 84, 169 polynomial types, 250
side effect, 89
Simpson’s rule, 26 uncurried function, 139, 142
six type constructions, 177 uncurrying, 142, 146
stack memory, 52 undecidable logic, 235
staggered factorial function, 20 unevaluated expression, 122
271
Index

unfold function, 75, 81


unit type, 95
named, 96, 134, 179, 183
universal quantifier (∀), 10, 181, 185
unplanned exception, 110

value-like behavior, 87
variable, 23, 24
void type, 104, 112, 217, 228, 252

Wallis product, 17
well-typed expression, 158

272

You might also like