Typed Lambda Calculus / Calculus of Constructions
Typed Lambda Calculus / Calculus of Constructions
Constructions
Helmut Brandl
(firstname dot lastname at gmx dot net)
Version 1.03
Abstract
In this text we describe the calculus of constructions as one of
the most interesting typed lambda calculus. It is a sweet spot in
the design space of typed lambda calculi because it can express an
immense set of computable functions and it can express a large set of
logical propositions including their proofs.
The paper is written in a textbook style. The needed concepts
are introduced step by step and the proofs are layed out sufficiently
detailed for newcomers to the subject. The readers who are familiar
with basic concepts of lambda calculus and computer science can get
a good unterstanding of typed lambda calculus.
For comments, questions, error reporting feel free to open an issue
at https://github.com/hbr/Lambda-Calculus
Contents
1 Introduction 4
2 Basic Mathematics 10
2.1 Sets and Relations . . . . . . . . . . . . . . . . . . . . 10
2.2 Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Inductively Defined Sets . . . . . . . . . . . . . . . . . 11
2.4 Induction Proofs . . . . . . . . . . . . . . . . . . . . . 12
2.5 Inductively Defined Relations . . . . . . . . . . . . . . 14
2.6 Term Grammar . . . . . . . . . . . . . . . . . . . . . . 15
2.7 Recursive Functions . . . . . . . . . . . . . . . . . . . 16
1
3 The Calculus 17
3.1 Sorts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3 Free Variables . . . . . . . . . . . . . . . . . . . . . . . 18
3.4 Contexts . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.5 Substitution . . . . . . . . . . . . . . . . . . . . . . . . 19
3.6 Beta Reduction . . . . . . . . . . . . . . . . . . . . . . 22
3.7 Beta Equivalence . . . . . . . . . . . . . . . . . . . . . 25
4 Confluence 28
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.2 Diamonds and Confluence . . . . . . . . . . . . . . . . 29
4.3 Parallel Reduction Relation . . . . . . . . . . . . . . . 33
4.4 Equivalent Terms . . . . . . . . . . . . . . . . . . . . . 40
4.5 Uniqueness of Normal Forms . . . . . . . . . . . . . . 42
4.6 Equivalent Binders . . . . . . . . . . . . . . . . . . . . 42
4.7 Binders are not Equivalent to Variables or Sorts . . . 42
5 Typing 44
5.1 Typing Relation . . . . . . . . . . . . . . . . . . . . . 46
5.2 Basic Definitions . . . . . . . . . . . . . . . . . . . . . 47
5.3 Start Lemma . . . . . . . . . . . . . . . . . . . . . . . 48
5.4 Thinning Lemma . . . . . . . . . . . . . . . . . . . . . 49
5.5 Generation Lemmata . . . . . . . . . . . . . . . . . . . 51
5.6 Substitution Lemma . . . . . . . . . . . . . . . . . . . 55
5.7 Type of Types . . . . . . . . . . . . . . . . . . . . . . 57
5.8 Subject Reduction . . . . . . . . . . . . . . . . . . . . 58
5.9 Type Uniqueness . . . . . . . . . . . . . . . . . . . . . 62
5.10 Kinds . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
2
6.11 Soundness Theorem . . . . . . . . . . . . . . . . . . . 98
6.12 Strong Normalization Proof . . . . . . . . . . . . . . . 103
6.13 Logical Consistency . . . . . . . . . . . . . . . . . . . 104
7 Bibliography 107
3
1 Introduction
Computation What is a computer? What comes into your mind
if you think of computers? Your laptop, your smartphone? A super-
computer? ... Would you say that a human being is a computer?
Looking 100 years back, computers had not yet been invented. At
least not modern electronic computers. However computers existed.
A computer was a man or a woman who can carry out computations.
The term computer in the meaning of one who computes had been
in use from the early 17th century. The NASA and its predecessor
NACA hired men and women to carry out computations.
A computer had no authority. He/She had to carry out computa-
tions following some fixed rules.
Carring out computations according to fixed rules had been present
in the whole history of mankind for more than 2000 years. Think of
adding and multiplying numbers, using Gaussian elimination to solve
a system of linear equations, using Newton’s method to find the root
of a nonlinear function etc.
The notion of computing i.e. carrying out some steps according to
fixed rules had been intuitively clear. Whenever somebody said that
he had found a method to calculate something and had written down
a recipe to do it, everyone could check the method by trying to carry
out the computation according to the written rules. If there are no
ambiguities, then the method can be regarded as a general recipe or
an algorithm.
4
• Alan Turing’s automatic machine (today called Turing machine)
• Alonzo Church’s lambda calculus
It can be shown that in this space of computable functions there
are problems which cannot be decided. E.g. the halting problem is
undecidable. There is no function which takes a valid program and
its input as input and returns true if the program terminates on the
input and false if the program does not terminate on the input.
Having a clear and formal definition of computability, many prob-
lems have been proved to be unsolvable by computation.
In this paper we look only into lambda calculus, because lambda
calculus is not only a universal model of computation like the other
two, there is much more in it.
Let’s see what this more is. There is something unsatisfactory in
lambda calculus which led to significant improvements.
5
isomorphism. The Curry-Howard isomorphism connects two seem-
ingly unrelated formalisms: computations and proof systems.
The types in computations are connected to propositions in logic
via the Curry-Howard isomorphism. The terms of a certain type are
proofs of the corresponding proposition. A proof of an implication
A ⇒ B is a function (i.e. a computation) mapping a proof of A i.e.
a term of type A to a proof of B i.e. a term of type B. The identity
function is a proof of A ⇒ A.
A proof of A ∧ (A ⇒ B) ⇒ B the modus ponens rule is nearly
trivial in this computation analogy. It is a function taking two argu-
ments. It takes an object a of type A and a function f of type A ⇒ B
and it returns an object of type B by just applying f to a.
However if we want lambda calculus to be a model for logic and
proof systems, then termination becomes crucial. Proofs must be
finite. An infinite proof makes no sense. Nobody can check the cor-
rectness of an infinite proof.
In the computational world the definition of a function directly
in terms of itself e.g. f (x : A) : B := f (x) is welltyped but useless.
Calling f with an object a of type A ends up in an infinite recursion.
The counterpart of this nonterminating function f in logic is a
proof of A ⇒ B which uses a proof of A ⇒ B. This circular logic
is not allowed. Proofs which use circular logic are not wellformed.
Anything can be proved by using circular logic.
6
and arbitrary functions formed over the type variables { U → U ,
U → V , U → (V → W ), . . . }. The computational power of
simply typed lambda calculus is fairly limited. But it is already
a model for the implicational fragment of natural deduction.
2. Polymorphic functions (System F ): In the simply typed lambda
calculus it is not possible to express the identity function which
works for arbitrary types. Each type needs its own identity func-
tion. Girard ’s System F allows types as arguments for functions.
Now it is possible to express a polymorphic identity function. It
is a function receiving two arguments. A type and term of this
type. The body of the function just returns the second argument.
This addition of polymorphic functions makes system F substan-
tially more powerful than simply type lambda calculus. Func-
tions operating on booleans, natural numbers, lists of values,
pairs, trees etc. can be expressed in the calculus. As a logic it
can expresses a proof system for second order intutionistic pred-
icate calculus.
3. Polymorphic types i.e. computing types (System Fω ): System
F already allows polymorphic functions which operate on lists
of a certain element type, pairs of two elements of two different
types, trees of certain element types etc.
However a list of integers and a list of booleans are different
types. In system F it is not possible to define functions which
take type arguments and return type results. E.g. it is not
possible to define a function L which takes an element type A as
argument and returns the type LA of a list where the elements
are of type A.
System Fω adds this possibility to compute types. This adds the
necessity to add types of types. It is necessary to express how
many arguments a type function takes. The arguments might
be types or type functions. The Haskell programming language
which is based to some extend on system Fω adds kinds. The
kind ∗ is the type of a type. The kind ∗ → ∗ is the type of a
type function which takes a type argument a returns a type. The
kind (∗ → ∗) → ∗ takes a type function and returns a type etc.
4. Dependent types (calculus of constructions): We need another
dimension to express interesting logical propositions. We want
to express the proposition that a certain number is a prime num-
ber. Propositions are types in the Curry-Howard isomorphism,
7
therefore this proposition is a type. However a number is cer-
tainly a computational object. In order to express the predicate
is prime we need functions which map a number to a proposition
i.e. a type.
Now we can express the proposition which states that all ob-
jects x of a certain type A have a certain property P i.e. the
proposition ∀(x : A).P x. P x is a type which depends on the
computational object x. Therefore it is called a dependent type.
A proof of the proposition ∀(x : A).P x is a function which takes
an object x of type A and returns an object of type P x. In the
calculus of construction we express the proposition ∀(x : A).P x
as the type ΠxA .P x.
The calculus of constructions has enormous computational power
and is very expressive as a logic and proof system. Therefore it is a
sweet spot in the design space of typed lambda calculi.
8
fines the welltyped terms. Terms in the calculus formed accord-
ing to section The Calculus 3 are wellformed, but not necessarily
welltyped. In order to be welltyped they have to be valid terms
in the typing relation.
• Section Proof of Strong Normalization 6: This section contains
the proof that all welltyped terms in the calculus of constructions
are strongly normalizing. I.e. Every welltyped term reduces in
a finite number of computation steps to a normal form which is
the result of the computation.
The proof of strong normalization is rather involved and requires
a lot of machinery to go through. All needed concepts and theo-
rems are explained in detail. Therefore this chapter is the longest
in the paper.
Strong normalization implies consistency when regarded as a
model of a proof system for logic. Consistency in logic means
that it is free of contradictions. A type system is consistent as a
logic if there are types which are uninhabited in the empty con-
text. In the Curry-Howard isomorphism an uninhabited type
corresponds to a proposition which is impossible to prove.
The proof of consistency is the last subsection in this chapter.
9
2 Basic Mathematics
2.1 Sets and Relations
It is assumed that the reader is familiar with the following concepts:
1. Basic set notation: Elements of a set a ∈ A, Subsets A ⊆ B
defined as a ∈ A implies a ∈ B.
2. Endorelations: An (endo-) relation r over the set A is the set of
pairs (a, b) (i.e. r ⊆ A × A) where a and b are drawn from the
r
set A. We write (a, b) ∈ r more pictorially as a → b.
3. n-ary Relations: An endorelation is a binary relation where the
domain and the codomain are the same. An n-ary relation over
the domains A1 , A2 , . . . , An is a subset from the cartesian prod-
uct A1 × A2 × . . . × An . In this paper we use a ternary relation
as the typing relation.
4. An intuitive understanding of the reflexive transitive closure r∗
of a relation r, the reflexivity, symmetry and transitivity of a
relation and similar concepts.
5. Logic: The logical connectives of conjunction a ∧ b, disjunction
a ∨ b, implication a ⇒ b and existential ∃x.p(x) and universal
quantication ∀x.p(x). We use the symbol ⊥ for falsity i.e. a
logical statement which can never be proved (a contradiction).
6. An intuitive understanding of mathematical functions from the
set A to the set B. I.e. A → B is the set of functions mapping
each element of the set A to a unique element of the set B.
2.2 Logic
We often have to state that some premises p1 , p2 , . . . pn have a certain
conclusion c
p1 ∧ p2 ∧ . . . ∧ pn ⇒ c
In many cases this notation with logical connectives is clumsy and
not very readable. We use the rule notation
p1
p2
...
pn
c
10
to express the same fact.
If some variables in logical statements are not quantified, then
universal quantification is assumed.
a∈A a∈A
means ∀a.
a∈B a∈B
Assume that a logical statement has more than one variable. The
universal quantification can be expressed at the highest level or pushed
done to the first appearance of the variable. Moreover logical con-
junction is commutative. Therefore the following interpretations are
equivalent.
a∈A
∀ab. b ∈ A
r
a→b
a∈A a∈A
b∈A means ∀a. b∈A
r ∀b. r
a→b a→b
b∈A
∀b. a∈A
∀a. r
a→b
All three settings of the universal quantifiers are logically equiva-
lent.
11
We say that E is the smallest set which satisfies the above rules.
Whenever there is an element m ∈ E it can be in the set only if it
is in the set because of one of the two rules which define the set. 0 is
in E because of the first rule. 2 is in E because of second rule where
n = 0 and 0 is in the set. 4 is in the set because of the second rule
where n = 2 and 2 is in the set.
A number m ∈ E can be arbitrarily big, but it can be in the set
only because of finitely many applications of the rules which define
the set.
n∈E
p(n)
n∈E
p(n)
Proof: By induction on n ∈ E.
1.
0 ∈ E p(0)
p(0) is valid because . . .
12
2.
n∈E p(n)
n + 2 ∈ E p(n + 2)
In order to prove p(n + 2) in the lower right corner
we assume n ∈ E, p(n), n + 2 ∈ E. p(n + 2) is valid
because . . .
For each rule in the inductive definition of the set we write a ma-
trix. The left part of the matrix is just the rule. The goal is always in
the lower right corner of the matrix. For each recursive premise (i.e.
that some other elements are already in the set we get an induction
hypothesis.
In order to prove the goal in the lower right corner we can assume
all other statements in the matrix.
This schema helps to layout all premises and all induction hypothe-
ses precisely.
As an example we prove that for all even numbers n ∈ E there
exists a number m such that n = 2m using the above schema.
Theorem: For all even numbers n ∈ E there exists a num-
ber m with n = 2m.
n∈E
∃m.n = 2m
Proof: By induction on n ∈ E.
1.
0 ∈ E ∃m.0 = 2m
m = 0 satisfies the goal.
2.
n∈E ∃i.n = 2i
n + 2 ∈ E ∃m.n + 2 = 2m
By the induction hypothesis we get a number i which
satisfies n = 2i. Therefore n + 2 = 2i + 2 = 2(i + 1).
The number m = i + 1 satisfies the goal in the lower
right corner.
Note that the name of the bound variable in existential and uni-
versal quantification is arbitrary. It is good practice to choose names
which do not interfere. In the case for the recursive rule we have cho-
sen the different names i and m in the existential quantification. This
makes the prove more readable for humans.
13
2.5 Inductively Defined Relations
An (endo-) relation r over the set A is just a subset of the cartesian
product A × A. I.e. a relation is just a set which can be defined
inductively. Since inductively defined relations are used excessively
in the paper we explain inductive definitions and induction proofs on
relations as well.
Note that the following is not restricted to binary endorelations
over a carrier set A. It is possibly to define n-ary relations over differ-
ent carrier sets as subset of the cartesian product A1 × A2 × . . . × An
as well. E.g. the typing relation in the calculus of constructions (see
section 5) is a ternary relation.
As an example we define the reflexive transitive closure of a relation
r inductively.
The reflexive transitive closure r∗ of the relation r is defined
by the rules
1.
r∗
a→a
2.
r∗
a→b
r
b→c
r∗
a→c
It should be intuitively clear that r∗ is reflexive and transitive.
The first rule guarantees reflexivity and the second rule guarantees
transitivity. While reflexivity is expressed explicitely in the first rule,
transitive is expressed only indirectly in the second rule therefore we
need a proof of transitivity.
Theorem: The reflexive transitive closure r∗ of the relation
r is transitive
r∗
a→b
r∗
b→c
r∗
a→c
r∗ r∗
Proof: Assume a → b and do induction on b → c.
1.
r∗ r∗
b→b a→b
r∗
The goal a → b is trivial because it is an assumption.
14
2.
r∗ r∗
b → c0 a → c0
r
c0 → c
r∗ r∗
b→c a→c
r∗
By the induction hypothesis we have a → c0 . Together
r
with the second premise c0 → c and the second rule in
the definition of r∗ we get the goal in the lower right
corner.
15
Theorem: All natural numbers n have the property p(n).
Proof: By induction on the structure of n:
1. p(0): The goal p(0) is valid because . . .
2. p(n′ ): By the induction hypothesis p(n) is valid. This
implies p(n′ ) because . . .
Note that such a proof corresponds one to one to a proof on in-
ductive sets shown above.
Furthermore note that a grammar rule in the term grammar might
be recursive in more than one subterm. In that case we get more than
one induction hypothesis.
16
3 The Calculus
In this section we introduce the terms of the calculus of constructions
and the way to do computations with them.
Lambda terms in untyped lambda calculus are either variables
x, y, z, . . ., abstractions λx.e or applications ab.
The calculus of constructions is a typed lambda calculus where
terms and types are expressed within the same syntax. Since all well-
typed terms must have a type we need types of types. The types of
types in the calculus of constructions are the two sorts P and U.
Since the calculus is typed, all variables must have a type. There-
fore all variables in binders like abstractions have types λxA .e. In
order to assign types to free variables contexts are introduced.
Furthermore it is necessary to express functions A → B from one
type A to another type B. The calculus of constructions includes
dependent types, i.e. the result type B of a function might depend
on the argument. In order to express that a product type of the
form ΠxA .B is used which describes the type of a function from an
argument of type A to the result of type B which might depend on
the argument.
After introducing the terms of the calculus on constructions we
define free variables, substitution, beta reduction, beta equivalence and
prove certain theorems which state some interesting properties about
these definitions.
In this section we do not yet define what it means for a term to be
welltyped. This is done in the chapter typing 5.
3.1 Sorts
In the calculus of constructions terms and types are in the same syn-
tactic category. All welldefined terms have types and therefore types
must have types as well. In order to have types of types we start with
the introduction of sorts which are the types of types.
Definition 3.1. Sorts: There are the two sorts P and U in the cal-
culus of constructions.
Sorts or universes are the types of types. Sorts are usually abbre-
viated by the variable s.
In many texts about typed lambda calculus like e.g. Barendregt1993 [1]
and Geuvers1994 [2] the symbol ∗ is used instead of P and the sym-
17
bol □ is used instead of U. This has found its way into the Haskell
programming language where ∗, ∗ → ∗, (∗ → ∗) → ∗, . . . are kinds.
We use the symbol P in order to emphasize the Curry-Howard
correspondence of propositions as types. Many interesting types T in
the calculus of constructions have type P i.e. T : P. Within the
Curry-Howard correspondence these types are propositions.
The symbol U is just a higher universe than P. A more general
lambda calculus the extended calculus of constructions has an infi-
nite hierarchy of universes P, U0 , U1 , . . . where P is the impredicative
universe and Ui are the predicative universes.
In the calculus of constructions the term P has type U and the
term U is not welltyped. In the extended calculus of constructions
the term Ui has type Ui+1 i.e. all terms of the form Ui for any i are
welltyped.
The decision to use the symbols P and U instead of ∗ and □ is just
a matter of taste.
3.2 Terms
Definition 3.2. The terms are defined by the grammar where s
ranges over sorts, x ranges over some countably infinite set of vari-
ables and t ranges over terms.
t ::= s sorts
| x variable
| Πxt .t product
| λxt .t abstraction
| tt application
18
A variable which is not free is called a bound variable. E.g. if x is
a free variable in the term e, it is no longer free in λxA .e. Therefore
we call λxA .e a binder, because it makes the variable x bound. The
same applies to the term ΠxA .B where the variable x is bound, but
can appear free in B.
Note that the binders λxA .e and ΠxA .B make the variable x bound
only in the subterms e and B, but not in the subterm A.
It is possible to rename bound variables within a term. The renam-
ing of a bound variable does not change the term. We consider two
terms which only differ in the name of bound variables as identical.
Examples of some identical terms:
λxP .x = λy P .y
Πxz .x = Πy z .y
3.4 Contexts
All bound variables get their types by their corresponding binders.
For free variables we need types as well. In order to assign types to
free variables we define contexts.
Γ := [] empty context
| Γ, xA one more variable x with its type A
3.5 Substitution
Definition 3.5. Substitution: The term t[y := u] is the term t where
the term u is substituted for all free occurrences of the variable y. It
is defined as a recursive function which iterates over all subterms until
19
variables or sorts are encountered.
t[y:= u] :=
s[y := u]
:= s(
u y=x
x[y := u] :=
x y ̸= x
(ab)[y := u] := a[y := u]b[y := u]
(λxA .e)[y := u] := λxA[y:=u] .e[y := u] y ̸= x, x ∈
/ FV(u)
A
(Πx .B)[y := u] := Πx A[y:=u] ̸ x, x ∈
.B[y := u] y = / FV(u)
20
The formal proof goes by induction on the structure of t.
1. If t is a sort then the equality is trivial because a sort does not
have any variables.
2. If t is an application, an abstraction or a product, then the goal
follows immediately from the induction hypotheses.
3. The only interesting case is when t is a variable. Let’s call the
variable z. Then we have to distinguish the cases that z is x or
z is y or z is different from x and y.
(a) Case z = x:
x[y := b][x := a] = x[x := a]
= a
β
(λxA .(λy B .t)b)a → (λy B .t[x := a])b[x := a]
β
→ t[x := a][y := b[x := a]]
21
3.6 Beta Reduction
Like in untyped lambda calculus, computation is done via beta reduc-
tion. A beta redex has the form (λxA .e)a which reduces to the reduct
e[x := a]. Beta reduction can be done in any subterm of a term.
The intuitive meaning of beta reduction is quite clear. The term
A
λx .e is a function with a formal argument x of type A. The body
e is the implementation of this function which can use the variable
x. As with any programming language which support functions the
name of the formal argument is irrelevant. An arbitrary name can be
chosen and the variable is only used internally and is not visible to
the outside world. We can apply the function to an argument a i.e.
form the term (λxA .e)a. In order to compute the result we use the
implementation e and substitute the actual argument a for the formal
argument x i.e. we form e[x := a].
β
Definition 3.8. Beta reduction is a binary relation a → b where the
term a reduces to the term b. It is defined by the rules
1. Redex:
β
(λxA .e)a → e[x := a]
2. Reduce function:
β
f →g
β
f a → ga
3. Reduce argument:
β
a→b
β
fa → fb
4. Reduce abstraction argument type:
β
A→B
β
λxA .e → λxB .e
5. Reduce abstraction inner term:
β
e→f
β
λxA .e → λxA .f
6. Reduce product argument type:
β
A→B
β
ΠxA .C → ΠxB .C
22
7. Reduce product result type
β
B→C
β
ΠxA .B → ΠxA .C
Proof. All rules except the redex rule reduce to something which can-
not be syntactically a sort. Therefore the term has to be a redex which
in general has the form (λxA .e)a. The redex reduces to e[x := a] which
by the substitution to sort lemma 3.6 proves the goal.
Theorem 3.10. Substitute Reduction A reduction remains valid if we
do the same substitution before and after the reduction.
β
t→u
β
t[y := v] → u[y := v]
(Note that iterated application of this lemma results in the more gen-
β∗ β∗
eral statement t → u ⇒ t[y := v] → u[y := v].)
β
Proof. Proof by induction on t → u.
1. Redex: We have to prove the goal
β
((λxA .e)a)[y := v] → e[x := a][y := v]
23
2. Reduce function: We have to prove the goal
β β
f →g f [y := v] → g[y := v]
β β
f a → ga (f a)[y := v] → (ga)[y := v]
The validity of the final goal in the right lower corner can be
seen by the following reasoning
(f a)[y := v] = f [y := v]a[y := v]
β
→ g[y := v]a[y := v]
= (ga)[y := v]
3. Other rules: All other rules follow the same pattern as the proof
of the rule reduce function.
24
5. Application: Same as abstraction.
Proof. The proofs for product and abstraction follow the same pat-
tern. Therefore we prove only the preservation of products.
β
Assume ΠxA .B → t and do induction on it. Only the two product
rules are syntactically possible. Each one guarantees one alternative
of the goal.
25
1. Reflexive:
β
a∼a
2. Forward:
β
a∼b
β
b→c
β
a∼c
3. Backward:
β
a∼b
β
c→b
β
a∼c
β
In other words the beta equivalence relation ∼ is the smallest
β
equivalence relation which contains beta reduction →.
β β
Proof. Assume a ∼ b. We prove the goal by induction on b ∼ c.
1. Reflexive: Trivial.
2. Forward:
β β
b∼c a∼c
β
c→d
β β
b∼d a∼d
The goal in the lower right corner is proved by the induction
hypothesis and applying the forward rule.
3. Backward:
β β
b∼c a∼c
β
d→c
β β
b∼d a∼d
The goal in the lower right corner is proved by the induction
hypothesis and applying the backward rule.
26
Theorem 3.15. Beta equivalence is symmetric.
β
a∼b
β
b∼a
β
Proof. By induction on a ∼ b.
1. Reflexive: Trivial
2. Forward:
β β
a ∼ b0 b0 ∼ a
β
b0 → b
β β
a∼b b∼a
β
First we use the reflexive rule to derive b ∼ b and then the second
β β
premise b0 → b and the backward rule to derive b ∼ b0 .
β
Then we use the induction hypothesis b0 ∼ a and the transitivity
β
of beta equivalence 3.14 to derive the goal b ∼ a.
3. Backward:
β β
a ∼ b0 b0 ∼ a
β
b → b0
β β
a∼b b∼a
Similar reasoning as with the forward rule. The second premise
β
implies b ∼ b0 and the induction hypothesis and transitivity
imply the goal.
27
4 Confluence
4.1 Overview
We want to be able to use the calculus of constructions for computa-
tion. The basic computation step is beta reduction. A computation
in the calculus ends if no more reduction is possible i.e. if the term
has been reduced to a normal form.
However beta reduction is ambiguous. There might be more than
one possibility to make a beta reduction.
Example 4.1.
(λx.f xx)((λy.y)z)
↙β ↘β
f ((λy.y)z)((λy.y)z)
↓β (λx.f xx)z
f z((λy.y)z)
↘β ↙β
f zz
In the left path we reduce first the outer redex. This reduction du-
plicates the redex in the argument. Therefore we need three reduction
steps to reach the final term f zz.
In the right path we reduce first the redex in the argument of the
application and then the outer redex. Therefore we need only two
reduction steps to reach the final term f zz.
In that specific example we have shown that both paths end up
in the same result. However we have to prove that this is always the
case. Otherwise the calculus of constructions would be useless as a
calculus.
It turns out that confluence is a key property in the proof of unique-
ness of results. We define a relation as confluent if going an arbitrary
number of steps in two directions there are always continuations of the
paths to join them. We want to prove confluence of beta reduction.
Before proving confluence of beta reduction we define the conflu-
ence of a relation by defining first a diamond property of a relation.
The diamond property of a relation is something like a one step con-
fluence. Then we show that the confluence of a relation can be proved
by finding a diamond between it and its reflexive transitive closure.
28
In a next step we define a parallel beta reduction, prove that it is
a diamond between beta reduction and the reflexive transitive closure
of beta reduction. This proves the confluence of beta reduction.
Having the confluence of beta reduction it is easy to prove that
beta equivalent terms always have a common reduct, normal forms
are unique and some other interesting properties of binders.
29
s∗
(a) a = b: Trivial since a → a is valid by definition.
(b)
r∗ s∗
a→b a→b
r
b→c
r∗ s∗
a→c a→c
In order to prove the final goal in the lower right corner we
s∗
start from the induction hypothesis a → b.
s
Since r ⊆ s we have b → c by definition of ⊆ and the second
premise.
Then by definition of the reflexive transitive closure we con-
clude the final goal.
2. s∗ ⊆ r∗ : We have to prove the goal
s∗
a→b
r∗
a→b
s∗
We use induction on a → b.
r∗
(a) a = b: Trivial since a → a is valid by definition.
(b)
s∗ r∗
a→b a→b
s
b→c
s∗ r∗
a→c a→c
r∗
Start with the induction hypothesis a → b.
r∗
Since s ⊆ r∗ is given we can infer b → c from the second premise.
r∗
r∗ is transitive and therefore a → c must be valid.
is valid.
r∗
Proof. By induction on a → b.
30
1. In the reflexive case we have a = b. We choose d = c which
satisfies the required properties.
2.
r∗
a → b
r∗
a → b ∀d. ↓r ↓r
r∗
d → ∃e
r
b→c
r∗
a → c
r∗
a → c ∀d. ↓r ↓r
r ∗
d → ∃f
r
To prove the goal in the lower right corner we assume a → d and
try to find some f with the required properties.
From the induction hypothesis we find some e which satisfies
r r∗
b → e and d → e.
r
b → c
Since r is a diamond there exists some f with ↓r ↓r . By
r
e → ∃f
glueing the boxes together we see that f satisfies the required
properties.
r∗ r
a → b → c
↓r ↓r ↓r
r∗ r
d → e → f
r∗
Proof. We assume that r is a diamond and a → b and do induction
r∗
on a → c
1. In the reflexive case we have a = c. We use d = b which satisfies
the required properties.
31
2.
r∗
a → b
r∗
a→c ↓r ∗ ↓r∗
r∗
c → ∃e
r
c→d
r∗
a → b
r∗
a→d ↓r ∗ ↓r∗
r∗
d → ∃f
We have to prove the goal in the lower right corner i.e. find some
f with the required properties.
From the induction hypothesis we find some e. Using the previ-
ous lemma 4.5 we find some f such that we can glue the boxes
together in order to see that f satisfies the required properties.
r∗
a → b
↓r ∗ ↓ r∗
r∗
c → e
↓r ↓r
r∗
d → f
r ⊆ s ⊆ r∗
s is a diamond
r is confluent
Proof. Assume r ⊆ s ⊆ r∗ and s is a diamond.
By 4.4 we know that both reflexive transitive closures are the same
i.e. r∗ = s∗ .
From 4.6 we conclude that s∗ is a diamond. This implies that r∗ is
a diamond as well and proves the fact that r is confluent by definition
of confluence.
32
4.3 Parallel Reduction Relation
Looking at the example 4.1 it can be seen that beta reduction is not
a diamond. Let’s analyze the example a little bit to see why beta
reduction is not a diamond. Assume there is a redex like
(λxA .e)a
where the subterm a contains redexes as well i.e. there is some b with
β
a → b. Then the following two reduction paths are possible.
β β
(λxA .e)a → (λxA .e)b → e[x := b]
β β∗
(λxA .e)a → e[x := a] → e[x := b]
33
2. Redex
βp
a→b
βp
e→f
βp
(λxA .e)a → f [x := b]
3. Product
βp
A→C
βp
B→D
βp
ΠxA .B → ΠxC .D
4. Abstraction
βp
A→B
βp
e→f
βp
λxA .e → λxB .f
5. Application
βp
a→c
βp
b→d
βp
ab → cd
Lemma 4.9. Parallel beta reduction is a superset of beta reduction
Proof. This fact is trivial, because all rules of beta reduction are con-
tained as special cases within the rules of parallel beta reduction.
Proof. All rules of parallel beta reduction are satisfied by the transitive
closure of beta reduction. Since an inductive relation defined by some
rules is the smallest relation which satisfies the rules, it is evident that
parallel beta reduction must be smaller than the transitive closure of
beta reduction.
34
1. a is a sort: Trivial, because substitution does not change a sort
and parallel beta reduction is reflexive.
2. a is a variable, let’s say y: In the case x = y the goal is implied by
the premise. In the case x ̸= y the goal is implied by reflexivity.
3. a is the product Πy B .C: We have to prove the goal (Πy B .C)[x :=
βp βp
t] → (Πy B .C)[x := u] from the premise t → u and the induction
βp βp
hypotheses B[x := t] → B[x := u] and C[x := t] → C[x := u].
The validity of the goal can be seen from the following derivation.
(Πy B .C)[x := t] = Πy B[x:=t] .C[x := t] definition of substitution
βp
→ Πy B[x:=u] .C[x := u] induction hypothesis
= (Πy B .C)[x := u] definition of substitution
The validity of the goal in the lower right corner can be seen
from the following reasoning:
((λy A .e)a)[x := t] = (λy A[x:=t] .e[x := t])a[x := t] definition of substitution
βp
→ f [x := u][y := b[x := u]] induction hypotheses
= f [y := b][x := u] double substitution 3.7
35
3. Product, abstraction and application: Some reasoning as with
redex. The lemma 3.7 is not needed in these cases.
Lemma 4.13. The product and abstraction are preserved under par-
allel reduction
βp
λxA .e → c
1. βp βp
∃Bf.c = λxB .f ∧ A → B ∧ e → f
βp
ΠxA .B → c
2. βp βp
∃CD.c = ΠxC .D ∧ A → C ∧ B → D
Proof. By induction on the premise. In both cases only one rule is syn-
tactically possible which guarantees the existence of the corresponding
terms.
βp
Proof. By induction on a → b. Note that we keep the variable c in
the following universally quantified.
1. Reflexive
βp
a → a
βp
a → a ∀c. ↓βb ↓βb
βp
c → ∃d
βp
To prove the goal on the right side we assume a → c. Then we
use c for d which satisfies the required property of d trivially.
36
2. Redex
βp
e → f
βp
e→f ∀g. ↓βb ↓βb
βp
g → ∃h
βp
a → b
βp
a→b ∀c. ↓βb ↓βb
βp
c → ∃d
βp
(λxA .e)a → f [x := b]
βp
(λxA .e)a → f [x := b] ∀k. ↓ βb ↓βb
βp
k → ∃n
βp
To prove the goal in the lower right corner we assume (λxA .e)a →
k and do a case split on the construction of this relation.
(a) Reflexive: In that case k = (λxA .e)a. We use n = f [x := b]
which has the required property.
(b) Redex: In that case k = g[x := c] for some g and c with the
βp βp
properties e → g and a → c.
βp βp
We have to find a term n with f [x := b] → n∧g[x := c] → n.
The term
n = h[x := d]
with the terms h and d which exist by the induction hy-
potheses. It is easy to see that the properties
βp
f [x := b] → h[x := d]
βp
g[x := c] → h[x := d]
37
βp
We have to find a term n which satisfies (λxB .g)c → n and
βp
f [x := b] → n.
We use the term
n = h[x := d]
with the terms h and d which exist by the induction hy-
potheses satisfying
βp
f → h
βp
g → h
βp
b → d
βp
c → d
Therefore the goals
βp
(λxB .g)c → h[x := d]
βp
f [x := b] → h[x := d]
are satisfied
3. Product
βp
A → C
βp
A→C ∀E. ↓βb ↓βb
βp
E → ∃H
βp
B → D
βp
B→D ∀F. ↓βb ↓βb
βp
F → ∃J
βp
ΠxA .B → ΠxC .D
βp
ΠxA .B → ΠxC .D ∀EF. ↓βb ↓βb
βp
ΠxE .F → ∃n
In order to prove the goal in the lower right corner we assume
βp
ΠxA .B → ΠxE .F .
Because of lemma 4.13 which says that parallel reduction pre-
serves products we have chosen the more specific ΠxE .F which
βp βp
satisfies A → E and B → F instead of a more general term.
From the induction hypotheses we conclude the existence of the
terms H and J such that
n = ΠxH .J
38
satisfies the required properties.
4. Abstraction: Same reasoning as with product.
5. Application
βp
a → c
βp
a→c ∀e. ↓βb ↓βb
βp
e → ∃g
βp
b → d
βp
b→d ∀f. ↓βb ↓βb
βp
f → ∃h
βp
ab → cd
βp
ab → cd ∀k. ↓βb ↓βb
βp
k → ∃n
βp
To prove the goal in the lower right corner we assume ab → k
and do a case split on the construction of this relation.
(a) Reflexive: In that case k = ab. We use n = cd which satisfies
the required properties.
(b) Redex: In this case ab has to be a redex, let’s say (λxA .m)b.
βp
Therefore and because of lemma 4.13 ab → cd becomes
βp βp
(λxA .m)b → (λxB .o)d for some B and o with A → B and
βp
m → o.
βp
k has to be the reduct p[x := f ] for some p, f with m → p
βp
and b → f .
We have to find some term n which satisfies
βp
(λxB .o)d → n
βp
p[x := f ] → n
From the first induction hypothesis and lemma 4.13 we pos-
βp βp
tulate the existence of q with o → q and p → q.
From the second induction hypothesis we conclude the ex-
βp βp
istence of some h with with d → h and f → h.
Therefore the term
n = q[x := h]
satisfies the requirement.
39
(c) Product: This case is syntactically impossible because ab
cannot be a product.
(d) Abstraction: This case is syntactically impossible because
ab cannot be an abstraction.
(e) Application: In that case k = ef for some terms e, f which
βp βp
satisfy a → e and b → f . We have to find some term n
βp βp
which satisfies cd → n and ef → n.
By the induction hypotheses there exist some terms g and
βp βp βp βp
h satisfying c → g, e → g, d → h and f → h such that the
term
n = gh
satisfies the requirement.
Proof. With the parallel beta reduction we have found a relation which
is
1. a diamond (4.14)
2. between beta reduction and its reflexive transitive closure (4.9,
4.10)
I.e. with parallel beta reduction we have found a diamond between
beta reduction and its reflexive transitive closure which implies by 4.7
that beta reduction is confluent.
40
Theorem 4.16. Equivalent terms have a common reduct.
β
a∼b
β∗ β∗
∃c.a → c ∧ b → c
β
Proof. By induction on a ∼ b.
1. Reflexive: In that case a = b. We use c = a which satisfies the
required properties trivially.
2. Forward:
β β∗ β∗
a∼b ∃d.a → d ∧ b → d
β
b→c
β β∗ β∗
a∼c ∃e.a → e ∧ c → e
In order to prove the goal in the lower right corner we construct
terms d and e which satisfy the following diagram and therefore
e satisfies the required properties.
β β
a ∼ b → c
↘β ∗ ↓β ∗ ↓β ∗
β∗
d → e
41
4.5 Uniqueness of Normal Forms
Theorem 4.17. The normal form of a lambda term is unique.
β∗
t→u
β∗
t→v
u and v are in normal form
u=v
Proof. We only prove the theorem for products (the proof for abstrac-
tions is practically the same).
Since both products are beta equivalent they have by 4.16 a com-
mon reduct. By 3.12 products (and abstractions) are preserved under
beta reduction. ∗Therefore the common reduct must have the form
β β∗ β∗ β∗
ΠxE .F with A → E, B → F , C → E and D → F .
Since reduction implies beta equivalence and beta equivalence is
β β
transitive and symmetric we get A ∼ C and B ∼ D.
42
because the proofs of the other variants follow the same pattern.
β
Assume ΠxA .B ∼ s. Both terms must have a common reduct
by 4.16 which must be the sort s since a sort is already in normal
β∗
form. I.e. we get ΠxA .B → s.
β∗
Since reduction preserves the form of binders by 3.12 ΠxA .B → s
is not possible and we get the desired contradiction.
43
5 Typing
This section describes welltyped terms and basic properties of well-
typed terms. It is based on Henk Barendregt’s paper Lambda Calculi
with Types [1].
What is needed to state that a term t is welltyped? First we have
to notice that the term t might contain free variables. As opposed to
bound variables, free variables don’t have a type expressed within a
term. Therefore we need a context Γ which assigns types to the free
variables in the term t. Furthermore we need a type T . We write the
statement The term t has type T in the context Γ as
Γ⊢t:T
44
product, abstraction, application). E.g. each welltyped abstrac-
tion λxA .e has a type of the form ΠxA .B which is beta equivalent
to T .
The generation lemmata describe important properties which are
used in the proofs on many other theorems.
2. Type of types: If T is the type of a welltyped term in a certain
context, then T is either the sort U or T is welltyped in the same
context and its type is a sort (i.e. either P or U).
This theorem justifies the introduction of sorts as types of types.
3. Subject reduction: Reduction (i.e. computation) does not change
the type of a term. Or in other words: If a term has a certain
type and we compute the term, then we really get an object of
that type. I.e. computation does what it promises to do.
Subject reduction is valid in all typed programming languages.
If you define a function with a certain result type, then the ac-
tual execution of that function returns an object of that type
(provided that the computation terminates).
4. Uniqueness of types: If a term is welltyped in a certain context,
then all its possible types are beta equivalent. I.e. each welltyped
term has a unique type modulo beta equivalence. I.e. we can
regard the equivalence class as the unique type of a term.
Together with subject reduction this theorem guarantees that
beta equivalent welltyped terms have the same unique type mod-
ulo beta equivalence.
In the last subsection of this section we introduce kinds. Since we
have the two sorts P and U and sorts are the types of types, there are
two types of types. Types of type P and types of type U. The types of
type U are called kinds. Kinds have the special property that they are
recognizable by pure syntactic analysis of the term which represents
the type.
Semantically kinds are the types of n-ary type functions where
n = 0 is allowed as a corner case. Therefore if we know that the type
of a term is a kind, then we know that the term will return a type, if
applied to sufficient arguments (of the correct type of course).
The kinds play an important role in the proof of strong normaliza-
tion i.e. in the proof of consistency of the calculus of constructions.
45
5.1 Typing Relation
Definition 5.1. The ternary typing relation Γ ⊢ t : T which says that
in the context Γ then term t has type T is defined inductively by the
rules
1. Introduction rules:
(a) Axiom:
[] ⊢ P : U
(b) Variable:
Γ⊢A:s
x∈/Γ
Γ, xA ⊢ x : A
(c) Product:
Γ ⊢ A : s1
Γ, xA ⊢ B : s2
Γ ⊢ ΠxA .B : s2
(d) Abstraction:
Γ ⊢ ΠxA .B : s
Γ, xA ⊢ e : B
Γ ⊢ (λxA .e) : ΠxA .B
(e) Application:
Γ ⊢ f : ΠxA .B
Γ⊢a:A
Γ ⊢ f a : B[x := a]
2. Structural rules:
(a) Weaken:
Γ⊢t:T
Γ⊢A:s
x∈/Γ
Γ, xA ⊢ t : T
(b) Type equivalence:
Γ⊢t:T
Γ⊢U :s
β
T ∼U
Γ⊢t:U
Remarks:
46
• There is no introduction rule for U. Therefore there can never
be a valid type of U i.e. the term U is not welltyped. Its only
purpose is to be the type of some types like P and it can only
appear in the type position of a typing judgement Γ ⊢ t : T .
• The variable introduction rule requires a welltyped type A for
the introduced variable (i.e. A : s for some sort s). Since U is
not welltyped, it is impossible for a variable to have type U.
• The product type ΠxA .B is the type of functions mapping ob-
jects of type A to objects of type B where the result type B
might contain the variable x. The introduction rule for prod-
ucts requires that both A and B are welltyped types. Therefore
neither of them can be U. Therefore a function cannot receive
arguments of type U nor return results of type U.
It is possible to compute with types but it is impossible to com-
pute with sorts (or kinds in general) as arguments and results.
This is an important difference to the extended calculus of con-
structions which allows variables of type Ui because Ui has type
Ui+1 . The extended calculus of constructions can compute with
kinds, the calculus of constructions not.
47
10. A term is a proposition or a proper type if it is a valid type of
sort P.
11. A term is a kind if it is a valid type of sort U.
Γ⊢P:U
∧
∀xA ∈ Γ ⇒ Γ ⊢ x : A
by induction on Γ ⊢ t : T .
For all rules except the axiom, the variable rule and the weakening
rule the goal is an immediate consequence of the induction hypothesis
for the same context. The axiom, the variable and the weakening rule
are treated separately.
1. Axiom [] ⊢ P : U: The first part is trivially valid. The second
part is vacuously valid, because the empty context does not have
any variables.
2. Variable:
Γ⊢B:s Γ ⊢ P : U ∧ (∀xA ∈ Γ ⇒ Γ ⊢ x : A)
y∈/Γ
Γ, y B ⊢ y : B Γ, xB ⊢ P : U ∧ (∀xA ∈ (Γ, xB ) ⇒ Γ, y B ⊢ x : A)
The first part of the goal in the lower right corner is a conse-
quence of the induction hypothesis and the weaking rule.
For the second part we have to distinguish two cases:
• xA ∈ Γ: In that case the second part of the goal is a conse-
quence of the second part of the induction hypothesis and
the weakening rule.
• xA = y B : In that case the second part of the goal is identical
with the lower left corner.
48
3. Weakening:
Γ⊢t:T
Γ⊢B:s Γ ⊢ P : U ∧ (∀xA ∈ Γ ⇒ Γ ⊢ x : A)
y∈/Γ
Γ, y B ⊢ t : B Γ, xB ⊢ P : U ∧ (∀xA ∈ (Γ, xB ) ⇒ Γ, y B ⊢ x : A)
The reasoning is nearly the same as with the variable case. Ex-
cept the second part of the goal for xA = y B is proved by the
variable introduction rule by using Γ ⊢ B : s and y ∈
/ Γ.
49
3. Product:
∆ valid
Γ ⊢ A : s1 ∀∆. Γ ⊆ ∆
∆ ′⊢ A : s1
∆ valid
Γ, xA ⊢ B : s2 ∀∆′ . Γ, xA ⊆ ∆′
∆′ ⊢ B : s 2
∆ valid
Γ ⊢ ΠxA .B : s2 ∀∆. Γ ⊆ ∆
A
∆ ⊢ Πx .B : s2
50
We assume a valid context ∆ which is a superset of Γ. By the
two induction hypotheses and the application introduction rule
we infer the goal.
6. Weaken:
∆ valid
Γ⊢t:T ∀∆. Γ ⊆ ∆
∆⊢t:T
Γ⊢A:s
x∈
/Γ
∆ valid
Γ, xA ⊢ t : T ∀∆. Γ, xA ⊆ ∆
∆⊢t:T
51
1. Sort:
Γ⊢s:T
β
s=P ∧T ∼U
2. Variable:
Γ ⊢ x :T
Γ⊢A:s
A
∃A, s. x ∈ Γ
β
A∼T
3. Product:
Γ ⊢ ΠxA.B : T
Γ ⊢ A : s1
A
∃s1 , s2 . Γ, x ⊢ B : s2
β
s2 ∼ T
4. Abstraction:
Γ ⊢ λxA .e : T
Γ ⊢ ΠxA .B : s
A
∃B, s. Γ, x ⊢ e : B
β
ΠxA .B ∼ T
5. Application:
Γ ⊢ f a : T
Γ ⊢ f : ΠxA .B
∃A, B. Γ ⊢ a : A
β
B[x := a] ∼ T
52
Only the introduction rule for products is syntactically possible.
Γ ⊢ A : s1
Γ, xA ⊢ B : s2
Γ ⊢ A : sa
A
Γ ⊢ ΠxA .B : s2 ∃sa , sb . Γ, x ⊢ B : sb
β
s2 ∼ sb
53
(b) Type equivalence:
Γ ⊢ A : s1
A
Γ ⊢ ΠxA .B : T ∃s1 , s2 . Γ, x ⊢ B : s2
β
T ∼ sb
Γ⊢U :s
β
T ∼U
Γ ⊢ A : sa
A
Γ ⊢ ΠxA .B : U ∃sa , sb . Γ, x ⊢ B : sb
β
U ∼ sb
Proof. By 5.6 all types in a context are welltyped and by 5.7 the term
U is not welltyped. Therefore U cannot be the type of a variable in a
valid context.
54
Proof. We prove this corollary by proving that any direct subterm of
a welltyped term is welltyped by induction on the structure of the
welltyped term.
For each possible form of a welltyped term the generation lemma 5.5
for this form states the existence of a context and a type of each direct
subterm, i.e. the direct subterms are welltyped.
Repeating the argument proves that all indirect subterms of a well-
typed subterm are welltyped as well.
Corollary 5.10. U cannot be a subterm of a welltyped term. This
implies that terms like λxU .e, λxA .U, ΠxU .B or ΠxA .U cannot be
welltyped, because they have U as a subterm.
Proof. Assume that U is a subterm of a welltyped term. Then by the
corollary 5.9 U must be welltyped. This contradicts corollary 5.7.
55
The final goal follows from the induction hypotheses and the
product introduction rule.
4. Abstraction:
′
Γ, xA , ∆ ⊢ Πy B .C : s Γ, ∆′ ⊢ Πy B .C ′ : s
′
A B
Γ, x , ∆, y ⊢ e : C Γ, ∆′ , y B ⊢ e′ : C ′
′ ′
Γ, xA , ∆ ⊢ λy B .e : Πy B .C Γ, ∆′ ⊢ λy B .e′ : Πy B .C ′
Same reasoning as above.
5. Application:
′
Γ, xA , ∆ ⊢ f : Πy B .C Γ, ∆′ ⊢ f ′ : Πy B .C ′
Γ, xA , ∆ ⊢ b : B Γ, ∆′ ⊢ b′ : B ′
Γ, x , ∆ ⊢ f b : C[y := b] Γ, ∆′ ⊢ f ′ b′ : (C[y := b])′
A
56
5.7 Type of Types
Theorem 5.12. A term in the type position of a context is either U
or it is a valid type of some sort.
Γ⊢t:T
T = U ∨ ∃s.Γ ⊢ T : s
Proof. By induction on Γ ⊢ t : T :
1. Sort: Trivial
2. Variable: Trivial by the premise of the variable introduction rule.
3. Product: Easy, because there are only two sorts. If the sort is
P then by the start lemma 5.3 Γ ⊢ P : U is valid in any valid
context. If the sort is U then the goal is trivial.
4. Abstraction:
Γ ⊢ ΠxA .B : s0
Γ, xA ⊢ e : B
Γ ⊢ λxA .e : ΠxA .B ∃s.Γ ⊢ ΠxA .B : s
Take s = s0 .
5. Application:
Γ ⊢ f : ΠxA .B ∃s0 .Γ ⊢ ΠxA .B : s0
Γ⊢a:A
Γ ⊢ f a : B[x := a] ∃s.Γ ⊢ B[x := a] : s
57
5.8 Subject Reduction
Theorem 5.13. Subject reduction lemma Reduction of a term does
change its type.
Γ⊢t:T
β
t→u
Γ⊢u:T
Proof. In order to prove the subject reduction lemma we prove the
more general lemma
Γ!⊢ t : T !
β β
∀u. t→u ∧ ∀∆. Γ → ∆
Γ⊢u:T ∆⊢t:T
β
where Γ → ∆ means that ∆ is Γ with one of the variable types replaced
by a reduced type.
We prove the more general lemma by induction on Γ ⊢ t : T .
1. Introduction rules:
(a) Sort: If Γ is empty and t is a sort, then the goal is vacuously
true because neither the empty context nor a sort can reduce
to anything (they are in normal form).
(b) Variable:
! !
β β
Γ⊢A:s ∀B. A → B ∧ ∀∆0 . Γ → ∆ 0
Γ⊢B:s ∆0 ⊢ A : s
! !
β A β
Γ, xA ⊢ x : A ∀u. x→u ∧ ∀∆. Γ, x → ∆
Γ, xA ⊢ u : A ∆⊢x:A
The left part of the goal in the lower right corner is vacously
true because a variable is in normal form and there is no
term to which it reduces.
β
For the right part we assume Γ, xA → ∆. There are two
possibilities:
β
i. ∆ = ∆0 , xA where Γ → ∆0 for some ∆0 :
In that case we get ∆0 ⊢ A : s from the induction
hypothesis which implies the goal ∆0 , xA ⊢ x : A.
β
ii. ∆ = Γ, xB where A → B:
In that case we get Γ ⊢ B : s from the induction hy-
pothesis which implies the goal Γ, xB : x ⊢ B.
58
(c) Product:
! !
β β
Γ ⊢ A : s1 ∀C. A → C ∨ ∀∆. Γ → ∆
Γ ⊢ C : s1 ! ∆ ⊢ A : s1 !
β A →β ′
Γ, xA ⊢ B : s2 ∀D. B → D Γ,
∨ ∀∆′ . ′ x ∆
Γ, xA ⊢ D : s2 ∆ ⊢ B : s2
! !
A β β
Γ ⊢ ΠxA .B : s2 ∀t. Πx .B → t ∧ ∀∆. Γ→∆
Γ ⊢ t : s2 ∆ ⊢ ΠxA .B : s2
β
i. Left part: We assume ΠxA .B → t.
Since products are preserved under reduction (lemma 3.12)
β
we have either t = ΠxC .B where A → C for some C or
β
t = ΠxA .D where B → D for some D.
In both cases we can derive from the induction hypothe-
ses either Γ ⊢ C : s1 or Γ, xA ⊢ D : s2 . Therefore
Γ ⊢ ΠxC .B : s2 or Γ ⊢ ΠxA .D : s2 is valid trivially.
β
ii. Right part: Assume Γ → ∆. From the first induction
hypothesis we get ∆ ⊢ A : s1 . From the second induc-
tion hypothesis we get ∆, xA ⊢ B : s2 where we use
∆′ = ∆, xA . These facts imply ∆ ⊢ ΠxA .B : s2 .
(d) Abstraction: Same reasoning as with product.
(e) Application:
! !
β β
A
Γ ⊢ f : Πx .B ∀g. f → g ∧ ∀∆. Γ → ∆
Γ ⊢ g : Πx!A .B ∆ ⊢ f!: ΠxA .B
β β
Γ⊢a:A ∀b. a→b ∧ ∀∆. Γ → ∆
Γ⊢b:A ∆⊢a:A
! !
β β
Γ ⊢ f a : B[x := a] ∀t. fa → t ∧ ∀∆. Γ→∆
Γ ⊢ t : B[x := a] ∆ ⊢ f a : B[x := a]
β
i. Left part: Assume f a → t. We have three cases to
consider.
β β
A. f a → ga where f → g:
From the first induction hypothesis we get that g has
the same type as f and therefore ga : B[x := a] is
valid.
59
β β
B. f a → f b where a → b:
From the second induction hypothesis we get Γ ⊢ b :
A and therefore Γ ⊢ f a : B[x := b].
Since B[x := a] is a valid type and because of lemma 3.10
β
we have B[x := a] → B[x := b] and therefore B[x :=
b] ≤ B[x := a]. The subtype rule let us derive
Γ ⊢ f b : B[x := a] which is identical to the final
goal.
β
C. (λxA .e)a → e[x := a]:
In that case we have to prove
Γ ⊢ λxA .e : ΠxA .B
Γ⊢a:A
Γ ⊢ e[x := a] : B[x := a]
We assume Γ ⊢ λxA .e : ΠxA .B and Γ ⊢ a : A and
prove the final goal.
Since ΠxA .B is not U the type of types lemma 5.12
states the existence of a sort s such that Γ ⊢ ΠxA .B :
s is valid and then by the generation lemma 5.5 for
products we get the existence of some sB such that
Γ, xA ⊢ B : sB is valid which implies by the substitu-
tion lemma 5.11 Γ ⊢ B[x := a] : sB .
According to the generation lemma 5.5 for abstrac-
tions there are some B0 and s0 with
Γ ⊢ ΠxA .B0 : s0
Γ, xA ⊢ e : B0
β
ΠxA .B0 ∼ ΠxA .B
The second one together with Γ ⊢ a : A and the sub-
stitution lemma 5.11 gives us Γ ⊢ e[x := a] : B0 [x :=
a].
The third one with the Church Rosser theorem 4.16
β
give B ∼ B0 which by the theorem 3.16 results in
β
B[x := a] ∼ B0 [x := a].
Finally we can convert Γ ⊢ e[x := a] : B0 [x := a],
β
Γ ⊢ B[x := a] : sB and B[x := a] ∼ B0 [x := a] via
the type equivalence rule into the final goal.
ii. Right part: Immediate consequence of the induction
hypotheses.
60
2. Structural rules:
(a) Weaken:
! !
β β
Γ⊢t:T ∀u. t→u ∧ ∀∆0 . Γ → ∆0
Γ⊢u:T ! ∆0 ⊢ t : T !
β β
Γ⊢A:s ∀B. A → B ∧ ∀∆0 . Γ → ∆0
Γ⊢B:s ∆0 ⊢ A : s
x∈
/Γ ! !
β β
Γ, xA ⊢ t : T ∀u. t→u ∧ ∀∆. Γ, xA →
∆
Γ, xA ⊢ u : T ∆⊢t:T
β
i. Left part: Assume t → u. From the first induction
hypothesis we get Γ ⊢ u : T which derives the final goal
by applying the variable introduction rule.
β
ii. Right part: We assume Γ, xA → ∆ and have to distin-
guish two cases.
β
A. ∆ = (∆0 , xA ) ∧ Γ → ∆0 :
In that case we get ∆0 ⊢ t : T and ∆0 ⊢ A : s from
the induction hypotheses which imply the final goal
∆0 , xA ⊢ t : T by application of the weakening rule.
β
B. ∆ = (Γ, xB ) ∧ A → B:
In that case we get Γ ⊢ B : s from the second induc-
tion hypothesis which implies the final goal Γ, xB ⊢
t : T by application of the weakening rule.
(b) Type equivalence:
! !
β β
Γ⊢t:T ∀u. t → u ∧ ∀∆. Γ → ∆
Γ⊢u:T ! ∆⊢t:T
β
Γ ⊢ U : s ... ∧ ∀∆. Γ → ∆
∆⊢U :s
β
T ∼U ! !
β β
Γ⊢t:U ∀u. t → u ∧ ∀∆. Γ → ∆
Γ⊢u:U ∆⊢t:U
β
i. Left part: Assume t → u. We get Γ ⊢ u : T by the
first induction hypothesis. The final goal Γ ⊢ u : U is
obtained by applying the type equivalence rule.
61
β
ii. Right part: Assume Γ → ∆. From the first induction
hypothesis we derive ∆ ⊢ t : T and from the second
induction hypothesis we derive ∆ ⊢ U : s. The final goal
∆ ⊢ t : U is obtained by applying the type equivalence
rule.
Γ⊢A:T
β
A∼U
⊥
Γ⊢t:T
Γ⊢t:U
β
T ∼U
62
Theorem 5.17. All types of beta equivalent terms are beta equivalent
β
t∼u
Γ⊢t:T
Γ⊢u:U
β
T ∼U
β
Proof. t ∼ u implies by the Church Rosser Theorem 4.16 the existence
β∗ β∗
of a term v such that t → v and u → v are valid.
By repeated application of the subject reduction theorem we get
Γ ⊢ v : T and Γ ⊢ v : U .
β
This implies by the previous theorem 5.16 the validity of T ∼
U.
β
Proof. By 5.17 we get T ∼ U and by the type of types theorem 5.12
both are either U or types of some sort.
The mixed cases are not possible since a welltyped term cannot be
beta equivalent to U by 5.15.
It remains to prove that both being welltyped implies that they
are types of the same sort.
Since both T and U are beta equivalent, their sorts have to be beta
equivalent as well by 5.17. The mixed case that one sort is P and the
other is U is not possible, because P is welltyped (it has type U in
any valid context) and a welltyped term cannot be beta equivalent to
U by 5.15. Therefore both sorts are either P or U.
5.10 Kinds
In the section 5.7 we saw that any term T in the type position of a
typing judgement Γ ⊢ t : T is either U or is a welltyped term whose
63
type is a sort. This justifies the statement that sorts are the types of
types.
However the calculus of constructions has dependent types i.e.
there are type valued functions F which when applied to arguments
return a type i.e. F T a might be a type where T represents a type
and a represents some other welltyped term e.g. a natural number.
In that case F has a type of the form ΠX P y A .P. We call such terms
F type functions and their types kinds. A kind is a product where
the final type is a sort. We include in our definition of kinds products
with arity zero i.e. sorts. Type functions with arity zero are types.
where s ranges over sorts, K ranges over kinds and A ranges over
terms which are not kinds.
It is also possible to use the equivalent definition: The set K is an
inductive set defined by the rules
1.
s∈K
2.
A∈/K
K∈K
ΠxA .K ∈ K
3.
A∈K
K∈K
ΠxA .K ∈ K
Theorem 5.20. Every term K which has type U is a syntactical kind
Γ⊢K:U
K∈K
Proof. By induction on Γ ⊢ K : U.
1. Sort:
[] ⊢ P : U
Trival, because P ∈ K.
64
2. Variable: The variable rule deriving Γ, xU ⊢ x : U is not appli-
cable, because it would require Γ ⊢ U : s which contradicts the
first generation lemma 5.5.
3. Product:
Γ⊢A:s
Γ, xA ⊢ B : U B∈K
Γ ⊢ ΠxA .B : U ΠxA .B ∈ K
The goal in the lower right corner is a consequence of the induc-
tion hypothesis.
4. Abstraction:
The abstraction rule deriving Γ ⊢ λxA .e : ΠxA .B is syntactically
not possible, because ΠxA .B ̸= U.
5. Application: In order to apply the rule
Γ ⊢ f : ΠxA .B
Γ⊢a:A
Γ ⊢ f a : B[x := a]
Γ⊢T :P
T ∈
/K
65
Proof. Assume Γ ⊢ T : P and by way of contradiction T ∈ K.
This implies by the previous theorem 5.23 Γ ⊢ T : U and by
β
theorem 5.16 P ∼ U which is contradictory.
Corollary 5.22. Every welltyped type which is not a kind has type P.
T ∈
/K
Γ⊢T :s
Γ⊢T :P
Proof. Assume T ∈ / K and Γ ⊢ T : s.
There are only two sorts, therefore either s = P or s = U has to
be valid. The first case proves the goal. The second case implies by
the theorem 5.20 T ∈ K which contradicts the assumption T ∈ / K.
Theorem 5.23. Every welltyped syntactical kind K has type U.
K∈K
Γ⊢K:T
Γ⊢K:U
Proof. By induction on the structure of K
1. Sort s: The validity of Γ ⊢ s : T implies by the generation
β
lemma 5.5 for sorts s = P and T ∼ U. By the type of types
lemma 5.12 we know that T is either U or it is welltyped and
by the lemma 5.15 we know that no welltyped term can be beta
equivalent to U. This implies T = U and therefore the goal.
2. ΠxA .K: We assume Γ ⊢ ΠxA .K : T and have to prove Γ ⊢ ΠxA :
K : U. The generation lemma 5.5 for products guarantees the
existence of a sort s with Γ, xA ⊢ K : s. This together with the
induction hypotheses implies Γ, xA ⊢ K : U. By the introduction
rule for products we get Γ ⊢ ΠxA .K : U.
66
6 Proof of Strong Normalization
In this section we prove the strong normalization of the calculus of
constructions i.e. we prove that all welltyped terms have no infinite
reduction sequence which implies that all welltyped terms can be re-
duced to a normal form. The presented proof is based on the proof
of Herman Geuvers in his paper A short and flexible proof of strong
normalization for the calculus of constructions [2].
The article of Herman Geuvers is a scientific paper which assume
some good knowledge of type theory. Here we present the proof in a
text book style which makes the proof more accessible to readers who
are no experts in type theory.
In the last subsection of this chapter we use the proof of strong
normalization to prove that the calculus of constructions is consistent
as a logic i.e. that it is impossible to derive a contradiction in the
calculus of constructions.
The proof of strong normalization is not trivial and spans nearly
the whole part of this chapter. Why is the proof nontrivial?
A naive attempt to prove the strong normalization would be to
prove
Γ⊢t:T
t ∈ SN
by induction on Γ ⊢ t : T (where SN is the set of all strongly normal-
izing terms).
This attempt passes a lot of rules of the typing relation. But
it fails at the rule for application. In the rule for application we
have to prove the strong normalization of f a by using the induction
hypotheses that the function term f and the argument term a are
strongly normalizing. The impossibility of this proof step can be seen
by looking at a counterexample.
The term λxA .xx is strongly normalizing because it is in normal
form. If we apply the term to any argument we get the reduction
β
(λxA .xx)a → aa
If we apply the term to itself we get
β
(λxA .xx)(λxA .xx) → (λxA .xx)(λxA .xx)
67
The problem with this naive approach is that the goal t ∈ SN
gives induction hypotheses which are too weak. The set of strongly
normalizing terms SN is not connected to the type T of the term t.
• Type Interpretation
It is better to find a subset of the strongly normalizing terms
[[T ]] ⊆ SN which reflects a set of strongly normalizing terms
which can represent terms of type T . We call [[T ]] an interpreta-
tion of the type T . The interpretation of a type shall be chosen
in a way such that
f ∈ [[ΠxA .B]]
a ∈ [[A]]
f a ∈ [[B[x := a]]]
68
This lambda function space solves the problem of proving f a ∈
λ
[[B]] from the induction hypotheses f ∈ [[A]] → [[B]] and a ∈ [[A]].
In the previous paragraph we have ignored the detail that the
variable x might be contained in the type B within the product
ΠxA .B and that each different value of the variable x might
generate a different interpretation [[B]] of the type B. In the
detailed proof we will see that this fact is important when the
type A of the variable x is a kind. In that case we have to form
the lambda function space
λ
\
[[A]] → [[B]]
x
where
T for the purposes of this overview we understand that
x [[B]] is the intersection of all possible interpretations [[B]] for
all possible values of x. In the detailed proof we give a precise
definition of this lambda function space. Here we get a first hint
why the closure conditions in the definition of saturated sets are
important. They guarantee that any intersection of saturated
sets is a saturated set.
• Type Functions
It is not sufficient to have interpretations [[A]] of types A. E.g.
a term which represents list L in the calculus of constructions is
not a type. It is a type function of type ΠX P .P. We have to find
an interpretation for L as well. But its interpretation cannot be
a saturated set. Its interpretation [[L]] has to be a mathematical
function which maps saturated sets to saturated sets.
In that case the term LT which represents the type of a list
of elements of type T where [[T ]] ∈ SAT and [[LT ]] ∈ SAT are
valid. Therefore the interpretation of L must be a mathematical
function which maps saturated sets to saturated sets i.e. [[L]] ∈
SAT → SAT. Note that SAT → SAT is a set of mathematical
functions.
Since type functions can be arbitrarily nested, interpretations
for type functions can be drawn e.g from SAT for types i.e.
0-ary type functions, SAT → SAT for unary type functions,
(SAT → SAT) → SAT for unary type functions having a unary
type function as argument, SAT → SAT → SAT for binary type
functions (e.g. pairs), ... to arbitrary depth.
69
• Models and Context Interpretations
In the chapter Typing 5 we have shown that kinds are the types
of n-ary type functions where the corner case n = 0 is allowed.
Based on this it is possible to define a function ν which maps
any kind K to its appropriate set of models ν(K) which is the
set of possible interpretations of terms of type K.
Kinds have the advantage that they can be recognized by pure
syntactical analysis. This makes the definition of the function ν
easy.
By looking at the types in a context Γ it is decidable whether
the type A of a variable x is a kind or not. A context interpreta-
tion ξ = [xM 1 Mn
1 , . . . , xn ] is a sequence of models for all variables
in the context whose type is a kind. We say that ξ is a valid
interpretation of a valid context Γ which we denote by ξ ⊨ Γ
if ξ associates to each variable in the context which is a type
function a model from the corresponding set ν(K) where K is
the type of the variable (which has to be a kind).
• Type Interpretation Function
Based on a valid context interpretation ξ ⊨ Γ it is possible to
define a function [[F ]]ξΓ which maps each welltyped type function
F (i.e. not only the variables which are type functions) into a
type interpretation which is in the set ν(K) where K is the type
of F .
In the definition of the type interpretation function it is crucial to
map all types to saturated sets i.e. that all types are interpreted
by staturated sets i.e. sets of strongly normalizing terms.
The definition of the type interpretation function is nontrivial
because it has to be shown in each case that the return value is
in the correct set.
Furthermore it has to be proved that the type interpretation
function returns for equivalent types the same interpretation.
This is important because beta equivalent types are practically
the same types. Therefore it is not allowed that they have differ-
ent interpretations. In order to prove this fact we have to show
that substitutions which are the basis of beta reduction and beta
equality are treated in a consistent manner. This prove is non-
trivial either.
• Term Interpretations and Context Models
70
In addition to type interpretations ξ we add term interpreta-
tions ρ. Term interpretations or better variable interpretations
are a mapping from variables to terms which are an element of
the type interpretation of the corresponding type i.e. strongly
normalizing terms.
The term interpretation (|t|)ρ replaces in the term t all free occur-
rences of a variable by its corresponding variable interpretation.
A context model ρξ ⊨ Γ is a variable interpretation ρ and a type
variable interpretation ξ which are consistent.
The additional complexity of term interpretations (|t|)ρ is neces-
sary to be able to enter binders like λxA .e and ΠxA B and get
sufficiently strong induction hypotheses.
• Soundness Theorem
The whole machinery culminates in the proof of the soundness
theorem which states that for all context models with ρξ ⊨ Γ and
all welltyped term t with Γ ⊢ t : T we can prove (|t|)ρ ∈ [[T ]]ξΓ .
Since all type interpretations of types are saturated sets i.e. sets
of strongly normalizing terms and there is a canonical interpre-
tation of type variables and the term interpretation which is the
identity function is possible (base terms include variables and
are in all saturated sets) we easily conclude that all welltyped
terms are strongly normalizing.
Definition 6.1. The set NF of terms in normal form is the set of all
terms which do not reduce (i.e. which do not have any redex).
" #
β
∀b. a → b
⊥
a ∈ NF
71
inductively by the rule
" #
β
∀b. a→b
b ∈ SN
a ∈ SN
In words: A term a is strongly normalizing if all its reducts b are
strongly normalizing.
We can view the strongly normalizing terms more intuitively in
the following manner.
• A term which is already in normalform i.e. it has no reducts is
strongly normalizing. In hat case all its reducts (there are none)
are vacuously strongly normalizing. I.e. terms in normal form
form the 0-th generation of strongly normalizing terms.
• The 1st generation of strongly normalizing terms are terms which
reduce only to terms in normal form.
• The (n+1)th generation of strongly normalizing terms are terms
which reduce only to terms of the nth generation of strongly
normalizing terms.
• ...
This intuitive definition defines the terms which are guaranteed
to reduce in at most n steps to terms in normal form. The strongly
normalizing terms are terms which are guaranteed to reduce in n steps
to terms in normal form for some natural number n.
With this intuitive definition we could prove properties of strongly
normalizing terms by doing induction on the maximal number of steps
n needed to reduce to normal form.
However the formal definition is better suited for doing induction
proofs. In order to prove that some term a which is strongly normal-
izing has some property p(a) one can assume that all its reducts have
this property. I.e. we can use an induction scheme similar to rule
based induction.
" # " #
β β
∀b. a → b ∀b. a → b
b ∈ SN p(b)
a ∈ SN p(a)
In order to prove the goal p(a) in the lower right corner we can
assume all the other statements especially the induction hypothesis in
the upper right corner.
72
Theorem 6.3. All subterms of a strongly normalizing term are strongly
normalizing.
t ∈ SN
t = ab
∀ab.
a ∈ SN
e ∈ SN
e[x := a] ∈ SN
a ∈ SN
A ∈ SN
(λxA .e)a ∈ SN
73
• Induction on e ∈ SN:
β
e→f
f [x := a] ∈ SN
" #
β
∀f. e → f
∀f aA. a ∈ SN
f ∈ SN
A ∈ SN
(λxA .f )a ∈ SN
e[x := a] ∈ SN
a ∈ SN
e ∈ SN ∀aA.
A ∈ SN
(λxA .e)a ∈ SN
• Induction on A ∈ SN:
" # " #
β β
∀B. A → B ∀B. A → B
B ∈ SN (λxB .e)a ∈ SN
A ∈ SN A
(λx .e)a ∈ SN
• I.e. we have to prove the goal (λxA .e)a ∈ SN under the following
assumptions e ∈ SN, e[x := a] ∈ SN, a ∈ SN, A ∈ SN and the
following induction hypotheses:
β
e→f
f [x := a] ∈ SN
1. ∀f aA. a ∈ SN
A ∈ SN
A
(λx .f )a ∈ SN
β
a→b
2. ∀bA. A ∈ SN
A
(λx .e)b ∈ SN
" #
β
3. ∀B. A → B
(λxB .e)a ∈ SN
74
In order to prove the goal (λxA .e)a ∈ SN we have to prove
" #
β
A .e)a →
∀c. (λx c
c ∈ SN
β
We assume (λxA .e)a → c. An application can reduce by definition
of reduction only if the application is a redex (which is the case) or
if the function term reduces or if the argument reduces. Since the
function term λxA .e is an abstraction it reduces either if the type of
the argument A reduces or the body e reduces. I.e. we have 4 cases
to consider:
β
1. (λxA .e)a → e[x := a]. Trivial. The goal e[x := a] is valid by
assumption.
β β
2. (λxA .e)a → (λxA .f )a, where e → f : In that case we have to
prove the goal
(λxA .f )a ∈ SN
β
by using e → f and the above assumptions. We can use the
induction hypothesis 1 to prove the goal. The only thing missing
is a prove of f [x := a] ∈ SN. From the theorem 3.10 we infer
β
e[x := a] → f [x := a]. Since e[x := a] ∈ SN, all its reducts must
be strongly normalizing by definition of strong normalization.
This completes the proof for that case.
β β
3. (λxA .e)a → (λxB .e)a, where A → B: In that case we have to
prove the goal
(λxB .e)a ∈ SN
β
By using A → B and the induction hypothesis 3 we prove the
goal.
β β
4. (λxA .e)a → (λxA .e)b, where a → b: In that case we have to
prove the goal
(λxA .e)b ∈ SN
β
By using a → b and the induction hypothesis 2 we prove the
goal.
75
6.2 Base Terms
Definition 6.5. The set of base terms BT is the set of all variables
applied to zero or more strongly normalizing arguments i.e. terms of
the form xa1 . . . an where ai ∈ SN for all i. We formally define the set
BT by the rules
1. x ∈ BT where x ranges over variables.
a ∈ BT
2. b ∈ SN
ab ∈ BT
Theorem 6.6. All base terms are strongly normalizing.
a ∈ BT
a ∈ SN
2.
β
a →k b
β
ac →k bc
I.e. a key reduction reduces only a leftmost redex in an application.
β β
Lemma 6.8. Let a →k b and a → a1 where a1 ̸= b. Then there exists
β β∗
a term b1 with a1 →k b1 and b → b1 .
β
a →k b
↓β ↓β ∗
β
a1 →k ∃b1
|{z}
̸=b
76
β
Proof. By induction on a →k b:
β
1. (λy A .e)a →k e[x := a]:
β
Assume λxA .e → a1 and a1 ̸= e[x := a]. The goal is to find b1
β β∗
such that a1 →k b1 and e[x := a] → b1 .
There are 4 cases to consider:
β
(a) (λxA .e)a → e[x := a]:
This case is contradictory because e[x := a] ̸= e[x := a] is
not satisfiable.
β β
(b) (λxA .e)a → (λxA1 .e)a where A → A1 : In that case have
β
(λxA1 .e)a →k e[x := a] and we choose b1 = e[x := a].
β β
(c) (λxA .e)a → (λxA .f )a where e → f :
β
In that case we have (λxA .f )a →k f [x := a] and we choose
β∗
b1 = f [x := a]. This is possible because e[x := a] → f [x :=
a] by theorem 3.10.
β β
(d) (λxA .e)a → (λxA .e)b where a → b:
β
In that case we have (λxA .e)b →k e[x := b] and we choose
β∗
b1 = e[x := b]. This is possible because e[x := a] → e[x := b]
by theorem 3.11.
2.
β
" #
βk a → a1
a→b ∀a1 . β β∗
∃b1 .a1 →k b1 ∧ b → b1
β
" #
βk ac → d
ac → bc ∀d. β β∗
∃d1 .d →k d1 ∧ bc → d1
β
We prove the goal in the lower right corner by assuming ac → d
and find some d1 with the required properties.
β
Since a →k b we know that a is not an abstraction. Therefore we
have two cases to consider:
β β
(a) ac → a1 c where a → a1 :
From the induction hypothesis we postulate the existence of
β β∗
b1 such that a1 →k b1 and b → b1 . We choose d1 = b1 c which
β β∗
satisfies a1 c →k b1 c and bc → b1 c.
β β
(b) ac → ac1 where c → c1 :
77
β∗ β∗
We choose d1 = bc1 which satisfies ac1 → bc1 and bc → bc1 .
β
Theorem 6.9. Let’s have a →k b and a and bc are strongly normaliz-
ing. Then ac is strongly normalizing as well.
β
a →k b
a, bc ∈ SN
ac ∈ SN
a1 c ∈
SN
β
a →k b
∀bc. c ∈ SN
a ∈ SN
bc ∈ SN
ac ∈ SN
β
2. Assume a →k b.
3. Induction on c ∈ SN:
β
c → c1
" #
β
∀c1 . c → c1 ∀c1 . bc1 ∈ SN
c1 ∈ SN
ac1 ∈ SN
bc ∈ SN
c ∈ SN
ac ∈ SN
78
We prove the goal in the lower right corner by assuming bc ∈ SN
and prove the final goal ac ∈ SN.
β
4. In order to prove ac ∈ SN we assume ac → d and prove d ∈ SN
for all d.
β
Since a →k b we know that a is not an abstraction. Therefore
there are only three cases to consider.
β β
(a) ac → bc where a → b: In that case we have to prove the
goal
d = bc ∈ SN
This is trivial since bc ∈ SN is an assumption.
β β
(b) ac → a1 c where a → a1 and a1 ̸= b: We have to prove the
goal
d = a1 c ∈ SN
β
Since we have a →k b by lemma 6.8 there exists a b1 such that
β β∗
a1 →k b1 and b → b1 . Because of bc ∈ SN we have b1 c ∈ SN
as well. Therefore all premises of the induction hypothesis
of step 1 are satisfied and we get a1 c ∈ SN from it.
β β
(c) ac → ac1 where c → c1 : We have to prove the goal
d = ac1 ∈ SN
79
2. All strongly normalizing terms which keyreduce to a term in the
saturated set are in the saturated set as well:
a ∈ SN
b∈S
β
a →k b
a∈S
Remark:
For those interested in lattice theory. The set of all subsets
of the set of strongly normalizing terms (i.e. the powerset
of SN) is a complete lattice with intersection and union as
the meet and join operations. The subset relation induces
a partial order.
The function which maps any set of strongly normalizing
terms into a saturated set (i.e. which adds all base terms
80
and all strongly normalizing key redexes) is monotonic, in-
creasing and idempotent. I.e. it is a closure map. The
saturated sets are fixpoints of that function.
The fixpoints of such a map form in general a complete lat-
tice which is closed with respect to intersection and union.
A ∈ SAT
B ∈ SAT
λ
A → B ∈ SAT
Proof. We have to prove three things:
λ λ
1. All terms in A → B are strongly normalizing: Assume f ∈ A →
B. Then by definition f a ∈ B ⊆ SN for all a ∈ A. Therefore
f a is strongly normalizing. Since all subterms of a strongly nor-
malizing term are strongly normalizing as well 6.3 f is strongly
normalizing.
λ
2. A → B contains all base terms:
We have to prove two things according to the definition of base
terms:
λ
(a) All variables are in A → B:
Since B is saturated, it contains all terms of the form xa
where a ∈ A, because xa is a base term. Therefore x ∈
λ
A → B for all variables x.
λ λ
(b) If c ∈ A → B where c is a baseterm, then cd ∈ A → B for
all strongly normalizing terms d:
Since B is saturated it contains all terms of the form cda
for a ∈ A ⊆ SN, because cda is a base term. Therefore by
λ
definition of the lambda function space cd ∈ A → B.
81
λ
3. A → B contains all strongly normalizing key redexes which re-
duce to a term in it:
λ β
Assume d ∈ A → B, c →k d and c ∈ SN. We have to prove that
λ
c ∈ A → B.
β
By definition of keyreduction we have ca →k da for all a ∈ A and
by definition of the lambda function space da ∈ B ⊆ SN.
Since B is saturated, it contains ca provided that ca is strongly
normalizing. ca is strongly normalizing by theorem 6.9 since
β
ca →k da and da ∈ SN.
λ
Therefore by definition of the lambda function space c ∈ A → B.
K∈K
Γ⊢F :K
Γ⊢F :T
T ∈ K ∧ ν(K) = ν(T )
82
Lemma 6.16. The model set of a welltyped kind is the same as the
model set of any beta equivalent welltyped type.
K∈K
Γ⊢K:U
Γ ⊢ T : sT
β
K∼T
T ∈ K ∧ ν(K) = ν(T )
83
Theorem 6.17. If a type function F is welltyped (i.e. its type is a
kind), then the model sets of all its possible types are the same.
K∈K
Γ⊢F :K
Γ⊢F :T
T ∈ K ∧ ν(K) = ν(T )
β
Proof. From 5.16 we infer T ∼ U and from 5.18 we infer that either
both are U or both are types of the same sort.
If both are U, then ν(K) = ν(T ) is valid trivially.
Assume both are types of the same sort i.e. we have Γ ⊢ K : s and
Γ ⊢ T : s. Since K is a syntactical kind s = U is valid.
Therefore the assumptions of lemma 6.16 are valid and we infer
the goal by applying the lemma.
As a next step we prove the fact that the model set of a kind is
not affected by a welltyped substitution.
K∈K
Γ, xA ⊢ K : U
Γ⊢a:A
ν(K) = ν(K[x := a])
Proof. By induction on K ∈ K.
1. s ∈ K: Trivial
2.
B∈
/K
Γ, xA , ∆′ ⊢ K : U
K∈K ∀∆′ .
ν(K) = ν((K)[x := a])
xA , ∆ ⊢ Πy B .K : U
B Γ,
Πy .K ∈ K ∀∆.
ν(Πy B .K) = ν((Πy B .K)[x := a])
84
From the generation lemma 5.5 for products we postulate the
existence of sB and sK such that
Γ, xA , ∆ ⊢ B : sB
Γ, xA , ∆, y B ⊢ K : sK
β
sK ∼ U
Γ, xA , ∆[x := a] ⊢ B[x := a] : P
Γ, xA , ∆ ⊢ B : sB
A
Γ, x , ∆, y B ⊢ K : sK
β
sK ∼ U
85
This implies sB = U and sK = U because of B, K ∈ K and
therorem 5.23.
By using ∆′ = ∆, y B the preconditions of both induction hy-
potheses are satisfied and we derive the facts
ξ⊨Γ
Γ⊢K:U
F ∈/Γ
M ∈ ν(K)
ξ, F M ⊨ Γ, F K
Theorem 6.20. For every valid context Γ there exists a unique canon-
ical context interpretation ξ c (Γ) with ξ c (Γ) ⊨ Γ.
86
Proof. We construct the canonical context interpretation recursively.
c
ξ ([]) := []
ξ c (Γ) := ξ c (Γ, xA ) := ξ c (Γ) A∈
/K
c A c ν c (A)
ξ (Γ, x ) := ξ (Γ), x A∈K
Definition 6.21. The type interpretation function [[F ]]ξΓ must satisfy
the following specification.
ξ⊨Γ
Γ⊢F :K ξ⊨Γ
∧
K∈K [[U]]ξΓ ∈ SAT
[[F ]]ξΓ ∈ ν(K)
87
This definition satisfies the specification 6.21. There are two
cases possible.
If s = P then it is welltyped and its type is U and we have
SN ∈ SAT and SAT = ν(U).
If s = U then the specification is trivially satisfied.
2. Variable:
[[x]]ξΓ := M
where xM ∈ ξ.
The precondition ξ ⊨ Γ is possible only if there is a type A such
that xA ∈ Γ and Γ ⊢ A : U is valid. By the start lemma 5.3
Γ ⊢ x : A is valid and since ξ is a valid context interpretation for
the context Γ we get M ∈ ν(A).
3. Product:
λ
[[ΠxA .B]]ξΓ := [[A]]ξΓ → IB
where
(
[[B]]ξ(Γ,xA ) if A ∈
/K
IB := T
M ∈ν(A) [[B]](ξ,xM )(Γ,xA ) if A ∈ K
88
(a) [[A]]ξΓ ∈ SAT: The preconditions ξ ⊨ Γ, Γ ⊢ A : sA and sA ∈
K of the typeinterpretation function are satisfied. Therefore
we can conclude the goal.
(b) IB ∈ SAT: We have to distinguish two cases:
i. A ∈/ K: In that case we have ξ ⊨ Γ, xA i.e. the precon-
ditions for the typeinterpretation function are satisfied
and we get [[B]]ξ(Γ,xA ) ∈ SAT.
ii. A ∈ K: By theorem 6.11 it is sufficient to prove [[B]](ξ,xM )(Γ,xA ) ∈
SAT for all M ∈ ν(A).
Assume M ∈ ν(A). Then ξ, xM ⊨ Γ, xA i.e. the precon-
ditions of the typeinterpretation function are satisfied
and we infer the goal.
4. Abstraction:
(
[[e]]ξ(Γ,xA ) if A ∈
/K
[[λxA .e]]ξΓ :=
M 7→ [[e]](ξ,xM )(Γ,xA ) if A ∈ K, M ∈ ν(A)
[[e]]ξ(Γ,xA ) ∈ ν(B)
89
where M ∈ ν(A). The function argument is in the cor-
rect domain. Because of ξ, xM ⊨ Γ, xA the preconditions of
[[e]](ξ,xM )(Γ,xA ) are satisfied and the specification of the type-
interpretation function guarantees that the function maps
its argument to a value in the correct range.
5. Application:
[[F ]]ξΓ if Γ ⊢ a : A for some A ∈
/K
[[F a]]ξΓ :=
[[F ]]ξΓ ([[a]]ξΓ ) if Γ ⊢ a : A for some A ∈ K
• Using the type of types lemma 5.12 we can derive the exis-
tence of some sort s such that Γ ⊢ ΠxA .B : s is valid. This
implies by the generation lemma 5.5 for products the exis-
tence of the sorts sA and sB such that Γ ⊢ A : sA , Γ, xA ⊢
β
B : sB and s ∼ sB are valid (i.e. s = sB ). Furthermore by
the substitution theorem 5.11 we get Γ ⊢ B[x := a] : s.
This implies that K and B[x := a] are welltyped and there-
fore cannot be U. Since K is a kind, s = U must be valid.
I.e. we get
Γ ⊢ A : sA
Γ, xA ⊢ B:U
Γ ⊢ ΠxA .B : U
90
Because of theorem 5.20 we have B ∈ K and ΠxA .B ∈ K
i.e. F is a type function.
• Since F is a type function, the preconditions for the type
interpretation are satisfied and we get
[[a]]ξΓ ∈ ν(A)
91
Proof. We distinguish two cases:
• A ∈ K: Assume ξ ⊨ Γ and Γ ⊢ a : A and prove the more general
lemma:
Γ, xA , ∆ ⊢ F : K
ξ, x[[a]]ξΓ , η ⊨ Γ, xA , ∆
∀K∆η.
K∈K
[[(F )′ ]](ξ,η)(Γ,∆′ ) = [[F ]](ξ,x[[a]]ξΓ ,η)(Γ,xA ,∆)
92
Γ, xA , ∆D ⊢ D : sD
ξ, x[[a]]ξΓ , η ⊨ Γ, xA , ∆D
(b) ∀sD ∆D ηD .
sD ∈ K
[[(D)′ ]](ξ,ηD )(Γ,∆′ ) = [[D]](ξ,x[[a]]ξΓ ,η A ,∆
D D )(Γ,x D)
We prove the goal below the line by assuming all statements
above the line and use the equivalence
λ
[[(Πy C .D)′ ]](ξ,η)(Γ,∆′ ) = IC ′ → ID′
λ
= IC → ID see below
C
= [[Πy .D]](ξ,x[[a]]ξΓ ,η)(Γ,xA ,∆)
where
IC ′ = [[C ′ ]](ξ,η)(Γ,∆′ )
IC = [[C]](ξ,x[[a]]ξΓ ,η)(Γ,xA ,∆)
[[D′ ]](ξ,η,yN )(Γ,xA ,∆,yC ′ )
T
ID ′ = C ∈K
TN ∈ν(C)
ID = N ∈ν(C) [[D]](ξ,x[[a]]ξΓ ,η,y N )(Γ,xA ,∆,y C ) C ∈K
ID ′ = [[D′ ]](ξ,η)(Γ,∆′ ,yC ′ ) C ∈
/K
ID = [[D]](ξ,x[[a]]ξΓ ,η)(Γ,xA ,∆,yC ) C ∈
/K
93
Γ, xA , ∆e ⊢ e : Ke
ξ, x[[a]]ξΓ , η ⊨ Γ, xA , ∆e
(b) ∀Ke ∆e ηe .
Ke ∈ K
[[(e)′ ]](ξ,ηe )(Γ,∆′e ) = [[e]](ξ,x[[a]]ξΓ ,η A ,∆
e )(Γ,x e)
We prove the goal below the line by assuming all statements
above the line and distinguish two cases:
(a) C ∈
/ K:
[[(λy C .e)′ ]](ξ,η)(Γ,∆′ ) = [[e′ ]](ξ,η)(Γ,∆′ )
= [[e]](ξ,x[[a]]ξΓ ,η)(Γ,yC ,∆)
= [[λy C .e]](ξ,x[[a]]ξΓ ,η)(Γ,∆)
We assume all statements above the line and prove the final
goal under the line.
From the generation lemma 5.5 for applications we postulate
the existence of B and C such that
Γ, xA , ∆ ⊢ G : Πy B .C
Γ, xA , ∆ ⊨ b:B
β
C[y := b] ∼ K
94
are valid. Since C[x := a] is a kind, C and Πy B .C are kinds
as well. Therefore we get the induction hypothesis for G
Γ, xA , ∆ ⊢ G : KG
ξ, x[[a]]ξΓ , η ⊨ Γ, xA , ∆
∀KG ∆η.
KG ∈ K
′
[[(G) ]](ξ,η)(Γ,∆′ ) = [[G]](ξ,x[[a]]ξΓ ,η)(Γ,xA ,∆)
95
therefore a is not a type function. The variable case is even
simpler, because the variable cannot be x (the type of x is not a
kind).
96
Theorem 6.25. Equivalent type functions have the same type inter-
pretation
β
F ∼G
ξ⊨Γ
Γ⊢F :K
Γ⊢G:s
K∈K
[[F ]]ξΓ = [[G]]ξΓ
β
Proof. By induction on F ∼ G. The reflexive case is trivial. The
forward and the backward cases can be proved by the corresponding
induction hypothesis and theorem 6.24.
Theorem 6.26. The type interpretation of a type is a saturated set.
ξ⊨Γ
Γ⊢T :s
[[T ]]ξΓ ∈ SAT
Proof. The type interpretation function satisfies the specification
[[T ]]ξΓ ∈ ν(s) = SAT
97
A term interpretation is just a parallel substitution of the free vari-
ables in a term. In the following we use only variable interpretations
which contain all variables of a context and apply it only to terms
which are welltyped in a context.
Γ = [xA 1 An
1 , . . . , xn ]
t1 t
ρ = [x1 , . . . , xnn ]
Γ⊢t:T
ρξ ⊨ Γ
(|t|)ρ ∈ [[T ]]ξΓ
Γ⊢s:T
ρξ ⊨ Γ
(|s|)ρ ∈ [[T ]]ξΓ
s=P
β
T ∼U
98
and prove the goal by
and using the fact that type interpretation respects beta equiv-
alence 6.25
2. Variable: We have to prove the goal
Γ⊢x:T
ρξ ⊨ Γ
(|x|)ρ ∈ [[T ]]ξΓ
Γ ⊢ ΠxA .B : T
ρξ ⊨ Γ
(|ΠxA .B|)ρ ∈ [[T ]]ξΓ
Γ ⊢ A : sa
Γ, xA ⊢ B : sb
β
T ∼ sb
99
Γb ⊢ B : Tb
(b) ∀Γb Tb ρb ξb . ρb ξb ⊨ Γb
(|B|)ρb ∈ [[Tb ]]ξb Γb
The final goal is (|ΠxA .B|)ρ ∈ [[T ]]ξΓ = [[sb ]]ξΓ = SN which
requires the subgoals
(|A|)ρ ∈ SN
(|B|)ρ,xx ∈ SN
Γ ⊢ λxA .e : T
ρξ ⊨ Γ
(|λxA .e|)ρ ∈ [[T ]]ξΓ
Γ ⊢ ΠxA .B : s
Γ, xA ⊢ e : B
β
T ∼ ΠxA .B
100
We have to prove the final goal
λ
(|λxA .e|)ρ ∈ [[T ]]ξΓ = [[ΠxA .B]]ξΓ = [[A]]ξΓ → IB
(
[[B]] A A∈/K
where IB = T ξ(Γ,x )
M ∈ν(A) [[B]](ξ,xM )(Γ,xA ) A ∈ K
λ
By definition of → we have to prove
(|λxA .e|)ρ a ∈ IB
(|e|)ρ,xa ∈ IB
101
(c) (|e|)ρ,xx [x := a] = (|e|)ρ,xa ∈ SN: We have already inferred
from the second induction hypothesis (|e|)ρ,xa ∈ IB . Since IB
is either a type interpretation of a type or an intersection of
type interpretations of a type and saturated sets are closed
with respect to intersection, IB is a saturated set which by
definition contains only strongly normalizing terms.
(d) (|e|)ρ,xx ∈ SN: Because [[A]]ξΓ and type interpretations of
types are saturated and saturated sets contain all base terms,
we have x ∈ [[A]]ξΓ . Therefore (ρ, xx )ξe ⊨ Γ, xA which im-
plies the goal.
5. Application: We have to prove the goal
Γ ⊢ fa : T
ρξ ⊨ Γ
Γ ⊢ f : ΠxA .B
Γ ⊢ a:A
β
T ∼ B[x := a]
β
By using the theorem 6.23, the theorem 6.25 and T ∼ B[x := a]
we have to prove the goal
(
[[B]]ξ(Γ,xA ) A∈/K
(|f |)ρ (|a|)ρ ∈ [[T ]]ξΓ = [[(B[x := a])]]ξΓ =
[[B]](ξ,x[[a]]ξΓ )(Γ,xA ) A ∈ K
102
(a) A ∈
/ K: The goal follows immediately from the induction
λ
hypotheses and the definition of →.
(b) A ∈ K: From the induction hypotheses and the definition
λ
of → we get
M ∈ ν(A)
∀M.
(|f |)ρ (|a|)ρ ∈ [[B]](ξ,xM )(Γ,xA )
[[a]]ξΓ ∈ ν(A)
Γ⊢t:T
t ∈ SN
ρξ ⊨ Γ
where
ρ = [xx1 1 , . . . , xxnn ]
Then we have by the soundness theorem 6.30
which proves the goal because type interpretations of types are satu-
rated sets 6.26 and saturated sets contain only strongly normalizing
terms by definition.
103
6.13 Logical Consistency
Logical consistency of the calculus of constructions means that it is not
possible to prove contradictions in the calculus. Since a contradiction
implies everything we can state logical consistency as It is not possible
to prove every proposition.
Via the Curry-Howard correspondence we interpret types as propo-
sitions and terms of a type as a proof of the proposition which cor-
responds to that type. In the calculus of constructions terms of the
type ΠX P .X are functions which map every type X into a proof of
that type.
Therefore in the calculus of contructions logical consistency means
that there does not exist a term t of type ΠX P .X in the empty context.
It is important to use the empty context here because a context
Γ = [xA 1 An
1 , . . . , xn ] is a sequence of assumptions and it is perfectly
P
possible to make contradictory assumptions e.g. having f ΠX .X as
one assumption. In such a context f is a term of type ΠX P .X.
In the previous section we have proved that all welltyped terms
in the calculus of constructions are strongly normalizing. I.e. all
welltyped terms can be reduced to their normal form. Therefore it is
sufficient to prove that in the empty context there exists no term in
normal form which has the type ΠX P .X.
Lemma 6.32. All welltyped terms in normal form are either sorts,
products, abstractions or base terms.
104
If f is a base term, then by definition of base terms f a is a base
term as well, because a ∈ NF ⊆ SN.
[] ⊢ t : ΠX P .X
t ∈ NF
⊥
[X P ] ⊢ e : X
105
5. e has the form ya1 , . . . , an with n > 0: We get y = X as well for
the same reason. Since y is in a function position its type must
be beta equivalent to some product Πy B .C. However its type is
P which cannot beta equivalent to a product.
106
7 Bibliography
References
[1] Henk Barendregt. Lambda calculi with types. Handbook of Logic
in Computer Science, II, 1993.
[2] Herman Geuvers. A short and flexible proof of strong normaliza-
tion for the calculus of constructions. Computing Science Report,
9450, 1994.
107