0% found this document useful (0 votes)
40 views

Typed Lambda Calculus / Calculus of Constructions

This document provides an introduction to and overview of the typed lambda calculus or Calculus of Constructions. It discusses how the calculus can express both computable functions and logical propositions including their proofs. The paper is written in a textbook style, introducing concepts step-by-step for newcomers to understand typed lambda calculus. It contains sections on basic mathematics, the untyped lambda calculus, confluence, typing, and a proof of strong normalization.

Uploaded by

reveriedotcomm
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views

Typed Lambda Calculus / Calculus of Constructions

This document provides an introduction to and overview of the typed lambda calculus or Calculus of Constructions. It discusses how the calculus can express both computable functions and logical propositions including their proofs. The paper is written in a textbook style, introducing concepts step-by-step for newcomers to understand typed lambda calculus. It contains sections on basic mathematics, the untyped lambda calculus, confluence, typing, and a proof of strong normalization.

Uploaded by

reveriedotcomm
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 107

Typed Lambda Calculus / Calculus of

Constructions
Helmut Brandl
(firstname dot lastname at gmx dot net)

Version 1.03

Abstract
In this text we describe the calculus of constructions as one of
the most interesting typed lambda calculus. It is a sweet spot in
the design space of typed lambda calculi because it can express an
immense set of computable functions and it can express a large set of
logical propositions including their proofs.
The paper is written in a textbook style. The needed concepts
are introduced step by step and the proofs are layed out sufficiently
detailed for newcomers to the subject. The readers who are familiar
with basic concepts of lambda calculus and computer science can get
a good unterstanding of typed lambda calculus.
For comments, questions, error reporting feel free to open an issue
at https://github.com/hbr/Lambda-Calculus

Contents
1 Introduction 4

2 Basic Mathematics 10
2.1 Sets and Relations . . . . . . . . . . . . . . . . . . . . 10
2.2 Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Inductively Defined Sets . . . . . . . . . . . . . . . . . 11
2.4 Induction Proofs . . . . . . . . . . . . . . . . . . . . . 12
2.5 Inductively Defined Relations . . . . . . . . . . . . . . 14
2.6 Term Grammar . . . . . . . . . . . . . . . . . . . . . . 15
2.7 Recursive Functions . . . . . . . . . . . . . . . . . . . 16

1
3 The Calculus 17
3.1 Sorts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3 Free Variables . . . . . . . . . . . . . . . . . . . . . . . 18
3.4 Contexts . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.5 Substitution . . . . . . . . . . . . . . . . . . . . . . . . 19
3.6 Beta Reduction . . . . . . . . . . . . . . . . . . . . . . 22
3.7 Beta Equivalence . . . . . . . . . . . . . . . . . . . . . 25

4 Confluence 28
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.2 Diamonds and Confluence . . . . . . . . . . . . . . . . 29
4.3 Parallel Reduction Relation . . . . . . . . . . . . . . . 33
4.4 Equivalent Terms . . . . . . . . . . . . . . . . . . . . . 40
4.5 Uniqueness of Normal Forms . . . . . . . . . . . . . . 42
4.6 Equivalent Binders . . . . . . . . . . . . . . . . . . . . 42
4.7 Binders are not Equivalent to Variables or Sorts . . . 42

5 Typing 44
5.1 Typing Relation . . . . . . . . . . . . . . . . . . . . . 46
5.2 Basic Definitions . . . . . . . . . . . . . . . . . . . . . 47
5.3 Start Lemma . . . . . . . . . . . . . . . . . . . . . . . 48
5.4 Thinning Lemma . . . . . . . . . . . . . . . . . . . . . 49
5.5 Generation Lemmata . . . . . . . . . . . . . . . . . . . 51
5.6 Substitution Lemma . . . . . . . . . . . . . . . . . . . 55
5.7 Type of Types . . . . . . . . . . . . . . . . . . . . . . 57
5.8 Subject Reduction . . . . . . . . . . . . . . . . . . . . 58
5.9 Type Uniqueness . . . . . . . . . . . . . . . . . . . . . 62
5.10 Kinds . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

6 Proof of Strong Normalization 67


6.1 Strong Normalization . . . . . . . . . . . . . . . . . . 71
6.2 Base Terms . . . . . . . . . . . . . . . . . . . . . . . . 76
6.3 Key Reduction . . . . . . . . . . . . . . . . . . . . . . 76
6.4 Saturated Sets . . . . . . . . . . . . . . . . . . . . . . 79
6.5 Lambda Function Space . . . . . . . . . . . . . . . . . 81
6.6 Model Set . . . . . . . . . . . . . . . . . . . . . . . . . 82
6.7 Context Interpretation . . . . . . . . . . . . . . . . . . 86
6.8 Type Interpretation . . . . . . . . . . . . . . . . . . . 87
6.9 Term Interpretation . . . . . . . . . . . . . . . . . . . 97
6.10 Context Model . . . . . . . . . . . . . . . . . . . . . . 98

2
6.11 Soundness Theorem . . . . . . . . . . . . . . . . . . . 98
6.12 Strong Normalization Proof . . . . . . . . . . . . . . . 103
6.13 Logical Consistency . . . . . . . . . . . . . . . . . . . 104

7 Bibliography 107

3
1 Introduction
Computation What is a computer? What comes into your mind
if you think of computers? Your laptop, your smartphone? A super-
computer? ... Would you say that a human being is a computer?
Looking 100 years back, computers had not yet been invented. At
least not modern electronic computers. However computers existed.
A computer was a man or a woman who can carry out computations.
The term computer in the meaning of one who computes had been
in use from the early 17th century. The NASA and its predecessor
NACA hired men and women to carry out computations.
A computer had no authority. He/She had to carry out computa-
tions following some fixed rules.
Carring out computations according to fixed rules had been present
in the whole history of mankind for more than 2000 years. Think of
adding and multiplying numbers, using Gaussian elimination to solve
a system of linear equations, using Newton’s method to find the root
of a nonlinear function etc.
The notion of computing i.e. carrying out some steps according to
fixed rules had been intuitively clear. Whenever somebody said that
he had found a method to calculate something and had written down
a recipe to do it, everyone could check the method by trying to carry
out the computation according to the written rules. If there are no
ambiguities, then the method can be regarded as a general recipe or
an algorithm.

Computability In the early 20th century the question came up:


What is computable? David Hilbert, the famous german mathemati-
cian, challenged the world with his statement that everything which
can be formalized in mathematics must be decidable/computable.
That there is nothing undecidable or uncomputable. If a computa-
tion or decision procedure had not yet been found, then we have to
try harder until we find the general method.
Looking at the question Is there something uncomputable / un-
decidable? an intuitive understanding of computation is no longer
sufficient. A precise formal definition of computation becomes neces-
sary. Nearly at the same time, three famous mathematicians came up
with formal definitions of computability which could be proved to be
equivalent:
• Kurt Gödel’s recursive functions

4
• Alan Turing’s automatic machine (today called Turing machine)
• Alonzo Church’s lambda calculus
It can be shown that in this space of computable functions there
are problems which cannot be decided. E.g. the halting problem is
undecidable. There is no function which takes a valid program and
its input as input and returns true if the program terminates on the
input and false if the program does not terminate on the input.
Having a clear and formal definition of computability, many prob-
lems have been proved to be unsolvable by computation.
In this paper we look only into lambda calculus, because lambda
calculus is not only a universal model of computation like the other
two, there is much more in it.
Let’s see what this more is. There is something unsatisfactory in
lambda calculus which led to significant improvements.

Typing Since lambda calculus or more specifically the untyped


lambda calculus is a universal model of computation, it is possible
to do all possible computations at least theoretically in lambda calcu-
lus. Define boolean values and functions, define natural numbers and
functions on natural numbers, define pairs of values, list of values,
trees of values and all functions on this data. There is no limit.
However it is possible to express terms which are completely use-
less.
• You can feed a string to a function which expects a natural num-
ber as argument.
• You can write expressions which implement a non terminating
computation.
Modern programming languages solve the first problem by adding
a type to each argument of a function and a type to the result of
a function. The second problem is usually not addressed in modern
programming languages. In nearly all mainstream programming lan-
guages infinite loops are possible.

Computation and Logic Alonzo Church added types to his lambda


calculus. But his main intention was not to avoid silly expressions.
He wanted typed lambda calculus to be a model for logic. This first
attempt only described propositions in the lambda calculus. More so-
phisticated typed lambda calculi laid the basis for the Curry-Howard

5
isomorphism. The Curry-Howard isomorphism connects two seem-
ingly unrelated formalisms: computations and proof systems.
The types in computations are connected to propositions in logic
via the Curry-Howard isomorphism. The terms of a certain type are
proofs of the corresponding proposition. A proof of an implication
A ⇒ B is a function (i.e. a computation) mapping a proof of A i.e.
a term of type A to a proof of B i.e. a term of type B. The identity
function is a proof of A ⇒ A.
A proof of A ∧ (A ⇒ B) ⇒ B the modus ponens rule is nearly
trivial in this computation analogy. It is a function taking two argu-
ments. It takes an object a of type A and a function f of type A ⇒ B
and it returns an object of type B by just applying f to a.
However if we want lambda calculus to be a model for logic and
proof systems, then termination becomes crucial. Proofs must be
finite. An infinite proof makes no sense. Nobody can check the cor-
rectness of an infinite proof.
In the computational world the definition of a function directly
in terms of itself e.g. f (x : A) : B := f (x) is welltyped but useless.
Calling f with an object a of type A ends up in an infinite recursion.
The counterpart of this nonterminating function f in logic is a
proof of A ⇒ B which uses a proof of A ⇒ B. This circular logic
is not allowed. Proofs which use circular logic are not wellformed.
Anything can be proved by using circular logic.

Typed Lambda Calculi The untyped lambda calculus can be


extended to many forms of typed lambda calculus which serve both as
model of computation and a model of logic i.e. they avoid silly terms
and they guarantee termination.
We come from untyped lambda calculus to typed lambda calculus
by adding type annotations. Type annotations are necessary only to
check that terms are welltyped. After type checking, types can be
thrown away. This is called type erasure. We distinguish between
computational objects and types (i.e. non computational objects).
In this paper we treat the Calculus of Constructions as a typed
lambda calculus which is a good model of computation and logic at
the same time. The way from untyped lambda calculus to the calculus
of constructions can be seen as having four steps:
1. Simply typed lambda calculus: The types have no structure. A
type is a type variable. We have type variables {U, V, W, . . .}

6
and arbitrary functions formed over the type variables { U → U ,
U → V , U → (V → W ), . . . }. The computational power of
simply typed lambda calculus is fairly limited. But it is already
a model for the implicational fragment of natural deduction.
2. Polymorphic functions (System F ): In the simply typed lambda
calculus it is not possible to express the identity function which
works for arbitrary types. Each type needs its own identity func-
tion. Girard ’s System F allows types as arguments for functions.
Now it is possible to express a polymorphic identity function. It
is a function receiving two arguments. A type and term of this
type. The body of the function just returns the second argument.
This addition of polymorphic functions makes system F substan-
tially more powerful than simply type lambda calculus. Func-
tions operating on booleans, natural numbers, lists of values,
pairs, trees etc. can be expressed in the calculus. As a logic it
can expresses a proof system for second order intutionistic pred-
icate calculus.
3. Polymorphic types i.e. computing types (System Fω ): System
F already allows polymorphic functions which operate on lists
of a certain element type, pairs of two elements of two different
types, trees of certain element types etc.
However a list of integers and a list of booleans are different
types. In system F it is not possible to define functions which
take type arguments and return type results. E.g. it is not
possible to define a function L which takes an element type A as
argument and returns the type LA of a list where the elements
are of type A.
System Fω adds this possibility to compute types. This adds the
necessity to add types of types. It is necessary to express how
many arguments a type function takes. The arguments might
be types or type functions. The Haskell programming language
which is based to some extend on system Fω adds kinds. The
kind ∗ is the type of a type. The kind ∗ → ∗ is the type of a
type function which takes a type argument a returns a type. The
kind (∗ → ∗) → ∗ takes a type function and returns a type etc.
4. Dependent types (calculus of constructions): We need another
dimension to express interesting logical propositions. We want
to express the proposition that a certain number is a prime num-
ber. Propositions are types in the Curry-Howard isomorphism,

7
therefore this proposition is a type. However a number is cer-
tainly a computational object. In order to express the predicate
is prime we need functions which map a number to a proposition
i.e. a type.
Now we can express the proposition which states that all ob-
jects x of a certain type A have a certain property P i.e. the
proposition ∀(x : A).P x. P x is a type which depends on the
computational object x. Therefore it is called a dependent type.
A proof of the proposition ∀(x : A).P x is a function which takes
an object x of type A and returns an object of type P x. In the
calculus of construction we express the proposition ∀(x : A).P x
as the type ΠxA .P x.
The calculus of constructions has enormous computational power
and is very expressive as a logic and proof system. Therefore it is a
sweet spot in the design space of typed lambda calculi.

This Paper In this paper we introduce the calculus of construc-


tions.
• Section Basic Mathematics 2 describes the basic mathematics
needed to understand the following chapters. It is important to
understand the used logic notation and the form of induction
proofs. In type theory induction proofs on inductively defined
sets and relations are used excessively. Therefore a special lay-
out has been chosen to present such proofs and figure out the
induction hypotheses clearly.
• In section The Calculus 3 the calculus of constructions is ex-
plained. It defines the terms of the calculus, the contexts to give
types to free variables, the basic computation steps and term
equivalence.
• Section Confluence 4 proves an important property of the calcu-
lus: Uniqueness of results. The computation steps in the calculus
are nondeterministic. At each state of the computation which
is not a final state different steps might be possible. The prop-
erty of confluence guarantees that all computation paths starting
from a term can be joined and that the final result (if it exists)
is unique.
• Section Typing 5 introduces the typing relation and proves im-
portant properties about this relation. The typing relation de-

8
fines the welltyped terms. Terms in the calculus formed accord-
ing to section The Calculus 3 are wellformed, but not necessarily
welltyped. In order to be welltyped they have to be valid terms
in the typing relation.
• Section Proof of Strong Normalization 6: This section contains
the proof that all welltyped terms in the calculus of constructions
are strongly normalizing. I.e. Every welltyped term reduces in
a finite number of computation steps to a normal form which is
the result of the computation.
The proof of strong normalization is rather involved and requires
a lot of machinery to go through. All needed concepts and theo-
rems are explained in detail. Therefore this chapter is the longest
in the paper.
Strong normalization implies consistency when regarded as a
model of a proof system for logic. Consistency in logic means
that it is free of contradictions. A type system is consistent as a
logic if there are types which are uninhabited in the empty con-
text. In the Curry-Howard isomorphism an uninhabited type
corresponds to a proposition which is impossible to prove.
The proof of consistency is the last subsection in this chapter.

9
2 Basic Mathematics
2.1 Sets and Relations
It is assumed that the reader is familiar with the following concepts:
1. Basic set notation: Elements of a set a ∈ A, Subsets A ⊆ B
defined as a ∈ A implies a ∈ B.
2. Endorelations: An (endo-) relation r over the set A is the set of
pairs (a, b) (i.e. r ⊆ A × A) where a and b are drawn from the
r
set A. We write (a, b) ∈ r more pictorially as a → b.
3. n-ary Relations: An endorelation is a binary relation where the
domain and the codomain are the same. An n-ary relation over
the domains A1 , A2 , . . . , An is a subset from the cartesian prod-
uct A1 × A2 × . . . × An . In this paper we use a ternary relation
as the typing relation.
4. An intuitive understanding of the reflexive transitive closure r∗
of a relation r, the reflexivity, symmetry and transitivity of a
relation and similar concepts.
5. Logic: The logical connectives of conjunction a ∧ b, disjunction
a ∨ b, implication a ⇒ b and existential ∃x.p(x) and universal
quantication ∀x.p(x). We use the symbol ⊥ for falsity i.e. a
logical statement which can never be proved (a contradiction).
6. An intuitive understanding of mathematical functions from the
set A to the set B. I.e. A → B is the set of functions mapping
each element of the set A to a unique element of the set B.

2.2 Logic
We often have to state that some premises p1 , p2 , . . . pn have a certain
conclusion c
p1 ∧ p2 ∧ . . . ∧ pn ⇒ c
In many cases this notation with logical connectives is clumsy and
not very readable. We use the rule notation
p1
p2
...
pn
c

10
to express the same fact.
If some variables in logical statements are not quantified, then
universal quantification is assumed.
 
a∈A a∈A
means ∀a.
a∈B a∈B

Assume that a logical statement has more than one variable. The
universal quantification can be expressed at the highest level or pushed
done to the first appearance of the variable. Moreover logical con-
junction is commutative. Therefore the following interpretations are
equivalent.

a∈A
 

∀ab.  b ∈ A 
r
a→b
a∈A a∈A
 
 
b∈A means ∀a.  b∈A 
r ∀b. r
a→b a→b
b∈A
 
 
∀b.  a∈A 
∀a. r
a→b
All three settings of the universal quantifiers are logically equiva-
lent.

2.3 Inductively Defined Sets


Sets can be defined inductively by rules. E.g. we can define the set of
even numbers by the rules
1. 0 is an even number
2. If n is an even number, then n + 2 is an even number as well.
We write this more formally:
The set of even numbers E is defined by the rules:
1.
0∈E
2.
n∈E
n+2∈E

11
We say that E is the smallest set which satisfies the above rules.
Whenever there is an element m ∈ E it can be in the set only if it
is in the set because of one of the two rules which define the set. 0 is
in E because of the first rule. 2 is in E because of second rule where
n = 0 and 0 is in the set. 4 is in the set because of the second rule
where n = 2 and 2 is in the set.
A number m ∈ E can be arbitrarily big, but it can be in the set
only because of finitely many applications of the rules which define
the set.

2.4 Induction Proofs


With inductively defined sets we can do induction proofs. Let’s say
that we want to prove that all numbers n in E satisfy a certain prop-
erty p(n) i.e. we want to prove

n∈E
p(n)

We prove this statement by induction on n ∈ E. Similar to induc-


tion proofs on natural numbers we have to prove
1. 0 satisfies p, i.e. p(0)
2. Every n that satisfies p (i.e. p(n)) must imply that n + 2 satisfies
p as well (i.e. p(n + 2)).
Because the second rule is recursive we get an induction hypothesis.
We write all induction proofs on an inductively defined set in the
following manner:
Theorem: All even numbers n ∈ E satisfy p(n).

n∈E
p(n)

Proof: By induction on n ∈ E.
1.
0 ∈ E p(0)
p(0) is valid because . . .

12
2.
n∈E p(n)
n + 2 ∈ E p(n + 2)
In order to prove p(n + 2) in the lower right corner
we assume n ∈ E, p(n), n + 2 ∈ E. p(n + 2) is valid
because . . .
For each rule in the inductive definition of the set we write a ma-
trix. The left part of the matrix is just the rule. The goal is always in
the lower right corner of the matrix. For each recursive premise (i.e.
that some other elements are already in the set we get an induction
hypothesis.
In order to prove the goal in the lower right corner we can assume
all other statements in the matrix.
This schema helps to layout all premises and all induction hypothe-
ses precisely.
As an example we prove that for all even numbers n ∈ E there
exists a number m such that n = 2m using the above schema.
Theorem: For all even numbers n ∈ E there exists a num-
ber m with n = 2m.
n∈E
∃m.n = 2m

Proof: By induction on n ∈ E.
1.
0 ∈ E ∃m.0 = 2m
m = 0 satisfies the goal.
2.
n∈E ∃i.n = 2i
n + 2 ∈ E ∃m.n + 2 = 2m
By the induction hypothesis we get a number i which
satisfies n = 2i. Therefore n + 2 = 2i + 2 = 2(i + 1).
The number m = i + 1 satisfies the goal in the lower
right corner.
Note that the name of the bound variable in existential and uni-
versal quantification is arbitrary. It is good practice to choose names
which do not interfere. In the case for the recursive rule we have cho-
sen the different names i and m in the existential quantification. This
makes the prove more readable for humans.

13
2.5 Inductively Defined Relations
An (endo-) relation r over the set A is just a subset of the cartesian
product A × A. I.e. a relation is just a set which can be defined
inductively. Since inductively defined relations are used excessively
in the paper we explain inductive definitions and induction proofs on
relations as well.
Note that the following is not restricted to binary endorelations
over a carrier set A. It is possibly to define n-ary relations over differ-
ent carrier sets as subset of the cartesian product A1 × A2 × . . . × An
as well. E.g. the typing relation in the calculus of constructions (see
section 5) is a ternary relation.
As an example we define the reflexive transitive closure of a relation
r inductively.
The reflexive transitive closure r∗ of the relation r is defined
by the rules
1.
r∗
a→a
2.
r∗
a→b
r
b→c
r∗
a→c
It should be intuitively clear that r∗ is reflexive and transitive.
The first rule guarantees reflexivity and the second rule guarantees
transitivity. While reflexivity is expressed explicitely in the first rule,
transitive is expressed only indirectly in the second rule therefore we
need a proof of transitivity.
Theorem: The reflexive transitive closure r∗ of the relation
r is transitive
r∗
a→b
r∗
b→c
r∗
a→c
r∗ r∗
Proof: Assume a → b and do induction on b → c.
1.
r∗ r∗
b→b a→b
r∗
The goal a → b is trivial because it is an assumption.

14
2.
r∗ r∗
b → c0 a → c0
r
c0 → c
r∗ r∗
b→c a→c
r∗
By the induction hypothesis we have a → c0 . Together
r
with the second premise c0 → c and the second rule in
the definition of r∗ we get the goal in the lower right
corner.

2.6 Term Grammar


It is possible to define inductive sets by a grammar. E.g. we can define
the set of natural numbers N by the grammar

n ::= 0 the number zero


| n′ the successor of n
where n ranges over natural numbers and 0 is a constant.
This grammar defines the natural numbers {0, 0′ , 0′′ , . . .} as strings
over the alphabet {0,′ }.
A definition with a grammar is just a special form of an inductively
defined set.
The set of natural numbers N is defined by the rules
1.
0∈N
2.
n∈N
n′ ∈ N
If we want to prove that all natural numbers n satisfy a certain
property p we have to prove that 0 satisfies the property (i.e. p(0))
and if n satisfies the property then n′ has to satisfy the property as
well i.e. p(n) ⇒ p(n′ ).
This is the classical induction principle on natural numbers. Note
that this is just a special case of the more general induction proof
scheme for inductively defined sets.
An induction proof over terms defined by a term grammar has the
following form:

15
Theorem: All natural numbers n have the property p(n).
Proof: By induction on the structure of n:
1. p(0): The goal p(0) is valid because . . .
2. p(n′ ): By the induction hypothesis p(n) is valid. This
implies p(n′ ) because . . .
Note that such a proof corresponds one to one to a proof on in-
ductive sets shown above.
Furthermore note that a grammar rule in the term grammar might
be recursive in more than one subterm. In that case we get more than
one induction hypothesis.

2.7 Recursive Functions


In the previous section we have shown that for each definition of a
set with a term grammar there is a corresponding inductive defini-
tion. However with terms defined via a term grammar we can define
recursive functions.
We use the natural numbers defined via a term grammar and define
the recursive function p(n, m) which computes the sum of the numbers
n and m. 
p(n, 0) := n
p(n, m) :=
p(n, m ) := p(n, m)′

The term grammar tells us that we can construct a natural num-


ber by using the number 0 and apply zero or more times the successor
function ′ to it. The recursive function deconstructs the string gener-
ated by the term grammar. Since every term has a finite length the
recursion is guaranteed to terminate.

16
3 The Calculus
In this section we introduce the terms of the calculus of constructions
and the way to do computations with them.
Lambda terms in untyped lambda calculus are either variables
x, y, z, . . ., abstractions λx.e or applications ab.
The calculus of constructions is a typed lambda calculus where
terms and types are expressed within the same syntax. Since all well-
typed terms must have a type we need types of types. The types of
types in the calculus of constructions are the two sorts P and U.
Since the calculus is typed, all variables must have a type. There-
fore all variables in binders like abstractions have types λxA .e. In
order to assign types to free variables contexts are introduced.
Furthermore it is necessary to express functions A → B from one
type A to another type B. The calculus of constructions includes
dependent types, i.e. the result type B of a function might depend
on the argument. In order to express that a product type of the
form ΠxA .B is used which describes the type of a function from an
argument of type A to the result of type B which might depend on
the argument.
After introducing the terms of the calculus on constructions we
define free variables, substitution, beta reduction, beta equivalence and
prove certain theorems which state some interesting properties about
these definitions.
In this section we do not yet define what it means for a term to be
welltyped. This is done in the chapter typing 5.

3.1 Sorts
In the calculus of constructions terms and types are in the same syn-
tactic category. All welldefined terms have types and therefore types
must have types as well. In order to have types of types we start with
the introduction of sorts which are the types of types.

Definition 3.1. Sorts: There are the two sorts P and U in the cal-
culus of constructions.
Sorts or universes are the types of types. Sorts are usually abbre-
viated by the variable s.

In many texts about typed lambda calculus like e.g. Barendregt1993 [1]
and Geuvers1994 [2] the symbol ∗ is used instead of P and the sym-

17
bol □ is used instead of U. This has found its way into the Haskell
programming language where ∗, ∗ → ∗, (∗ → ∗) → ∗, . . . are kinds.
We use the symbol P in order to emphasize the Curry-Howard
correspondence of propositions as types. Many interesting types T in
the calculus of constructions have type P i.e. T : P. Within the
Curry-Howard correspondence these types are propositions.
The symbol U is just a higher universe than P. A more general
lambda calculus the extended calculus of constructions has an infi-
nite hierarchy of universes P, U0 , U1 , . . . where P is the impredicative
universe and Ui are the predicative universes.
In the calculus of constructions the term P has type U and the
term U is not welltyped. In the extended calculus of constructions
the term Ui has type Ui+1 i.e. all terms of the form Ui for any i are
welltyped.
The decision to use the symbols P and U instead of ∗ and □ is just
a matter of taste.

3.2 Terms
Definition 3.2. The terms are defined by the grammar where s
ranges over sorts, x ranges over some countably infinite set of vari-
ables and t ranges over terms.
t ::= s sorts
| x variable
| Πxt .t product
| λxt .t abstraction
| tt application

3.3 Free Variables


Definition 3.3. The set of free variables FV(t) of a term t is defined
by the function


 FV(s) := ∅
 FV(x) := {x}


FV(t) := FV(ab) := FV(a) ∪ FV(b)

 FV(λx A .e) := FV(A) ∪ (FV(e) − {x})


FV(ΠxA .B) := FV(A) ∪ (FV(B) − {x})

where s ranges over sorts, x ranges over variables and a, b, e, A and


B range over arbitrary terms.

18
A variable which is not free is called a bound variable. E.g. if x is
a free variable in the term e, it is no longer free in λxA .e. Therefore
we call λxA .e a binder, because it makes the variable x bound. The
same applies to the term ΠxA .B where the variable x is bound, but
can appear free in B.
Note that the binders λxA .e and ΠxA .B make the variable x bound
only in the subterms e and B, but not in the subterm A.
It is possible to rename bound variables within a term. The renam-
ing of a bound variable does not change the term. We consider two
terms which only differ in the name of bound variables as identical.
Examples of some identical terms:

λxP .x = λy P .y
Πxz .x = Πy z .y

3.4 Contexts
All bound variables get their types by their corresponding binders.
For free variables we need types as well. In order to assign types to
free variables we define contexts.

Definition 3.4. A context is a sequence of variables and their cor-


responding types. Contexts are usually abbreviated by upper greek
letters and types are terms which are usually abbreviated by uppercase
letters.
Contexts are defined by the grammar

Γ := [] empty context
| Γ, xA one more variable x with its type A

3.5 Substitution
Definition 3.5. Substitution: The term t[y := u] is the term t where
the term u is substituted for all free occurrences of the variable y. It
is defined as a recursive function which iterates over all subterms until

19
variables or sorts are encountered.
t[y:= u] :=
 s[y := u]
 := s(


 u y=x
 x[y := u] :=


x y ̸= x


 (ab)[y := u] := a[y := u]b[y := u]
(λxA .e)[y := u] := λxA[y:=u] .e[y := u] y ̸= x, x ∈
/ FV(u)




 A
(Πx .B)[y := u] := Πx A[y:=u] ̸ x, x ∈
.B[y := u] y = / FV(u)

Remark: The conditions y ̸= x and x ∈/ FV(u) are not a restriction


because the bound variable x can always be renamed to annother
variable which is different from y and does not occur free in the term
u.
Remark: Many authors in the literature use t[u/y] as a notation
for t[y := u]. Both notations mean the same thing.
Lemma 3.6. Substitution to sort lemma: If the result of a substitu-
tion is a sort then either the term in which the substitution occurs is
the sort or the term is the variable to be replaced and the substitution
term is the sort.
a[x := b] = s
a = s ∨ (a = x ∧ b = s)
Proof. From the definition of the substitution it is evident that only
the first two cases can result in a sort. All other cases cannot syn-
tactically result in a sort. The first two cases of the definition prove
exactly the goal.

Lemma 3.7. Double substitution lemma If we change the order of two


subsequent substitutions, then it is necessary to introduce a correction
term.
x ̸= y
y∈/ FV(a)
t[y := b][x := a] = t[x := a][y := b[x := a]]
Proof. The intuitive proof is quite evident. If we substitute b for the
variable y in the term t and then substitute a for the variable x in the
result we replace in the second substitution not only all variables x
originally contained in t but all occurrences of x in b as well.
If we do the substitution [x := a] on t, then all occurrences of x in b
have not yet been substituted. Therefore the correction term b[x := a]
is necessary if the order is changed.

20
The formal proof goes by induction on the structure of t.
1. If t is a sort then the equality is trivial because a sort does not
have any variables.
2. If t is an application, an abstraction or a product, then the goal
follows immediately from the induction hypotheses.
3. The only interesting case is when t is a variable. Let’s call the
variable z. Then we have to distinguish the cases that z is x or
z is y or z is different from x and y.
(a) Case z = x:
x[y := b][x := a] = x[x := a]
= a

x[x := a][y := b[x := a]] = a[y := b[x := a]]


= a y∈
/ FV(a)
(b) Case z = y:
y[y := b][x := a] = b[x := a]

y[x := a][y := b[x := a]] = y[y := b[x := a]]


= b[x := a]
(c) Case z ̸= x ∧ z ̸= y:
z[y := b][x := a] = z

z[x := a][y := b[x := a]] = z

The validity of the double substitution lemma 3.7 is needed to make


sure that beta reduction (see definition in the next section) is confluent.
E.g. if we have the term (λxA .(λy B .t)b)a then we can decide whether
we reduce first the inner redex and then the outer redex or the other
way round. Because of confluence both possibilities shall have the
same result.
β
(λxA .(λy B .t)b)a → (λxA .t[y := b])a
β
→ t[y := b][x := a]

β
(λxA .(λy B .t)b)a → (λy B .t[x := a])b[x := a]
β
→ t[x := a][y := b[x := a]]

21
3.6 Beta Reduction
Like in untyped lambda calculus, computation is done via beta reduc-
tion. A beta redex has the form (λxA .e)a which reduces to the reduct
e[x := a]. Beta reduction can be done in any subterm of a term.
The intuitive meaning of beta reduction is quite clear. The term
A
λx .e is a function with a formal argument x of type A. The body
e is the implementation of this function which can use the variable
x. As with any programming language which support functions the
name of the formal argument is irrelevant. An arbitrary name can be
chosen and the variable is only used internally and is not visible to
the outside world. We can apply the function to an argument a i.e.
form the term (λxA .e)a. In order to compute the result we use the
implementation e and substitute the actual argument a for the formal
argument x i.e. we form e[x := a].
β
Definition 3.8. Beta reduction is a binary relation a → b where the
term a reduces to the term b. It is defined by the rules
1. Redex:
β
(λxA .e)a → e[x := a]
2. Reduce function:
β
f →g
β
f a → ga
3. Reduce argument:
β
a→b
β
fa → fb
4. Reduce abstraction argument type:
β
A→B
β
λxA .e → λxB .e
5. Reduce abstraction inner term:
β
e→f
β
λxA .e → λxA .f
6. Reduce product argument type:
β
A→B
β
ΠxA .C → ΠxB .C

22
7. Reduce product result type
β
B→C
β
ΠxA .B → ΠxA .C

Note that beta reduction is not deterministic. There might be two


possibilities to reduce an application, a product and an abstraction.
And there might be more ambiguous subterms contained. In section
Confluence 4 we prove that the choice of the redex does not affect the
final result.
Lemma 3.9. Reduction to sort lemma: A term which reduces to a
sort must be a redex where the sort is the body or the abstraction is
the identity function and the argument is the sort.
β
t → s
∃Aa.t = (λx .s)a ∨ ∃A.t = (λxA .x)s
A


Proof. All rules except the redex rule reduce to something which can-
not be syntactically a sort. Therefore the term has to be a redex which
in general has the form (λxA .e)a. The redex reduces to e[x := a] which
by the substitution to sort lemma 3.6 proves the goal.
Theorem 3.10. Substitute Reduction A reduction remains valid if we
do the same substitution before and after the reduction.
β
t→u
β
t[y := v] → u[y := v]
(Note that iterated application of this lemma results in the more gen-
β∗ β∗
eral statement t → u ⇒ t[y := v] → u[y := v].)
β
Proof. Proof by induction on t → u.
1. Redex: We have to prove the goal
β
((λxA .e)a)[y := v] → e[x := a][y := v]

We can see this by the sequence


((λxA .e)a)[y := v] = (λxA[y:=v] .e[y := v])a[y := v]
β
→ e[y := v][x := a[y := v]]
= by Double substitution lemma 3.7
e[x := a][y := v]

23
2. Reduce function: We have to prove the goal
β β
f →g f [y := v] → g[y := v]
β β
f a → ga (f a)[y := v] → (ga)[y := v]

The validity of the final goal in the right lower corner can be
seen by the following reasoning

(f a)[y := v] = f [y := v]a[y := v]
β
→ g[y := v]a[y := v]
= (ga)[y := v]

3. Other rules: All other rules follow the same pattern as the proof
of the rule reduce function.

Theorem 3.11. Reduction in substitution term. A reduction in the


substitution term makes the corresponding substituted terms reducing
in zero or more steps as well.
β
a→b
β∗
t[x := a] → t[x := b]

Proof. Intuitively it might be clear that the substituted term reduces


in zero or more steps, because the variable x might be contained in
the term t zero or more times.
We prove the statement by induction on the structure of t.
1. Sort: Trival.
2. Variable: We have to distinguish the cases that the variable is
the same as x or is different from x. In both case the goal is
trivial.
3. Abstraction: We have to prove the goal
β∗
(λy B .e)[x := a] → (λy B .e)[x := b]

The goal is a consequence of the induction hypotheses B[x :=


β∗ β∗
a] → B[x := b] and e[x := a] → e[y := b].
4. Product: Same as abstraction.

24
5. Application: Same as abstraction.

Lemma 3.12. Products and abstractions are preserved under beta


reduction
β
ΠxA .B → t
1.  β
  β

∃C.A → C ∧ t = ΠxC .B ∨ ∃D.B → D ∧ t = ΠxA .D
β
λxA .e → t
2.  β
  β

∃B.A → B ∧ t = λxB .e ∨ ∃f.e → f ∧ t = λxA .f

Proof. The proofs for product and abstraction follow the same pat-
tern. Therefore we prove only the preservation of products.
β
Assume ΠxA .B → t and do induction on it. Only the two product
rules are syntactically possible. Each one guarantees one alternative
of the goal.

3.7 Beta Equivalence


In arithmetics of numbers the terms 3 + 4, 7 and 2 + 5 are equivalent.
Syntactically the terms are different, however we regard the terms
equivalent because they represent the same value. In arithmetics the
equivalence is so strong that we even consider the terms being equal
and write 3 + 4 = 2 + 5 = 7.
In lambda calculus we only mark two terms as equal if they are
syntactically the same terms except for irrelevant renamings of bound
variables. In the literature of lambda calculus this equality is usually
called α-equivalence. We say that α-equivalent terms represent the
same term.
In lambda calculus we call terms which represent the same value
β-equivalent terms. E.g. the terms (λxA .e)a and e[x := a] are called
(beta-) equivalent because they represent the same value. The former
before the compuation step and the latter after the computation step.
Beta equivalence is just the reflexive, symmetric and transitive
closure of the beta reduction.
β
Definition 3.13. We define beta equivalence as a binary relation a ∼ b
between the terms a and b inductively by the rules

25
1. Reflexive:
β
a∼a

2. Forward:
β
a∼b
β
b→c
β
a∼c
3. Backward:
β
a∼b
β
c→b
β
a∼c
β
In other words the beta equivalence relation ∼ is the smallest
β
equivalence relation which contains beta reduction →.

Theorem 3.14. Beta equivalence is transitive.


β
a∼b
β
b∼c
β
a∼c

β β
Proof. Assume a ∼ b. We prove the goal by induction on b ∼ c.
1. Reflexive: Trivial.
2. Forward:
β β
b∼c a∼c
β
c→d
β β
b∼d a∼d
The goal in the lower right corner is proved by the induction
hypothesis and applying the forward rule.
3. Backward:
β β
b∼c a∼c
β
d→c
β β
b∼d a∼d
The goal in the lower right corner is proved by the induction
hypothesis and applying the backward rule.

26
Theorem 3.15. Beta equivalence is symmetric.
β
a∼b
β
b∼a
β
Proof. By induction on a ∼ b.
1. Reflexive: Trivial
2. Forward:
β β
a ∼ b0 b0 ∼ a
β
b0 → b
β β
a∼b b∼a
β
First we use the reflexive rule to derive b ∼ b and then the second
β β
premise b0 → b and the backward rule to derive b ∼ b0 .
β
Then we use the induction hypothesis b0 ∼ a and the transitivity
β
of beta equivalence 3.14 to derive the goal b ∼ a.
3. Backward:
β β
a ∼ b0 b0 ∼ a
β
b → b0
β β
a∼b b∼a
Similar reasoning as with the forward rule. The second premise
β
implies b ∼ b0 and the induction hypothesis and transitivity
imply the goal.

Theorem 3.16. The same substitution applied to beta equivalent terms


results in beta equivalent terms.
β
t∼u
β
t[x := a] ∼ u[x := a]
β
Proof. By induction on t ∼ u:
The reflexive case is trivial, because t and u are identical.
For the other two cases the goal is a consequence of the induction
hypothesis and the transitivity of beta equivalence 3.14.

27
4 Confluence
4.1 Overview
We want to be able to use the calculus of constructions for computa-
tion. The basic computation step is beta reduction. A computation
in the calculus ends if no more reduction is possible i.e. if the term
has been reduced to a normal form.
However beta reduction is ambiguous. There might be more than
one possibility to make a beta reduction.

Example 4.1.

(λx.f xx)((λy.y)z)
↙β ↘β
f ((λy.y)z)((λy.y)z)
↓β (λx.f xx)z
f z((λy.y)z)
↘β ↙β
f zz

In the left path we reduce first the outer redex. This reduction du-
plicates the redex in the argument. Therefore we need three reduction
steps to reach the final term f zz.
In the right path we reduce first the redex in the argument of the
application and then the outer redex. Therefore we need only two
reduction steps to reach the final term f zz.
In that specific example we have shown that both paths end up
in the same result. However we have to prove that this is always the
case. Otherwise the calculus of constructions would be useless as a
calculus.
It turns out that confluence is a key property in the proof of unique-
ness of results. We define a relation as confluent if going an arbitrary
number of steps in two directions there are always continuations of the
paths to join them. We want to prove confluence of beta reduction.
Before proving confluence of beta reduction we define the conflu-
ence of a relation by defining first a diamond property of a relation.
The diamond property of a relation is something like a one step con-
fluence. Then we show that the confluence of a relation can be proved
by finding a diamond between it and its reflexive transitive closure.

28
In a next step we define a parallel beta reduction, prove that it is
a diamond between beta reduction and the reflexive transitive closure
of beta reduction. This proves the confluence of beta reduction.
Having the confluence of beta reduction it is easy to prove that
beta equivalent terms always have a common reduct, normal forms
are unique and some other interesting properties of binders.

4.2 Diamonds and Confluence


Definition 4.2. Diamond Property A relation r has the diamond
r r
property (or is a diamond) if a → b and a → c implies the existence
r r
of some d such that b → d and c → d are valid.
r
a→b
r
a→c
r r
∃d.b → d ∧ c → d
We can state the diamond property of a relation r more pictorially
as
r
a → b
↓r ↓r
r
c → ∃d
which looks like a diamond if tilted by 45◦ clockwise.
Definition 4.3. Confluence A relation r is confluent if its reflexive
transitive closure r∗ is a diamond.
Theorem 4.4. Let r and s be two relations. If s is between r and its
reflexive transitive closure i.e. r ⊆ s ⊆ r∗ then both closures are the
same i.e. r∗ = s∗ .
" #
r
a→b
Proof. Note that r ⊆ s is defined by ∀ab. s .
a→b
Assume r ⊆ s ⊆ r∗ . In order to prove r∗ = s∗ we have to prove
r∗ ⊆ s∗ and s∗ ⊆ r∗ .
1. r∗ ⊆ s∗ : We have to prove the goal
r∗
a→b
s∗
a→b
r∗
We use induction on a → b.

29
s∗
(a) a = b: Trivial since a → a is valid by definition.
(b)
r∗ s∗
a→b a→b
r
b→c
r∗ s∗
a→c a→c
In order to prove the final goal in the lower right corner we
s∗
start from the induction hypothesis a → b.
s
Since r ⊆ s we have b → c by definition of ⊆ and the second
premise.
Then by definition of the reflexive transitive closure we con-
clude the final goal.
2. s∗ ⊆ r∗ : We have to prove the goal
s∗
a→b
r∗
a→b

s∗
We use induction on a → b.
r∗
(a) a = b: Trivial since a → a is valid by definition.
(b)
s∗ r∗
a→b a→b
s
b→c
s∗ r∗
a→c a→c
r∗
Start with the induction hypothesis a → b.
r∗
Since s ⊆ r∗ is given we can infer b → c from the second premise.
r∗
r∗ is transitive and therefore a → c must be valid.

Lemma 4.5. If r is a diamond then


r∗
a → b
↓r ↓r
r∗
c → ∃d

is valid.
r∗
Proof. By induction on a → b.

30
1. In the reflexive case we have a = b. We choose d = c which
satisfies the required properties.
2.
r∗
 
a → b
r∗
a → b ∀d.  ↓r ↓r 
 
r∗
d → ∃e
r
b→c
r∗
 
a → c
r∗
a → c ∀d.  ↓r ↓r 
 
r ∗
d → ∃f
r
To prove the goal in the lower right corner we assume a → d and
try to find some f with the required properties.
From the induction hypothesis we find some e which satisfies
r r∗
b → e and d → e.
r
b → c
Since r is a diamond there exists some f with ↓r ↓r . By
r
e → ∃f
glueing the boxes together we see that f satisfies the required
properties.
r∗ r
a → b → c
↓r ↓r ↓r
r∗ r
d → e → f

Lemma 4.6. If r is a diamond, then r∗ is a diamond as well.


r∗
a → b
↓r ∗ ↓r ∗
r∗
c → ∃d

r∗
Proof. We assume that r is a diamond and a → b and do induction
r∗
on a → c
1. In the reflexive case we have a = c. We use d = b which satisfies
the required properties.

31
2.
r∗
a → b
r∗
a→c ↓r ∗ ↓r∗
r∗
c → ∃e
r
c→d
r∗
a → b
r∗
a→d ↓r ∗ ↓r∗
r∗
d → ∃f
We have to prove the goal in the lower right corner i.e. find some
f with the required properties.
From the induction hypothesis we find some e. Using the previ-
ous lemma 4.5 we find some f such that we can glue the boxes
together in order to see that f satisfies the required properties.
r∗
a → b
↓r ∗ ↓ r∗
r∗
c → e
↓r ↓r
r∗
d → f

Now we can prove the basic theorem of this subsection:

Theorem 4.7. In order to prove the confluence of a relation it is suf-


ficient to find a diamond between it and its reflexive transitive closure.

r ⊆ s ⊆ r∗
s is a diamond
r is confluent
Proof. Assume r ⊆ s ⊆ r∗ and s is a diamond.
By 4.4 we know that both reflexive transitive closures are the same
i.e. r∗ = s∗ .
From 4.6 we conclude that s∗ is a diamond. This implies that r∗ is
a diamond as well and proves the fact that r is confluent by definition
of confluence.

32
4.3 Parallel Reduction Relation
Looking at the example 4.1 it can be seen that beta reduction is not
a diamond. Let’s analyze the example a little bit to see why beta
reduction is not a diamond. Assume there is a redex like

(λxA .e)a

where the subterm a contains redexes as well i.e. there is some b with
β
a → b. Then the following two reduction paths are possible.
β β
(λxA .e)a → (λxA .e)b → e[x := b]
β β∗
(λxA .e)a → e[x := a] → e[x := b]

In the first path we reach the term e[x := b] in two reduction


steps. In the second path we can reach the term e[x := b] as well. But
the needed number of steps depends on how often the variable x is
present in the term e. If x is contained in the term e n times we need
n + 1 reduction steps in total because we have to reduce the term a in
e[x := a] to the term b n times. If beta reduction were a diamond, it
would be required that n is always 1 which is not the case in general.
In order to prove the confluence of beta reduction we have to find
a diamond between beta reduction and its reflexive transitive closure.
Let’s call this relation βp . This relation is like beta reduction β but
must allow more reductions steps to remedy the situation that in a
redex (λxA .e)a the variable x might be contained zero or more times
in the subterm e.
• In order to remedy the situation that the variable x is not con-
tained in the subterm e we make the relation βp reflexive.
• In order to remedy the situation that the variable x is contained
two or more times in the subterm e we allow that βp reductions
can happen in any subterm of a term. Therefore we call βp
parallel beta reduction because the reduction can happen in the
subterms in parallel.
βp
Definition 4.8. The parallel reduction relation a → b is defined in-
ductively by the rules
1. Reflexive
βp
a→a

33
2. Redex
βp
a→b
βp
e→f
βp
(λxA .e)a → f [x := b]
3. Product
βp
A→C
βp
B→D
βp
ΠxA .B → ΠxC .D
4. Abstraction
βp
A→B
βp
e→f
βp
λxA .e → λxB .f
5. Application
βp
a→c
βp
b→d
βp
ab → cd
Lemma 4.9. Parallel beta reduction is a superset of beta reduction

Proof. This fact is trivial, because all rules of beta reduction are con-
tained as special cases within the rules of parallel beta reduction.

Lemma 4.10. Parallel beta reduction is a subset of the transitive


closure of beta reduction

Proof. All rules of parallel beta reduction are satisfied by the transitive
closure of beta reduction. Since an inductive relation defined by some
rules is the smallest relation which satisfies the rules, it is evident that
parallel beta reduction must be smaller than the transitive closure of
beta reduction.

Lemma 4.11. Basic compatibility of parallel reduction and substitu-


tion
βp
t→u
βp
a[x := t] → a[x := u]
Proof. By induction on the structure of a.

34
1. a is a sort: Trivial, because substitution does not change a sort
and parallel beta reduction is reflexive.
2. a is a variable, let’s say y: In the case x = y the goal is implied by
the premise. In the case x ̸= y the goal is implied by reflexivity.
3. a is the product Πy B .C: We have to prove the goal (Πy B .C)[x :=
βp βp
t] → (Πy B .C)[x := u] from the premise t → u and the induction
βp βp
hypotheses B[x := t] → B[x := u] and C[x := t] → C[x := u].
The validity of the goal can be seen from the following derivation.
(Πy B .C)[x := t] = Πy B[x:=t] .C[x := t] definition of substitution
βp
→ Πy B[x:=u] .C[x := u] induction hypothesis
= (Πy B .C)[x := u] definition of substitution

4. a is the abstraction λy B .e: Same reasoning as with product.


5. a is the application f a: Same reasoning as with product.

Lemma 4.12. Full compatibility of parallel reduction and substitution


βp
a→b
βp
t→u
βp
a[x := t] → b[x := u]
βp
Proof. By induction on a → b.
1. Reflexive: In that case we have a = b. The goal is an immediate
consequence of lemma 4.11
2. Redex:
βp βp
a→b a[x := t] → b[x := u]
βp βp
e→f e[x := t] → f [x := u]
βp βp
(λy A .e)a → f [y := b] ((λy A .e)a)[x := t] → f [y := b][x := u]

The validity of the goal in the lower right corner can be seen
from the following reasoning:
((λy A .e)a)[x := t] = (λy A[x:=t] .e[x := t])a[x := t] definition of substitution
βp
→ f [x := u][y := b[x := u]] induction hypotheses
= f [y := b][x := u] double substitution 3.7

35
3. Product, abstraction and application: Some reasoning as with
redex. The lemma 3.7 is not needed in these cases.

Lemma 4.13. The product and abstraction are preserved under par-
allel reduction
βp
λxA .e → c
1. βp βp
∃Bf.c = λxB .f ∧ A → B ∧ e → f
βp
ΠxA .B → c
2. βp βp
∃CD.c = ΠxC .D ∧ A → C ∧ B → D
Proof. By induction on the premise. In both cases only one rule is syn-
tactically possible which guarantees the existence of the corresponding
terms.

Theorem 4.14. Parallel reduction is a diamond


βp
a → b
↓βb ↓βb
βp
c → ∃d

βp
Proof. By induction on a → b. Note that we keep the variable c in
the following universally quantified.
1. Reflexive
βp
 
a → a
βp
a → a ∀c.  ↓βb ↓βb 
 
βp
c → ∃d
βp
To prove the goal on the right side we assume a → c. Then we
use c for d which satisfies the required property of d trivially.

36
2. Redex
βp
 
e → f
βp
e→f ∀g.  ↓βb ↓βb 
 
βp
g → ∃h
βp
 
a → b
βp
a→b ∀c.  ↓βb ↓βb 
 
βp
c → ∃d
βp
 
(λxA .e)a → f [x := b]
βp
(λxA .e)a → f [x := b] ∀k.  ↓ βb ↓βb
 

βp
k → ∃n

βp
To prove the goal in the lower right corner we assume (λxA .e)a →
k and do a case split on the construction of this relation.
(a) Reflexive: In that case k = (λxA .e)a. We use n = f [x := b]
which has the required property.
(b) Redex: In that case k = g[x := c] for some g and c with the
βp βp
properties e → g and a → c.
βp βp
We have to find a term n with f [x := b] → n∧g[x := c] → n.
The term
n = h[x := d]
with the terms h and d which exist by the induction hy-
potheses. It is easy to see that the properties
βp
f [x := b] → h[x := d]
βp
g[x := c] → h[x := d]

are satisfied because of the induction hypotheses and lemma 4.12


(c) Product: This case is syntactically impossible because (λxA .e)a
cannot be a product.
(d) Abstraction: This case is syntactically impossible because
(λxA .e)a cannot be an abstraction.
(e) Application: In that case k = (λxB .g)c for some B, g and c
βp βp βp
with A → B, e → g and a → c. Because of lemma 4.13 we
have chosen the more specific term λxB .g instead of a more
general term.

37
βp
We have to find a term n which satisfies (λxB .g)c → n and
βp
f [x := b] → n.
We use the term
n = h[x := d]
with the terms h and d which exist by the induction hy-
potheses satisfying
βp
f → h
βp
g → h
βp
b → d
βp
c → d
Therefore the goals
βp
(λxB .g)c → h[x := d]
βp
f [x := b] → h[x := d]
are satisfied
3. Product
βp
 
A → C
βp
A→C ∀E.  ↓βb ↓βb 
 
βp
E → ∃H
βp
 
B → D
βp
B→D ∀F.  ↓βb ↓βb 
 
βp
F → ∃J
βp
 
ΠxA .B → ΠxC .D
βp
ΠxA .B → ΠxC .D ∀EF.  ↓βb ↓βb 
 
βp
ΠxE .F → ∃n
In order to prove the goal in the lower right corner we assume
βp
ΠxA .B → ΠxE .F .
Because of lemma 4.13 which says that parallel reduction pre-
serves products we have chosen the more specific ΠxE .F which
βp βp
satisfies A → E and B → F instead of a more general term.
From the induction hypotheses we conclude the existence of the
terms H and J such that
n = ΠxH .J

38
satisfies the required properties.
4. Abstraction: Same reasoning as with product.
5. Application
βp
 
a → c
βp
a→c ∀e.  ↓βb ↓βb 
 
βp
e → ∃g
βp
 
b → d
βp
b→d ∀f.  ↓βb ↓βb 
 
βp
f → ∃h
βp
 
ab → cd
βp
ab → cd ∀k.  ↓βb ↓βb 
 
βp
k → ∃n
βp
To prove the goal in the lower right corner we assume ab → k
and do a case split on the construction of this relation.
(a) Reflexive: In that case k = ab. We use n = cd which satisfies
the required properties.
(b) Redex: In this case ab has to be a redex, let’s say (λxA .m)b.
βp
Therefore and because of lemma 4.13 ab → cd becomes
βp βp
(λxA .m)b → (λxB .o)d for some B and o with A → B and
βp
m → o.
βp
k has to be the reduct p[x := f ] for some p, f with m → p
βp
and b → f .
We have to find some term n which satisfies
βp
(λxB .o)d → n
βp
p[x := f ] → n
From the first induction hypothesis and lemma 4.13 we pos-
βp βp
tulate the existence of q with o → q and p → q.
From the second induction hypothesis we conclude the ex-
βp βp
istence of some h with with d → h and f → h.
Therefore the term
n = q[x := h]
satisfies the requirement.

39
(c) Product: This case is syntactically impossible because ab
cannot be a product.
(d) Abstraction: This case is syntactically impossible because
ab cannot be an abstraction.
(e) Application: In that case k = ef for some terms e, f which
βp βp
satisfy a → e and b → f . We have to find some term n
βp βp
which satisfies cd → n and ef → n.
By the induction hypotheses there exist some terms g and
βp βp βp βp
h satisfying c → g, e → g, d → h and f → h such that the
term
n = gh
satisfies the requirement.

Theorem 4.15. Church Rosser theorem: Beta reduction is confluent.

Proof. With the parallel beta reduction we have found a relation which
is
1. a diamond (4.14)
2. between beta reduction and its reflexive transitive closure (4.9,
4.10)
I.e. with parallel beta reduction we have found a diamond between
beta reduction and its reflexive transitive closure which implies by 4.7
that beta reduction is confluent.

4.4 Equivalent Terms


One of the most important consequences of the Church Rosser theorem
(i.e. confluence of beta reduction) is the fact that beta equivelent
terms have a common reduct. I.e. for two beta equivalent terms a
and b there exists always a term c to which both reduce
β
a ∼ b
↘β ∗ ↙β ∗
∃c

40
Theorem 4.16. Equivalent terms have a common reduct.
β
a∼b
β∗ β∗
∃c.a → c ∧ b → c

β
Proof. By induction on a ∼ b.
1. Reflexive: In that case a = b. We use c = a which satisfies the
required properties trivially.
2. Forward:
β β∗ β∗
a∼b ∃d.a → d ∧ b → d
β
b→c
β β∗ β∗
a∼c ∃e.a → e ∧ c → e
In order to prove the goal in the lower right corner we construct
terms d and e which satisfy the following diagram and therefore
e satisfies the required properties.
β β
a ∼ b → c
↘β ∗ ↓β ∗ ↓β ∗
β∗
d → e

d exists by the induction hypothesis and e exists by conflu-


ence 4.15
3. Backward:
β β∗ β∗
a∼b ∃d.a → d ∧ b → d
β
c→b
β β∗ β∗
a∼c ∃d.a → d ∧ c → d
In order to prove the goal in the lower right corner we construct
term d which satisfies the following diagram and therefore the
required properties.
β β
a ∼ b ← c
↘β ∗ ↓β ∗ ↙β ∗
d

The term d exists by the induction hypothesis.

41
4.5 Uniqueness of Normal Forms
Theorem 4.17. The normal form of a lambda term is unique.
β∗
t→u
β∗
t→v
u and v are in normal form
u=v

Proof. u and v are beta equivalent. According to the theorem 4.16


both have to reduce to a common reduct, say w. Since both are in
normal form, the common reduct has to be reached in zero reduction
steps (no reduction step is possible in normal form) which is possible
only if u and v are the same term.

4.6 Equivalent Binders


Theorem 4.18. In beta equivalent binders i.e. terms of the form
λxA .e or ΠxA .B corresponding subterms are beta equivalent.
β β
ΠxA .B ∼ ΠxC .D λxA .e ∼ λxB .f
β β β β
A∼C ∧B ∼D A∼B∧e∼f

Proof. We only prove the theorem for products (the proof for abstrac-
tions is practically the same).
Since both products are beta equivalent they have by 4.16 a com-
mon reduct. By 3.12 products (and abstractions) are preserved under
beta reduction. ∗Therefore the common reduct must have the form
β β∗ β∗ β∗
ΠxE .F with A → E, B → F , C → E and D → F .
Since reduction implies beta equivalence and beta equivalence is
β β
transitive and symmetric we get A ∼ C and B ∼ D.

4.7 Binders are not Equivalent to Variables or


Sorts
Theorem 4.19. Binders (products or abstractions) cannot be beta
equivalent to variables or sorts.

Proof. We only prove


β
ΠxA .B ∼ s

42
because the proofs of the other variants follow the same pattern.
β
Assume ΠxA .B ∼ s. Both terms must have a common reduct
by 4.16 which must be the sort s since a sort is already in normal
β∗
form. I.e. we get ΠxA .B → s.
β∗
Since reduction preserves the form of binders by 3.12 ΠxA .B → s
is not possible and we get the desired contradiction.

43
5 Typing
This section describes welltyped terms and basic properties of well-
typed terms. It is based on Henk Barendregt’s paper Lambda Calculi
with Types [1].
What is needed to state that a term t is welltyped? First we have
to notice that the term t might contain free variables. As opposed to
bound variables, free variables don’t have a type expressed within a
term. Therefore we need a context Γ which assigns types to the free
variables in the term t. Furthermore we need a type T . We write the
statement The term t has type T in the context Γ as

Γ⊢t:T

We need a typing relation which is a subset of

Contexts × Terms × Terms

In this section we define the typing relation inductively.


Since terms are either sorts, variables, products, abstractions or
applications, the typing relation has one introduction rule for each
possible form of a term which states how to introduce a welltyped
term of that form.
Furthermore there are two structural rules which state
• If we add a welltyped variable to a context, then every welltyped
term in the context is welltyped in the augmented context in
which it has the same type.
• If a term t has type T and the type U is beta equivalent to T
and a welltyped type, then the term t has type U in the same
context.
The last rule states that beta equivalent types are really equivalent
in their role as types. This allows to do computations (i.e. beta
reductions) in types without changing the meaning of a type.
Besides the definition of the typing relation the main part of this
section is to prove important properties of welltyped terms. The most
important nontrivial theorems/lemmas are:
1. Generation Lemmata: If a term t is welltyped (i.e. it has a type
T in a certain context Γ), then there is an equivalent type of
a certain form for each possible form the term (sort, variable,

44
product, abstraction, application). E.g. each welltyped abstrac-
tion λxA .e has a type of the form ΠxA .B which is beta equivalent
to T .
The generation lemmata describe important properties which are
used in the proofs on many other theorems.
2. Type of types: If T is the type of a welltyped term in a certain
context, then T is either the sort U or T is welltyped in the same
context and its type is a sort (i.e. either P or U).
This theorem justifies the introduction of sorts as types of types.
3. Subject reduction: Reduction (i.e. computation) does not change
the type of a term. Or in other words: If a term has a certain
type and we compute the term, then we really get an object of
that type. I.e. computation does what it promises to do.
Subject reduction is valid in all typed programming languages.
If you define a function with a certain result type, then the ac-
tual execution of that function returns an object of that type
(provided that the computation terminates).
4. Uniqueness of types: If a term is welltyped in a certain context,
then all its possible types are beta equivalent. I.e. each welltyped
term has a unique type modulo beta equivalence. I.e. we can
regard the equivalence class as the unique type of a term.
Together with subject reduction this theorem guarantees that
beta equivalent welltyped terms have the same unique type mod-
ulo beta equivalence.
In the last subsection of this section we introduce kinds. Since we
have the two sorts P and U and sorts are the types of types, there are
two types of types. Types of type P and types of type U. The types of
type U are called kinds. Kinds have the special property that they are
recognizable by pure syntactic analysis of the term which represents
the type.
Semantically kinds are the types of n-ary type functions where
n = 0 is allowed as a corner case. Therefore if we know that the type
of a term is a kind, then we know that the term will return a type, if
applied to sufficient arguments (of the correct type of course).
The kinds play an important role in the proof of strong normaliza-
tion i.e. in the proof of consistency of the calculus of constructions.

45
5.1 Typing Relation
Definition 5.1. The ternary typing relation Γ ⊢ t : T which says that
in the context Γ then term t has type T is defined inductively by the
rules
1. Introduction rules:
(a) Axiom:
[] ⊢ P : U
(b) Variable:
Γ⊢A:s
x∈/Γ
Γ, xA ⊢ x : A
(c) Product:
Γ ⊢ A : s1
Γ, xA ⊢ B : s2
Γ ⊢ ΠxA .B : s2
(d) Abstraction:
Γ ⊢ ΠxA .B : s
Γ, xA ⊢ e : B
Γ ⊢ (λxA .e) : ΠxA .B
(e) Application:
Γ ⊢ f : ΠxA .B
Γ⊢a:A
Γ ⊢ f a : B[x := a]
2. Structural rules:
(a) Weaken:
Γ⊢t:T
Γ⊢A:s
x∈/Γ
Γ, xA ⊢ t : T
(b) Type equivalence:
Γ⊢t:T
Γ⊢U :s
β
T ∼U
Γ⊢t:U
Remarks:

46
• There is no introduction rule for U. Therefore there can never
be a valid type of U i.e. the term U is not welltyped. Its only
purpose is to be the type of some types like P and it can only
appear in the type position of a typing judgement Γ ⊢ t : T .
• The variable introduction rule requires a welltyped type A for
the introduced variable (i.e. A : s for some sort s). Since U is
not welltyped, it is impossible for a variable to have type U.
• The product type ΠxA .B is the type of functions mapping ob-
jects of type A to objects of type B where the result type B
might contain the variable x. The introduction rule for prod-
ucts requires that both A and B are welltyped types. Therefore
neither of them can be U. Therefore a function cannot receive
arguments of type U nor return results of type U.
It is possible to compute with types but it is impossible to com-
pute with sorts (or kinds in general) as arguments and results.
This is an important difference to the extended calculus of con-
structions which allows variables of type Ui because Ui has type
Ui+1 . The extended calculus of constructions can compute with
kinds, the calculus of constructions not.

5.2 Basic Definitions


Definition 5.2. Basic Definitions:
1. Γ ⊢ t : T : U is defined as Γ ⊢ t : T and Γ ⊢ T : U .
2. Γ ⊢ ∆ for the contexts Γ and ∆ is defined as ∀xA ∈ ∆ ⇒ Γ ⊢ x :
A.
3. Γ is a valid context if Γ ⊢ t : T is valid for some terms t and T .
4. A term t is welltyped in a context Γ if Γ ⊢ t : T is valid for some
term T .
5. A term t is welltyped if there is a context in which it is welltyped.
6. A term is valid in a context if it is either welltyped in the context
or it is U.
7. A term is valid if there is a context in which it is valid.
8. A term T is a valid type of sort s in a context Γ if Γ ⊢ T : s is
valid.
9. A term is a valid type of some sort if it is a valid type of this
sort in some context.

47
10. A term is a proposition or a proper type if it is a valid type of
sort P.
11. A term is a kind if it is a valid type of sort U.

5.3 Start Lemma


Theorem 5.3. For any valid context Γ the following two typing judge-
ments are valid:
Γ⊢P:U
xA ∈ Γ ⇒ Γ ⊢ x : A
Proof. A context Γ is valid by definition if Γ ⊢ t : T is valid for some
terms t and T . We prove

Γ⊢P:U

∀xA ∈ Γ ⇒ Γ ⊢ x : A

by induction on Γ ⊢ t : T .
For all rules except the axiom, the variable rule and the weakening
rule the goal is an immediate consequence of the induction hypothesis
for the same context. The axiom, the variable and the weakening rule
are treated separately.
1. Axiom [] ⊢ P : U: The first part is trivially valid. The second
part is vacuously valid, because the empty context does not have
any variables.
2. Variable:
Γ⊢B:s Γ ⊢ P : U ∧ (∀xA ∈ Γ ⇒ Γ ⊢ x : A)
y∈/Γ
Γ, y B ⊢ y : B Γ, xB ⊢ P : U ∧ (∀xA ∈ (Γ, xB ) ⇒ Γ, y B ⊢ x : A)

The first part of the goal in the lower right corner is a conse-
quence of the induction hypothesis and the weaking rule.
For the second part we have to distinguish two cases:
• xA ∈ Γ: In that case the second part of the goal is a conse-
quence of the second part of the induction hypothesis and
the weakening rule.
• xA = y B : In that case the second part of the goal is identical
with the lower left corner.

48
3. Weakening:

Γ⊢t:T
Γ⊢B:s Γ ⊢ P : U ∧ (∀xA ∈ Γ ⇒ Γ ⊢ x : A)
y∈/Γ
Γ, y B ⊢ t : B Γ, xB ⊢ P : U ∧ (∀xA ∈ (Γ, xB ) ⇒ Γ, y B ⊢ x : A)

The reasoning is nearly the same as with the variable case. Ex-
cept the second part of the goal for xA = y B is proved by the
variable introduction rule by using Γ ⊢ B : s and y ∈
/ Γ.

5.4 Thinning Lemma


Theorem 5.4. Thinning Lemma: Let ∆ be a valid context with
Γ ⊆ ∆. Then
Γ⊢t:T
∆⊢t:T
Proof. By induction on Γ ⊢ t : T .
1. Sort:
[] ⊢ P : U

Since ∆ is a valid context we get ∆ ⊢ P : U by the start


lemma 5.3.
2. Variable:
Γ⊢A:s
x∈

 
∆ valid
Γ, xA ⊢ x : A ∀∆.  Γ, xA ⊆ ∆ 
∆⊢x:A
In order to prove the goal in the lower right corner we assume a
valid context ∆ which is a superset of Γ, xA . Because of that it
has an entry xA . By the start lemma 5.3 we infer the goal.

49
3. Product:
 
∆ valid
Γ ⊢ A : s1 ∀∆.  Γ ⊆ ∆ 
 ∆ ′⊢ A : s1 
∆ valid
Γ, xA ⊢ B : s2 ∀∆′ .  Γ, xA ⊆ ∆′ 
∆′ ⊢ B : s 2
 
∆ valid
Γ ⊢ ΠxA .B : s2 ∀∆.  Γ ⊆ ∆ 
A
∆ ⊢ Πx .B : s2

In order to prove the goal in the lower right corner we assume a


valid context ∆ which is a superset of Γ and derive the following
facts:
(a) ∆ ⊢ A : s1 : This is an immediate consequence of the first
induction hypothesis.
(b) ∆, xA ⊢ B : s2 : We can assume x ∈ / ∆. Otherwise we
rename the variable such that the condition is satisfied. The
context ∆, xA is a valid context because of ∆ ⊢ A : s1
and the variable introduction rule. By the second induction
hypothesis we get the subgoal.
(c) ∆ ⊢ ΠxA .B : s2 : This fact can be derived from the previous
two subgoals and the product introduction rule.
The last fact proves the goal.
4. Abstraction:
Similar reasoning as in product
5. Application:
 
∆ valid
Γ ⊢ f : ΠxA .B ∀∆.  Γ⊆∆ 
A
∆ ⊢ f : Πx .B

∆ valid
Γ⊢a:A ∀∆.  Γ⊆∆ 
∆⊢a:A
 
∆ valid
Γ ⊢ f a : B[x := a] ∀∆.  Γ⊆∆ 
∆ ⊢ f a : B[x := a]

50
We assume a valid context ∆ which is a superset of Γ. By the
two induction hypotheses and the application introduction rule
we infer the goal.
6. Weaken:
 
∆ valid
Γ⊢t:T ∀∆.  Γ ⊆ ∆ 
∆⊢t:T
Γ⊢A:s
x∈

 
∆ valid
Γ, xA ⊢ t : T ∀∆.  Γ, xA ⊆ ∆ 
∆⊢t:T

Let’s assume a valid context ∆ which is a superset of Γ, xA .


This implies that it is a superset of Γ as well. The goal follows
immediately from the first induction hypothesis.
7. Type reduction:
 
∆ valid
Γ ⊢ t : T ∀∆.  Γ ⊆ ∆ 
 ∆⊢t:T 
∆ valid
Γ ⊢ U : s ∀∆. Γ ⊆ ∆
 
∆⊢U :s
β
T ∼U
 
∆ valid
Γ⊢t:U ∀∆.  Γ ⊆ ∆ 
∆⊢t:U
Let’s assume a valid context ∆ which is a superset of Γ. From
the two induction hypotheses we get ∆ ⊢ t : T and ∆ ⊢ U : s.
We conclude the final goal by applying the type equivalence rule.

5.5 Generation Lemmata


Theorem 5.5. The following generation lemmata are valid:

51
1. Sort:
Γ⊢s:T
β
s=P ∧T ∼U
2. Variable:
Γ ⊢ x :T 
Γ⊢A:s
 A
∃A, s.  x ∈ Γ 

β
A∼T

3. Product:
Γ ⊢ ΠxA.B : T 
Γ ⊢ A : s1
A
∃s1 , s2 .  Γ, x ⊢ B : s2 
 
β
s2 ∼ T

4. Abstraction:
Γ ⊢ λxA .e : T 
Γ ⊢ ΠxA .B : s
A
∃B, s.  Γ, x ⊢ e : B 
 
β
ΠxA .B ∼ T

5. Application:
Γ ⊢ f a : T 
Γ ⊢ f : ΠxA .B
∃A, B.  Γ ⊢ a : A
 

β
B[x := a] ∼ T

Proof. The premise in all lemmata is the validity of Γ ⊢ t : T where t


is either a sort, a variable, a product, an abstraction or an application.
The goal in all lemmata is that some other terms exist and the type
T is beta equivalent (or identical) to some other type. All lemmata
can be proved by induction on Γ ⊢ t : T .
β
Note that the betaequivalence X ∼ T in all lemmata is essential.
Otherwise the proof for the type equivalence rule cannot be passed.
1. Introduction rules: For all generation lemmata only one intro-
duction rule is syntactically possible. The corresponding intro-
duction rule proves the goal trivially. As an example we prove
the generation lemma for products.

52
Only the introduction rule for products is syntactically possible.

Γ ⊢ A : s1
Γ, xA ⊢ B : s2  
Γ ⊢ A : sa
A
Γ ⊢ ΠxA .B : s2 ∃sa , sb .  Γ, x ⊢ B : sb 
 
β
s2 ∼ sb

The goal in the lower right corner is proved by using sa = s1 and


sb = s2 .
2. Structural rules: For each of the generation lemmata and struc-
tural rule the prove is straightforward. Here we prove the gen-
eration lemma for products as an example.
(a) Weaken:
 
Γ ⊢ A : s1
A
Γ ⊢ ΠxA .B : T ∃s1 , s2 .  Γ, x ⊢ B : s2 
 
β
T ∼ s2
Γ⊢C:s
z∈
/Γ  
Γ, z C ⊢ A : sa
C A
Γ, z C ⊢ ΠxA .B : T ∃sa , sb .  Γ, z , x ⊢ B : sb 
 
β
T ∼ sb

By the induction hypothesis there are some s1 and s2 which


satisfy the conditions in the square brackets. In order to
prove the goal in the lower right corner and use sa = s1 and
sb = s2 and have to prove the following three subgoals:
i. Γ, y C ⊢ A : s1 : Since Γ, y C is a valid context with
Γ ⊆ Γ, y C we can prove the subgoal by the thinning
lemma 5.4.
ii. Γ, z C , xA ⊢ B : s2 : The previous subgoal and the vari-
able introduction rule ensures that Γ, y C , xA is a valid
context with Γ, xA ⊆ Γ, y C , xA . By the thinning lemma 5.4
and the induction hypothesis we prove the current sub-
goal.
β
iii. T ∼ s2 : This subgoal is identical to the third proposi-
tion in the induction hypothesis.

53
(b) Type equivalence:
 
Γ ⊢ A : s1
A
Γ ⊢ ΠxA .B : T ∃s1 , s2 .  Γ, x ⊢ B : s2 
 
β
T ∼ sb
Γ⊢U :s
β
T ∼U  
Γ ⊢ A : sa
A
Γ ⊢ ΠxA .B : U ∃sa , sb .  Γ, x ⊢ B : sb 
 
β
U ∼ sb

By the induction hypothesis there exist some s1 and s2


which satisfy the propositions in the square brackets, es-
β
pecially T ∼ s2 . To prove the goal in the lower right corner
β
we use sa = s1 and sb = s2 and use the fact T ∼ U and the
transitivity of beta equivalence.

Corollary 5.6. All types in a valid context are welltyped.

Proof. According to the start lemma 5.3 Γ ⊢ x : A is valid for each


xA ∈ Γ. By the generation lemma 5.5 for variables there is a sort s
with Γ ⊢ A : s. Therefore A is welltyped.

Corollary 5.7. The term U is not welltyped.

Proof. Let’s assume it is welltyped. Then by definition there exists a


context Γ and a term T such that Γ ⊢ U : T is valid. By the generation
lemma 5.5 for sorts we get U = P which is not possible.

Corollary 5.8. A valid context has no variable with the type U.

Proof. By 5.6 all types in a context are welltyped and by 5.7 the term
U is not welltyped. Therefore U cannot be the type of a variable in a
valid context.

Corollary 5.9. All subterms of a welltyped term are welltyped.

54
Proof. We prove this corollary by proving that any direct subterm of
a welltyped term is welltyped by induction on the structure of the
welltyped term.
For each possible form of a welltyped term the generation lemma 5.5
for this form states the existence of a context and a type of each direct
subterm, i.e. the direct subterms are welltyped.
Repeating the argument proves that all indirect subterms of a well-
typed subterm are welltyped as well.
Corollary 5.10. U cannot be a subterm of a welltyped term. This
implies that terms like λxU .e, λxA .U, ΠxU .B or ΠxA .U cannot be
welltyped, because they have U as a subterm.
Proof. Assume that U is a subterm of a welltyped term. Then by the
corollary 5.9 U must be welltyped. This contradicts corollary 5.7.

5.6 Substitution Lemma


Theorem 5.11. Substitution (cut) theorem.
Γ⊢a:A
Γ, xA , ∆ ⊢ t : T
Γ, ∆[x := a] ⊢ t[x := a] : T [x := a]
Proof. We assume Γ ⊢ a : A and prove this theorem by induction on
Γ, xA , ∆ ⊢ t : T .
In order to express the proof more compactly we introduce the
abbreviation t′ := t[x := a].
1. Axiom: This case is syntactically impossible, because the context
is not empty.
2. Variable: We have to prove the goal
Γ, xA , ∆ ⊢ B : s Γ, ∆′ ⊢ B ′ : s

Γ, xA , ∆, y B ⊢ y : B Γ, ∆′ , y B ⊢ y : B ′
The final goal in the lower right corner follows directly from the
induction hypothesis and the variable introduction rule.
3. Product:
Γ, xA , ∆ ⊢ B : s1 Γ, ∆′ ⊢ B ′ : s1

Γ, xA , ∆, y B ⊢ C : s2 Γ, ∆′ , y B ⊢ C ′ : s2
s2 = P ∨ s1 = s2

Γ, xA , ∆ ⊢ Πy B .C : s2 Γ, ∆′ ⊢ Πy B .C ′ : s2

55
The final goal follows from the induction hypotheses and the
product introduction rule.
4. Abstraction:

Γ, xA , ∆ ⊢ Πy B .C : s Γ, ∆′ ⊢ Πy B .C ′ : s

A B
Γ, x , ∆, y ⊢ e : C Γ, ∆′ , y B ⊢ e′ : C ′
′ ′
Γ, xA , ∆ ⊢ λy B .e : Πy B .C Γ, ∆′ ⊢ λy B .e′ : Πy B .C ′
Same reasoning as above.
5. Application:

Γ, xA , ∆ ⊢ f : Πy B .C Γ, ∆′ ⊢ f ′ : Πy B .C ′
Γ, xA , ∆ ⊢ b : B Γ, ∆′ ⊢ b′ : B ′
Γ, x , ∆ ⊢ f b : C[y := b] Γ, ∆′ ⊢ f ′ b′ : (C[y := b])′
A

From the induction hypotheses and the application introduction


rule we conclude
Γ, ∆′ ⊢ f ′ b′ : C ′ [y := b′ ]
and get the final goal by observing
C ′ [y := b′ ] = (C[y := b])′
by using the double substition lemma 3.7.
6. Structural rules:
(a) Weakening:
Γ, xA , ∆ ⊢ t : T Γ, ∆′ ⊢ t′ : T ′
Γ, xA , ∆ ⊢ B : s Γ, ∆′ ⊢ B ′ : s′

Γ, xA , ∆, y B ⊢ t : T Γ, ∆′ , y B ⊢ t′ : T ′
The goal in the lower right corner can be proved by the
induction hypotheses and the weakening rule.
(b) Type equivalence:
Γ, xA , ∆ ⊢ t : T Γ, ∆′ ⊢ t′ : T ′
Γ, xA , ∆ ⊢ U : s Γ, ∆′ ⊢ U ′ : s′
β
T ∼U
Γ, xA , ∆ ⊢ t : U Γ, ∆′ ⊢ t′ : U ′
From the theorem 3.16 we conclude that T ′ and U ′ are beta
equivalent. The goal in the lower right corner is an imme-
diate consequence of the induction hypotheses and the type
equivalence rule.

56
5.7 Type of Types
Theorem 5.12. A term in the type position of a context is either U
or it is a valid type of some sort.

Γ⊢t:T
T = U ∨ ∃s.Γ ⊢ T : s
Proof. By induction on Γ ⊢ t : T :
1. Sort: Trivial
2. Variable: Trivial by the premise of the variable introduction rule.
3. Product: Easy, because there are only two sorts. If the sort is
P then by the start lemma 5.3 Γ ⊢ P : U is valid in any valid
context. If the sort is U then the goal is trivial.
4. Abstraction:
Γ ⊢ ΠxA .B : s0
Γ, xA ⊢ e : B
Γ ⊢ λxA .e : ΠxA .B ∃s.Γ ⊢ ΠxA .B : s

Take s = s0 .
5. Application:
Γ ⊢ f : ΠxA .B ∃s0 .Γ ⊢ ΠxA .B : s0
Γ⊢a:A
Γ ⊢ f a : B[x := a] ∃s.Γ ⊢ B[x := a] : s

Remark: Since U ̸= ΠxA .B only the second alternative is inter-


esting.
From the induction hypothesis and the generation lemma 5.5
for products we conclude the existence of a sort s2 such that
Γ, xA ⊢ B : s2 is valid.
By applying the substitution lemma 5.11 we get Γ ⊢ B[x := a] :
s2 which proves the goal.
6. Weaken: The goal is easy to prove by the induction hypothesis
and the weakening rule.
7. Type Equivalence:
The goal is a immediate consequence of one of the premises of
the type equivalence rule.

57
5.8 Subject Reduction
Theorem 5.13. Subject reduction lemma Reduction of a term does
change its type.
Γ⊢t:T
β
t→u
Γ⊢u:T
Proof. In order to prove the subject reduction lemma we prove the
more general lemma
Γ!⊢ t : T !
β β
∀u. t→u ∧ ∀∆. Γ → ∆
Γ⊢u:T ∆⊢t:T
β
where Γ → ∆ means that ∆ is Γ with one of the variable types replaced
by a reduced type.
We prove the more general lemma by induction on Γ ⊢ t : T .
1. Introduction rules:
(a) Sort: If Γ is empty and t is a sort, then the goal is vacuously
true because neither the empty context nor a sort can reduce
to anything (they are in normal form).
(b) Variable:
! !
β β
Γ⊢A:s ∀B. A → B ∧ ∀∆0 . Γ → ∆ 0
Γ⊢B:s ∆0 ⊢ A : s
! !
β A β
Γ, xA ⊢ x : A ∀u. x→u ∧ ∀∆. Γ, x → ∆
Γ, xA ⊢ u : A ∆⊢x:A
The left part of the goal in the lower right corner is vacously
true because a variable is in normal form and there is no
term to which it reduces.
β
For the right part we assume Γ, xA → ∆. There are two
possibilities:
β
i. ∆ = ∆0 , xA where Γ → ∆0 for some ∆0 :
In that case we get ∆0 ⊢ A : s from the induction
hypothesis which implies the goal ∆0 , xA ⊢ x : A.
β
ii. ∆ = Γ, xB where A → B:
In that case we get Γ ⊢ B : s from the induction hy-
pothesis which implies the goal Γ, xB : x ⊢ B.

58
(c) Product:
! !
β β
Γ ⊢ A : s1 ∀C. A → C ∨ ∀∆. Γ → ∆
Γ ⊢ C : s1 ! ∆ ⊢ A : s1 !
β A →β ′
Γ, xA ⊢ B : s2 ∀D. B → D Γ,
∨ ∀∆′ . ′ x ∆
Γ, xA ⊢ D : s2 ∆ ⊢ B : s2
! !
A β β
Γ ⊢ ΠxA .B : s2 ∀t. Πx .B → t ∧ ∀∆. Γ→∆
Γ ⊢ t : s2 ∆ ⊢ ΠxA .B : s2

β
i. Left part: We assume ΠxA .B → t.
Since products are preserved under reduction (lemma 3.12)
β
we have either t = ΠxC .B where A → C for some C or
β
t = ΠxA .D where B → D for some D.
In both cases we can derive from the induction hypothe-
ses either Γ ⊢ C : s1 or Γ, xA ⊢ D : s2 . Therefore
Γ ⊢ ΠxC .B : s2 or Γ ⊢ ΠxA .D : s2 is valid trivially.
β
ii. Right part: Assume Γ → ∆. From the first induction
hypothesis we get ∆ ⊢ A : s1 . From the second induc-
tion hypothesis we get ∆, xA ⊢ B : s2 where we use
∆′ = ∆, xA . These facts imply ∆ ⊢ ΠxA .B : s2 .
(d) Abstraction: Same reasoning as with product.
(e) Application:
! !
β β
A
Γ ⊢ f : Πx .B ∀g. f → g ∧ ∀∆. Γ → ∆
Γ ⊢ g : Πx!A .B ∆ ⊢ f!: ΠxA .B
β β
Γ⊢a:A ∀b. a→b ∧ ∀∆. Γ → ∆
Γ⊢b:A ∆⊢a:A
! !
β β
Γ ⊢ f a : B[x := a] ∀t. fa → t ∧ ∀∆. Γ→∆
Γ ⊢ t : B[x := a] ∆ ⊢ f a : B[x := a]

β
i. Left part: Assume f a → t. We have three cases to
consider.
β β
A. f a → ga where f → g:
From the first induction hypothesis we get that g has
the same type as f and therefore ga : B[x := a] is
valid.

59
β β
B. f a → f b where a → b:
From the second induction hypothesis we get Γ ⊢ b :
A and therefore Γ ⊢ f a : B[x := b].
Since B[x := a] is a valid type and because of lemma 3.10
β
we have B[x := a] → B[x := b] and therefore B[x :=
b] ≤ B[x := a]. The subtype rule let us derive
Γ ⊢ f b : B[x := a] which is identical to the final
goal.
β
C. (λxA .e)a → e[x := a]:
In that case we have to prove
Γ ⊢ λxA .e : ΠxA .B
Γ⊢a:A
Γ ⊢ e[x := a] : B[x := a]
We assume Γ ⊢ λxA .e : ΠxA .B and Γ ⊢ a : A and
prove the final goal.
Since ΠxA .B is not U the type of types lemma 5.12
states the existence of a sort s such that Γ ⊢ ΠxA .B :
s is valid and then by the generation lemma 5.5 for
products we get the existence of some sB such that
Γ, xA ⊢ B : sB is valid which implies by the substitu-
tion lemma 5.11 Γ ⊢ B[x := a] : sB .
According to the generation lemma 5.5 for abstrac-
tions there are some B0 and s0 with
Γ ⊢ ΠxA .B0 : s0
Γ, xA ⊢ e : B0
β
ΠxA .B0 ∼ ΠxA .B
The second one together with Γ ⊢ a : A and the sub-
stitution lemma 5.11 gives us Γ ⊢ e[x := a] : B0 [x :=
a].
The third one with the Church Rosser theorem 4.16
β
give B ∼ B0 which by the theorem 3.16 results in
β
B[x := a] ∼ B0 [x := a].
Finally we can convert Γ ⊢ e[x := a] : B0 [x := a],
β
Γ ⊢ B[x := a] : sB and B[x := a] ∼ B0 [x := a] via
the type equivalence rule into the final goal.
ii. Right part: Immediate consequence of the induction
hypotheses.

60
2. Structural rules:
(a) Weaken:
! !
β β
Γ⊢t:T ∀u. t→u ∧ ∀∆0 . Γ → ∆0
Γ⊢u:T ! ∆0 ⊢ t : T !
β β
Γ⊢A:s ∀B. A → B ∧ ∀∆0 . Γ → ∆0
Γ⊢B:s ∆0 ⊢ A : s
x∈
/Γ ! !
β β
Γ, xA ⊢ t : T ∀u. t→u ∧ ∀∆. Γ, xA →

Γ, xA ⊢ u : T ∆⊢t:T
β
i. Left part: Assume t → u. From the first induction
hypothesis we get Γ ⊢ u : T which derives the final goal
by applying the variable introduction rule.
β
ii. Right part: We assume Γ, xA → ∆ and have to distin-
guish two cases.
β
A. ∆ = (∆0 , xA ) ∧ Γ → ∆0 :
In that case we get ∆0 ⊢ t : T and ∆0 ⊢ A : s from
the induction hypotheses which imply the final goal
∆0 , xA ⊢ t : T by application of the weakening rule.
β
B. ∆ = (Γ, xB ) ∧ A → B:
In that case we get Γ ⊢ B : s from the second induc-
tion hypothesis which implies the final goal Γ, xB ⊢
t : T by application of the weakening rule.
(b) Type equivalence:
! !
β β
Γ⊢t:T ∀u. t → u ∧ ∀∆. Γ → ∆
Γ⊢u:T ! ∆⊢t:T
β
Γ ⊢ U : s ... ∧ ∀∆. Γ → ∆
∆⊢U :s
β
T ∼U ! !
β β
Γ⊢t:U ∀u. t → u ∧ ∀∆. Γ → ∆
Γ⊢u:U ∆⊢t:U
β
i. Left part: Assume t → u. We get Γ ⊢ u : T by the
first induction hypothesis. The final goal Γ ⊢ u : U is
obtained by applying the type equivalence rule.

61
β
ii. Right part: Assume Γ → ∆. From the first induction
hypothesis we derive ∆ ⊢ t : T and from the second
induction hypothesis we derive ∆ ⊢ U : s. The final goal
∆ ⊢ t : U is obtained by applying the type equivalence
rule.

Corollary 5.14. No welltyped term can reduce to U.


β∗
A→U
Γ⊢A:T

Proof. Assume Γ ⊢ A : T . By the subject reduction theorem 5.13


(applied zero or more times) we get Γ ⊢ U : T which contradicts the
fact that U is not welltyped 5.7.

Corollary 5.15. No welltyped term can be beta equivalent to U.

Γ⊢A:T
β
A∼U

Proof. Assume Γ ⊢ A : T . Since A and U are beta equivalent, by 4.16


they must have a common reduct and since U cannot reduce to any-
thing (it is in normal form) the common reduct must be U.
Because of the previous theorem 5.14 A cannot reduce to U and
we get the desired contradiction.

5.9 Type Uniqueness


Theorem 5.16. All types of a term are beta equivalent

Γ⊢t:T
Γ⊢t:U
β
T ∼U

Proof. By induction on the structure of t using the generation lem-


mata 5.5. The generation lemma for each form of t guarantees the
β β
existence of a term V for which V ∼ T and V ∼ U is valid. By
β
transitivity of beta equivalence T ∼ U is implied.

62
Theorem 5.17. All types of beta equivalent terms are beta equivalent
β
t∼u
Γ⊢t:T
Γ⊢u:U
β
T ∼U

β
Proof. t ∼ u implies by the Church Rosser Theorem 4.16 the existence
β∗ β∗
of a term v such that t → v and u → v are valid.
By repeated application of the subject reduction theorem we get
Γ ⊢ v : T and Γ ⊢ v : U .
β
This implies by the previous theorem 5.16 the validity of T ∼
U.

Corollary 5.18. Two types T and U of beta equivalent terms t and


u are either both U or they are types of the same sort.
β
t∼u
Γ⊢t:T
Γ ⊢ u :U 
Γ⊢T :s
T = U = U ∨ ∃s.
Γ⊢U :s

β
Proof. By 5.17 we get T ∼ U and by the type of types theorem 5.12
both are either U or types of some sort.
The mixed cases are not possible since a welltyped term cannot be
beta equivalent to U by 5.15.
It remains to prove that both being welltyped implies that they
are types of the same sort.
Since both T and U are beta equivalent, their sorts have to be beta
equivalent as well by 5.17. The mixed case that one sort is P and the
other is U is not possible, because P is welltyped (it has type U in
any valid context) and a welltyped term cannot be beta equivalent to
U by 5.15. Therefore both sorts are either P or U.

5.10 Kinds
In the section 5.7 we saw that any term T in the type position of a
typing judgement Γ ⊢ t : T is either U or is a welltyped term whose

63
type is a sort. This justifies the statement that sorts are the types of
types.
However the calculus of constructions has dependent types i.e.
there are type valued functions F which when applied to arguments
return a type i.e. F T a might be a type where T represents a type
and a represents some other welltyped term e.g. a natural number.
In that case F has a type of the form ΠX P y A .P. We call such terms
F type functions and their types kinds. A kind is a product where
the final type is a sort. We include in our definition of kinds products
with arity zero i.e. sorts. Type functions with arity zero are types.

Definition 5.19. The set of syntactical kinds K is defined as the set


of terms generated by the grammar

K ::= s | ΠxA .K | ΠxK .K

where s ranges over sorts, K ranges over kinds and A ranges over
terms which are not kinds.
It is also possible to use the equivalent definition: The set K is an
inductive set defined by the rules
1.
s∈K

2.
A∈/K
K∈K
ΠxA .K ∈ K
3.
A∈K
K∈K
ΠxA .K ∈ K
Theorem 5.20. Every term K which has type U is a syntactical kind

Γ⊢K:U
K∈K

Proof. By induction on Γ ⊢ K : U.
1. Sort:
[] ⊢ P : U
Trival, because P ∈ K.

64
2. Variable: The variable rule deriving Γ, xU ⊢ x : U is not appli-
cable, because it would require Γ ⊢ U : s which contradicts the
first generation lemma 5.5.
3. Product:
Γ⊢A:s
Γ, xA ⊢ B : U B∈K
Γ ⊢ ΠxA .B : U ΠxA .B ∈ K
The goal in the lower right corner is a consequence of the induc-
tion hypothesis.
4. Abstraction:
The abstraction rule deriving Γ ⊢ λxA .e : ΠxA .B is syntactically
not possible, because ΠxA .B ̸= U.
5. Application: In order to apply the rule

Γ ⊢ f : ΠxA .B
Γ⊢a:A
Γ ⊢ f a : B[x := a]

the validity of B[x := a] = U would be necessary. This would


require either B = U or B = x ∧ a = U by lemma 3.6 and both
cases lead to a contradiction.
• B = U: The generation lemma 5.5 for a product requires
the existence of a sort s with Γ, xA ⊢ U : s which requires
by the generation lemma for sorts that U = P which is not
possible.
• B = x ∧ a = U: The generation lemma for sorts would
require in that case U = P which is not possible.
6. Weaken: Trivial, because the goal is immediately implied by the
induction hypothesis.
7. Type equivalence: Γ ⊢ K : U cannot be derived by the type
equivalence rule because the premise Γ ⊢ U : s is not satisfiable
(U is not welltyped by 5.7).

Corollary 5.21. Every type T of sort P is not a kind.

Γ⊢T :P
T ∈
/K

65
Proof. Assume Γ ⊢ T : P and by way of contradiction T ∈ K.
This implies by the previous theorem 5.23 Γ ⊢ T : U and by
β
theorem 5.16 P ∼ U which is contradictory.
Corollary 5.22. Every welltyped type which is not a kind has type P.
T ∈
/K
Γ⊢T :s
Γ⊢T :P
Proof. Assume T ∈ / K and Γ ⊢ T : s.
There are only two sorts, therefore either s = P or s = U has to
be valid. The first case proves the goal. The second case implies by
the theorem 5.20 T ∈ K which contradicts the assumption T ∈ / K.
Theorem 5.23. Every welltyped syntactical kind K has type U.
K∈K
Γ⊢K:T
Γ⊢K:U
Proof. By induction on the structure of K
1. Sort s: The validity of Γ ⊢ s : T implies by the generation
β
lemma 5.5 for sorts s = P and T ∼ U. By the type of types
lemma 5.12 we know that T is either U or it is welltyped and
by the lemma 5.15 we know that no welltyped term can be beta
equivalent to U. This implies T = U and therefore the goal.
2. ΠxA .K: We assume Γ ⊢ ΠxA .K : T and have to prove Γ ⊢ ΠxA :
K : U. The generation lemma 5.5 for products guarantees the
existence of a sort s with Γ, xA ⊢ K : s. This together with the
induction hypotheses implies Γ, xA ⊢ K : U. By the introduction
rule for products we get Γ ⊢ ΠxA .K : U.

Theorem 5.24. Every syntactical kind in the type position of a typing


judgement either is U or has type U.
K∈K
Γ⊢A:K
K =U ∨Γ⊢K :U
Proof. By 5.12 either K = U or Γ ⊢ K : s must be valid for some sort
s. In the first case the goal is trivial. In the second case the goal can
be inferred from theorem 5.23.

66
6 Proof of Strong Normalization
In this section we prove the strong normalization of the calculus of
constructions i.e. we prove that all welltyped terms have no infinite
reduction sequence which implies that all welltyped terms can be re-
duced to a normal form. The presented proof is based on the proof
of Herman Geuvers in his paper A short and flexible proof of strong
normalization for the calculus of constructions [2].
The article of Herman Geuvers is a scientific paper which assume
some good knowledge of type theory. Here we present the proof in a
text book style which makes the proof more accessible to readers who
are no experts in type theory.
In the last subsection of this chapter we use the proof of strong
normalization to prove that the calculus of constructions is consistent
as a logic i.e. that it is impossible to derive a contradiction in the
calculus of constructions.
The proof of strong normalization is not trivial and spans nearly
the whole part of this chapter. Why is the proof nontrivial?
A naive attempt to prove the strong normalization would be to
prove
Γ⊢t:T
t ∈ SN
by induction on Γ ⊢ t : T (where SN is the set of all strongly normal-
izing terms).
This attempt passes a lot of rules of the typing relation. But
it fails at the rule for application. In the rule for application we
have to prove the strong normalization of f a by using the induction
hypotheses that the function term f and the argument term a are
strongly normalizing. The impossibility of this proof step can be seen
by looking at a counterexample.
The term λxA .xx is strongly normalizing because it is in normal
form. If we apply the term to any argument we get the reduction
β
(λxA .xx)a → aa
If we apply the term to itself we get
β
(λxA .xx)(λxA .xx) → (λxA .xx)(λxA .xx)

It is obvious that no normal form can be reached because the reduction


steps can be carried out forever.

67
The problem with this naive approach is that the goal t ∈ SN
gives induction hypotheses which are too weak. The set of strongly
normalizing terms SN is not connected to the type T of the term t.
• Type Interpretation
It is better to find a subset of the strongly normalizing terms
[[T ]] ⊆ SN which reflects a set of strongly normalizing terms
which can represent terms of type T . We call [[T ]] an interpreta-
tion of the type T . The interpretation of a type shall be chosen
in a way such that

f ∈ [[ΠxA .B]]
a ∈ [[A]]
f a ∈ [[B[x := a]]]

is valid where ΠxA .B is the type of f and A is the type of a.


• Saturated Sets
In the proof on strong normalization we use saturated sets as
interpretations of types. Saturated sets are sets of strongly nor-
malizing terms which are closed in the sense that they contain all
base terms (strongly normalizing terms of the form xa1 a2 . . . an )
and that they contain all strongly normalizing terms which re-
duce by beta reduction of a redex in the leftmost position of the
term to a term in the set (key reductions).
These closure conditions are important in the proof. The set of
saturated set will be denoted by SAT. Note that SAT is a set of
sets of strongly normalizing terms.
• Lambda Function Space
Having a type interpretation [[A]] for the type A and a type in-
terpretation [[B]] it is possible to form a saturated set
λ
[[A]] → [[B]]

which is the set of lambda terms f which whenever applied to


λ
a term a ∈ [[A]] guarantees that f a ∈ [[A]] → [[B]] is valid. Note
λ
that [[A]] → [[B]] is a set of strongly normalizing lambda terms
and not a relation. We call such a set of stronly normalizing
terms a lambda function space. We use sets in the form of a
lambda function space as type interpretations of types of the
form ΠxA .B.

68
This lambda function space solves the problem of proving f a ∈
λ
[[B]] from the induction hypotheses f ∈ [[A]] → [[B]] and a ∈ [[A]].
In the previous paragraph we have ignored the detail that the
variable x might be contained in the type B within the product
ΠxA .B and that each different value of the variable x might
generate a different interpretation [[B]] of the type B. In the
detailed proof we will see that this fact is important when the
type A of the variable x is a kind. In that case we have to form
the lambda function space
λ
\
[[A]] → [[B]]
x

where
T for the purposes of this overview we understand that
x [[B]] is the intersection of all possible interpretations [[B]] for
all possible values of x. In the detailed proof we give a precise
definition of this lambda function space. Here we get a first hint
why the closure conditions in the definition of saturated sets are
important. They guarantee that any intersection of saturated
sets is a saturated set.
• Type Functions
It is not sufficient to have interpretations [[A]] of types A. E.g.
a term which represents list L in the calculus of constructions is
not a type. It is a type function of type ΠX P .P. We have to find
an interpretation for L as well. But its interpretation cannot be
a saturated set. Its interpretation [[L]] has to be a mathematical
function which maps saturated sets to saturated sets.
In that case the term LT which represents the type of a list
of elements of type T where [[T ]] ∈ SAT and [[LT ]] ∈ SAT are
valid. Therefore the interpretation of L must be a mathematical
function which maps saturated sets to saturated sets i.e. [[L]] ∈
SAT → SAT. Note that SAT → SAT is a set of mathematical
functions.
Since type functions can be arbitrarily nested, interpretations
for type functions can be drawn e.g from SAT for types i.e.
0-ary type functions, SAT → SAT for unary type functions,
(SAT → SAT) → SAT for unary type functions having a unary
type function as argument, SAT → SAT → SAT for binary type
functions (e.g. pairs), ... to arbitrary depth.

69
• Models and Context Interpretations
In the chapter Typing 5 we have shown that kinds are the types
of n-ary type functions where the corner case n = 0 is allowed.
Based on this it is possible to define a function ν which maps
any kind K to its appropriate set of models ν(K) which is the
set of possible interpretations of terms of type K.
Kinds have the advantage that they can be recognized by pure
syntactical analysis. This makes the definition of the function ν
easy.
By looking at the types in a context Γ it is decidable whether
the type A of a variable x is a kind or not. A context interpreta-
tion ξ = [xM 1 Mn
1 , . . . , xn ] is a sequence of models for all variables
in the context whose type is a kind. We say that ξ is a valid
interpretation of a valid context Γ which we denote by ξ ⊨ Γ
if ξ associates to each variable in the context which is a type
function a model from the corresponding set ν(K) where K is
the type of the variable (which has to be a kind).
• Type Interpretation Function
Based on a valid context interpretation ξ ⊨ Γ it is possible to
define a function [[F ]]ξΓ which maps each welltyped type function
F (i.e. not only the variables which are type functions) into a
type interpretation which is in the set ν(K) where K is the type
of F .
In the definition of the type interpretation function it is crucial to
map all types to saturated sets i.e. that all types are interpreted
by staturated sets i.e. sets of strongly normalizing terms.
The definition of the type interpretation function is nontrivial
because it has to be shown in each case that the return value is
in the correct set.
Furthermore it has to be proved that the type interpretation
function returns for equivalent types the same interpretation.
This is important because beta equivalent types are practically
the same types. Therefore it is not allowed that they have differ-
ent interpretations. In order to prove this fact we have to show
that substitutions which are the basis of beta reduction and beta
equality are treated in a consistent manner. This prove is non-
trivial either.
• Term Interpretations and Context Models

70
In addition to type interpretations ξ we add term interpreta-
tions ρ. Term interpretations or better variable interpretations
are a mapping from variables to terms which are an element of
the type interpretation of the corresponding type i.e. strongly
normalizing terms.
The term interpretation (|t|)ρ replaces in the term t all free occur-
rences of a variable by its corresponding variable interpretation.
A context model ρξ ⊨ Γ is a variable interpretation ρ and a type
variable interpretation ξ which are consistent.
The additional complexity of term interpretations (|t|)ρ is neces-
sary to be able to enter binders like λxA .e and ΠxA B and get
sufficiently strong induction hypotheses.
• Soundness Theorem
The whole machinery culminates in the proof of the soundness
theorem which states that for all context models with ρξ ⊨ Γ and
all welltyped term t with Γ ⊢ t : T we can prove (|t|)ρ ∈ [[T ]]ξΓ .
Since all type interpretations of types are saturated sets i.e. sets
of strongly normalizing terms and there is a canonical interpre-
tation of type variables and the term interpretation which is the
identity function is possible (base terms include variables and
are in all saturated sets) we easily conclude that all welltyped
terms are strongly normalizing.

6.1 Strong Normalization


We start the proof of strong normalization with a definition of the set
of normal forms NF and the set of strongly normalizing terms SN and
prove some properties based on these definitions.

Definition 6.1. The set NF of terms in normal form is the set of all
terms which do not reduce (i.e. which do not have any redex).
" #
β
∀b. a → b

a ∈ NF

Definition 6.2. The set SN of strongly normalizing terms is defined

71
inductively by the rule
" #
β
∀b. a→b
b ∈ SN
a ∈ SN
In words: A term a is strongly normalizing if all its reducts b are
strongly normalizing.
We can view the strongly normalizing terms more intuitively in
the following manner.
• A term which is already in normalform i.e. it has no reducts is
strongly normalizing. In hat case all its reducts (there are none)
are vacuously strongly normalizing. I.e. terms in normal form
form the 0-th generation of strongly normalizing terms.
• The 1st generation of strongly normalizing terms are terms which
reduce only to terms in normal form.
• The (n+1)th generation of strongly normalizing terms are terms
which reduce only to terms of the nth generation of strongly
normalizing terms.
• ...
This intuitive definition defines the terms which are guaranteed
to reduce in at most n steps to terms in normal form. The strongly
normalizing terms are terms which are guaranteed to reduce in n steps
to terms in normal form for some natural number n.
With this intuitive definition we could prove properties of strongly
normalizing terms by doing induction on the maximal number of steps
n needed to reduce to normal form.
However the formal definition is better suited for doing induction
proofs. In order to prove that some term a which is strongly normal-
izing has some property p(a) one can assume that all its reducts have
this property. I.e. we can use an induction scheme similar to rule
based induction.
" # " #
β β
∀b. a → b ∀b. a → b
b ∈ SN p(b)
a ∈ SN p(a)
In order to prove the goal p(a) in the lower right corner we can
assume all the other statements especially the induction hypothesis in
the upper right corner.

72
Theorem 6.3. All subterms of a strongly normalizing term are strongly
normalizing.

Proof. Sorts and variables don’t have subterms. A product ΠxA .B


has the subterms A and B, an abstraction λxA .e has the subterms A
and e and an application ab has the subterms a and b. All proofs that
subterms of strongly normalizing terms are strongly normalizing follow
ab ∈ SN
the same pattern. Here we prove which we transform into
a ∈ SN
the equivalent statement

t ∈ SN
 
t = ab
∀ab.
a ∈ SN

and prove by induction on t ∈ SN.


 
β
t→u
" #
β
∀u. t→u ∀ua1 b.  u = a1 b 
 
u ∈ SN
a1 ∈ SN
 
t = ab
t ∈ SN ∀ab.
a ∈ SN

In order to prove the goal in the lower right corner we assume t = ab


β
and a → a1 and prove a1 ∈ SN.
β β
From a → a1 we infer ab → a1 b and get the final goal from the
induction hypothesis by using u = a1 b.

Theorem 6.4. A redex is strongly normalizing if its reduct and all


its subexpressions are strongly normalizing.

e ∈ SN
e[x := a] ∈ SN
a ∈ SN
A ∈ SN
(λxA .e)a ∈ SN

Proof. We do induction on e ∈ SN, a ∈ SN and A ∈ SN to get good


induction hypotheses and then prove the final goal.

73
• Induction on e ∈ SN:
 
β
e→f
f [x := a] ∈ SN
" #  
β
∀f. e → f
 
∀f aA.  a ∈ SN
 

f ∈ SN 
 A ∈ SN


(λxA .f )a ∈ SN
 
e[x := a] ∈ SN
 a ∈ SN 
e ∈ SN ∀aA. 
 A ∈ SN


(λxA .e)a ∈ SN

• Assume e[x := a] ∈ SN and do induction on a ∈ SN:


 
β
a→b
" #
β
∀b. a → b ∀bA.  A ∈ SN
 

b ∈ SN A
(λx .e)b ∈ SN
 
A ∈ SN
a ∈ SN ∀A.
(λxA .e)a ∈ SN

• Induction on A ∈ SN:
" # " #
β β
∀B. A → B ∀B. A → B
B ∈ SN (λxB .e)a ∈ SN
A ∈ SN A
(λx .e)a ∈ SN

• I.e. we have to prove the goal (λxA .e)a ∈ SN under the following
assumptions e ∈ SN, e[x := a] ∈ SN, a ∈ SN, A ∈ SN and the
following induction hypotheses:
 
β
e→f
 f [x := a] ∈ SN 
 
1. ∀f aA.  a ∈ SN
 

 
 A ∈ SN 
A
(λx .f )a ∈ SN
 
β
a→b
2. ∀bA.  A ∈ SN
 

A
(λx .e)b ∈ SN
" #
β
3. ∀B. A → B
(λxB .e)a ∈ SN

74
In order to prove the goal (λxA .e)a ∈ SN we have to prove
" #
β
A .e)a →
∀c. (λx c
c ∈ SN

β
We assume (λxA .e)a → c. An application can reduce by definition
of reduction only if the application is a redex (which is the case) or
if the function term reduces or if the argument reduces. Since the
function term λxA .e is an abstraction it reduces either if the type of
the argument A reduces or the body e reduces. I.e. we have 4 cases
to consider:
β
1. (λxA .e)a → e[x := a]. Trivial. The goal e[x := a] is valid by
assumption.
β β
2. (λxA .e)a → (λxA .f )a, where e → f : In that case we have to
prove the goal
(λxA .f )a ∈ SN
β
by using e → f and the above assumptions. We can use the
induction hypothesis 1 to prove the goal. The only thing missing
is a prove of f [x := a] ∈ SN. From the theorem 3.10 we infer
β
e[x := a] → f [x := a]. Since e[x := a] ∈ SN, all its reducts must
be strongly normalizing by definition of strong normalization.
This completes the proof for that case.
β β
3. (λxA .e)a → (λxB .e)a, where A → B: In that case we have to
prove the goal
(λxB .e)a ∈ SN
β
By using A → B and the induction hypothesis 3 we prove the
goal.
β β
4. (λxA .e)a → (λxA .e)b, where a → b: In that case we have to
prove the goal
(λxA .e)b ∈ SN
β
By using a → b and the induction hypothesis 2 we prove the
goal.

75
6.2 Base Terms
Definition 6.5. The set of base terms BT is the set of all variables
applied to zero or more strongly normalizing arguments i.e. terms of
the form xa1 . . . an where ai ∈ SN for all i. We formally define the set
BT by the rules
1. x ∈ BT where x ranges over variables.
a ∈ BT
2. b ∈ SN
ab ∈ BT
Theorem 6.6. All base terms are strongly normalizing.

a ∈ BT
a ∈ SN

Proof. Because a base terms begins with a variable followed by zero or


more strongly normalizing arguments, it has no redex whose reduction
can change the structure. I.e. only the arguments can reduce and they
are by definition strongly normalizing.

6.3 Key Reduction


β
Definition 6.7. The key reduction a →k b is a relation defined by the
rules
1.
β
(λxA .e)a →k e[x := a]

2.
β
a →k b
β
ac →k bc
I.e. a key reduction reduces only a leftmost redex in an application.
β β
Lemma 6.8. Let a →k b and a → a1 where a1 ̸= b. Then there exists
β β∗
a term b1 with a1 →k b1 and b → b1 .
β
a →k b
↓β ↓β ∗
β
a1 →k ∃b1
|{z}
̸=b

76
β
Proof. By induction on a →k b:
β
1. (λy A .e)a →k e[x := a]:
β
Assume λxA .e → a1 and a1 ̸= e[x := a]. The goal is to find b1
β β∗
such that a1 →k b1 and e[x := a] → b1 .
There are 4 cases to consider:
β
(a) (λxA .e)a → e[x := a]:
This case is contradictory because e[x := a] ̸= e[x := a] is
not satisfiable.
β β
(b) (λxA .e)a → (λxA1 .e)a where A → A1 : In that case have
β
(λxA1 .e)a →k e[x := a] and we choose b1 = e[x := a].
β β
(c) (λxA .e)a → (λxA .f )a where e → f :
β
In that case we have (λxA .f )a →k f [x := a] and we choose
β∗
b1 = f [x := a]. This is possible because e[x := a] → f [x :=
a] by theorem 3.10.
β β
(d) (λxA .e)a → (λxA .e)b where a → b:
β
In that case we have (λxA .e)b →k e[x := b] and we choose
β∗
b1 = e[x := b]. This is possible because e[x := a] → e[x := b]
by theorem 3.11.
2.
β
" #
βk a → a1
a→b ∀a1 . β β∗
∃b1 .a1 →k b1 ∧ b → b1
β
" #
βk ac → d
ac → bc ∀d. β β∗
∃d1 .d →k d1 ∧ bc → d1
β
We prove the goal in the lower right corner by assuming ac → d
and find some d1 with the required properties.
β
Since a →k b we know that a is not an abstraction. Therefore we
have two cases to consider:
β β
(a) ac → a1 c where a → a1 :
From the induction hypothesis we postulate the existence of
β β∗
b1 such that a1 →k b1 and b → b1 . We choose d1 = b1 c which
β β∗
satisfies a1 c →k b1 c and bc → b1 c.
β β
(b) ac → ac1 where c → c1 :

77
β∗ β∗
We choose d1 = bc1 which satisfies ac1 → bc1 and bc → bc1 .

β
Theorem 6.9. Let’s have a →k b and a and bc are strongly normaliz-
ing. Then ac is strongly normalizing as well.
β
a →k b
a, bc ∈ SN
ac ∈ SN

Proof. Since bc ∈ SN implies c ∈ SN we can prove the equivalent


theorem
a ∈ SN
β
a →k b
c ∈ SN
bc ∈ SN
ac ∈ SN
This modified theorem has the advantage that we can do induction on
c ∈ SN in the course of the proof.
1. Induction on a ∈ SN:
β
 
a → a1
" #  βk 
β  a1 → b1
a → a1

∀a1 . ∀a1 b1 c.  c ∈ SN
 

a1 ∈ SN 
 b1 c ∈ SN

 a1 c ∈ 
SN
β
a →k b
∀bc.  c ∈ SN
 
a ∈ SN
 

 bc ∈ SN 
ac ∈ SN

β
2. Assume a →k b.
3. Induction on c ∈ SN:
 
β
c → c1
" #
β
∀c1 . c → c1 ∀c1 .  bc1 ∈ SN 
 
c1 ∈ SN
ac1 ∈ SN
bc ∈ SN
c ∈ SN
ac ∈ SN

78
We prove the goal in the lower right corner by assuming bc ∈ SN
and prove the final goal ac ∈ SN.
β
4. In order to prove ac ∈ SN we assume ac → d and prove d ∈ SN
for all d.
β
Since a →k b we know that a is not an abstraction. Therefore
there are only three cases to consider.
β β
(a) ac → bc where a → b: In that case we have to prove the
goal
d = bc ∈ SN
This is trivial since bc ∈ SN is an assumption.
β β
(b) ac → a1 c where a → a1 and a1 ̸= b: We have to prove the
goal
d = a1 c ∈ SN
β
Since we have a →k b by lemma 6.8 there exists a b1 such that
β β∗
a1 →k b1 and b → b1 . Because of bc ∈ SN we have b1 c ∈ SN
as well. Therefore all premises of the induction hypothesis
of step 1 are satisfied and we get a1 c ∈ SN from it.
β β
(c) ac → ac1 where c → c1 : We have to prove the goal

d = ac1 ∈ SN

Because of bc ∈ SN we have bc1 ∈ SN as well. Therefore


the preconditions of the induction hypothesis of step 3 are
satisfied and the goal is an immediate consequence of the
induction hypothesis.

6.4 Saturated Sets


Definition 6.10. A saturated set S is a set of strongly normalizing
terms (i.e. S ⊆ SN) which is closed under the rules
1. All base terms are in a saturated set:
b ∈ BT
b∈S

79
2. All strongly normalizing terms which keyreduce to a term in the
saturated set are in the saturated set as well:
a ∈ SN
b∈S
β
a →k b
a∈S

The set of all saturated sets is abbreviated by SAT.

Theorem 6.11. An arbitrary intersection of saturated sets is a satu-


rated set.
C ⊆ SAT
T
C ∈ SAT
Proof. We have to prove three things:
T
1. All terms in C are strongly normalizing:
Since all sets in C contain only strongly normalizing terms, the
intersection contains only strongly normalizing terms as well.
T
Note the cornercase ∅ = SN since SN is the base set.
T
2. C contains all base terms:
Since all sets in C contain all base terms, the intersection of all
sets in C contain all base terms as well.
T
3. C contains all strongly normalizing key redexes which reduce
to a term in it:
T
Assume that the term b is in C. Then by definition of inter-
section b ∈ S for all S ∈ C. Since all S ∈ C are saturated any
β
strongly normalizing term a with a →k b is
T in all sets S as well.
Therefore a has to be in the intersection C.

Remark:
For those interested in lattice theory. The set of all subsets
of the set of strongly normalizing terms (i.e. the powerset
of SN) is a complete lattice with intersection and union as
the meet and join operations. The subset relation induces
a partial order.
The function which maps any set of strongly normalizing
terms into a saturated set (i.e. which adds all base terms

80
and all strongly normalizing key redexes) is monotonic, in-
creasing and idempotent. I.e. it is a closure map. The
saturated sets are fixpoints of that function.
The fixpoints of such a map form in general a complete lat-
tice which is closed with respect to intersection and union.

6.5 Lambda Function Space


Definition 6.12. If A and B are sets of lambda terms we define the
λ
lambda functions space A → B as the set of lambda terms f such that
whenever a is in A then f a is in B.
λ
A → B := {f | ∀a.a ∈ A ⇒ f a ∈ B}

Theorem 6.13. The lambda function space between saturated sets is


a saturated set.

A ∈ SAT
B ∈ SAT
λ
A → B ∈ SAT
Proof. We have to prove three things:
λ λ
1. All terms in A → B are strongly normalizing: Assume f ∈ A →
B. Then by definition f a ∈ B ⊆ SN for all a ∈ A. Therefore
f a is strongly normalizing. Since all subterms of a strongly nor-
malizing term are strongly normalizing as well 6.3 f is strongly
normalizing.
λ
2. A → B contains all base terms:
We have to prove two things according to the definition of base
terms:
λ
(a) All variables are in A → B:
Since B is saturated, it contains all terms of the form xa
where a ∈ A, because xa is a base term. Therefore x ∈
λ
A → B for all variables x.
λ λ
(b) If c ∈ A → B where c is a baseterm, then cd ∈ A → B for
all strongly normalizing terms d:
Since B is saturated it contains all terms of the form cda
for a ∈ A ⊆ SN, because cda is a base term. Therefore by
λ
definition of the lambda function space cd ∈ A → B.

81
λ
3. A → B contains all strongly normalizing key redexes which re-
duce to a term in it:
λ β
Assume d ∈ A → B, c →k d and c ∈ SN. We have to prove that
λ
c ∈ A → B.
β
By definition of keyreduction we have ca →k da for all a ∈ A and
by definition of the lambda function space da ∈ B ⊆ SN.
Since B is saturated, it contains ca provided that ca is strongly
normalizing. ca is strongly normalizing by theorem 6.9 since
β
ca →k da and da ∈ SN.
λ
Therefore by definition of the lambda function space c ∈ A → B.

6.6 Model Set


Definition 6.14. Model Set For K ∈ K we define the model set ν(K)
by

 ν(s) := SAT s is a sort
ν(K) := A
ν(Πx .B) := ν(A) → ν(B) A, B ∈ K
ν(ΠxA .B) := ν(B) A∈ / K∧B ∈K

Definition 6.15. Canonical Model For K ∈ K we define the canonical


model ν c (K) by
 c
 ν (s) := SN s is a sort
c c A c
ν (K) := ν (Πx .B) := • 7→ ν (B) A, B ∈ K
 c
ν (ΠxA .B) := ν c (B) A∈ / K∧B ∈K

where • 7→ v is the constant function which maps any argument to


the value v.

The following property of ν is convenient:

K∈K
Γ⊢F :K
Γ⊢F :T
T ∈ K ∧ ν(K) = ν(T )

In order to prove the property we first prove a similar lemma for


kinds.

82
Lemma 6.16. The model set of a welltyped kind is the same as the
model set of any beta equivalent welltyped type.

K∈K
Γ⊢K:U
Γ ⊢ T : sT
β
K∼T
T ∈ K ∧ ν(K) = ν(T )

Proof. By induction on the structure of K.


General observation: Since K and T are betaequivalent we get
from theorem 5.18 sT = U and by 5.20 that T is a syntactical kind.
This is valid for all induction hypotheses.
For the induction proof we distiguish three cases:
1. K = s: In that case K has to be P, otherwise it would not be
welltyped.
Using the general observation above we conclude that T is a
syntactical kind. P is the only possible syntactical kind beta
equivalent to P. Therefore T = P which implies the goal ν(T ) =
ν(P) trivially.
2. K = ΠxK1 .K2 : Since T is a syntactical kind betaequivalent to
K it must have the form of a product (a sort is not possible).
I.e. T = ΠxA .B for some types A and B.
β β
By theorem 4.18 we get the equivalences K1 ∼ A and K2 ∼ B.
From both induction hypotheses we conclude ν(K1 ) = ν(A) and
ν(K2 ) = ν(B).
3. K = ΠxA .K2 (where A ∈
/ K): By the same reasoning as above
β
we get T = ΠxAT .B for some types AT and B with A ∼ AT and
β
K2 ∼ B.
From the induction hypothesis we get ν(K2 ) = ν(B).
Since both A and AT are valid types and A is not a syntactical
kind, we conclude by using 5.18 that AT cannot be a syntactical
kind either.
Therefore we get

ν(K) = ν(K2 ) = ν(B) = ν(T ).

83
Theorem 6.17. If a type function F is welltyped (i.e. its type is a
kind), then the model sets of all its possible types are the same.

K∈K
Γ⊢F :K
Γ⊢F :T
T ∈ K ∧ ν(K) = ν(T )

β
Proof. From 5.16 we infer T ∼ U and from 5.18 we infer that either
both are U or both are types of the same sort.
If both are U, then ν(K) = ν(T ) is valid trivially.
Assume both are types of the same sort i.e. we have Γ ⊢ K : s and
Γ ⊢ T : s. Since K is a syntactical kind s = U is valid.
Therefore the assumptions of lemma 6.16 are valid and we infer
the goal by applying the lemma.

As a next step we prove the fact that the model set of a kind is
not affected by a welltyped substitution.

Theorem 6.18. A type correct variable substitution does not affect


the model set of a kind.

K∈K
Γ, xA ⊢ K : U
Γ⊢a:A
ν(K) = ν(K[x := a])

Proof. By induction on K ∈ K.
1. s ∈ K: Trivial
2.
B∈
/K
Γ, xA , ∆′ ⊢ K : U
 
K∈K ∀∆′ .
ν(K) = ν((K)[x := a])
xA , ∆ ⊢ Πy B .K : U
 
B Γ,
Πy .K ∈ K ∀∆.
ν(Πy B .K) = ν((Πy B .K)[x := a])

We assume Γ, xA , ∆ ⊢ Πy B .K : U and derive the goal ν(Πy B .K) =


ν((Πy B .K)[x := a]).

84
From the generation lemma 5.5 for products we postulate the
existence of sB and sK such that

Γ, xA , ∆ ⊢ B : sB
Γ, xA , ∆, y B ⊢ K : sK
β
sK ∼ U

This implies sB = P (because B ∈ / K and corollary 5.22) and


sK = U.
Applying the substitution lemma 5.11 we get

Γ, xA , ∆[x := a] ⊢ B[x := a] : P

which by theorem 5.23 implies B[x := a] ∈


/ K.
′ B
We use ∆ = ∆, y and derive ν(K) = ν(K[x := a]) from the
induction hypothesis.
Therefore by the equalities

ν(Πy B .K) = ν(K)


= ν(K[x := a])
= ν(Πy B[x:=a] .K[x := a])
= ν((Πy B .K)[x := a])

we derive the desired goal.


3.
Γ, xA , ∆ ⊢ B : U
 
B∈K ∀∆.
 ν(B)A= ν((B)[x
′ ⊢K :U
:= a]) 
Γ, x , ∆
K∈K ∀∆′ .
ν(K) = ν((K)[x := a])
Γ, xA , ∆ ⊢ Πy B .K : U
 
B
Πy .K ∈ K ∀∆.
ν(Πy B .K) = ν((Πy B .K)[x := a])

We assume Γ, xA , ∆ ⊢ Πy B .K : U and derive the goal ν(Πy B .K) =


ν((Πy B .K)[x := a]).
From the generation lemma 5.5 for products we postulate the
existence of sB and sK such that

Γ, xA , ∆ ⊢ B : sB
A
Γ, x , ∆, y B ⊢ K : sK
β
sK ∼ U

85
This implies sB = U and sK = U because of B, K ∈ K and
therorem 5.23.
By using ∆′ = ∆, y B the preconditions of both induction hy-
potheses are satisfied and we derive the facts

ν(B) = ν(B[x := a])


ν(K) = ν(K[x := a])

Therefore by the equalities

ν(Πy B .K) = ν(B) → ν(K)


= ν(B[x := a]) → ν(K[x := a])
= ν(Πy B[x:=a] .K[x := a])
= ν((Πy B .K)[x := a])

we derive the desired goal.

6.7 Context Interpretation


Definition 6.19. Context Interpretation: We call ξ = [xM 1 M2 Mn
1 , x2 , . . . , xn ]
an interpretation of a context Γ if and only if it satisfies the relation
ξ ⊨ Γ defined by the rules
1. Empty context
[] ⊨ []
2. Term variable
ξ⊨Γ
Γ⊢A:P
x∈/Γ
ξ ⊨ Γ, xA
3. Type (function) variable

ξ⊨Γ
Γ⊢K:U
F ∈/Γ
M ∈ ν(K)
ξ, F M ⊨ Γ, F K

Theorem 6.20. For every valid context Γ there exists a unique canon-
ical context interpretation ξ c (Γ) with ξ c (Γ) ⊨ Γ.

86
Proof. We construct the canonical context interpretation recursively.
 c
 ξ ([]) := []
ξ c (Γ) := ξ c (Γ, xA ) := ξ c (Γ) A∈
/K
 c A c ν c (A)
ξ (Γ, x ) := ξ (Γ), x A∈K

6.8 Type Interpretation


The goal of this section is to find a function which maps any welltyped
type function F for all types K it can have in a certain context Γ into
a value [[F ]]ξΓ ∈ ν(K) or U into [[U]]ξΓ ∈ SAT. The function is based
on a context interpretation ξ which assigns to each type variable x of
type A (which is a kind) in the context a model which is an element
of ν(A).

Definition 6.21. The type interpretation function [[F ]]ξΓ must satisfy
the following specification.

ξ⊨Γ
Γ⊢F :K ξ⊨Γ

K∈K [[U]]ξΓ ∈ SAT
[[F ]]ξΓ ∈ ν(K)

I.e. it has the preconditions


• The context Γ has to be valid and ξ is a context interpretation
(i.e. ξ ⊢ Γ).
• F is either U or it is a welltyped type function (i.e. there exist
some type K ∈ K such that Γ ⊢ t : K is valid).
In the case that F is a welltyped type function, then the typein-
terpretation has to be an element of ν(T ) for all possible types of F .
If F is U then the typeinterpretation must be a saturated set.

Definition 6.22. The type interpretation function [[F ]]ξΓ is defined


recursively on the structure of F .
Note that by theorem 6.17 we can choose any type of a welltyped
type function to prove the satisfaction of the specification.
1. Sort:
[[s]]ξΓ := SN

87
This definition satisfies the specification 6.21. There are two
cases possible.
If s = P then it is welltyped and its type is U and we have
SN ∈ SAT and SAT = ν(U).
If s = U then the specification is trivially satisfied.
2. Variable:
[[x]]ξΓ := M
where xM ∈ ξ.
The precondition ξ ⊨ Γ is possible only if there is a type A such
that xA ∈ Γ and Γ ⊢ A : U is valid. By the start lemma 5.3
Γ ⊢ x : A is valid and since ξ is a valid context interpretation for
the context Γ we get M ∈ ν(A).
3. Product:
λ
[[ΠxA .B]]ξΓ := [[A]]ξΓ → IB
where
(
[[B]]ξ(Γ,xA ) if A ∈
/K
IB := T
M ∈ν(A) [[B]](ξ,xM )(Γ,xA ) if A ∈ K

Since a product is not U we have to prove the left part of the


specification
ξ⊨Γ
Γ ⊢ ΠxA .B : K
K∈K
λ
[[A]]ξΓ → IB ∈ ν(K)
where IB is the type interpretation of B (see above).
From the generation lemma 5.5 for products we postulate the
existence of the sorts sA and sB such that Γ ⊢ A : sA and
Γ, xA ⊢ B : sB are valid and sB is beta equivalent to K.
Since products cannot be beta equivalent to sorts by 4.19 K
must be a sort and the only sort beta equivalent to sB is sB .
Therefore we have K = sB and ν(K) = SAT.
λ
In order to prove that [[A]]ξΓ → IB is a saturated set by theo-
rem 6.13 it is sufficient to prove that [[A]]ξΓ and IB are saturated
sets.

88
(a) [[A]]ξΓ ∈ SAT: The preconditions ξ ⊨ Γ, Γ ⊢ A : sA and sA ∈
K of the typeinterpretation function are satisfied. Therefore
we can conclude the goal.
(b) IB ∈ SAT: We have to distinguish two cases:
i. A ∈/ K: In that case we have ξ ⊨ Γ, xA i.e. the precon-
ditions for the typeinterpretation function are satisfied
and we get [[B]]ξ(Γ,xA ) ∈ SAT.
ii. A ∈ K: By theorem 6.11 it is sufficient to prove [[B]](ξ,xM )(Γ,xA ) ∈
SAT for all M ∈ ν(A).
Assume M ∈ ν(A). Then ξ, xM ⊨ Γ, xA i.e. the precon-
ditions of the typeinterpretation function are satisfied
and we infer the goal.
4. Abstraction:
(
[[e]]ξ(Γ,xA ) if A ∈
/K
[[λxA .e]]ξΓ :=
M 7→ [[e]](ξ,xM )(Γ,xA ) if A ∈ K, M ∈ ν(A)

Since an abstraction is not U we have to prove the left part of


the specification
ξ⊨Γ
Γ ⊢ λxA .e : K
K∈K
[[λxA .e]]ξΓ ∈ ν(K)
By the generation lemma 5.5 for abstractions we postulate the
existence of B and s such that Γ ⊢ ΠxA .B : s, Γ, xA ⊢ e : B and
β
K ∼ ΠxA .B are satisfied.
By the lemma 6.17 we get ΠxA .B ∈ K and ν(K) = ν(ΠxA .B).
In order to prove the goal [[λxA .e]]ξΓ ∈ ν(ΠxA .B) we distinguish
two cases:
(a) A ∈
/ K: In that case the goal is

[[e]]ξ(Γ,xA ) ∈ ν(B)

Since A is not a kind we have ξ ⊨ Γ, xA and therefore the


preconditions for [[e]]ξ(Γ,xA ) are satisfied and the specification
of the typeinterpretation function guarantees the goal.
(b) A ∈ K: In that case the goal is

M 7→ [[e]](ξ,xM )(Γ,xA ) ∈ ν(A) → ν(B)

89
where M ∈ ν(A). The function argument is in the cor-
rect domain. Because of ξ, xM ⊨ Γ, xA the preconditions of
[[e]](ξ,xM )(Γ,xA ) are satisfied and the specification of the type-
interpretation function guarantees that the function maps
its argument to a value in the correct range.
5. Application:

[[F ]]ξΓ if Γ ⊢ a : A for some A ∈
/K
[[F a]]ξΓ :=
[[F ]]ξΓ ([[a]]ξΓ ) if Γ ⊢ a : A for some A ∈ K

where Γ ⊢ a : A for some A.


Since an application is not U we have to prove the left part of
the specification
ξ⊨Γ
Γ ⊢ Fa : K
K∈K
[[F a]]ξΓ ∈ ν(K)
• By the generation lemma 5.5 for applications we postulate
the existence of A and B such that
Γ ⊢ F : ΠxA .B
Γ⊢a:A
β
K ∼ B[x := a]

are valid. By the lemma 6.17 we get B[x := a] ∈ K and


ν(K) = ν(B[x := a]) i.e. we have to prove the goal

[[F a]]ξΓ ∈ ν(B[x := a])

• Using the type of types lemma 5.12 we can derive the exis-
tence of some sort s such that Γ ⊢ ΠxA .B : s is valid. This
implies by the generation lemma 5.5 for products the exis-
tence of the sorts sA and sB such that Γ ⊢ A : sA , Γ, xA ⊢
β
B : sB and s ∼ sB are valid (i.e. s = sB ). Furthermore by
the substitution theorem 5.11 we get Γ ⊢ B[x := a] : s.
This implies that K and B[x := a] are welltyped and there-
fore cannot be U. Since K is a kind, s = U must be valid.
I.e. we get
Γ ⊢ A : sA
Γ, xA ⊢ B:U
Γ ⊢ ΠxA .B : U

90
Because of theorem 5.20 we have B ∈ K and ΠxA .B ∈ K
i.e. F is a type function.
• Since F is a type function, the preconditions for the type
interpretation are satisfied and we get

[[F ]]ξΓ ∈ ν(ΠxA .B)

• We distinguish two cases


(a) A ∈
/ K: In that case we have to prove the goal

[[F ]]ξΓ ∈ ν(B[x := a])

We prove the goal by using the equivalence


ν(B[x := a] = ν(B) 6.18
= ν(ΠxA .B) definition of ν
(b) A ∈ K: In that case the preconditions of the type inter-
pretation function for a are satisfied and we get

[[a]]ξΓ ∈ ν(A)

Furthermore we have [[F ]]ξΓ ∈ ν(A) → ν(B) and there-


fore [[F ]]ξΓ ([[a]]ξΓ is a valid function application with

[[F ]]ξΓ ([[a]]ξΓ ) ∈ ν(B)

Since typesafe substitution does not change the model


we get by using 6.18 ν(B) = ν(B[x := a]) which proves
the goal
[[F ]]ξΓ) ([[a]]ξΓ ) ∈ ν(B[x := a])
Theorem 6.23. Type interpretation treats substitution consistently
ξ⊨Γ
Γ⊢a:A
Γ, xA ⊢ F : K
K∈K (
[[F ]]ξ(Γ,xA ) A∈/K
[[F [x := a]]]ξΓ =
[[F ]](ξ,x[[a]]ξΓ )(Γ,xA ) A ∈ K

Note: Due to the substitution lemma 5.11 we get Γ ⊢ F [x := a] :


K[x := a]. A substituted kind remains a kind (easy induction on
the structure of the kind). Therefore the preconditions for the type
interpretation function are satisfied for the substituted term as well.

91
Proof. We distinguish two cases:
• A ∈ K: Assume ξ ⊨ Γ and Γ ⊢ a : A and prove the more general
lemma:
 
Γ, xA , ∆ ⊢ F : K
 ξ, x[[a]]ξΓ , η ⊨ Γ, xA , ∆ 
∀K∆η. 
 
 K∈K


[[(F )′ ]](ξ,η)(Γ,∆′ ) = [[F ]](ξ,x[[a]]ξΓ ,η)(Γ,xA ,∆)

where Y ′ is an abbreviation for Y [x := a].


Note: Since the model set function ν respects substitution (lemma 6.18)
the fact ξ, x[[a]]ξΓ , η ⊨ Γ, xA , ∆ implies the fact ξ, η ⊨ Γ, ∆′ .
We prove this lemma by induction on the structure of F .
1. Sort: Trivial, becaus a substituted sort remains the same
sort and the type interpretation of a sort is always SN.
2. Variable y: We distinguish two cases
(a) y = x: In that case we prove the goal by the equivalence
[[x′ ]](ξ,η)(Γ,∆′ ) = [[a]](ξ,η)(Γ,∆′ )
= [[a]]ξΓ
= [[x]](ξ,x[[a]]ξΓ ,η)(Γ,xA ,∆)

(b) y ̸= x: In that case we prove the goal by the equivalence


[[y ′ ]](ξ,η)(Γ,∆′ ) = [[y]](ξ,η)(Γ,∆′ )
= [[y]](ξ,x[[a]]ξΓ ,η)(Γ,xA ,∆)

Since y ̸= x and its type is a kind, the type interpreta-


tion by definition depends only on ξ and η.
3. Product Πy C .D: We have to prove the goal
 
Γ, xA , ∆ ⊢ Πy C .D : K
 ξ, x[[a]]ξΓ , η ⊨ Γ, xA , ∆ 
∀K∆η. 
 
 K∈K


[[(Πy C .D)′ ]](ξ,η)(Γ,∆′ ) = [[Πy C .D]](ξ,x[[a]]ξΓ ,η)(Γ,xA ,∆)

We can use the two induction hypotheses:


 
Γ, xA , ∆ ⊢ C : sC
 ξ, x[[a]]ξΓ , η ⊨ Γ, xA , ∆ 
(a) ∀sC ∆η. 
 
 sC ∈ K


[[(C)′ ]](ξ,η)(Γ,∆′ ) = [[C]](ξ,x[[a]]ξΓ ,η)(Γ,xA ,∆)

92
 
Γ, xA , ∆D ⊢ D : sD
 ξ, x[[a]]ξΓ , η ⊨ Γ, xA , ∆D 
(b) ∀sD ∆D ηD . 
 
 sD ∈ K 

[[(D)′ ]](ξ,ηD )(Γ,∆′ ) = [[D]](ξ,x[[a]]ξΓ ,η A ,∆
D D )(Γ,x D)
We prove the goal below the line by assuming all statements
above the line and use the equivalence
λ
[[(Πy C .D)′ ]](ξ,η)(Γ,∆′ ) = IC ′ → ID′
λ
= IC → ID see below
C
= [[Πy .D]](ξ,x[[a]]ξΓ ,η)(Γ,xA ,∆)

where
IC ′ = [[C ′ ]](ξ,η)(Γ,∆′ )
IC = [[C]](ξ,x[[a]]ξΓ ,η)(Γ,xA ,∆)
[[D′ ]](ξ,η,yN )(Γ,xA ,∆,yC ′ )
T
ID ′ = C ∈K
TN ∈ν(C)
ID = N ∈ν(C) [[D]](ξ,x[[a]]ξΓ ,η,y N )(Γ,xA ,∆,y C ) C ∈K
ID ′ = [[D′ ]](ξ,η)(Γ,∆′ ,yC ′ ) C ∈
/K
ID = [[D]](ξ,x[[a]]ξΓ ,η)(Γ,xA ,∆,yC ) C ∈
/K

We still have to prove IC ′ = IC and ID′ = ID .


IC ′ = IC follows immediately from the first induction hy-
pothesis. In order to prove ID′ = ID we have to distinguish
two cases:
(a) C ∈/ K: Consequence of the second induction hypothesis
by using ηD = η and ∆D = ∆, y C .
(b) C ∈ K: Consequence of the second induction hypothesis
by using ηD = η, y N and ∆D = ∆, y C .
4. Abstraction λy C .e: We have to prove the goal
 
Γ, xA , ∆ ⊢ λy C .e : K
 ξ, x[[a]]ξΓ , η ⊨ Γ, xA , ∆ 
∀K∆η. 
 
 K ∈ K 

[[(λy C .e)′ ]](ξ,η)(Γ,∆′ ) = [[λy C .e]](ξ,x[[a]]ξΓ ,η)(Γ,xA ,∆)

We can use the two induction hypotheses


 
Γ, xA , ∆ ⊢ C : sC
 ξ, x[[a]]ξΓ , η ⊨ Γ, xA , ∆ 
(a) ∀sC ∆η. 
 
 sC ∈ K


[[(C)′ ]](ξ,η)(Γ,∆′ ) = [[C]](ξ,x[[a]]ξΓ ,η)(Γ,xA ,∆)

93
 
Γ, xA , ∆e ⊢ e : Ke
 ξ, x[[a]]ξΓ , η ⊨ Γ, xA , ∆e 
(b) ∀Ke ∆e ηe . 
 
 Ke ∈ K 

[[(e)′ ]](ξ,ηe )(Γ,∆′e ) = [[e]](ξ,x[[a]]ξΓ ,η A ,∆
e )(Γ,x e)
We prove the goal below the line by assuming all statements
above the line and distinguish two cases:
(a) C ∈
/ K:
[[(λy C .e)′ ]](ξ,η)(Γ,∆′ ) = [[e′ ]](ξ,η)(Γ,∆′ )
= [[e]](ξ,x[[a]]ξΓ ,η)(Γ,yC ,∆)
= [[λy C .e]](ξ,x[[a]]ξΓ ,η)(Γ,∆)

In this equivalence we used the definition of type in-


terpretation for abstractions and the second induction
hypothesis with ηe = η and ∆e = ∆.
(b) C ∈ K:
[[(λy C .e)′ ]](ξ,η)(Γ,∆′ ) = N 7→ [[e′ ]](ξ,η)(Γ,∆′ )
|{z}
∈ν(C ′ )
= N 7→ [[e]](ξ,x[[a]]ξΓ ,η,yN )(Γ,yC ,∆,yC )
|{z}
∈ν(C)
= [[λy C .e]](ξ,x[[a]]ξΓ ,η)(Γ,∆)

In this equivalence we used the definition of type inter-


pretation for abstractions, ν(C ′ ) = ν(C) by 6.18 and
the second induction hypothesis with ηe = η, y N and
∆e = ∆, y C .
5. Application: We have to prove the goal
 
Γ, xA , ∆ ⊢ Gb : K
 ξ, x[[a]]ξΓ , η ⊨ Γ, xA , ∆ 
∀K∆η. 
 
 K ∈ K 

[[(Gb)′ ]](ξ,η)(Γ,∆′ ) = [[Gb]](ξ,x[[a]]ξΓ ,η)(Γ,xA ,∆)

We assume all statements above the line and prove the final
goal under the line.
From the generation lemma 5.5 for applications we postulate
the existence of B and C such that
Γ, xA , ∆ ⊢ G : Πy B .C
Γ, xA , ∆ ⊨ b:B
β
C[y := b] ∼ K

94
are valid. Since C[x := a] is a kind, C and Πy B .C are kinds
as well. Therefore we get the induction hypothesis for G
 
Γ, xA , ∆ ⊢ G : KG
 ξ, x[[a]]ξΓ , η ⊨ Γ, xA , ∆ 
∀KG ∆η. 
 
 KG ∈ K



[[(G) ]](ξ,η)(Γ,∆′ ) = [[G]](ξ,x[[a]]ξΓ ,η)(Γ,xA ,∆)

We distinguish two cases:


(a) B ∈
/ K:

[[(Gb)′ ]](ξ,η)(Γ,∆′ ) = [[G′ ]](ξ,η)(Γ,∆′ )


= [[G]](ξ,x[[a]]ξΓ ,η)(Γ,xA ,∆)
= [[Gb]](ξ,x[[a]]ξΓ ,η)(Γ,xA ,∆)

where we used the definition of type interpretation for


applications and the induction hypothesis for G.
(b) B ∈ K: In that case we get an additional induction
hypothesis for b:
 
Γ, xA , ∆ ⊢ b : B
 ξ, x[[a]]ξΓ , η ⊨ Γ, xA , ∆ 
∀B∆η. 
 
 B∈K



[[(b) ]](ξ,η)(Γ,∆′ ) = [[b]](ξ,x[[a]]ξΓ ,η)(Γ,xA ,∆)

and prove the goal by the equivalence


[[(Gb)′ ]](ξ,η)(Γ,∆′ ) = [[G′ ]](ξ,η)(Γ,∆′ ) ([[b′ ]](ξ,η)(Γ,∆′ ) )
= [[G]](ξ,x[[a]]ξΓ ,η)(Γ,xA ,∆) ([[b]](ξ,x[[a]]ξΓ ,η)(Γ,xA ,∆) )
= [[Gb]](ξ,x[[a]]ξΓ ,η)(Γ,xA ,∆)

where we used the definition of type interpretation for


applications and both induction hypotheses.
• A∈ / K: Assume ξ ⊨ Γ and Γ ⊢ a : A and prove the more general
lemma
Γ, xA , ∆ ⊢ F : K
 
 ξ, η ⊨ Γ, xA , ∆ 
∀K∆η.  K∈K



[[F ]](ξ,η)(Γ,∆′ ) = [[F ]](ξ,η)(Γ,∆)
The proof is practically the same as the proof for the case A ∈ K
except that x[[a]]ξΓ is never needed, because A is not a kind and

95
therefore a is not a type function. The variable case is even
simpler, because the variable cannot be x (the type of x is not a
kind).

Theorem 6.24. Type interpretation respects reduction


β
F →G
ξ⊨Γ
Γ⊢F :K
K∈K
[[F ]]ξΓ = [[G]]ξΓ
β
Proof. By induction on F → G. The only interesting case is the case
of a redex. In all the other case the reduction does not change the
toplevel structure of the term and the goal can be derived from the
induction hypotheses and the definition of type interpretation.
Therefore here we prove only the redex case i.e. we prove
β
(λxA .e)a → e[x := a]
ξ⊨Γ
Γ ⊢ λxA .e : K
K∈K
[[(λxA .e)a]]ξΓ = [[e[x := a]]]ξΓ
We distinguish two cases:
1. A ∈
/ K: We prove the goal by the equivalence
[[(λxA .e)a]]ξΓ = [[e]]ξ(Γ,xA )
= [[e[x := a]]]ξΓ
We have used the definition of type interpretation for applica-
tions and abstractions and theorem 6.23.
2. A ∈ K: We prove the goal by the equivalence
[[(λxA .e)a]]ξΓ = ( |{z}
M 7→ [[e]](ξ,xM )(Γ,xA ) )([[a]]ξΓ )
M ∈ν(A)
= [[e]](ξ,x[[a]]ξΓ )(Γ,xA )
= [[e[x := a]]]ξΓ
We have used the definition of type interpretation for applica-
tions and abstractions and theorem 6.23.

96
Theorem 6.25. Equivalent type functions have the same type inter-
pretation
β
F ∼G
ξ⊨Γ
Γ⊢F :K
Γ⊢G:s
K∈K
[[F ]]ξΓ = [[G]]ξΓ
β
Proof. By induction on F ∼ G. The reflexive case is trivial. The
forward and the backward cases can be proved by the corresponding
induction hypothesis and theorem 6.24.
Theorem 6.26. The type interpretation of a type is a saturated set.
ξ⊨Γ
Γ⊢T :s
[[T ]]ξΓ ∈ SAT
Proof. The type interpretation function satisfies the specification
[[T ]]ξΓ ∈ ν(s) = SAT

6.9 Term Interpretation


Definition 6.27. A variable interpretation ρ is a list of variables
which associates to each variable a term. No duplicate variables are
allowed.
ρ = {xt11 , xt22 , . . . , xtnn }
Definition 6.28. A term interpretation (|u|)ρ replaces each free vari-
able x in the term u with the term t when xt ∈ ρ, otherwise leaves the
variable unchanged.


 (|s|)ρ := s(
t xt ∈ ρ





 (|x|) ρ :=
(|u|)ρ := x otherwise


 (|Πx .B|)ρ := Πx(|A|)ρ .(|B|)ρ,xx
A

 (|λxA .e|)ρ := λx(|A|)ρ .(|e|)ρ,xx





(|ab|)ρ := (|a|)ρ (|b|)ρ

97
A term interpretation is just a parallel substitution of the free vari-
ables in a term. In the following we use only variable interpretations
which contain all variables of a context and apply it only to terms
which are welltyped in a context.

6.10 Context Model


Definition 6.29. Context Model : We call a variable interpretation ρ
and a context interpretation ξ (i.e. ξ ⊨ Γ) a model of a context which
we write
ρξ ⊨ Γ
when ρ and Γ have the form

Γ = [xA 1 An
1 , . . . , xn ]
t1 t
ρ = [x1 , . . . , xnn ]

where ti ∈ [[Ai ]]ξΓ for all i ∈ {1, . . . , n}.

6.11 Soundness Theorem


Theorem 6.30. Let ρξ be a model of the context Γ and Γ ⊢ t : T
a valid typing judgement. Then the term interpretation (|t|)ρ is an
element of the type interpretation [[T ]]ξΓ .

Γ⊢t:T
ρξ ⊨ Γ
(|t|)ρ ∈ [[T ]]ξΓ

Proof. By induction on the structure of t.


1. Sort: We have to prove the goal

Γ⊢s:T
ρξ ⊨ Γ
(|s|)ρ ∈ [[T ]]ξΓ

From the generation lemma 5.5 for sorts we get

s=P
β
T ∼U

98
and prove the goal by

(|P|)ρ = P ∈ SN = [[U]]ξΓ = [[T ]]ξΓ

and using the fact that type interpretation respects beta equiv-
alence 6.25
2. Variable: We have to prove the goal

Γ⊢x:T
ρξ ⊨ Γ
(|x|)ρ ∈ [[T ]]ξΓ

From the generation lemma 5.5 for variables we postulate the


existence of A and s with
Γ⊢A:s
xA ∈ Γ
β
T ∼A

and prove the goal by

(|x|)ρ = t ∈ [[A]]ξΓ = [[T ]]ξΓ

where xt ∈ ρ and we used the definition of ρξ ⊨ Γ and the fact


that type interpretation respects beta equality 6.25
3. Product: We have to prove the goal

Γ ⊢ ΠxA .B : T
ρξ ⊨ Γ
(|ΠxA .B|)ρ ∈ [[T ]]ξΓ

From the generation lemma 5.5 for products we postulate the


existence of the sorts sb and sc with

Γ ⊢ A : sa
Γ, xA ⊢ B : sb
β
T ∼ sb

We can use the induction hypotheses


 
Γ ⊢ A : Ta
(a) ∀ΓTa ρξ.  ρξ ⊨ Γ 
(|A|)ρ ∈ [[Ta ]]ξΓ

99
 
Γb ⊢ B : Tb
(b) ∀Γb Tb ρb ξb .  ρb ξb ⊨ Γb 
(|B|)ρb ∈ [[Tb ]]ξb Γb
The final goal is (|ΠxA .B|)ρ ∈ [[T ]]ξΓ = [[sb ]]ξΓ = SN which
requires the subgoals

(|A|)ρ ∈ SN
(|B|)ρ,xx ∈ SN

The first subgoal is proved by the first induction hypothesis by


using Ta = sa .
The second subgoal is proved by the second induction hypothesis
by using
Γb = Γ, xA
Tb = sb
ρb = ρ, x
x
ξ A∈ /K
ξb = c
ξ, xν (A) A ∈ K
4. Abstraction: We have to prove the goal

Γ ⊢ λxA .e : T
ρξ ⊨ Γ
(|λxA .e|)ρ ∈ [[T ]]ξΓ

From the generation lemma 5.5 for abstractions we postulate the


existence of the type B and the sort s with

Γ ⊢ ΠxA .B : s
Γ, xA ⊢ e : B
β
T ∼ ΠxA .B

We can use the induction hypotheses


 
Γ ⊢ A : Ta
(a) ∀ΓTa ρξ.  ρξ ⊨ Γ 
(|A|)ρ ∈ [[Ta ]]ξΓ
 
Γe ⊢ e : Te
(b) ∀Γe Te ρe ξe .  ρe ξe ⊨ Γe 
(|e|)ρe ∈ [[Te ]]ξe Γe

100
We have to prove the final goal
λ
(|λxA .e|)ρ ∈ [[T ]]ξΓ = [[ΠxA .B]]ξΓ = [[A]]ξΓ → IB
(
[[B]] A A∈/K
where IB = T ξ(Γ,x )
M ∈ν(A) [[B]](ξ,xM )(Γ,xA ) A ∈ K
λ
By definition of → we have to prove

(|λxA .e|)ρ a ∈ IB

for all a ∈ [[A]]ξΓ .


From the second induction hypothesis we infer

(|e|)ρ,xa ∈ [[B]]ξe (Γ,xA )


(
ξ A∈/K
with ξe = M
because (ρ, xA )ξe ⊨ Γ, xA ,
ξ, x A ∈ K M ∈ ν(A)
since a ∈ [[A]]ξΓ .
This is true for all M ∈ ν(A), if A ∈ K. Therefore we infer from
the second induction hypothesis

(|e|)ρ,xa ∈ IB

for all a ∈ [[A]]ξΓ .


Since IB is a saturated set it includes all strongly normalizing
β β
terms t with t →k (|e|)ρ,xa . By definition of →k and term interpre-
tation we have
(|λxA .e|)ρ a = (λx(|A|)ρ .(|e|)ρ,xx )a
β
→k (|e|)ρ,xx [x := a]
= (|e|)ρ,xa

Therefore it remains to prove that (λx(|A|)ρ .(|e|)ρ,xx )a is strongly


normalizing. This can be proved by theorem 6.4 provided that
(a) (|A|)ρ ∈ SN: This can be infered from the first induction
hypothesis, because (|A|)ρ ∈ [[Ta ]]ξΓ and the type of A must
be a sort whose type interpretation is the set of strongly
normalizing terms.
(b) a ∈ SN: Since a ∈ [[A]]ξΓ and A is a type we infer from
theorem 6.26 that a is in a saturated set which by definition
has only strongly normalizing terms.

101
(c) (|e|)ρ,xx [x := a] = (|e|)ρ,xa ∈ SN: We have already inferred
from the second induction hypothesis (|e|)ρ,xa ∈ IB . Since IB
is either a type interpretation of a type or an intersection of
type interpretations of a type and saturated sets are closed
with respect to intersection, IB is a saturated set which by
definition contains only strongly normalizing terms.
(d) (|e|)ρ,xx ∈ SN: Because [[A]]ξΓ and type interpretations of
types are saturated and saturated sets contain all base terms,
we have x ∈ [[A]]ξΓ . Therefore (ρ, xx )ξe ⊨ Γ, xA which im-
plies the goal.
5. Application: We have to prove the goal

(|f a|)ρ ∈ [[T ]]ξΓ

under the assumptions

Γ ⊢ fa : T
ρξ ⊨ Γ

From the generation lemma 5.5 for applications we postulate the


existence of the types A and B such that

Γ ⊢ f : ΠxA .B
Γ ⊢ a:A
β
T ∼ B[x := a]

Furthermore we have the following induction hypotheses avail-


able:
λ
(|f |)ρ ∈ [[A]]ξΓ → IB
(|a|)ρ ∈ [[A]]ξΓ
where (
[[B]]ξ(Γ,xA ) A∈/K
IB = T
M ∈ν(A) [[B]](ξ,xM )(Γ,xA ) A ∈ K

β
By using the theorem 6.23, the theorem 6.25 and T ∼ B[x := a]
we have to prove the goal
(
[[B]]ξ(Γ,xA ) A∈/K
(|f |)ρ (|a|)ρ ∈ [[T ]]ξΓ = [[(B[x := a])]]ξΓ =
[[B]](ξ,x[[a]]ξΓ )(Γ,xA ) A ∈ K

We distinguish two cases:

102
(a) A ∈
/ K: The goal follows immediately from the induction
λ
hypotheses and the definition of →.
(b) A ∈ K: From the induction hypotheses and the definition
λ
of → we get
 
M ∈ ν(A)
∀M.
(|f |)ρ (|a|)ρ ∈ [[B]](ξ,xM )(Γ,xA )

and from the specification of the type interpretation we get

[[a]]ξΓ ∈ ν(A)

which finally proves the goal.

6.12 Strong Normalization Proof


Theorem 6.31. All welltyped terms are strongly normalizing.

Γ⊢t:T
t ∈ SN

Proof. Assume Γ ⊢ t : T , i.e. Γ = [x1 : A1 , . . . , xn : An ] is a valid


context.
By theorem 6.20 ξ := ξ c (Γ) is a valid interpretation for the context
Γ with ξ ⊨ Γ.
We can form a context model

ρξ ⊨ Γ

where
ρ = [xx1 1 , . . . , xxnn ]
Then we have by the soundness theorem 6.30

t = (|t|)ρ ∈ [[T ]]ξΓ ∈ SAT

which proves the goal because type interpretations of types are satu-
rated sets 6.26 and saturated sets contain only strongly normalizing
terms by definition.

103
6.13 Logical Consistency
Logical consistency of the calculus of constructions means that it is not
possible to prove contradictions in the calculus. Since a contradiction
implies everything we can state logical consistency as It is not possible
to prove every proposition.
Via the Curry-Howard correspondence we interpret types as propo-
sitions and terms of a type as a proof of the proposition which cor-
responds to that type. In the calculus of constructions terms of the
type ΠX P .X are functions which map every type X into a proof of
that type.
Therefore in the calculus of contructions logical consistency means
that there does not exist a term t of type ΠX P .X in the empty context.
It is important to use the empty context here because a context
Γ = [xA 1 An
1 , . . . , xn ] is a sequence of assumptions and it is perfectly
P
possible to make contradictory assumptions e.g. having f ΠX .X as
one assumption. In such a context f is a term of type ΠX P .X.
In the previous section we have proved that all welltyped terms
in the calculus of constructions are strongly normalizing. I.e. all
welltyped terms can be reduced to their normal form. Therefore it is
sufficient to prove that in the empty context there exists no term in
normal form which has the type ΠX P .X.

Lemma 6.32. All welltyped terms in normal form are either sorts,
products, abstractions or base terms.

Proof. We prove the goal by induction on the structure of t.


1. For sorts, variables, products and abstractions there is nothing
to do, because they fall trivially into one of the alternatives.
2. It remains to prove that a welltyped application in normal form
is a base term. Assume Γ ⊢ f a : T and f a ∈ NF.
By the generation lemma 5.5 for applications we postulate the
existence of the types A and B such that Γ ⊢ f : ΠxA .B.
The first induction hypothesis states that f falls into one of the
categories.
However it cannot be a sort nor a product, because neither of
them can have a type equivalent to ΠxA .B.
f cannot be an abstraction, otherwise f a would not be in normal
form. Therefore f can only be a base term.

104
If f is a base term, then by definition of base terms f a is a base
term as well, because a ∈ NF ⊆ SN.

Theorem 6.33. There are no welltyped terms in normal form which


have the type ΠX P : X in the empty context.

[] ⊢ t : ΠX P .X
t ∈ NF

Proof. We assume t ∈ NF and [] ⊢ t : ΠX P .X and derive a contradic-


tion.
1. By the previous theorem 6.32 t has to be either a sort, a product
an abstraction or a base term. t cannot be a sort nor a product,
because they cannot have a type equivalent to a product (the
only valid type of both are sorts). It cannot be a base term,
because a base term has a free variable in the head position and
we are in the empty context (only closed terms).
Therefore t must be an abstraction, say λy B .e.
2. By the generation lemma 5.5 for abstractions we postulate the
β
existence of a type C such that [X P ] ⊢ e : C and ΠX P .X ∼
β
ΠX P .C which imply X ∼ C. Therefore we have

[X P ] ⊢ e : X

and e must be in normal form.


3. e cannot be a sort nor a product nor an abstraction, because
the type of the first two must be a sort and the type of the last
must be a product. Neither a sort nor a product can be beta
equivalent to a variable.
Therefore e must be a base term. There are two possibilities:
4. e is a variable, say y:
In that case we have y = X, because X is the only free variable
β
available. By the generation lemma 5.5 for variables we get P ∼
X which is not possible.

105
5. e has the form ya1 , . . . , an with n > 0: We get y = X as well for
the same reason. Since y is in a function position its type must
be beta equivalent to some product Πy B .C. However its type is
P which cannot beta equivalent to a product.

106
7 Bibliography
References
[1] Henk Barendregt. Lambda calculi with types. Handbook of Logic
in Computer Science, II, 1993.
[2] Herman Geuvers. A short and flexible proof of strong normaliza-
tion for the calculus of constructions. Computing Science Report,
9450, 1994.

107

You might also like