Theory of Computation
Theory of Computation
The Calculus
of Computation
Decision Procedures
with Applications to Verification
With 60 Figures
123
Authors
Aaron R. Bradley
Zohar Manna
Gates Building, Room 481
Stanford University
Stanford, CA 94305
USA
[email protected]
[email protected]
This work is subject to copyright. All rights are reserved, whether the whole or part of the material
is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broad-
casting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of
this publication or parts thereof is permitted only under the provisions of the German Copyright Law
of September 9, 1965, in its current version, and permission for use must always be obtained from
Springer. Violations are liable for prosecution under the German Copyright Law.
Springer is a part of Springer Science+Business Media
springer.com
© Springer-Verlag Berlin Heidelberg 2007
The use of general descriptive names, registered names, trademarks, etc. in this publication does not
imply, even in the absence of a specific statement, that such names are exempt from the relevant pro-
tective laws and regulations and therefore free for general use.
Typesetting by the authors
Production: LE-TEX Jelonek, Schmidt & Vöckler GbR, Leipzig
Cover design: KünkelLopka Werbeagentur, Heidelberg
Printed on acid-free paper 45/3180/YL - 5 4 3 2 1 0
To my wife,
Sarah
A.R.B.
To my grandchildren,
Itai
Maya
Ori
Z.M.
Preface
Content
The book has two parts. Part I, Foundations, presents first-order logic, induc-
tion, and program verification. The methods are general. For example, Chap-
ter 2 presents a complete proof system for first-order logic, while Chapter 5
describes a relatively complete verification methodology. Part II, Algorithmic
Reasoning, focuses on specialized algorithms for reasoning about fragments of
first-order logic and for deducing facts about programs. Part II trades gener-
ality for decidability and efficiency.
VIII Preface
The first three chapters of Part I introduce first-order logic. Chapters 1 and
2 begin our presentation with a review of propositional and predicate logic.
Much of the material will be familiar to the reader who previously studied
logic. However, Chapter 3 on first-order theories will be new to many readers.
It axiomatically defines the various first-order theories and fragments that we
study and apply throughout the rest of the book. Chapter 4 reviews induction,
introducing some forms of induction that may be new to the reader. Induction
provides the mathematical basis for analyzing program correctness.
Chapter 5 turns to the primary motivating application of computational
logic in this book, the task of verifying programs. It discusses specification, in
which the programmer formalizes in logic the (sometimes surprisingly vague)
understanding that he has about what functions should do; partial correctness,
which requires proving that a program or function meets a given specification
if it halts; and total correctness, which requires proving additionally that a pro-
gram or function always halts. The presentation uses the simple programming
language pi and is supported by the verifying compiler πVC (see The πVC
System, below, for more information on πVC). Chapter 6 suggests strategies
for applying the verification methodology.
Part II on Algorithmic Reasoning begins in Chapter 7 with quantifier-
elimination methods for limited integer and rational arithmetic. It describes
an algorithm for reducing a quantified formula in integer or rational arithmetic
to an equivalent formula without quantifiers.
Chapter 8 begins a sequence of chapters on decision procedures for
quantifier-free and other fragments of theories. These fragments of first-order
theories are interesting for three reasons. First, they are sometimes decidable
when the full theory is not (see Chapters 9, 10, and 11). Second, they are
sometimes efficiently decidable when the full theory is not (compare Chapters
7 and 8). Finally, they are often useful; for example, proving the verification
conditions that arise in the examples of Chapters 5 and 6 requires just the
fragments of theories studied in Chapters 8–11. The simplex method for linear
programming is presented in Chapter 8 as a decision procedure for deciding
satisfiability in rational and real arithmetic without multiplication.
Chapters 9 and 11 turn to decision procedures for non-arithmetical theo-
ries. Chapter 9 discusses the classic congruence closure algorithm for equality
with uninterpreted functions and extends it to reason about data structures
like lists, trees, and arrays. These decision procedures are for quantifier-free
fragments only. Chapter 11 presents decision procedures for larger fragments
of theories that formalize array-like data structures.
Decision procedures are most useful when they are combined. For example,
in program verification one must reason about arithmetic and data structures
simultaneously. Chapter 10 presents the Nelson-Oppen method for combining
decision procedures for quantifier-free fragments. The decision procedures of
Chapters 8, 9, and 11 are all combinable using the Nelson-Oppen method.
Chapter 12 presents a methodology for constructing invariant generation
procedures. These procedures reason inductively about programs to aid in
Preface IX
1–4
5,6 7 8 9
12 10
11
Teaching
This book can be used in various ways and taught at multiple levels. Figure
0.1 presents a dependency graph for the chapters. There are two main tracks:
the verification track, which focuses on Chapters 1–4, 5, 6, and 12; and the
decision procedures track, which focuses on Chapters 1–4 and 7–11. Within
the decision procedures track, the reader can focus on the quantifier-free de-
cision procedures track, which skips Chapters 7 and 11. The reader interested
in quickly obtaining an understanding of modern combination decision proce-
dures would prefer this final track.
We have annotated several sections with a ⋆ to indicate that they provide
additional depth that is unnecessary for understanding subsequent material.
Additionally, all proofs may be skipped without preventing a general under-
standing of the material.
Each chapter ends with a set of exercises. Some require just a mechanical
understanding of the material, while others require a conceptual understand-
ing or ask the reader to think beyond what is presented in the book. These
latter exercises are annotated with a ⋆ . For certain audiences, additional exer-
cises might include implementing decision procedures or invariant generation
procedures and exploring certain topics in greater depth (see Chapter 13).
In our courses, we assign program verification exercises from Chapters 5
and 6 throughout the term to give students time to develop this important
skill. Learning to verify programs is about as difficult for students as learning
X Preface
to program in the first place. Specifying and verifying programs also strength-
ens the students’ facility with logic.
Bibliographic Remarks
Acknowledgments
This material is based upon work supported by the National Science Foun-
dation under Grant Nos. CSR-0615449 and CNS-0411363 and by Navy/ONR
contract N00014-03-1-0939. Any opinions, findings, and conclusions or rec-
ommendations expressed in this material are those of the authors and do
not necessarily reflect the views of the National Science Foundation or the
Navy/ONR. The first author received additional support from a Sang Samuel
Wang Stanford Graduate Fellowship.
We thank the following people for their comments throughout the writ-
ing of this book: Miquel Bertran, Andrew Bradley, Susan Bradley, Chang-
Seo Park, Caryn Sedloff, Henny Sipma, Matteo Slanina, Sarah Solter, Fabio
Somenzi, Tomás Uribe, the students of CS156, and Alfred Hofmann and the
reviewers and editors at Springer. Their suggestions helped us to improve
the presentation substantially. Remaining errors and shortcomings are our
responsibility.
Part I Foundations
1 Propositional Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Satisfiability and Validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.1 Truth Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.2 Semantic Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4 Equivalence and Implication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.6 Normal Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.7 Decision Procedures for Satisfiability . . . . . . . . . . . . . . . . . . . . . . . 21
1.7.1 Simple Decision Procedures . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.7.2 Reconsidering the Truth-Table Method . . . . . . . . . . . . . . . 22
1.7.3 Conversion to an Equisatisfiable Formula in CNF . . . . . . 24
1.7.4 The Resolution Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.7.5 DPLL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Bibliographic Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2 First-Order Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.3 Satisfiability and Validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.4 Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.4.1 Safe Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.4.2 Schema Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.5 Normal Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.6 Decidability and Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.6.1 Satisfiability as a Formal Language . . . . . . . . . . . . . . . . . . 53
XII Contents
2.6.2 Decidability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.6.3 ⋆ Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.7 ⋆ Meta-Theorems of First-Order Logic . . . . . . . . . . . . . . . . . . . . . 56
2.7.1 Simplifying the Language of FOL . . . . . . . . . . . . . . . . . . . . 57
2.7.2 Semantic Argument Proof Rules . . . . . . . . . . . . . . . . . . . . . 58
2.7.3 Soundness and Completeness . . . . . . . . . . . . . . . . . . . . . . . 58
2.7.4 Additional Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Bibliographic Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3 First-Order Theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.1 First-Order Theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.2 Equality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.3 Natural Numbers and Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.3.1 Peano Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.3.2 Presburger Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.3.3 Theory of Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.4 Rationals and Reals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.4.1 Theory of Reals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.4.2 Theory of Rationals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.5 Recursive Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.6 Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.7 ⋆ Survey of Decidability and Complexity . . . . . . . . . . . . . . . . . . . 90
3.8 Combination Theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
3.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Bibliographic Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4 Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.1 Stepwise Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.2 Complete Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.3 Well-Founded Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.4 Structural Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
Bibliographic Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
11 Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
11.1 Arrays with Uninterpreted Indices . . . . . . . . . . . . . . . . . . . . . . . . . 292
11.1.1 Array Property Fragment . . . . . . . . . . . . . . . . . . . . . . . . . . 292
11.1.2 Decision Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
11.2 Integer-Indexed Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
Contents XV
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
Part I
Foundations
Everything is vague to a degree you do not realize till you have tried
to make it precise.
— Bertrand Russell
Philosophy of Logical Atomism, 1918
Modern design and implementation of software and hardware systems lacks
precision. Design documents written in a natural language admit misinterpre-
tation. Informal arguments about why a system works miss crucial weaknesses.
The resulting systems are fragile. Part I of this book presents an alternative
approach to system design and implementation based on using a formal lan-
guage to specify and reason about software systems.
Chapters 1 and 2 introduce the (first-order) predicate calculus. Chapter 1
presents the propositional calculus, and Chapter 2 presents the full predicate
calculus. A central task is determining whether formulae of the calculus are
valid. Chapter 3 formalizes common data types of software in the predicate
calculus. It also introduces the concepts of decidability and complexity of
deciding validity of formulae.
The final three chapters of Part I discuss applications of the predicate cal-
culus. Chapter 4 formalizes mathematical induction in the predicate calculus,
in the process introducing several forms of induction that may be new to the
reader. Chapters 5 and 6 then apply the predicate calculus and mathemat-
ical induction to the specification and verification of software. Specification
consists of asserting facts about software. Verification applies mathematical
induction to prove that each assertion evaluates to true when program con-
trol reaches it; and to prove that program control eventually reaches specific
program locations.
Part I thus provides the mathematical foundations for precise engineering.
Part II will investigate algorithmic aspects of applying these foundations.
1
Propositional Logic
This chapter and the next introduce the calculus that will be the basis for
studying computation in this book. In this chapter, we cover propositional
logic (PL); in the next chapter, we build on the presentation to define first-
order logic (FOL). PL and FOL are also known as propositional calculus
and predicate calculus, respectively, because they are calculi for reasoning
about propositions (“the sky is blue”, “this comment references itself”) and
predicates (“x is blue”, “y references z”), respectively. Propositions are either
true or false, while predicates evaluate to true or false depending on the values
given to their parameters (x, y, and z).
Just as differential calculus has a set of symbols, a set of rules, and a
mapping to reality that provides its meaning, propositional logic has its own
symbols, rules of inference, and meaning. Sections 1.1 and 1.2 introduce the
syntax and semantics (meaning) of PL formulae. Then Section 1.3 discusses
two concepts that are fundamental throughout this book, satisfiability (Is
this formula ever true?) and validity (Is this formula always true?), and the
rules for computing whether a PL formula is satisfiable or valid. Rules for
manipulating PL formulae, some of which preserve satisfiability and validity,
are discussed in Section 1.5 and applied in Section 1.6.
1.1 Syntax
In this section, we introduce the syntax of PL. The syntax of a logical lan-
guage consists of a set of symbols and rules for combining them to form
“sentences” (in this case, formulae) of the language.
The basic elements of PL are the truth symbols ⊤ (“true”) and ⊥
(“false”) and the propositional variables, usually denoted by P , Q, R,
P1 , P2 , . . .. A countably infinite set of propositional variable symbols exists.
Logical connectives, also called Boolean connectives, provide the expres-
sive power of PL. A formula is simply ⊤, ⊥, or a propositional variable P ; or
the application of one of the following connectives to formulae F , F1 , or F2 :
• ¬F : negation, pronounced “not”;
• F1 ∧ F2 : conjunction, pronounced “and”;
• F1 ∨ F2 : disjunction, pronounced “or”;
• F1 → F2 : implication, pronounced “implies”;
• F1 ↔ F2 : iff, pronounced “if and only if”.
Each connective has an arity (the number of arguments that it takes): nega-
tion is unary (it takes one argument), while the other connectives are binary
(they take two arguments). The left and right arguments of → are called the
antecedent and consequent, respectively.
Some common terminology is useful. An atom is a truth symbol ⊤, ⊥ or
propositional variable P , Q, . . .. A literal is an atom α or its negation ¬α. A
formula is a literal or the application of a logical connective to a formula or
formulae.
1.1 Syntax 5
F : (P ∧ Q) → (P ∨ ¬Q) .
F , P ∨ ¬Q , ¬Q , P ∧Q , P , Q.
F ′ : P ∧ Q → P ∨ ¬Q .
Also,
P1 ∧ ¬P2 ∧ ⊤ ∨ ¬P1 ∧ P2
stands for
Finally,
P1 → P2 → P3
abbreviates
P1 → (P2 → P3 ) .
6 1 Propositional Logic
1.2 Semantics
So far, we have considered the syntax of PL. The semantics of a logic provides
its meaning. What exactly is meaning? In PL, meaning is given by the truth
values true and false, where true 6= false. Our objective is to define how to
give meaning to formulae.
The first step in defining the semantics of PL is to provide a mechanism
for evaluating the propositional variables. An interpretation I assigns to
every propositional variable exactly one truth value. For example,
I : {P 7→ true, Q 7→ false, . . .}
F ¬F
0 1
1 0
F1 F2 F1 ∧ F2 F1 ∨ F2 F1 → F2 F1 ↔ F2
0 0 0 0 1 1
0 1 0 1 1 0
1 0 0 1 0 0
1 1 1 1 1 1
F : P ∧ Q → P ∨ ¬Q
I : {P 7→ true, Q 7→ false} .
P Q ¬Q P ∧ Q P ∨ ¬Q F
1 0 1 0 1 1
The top row is given by the subformulae of F . I provides values for the first
two columns; then the semantics of PL provide the values for the remainder
of the table. Hence, F evaluates to true under I.
I |= ⊤
I 6|= ⊥
Under any interpretation I, ⊤ has value true, and ⊥ has value false. Next,
define the truth value of propositional variables:
P has value true iff the interpretation I assigns P to have value true.
Since an interpretation assigns a truth value to every propositional vari-
able, I assigns false to P when I does not assign true to P . Thus, we can
instead define the truth values of propositional variables as follows:
Since true 6= false, both definitions yield the same (unique) truth values.
Having completed the base cases of our inductive definition, we turn to
the inductive step. Assume that formulae F , F1 , and F2 have truth values.
From these formulae, evaluate the semantics of more complex formulae:
I |= ¬F iff I 6|= F
I |= F1 ∧ F2 iff I |= F1 and I |= F2
I |= F1 ∨ F2 iff I |= F1 or I |= F2
I |= F1 → F2 iff, if I |= F1 then I |= F2
I |= F1 ↔ F2 iff I |= F1 and I |= F2 , or I 6|= F1 and I 6|= F2
8 1 Propositional Logic
The formula F1 → F2 has truth value true under I when either F1 is false
or F2 is true. It is false only when F1 is true and F2 is false. Our inductive
definition of the semantics of PL is complete.
F : P ∧ Q → P ∨ ¬Q
I : {P 7→ true, Q 7→ false} .
I |= F1 → F2 iff, if I |= F1 then I |= F2
the implication F1 → F2 has value true when I 6|= F1 . Thus, line 5 is unnec-
essary for establishing the truth value of F .
F : P ∧ Q → P ∨ ¬Q .
Is it valid? Construct a table in which the first row is a list of the subformulae
of F ordered according to the subformula ordering. Fill columns of proposi-
tional variables with all possible combinations of truth values. Then apply the
semantics of PL to fill the rest of the table:
P QP ∧Q ¬Q P ∨ ¬Q F
0 0 0 1 1 1
0 1 0 0 0 1
1 0 0 1 1 1
1 1 1 0 1 1
The final column, which represents the truth value of F under the possible
interpretations, is filled entirely with true. F is valid.
F : P ∨Q → P ∧Q .
Because the second and third rows show that F can be false, F is invalid.
10 1 Propositional Logic
I |= ¬F I 6|= ¬F
I 6|= F I |= F
I |= F ∧ G I 6|= F ∧ G
I |= F I 6|= F | I 6|= G
I |= G
I |= F ∨ G I |6 = F ∨ G
I |= F | I |= G I 6|= F
I 6|= G
I |= F → G I |6 = F → G
I 6|= F | I |= G I |= F
I 6|= G
1.3 Satisfiability and Validity 11
F : (P → Q) ∧ (Q → R) → (P → R)
1. I 6|= F assumption
2. I |= (P → Q) ∧ (Q → R) by 1 and semantics of →
3. I 6|= P →R by 1 and semantics of →
4. I |= P by 3 and semantics of →
5. I 6|= R by 3 and semantics of →
6. I |= P →Q by 2 and semantics of ∧
7. I |= Q→R by 2 and semantics of ∧
Now there are two more cases from 7. In the first case,
and I 6|= G for some formula G. Otherwise, the branch is open. A semantic
argument is finished when no more proof rules are applicable. It is a proof
of the validity of F if every branch is closed; otherwise, each open branch
describes a falsifying interpretation of F .
While the given proof rules are (theoretically) sufficient, derived proof
rules can make proofs more concise.
Example 1.9. The derived rule of modus ponens simplifies the proof of
Example 1.8. The rule is the following:
I |= F
I |= F → G
I |= G
F : (P → Q) ∧ (Q → R) → (P → R) .
¬F : ¬(P ∨ Q → P ∧ Q)
is satisfiable:
P QP ∨Q P ∧Q F ¬F
0 0 0 0 1 0
0 1 1 0 0 1
1 0 1 0 0 1
1 1 1 1 1 0
G : ¬(P ∨ Q → P ∧ Q)
F : P ∨Q → P ∧Q
1. I 6|= P ∨ Q → P ∧ Q assumption
2. I |= P ∨ Q by 1 and semantics of →
3. I 6|= P ∧ Q by 1 and semantics of →
I : {P 7→ true, Q 7→ false}
P ⇔ ¬¬P ,
we prove that
P ↔ ¬¬P
P → Q ⇔ ¬P ∨ Q ,
we prove that
F : P → Q ↔ ¬P ∨ Q
R ∧ (¬R ∨ P ) ⇒ P ,
we prove that
16 1 Propositional Logic
F : R ∧ (¬R ∨ P ) → P
is valid via a semantic argument. Suppose F is not valid; then there exists an
interpretation I such that I 6|= F :
1. I 6|= F assumption
2. I |= R ∧ (¬R ∨ P ) by 1 and semantics of →
3. I 6|= P by 1 and semantics of →
4. I |= R by 2 and semantics of ∧
5. I |= ¬R ∨ P by 2 and semantics of ∧
1.5 Substitution
Substitution is a syntactic operation on formulae with significant semantic
consequences. It allows us to prove the validity of entire sets of formulae via
formula templates. It is also an essential tool for manipulating formulae
throughout the text.
A substitution σ is a mapping from formulae to formulae:
σ : {F1 7→ G1 , . . . , Fn 7→ Gn } .
domain(σ) : {F1 , . . . , Fn } ,
range(σ) : {G1 , . . . , Gn } .
F : P ∧ Q → P ∨ ¬Q
and substitution
σ : {P 7→ R, P ∧ Q 7→ P → Q} .
Then
F σ : (P → Q) → R ∨ ¬Q ,
F σ 6= R ∧ Q → R ∨ ¬Q
by our convention.
F [F1 , . . . , Fn ]σ : F [G1 , . . . , Gn ] .
σ : {F1 7→ G1 , . . . , Fn 7→ Gn }
σ : {P → Q 7→ ¬P ∨ Q}
to
18 1 Propositional Logic
F : (P → Q) → R .
F σ : (¬P ∨ Q) → R
is equivalent to F .
F : (P → Q) ↔ (¬P ∨ Q)
σ1 σ2 : {P 7→ R, P ∧ Q 7→ P → Q}{P 7→ S, S 7→ Q}
as follows:
{P 7→ Rσ2 , P ∧ Q 7→ (P → Q)σ2 , S 7→ Q}
= {P 7→ R, P ∧ Q 7→ S → Q, S 7→ Q}
¬¬F1 ⇔ F1
¬⊤ ⇔ ⊥
¬⊥ ⇔ ⊤
¬(F1 ∧ F2 ) ⇔ ¬F1 ∨ ¬F2
¬(F1 ∨ F2 ) ⇔ ¬F1 ∧ ¬F2
F1 → F2 ⇔ ¬F1 ∨ F2
F1 ↔ F2 ⇔ (F1 → F2 ) ∧ (F2 → F1 )
F1 → F2 ⇔ ¬F1 ∨ F2 (1.1)
to produce
Proposition 1.17 implies that the result is valid. Then construct substitution
F ′ : (Q1 ∧ Q2 ) ∨ (R1 ∨ R2 ) .
F ′′ : (Q1 ∨ R1 ∨ R2 ) ∧ (Q2 ∨ R1 ∨ R2 ) ,
Section 1.3 introduced the truth-table and semantic argument methods for
determining the satisfiability of PL formulae. In this section, we study al-
gorithms for deciding satisfiability (see Section 2.6 for a formal discussion of
decidability). A decision procedure for satisfiability of PL formulae reports,
after some finite amount of computation, whether a given PL formula F is
satisfiable.
In the naive decision procedure based on the truth-table method, the entire
table is constructed. Actually, only one row need be considered at a time, mak-
ing for a space efficient procedure. This idea is implemented in the following
recursive algorithm for deciding the satisfiability of a PL formula F :
The notation “let rec sat F =” declares sat as a recursive function that
takes one argument, a formula F . The notation “let P = choose vars(F ) in”
means that P ’s value in the subsequent text is the variable returned by the
choose function. When applying the substitutions F {P 7→ ⊤} or F {P 7→ ⊥},
the template equivalences of Exercise 1.2 should be applied to simplify the
result. Then the comparisons F = ⊤ and F = ⊥ can be implemented as
purely syntactic operations.
At each recursive step, if F is not yet ⊤ or ⊥, a variable is chosen on which
to branch. Each possibility for P is attempted if necessary. This algorithm
returns true immediately upon finding a satisfying interpretation. Otherwise,
if F is unsatisfiable, it eventually returns ⊥. sat may save branching on certain
variables by simplifying intermediate formulae.
Example 1.23. Consider the formula
F : (P → Q) ∧ P ∧ ¬Q .
To compute sat F , choose a variable, say P , and recurse on the first case,
F {P 7→ ⊤} : (⊤ → Q) ∧ ⊤ ∧ ¬Q ,
which simplifies to
1.7 Decision Procedures for Satisfiability 23
F
P 7→ ⊤ P 7→ ⊥ F
F1 : Q ∧ ¬Q ⊥ P 7→ ⊤ P 7→ ⊥
Q 7→ ⊤ Q 7→ ⊥ ⊥ ⊤
⊥ ⊥
(a) (b)
F1 : Q ∧ ¬Q .
F1 {Q 7→ ⊤} and F1 {Q 7→ ⊥} .
F {P 7→ ⊥} : (⊥ → Q) ∧ ⊥ ∧ ¬Q ,
which simplifies to ⊥. Thus, this branch also ends without finding a satisfying
interpretation. Thus, F is unsatisfiable.
The run of sat on F is visualized in Figure 1.1(a).
F : (P → Q) ∧ ¬P .
To compute sat F , choose a variable, say P , and recurse on the first case,
F {P 7→ ⊤} : (⊤ → Q) ∧ ¬⊤ ,
F {P 7→ ⊥} : (⊥ → Q) ∧ ¬⊥
I : {P 7→ false, Q 7→ true} .
→ Rep(F )
P Q ∧ Rep(P ∧ ¬R)
P ¬ Rep(¬R)
Fig. 1.2. Parse tree of F : P ∨Q → ¬(P ∧¬R) with representatives for subformulae
The next two decision procedures operate on PL formulae in CNF. The trans-
formation suggested in Section 1.6 produces an equivalent formula that can be
exponentially larger than the original formula: consider converting a formula
in DNF into CNF. However, to decide the satisfiability of F , we need only
examine a formula F ′ such that F and F ′ are equisatisfiable. F and F ′ are
equisatisfiable when F is satisfiable iff F ′ is satisfiable.
We define a method for converting PL formula F to equisatisfiable PL
formula F ′ in CNF that is at most a constant factor larger than F . The main
idea is to introduce new propositional variables to represent the subformulae
of F . The constructed formula F ′ includes extra clauses that assert that these
new variables are equivalent to the subformulae that they represent.
Figure 1.2 visualizes the idea of the procedure. Each node of the “parse
tree” of F represents a subformula G of F . With each node G is associated a
representative propositional variable Rep(G). In the constructed formula F ′ ,
each representative Rep(G) is asserted to be equivalent to the subformula G
that it represents in such a way that the conjunction of all such assertions is
in CNF. Finally, the representative Rep(F ) of F is asserted to be true.
To obtain a small formula in CNF, each assertion of equivalence between
Rep(G) and G refers at most to the children of G in the parse tree. How is this
possible when a subformula may be arbitrarily large? The main trick is to refer
to the representatives of G’s children rather than the children themselves.
Let the “representative” function Rep : PL → V ∪{⊤, ⊥} map PL formulae
to propositional variables V, ⊤, or ⊥. In the general case, it is intended to
map a formula F to its representative propositional variable PF such that the
truth value of PF is the same as that of F . In other words, PF provides a
compact way of referring to F .
Let the “encoding” function En : PL → PL map PL formulae to PL formu-
lae. En is intended to map a PL formula F to a PL formula F ′ in CNF that
asserts that F ’s representative, PF , is equivalent to F : “Rep(F ) ↔ F ”.
1.7 Decision Procedures for Satisfiability 25
As the base cases for defining Rep and En, define their behavior on ⊤, ⊥,
and propositional variables P :
Rep(⊤) = ⊤ En(⊤) = ⊤
Rep(⊥) = ⊥ En(⊥) = ⊤
Rep(P ) = P En(P ) = ⊤
The representative of ⊤ is ⊤ itself, and the representative of ⊥ is ⊥ itself.
Thus, Rep(⊤) ↔ ⊤ and Rep(⊥) ↔ ⊥ are both trivially valid, so En(⊤) and
En(⊥) are both ⊤. Finally, the representative of a propositional variable P is
P itself; and again, Rep(P ) ↔ P is trivially valid so that En(P ) is ⊤.
For the inductive case, F is a formula other than an atom, so define its
representative as a unique propositional variable PF :
Rep(F ) = PF .
together assert
P → Rep(F1 ) ∧ Rep(F2 )
Rep(F1 ) ∧ Rep(F2 ) → P .
En(F1 ∨ F2 ) =
let P = Rep(F1 ∨ F2 ) in
(¬P ∨ Rep(F1 ) ∨ Rep(F2 )) ∧ (¬Rep(F1 ) ∨ P ) ∧ (¬Rep(F2 ) ∨ P )
En(F1 → F2 ) =
let P = Rep(F1 → F2 ) in
(¬P ∨ ¬Rep(F1 ) ∨ Rep(F2 )) ∧ (Rep(F1 ) ∨ P ) ∧ (¬Rep(F2 ) ∨ P )
En(F1 ↔ F2 ) =
let P = Rep(F1 ↔ F2 ) in
(¬P ∨ ¬Rep(F1 ) ∨ Rep(F2 )) ∧ (¬P ∨ Rep(F1 ) ∨ ¬Rep(F2 ))
∧ (P ∨ ¬Rep(F1 ) ∨ ¬Rep(F2 )) ∧ (P ∨ Rep(F1 ) ∨ Rep(F2 ))
Having defined En, let us construct the full CNF formula that is equisat-
isfiable to F . If SF is the set of all subformulae of F (including F itself),
then
^
F ′ : Rep(F ) ∧ En(G)
G∈SF
F : (Q1 ∧ Q2 ) ∨ (R1 ∧ R2 ) ,
SF : {Q1 , Q2 , Q1 ∧ Q2 , R1 , R2 , R1 ∧ R2 , F }
and compute
En(Q1 ) = ⊤
En(Q2 ) = ⊤
En(Q1 ∧ Q2 ) = (¬P(Q1 ∧Q2 ) ∨ Q1 ) ∧ (¬P(Q1 ∧Q2 ) ∨ Q2 )
∧ (¬Q1 ∨ ¬Q2 ∨ P(Q1 ∧Q2 ) )
1.7 Decision Procedures for Satisfiability 27
En(R1 ) = ⊤
En(R2 ) = ⊤
En(R1 ∧ R2 ) = (¬P(R1 ∧R2 ) ∨ R1 ) ∧ (¬P(R1 ∧R2 ) ∨ R2 )
∧ (¬R1 ∨ ¬R2 ∨ P(R1 ∧R2 ) )
En(F ) = (¬P(F ) ∨ P(Q1 ∧Q2 ) ∨ P(R1 ∧R2 ) )
∧ (¬P(Q1 ∧Q2 ) ∨ P(F ) )
∧ (¬P(R1 ∧R2 ) ∨ P(F ) )
Then
^
F ′ : P(F ) ∧ En(G)
G∈SF
C1 [P ] C2 [¬P ]
C1 [⊥] ∨ C2 [⊥]
From the two clauses of the premise, deduce the new clause, called the resol-
vent.
If ever ⊥ is deduced via resolution, F must be unsatisfiable since F ∧ ⊥ is
unsatisfiable. Otherwise, if every possible resolution produces a clause that is
already known, then F must be satisfiable.
F : (¬P ∨ Q) ∧ P ∧ ¬Q .
From resolution
(¬P ∨ Q) P
,
Q
28 1 Propositional Logic
construct
F1 : (¬P ∨ Q) ∧ P ∧ ¬Q ∧ Q .
From resolution
¬Q Q
,
⊥
F : (¬P ∨ Q) ∧ ¬Q .
(¬P ∨ Q) ¬Q
¬P
yields
F1 : (¬P ∨ Q) ∧ ¬Q ∧ ¬P .
I : {P 7→ false, Q 7→ false}
is a satisfying interpretation. A CNF formula that does not contain the clause
⊥ and to which no more resolutions can be applied represents all possible
satisfying interpretations.
1.7.5 DPLL
Modern satisfiability procedures for propositional logic are based on the Davis-
Putnam-Logemann-Loveland algorithm (DPLL), which combines the space-
efficient procedure of Section 1.7.2 with a restricted form of resolution. We
review in this section the basic algorithm. Much research in the past decade
has advanced the state-of-the-art considerably.
Like the resolution procedure, DPLL operates on PL formulae in CNF.
But again, as the procedure decides satisfiability, we can apply the conversion
procedure of Section 1.7.3 to produce a small equisatisfiable CNF formula.
As in the procedure sat, DPLL attempts to construct an interpretation of
F ; failing to do so, it reports that the given formula is unsatisfiable. Rather
than relying solely on enumerating possibilities, however, DPLL applies a
restricted form of resolution to gain some deductive power. The process of
applying this restricted resolution as much as possible is called Boolean con-
straint propagation (BCP).
1.7 Decision Procedures for Satisfiability 29
ℓ C[¬ℓ]
.
C[⊥]
Unlike with full resolution, the literals of the resolvent are a subset of the
literals of the second clause. Hence, the resolvent replaces the second clause.
F : (P ) ∧ (¬P ∨ Q) ∧ (R ∨ ¬Q ∨ S) ,
P (¬P ∨ Q)
Q
produces
F ′ : (Q) ∧ (R ∨ ¬Q ∨ S) .
Q R ∨ ¬Q ∨ S
R∨S
produces
F ′′ : (R ∨ S) ,
F
Q 7→ ⊤ Q 7→ ⊥
(R) ∧ (¬R) ∧ (P ∨ ¬R) (¬P ∨ R)
R 7→ ⊤
R (¬R)
⊥ ¬P
P 7→ ⊥
⊥ I : {P 7→ false, Q 7→ false, R 7→ true}
F : (P ) ∧ (¬P ∨ Q) ∧ (R ∨ ¬Q ∨ S) .
On the first level of recursion, dpll recognizes the unit clause (P ) and applies
the BCP steps from Example 1.28, resulting in the formula
F ′′ : R ∨ S .
{P 7→ true, Q 7→ true} .
is a satisfying interpretation of F .
Branching was not required in this example.
F {Q 7→ ⊥} : (¬P ∨ R) .
1.8 Summary
This chapter introduces propositional logic (PL). It covers:
• Its syntax. How one constructs a PL formula. Propositional variables,
atoms, literals, logical connectives.
• Its semantics. What a PL formula means. Truth values true and false.
Interpretations. Truth-table definition, inductive definition.
• Satisfiability and validity. Whether a PL formula evaluates to true under
any or all interpretations. Duality of satisfiability and validity, truth-table
method, semantic argument method.
• Equivalence and implication. Whether two formulae always evaluate to the
same truth value under every interpretation. Whether under any interpre-
tation, if one formula evaluates to true, the other also evaluates to true.
Reduction to validity.
• Substitution, which is a tool for manipulating formulae and making general
claims. Substitution of equivalent formulae. Valid templates.
• Normal forms. A normal form is a set of syntactically restricted formulae
such that every PL formula is equivalent to some member of the set.
• Decision procedures for satisfiability. Truth-table method, sat, resolution
procedure, dpll. Transformation to equisatisfiable CNF formula.
32 1 Propositional Logic
Bibliographic Remarks
Exercises
1.1 (PL validity & satisfiability). For each of the following PL formulae,
identify whether it is valid or not. If it is valid, prove it with a truth table or
semantic argument; otherwise, identify a falsifying interpretation. Recall our
conventions for operator precedence and associativity from Section 1.1.
(a) P ∧ Q → P → Q
(b) (P → Q) ∨ P ∧ ¬Q
(c) (P → Q → R) → P → R
(d) (P → Q ∨ R) → P → R
(e) ¬(P ∧ Q) → R → ¬R → Q
(f) P ∧ Q ∨ ¬P ∨ (¬Q → ¬P )
(g) (P → Q → R) → ¬R → ¬Q → ¬P
(h) (¬R → ¬Q → ¬P ) → P → Q → R
(g) F ∨ ⊤ ⇔ ⊤
(h) F ∨ ⊥ ⇔ F
(i) F ∨ F ⇔ F
(j) F → ⊤ ⇔ ⊤
(k) F → ⊥ ⇔ ¬F
(l) ⊤ → F ⇔ F
(m) ⊥ → F ⇔ ⊤
(n) ⊤ ↔ F ⇔ F
(o) ⊥ ↔ F ⇔ ¬F
(p) ¬(F1 ∧ F2 ) ⇔ ¬F1 ∨ ¬F2
(q) ¬(F1 ∨ F2 ) ⇔ ¬F1 ∧ ¬F2
(r) F1 → F2 ⇔ ¬F1 ∨ F2
(s) F1 → F2 ⇔ ¬F2 → ¬F1
(t) ¬(F1 → F2 ) ⇔ F1 ∧ ¬F2
(u) (F1 ∨ F2 ) ∧ F3 ⇔ (F1 ∧ F3 ) ∨ (F2 ∧ F3 )
(v) (F1 ∧ F2 ) ∨ F3 ⇔ (F1 ∨ F3 ) ∧ (F2 ∨ F3 )
(w) (F1 → F3 ) ∧ (F2 → F3 ) ⇔ F1 ∨ F2 → F3
(x) (F1 → F2 ) ∧ (F1 → F3 ) ⇔ F1 → F2 ∧ F3
(y) F1 → F2 → F3 ⇔ F1 ∧ F2 → F3
(z) (F1 ↔ F2 ) ∧ (F2 ↔ F3 ) ⇒ (F1 ↔ F3 )
F1 F2 F1 ∧F2
0 0 1
0 1 1
1 0 1
1 1 0
1.5 (Normal forms). Convert the following PL formulae to NNF, DNF, and
CNF via the transformations of Section 1.6.
(a) ¬(P → Q)
(b) ¬(¬(P ∧ Q) → ¬R)
(c) (Q ∧ R → (P ∨ ¬Q)) ∧ (P ∨ R)
(d) ¬(Q → R) ∧ P ∧ (Q ∨ ¬(P ∧ R))
34 1 Propositional Logic
2.1 Syntax
All formulae of PL evaluate to true or false. FOL is not so simple. In FOL,
terms evaluate to values other than truth values such as integers, people, or
cards of a deck. However, we are getting ahead of ourselves: just as in PL,
36 2 First-Order Logic
the syntax of FOL is independent of its meaning. The most basic terms are
variables x, y, z, x1 , x2 , . . . and constants a, b, c, a1 , a2 , . . ..
More complicated terms are constructed using functions. An n-ary func-
tion f takes n terms as arguments. Notationally, we represent generic FOL
functions by symbols f , g, h, f1 , f2 , . . .. A constant can also be viewed as a
0-ary function.
Example 2.3. In
∀x. p(f (x), x) → (∃y. p(f (g(x, y)), g(x, y))) ∧ q(x, f (x)) ,
| {z }
G
| {z }
F
the scope of x is F , and the scope of y is G. This formula is read: “for all x, if
p(f (x), x) then there exists a y such that p(f (g(x, y)), g(x, y)) and q(x, f (x))”.
Example 2.4. In
x only occurs bound, while y appears both free (in the antecedent) and bound
(in the consequent). Thus, free(F ) = {y} and bound(F ) = {x, y}.
∀x1 . . . . ∀xn . F ,
∃x1 . . . . ∃xn . F .
• and the subterms of f (t1 , . . . , tn ) are the term itself and the subterms of
t1 , . . . , tn .
The strict subterms of a term excludes the term itself.
Example 2.5. In
F , p(f (x), y) → ∀y. p(f (x), y) , ∀y. p(f (x), y) , p(f (x), y) .
• Fido is a dog. Furrball is a cat. Fido has fewer days than does Furrball.
• The length of one side of a triangle is less than the sum of the lengths of
the other two sides.
2.2 Semantics
Having defined the syntax of FOL, we now define its semantics. Formulae
of FOL evaluate to the truth values true and false as in PL. However, terms
of FOL formulae evaluate to values from a specified domain. We extend the
concept of interpretations to this more complex setting and then define the
semantics of FOL in terms of interpretations.
First, we define a FOL interpretation I. The domain DI of an interpre-
tation I is a nonempty set of values or objects, such as integers, real numbers,
dogs, people, or merely abstract objects. |DI | denotes the cardinality, or
size, of DI . Domains can be finite, such as the 52 cards of a deck of cards;
countably infinite, such as the integers; or uncountably infinite, such as the
reals. But all domains are nonempty.
The assignment αI of interpretation I maps constant, function, and pred-
icate symbols to elements, functions, and predicates over DI . It also maps
variables to elements of DI :
• each variable symbol x is assigned a value xI from DI ;
• each n-ary function symbol f is assigned an n-ary function
fI : DIn → DI
contains the binary function symbols + and −, the binary predicate symbol >,
and the variables x, y, and z. Again, +, −, and > are just symbols: we choose
these names to provide intuition for the intended meaning of the formulae.
We could just as easily have written
DI = Z = {. . . , −2, −1, 0, 1, 2, . . .} .
40 2 First-Order Logic
The elision reminds us that, as always, αI provides values for the countably
infinitely many other constant, function, and predicate symbols. Usually, we
do not write the elision.
I |= ⊤
I 6|= ⊥
Next, consider more complicated atoms. αI gives meaning αI [x], αI [c], and
αI [f ] to variables x, constants c, and functions f . Evaluate arbitrary terms
recursively:
Then
I |= ¬F iff I 6|= F
I |= F1 ∧ F2 iff I |= F1 and I |= F2
I |= F1 ∨ F2 iff I |= F1 or I |= F2
I |= F1 → F2 iff, if I |= F1 then I |= F2
I |= F1 ↔ F2 iff I |= F1 and I |= F2 , or I 6|= F1 and I 6|= F2
2.2 Semantics 41
In the first line, basic reasoning about the interpretation I reveals that f and
g always disagree. The second line follows from the first by the semantics of
existential quantification.
42 2 First-Order Logic
The restriction in the latter two rules corresponds to our intuition: if all we
know is that ∃x. F , then we certainly do not know which value in particular
satisfies F . Hence, we choose a new value v that does not appear previously in
the proof: it was never introduced before by a quantification rule. Moreover,
αI does not already assign it to some constant, αI [a], or to some function
application, αI [f (t1 , . . . , tn )].
Notice the similarity between the first two and between the final two rules.
The first two rules handle a case that is universal in character. Consider the
second rule: if there does not exist an x such that F , then for all values, F
does not hold. The final two rules are existential in character.
Lastly, the contradiction rule is modified for the FOL case.
• A contradiction exists if two variants of the original interpretation I dis-
agree on the truth value of an n-ary predicate p for a given tuple of domain
values.
J : I ⊳ · · · |= p(s1 , . . . , sn )
K : I ⊳ · · · 6|= p(t1 , . . . , tn ) for i ∈ {1, . . . , n}, αJ [si ] = αK [ti ]
I |= ⊥
The intuition behind the contradiction rule is the following. The variants J
and K are constructed only through the rules for quantification. Hence, the
truth value of p on the given tuple of domain values is already established
by I. Therefore, the disagreement between J and K on the truth value of p
indicates a problem with I.
None of these rules cause branching, but several of the rules for the logical
connectives do. Thus, a proof in general is a tree. A branch is closed if it
contains a contradiction according to the (first-order) contradiction rule; it is
open otherwise. All branches are closed in a finished proof of a valid formula.
We exhibit the proof method through several examples.
Example 2.10. We prove that
1. I 6|= F assumption
2. I |= ∀x. p(x) 1 and semantics of →
3. I 6|= ∀y. p(y) 1 and semantics of →
4. I ⊳ {y 7→ v} 6|= p(y) 3 and semantics of ∀, for some v ∈ DI
5. I ⊳ {x 7→ v} |= p(x) 2 and semantics of ∀
Lines 2 and 3 state the case in which line 1 holds: the antecedent and conse-
quent of F are respectively true and false under I. Line 4 states that because
of 3, there must be a value v ∈ DI such that I ⊳ {y 7→ v} 6|= p(y). Line 5 uses
this same value v and the semantics of ∀ with 2 to derive a contradiction:
under I, p(v) is false by 4 and true by 5. Thus, F is valid.
44 2 First-Order Logic
Choose
DI = {0, 1}
and
We use a common notation for defining relations: pI (a, b) is true iff (a, b) ∈ pI .
Here, pI (0, 0) is true, and pI (1, 0) is false.
Both ∀x. p(x, x) and ¬(∃x. ∀y. p(x, y)) evaluate to true under I, so
2.4 Substitution
Substitution for FOL is more complex than substitution for PL because of
quantification. We introduce two types of substitution in this section with
the goal of generalizing Propositions 1.15 and 1.17 to the FOL setting. As in
PL, substitution allows us to consider the validity of entire sets of formulae
simultaneously.
46 2 First-Order Logic
Vσ consists of the free variables of all formulae Fi and Gi of the domain and
range of σ. Compute the safe substitution F σ of formula F as follows:
1. For each quantified variable x in F such that x ∈ Vσ , rename x to a fresh
variable to produce F ′ .
2. Compute F ′ σ.
and substitution
Then
1. x ∈ Vσ , so rename bound occurrences in F :
in which the quantified variable has a different name than any free variable
of F or the substitution
48 2 First-Order Logic
σ : {F1 7→ G1 , . . . , Fn 7→ Gn }
H : (∀x. F ) ↔ (¬∃x. ¬F )
is valid. It can act as a formula schema. First, rewrite the formula using
placeholders:
H : (∀x. F ) ↔ (¬∃x. ¬F ) .
H does not have any side conditions. Next, to prove the validity of
2.4 Substitution 49
consider the two directions. First, assume that I 6|= (∀x. F1 ∧F2 ) → (∀x. F1 ) ∧
(∀x. F2 ):
1. I |= ∀x. F1 ∧ F2 assumption
2. I |6 = (∀x. F1 ) ∧ (∀x. F2 ) assumption
3. I |= (∃x. ¬F1 ) ∨ (∃x. ¬F2 ) 2, ¬
9a. I ⊳ {x 7→ v} |= ¬F1 4, ∨
If we disregard the side condition, then H is an invalid formula schema as, for
example,
σ : {F 7→ p(x)} ,
1. I |= ∀x. F assumption
2. I 6|= F assumption
3. I |= F 1, ∀, since x 6∈ free(F )
4. I |= ⊥ 2, 3
Second,
F1 → F2 ⇔ ¬F1 ∨ F2 .
and the final formula, which is in NNF, follows from De Morgan’s Law.
Q1 x1 . . . . Qn xn . F [x1 , . . . , xn ] ,
F4 : Q1 x1 . . . . Qn xn . F3 ,
1. Write F in NNF:
is not equivalent to F .
A FOL formula is in CNF (DNF) if it is in PNF and its main quantifier-
free subformula is in CNF (DNF). CNF and DNF equivalents are obtained by
transforming formula F into PNF formula F ′ and then applying the relevant
procedure of Section 1.6 to the main quantifier-free subformula of F ′ .
2.6.2 Decidability
⋆
2.6.3 Complexity
∀n ≥ n0 . g(n) ≤ cf (n) .
O(f (n)) denotes the set of all functions of at most order f (n). Similarly,
Ω(f (n)) denotes the set of all function of at least order f (n): a function g(n)
is of at least order f (n) if there exist a scalar c ≥ 0 and an an integer n0 ≥ 0
such that
∀n ≥ n0 . g(n) ≥ cf (n) .
56 2 First-Order Logic
Finally, Θ(f (n)) = Ω(f (n)) ∩ O(f (n)) denotes the set of all functions of
precisely order f (n).
Example 2.28.
• 3n2 + n ∈ O(n2 )
• 3n2 + n ∈ Ω(n2 )
• 3n2 + n ∈ Θ(n2 )
1 2 2
• 99 n + n ∈ Ω(n )
2 n
• 3n + n ∈ O(2 )
• 3n2 + n ∈ Ω(n)
• 3n2 + n 6∈ Ω(2n )
• 3n2 + n 6∈ Θ(2n )
• 2n ∈ Ω(n3 )
• 2n 6∈ O(n3 )
A decision problem has time complexity O(f (n)) if there exists a decision
algorithm P for the problem and a function g(n) ∈ O(f (n)) such that P
runs in time at most g(n) on input of size n. A decision problem has time
complexity Ω(f (n)) if there exists a function g(n) ∈ Ω(f (n)) such that all
decision algorithms P for the problem run in time at least g(n) on input of
size n. Finally, a decision problem has time complexity Θ(f (n)) if it has time
complexities Ω(f (n)) and O(f (n)).
Example 2.29. The algorithm sat for deciding PL satisfiability runs in time
Θ(2n ), where n is the number of variables in the input formula, because each
level of recursion branches. Hence, the problem of PL satisfiability has time
complexity O(2n ).
⋆
2.7 Meta-Theorems of First-Order Logic
We prove that the semantic argument method for FOL is sound and, given
a proper strategy of applying the proof rules, complete. A proof method is
sound if every formula that has a proof according to the method is valid. A
proof method is complete if every formula that is valid has a proof according
to the method. That the semantic argument method is sound means that a
closed semantic argument for I 6|= F proves the validity of F ; and that the
semantic argument method is complete means that every valid formula F
of FOL has a closed semantic argument proving its validity. Because there
exists a complete proof method for FOL, FOL is a complete logic: every valid
formula of FOL has a proof of its validity.
The second half of this section is devoted to proving two classic theorems
that we apply in Chapter 10.
⋆
2.7 Meta-Theorems of First-Order Logic 57
In preparation for the proofs, we simplify the language of FOL without los-
ing expressiveness. Exercises 1.3 and 4.6 show that we have many redundant
logical connectives. We choose to use only the logical constant ⊤ and the con-
nectives ¬ and ∧, from which the others can be constructed. Additionally, we
need only one quantifier since ∃x. F is equivalent to ¬∀x. ¬F . We choose ∀.
A second simplification is more involved. The goal is to remove constant
and function symbols from the language by using predicate symbols instead.
Given a formula F , let S be the set of function symbols appearing in it.
Associate with each n-ary function symbol f of S a new (n + 1)-ary predicate
pf . Then for each occurrence of a function f in a literal L of F
L[f (t1 , . . . , tn )] ,
After all replacements, the resulting formula G does not contain any function
symbols.
The next step ensures that the new predicate pf describes a function f : it
associates with each tuple of domain values v1 , . . . , vn precisely one value v.
For each introduced predicate pf , construct the formula
(∀x. x = x)
E: ∧ (∀x, y. x = y → y = x)
∧ (∀x, y, z. x = y ∧ y = z → x = z)
F ′ is valid iff F is valid. Moreover, F ′ does not contain any function symbols.
For the special case of constant symbols, it is simpler to replace F [a] with
F ′ : ∀x. F [x].
For the remainder of this section, we consider a version of FOL with only
the logical constant ⊤, the connectives ¬ and ∧, the quantifier ∀, and predicate
symbols. It is equivalent in expressive power to the richer language studied
earlier in the chapter.
Lemma 2.31. Each open branch of a finished semantic argument for I 6|= F
defines a falsifying interpretation of F .
Remark 2.33. We defined the procedure with a fixed countably infinite do-
main in mind and then proved that an open branch of a finished semantic
argument corresponds to at least one falsifying interpretation. Therefore, we
have proved an additional fact: every satisfiable FOL formula is satisfied by an
interpretation with a countable domain. This result is Löwenheim’s Theo-
rem.
Proof. Suppose that F is valid, yet no semantic argument proof exists. Then
a finished semantic argument constructed according to our procedure has an
open branch. By Lemma 2.31, this branch describes a falsifying interpretation
of F , a contradiction. Hence, all branches of a finished semantic argument must
in fact be closed (and thus finite). By König’s Lemma, the semantic argument
itself has finite size.
Remark 2.36. This proof proves an additional fact that extends Löwenheim’s
Theorem: every simultaneously satisfiable countable set of FOL formulae is
simultaneously satisfied by an interpretation with a countable domain. This
result is the Löwenheim-Skolem Theorem.
the rules of 2.7.2 can be translated into a proof using the new rules. Then we
prove the Craig Interpolation Lemma using these new proof rules.
One trick that will prove convenient is the following. Associate a fresh
variable xi with each domain value vi introduced during the proof. Whenever
a variant I ⊳ {x 7→ vi } is used, rename x to the variable xi corresponding to
the value vi in both the variant interpretation and the formula. This renaming
does not affect the soundness of the proof, but it makes contradictions more
obvious.
The new rules are the following:
• For handling double negation:
I |= ¬¬F I |6 = ¬¬F
I |= F I 6|= F
I |= F ∧ G I 6|= F ∧ G
I |= F I 6|= F | I 6|= G
I |= G
and
I |= ¬(F ∧ G) I |6 = ¬(F ∧ G)
I |= ¬F | I |= ¬G I 6|= ¬F
I 6|= ¬G
and
I 6|= ∀x. F I |= ¬∀x. F for a fresh v ∈ DI
I ⊳ {x 7→ v} 6|= F I ⊳ {x →
7 v} |= ¬F
J : I ⊳ · · · |= p(x1 , . . . , xn )
K : I ⊳ · · · 6|= p(x1 , . . . , xn )
I |= ⊥
and
J : I ⊳ · · · |= p(x1 , . . . , xn ) J : I ⊳ · · · |6 = p(x1 , . . . , xn )
K : I ⊳ · · · |= ¬p(x1 , . . . , xn ) K : I ⊳ · · · |6 = ¬p(x1 , . . . , xn )
I |= ⊥ I |= ⊥
⋆
2.7 Meta-Theorems of First-Order Logic 63
The important characteristic (for proving the interpolation lemma) of this set
of proof rules is that premises and deductions agree on the use of |= or 6|=,
except in the contradiction rules. In contrast, the negation rules of Section
2.7.2 do not have this property. We obtained this property by folding each
negation rule into every other rule.
Before proving the interpolation lemma, let us prove that the new semantic
argument proof system based on these rules is sound and complete. Soundness
is fairly obvious; for completeness, we briefly describe how to map a proof from
the system of Section 2.7.2 to a proof using these rules.
Lemma 2.37. Every proof in the proof system of Section 2.7.2 has a corre-
sponding proof in the new proof system.
Proof. In constructing the new proof, ignore any use of the negation rules
of Section 2.7.2, instead choosing from the (doubled) set of conjunction and
quantification rules depending on whether a ¬ is at the root of the parse tree
of a formula. Use the new negation rules to remove double negations when
necessary. For deriving a contradiction, one of the three cases represented by
the contradiction rules must occur when a contradiction occurs in the original
proof.
1. I |= F assumption
2. I |6 = G assumption
Notice that with the new set of proof rules, only |= rules will be applied
to deductions stemming from line 1, while only 6|= rules will be applied to
those stemming from line 2. The three contradiction rules correspond to three
possible situations: a contradiction between I |= F and I 6|= G (F → G is
valid), within I |= F itself (F is unsatisfiable), and within I 6|= G itself (G is
valid).
The procedure runs backwards through a proof. It associates with each line
L of the proof a set of positive formulae U and a set of negative formulae V .
U consists of formulae on lines from which L descends (including itself) that
are satisfied by their interpretation (lines of the form K |= F1 ). V consists of
formulae on lines from which L descends (including itself) that are falsified
by their interpretation (lines of the form K 6|= F2 ). Define L’s characteristic
formula as
64 2 First-Order Logic
^ _
U → V ,
and the predicates and free variables of X appear in both U and V . The
interpolant of line 2 of the proof is the interpolant H that we seek.
Let us begin with the end of a branch, L : I |= ⊥. It must have been
deduced via a contradiction. If the first contradiction rule produced L, then
its characteristic formula is of the form
where the variable renaming trick ensures that the arguments to p are syn-
tactically the same. Its parent has characteristic formula
and both have interpolant p(x1 , . . . , xn ). If the second contradiction rule pro-
duced L, then its characteristic formula is of the form
Both have interpolant ⊥ (¬⊤ in the restricted language). Similarly, if the third
contradiction rule produced L, then the interpolant is ⊤.
Consider lines derived via the conjunction rules. Suppose L : I |= F is
deduced from I |= F ∧ G. Then the characteristic formulae of L and its parent
are
respectively. If L has interpolant X, then so does its parent. The case is similar
for a line L : I 6|= ¬F deduced from I 6|= ¬(F ∧ G).
For the next conjunction rule, suppose that L : I 6|= F is deduced on one
branch from I 6|= F ∧ G. Then L is at a fork in the proof and has sibling line
L′ : I 6|= G. The characteristic formulae of L, L′ , and their parent are
∀ ∗ . ∀z. X → V ∨ ∀x.F ∨ F
is equivalent to
∀ ∗ . X → V ∨ ∀x.F ∨ ∀z. F
and thus to
∀ ∗ . X → V ∨ ∀x.F .
respectively. Clearly,
U ∧ ∀x. F ∧ F ⇒ X
implies that
U ∧ ∀x. F ⇒ X .
U ∧ ∀x. F ⇒ ∀z. X
66 2 First-Order Logic
and
∀z. X ⇒ V because ∀z. X ⇒ X and X ⇒ V .
For the final case, suppose that L : I ⊳ {z 7→ v} 6|= ¬F is deduced from
I 6|= ¬∀x. F . The characteristic formula of L is
{U } → {V, ¬∀x. F, F } .
Then X is the interpolant of the parent M unless z is free in U but not free
in V . In the latter case, the interpolant is ∃z. X (¬∀z. ¬X in the restricted
language). The reasoning is similar to the previous case, completing the proof.
2.8 Summary
Building on the presentation of PL in Chapter 1, this chapter introduces first-
order logic (FOL). It covers:
• Its syntax. How one constructs a FOL formula. Variables, terms, function
symbols, predicate symbols, atoms, literals, logical connectives, quantifiers.
• Its semantics. What a FOL formula means. Truth values true and false.
Interpretations: domain and assignments. Difference between a function
(predicate) symbol and a function (predicate) over a domain.
• Satisfiability and validity. Whether a FOL formula evaluates to true under
any or all interpretations. Semantic argument method.
• Substitution, which is a tool for manipulating formulae and making general
claims. Safe and schema substitutions. Substitution of equivalent formulae.
Valid schemata.
• Normal forms. A normal form is a set of syntactically restricted formulae
such that every FOL formula is equivalent to some member of the set.
• A review of decidability and complexity theory, which provides the concepts
necessary for discussing decidability and complexity questions in logic.
• Meta-theorems. Semantic argument method is sound and complete. Com-
pactness Theorem. Craig Interpolation Lemma.
The results of Section 2.7 are the groundwork for our theoretical treatment
of the Nelson-Oppen combination method in Chapter 10.
FOL is the most general logic that is discussed in this book. Its applications
include software and hardware design and analysis, knowledge representation,
and complexity and decidability theory.
FOL is a complete logic: every valid FOL formula has a proof in the se-
mantic argument method. However, validity is undecidable. Many applications
benefit from complete automation, which is impossible when considering all
of FOL. Therefore, Chapter 3 introduces first-order theories, which formal-
ize interesting structures, such as integers, rationals, lists, stacks, and arrays.
Part II of this book explores algorithms for reasoning within these theories.
Exercises 67
Bibliographic Remarks
For a complete and concise presentation of propositional and first-order logic,
see Smullyan’s text First-Order Logic [87]. The semantic argument method
is similar to Smullyan’s tableau method. Also, the proofs of completeness of
the semantic argument method, the Compactness Theorem, and the Craig
Interpolation Lemma are inspired by Smullyan’s presentation.
The history of the development of mathematical logic is rich. For an
overview, see [98] and related articles in The Stanford Encyclopedia of Phi-
losophy. We mention in particular Hilbert’s program of the 1920s — see, for
example, [38] — to find a consistent and complete axiomatization of arith-
metic. Gödels two incompleteness theorems proved that such a goal is impos-
sible. The first incompleteness theorem, which Gödel presented in a lecture
in September, 1930, and then in [36], states that any axiomatization of arith-
metic contains theorems that are not provable within the theory. The second,
which Gödel had proved by October, 1930, states that a theory such as Peano
arithmetic cannot prove its own consistency unless it is itself inconsistent.
Earlier, Gödel proved that first-order logic is complete [35]: every theorem
has a proof. However, Church — and, independently, Turing — proved that
satisfiability in first-order logic is undecidable [13]. Thus, while every theorem
of first-order logic has a finite proof, invalid formulae need not have a finite
proof of their invalidity.
For an introduction to formal languages, decidability, and complexity the-
ory, see [85, 72, 41].
Exercises
2.1 (English and FOL). Encode the following English sentences into FOL.
(a) Some days are longer than others.
(b) In all the world, there is but one place that I call home.
(c) My mother’s mother is my grandmother.
(d) The intersection of two convex sets is convex.
2.2 (FOL validity & satisfiability). For each of the following FOL formu-
lae, identify whether it is valid or not. If it is valid, prove it with a semantic
argument; otherwise, identify a falsifying interpretation.
(a) (∀x, y. p(x, y) → p(y, x)) → ∀z. p(z, z)
(b) ∀x, y. p(x, y) → p(y, x) → ∀z. p(z, z)
(c) (∃x. p(x)) → ∀y. p(y)
(d) (∀x. p(x)) → ∃y. p(y)
(e) ∃x, y. (p(x, y) → (p(y, x) → ∀z. p(z, z)))
2.3 (Semantic argument). Use the semantic argument method to prove the
following formula schemata.
68 2 First-Order Logic
2.4 (Normal forms). Put the following formulae into prenex normal form.
(a) (∀x. ∃y. p(x, y)) → ∀x. p(x, x)
(b) ∃z. (∀x. ∃y. p(x, y)) → ∀x. p(x, z)
(c) ∀w. ¬(∃x, y. ∀z. p(x, z) → q(y, z)) ∧ ∃z. p(w, z)
T |= F
3.2 Equality
The theory of equality TE is the simplest first-order theory. Its signature
ΣE : {=, a, b, c, . . . , f, g, h, . . . , p, q, r, . . .}
consists of
• = (equality), a binary predicate;
• and all constant, function, and predicate symbols.
Equality = is an interpreted predicate symbol: its meaning is defined via
the axioms of TE . The other constant, function, and predicate symbols are
uninterpreted except as they relate to equality. The axioms of TE are the
following:
1. ∀x. x = x (reflexivity)
2. ∀x, y. x = y → y = x (symmetry)
3. ∀x, y, z. x = y ∧ y = z → x = z (transitivity)
4. for each positive integer n and n-ary function symbol f ,
n
!
^
∀x, y. xi = yi → f (x) = f (y) (function congruence)
i=1
The notation x stands for the list of variables x1 , . . . , xn . Axioms (function con-
gruence) and (predicate congruence) are actually axiom schemata. An axiom
schema stands for a set of axioms, each an instantiation of the parameters (f
and p in (function congruence) and (predicate congruence), respectively). For
example, for binary function symbol f2 , (function congruence) instantiates to
the following axiom:
value (truth value) for a given set of argument values. They assert that = is
a congruence relation.
TE is just as undecidable as full FOL because it allows all constant, func-
tion, and predicate symbols. In particular, any FOL formula F can be en-
coded as a ΣE -formula F ′ simply by replacing occurrences of the symbol =
with a fresh symbol. Since = does not occur in this transformed formula F ′ ,
the axioms of TE are irrelevant; hence, F ′ is TE -satisfiable iff F ′ is first-order
satisfiable.
However, the quantifier-free fragment of TE is both interesting and effi-
ciently decidable, as we show in Chapter 9.
Example 3.1. Without quantifiers, free variables and constants play the
same role. In the formula
1. I 6|= F assumption
2. I |= a=b ∧ b=c 1, →
3. I 6|= g(f (a), b) = g(f (c), a) 1, →
4. I |= a=b 2, ∧
5. I |= b=c 2, ∧
6. I |= a=c 4, 5, (transitivity)
7. I |= f (a) = f (c) 6, (function congruence)
8. I |= b=a 4, (symmetry)
9. I |= g(f (a), b) = g(f (c), a) 7, 8 (function congruence)
10. I |= ⊥ 3, 9
ΣPA : {0, 1, +, ·, =} ,
where
• 0 and 1 are constants;
• + (addition) and · (multiplication) are binary functions;
• and = (equality) is a binary predicate.
Its axioms are the following:
1. ∀x. ¬(x + 1 = 0) (zero)
2. ∀x, y. x + 1 = y + 1 → x = y (successor)
3. F [0] ∧ (∀x. F [x] → F [x + 1]) → ∀x. F [x] (induction)
4. ∀x. x + 0 = x (plus zero)
5. ∀x, y. x + (y + 1) = (x + y) + 1 (plus successor)
6. ∀x. x · 0 = 0 (times zero)
7. ∀x, y. x · (y + 1) = x · y + x (times successor)
These axioms concisely define addition, multiplication, and equality over nat-
ural numbers. Informally, axioms (zero), (plus zero), and (times zero) define
0 as we understand it: it is the minimal element of the natural numbers; it
is the identity for addition (x + 0 = x); and under multiplication, it maps
any number to 0 (x · 0 = 0). Axioms (zero), (successor), (plus zero), and (plus
successor) define addition. Axioms (times zero) and (times successor) define
multiplication: in particular, (times successor) defines multiplication in terms
of addition.
(induction) is an axiom schema: it stands for the set of axioms obtained
by substituting for F each ΣPA -formula that has precisely one free variable.
It asserts that every TPA -interpretation I obeys induction: if I satisfies F [0]
and ∀x. F [x] → F [x + 1], then I also satisfies ∀x. F [x].
For convenience, we usually do not write the “·” for multiplication. For
example, we write xy rather than x · y.
74 3 First-Order Theories
x+x+x+1+1+1+1+1 =y+y
or as
(1 + 1 + 1) · x + 1 + 1 + 1 + 1 + 1 = (1 + 1) · y .
Example 3.4. Rather than augmenting TPA with axioms defining inequality
>, we can transform formulae with inequality into formulae over the restricted
signature ΣPA . Write
3x + 5 > 2y as ∃z. z 6= 0 ∧ 3x + 5 = 2y + z ,
where z 6= 0 abbreviates ¬(z = 0). The latter formula is a ΣPA -formula. Weak
inequality can be similarly transformed. Write
3x + 5 ≥ 2y as ∃z. 3x + 5 = 2y + z .
∃x, y, z. x 6= 0 ∧ y 6= 0 ∧ z 6= 0 ∧ xx + yy = zz
is TPA -valid. It asserts that there exists a triple of positive integers fulfilling
the Pythagorean Theorem. The formula
{∀x, y, z. x 6= 0 ∧ y 6= 0 ∧ z 6= 0 → xn + y n 6= z n : n > 2 ∧ n ∈ Z}
ΣN : {0, 1, +, =} ,
where
• 0 and 1 are constants;
• + (addition) is a binary function;
• and = (equality) is a binary predicate.
Its axioms are a subset of the axioms of TPA :
1. ∀x. ¬(x + 1 = 0) (zero)
2. ∀x, y. x + 1 = y + 1 → x = y (successor)
3. F [0] ∧ (∀x. F [x] → F [x + 1]) → ∀x. F [x] (induction)
4. ∀x. x + 0 = x (plus zero)
5. ∀x, y. x + (y + 1) = (x + y) + 1 (plus successor)
Again, (induction) is an axiom schema standing for the set of axioms obtained
by replacing F with each ΣN -formula that has precisely one free variable.
The intended interpretations of TN have domain N and are such that
• αI [0] is 0N ∈ N;
• αI [1] is 1N ∈ N;
• αI [+] is +N , addition over N;
• αI [=] is =N , equality over N.
How does one reason about all integers, Z = {. . . , −2, −1, 0, 1, 2, . . .}? Such
formulae can be encoded as ΣN -formulae.
∀wp , wn , xp , xn . ∃yp , yn , zp , zn .
F1 :
(xp − xn ) + 2(yp − yn ) − (zp − zn ) − 13 > −3(wp − wn ) + 5
∀wp , wn , xp , xn . ∃yp , yn , zp , zn .
F2 :
xp + 2yp + zn + 3wp > xn + 2yn + zp + 13 + 3wn + 5 .
Presburger showed in 1929 that TN is decidable. Therefore, the “theory
of (negative and positive) integers” that we loosely constructed above is also
decidable via the syntactic rewriting of formulae into ΣN -formulae. Rather
than using this cumbersome rewriting, however, we next study a theory of
integers.
where
• . . . , −2, −1, 0, 1, 2, . . . are constants, intended to be assigned the obvious
corresponding values in the intended domain of integers Z;
• . . . , −3·, −2·, 2·, 3·, . . . are unary functions, intended to represent con-
stant coefficients (e.g., 2 · x, abbreviated 2x);
• + and − are binary functions, intended to represent the obvious corre-
sponding functions over Z;
• = and > are binary predicates, intended to represent the obvious corre-
sponding predicates over Z.
Since Example 3.7 shows that ΣZ -formulae can be reduced to ΣN -formulae, we
do not axiomatize TZ . TZ is merely a convenient representation for reasoning
about addition over all integers.
3.3 Natural Numbers and Integers 77
∀x. ∃y. x = y + 1 ,
∀x. x ≥ 0 → ∃y. y ≥ 0 ∧ x = y + 1 ,
1. I |6 = F assumption
2. I1 : I ⊳ {x 7→ v1 } ⊳ {y →
7 v2 } ⊳ {z 7→ v3 }
6|= x > z ∧ y ≥ 0 → x + y > z 1, ∀
3. I1 |= x > z ∧ y ≥ 0 2, →
4. I1 6|= x + y > z 2, →
5. I1 |= ¬(x + y > z) 4, ¬
6. I1 |= ⊥ 3, 5, TZ
Therefore, F is TZ -valid.
78 3 First-Order Theories
1. I |6 = F assumption
2. I1 : I ⊳ {x 7→ v1 } ⊳ {y 7→ v2 }
6|= x > 0 ∧ (x = 2y ∨ x = 2y + 1) → x − y > 0
1, ∀
3. I1 |= x > 0 ∧ (x = 2y ∨ x = 2y + 1) 2, →
4. I1 |= x > 0 3, ∧
5. I1 |= x = 2y ∨ x = 2y + 1 3, ∧
6. I1 6|= x − y > 0 2, →
7. I1 |= ¬(x − y > 0) 6, ¬
8a. I1 |= x = 2y 5, ∨
We collect the formulae of lines 4, 7, and 8a, apply the variant interpretation
I1 , and query the theory TZ : are there integers v1 , v2 such that
which simplifies to
9a. I1 |= ⊥ 4, 7, 8a, TZ
8b. I1 |= x = 2y + 1 5, ∨
which simplifies to
The first literal holds only when v2 > −1, while the second holds only when
v2 ≤ −1, a contradiction. This reasoning is summarized by
9b. I1 |= ⊥ 4, 7, 8b, TZ
Thus, F is TZ -valid.
F : ∃x. 2x = 7
G : ∃x. x2 = 2
ΣR : {0, 1, +, −, ·, =, ≥} ,
where
• 0 and 1 are constants;
• + (addition) and · (multiplication) are binary functions;
• − (negation) is a unary function;
• and = (equality) and ≥ (weak inequality) are binary predicates.
TR has the most complex axiomatization of the theories that we study. We
group axioms by their mathematical content.
First are the axioms of an abelian group. An abelian group is a structure
with additive identity 0, associative and commutative addition +, additive
inverse −, and equality =. The qualifier “abelian” simply means that addition
is commutative. The axioms are the following:
1. ∀x, y, z. (x + y) + z = x + (y + z) (+ associativity)
2. ∀x. x + 0 = x (+ identity)
3. ∀x. x + (−x) = 0 (+ inverse)
4. ∀x, y. x + y = y + x (+ commutativity)
The first three axioms are the axioms of a group.
Second are the additional axioms of a ring. A ring is an abelian group with
a multiplicative identity 1 and associative multiplication · that distributes over
addition. For convenience, we usually shorten x · y to xy.
1. ∀x, y, z. (xy)z = x(yz) (· associativity)
2. ∀x. x1 = x (· left identity)
3. ∀x. 1x = x (· right identity)
4. ∀x, y, z. x(y + z) = xy + xz (left distributivity)
5. ∀x, y, z. (x + y)z = xz + yz (right distributivity)
Both left and right identity and distributivity axioms are required since · is
not commutative (yet). It is made so in the next set of axioms.
Third are the additional axioms of a field. In a field, · is commutative;
the additive and multiplicative identities are different; and the multiplicative
inverse of a non-0 value exists (e.g., 12 is the multiplicative inverse of 2).
1. ∀x, y. xy = yx (· commutativity)
3.4 Rationals and Reals 81
2. 0 6= 1 (separate identities)
3. ∀x. x 6= 0 → ∃y. xy = 1 (· inverse)
The axiom (· commutativity) makes the (· right identity) and (right distributivity)
axioms redundant.
Fourth are the additional axioms characterizing ≥ as a total order.
1. ∀x, y. x ≥ y ∧ y ≥ x → x = y (antisymmetry)
2. ∀x, y, z. x ≥ y ∧ y ≥ z → x ≥ z (transitivity)
3. ∀x, y. x ≥ y ∨ y ≥ x (totality)
Finally are the additional axioms of a real closed field.
1. ∀x, y, z. x ≥ y → x + z ≥ y + z (+ ordered)
2. ∀x, y. x ≥ 0 ∧ y ≥ 0 → xy ≥ 0 (· ordered)
3. ∀x. ∃y. x = y 2 ∨ x = −y 2 (square-root)
4. for each odd integer n,
4. ∀x, y, z. (x + y) + z = x + (y + z) (+ associativity)
5. ∀x. x + 0 = x (+ identity)
6. ∀x. x + (−x) = 0 (+ inverse)
7. ∀x, y. x + y = y + x (+ commutativity)
8. ∀x, y, z. x ≥ y → x + z ≥ y + z (+ ordered)
F : ∃x. ax2 + bx + c = 0
satisfiable? That is, what are the conditions on a, b, and c such that a quadratic
polynomial has a real root? Recall that the discriminant must be nonnegative:
F ′ : b2 − 4ac ≥ 0 .
Tarski proved that TR was decidable in the 1930s, although the Second
World War prevented his publishing the result until 1956. Collins proposed the
more efficient technique of cylindrical algebraic decomposition (CAD) in
1975. Unfortunately, even the most efficient decision procedures for TR have
k|F |
prohibitively high time complexity: CAD runs in time proportionate to 22 ,
for some constant k and for |F | the length of ΣR -formula F .
Given the high complexity of deciding TR -validity (and the high intellectual
complexity of Tarski’s and subsequent decision procedures for TR ), we turn to
a simpler theory without multiplication, the theory of rationals TQ . It has
signature
ΣQ : {0, 1, +, −, =, ≥} ,
where
• 0 and 1 are constants;
• + (addition) is a binary function;
• − (negation) is a unary function;
• and = (equality) and ≥ (weak inequality) are binary predicates.
Its axioms are the following:
1. ∀x, y. x ≥ y ∧ y ≥ x → x = y (antisymmetry)
2. ∀x, y, z. x ≥ y ∧ y ≥ z → x ≥ z (transitivity)
3. ∀x, y. x ≥ y ∨ y ≥ x (totality)
3.4 Rationals and Reals 83
4. ∀x, y, z. (x + y) + z = x + (y + z) (+ associativity)
5. ∀x. x + 0 = x (+ identity)
6. ∀x. x + (−x) = 0 (+ inverse)
7. ∀x, y. x + y = y + x (+ commutativity)
8. ∀x, y, z. x ≥ y → x + z ≥ y + z (+ ordered)
∀x. nx = 0 → x = 0 (torsion-free)
as the ΣQ -formula
Theory of Lists
We first focus on the theory of LISP-like lists, Tcons , which has signature
where
• cons is a binary function, called the constructor: cons(a, b) represents the
list constructed by concatenating a to b;
• car is a unary function, called the left projector: car(cons(a, b)) = a;
• cdr is a unary function, called the right projector: cdr(cons(a, b)) = b;
• atom is a unary predicate: atom(x) is true iff x is a single-element list;
• and = (equality) is a binary predicate.
car and cdr are historical names abbreviating “contents of address register”
and “contents of decrement register”, respectively. In the intended interpre-
tations, atoms are individual elements, while lists are multiple elements as-
sembled together via cons. For example, cons(a, cons(b, c)) is a list of three
3.5 Recursive Data Structures 85
elements, while a for which atom(a) holds is an atom. car and cdr are func-
tions for accessing parts of lists. For example, car(cons(a, cons(b, c))) returns
the head a of the list; cdr(cons(a, cons(b, c))) returns the tail cons(b, c) of the
list; and cdr(cdr(cons(a, cons(b, c)))) returns c.
The axioms of Tcons are the following:
1. the axioms of (reflexivity), (symmetry), and (transitivity) of TE
2. instantiations of the (function congruence) axiom schema for cons, car, and
cdr:
∀x1 , x2 , y1 , y2 . x1 = x2 ∧ y1 = y2 → cons(x1 , y1 ) = cons(x2 , y2 )
A variation on this theory in which data structures are acyclic has been stud-
ied. Acyclicity makes sense for stacks, but not necessarily for lists and other
+
data structures. Consider the theory of acyclic LISP-like lists, Tcons . Its axioms
include those of Tcons and the following axiom schema:
∀x. car(x) 6= x
∀x. cdr(x) 6= x
∀x. car(car(x)) 6= x
∀x. car(cdr(x)) 6= x
∀x. cdr(car(x)) 6= x
...
+
Tcons is decidable, but Tcons is not. However, the quantifier-free fragments of
these theories are efficiently decidable.
The axioms of Tcons leave the behavior of car and cdr on atoms unspecified.
Adding the axiom
3.6 Arrays
Arrays are another common data structure in programming. They are similar
to the uninterpreted functions of TE except that they can be modified. The
theory of arrays TA describes the basic characteristic of an array: if value v
is written to position i of array a, then subsequently reading from position i
of a should return v. Because logic is static, modified arrays are represented
functionally, as in functional programming.
The theory of arrays TA has signature
F : a[i] = e → ahi ⊳ ei = a
is not TA -valid, although our intuition suggests that it should be. The problem
is that the interaction between = and the read and write functions is not
captured in the axioms of TA . In other words, equality between arrays, not
just between elements, is undefined.
Instead of F , we write
which is TA -valid.
1. I 6|= F′ assumption
2. I |= a[i] = e 1, →
3. I 6|= ∀j. ahi ⊳ ei[j] = a[j] 1, →
4. I1 : I ⊳ {j 7→ j} 6|= ahi ⊳ ei[j] = a[j] 3, ∀, for some j ∈ DI
5. I1 |= ahi ⊳ ei[j] 6= a[j] 4, ¬
6. I1 |= i=j 5, (read-over-write 2)
7. I1 |= a[i] = a[j] 6, (array congruence)
8. I1 |= ahi ⊳ ei[j] = e 6, (read-over-write 1)
9. I1 |= ahi ⊳ ei[j] = a[j] 2, 7, 8, (transitivity)
10. I1 |= ⊥ 4, 9
F1 → F2 ⇔ ¬F2 → ¬F1 .
F : a[i] = e → ahi ⊳ ei = a
is TA= -valid, assume otherwise: there is a TA= -interpretation I such that I 6|=
F:
1. I 6|= F assumption
2. I |= a[i] = e 1, →
3. I 6|= ahi ⊳ ei = a 1, →
4. I |= 6 a
ahi ⊳ ei = 3, ¬
5. I |= ¬(∀j. ahi ⊳ ei[j] = a[j]) 4, (extensionality)
6. I 6|= ∀j. ahi ⊳ ei[j] = a[j] 5, ¬
⋆
3.7 Survey of Decidability and Complexity
F : a = b → a[i] ≥ b[i]
1. I 6|= F assumption
2. I |= a=b 1, →
3. I 6|= a[i] ≥ b[i] 1, →
4. I |= ¬(a[i] ≥ b[i]) 3, ¬
5. I |= a[i] = b[i] 2, TA= (extensionality)
6. I |= ⊥ 4, 5, TA= ∪ TZ
is (TE ∪TQ )-unsatisfiable. In particular, the final three literals imply that z = 0
and x = y, so that f (x) = f (y). But then from the first literal, f (0) 6= f (0)
since both f (x) − f (y) and z equal 0.
Finally, the (ΣE ∪ ΣZ )-formula
3.9 Summary
Important data types in software and hardware models include integers; ratio-
nals; recursive data structures like records, lists, stacks, and trees; and arrays.
This chapter introduces first-order theories that formalize these data types.
It covers:
• First-order theories. Formalizations of structures and operations into
first-order logic: signatures, axioms. Fragments of theories, in particular
quantifier-free fragments. Interpretations, satisfiability, validity.
• Specific theories:
– Equality defines the binary predicate = as a congruence relation. Sat-
isfiability in the quantifier-free fragment is efficiently decidable, and
the decision procedure is the basis for decision procedures for data
structures (see Chapter 9).
– Integer arithmetic. Satisfiability in integer arithmetic without multipli-
cation is decidable.
– Rational and real arithmetic. Satisfiability in real arithmetic with mul-
tiplication is decidable with high complexity. Satisfiability in ratio-
nal arithmetic without multiplication is efficiently decidable. Rational
arithmetic without multiplication is indistinguishable from real arith-
metic without multiplication.
– Recursive data structures include records, lists, stacks, and queues.
Satisfiability in the quantifier-free fragment is efficiently decidable.
Exercises 93
Bibliographic Remarks
The undecidability of validity in FOL [13] motivated the subsequent study of
first-order theories and fragments. In 1929, Presburger proved that satisfiabil-
ity in arithmetic without multiplication is decidable [73]. Tarski showed in the
1930s that real arithmetic is decidable even with multiplication, although the
Second World War delayed the publication of this result [90]. The axiomati-
zation of recursive data structures that we study is from work by Nelson and
Oppen [66]. Oppen studied a variation in which structures are acyclic [69].
The axiomatization of arrays, in particular the read-over-write axioms, is due
to McCarthy [59]. The Nelson-Oppen combination method is based on work
by Nelson and Oppen in the late 1970s and early 1980s [65].
Exercises
3.1 (Semantic argument in TE ). Use the semantic method to argue the
validity of the following ΣE -formulae, or identify a counterexample (a falsifying
TE -interpretation).
(a) f (x, y) = f (y, x) → f (a, y) = f (y, a)
(b) f (g(x)) = g(f (x)) ∧ f (g(f (y))) = x ∧ f (y) = x → g(f (x)) = x
(c) f (f (f (a))) = f (f (a)) ∧ f (f (f (f (a)))) = a → f (a) = a
(d) f (f (f (a))) = f (a) ∧ f (f (a)) = a → f (a) = a
(e) p(x) ∧ f (f (x)) = x ∧ f (f (f (x))) = x → p(f (x))
(b) x ≤ y ∧ z = x − 1 → z≤y
(c) 3x = 2 → x ≤ 0
(d) 1 ≤ x ∧ x ≤ 2 → x=1 ∨ x=2
(e) 1 ≤ x ∧ x + y ≤ 3 ∧ 1≤y → x=1 ∨ x=2
(f) 0 ≤ x ∧ 0 ≤ x + y ∧ x + y ≤ 1 ∧ (y ≤ −2 ∨ 2 ≤ y) → 0 ≤ −1
3.3 (Semantic argument in TQ ). Use the semantic method to argue the va-
lidity of the following ΣQ -formulae, or identify a counterexample (a falsifying
TQ -interpretation).
(a) 3x = 2 → x ≤ 0
(b) 0 ≤ x + 2y ∧ 2x + y ≤ 1
(c) 1 ≤ x ∧ x ≤ 2 → x = 1 ∨ x = 2
3.4 (Semantic argument in Tcons ). Use the semantic method to argue the
validity of the following Σcons -formulae, or identify a counterexample (a falsi-
fying Tcons -interpretation).
(a) car(x) = y ∧ cdr(x) = z → x = cons(y, z)
(b) ¬atom(x) ∧ car(x) = y ∧ cdr(x) = z → x = cons(y, z)
3.5 (Semantic argument in TA ). Use the semantic method to argue the va-
lidity of the following ΣA -formulae, or identify a counterexample (a falsifying
TA -interpretation).
(a) ahi ⊳ ei[j] = e → i = j
(b) ahi ⊳ ei[j] = e → a[j] = e
(c) ahi ⊳ ei[j] = e → i = j ∨ a[j] = e
(d) ahi ⊳ eihj ⊳ f i[k] = g ∧ j 6= k ∧ i = j → a[k] = g
Arithmetic
Recall from Chapter 3 that the theory of Peano arithmetic TPA formalizes
arithmetic over the natural numbers. Its axioms include an instance of the
(induction) axiom schema
for each ΣPA -formula F [x] with only one free variable x. This axiom schema
says that to prove ∀x. F [x] — that is, F [x] is TPA -valid for all natural numbers
x — it is sufficient to do the following:
• For the base case, prove that F [0] is TPA -valid.
• For the inductive step, assume as the inductive hypothesis that for
some arbitrary natural number n, F [n] is TPA -valid. Then prove that F [n+
1] is TPA -valid under this assumption.
These two steps comprise the stepwise induction principle for Peano (and
Presburger) arithmetic.
+
Example 4.1. Consider the theory TPA obtained from augmenting TPA with
the following axioms:
• ∀x. x0 = 1 (exp. zero)
• ∀x, y. xy+1 = xy · x (exp. successor)
• ∀x, z. exp 3 (x, 0, z) = z (exp 3 zero)
• ∀x, y, z. exp 3 (x, y + 1, z) = exp 3 (x, y, x · z) (exp 3 successor)
The first two axioms define exponentiation xy , while the latter two axioms
define a ternary function exp 3 (x, y, z).
+
Let us prove that the following formula is TPA -valid:
Lists
We can define stepwise induction over recursive data structures such as lists
(see Chapters 3 and 9). Consider the theory of lists Tcons . Stepwise induction
in Tcons is defined according to the following schema
for Σcons -formulae F [x] with only one free variable x. The notation ∀ atom u. F [u]
abbreviates ∀u. atom(u) → F [u]. In other words, to prove ∀x. F [x] — that is,
F [x] is Tcons -valid for all lists x — it is sufficient to do the following:
• For the base case, prove that F [u] is Tcons -valid for an arbitrary atom u.
• For the inductive step, assume as the inductive hypothesis that for
some arbitrary list v, F [v] is valid. Then prove that for arbitrary list u,
F [cons(u, v)] is Tcons -valid under this assumption.
These steps comprise the stepwise induction principle for lists.
+
Example 4.2. Consider the theory Tcons obtained from augmenting Tcons with
the following axioms:
• ∀ atom u. ∀v. concat (u, v) = cons(u, v) (concat. atom)
• ∀u, v, x. concat (cons(u, v), x) = cons(u, concat (v, x)) (concat. list)
• ∀ atom u. rvs(u) = u (reverse atom)
• ∀x, y. rvs(concat (x, y)) = concat (rvs(y), rvs(x)) (reverse list)
• ∀ atom u. flat(u) (flat atom)
• ∀u, v. flat (cons(u, v)) ↔ atom(u) ∧ flat (v) (flat list)
The first two axioms define the concat function, which concatenates two lists
together. For example,
The next two axioms define the rvs function, which reverses a list. For exam-
ple,
Note, however, that rvs is undefined on lists like cons(cons(a, b), c), for
cons(cons(a, b), c) cannot result from concatenating two lists together. There-
fore, the final two axioms define the flat predicate, which evaluates to ⊤ on
a list iff every element is an atom. For example, cons(a, cons(b, c)) is flat , but
cons(cons(a, b), c) is not because the first element of the list is itself a list.
+
Let us prove that the following formula is Tcons -valid:
For example,
4.2 Complete Induction 99
by (flat list) and assumption. Therefore, (4.9) holds since its antecedent is ⊥.
If atom(u), then we have that
rvs(rvs(cons(u, v)))
= rvs(rvs(concat (u, v))) (concat. atom)
= rvs(concat (rvs(v), rvs(u))) (reverse list)
= concat(rvs(rvs(u)), rvs(rvs(v))) (reverse list)
= concat(u, rvs(rvs(v))) (reverse atom)
= concat(u, v) IH (4.8), since flat (v)
= cons(u, v) (concat. atom)
for ΣPA -formulae F [x] with only one free variable x. In other words, to prove
∀x. F [x] — that is, F [x] is TPA -valid for all natural numbers x — it is sufficient
to follow the complete induction principle:
• Assume as the inductive hypothesis that for arbitrary natural number
n and for every natural number n′ such that n′ < n, F [n′ ] is TPA -valid.
Then prove that F [n] is TPA -valid.
It appears that we are missing a base case. In practice, a case analysis usually
requires at least one base case. In other words, the base case is implicit in
the structure of complete induction. For example, for n = 0, the inductive
hypothesis does not provide any information — there does not exist a natural
number n′ < 0. Hence, F [0] must be shown separately without assistance from
the inductive hypothesis.
∗
Example 4.3. Consider another augmented version of Peano arithmetic, TPA ,
that defines integer division. It has the usual axioms of TPA plus the following:
• ∀x, y. x<y → quot(x, y) = 0 (quotient less)
• ∀x, y. y>0 → quot(x + y, y) = quot (x, y) + 1 (quotient successor)
• ∀x, y. x<y → rem(x, y) = x (remainder less)
• ∀x, y. y>0 → rem(x + y, y) = rem(x, y) (remainder successor)
These axioms define functions for computing integer quotients quot(x, y) and
remainders rem(x, y). For example, quot (5, 3) = 1 and rem(5, 3) = 2. We
prove two properties, which the reader may recall from grade school, about
these functions. First, we prove that the remainder is always less than the
divisor:
Thus, for the inductive hypothesis, assume that for arbitrary natural number
x,
If x < y, then
rem(x, y) = x (remainder less)
<y by assumption x < y
as desired.
If ¬(x < y), then there is a natural number n, n < x, such that x = n + y.
Compute
rem(x, y) = rem(n + y, y) x=n+y
= rem(n, y) (remainder successor)
<y IH (4.13), x′ 7→ n, since n < x
finishing the proof of this property.
For property (4.11), (remainder successor) again suggests that we apply
complete induction on x to prove
G[x] : ∀y. y > 0 → x = y · quot (x, y) + rem(x, y) . (4.14)
Thus, for the inductive hypothesis, assume that for arbitrary natural number
x,
∀x′ . x′ < x → ∀y. y > 0 → x′ = y · quot(x′ , y) + rem(x′ , y) . (4.15)
| {z }
G[x′ ]
s1 ≻ s2 ≻ s3 ≻ · · · ,
Example 4.4. The relation < is well-founded over the natural numbers. Any
sequence of natural numbers decreasing according to < is finite:
However, the relation < is not well-founded over the rationals. Consider the
infinite decreasing sequence
1 1 1
1> > > > ··· ,
2 3 4
1
that is, the sequence si = i for i ≥ 0.
PA
Example 4.5. Consider the theory Tcons , which includes the axioms of Tcons
and TPA and the following axioms:
• ∀ atom u, v. u c v ↔ u = v (c (1))
• ∀ atom u. ∀v. ¬atom(v) → ¬(v c u) (c (2))
• ∀ atom u. ∀v, w. u c cons(v, w) ↔ u = v ∨ u c w (c (3))
• ∀u1 , v1 , u2 , v2 . cons(u1 , v1 ) c cons(u2 , v2 )
↔ (u1 = u2 ∧ v1 c v2 ) ∨ cons(u1 , v1 ) c v2 (c (4))
• ∀x, y. x ≺c y ↔ x c y ∧ x 6= y (≺c )
• ∀ atom u. |u| = 1 (length atom)
• ∀u, v. |cons(u, v)| = 1 + |v| (length list)
The first four axioms define the sublist relation c . x c y holds iff x is a
(not necessarily strict) sublist of y. The next axiom defines the strict sublist
relation: x ≺c y iff x is a strict sublist of y. The final two axioms define the
length function, which returns the number of elements in a list.
The strict sublist relation ≺c is well-founded on the set of all lists. One can
prove that the number of sublists of a list is finite; and that its set of strict
sublists is a superset of the set of strict sublists of any of its sublists. Hence,
there cannot be an infinite sequence of lists descending according to ≺c .
for Σ-formulae F [x] with only one free variable x. In other words, to prove the
T -validity of ∀x. F [x], it is sufficient to follow the well-founded induction
principle:
• Assume as the inductive hypothesis that for arbitrary element n and
for every element n′ such that n′ ≺ n, F [n′ ] is T -valid. Then prove that
F [n] is T -valid.
Complete induction in TPA of Section 4.2 is a specific instance of well-founded
induction that uses the well-founded relation <.
A theory of lists augmented with the first five axioms of Example 4.5 has
well-founded induction in which the well-founded relation is ≺c .
as desired. Exercise 4.2 asks the reader to prove formally that ∀u, v. v ≺c
cons(u, v).
This property is also easily proved using stepwise induction.
S = S1 × · · · × Sm ,
for Σ-formula F [x] with only free variables x = {x1 , . . . , xm }. Notice that the
form of this induction principle is the same as well-founded induction. The
only difference is that we are considering tuples n = (n1 , . . . , nm ) rather than
single elements n.
Example 4.7. Consider the following puzzle. You have a bag of red, yellow,
and blue chips. If only one chip remains in the bag, you take it out. Otherwise,
you remove two chips at random:
1. If one of the two removed chips is red, you do not put any chips in the
bag.
2. If both of the removed chips are yellow, you put one yellow chip and five
blue chips in the bag.
3. If one of the chips is blue and the other is not red, you put ten red chips
in the bag.
These cases cover all possibilities for the two chips. Does this process always
halt?
We prove the following property: for all bags of chips, you can execute
the choose-and-replace process only a finite number of times before the bag is
empty. Let the triple
represent the current state of the bag. Such a tuple is in the set of triples of
natural numbers S : N3 . Let <3 be the natural lexicographic extension of <
to such triples. For example,
(11, 13, 3) 6<3 (11, 9, 104) but (11, 9, 104) <3 (11, 13, 3) .
We prove that for arbitrary bag state (y, b, r) represented by the triple of
natural numbers y, b, and r, only a finite number of steps remain.
4.3 Well-Founded Induction 105
For the base cases, consider when the bag has no chips (state (0, 0, 0)) or
only one chip (one of states (1, 0, 0), (0, 1, 0), or (0, 0, 1)). In the first case, you
are done; in the second set of cases, only one step remains.
Assume for the inductive hypothesis that for any bag state (y ′ , b′ , r′ ) such
that
(y ′ , b′ , r′ ) <3 (y, b, r) ,
only a finite number of steps remain. Now remove two chips from the current
bag, represented by state (y, b, r). Consider the three possible cases:
1. If one of the two removed chips is red, you do not put any chips in the bag.
Then the new bag state is (y − 1, b, r − 1), (y, b − 1, r − 1), or (y, b, r − 2).
Each is less than (y, b, r) by <3 .
2. If both of the removed chips are yellow, you put one yellow chip and five
blue chips in the bag. Then the new bag state is (y − 1, b + 5, r), which is
less than (y, b, r) by <3 .
3. If one of the chips is blue and the other is not red, you put ten red chips in
the bag. Then the new bag state is (y − 1, b − 1, r + 10) or (y, b − 2, r + 10).
Each is less than (y, b, r) by <3 .
In all cases, we can apply the inductive hypothesis to deduce that only a finite
number of steps remain from the next state. Since only one step of the process
is required to get to the next state, there are only a finite number of steps
remaining from the current state (y, b, r). Hence, the process always halts.
Now consider arbitrary lists x and y. Consider two cases: either atom(x)
or ¬atom(x).
If atom(x), then
x c y ⇔ ⊥
The disjunction suggests two possibilities. Consider the first disjunct. Because
v1 ≺c cons(u1 , v1 ) = x, we have that
• ack (2, 2) = 7
• ack (3, 3) = 61
16
22
• ack (4, 4) = 22 −3
One might expect that proving properties about the Ackermann function
would be difficult.
However, lexicographic well-founded induction allows us to reason about
certain properties of the function. Define <2 as the natural lexicographic ex-
tension of < to pairs of natural numbers. Now consider input arguments to
ack and the resulting arguments in recursive calls:
• (ack left zero) does not involve a recursive call.
• In (ack right zero), (x + 1, 0) >2 (x, 1).
• In (ack successor),
– (x + 1, y + 1) >2 (x + 1, y), and
– (x + 1, y + 1) >2 (x, ack (x + 1, y)).
As the arguments decrease according to <2 with each level of recursion, we
conclude that the computation of ack (x, y) halts for every x and y. In Chap-
ter 5, we show that finding well-founded relations is a general technique for
showing that functions always halt.
Additionally, we can induct over the execution of ack to prove properties
of the ack function itself. Let us prove that
for arbitrary natural numbers x and y. For the inductive hypothesis, assume
that
ack (x − 1, 1) > 1 .
Therefore, we have
Furthermore, since
ack (x, y) = ack (x − 1, ack (x, y − 1)) > ack (x, y − 1) > y − 1 ;
Example 4.10. Exercise 1.3 asks the reader to prove that certain logical
connectives are redundant in the presence of others. Formally, the exercise
is asking the reader to prove the following claim: Every propositional formula
F is equivalent to a propositional formula F ′ constructed with only the logical
connectives ⊤, ∧, and ¬.
There are three base cases to consider:
• The formula ⊤ can be represented directly as ⊤.
• The formula ⊥ is equivalent to ¬⊤.
• Any propositional variable P can be represented directly as P .
For the inductive step, consider formulae G, G1 , and G2 , and assume as
the inductive hypothesis that each is equivalent to formulae G′ , G′1 , and G′2 ,
respectively, which are constructed only from the connectives ⊤, ∨, and ¬
(and propositional variables, of course). We show that each possible formulae
that can be constructed from G, G1 , and G2 with only one logical connective
is equivalent to another constructed with only ⊤, ∨, and ¬:
• ¬G is equivalent to ¬G′ from the inductive hypothesis.
• By considering the truth table in which the four possible valuations of
G1 and G2 are considered, one can establish that G1 ∨ G2 is equivalent
to ¬(¬G′1 ∧ ¬G′2 ). By the inductive hypothesis, the latter formula is con-
structed only from propositional variables, ⊤, ∧, and ¬.
• By similar reasoning, G1 → G2 is equivalent to ¬(G′1 ∧¬G′2 ), which satisfies
the claim.
• Similar reasoning handles G1 ↔ G2 as well.
Hence, the claim is proved.
Note that the main argument is essentially similar to the answer that the
reader might have provided in answering Exercise 1.3. Structural induction
merely provides the basis for lifting the truth-table argument to a general
statement about propositional formulae.
Example 4.11. This example relies on several basic concepts of set theory;
however, even the reader unfamiliar with set theory can understand the ap-
plication of structural induction without understanding the actual claim.
Consider ΣQ -formulae F [x1 , . . . , xn ] in which the only predicate is ≤,
the only logical connectives are ∨ and ∧, and the only quantifier is ∀. We
110 4 Induction
G ab : G ∧ bx ≤ a ∧ a ≤ bx .
4.5 Summary
Bibliographic Remarks
The induction proofs in Examples 4.1, 4.3, and 4.9 are taken from the text of
Manna and Waldinger [55].
Blaise Pascal (1623–1662) and Jacob Bernoulli (1654–1705) are recognized
as having formalized stepwise and complete induction, respectively. Less for-
mal versions of induction appear in texts by Francesco Maurolico (1494–1575);
Rabbi Levi Ben Gershon (1288–1344), who recognized induction as a distinct
form of mathematical proof; Abu Bekr ibn Muhammad ibn al-Husayn Al-
Karaji (953–1029); and Abu Kamil Shuja Ibn Aslam Ibn Mohammad Ibn
Shaji (850–930) [97]. Some historians claim that Euclid may have applied
induction informally.
Exercises
+ +
4.1 (Tcons ). Prove the following in Tcons :
(a) ∀u, v. flat(u) ∧ flat (v) → flat (concat (u, v))
(b) ∀u. flat(u) → flat (rvs(u))
PA PA
4.2 (Tcons ). Prove or disprove the following in Tcons :
(a) ∀u. u c u
(b) ∀u, v, w. cons(u, v) c w → v c w
(c) ∀u, v. v ≺c cons(u, v)
112 4 Induction
+ PA + PA
4.3 (Tcons ∪ Tcons ). Prove the following in Tcons ∪ Tcons :
(a) ∀u, v. |concat (u, v)| = |u| + |v|
(b) ∀u. flat(u) → |rvs(u)| = |u|
When examining the detail of the algorithm, it seems probable that the
proof will be helpful in explaining not only what is happening but why.
— Tony Hoare
An Axiomatic Basis for Computer Programming, 1969
We are finally ready to apply FOL and induction to a real problem: spec-
ifying and proving properties of programs. In this chapter, we develop the
three foundational methods that underly all verification and program analy-
sis techniques. In the next chapter, we discuss strategies for applying them.
First, specification is the precise statement of properties that a program
should exhibit. The language of FOL offers precision. The remaining task
is to develop a scheme for embedding FOL statements into program text
as program annotations. We focus on two forms of properties. Partial
correctness properties, or safety properties, assert that certain states —
typically, error states — cannot ever occur during the execution of a program.
An important subset of this form of property is the partial correctness of
programs: if a program halts, then its output satisfies some relation with
its input. Total correctness properties, or progress properties, assert that
certain states are eventually reached during program execution. Section 5.1
presents specification in the context of a simple programming language, pi.
The next foundational method is the inductive assertion method for
proving partial correctness properties. The inductive assertion method is
based on the mathematical induction of Chapter 4. To prove that every state
during the execution of a program satisfies FOL formula F , prove as the base
case that F holds at the beginning of execution; assume as the inductive hy-
pothesis that F currently holds (at some point during the execution); and
prove as the inductive step that F holds after one more step of the program.
Section 5.2 discusses the mechanics for reducing a program with a partial cor-
rectness specification to this inductive argument. The challenge in applying
this method is to discover additional annotations to make the induction go
through. Chapter 6 discusses strategies for finding the extra information.
114 5 Program Correctness: Mechanics
@pre ⊤
@post ⊤
bool LinearSearch(int[] a, int ℓ, int u, int e) {
for @ ⊤
(int i := ℓ; i ≤ u; i := i + 1) {
if (a[i] = e) return true;
}
return false;
}
guages: its data types do not include pointer or reference types; and it does
not allow global variables, although it does have global constants (see Exercise
6.5). After reading this chapter and Chapter 12, the interested reader should
consult the wide literature on program analysis to learn how the techniques of
these chapters extend to reasoning about standard programming languages.
Example 5.1. Figure 5.1 lists the function LinearSearch, which searches the
range [ℓ, u] of an array a of integers for a value e. It returns true iff the given
array contains the value between the lower bound ℓ and upper bound u. It
behaves correctly only if 0 ≤ ℓ and u < |a|; otherwise, the array a is accessed
outside of its domain [0, |a| − 1]. |a| denotes the length of array a.
Observe that most of the syntax is similar to C. For example, the for loop
sets i to be ℓ initially and then executes the body of the loop and increments i
by 1 as long as i ≤ u. Also, an integer array has type int[], which is constructed
from base type int. One syntactic difference occurs in assignment, which is
written := to distinguish it from the equality predicate =. We use = as the
equality predicate, rather than ==, to correspond to the standard equality
predicate of FOL. Finally, unlike C, pi has type bool and constants true and
false.
Notice the lines beginning with @. They are program annotations, which
we discuss in detail in the next section.
In LinearSearch, a, ℓ, u, and e are the formal parameters (also, param-
eters) of the function. If LinearSearch is called as LinearSearch(b, 0, |b| − 1, v),
then b, 0, |b| − 1, and v are the arguments.
Example 5.2. Figure 5.2 lists the recursive function BinarySearch, which
searches a range [ℓ, u] of a sorted (weakly increasing: a[i] ≤ a[j] if i ≤ j)
array a of integers for a value e. Like LinearSearch, it returns true iff the
116 5 Program Correctness: Mechanics
@pre ⊤
@post ⊤
bool BinarySearch(int[] a, int ℓ, int u, int e) {
if (ℓ > u) return false;
else {
int m := (ℓ + u) div 2;
if (a[m] = e) return true;
else if (a[m] < e) return BinarySearch(a, m + 1, u, e);
else return BinarySearch(a, ℓ, m − 1, e);
}
}
given array contains the value in the range [ℓ, u]. It behaves correctly only if
0 ≤ ℓ and u < |a|.
One level of recursion operates as follows. If the lower bound ℓ of the range
is greater than the upper bound u, then the (empty) subarray cannot contain
e, so it returns false. Otherwise, it examines the middle element a[m] of the
subarray: if it is e, then the subarray clearly contains e; otherwise, it recurses
on the left half if a[m] < e and on the right half if a[m] > e.
pi syntactically distinguishes between integer division and real division: for
int variables a and b, write a div b instead of a/b. Integer division is defined
as follows:
jak
def
a div b = .
b
a
That is, a div b is equal to the greatest integer less than or equal to b (the
floor of ab ).
Example 5.3. Figure 5.3 lists the function BubbleSort, which sorts an integer
array. It works by “bubbling” the largest element of the left unsorted region of
the array toward the sorted region on the right; this element then becomes the
left element of the sorted region, enlarging the region by one cell. In Figure
5.4, for example, the first line shows an array in which the rightmost boxed
cells comprise the sorted region and the other cells comprise the unsorted
region. In the final line, the sorted region has been expanded by one cell.
Figure 5.4 lists a portion of a sample execution trace. The right two cells
(5, 6) of the array have already been sorted. In the trace, the inner loop
moves the largest element 4 of the unsorted region to the right to join the
sorted region, which is indicated by the dotted rectangle. In the first two
steps, a[j] ≤ a[j + 1] (2 ≤ 3 and 3 ≤ 4), so the values of cell j and j + 1
are not swapped in either case. In the subsequent two steps, a[j] > a[j + 1]
(4 > 1 and 4 > 2), causing a swap at each step. In the fifth step, the inner
loop’s guard i < j no longer holds, so the inner loop exits and the outer
5.1 pi: A Simple Imperative Language 117
@pre ⊤
@post ⊤
int[] BubbleSort(int[] a0 ) {
int[] a := a0 ;
for @ ⊤
(int i := |a| − 1; i > 0; i := i − 1) {
for @ ⊤
(int j := 0; j < i; j := j + 1) {
if (a[j] > a[j + 1]) {
int t := a[j];
a[j] := a[j + 1];
a[j + 1] := t;
}
}
}
return a;
}
2 3 4 1 2 5 6
j i
2 3 4 1 2 5 6
j i
2 3 4 1 2 5 6
j i
2 3 1 4 2 5 6
j i
2 3 1 2 4 5 6
j, i
2 3 1 2 4 5 6
j i
loop decrements i by 1. The sorted region has been expanded by one cell, as
indicated by the final dotted rectangle. The last step shows the beginning of
the next round of the inner loop.
Because pi does not have pointer or reference types, all data are passed by
value, including arrays and structures. If BubbleSort were missing the return
118 5 Program Correctness: Mechanics
typedef struct qs {
int pivot;
int[] array;
} qs;
statement, then calling it would not have any discernible effect on the calling
context. Additionally, pi does not allow updates to parameters, so BubbleSort
assigns a0 to a fresh variable a in the first line. This artificial requirement
makes reasoning about functions easier: in annotations (see Section 5.1.2)
throughout the function, one can always reference the input.
In this book, our example programs manipulate arrays rather than re-
cursive data structures. The reason is that we can express more interesting
properties about arrays in the fragment of the theory of arrays studied in
Chapter 11 than we can about lists in the fragment of the theory of recursive
data structures studied in Chapter 9. This bias is a reflection of the structure
and content of this book, not of what is theoretically possible.
However, we sometimes use records, a basic recursive data type, to allow
a function to return multiple values. The following example illustrates such a
record type, which is used in the program QuickSort (see Section 6.2).
Example 5.4. The structure qs of Figure 5.5 is a record with two fields: the
pivot field of type int and the array field of type array. If x is a variable of
type qs, then x.pivot returns the value in its pivot field; also, x.array[i] := v
assigns v to position i of x’s array field.
Function Specifications
The function specification of a function is a pair of annotations. The func-
tion precondition is a formula F whose free variables include only the for-
mal parameters. It specifies what should be true upon entering the function
— or, in other words, under what inputs the function is expected to work. The
function postcondition is a formula G whose free variables include only the
formal parameters and the special variable rv representing the return value
of the function. The postcondition relates the function’s output (the return
value rv ) to its input (the parameters).
5.1 pi: A Simple Imperative Language 119
@pre ⊤
@post rv ↔ ∃i. 0 ≤ ℓ ≤ i ≤ u < |a| ∧ a[i] = e
bool LinearSearch(int[] a, int ℓ, int u, int e) {
if (ℓ < 0 ∨ u ≥ |a|) return false;
for @ ⊤
(int i := ℓ; i ≤ u; i := i + 1) {
if (a[i] = e) return true;
}
return false;
}
0 ≤ ℓ ≤ i ≤ u < |a|
abbreviates
0 ≤ ℓ ∧ ℓ ≤ i ∧ i ≤ u ∧ u < |a| .
@pre ⊤
@post sorted(rv , 0, |rv | − 1)
int[] BubbleSort(int[] a0 ) {
int[] a := a0 ;
for @ ⊤
(int i := |a| − 1; i > 0; i := i − 1) {
for @ ⊤
(int j := 0; j < i; j := j + 1) {
if (a[j] > a[j + 1]) {
int t := a[j];
a[j] := a[j + 1];
a[j + 1] := t;
}
}
}
return a;
}
Example 5.7. Figure 5.8 lists BinarySearch with its specification. As ex-
pected, its postcondition is identical to the postcondition of LinearSearch.
However, its precondition also states that the array a is sorted.
The sorted predicate is defined in the combined theory of integers and
arrays, TZ ∪ TA :
Example 5.8. Figure 5.9 lists BubbleSort with its specification. Given any
array, the returned array is sorted. Of course, other properties are desirable
5.1 pi: A Simple Imperative Language 121
and could be specified as well. For example, the returned array rv should be
a permutation of the original array a0 (see Exercise 6.5).
Section 5.2 presents a method for proving that a function satisfies its
partial correctness specification: if the function precondition is satisfied and
the function halts, then the function postcondition holds upon return. Section
5.3 discusses a method for proving that, additionally, the function always halts.
Loop Invariants
Each for loop and while loop has an attendant annotation called the loop
invariant. A while loop
while
@F
(hcondition i) {
hbody i
}
says to apply the hbodyi as long as hcondition i holds. The assertion F must
hold at the beginning of every iteration. It is evaluated before the hconditioni
is evaluated, so it must hold even on the final iteration when hcondition i is
false. Therefore, on entering the hbody i of the loop,
F ∧ hcondition i
must hold, and on exiting the loop,
F ∧ ¬hcondition i
must hold.
To consider a for loop, translate the loop
for
@F
(hinitialize i; hconditioni; hincrement i) {
hbody i
}
into the equivalent loop
hinitializei;
while
@F
(hcondition i) {
hbody i
hincrement i
}
F must hold after the hinitializei statement has been evaluated and, on each
iteration, before the hcondition i is evaluated.
122 5 Program Correctness: Mechanics
Example 5.9. Figure 5.10 lists LinearSearch with a nontrivial loop invariant
at L. It asserts that whenever control reaches L, the loop index is at least ℓ
and that a[j] 6= e for previously examined indices j.
Section 5.2 shows that loop invariants are crucial for constructing an in-
ductive argument that a function obeys its specification.
Assertions
i := i + k;
the programmer thinks that k is positive, then the programmer can add an
assertion stating that supposition:
@ k > 0;
i := i + k;
@pre ⊤
@post ⊤
bool LinearSearch(int[] a, int ℓ, int u, int e) {
for @ ⊤
(int i := ℓ; i ≤ u; i := i + 1) {
@ 0 ≤ i < |a|;
if (a[i] = e) return true;
}
return false;
}
@pre ⊤
@post ⊤
bool BinarySearch(int[] a, int ℓ, int u, int e) {
if (ℓ > u) return false;
else {
@ 2 6= 0;
int m := (ℓ + u) div 2;
@ 0 ≤ m < |a|;
if (a[m] = e) return true;
else {
@ 0 ≤ m < |a|;
if (a[m] < e) return BinarySearch(a, m + 1, u, e);
else return BinarySearch(a, ℓ, m − 1, e);
}
}
}
Example 5.11. Figure 5.12 lists BinarySearch with runtime assertions. The
first assertion protects the division: it asserts that 2 6= 0, which clearly holds.
The next two assertions protect the array reads.
@pre ⊤
@post ⊤
int[] BubbleSort(int[] a0 ) {
int[] a := a0 ;
for @ ⊤
(int i := |a| − 1; i > 0; i := i − 1) {
for @ ⊤
(int j := 0; j < i; j := j + 1) {
@ 0 ≤ j < |a|;
@ 0 ≤ j + 1 < |a|;
if (a[j] > a[j + 1]) {
@ 0 ≤ j < |a|;
int t := a[j];
@ 0 ≤ j < |a|;
@ 0 ≤ j + 1 < |a|;
a[j] := a[j + 1];
@ 0 ≤ j + 1 < |a|;
a[j + 1] := t;
}
}
}
return a;
}
@pre ⊤
@post ⊤
int[] BubbleSort(int[] a0 ) {
int[] a := a0 ;
for @ ⊤
(int i := |a| − 1; i > 0; i := i − 1) {
for @ ⊤
(int j := 0; j < i; j := j + 1) {
@ 0 ≤ j < |a| ∧ 0 ≤ j + 1 < |a|;
if (a[j] > a[j + 1]) {
int t := a[j];
a[j] := a[j + 1];
a[j + 1] := t;
}
}
}
return a;
}
and loop invariants and concentrate on the task of generating the correspond-
ing verification conditions. In practice, this task is performed by a verifying
compiler. Chapter 6 discusses strategies for constructing specifications and
loop invariants.
The second basic path begins at the loop invariant at L, passes the loop
guard i ≤ u, passes the guard a[i] = e of the if statement, executes the
return (of true), and ends at the postcondition:
(2)
@L : ℓ ≤ i ∧ (∀j. ℓ ≤ j < i → a[j] 6= e)
assume i ≤ u;
assume a[i] = e;
rv := true;
@post rv ↔ ∃j. ℓ ≤ j ≤ u ∧ a[j] = e
This path exhibits two new aspects of basic paths. First, return statements
become assignments to the special variable rv representing the return value.
Second, guards arising in program statements (in for loop guards, while
loop guards, or if statements) become assume statements in basic paths.
An assume statement assume c in a basic path means that the remainder of
the basic path is executed only if the condition c holds at assume c. Each
guard with condition c results in two assumptions: the guard holds (c) or
it does not hold (¬c). Therefore, each guard produces two paths with the
same prefix up to the guard. They diverge on the assumption: one basic path
has the statement assume c, and the other has the statement assume ¬c.
These assumptions and the control structure of the program determine the
construction of the remainder of the basic paths.
For example, the third path has the same prefix as (2) but makes the
opposite assumption at the if statement guard: it assumes a[i] 6= e rather
than a[i] = e. Therefore, this path loops back around to the loop invariant:
(3)
@L : ℓ ≤ i ∧ (∀j. ℓ ≤ j < i → a[j] 6= e)
assume i ≤ u;
assume a[i] 6= e;
i := i + 1;
@L : ℓ ≤ i ∧ (∀j. ℓ ≤ j < i → a[j] 6= e)
The final basic path has the same prefix as (2) and (3) but makes the
opposite assumption at the for loop guard: it assumes i > u rather than
i ≤ u. Therefore, this path exits the loop and returns false:
5.2 Partial Correctness 127
@pre
(1)
(3) L
(2),(4)
@post
(4)
@L : ℓ ≤ i ∧ (∀j. ℓ ≤ j < i → a[j] 6= e)
assume i > u;
rv := false;
@post rv ↔ ∃j. ℓ ≤ j ≤ u ∧ a[j] = e
Example 5.14. Figure 5.17 lists BubbleSort with loop invariants. The outer
loop invariant at L1 asserts that
• i is in the range [−1, |a| − 1] (if |a| = 0, then i is initially −1);
• a is sorted in the range [i, |a| − 1];
• and a is partitioned such that each element in the range [0, i] is at most
(less than or equal to) each element in the range [i + 1, |a| − 1].
Its inner loop invariant at L2 asserts that
• i is in the range [1, |a| − 1], and j is in the range [0, i];
• a is sorted in the range [i, |a| − 1] as in the outer loop;
• a is partitioned as in the outer loop;
• and a is also partitioned such that each element in the range [0, j − 1] is
at most a[j].
The partitioned predicate is defined in the theory TZ ∪ TA :
partitioned(a, ℓ1 , u1 , ℓ2 , u2 )
⇔ ∀i, j. ℓ1 ≤ i ≤ u1 < ℓ2 ≤ j ≤ u2 → a[i] ≤ a[j] .
128 5 Program Correctness: Mechanics
@pre ⊤
@post sorted(rv , 0, |rv | − 1)
int[] BubbleSort(int[] a0 ) {
int[] a := a0 ;
for 2 3
−1 ≤ i < |a|
@L1 : 4 ∧ partitioned(a, 0, i, i + 1, |a| − 1)5
∧ sorted(a, i, |a| − 1)
(int i := |a| − 1; i > 0; i := i − 1) {
for 2 3
1 ≤ i < |a| ∧ 0 ≤ j ≤ i
6 ∧ partitioned(a, 0, i, i + 1, |a| − 1)7
@L2 : 6
4 ∧ partitioned(a, 0, j − 1, j, j)
7
5
∧ sorted(a, i, |a| − 1)
(int j := 0; j < i; j := j + 1) {
if (a[j] > a[j + 1]) {
int t := a[j];
a[j] := a[j + 1];
a[j + 1] := t;
}
}
}
return a;
}
Performing a depth-first exploration, the first basic path starts at the pre-
condition and ends at the outer loop invariant at L1 :
(1)
@pre ⊤;
a := a0 ;
i := |a| − 1;
@L1 : −1 ≤ i < |a| ∧ partitioned(a, 0, i, i + 1, |a| − 1) ∧ sorted(a, i, |a| − 1)
The second basic path starts at L1 and ends at the inner loop invariant at L2
(recall that the annotation is checked after the loop initialization j := 0):
(2)
@L1 : −1 ≤ i < |a| ∧ partitioned(a, 0, i, i + 1, |a| − 1) ∧ sorted(a, i, |a| − 1)
assume i > 0;
j := 0;
1 ≤ i < |a| ∧ 0 ≤ j ≤ i ∧ partitioned(a, 0, i, i + 1, |a| − 1)
@L2 :
∧ partitioned(a, 0, j − 1, j, j) ∧ sorted(a, i, |a| − 1)
The third and fourth basic paths follow the inner loop, each handling one
assumption on the guard a[j] > a[j + 1] of the if statement:
5.2 Partial Correctness 129
(3)
1 ≤ i < |a| ∧ 0 ≤ j ≤ i ∧ partitioned(a, 0, i, i + 1, |a| − 1)
@L2 :
∧ partitioned(a, 0, j − 1, j, j) ∧ sorted(a, i, |a| − 1)
assume j < i;
assume a[j] > a[j + 1];
t := a[j];
a[j] := a[j + 1];
a[j + 1] := t;
j := j+ 1;
1 ≤ i < |a| ∧ 0 ≤ j ≤ i ∧ partitioned(a, 0, i, i + 1, |a| − 1)
@L2 :
∧ partitioned(a, 0, j − 1, j, j) ∧ sorted(a, i, |a| − 1)
(4)
1 ≤ i < |a| ∧ 0 ≤ j ≤ i ∧ partitioned(a, 0, i, i + 1, |a| − 1)
@L2 :
∧ partitioned(a, 0, j − 1, j, j) ∧ sorted(a, i, |a| − 1)
assume j < i;
assume a[j] ≤ a[j + 1];
j := j+ 1;
1 ≤ i < |a| ∧ 0 ≤ j ≤ i ∧ partitioned(a, 0, i, i + 1, |a| − 1)
@L2 :
∧ partitioned(a, 0, j − 1, j, j) ∧ sorted(a, i, |a| − 1)
The fifth basic path starts at L2 , exits the inner loop, and decrements i on its
way to L1 :
(5)
1 ≤ i < |a| ∧ 0 ≤ j ≤ i ∧ partitioned(a, 0, i, i + 1, |a| − 1)
@L2 :
∧ partitioned(a, 0, j − 1, j, j) ∧ sorted(a, i, |a| − 1)
assume j ≥ i;
i := i − 1;
@L1 : −1 ≤ i < |a| ∧ partitioned(a, 0, i, i + 1, |a| − 1) ∧ sorted(a, i, |a| − 1)
The final basic path starts at L1 , exits the outer loop, and then exits the
function, returning the (presumably sorted) array a:
(6)
@L1 : −1 ≤ i < |a| ∧ partitioned(a, 0, i, i + 1, |a| − 1) ∧ sorted(a, i, |a| − 1)
assume i ≤ 0;
rv := a;
@post sorted(rv , 0, |rv| − 1)
@pre
(1)
L1
(6) (5) (2)
@post L2
(3), (4)
@pre ⊤
@post ⊤
int[] BubbleSort(int[] a0 ) {
int[] a := a0 ;
for
@L1 : −1 ≤ i < |a|
(int i := |a| − 1; i > 0; i := i − 1) {
for
@L2 : 0 < i < |a| ∧ 0 ≤ j ≤ i
(int j := 0; j < i; j := j + 1) {
@L3 : 0 ≤ j < |a| ∧ 0 ≤ j + 1 < |a|;
if (a[j] > a[j + 1]) {
int t := a[j];
a[j] := a[j + 1];
a[j + 1] := t;
}
}
}
return a;
}
tion, and loop invariants. These basic paths ignore the runtime assertion at
L3 . Then one additional basic path ends at the runtime assertion:
(7)
@L2 : 0 < i < |a| ∧ 0 ≤ j ≤ i
assume j < i;
@L3 : 0 ≤ j < |a| ∧ 0 ≤ j + 1 < |a|
5.2 Partial Correctness 131
Example 5.17. Figure 5.8 lists BinarySearch with its function specification.
BinarySearch contains two (recursive) function calls. In Figure 5.20, each func-
tion call is protected by a function call assertion at R1 and R2 . Each asser-
tion is constructed by applying a substitution to BinarySearch’s precondition
132 5 Program Correctness: Mechanics
The first function call is BinarySearch(a, m + 1, u, e), so the function call as-
sertion at R1 is F σ1 , where
σ1 : {a 7→ a, ℓ 7→ m + 1, u 7→ u, e 7→ e} .
(2)
@pre 0 ≤ ℓ ∧ u < |a| ∧ sorted(a, ℓ, u)
assume ℓ ≤ u;
m := (ℓ + u) div 2;
assume a[m] = e;
rv := true;
@post rv ↔ ∃i. ℓ ≤ i ≤ u ∧ a[i] = e
(3)
@pre 0 ≤ ℓ ∧ u < |a| ∧ sorted(a, ℓ, u)
assume ℓ ≤ u;
m := (ℓ + u) div 2;
assume a[m] 6= e;
assume a[m] < e;
@R1 : 0 ≤ m + 1 ∧ u < |a| ∧ sorted(a, m + 1, u)
(5)
@pre 0 ≤ ℓ ∧ u < |a| ∧ sorted(a, ℓ, u)
assume ℓ ≤ u;
m := (ℓ + u) div 2;
assume a[m] 6= e;
assume a[m] ≥ e;
@R2 : 0 ≤ ℓ ∧ m − 1 < |a| ∧ sorted(a, ℓ, m − 1)
Because BinarySearch lacks loops, each basic path starts at the function pre-
condition.
5.2 Partial Correctness 133
It remains to consider paths (4) and (6), which pass through the recursive
function calls and end at the postcondition. Paths (3) and (5) end in the
function call assertions at R1 and R2 protecting these function calls. Since
they assert that the called BinarySearch’s precondition holds, we can assume
that the returned values obey the postcondition of BinarySearch in each of
the calling contexts. Therefore, we can use the function postcondition as a
summary of the function call:
(4)
@pre 0 ≤ ℓ ∧ u < |a| ∧ sorted(a, ℓ, u)
assume ℓ ≤ u;
m := (ℓ + u) div 2;
assume a[m] 6= e;
assume a[m] < e;
assume v1 ↔ ∃i. m + 1 ≤ i ≤ u ∧ a[i] = e;
rv := v1 ;
@post rv ↔ ∃i. ℓ ≤ i ≤ u ∧ a[i] = e
The lines
rv := BinarySearch(a, m + 1, u, e);
Next, given that the precondition holds (from path (3)), assume that the
postcondition holds. Therefore, summarize the function call with a relation
based on BinarySearch’s postcondition,
assume G[a, m + 1, u, e, v1 ];
rv := v1 ;
These are the penultimate lines of (4). Hence, (4) replaces the function call
BinarySearch(a, m + 1, u, e) with a summary based on the function postcon-
dition. Now reasoning about the basic path does not require reasoning about
all of BinarySearch at once.
134 5 Program Correctness: Mechanics
@pre
(3),(4) (5),(6)
R1 (1) (2) R2
(4) (6)
@post
Construct the final basic path for the function call BinarySearch(a, ℓ, m −
1, e) similarly:
(6)
@pre 0 ≤ ℓ ∧ u < |a| ∧ sorted(a, ℓ, u)
assume ℓ ≤ u;
m := (ℓ + u) div 2;
assume a[m] 6= e;
assume a[m] ≥ e;
assume v2 ↔ ∃i. ℓ ≤ i ≤ m − 1 ∧ a[i] = e;
rv := v2 ;
@post rv ↔ ∃i. ℓ ≤ i ≤ u ∧ a[i] = e
Finally, in basic paths that pass through the function call, replace the function
call by an assumption and assignment constructed from the postcondition,
where v is a fresh variable:
...
assume G[e1 , . . . , en , v];
w := v;
...
Note that rv need not have type bool as in BinarySearch. For example, for
a function with prototype
@pre ⊤
@post rv ≥ x
int g(int x)
the statement
w := g(n + 1);
is summarized in basic paths as follows:
assume v ≥ n + 1;
w := v;
•s′
•s F
wp(F, S)
s |= wp(F, S)
s′ |= F .
This situation is visualized in Figure 12.1(a). The region labeled F is the set
of states that satisfy F ; similarly, the region labeled wp(F, S) is the set of
5.2 Partial Correctness 137
states that satisfy wp(F, S). Every state s on which executing statement S
leads to a state s′ in the F region must be in the wp(F, S) region.
Define the weakest precondition for the two statement types of basic paths
introduced in Section 5.2.1:
• Assumption: What must hold before statement assume c is executed to
ensure that F holds afterward? If c → F holds before, then satisfying c in
assume c guarantees that F holds afterward:
wp(F, assume c) ⇔ c → F
• Assignment : What must hold before statement v := e is executed to ensure
that F [v] holds afterward? If F [e] holds before, then assigning e to v with
v := e makes F [v] hold afterward:
wp(F [v], v := e) ⇔ F [e]
For a sequence of statements S1 ; . . . ; Sn , define
wp(F, S1 ; . . . ; Sn ) ⇔ wp(wp(F, Sn ), S1 ; . . . ; Sn−1 ) .
The weakest precondition moves a formula backward over a sequence of state-
ments: for F to hold after executing S1 ; . . . ; Sn , wp(F, S1 ; . . . ; Sn ) must hold
before executing the statements. Because basic paths have only assumption
and assignment statements, the definition of wp is complete.
Then the verification condition of basic path
@F
S1 ;
..
.
Sn ;
@G
is
F → wp(G, S1 ; . . . ; Sn ) .
Its validity implies that when F holds before the statements of the path are
executed, then G holds afterward. Traditionally, this verification condition is
denoted by the Hoare triple
{F }S1 ; . . . ; Sn {G} .
Example 5.20. Consider the basic path
(1)
@x≥0
x := x + 1;
@x≥1
The VC is
138 5 Program Correctness: Mechanics
so compute
wp(x ≥ 1, x := x + 1)
⇔ (x ≥ 1){x 7→ x + 1}
⇔ x+1≥1
⇔ x≥0
x≥0 → x≥0,
The VC is
F → wp(G, S1 ; S2 ; S3 ) ,
so compute
wp(G, S1 ; S2 ; S3 )
⇔ wp(wp(rv ↔ ∃j. ℓ ≤ j ≤ u ∧ a[j] = e, rv := true), S1 ; S2 )
⇔ wp(true ↔ ∃j. ℓ ≤ j ≤ u ∧ a[j] = e, S1 ; S2 )
⇔ wp(∃j. ℓ ≤ j ≤ u ∧ a[j] = e, S1 ; S2 )
⇔ wp(wp(∃j. ℓ ≤ j ≤ u ∧ a[j] = e, assume a[i] = e), S1 )
⇔ wp(a[i] = e → ∃j. ℓ ≤ j ≤ u ∧ a[j] = e, S1 )
⇔ wp(a[i] = e → ∃j. ℓ ≤ j ≤ u ∧ a[j] = e, assume i ≤ u)
⇔ i ≤ u → (a[i] = e → ∃j. ℓ ≤ j ≤ u ∧ a[j] = e)
or, equivalently,
or
ℓ ≤ i ∧ (∀j. ℓ ≤ j < i → a[j] 6= e) ∧ i ≤ u ∧ a[i] 6= e
→ ℓ ≤ i + 1 ∧ ∀j. ℓ ≤ j < i + 1 → a[j] 6= e ,
The VC is
140 5 Program Correctness: Mechanics
F → wp(G, S1 ; S2 ; S3 ; S4 ) ,
so compute
wp(G, S1 ; S2 ; S3 ; S4 )
⇔ wp(wp(G, rv := true), S1 ; S2 ; S3 )
⇔ wp(G{rv 7→ true}, S1 ; S2 ; S3 )
⇔ wp(wp(G{rv 7→ true}, assume a[m] = e), S1 ; S2 )
⇔ wp(a[m] = e → G{rv 7→ true}, S1 ; S2 )
⇔ wp(wp(a[m] = e → G{rv 7→ true}, m := (ℓ + u) div 2), S1 )
⇔ wp((a[m] = e → G{rv 7→ true}){m 7→ (ℓ + u) div 2}, S1 )
⇔ wp((a[m] = e → G{rv 7→ true}){m 7→ (ℓ + u) div 2}
, assume ℓ ≤ u)
⇔ ℓ ≤ u → (a[m] = e → G{rv 7→ true}){m 7→ (ℓ + u) div 2}
The VC is
F → wp(G, S1 ; S2 ; S3 ; S4 ; S5 ; S6 )
so compute
5.2 Partial Correctness 141
wp(G, S1 ; S2 ; S3 ; S4 ; S5 ; S6 )
⇔ wp(wp(G, j := j + 1), S1 ; S2 ; S3 ; S4 ; S5 )
⇔ wp(G{j 7→ j + 1}, S1 ; S2 ; S3 ; S4 ; S5 )
⇔ wp(wp(G{j 7→ j + 1}, a[j + 1] := t), S1 ; S2 ; S3 ; S4 )
⇔ wp(G{j 7→ j + 1}{a 7→ ahj + 1 ⊳ ti}, S1 ; S2 ; S3 ; S4 )
to G produces Gσ:
In other words, modifying an array’s elements does not change its size. Under
this axiom, Gσ is equivalent to
1 ≤ i < |a| ∧ 0 ≤ j + 1 ≤ i
∧ partitioned(ahj ⊳ a[j + 1]ihj + 1 ⊳ a[j]i, 0, i, i + 1, |a| − 1)
∧ partitioned(ahj ⊳ a[j + 1]ihj + 1 ⊳ a[j]i, 0, j, j + 1, j + 1)
∧ sorted(ahj ⊳ a[j + 1]ihj + 1 ⊳ a[j]i, i, |a| − 1)
Thus, the VC is
1 ≤ i < |a| ∧ 0 ≤ j ≤ i
∧ partitioned(a, 0, i, i + 1, |a| − 1)
∧ partitioned(a, 0, j − 1, j, j)
∧ sorted(a, i, |a| − 1)
∧ j < i ∧ a[j] > a[j + 1]
1 ≤ i < |a| ∧ 0 ≤ j + 1 ≤ i
∧ partitioned(ahj ⊳ a[j + 1]ihj + 1 ⊳ a[j]i, 0, i, i + 1, |a| − 1)
→ ∧ partitioned(ahj ⊳ a[j + 1]ihj + 1 ⊳ a[j]i, 0, j, j + 1, j + 1)
s 0 , s1 , s2 , . . .
such that
5.3 Total Correctness 143
@Li : F
S1 ;
..
.
Sn ;
@Lj : G
{F }S1 ; . . . ; Sn {G}
is valid (in the appropriate theory), then the annotations are P -invariant. In
particular, the annotations are P -inductive.
correctness requires proving partial correctness and that the function always
halts on input satisfying its precondition. We focus now on the latter task.
Proving function termination is based on well-founded relations (see Chap-
ter 4). Define a set S with a well-founded relation ≺. Then find a function δ
mapping program states to S such that δ decreases according to ≺ along every
basic path. Since ≺ is well-founded, there cannot exist an infinite sequence of
program states; otherwise, they would map to an infinite decreasing sequence
in S. The function δ is called a ranking function.
Example 5.26. Figure 5.23 lists BubbleSort with ranking annotations. It con-
tains one new type of annotation: ↓ (i + 1, i + 1) and ↓ (i + 1, i − j) assert that
the functions (i + 1, i + 1) and (i + 1, i − j), respectively, are ranking functions.
These functions map states of BubbleSort onto pairs of natural numbers S : N2
with well-founded relation <2 . Intuitively, we have captured two separate ar-
guments. The outer loop eventually finishes because i decreases to 0; hence
i + 1 decreases as well. Why do we use i + 1 rather than i? When |a| = 0, the
initial assignment to i is −1. While i + 1 is always nonnegative, i is not; and
recall that we want to map into the natural numbers. The inner loop halts
because j increases to i; hence i − j decreases to 0. Therefore, our intuition
tells us that i + 1 is important for the outer loop, and i − j is important for
the inner loop.
Placing these two functions i + 1 and i − j together as a pair (i + 1, i − j)
provides the annotation for the inner loop. We expect i + 1 to remain constant
while the inner loop is executing and decreasing i − j. For the annotation of
the outer loop, we note that i + 1 > i − j = i − 0 = i on entry to the inner
loop, so that (i + 1, i + 1) >2 (i + 1, i − j).
The loop annotations assert that the ranking functions (i + 1, i + 1) and
(i + 1, i − j) map program states to pairs of natural numbers. Hence, we need
to prove that the loop annotations are inductive using the inductive assertion
5.3 Total Correctness 145
@pre ⊤
@post ⊤
int[] BubbleSort(int[] a0 ) {
int[] a := a0 ;
for
@L1 : i + 1 ≥ 0
↓ (i + 1, i + 1)
(int i := |a| − 1; i > 0; i := i − 1) {
for
@L2 : i + 1 ≥ 0 ∧ i − j ≥ 0
↓ (i + 1, i − j)
(int j := 0; j < i; j := j + 1) {
if (a[j] > a[j + 1]) {
int t := a[j];
a[j] := a[j + 1];
a[j + 1] := t;
}
}
}
return a;
}
method. We leave this step to the reader. It only remains to prove that the
functions decrease along each basic path.
The relevant basic paths are the following:
(1)
@L1 : i + 1 ≥ 0
↓L1 : (i + 1, i + 1)
assume i > 0;
j := 0;
↓L2 : (i + 1, i − j)
(2)
@L2 : i + 1 ≥ 0 ∧ i − j ≥ 0
↓L2 : (i + 1, i − j)
assume j < i;
assume a[j] > a[j + 1];
t := a[j];
a[j] := a[j + 1];
a[j + 1] := t;
j := j + 1;
↓L2 : (i + 1, i − j)
146 5 Program Correctness: Mechanics
(3)
@L2 : i + 1 ≥ 0 ∧ i − j ≥ 0
↓L2 : (i + 1, i − j)
assume j < i;
assume a[j] ≤ a[j + 1];
j := j + 1;
↓L2 : (i + 1, i − j)
(4)
@L2 : i + 1 ≥ 0 ∧ i − j ≥ 0
↓L2 : (i + 1, i − j)
assume j ≥ i;
i := i − 1;
↓L1 : (i + 1, i + 1)
The paths entering and exiting the outer loop at L1 are not relevant for the
termination argument. The entering path does not begin with a ranking func-
tion annotation, so there is nothing to prove. The exiting path leads to the
return statement.
For termination purposes, paths (2) and (3) can be treated the same:
@L2 : i + 1 ≥ 0 ∧ i − j ≥ 0
↓L2 : (i + 1, i − j)
assume j < i;
···
j := j + 1;
↓L2 : (i + 1, i − j)
The excluded statements do not impact the value of the ranking functions.
Verification Conditions
@F
↓ δ[x]
S1 ;
..
.
Sk ;
↓ κ[x]
Example 5.27. Let us return to the proof of Example 5.26 that BubbleSort
halts. Path (1) induces the following verification condition:
i + 1 ≥ 0 ∧ i > 0 → (i + 1, i − 0) <2 (i + 1, i + 1) ,
which is valid. Paths (2) and (3) induce the verification condition:
i + 1 ≥ 0 ∧ i − j ≥ 0 ∧ j ≥ i → ((i − 1) + 1, (i − 1) + 1) <2 (i + 1, i − j) ,
which is also valid. Hence, BubbleSort always halts. Combined with the proof
of the sortedness property, we can now say that BubbleSort is totally correct
with respect to its specification: it always halts and returns a sorted array.
Let us work through the construction of the final verification condition
for basic path (4). First, replace i and j with i0 and j0 , respectively, in the
function annotating L2 : (i0 + 1, i0 − j0 ). Then compute
j ≥ i → (i, i) <2 (i + 1, i − j) .
i + 1 ≥ 0 ∧ i − j ≥ 0 ∧ j ≥ i → (i, i) <2 (i + 1, i − j) .
In this proof, the loop annotations (other than the ranking functions) do
not have any bearing on the termination argument. Their purpose is only to
prove that the given functions map to the natural numbers.
148 5 Program Correctness: Mechanics
@pre u − ℓ + 1 ≥ 0
@post ⊤
↓ u−ℓ+1
bool BinarySearch(int[] a, int ℓ, int u, int e) {
if (ℓ > u) return false;
else {
int m := (ℓ + u) div 2;
if (a[m] = e) return true;
else if (a[m] < e) return BinarySearch(a, m + 1, u, e);
else return BinarySearch(a, ℓ, m − 1, e);
}
}
(1)
@pre u − ℓ + 1 ≥ 0
↓ u−ℓ+1
assume ℓ ≤ u;
m := (ℓ + u) div 2;
assume a[m] 6= e;
assume a[m] < e;
↓ u − (m + 1) + 1
(2)
@pre u − ℓ + 1 ≥ 0
↓ u−ℓ+1
assume ℓ ≤ u;
m := (ℓ + u) div 2;
assume a[m] 6= e;
assume a[m] ≥ e;
↓ (m − 1) − ℓ + 1
Two other basic paths exist from function entry to the first two return state-
ments; however, as the recursion ends at each, they are irrelevant to the ter-
mination argument.
The basic paths induce two verification conditions. Before examining them,
notice that the assume statements about a[m] are irrelevant to the termination
argument. Now, the first VC is
u − ℓ + 1 ≥ 0 ∧ ℓ ≤ u ∧ ···
→ u − (((ℓ + u) div 2) + 1) + 1 < u − ℓ + 1 ,
u − ℓ + 1 ≥ 0 ∧ ℓ ≤ u ∧ ···
→ (((ℓ + u) div 2) − 1) − ℓ + 1 < u − ℓ + 1
for the second basic path is also TZ -valid, so BinarySearch halts on all input
in which ℓ is initially at most u + 1.
Section 6.2 provides an alternative to using the awkward ranking function
u − ℓ + 1. Additionally, the argument proves termination on all input.
5.4 Summary
This chapter introduces the specification and verification of sequential pro-
grams. It covers:
• The programming language pi.
150 5 Program Correctness: Mechanics
Bibliographic Remarks
Formally proving program correctness has been a subject of active research for
five decades. McCarthy argues in [59, 58] for a “mathematical science of com-
putation”. Floyd [34] and Hoare [39] introduce the main concepts for proving
property invariance and termination. In particular, they develop Floyd-Hoare
logic. Manna describes a verification style similar to ours [52]. The weakest
precondition predicate transformer was first formalized by Dijkstra [28].
King describes in his thesis [50] the idea of a verifying compiler, which
generates and proves during compilation the verification conditions that arise
from program annotations. See [27] for a discussion of the Extended Static
Checker, a verifying compiler for Java.
Exercises 151
@pre p(a0 )
@post sorted(rv , 0, |rv | − 1)
int[] InsertionSort(int[] a0 ) {
int[] a := a0 ;
for
@ r1 (a, a0 , i, j)
(int i := 1; i < |a|; i := i + 1) {
int t := a[i];
for
@ r2 (a, a0 , i, j)
(int j := i − 1; j ≥ 0; j := j − 1) {
if (a[j] ≤ t) break;
a[j + 1] := a[j];
}
a[j + 1] := t;
}
return a;
}
Exercises
5.1 (Basic paths). For each of the following functions, replace each @pre ⊤
with a fresh predicate p over the function parameters, each @post ⊤ with a
fresh predicate q over rv and the function parameters, and each @ ⊤ with a
fresh predicate r over the function variables. As an example, see Figure 5.25
for the replacements for part (a). Then list the basic paths.
(a) InsertionSort of Figure 6.8.
(b) merge of Figure 6.9.
(c) ms of Figure 6.9.
5.3 (Verification condition generation). Generate the VCs for the follow-
ing basic paths:
(1)
@ x > 0;
x := x − k;
assume k ≤ 1;
@ x ≥ 0;
152 5 Program Correctness: Mechanics
(2)
@ ⊤;
assume k ≤ x;
x := x − k;
@ x ≥ 0;
(3)
@ ⊤;
x := x − k;
assume k ≤ x;
@ x ≥ 0;
(4)
@ k ≥ 0;
x := x − k;
assume k ≤ x;
@ x ≥ 0;
(5)
@ y ≥ 0;
x := x + 1;
assume x > 0;
y := y + x;
@ x + 2y ≥ 3;
requires it. Of course, simple assertions, such as that a program is free of run-
time errors, can be generated automatically.
Writing the loop invariants also requires human ingenuity. A certain level
of human intervention is acceptable: the programmer ought to know certain
facts about her/his code. Loop invariants often capture insights into how the
code works and what it accomplishes. Developing the implementation and an-
notations simultaneously results in more robust systems. Finally, annotations
formally document code, facilitating better development in team projects.
However, Section 5.2.5 points out a fundamental limitation of the inductive
assertion method of program verification: loop invariants must be inductive
for the corresponding verification conditions to be valid, not just invariant.
Consequently, the programmer can assert many facts that are indeed invariant;
yet if the annotations are not inductive, the facts cannot be proved.
Much research addresses automatic (inductive) invariant discovery. For ex-
ample, algorithms exist for discovering linear and polynomial relations among
integer and real variables. Such invariants can, for example, provide loop in-
dex bounds, prove the lack of division by 0, or prove that an index into an
array is within bounds. Other methods exist for discovering the “shape” of
memory in programming languages with pointers, allowing, for example, the
partially automated analysis of linked lists. One of the most important roles
of automatic invariant discovery is strengthening the programmer’s annota-
tions into inductive annotations. Chapter 12 introduces invariant generation
procedures. However, no set of algorithms will ever fully replace humans in
writing verified software.
In this section, we suggest structured techniques for developing inductive
annotations to prove partial correctness. We emphasize that the methods are
just heuristics: human ingenuity is still the most important ingredient in form-
ing proofs.
To begin a proof, include basic facts in loop invariants. Basic facts include loop
index ranges and other “obvious” facts. To be inductive, complex assertions
usually require these basic facts. We illustrate the development of basic facts
through several examples.
Example 6.1. Consider the loop of LinearSearch(see also Figure 5.1):
for
@L : ⊤
(int i := ℓ; i ≤ u; i := i + 1) {
if (a[i] = e) return true;
}
Based on the initialization of i, the loop guard, and that i is only modified by
being incremented in the loop update, we know that at L,
6.1 Developing Inductive Annotations 155
ℓ≤i≤u+1 .
Notice the upper bound. It is a common mistake to forget that on the final
iteration, the loop guard is not true. Our basic annotation of the loop is the
following:
for
@L : ℓ ≤ i ≤ u + 1
(int i := ℓ; i ≤ u; i := i + 1) {
if (a[i] = e) return true;
}
Example 6.2. Consider the loops of BubbleSort (see also Figure 5.3):
for
@L1 : ⊤
(int i := |a| − 1; i > 0; i := i − 1) {
for
@L2 : ⊤
(int j := 0; j < i; j := j + 1) {
if (a[j] > a[j + 1]) {
int t := a[j];
a[j] := a[j + 1];
a[j + 1] := t;
}
}
}
−1 ≤ i < |a|
Why −1? If |a| = 0, then |a| − 1 = −1 so that i is initially −1. Keep in mind
that “corner cases” like this one are just as important as normal cases (and
perhaps even more important when considering correctness: corner cases are
often the source of bugs). In the inner loop, the range of i is more restricted:
0≤j≤i.
for
@L1 : −1 ≤ i < |a|
(int i := |a| − 1; i > 0; i := i − 1) {
for
@L2 : 0 < i < |a| ∧ 0 ≤ j ≤ i
(int j := 0; j < i; j := j + 1) {
if (a[j] > a[j + 1]) {
int t := a[j];
a[j] := a[j + 1];
a[j + 1] := t;
}
}
}
Note that the loops modify just the elements of a, not a itself. Therefore,
we could add the annotation
|a| = |a0 |
|rv | = |a0 | .
For the property that we address (sorted(rv , 0, |rv | − 1)), this annotation is
not useful.
Basic facts provide a foundation for more interesting information. The pre-
condition method (also called the “backward substitution” or “backward
propagation” method) is a strategy for developing more interesting informa-
tion in a structured way. Again, we emphasize that the method is a heuristic,
not an algorithm: it provides some guidance for the human rather than re-
placing the human’s intuition and ingenuity.
The precondition method consists of the following steps:
1. Identify a fact F that is known at one location L in the function (@L : F )
but that is not supported by annotations earlier in the function.
2. Repeat:
a) Compute the weakest preconditions of F backward through the func-
tion, ending at loop invariants or at the beginning of the function.
b) At each new annotation location L′ , generalize the new facts to new
formula F ′ (@L′ : F ′ ).
We illustrate the technique through examples.
6.1 Developing Inductive Annotations 157
Example 6.3. Consider the loop of LinearSearch (see also Figure 5.1), anno-
tated with basic facts:
for
@L : ℓ ≤ i ≤ u + 1
(int i := ℓ; i ≤ u; i := i + 1) {
if (a[i] = e) return true;
}
return false;
rv ↔ ∃i. ℓ ≤ i ≤ u ∧ a[i] = e .
Consider basic path (4) of Example 5.13 but with the current loop invariant
substituted for the first assertion:
(4)
@L : F1 : ℓ ≤ i ≤ u + 1
S1 : assume i > u;
S2 : rv := false;
@post F2 : rv ↔ ∃j. ℓ ≤ j ≤ u ∧ a[j] = e
Note that we continue to number basic paths as they were numbered in Ex-
ample 5.13. The VC
is not (TZ ∪ TA )-valid. Essentially, the antecedent does not assert anything
useful about the content of a. Write the consequent as
F : ∀j. ℓ ≤ j ≤ u → a[j] 6= e
wp(F2 , S1 ; S2 )
⇔ wp(wp(F2 , rv := false), S1 )
⇔ wp(F2 {rv 7→ false}, S1 )
⇔ wp(F2 {rv 7→ false}, assume i > u)
⇔ i > u → F2 {rv 7→ false}
⇔ i > u → ∀j. ℓ ≤ j ≤ u → a[j] 6= e
Then
wp(G, S1 ; S2 ; S3 )
⇔ wp(wp(G, i := i + 1), S1 ; S2 )
⇔ wp(G{i := i + 1}, S1 ; S2 )
⇔ wp(wp(G{i := i + 1}, assume a[i] 6= e), S1 )
⇔ wp(a[i] 6= e → G{i := i + 1}, S1 )
⇔ wp(a[i] 6= e → G{i := i + 1}, assume i ≤ u)
⇔ i ≤ u → a[i] 6= e → G{i := i + 1}
⇔ i ≤ u ∧ a[i] 6= e ∧ i + 1 > u → ∀j. ℓ ≤ j ≤ u → a[j] 6= e
⇔ i = u ∧ a[u] 6= e → ∀j. ℓ ≤ j ≤ u → a[j] 6= e
⇔ i = u ∧ a[u] 6= e → ∀j. ℓ ≤ j ≤ u − 1 → a[j] 6= e
⇔ i = u ∧ a[u] 6= e → ∀j. ℓ ≤ j ≤ i − 1 → a[j] 6= e
To obtain the second-to-last line from the third-to-last, note that the an-
tecedent already asserts that a[u] 6= e; hence, its occurrence as the case j = u
of ∀j. ℓ ≤ j ≤ u · · · is redundant. The final line is realized by applying the
equality i = u to the upper bound on j. As we suspected, it seems that the
right bound on j should be related to the progress of i, rather than being fixed
to u. This observation from computing the weakest precondition matches our
intuition. One trick to generalize assertions is to replace fixed terms (bounds,
indices, etc.) with terms that evolve according to the loop counter.
Thus, we settle on the formula
G′ : ∀j. ℓ ≤ j < i → a[j] 6= e .
That is, all previously checked entries of a do not equal e. We add this assertion
to the loop invariant:
for
@L : ℓ ≤ i ≤ u + 1 ∧ (∀j. ℓ ≤ j < i → a[j] 6= e)
(int i := ℓ; i ≤ u; i := i + 1) {
if (a[i] = e) return true;
}
6.1 Developing Inductive Annotations 159
The result is similar to the annotation in Figure 5.15. Generating and checking
the corresponding VCs reveals that the annotations are inductive.
Example 6.4. Consider the version of BinarySearch of Figure 5.12 that con-
tains runtime assertions but has only a trivial function specification ⊤. Using
the precondition method, we infer a function precondition that makes the an-
notations inductive. Contexts that call BinarySearch are then forced to obey
this function precondition, guaranteeing a lack of runtime errors.
Consider the path from function entry to the assertion protecting the array
access:
(·)
@pre H : ?
S1 : assume ℓ ≤ u;
S2 : m := (ℓ + u) div 2;
@ F : 0 ≤ m < |a|
Compute
wp(F, S1 ; S2 )
⇔ wp(wp(F, m := (ℓ + u) div 2), S1 )
⇔ wp(F {m 7→ (ℓ + u) div 2}, S1 )
⇔ wp(F {m 7→ (ℓ + u) div 2}, assume ℓ ≤ u)
⇔ ℓ ≤ u → F {m 7→ (ℓ + u) div 2}
⇔ ℓ ≤ u → 0 ≤ (ℓ + u) div 2 < |a|
⇐ 0 ≤ ℓ ∧ u < |a|
The final line implies the penultimate line, for if 0 ≤ ℓ ∧ u < |a| and ℓ ≤ u,
then both 0 ≤ ℓ < |a| and 0 ≤ u < |a|; hence, their mean is also in the range
[0, |a| − 1]. Therefore, it is guaranteed that
is TZ -valid.
The formula 0 ≤ ℓ ∧ u < |a| appears as the function precondition in
Figure 6.1. The annotations are inductive, proving that the runtime assertion
0 ≤ m < |a| holds in every execution of BinarySearch in which the precondition
0 ≤ ℓ ∧ u < |a| is satisfied.
Example 6.5. Consider the following code fragment of BubbleSort (see also
Figure 5.3)
160 6 Program Correctness: Strategies
for
@L1 : −1 ≤ i < |a|
(int i := |a| − 1; i > 0; i := i − 1) {
for
@L2 : 0 < i < |a| ∧ 0 ≤ j ≤ i
(int j := 0; j < i; j := j + 1) {
if (a[j] > a[j + 1]) {
int t := a[j];
a[j] := a[j + 1];
a[j + 1] := t;
}
}
}
return a;
and its postcondition
F : sorted(rv , 0, |rv| − 1) .
Consider the path
(6)
@L1 : G : ?
S1 : assume i ≤ 0;
S2 : rv := a;
@post F : sorted(rv , 0, |rv| − 1)
and decrements down to 0. Therefore, recalling the trick to replace fixed terms
(bounds, indices, etc.) with terms that evolve according to the loop counter
suggests the following generalization of F ′ :
G : sorted(a, i, |a| − 1) .
G trivially holds upon entering the outer loop; moreover, it follows from the
behavior of i that progress is made by working down the array. The outer loop
invariant L1 should include G. Thus, we have
so far.
Propagate G via wp to the inner loop along the path from the exit of the
inner loop L2 to the top of the outer loop L1 :
(5)
@L2 : H : ?
S1 : assume j ≥ i;
S2 : i := i − 1;
@L1 : G : sorted(a, i, |a| − 1)
H ′ : j ≥ i → sorted(a, i − 1, |a| − 1) ,
which states that when the inner loop has finished, the range [i − 1, |a| − 1] is
sorted. Immediately generalizing H ′ to
H ′′ : sorted(a, i − 1, |a| − 1)
is too strong. For suppose H ′′ were to annotate the inner loop at L2 , and
consider the path
(2)
@L1 : G : sorted(a, i, |a| − 1)
S1 : assume i > 0;
S2 : j := 0;
@L2 : H ′′ : sorted(a, i − 1, |a| − 1)
Computing
produces
case that definitely holds only when the inner loop has finished. Therefore,
we generalize H ′ to the weaker assertion H : sorted(a, i, |a| − 1), which claims
that a smaller subrange of a is sorted.
At this point, we have annotated the loops of BubbleSort as follows:
for
@L1 : −1 ≤ i < |a| ∧ sorted(a, i, |a| − 1)
(int i := |a| − 1; i > 0; i := i − 1) {
for
@L2 : 0 < i < |a| ∧ 0 ≤ j ≤ i ∧ sorted(a, i, |a| − 1)
(int j := 0; j < i; j := j + 1) {
if (a[j] > a[j + 1]) {
int t := a[j];
a[j] := a[j + 1];
a[j + 1] := t;
}
}
}
The resulting VCs are not valid. Further annotations require some insight on
our part, which leads us to the next section.
6.1.3 A Strategy
Example 6.6. We resume our analysis of BubbleSort from Example 6.5. Some
cogitation (and observation of sample traces; see Figure 5.4) suggests that
BubbleSort exhibits the following behavior: the inner loop propagates the
largest value of the unsorted region to the right side of the unsorted region,
thus expanding the sorted region. At every iteration, j is the index of the
largest value found so far. In other words, all values in the range [0, j − 1] are
at most a[j]:
F : partitioned(a, 0, j − 1, j, j) .
for
@L1 : −1 ≤ i < |a| ∧ sorted(a, i, |a| − 1)
(int i := |a| − 1; i > 0; i := i − 1) {
for
0 < i < |a| ∧ 0 ≤ j ≤ i
@L2 :
∧ partitioned(a, 0, j − 1, j, j) ∧ sorted(a, i, |a| − 1)
(int j := 0; j < i; j := j + 1) {
if (a[j] > a[j + 1]) {
int t := a[j];
a[j] := a[j + 1];
a[j + 1] := t;
}
}
}
G : partitioned(a, 0, i, i + 1, |a| − 1) .
wp(G, assume j ≥ i; i := i − 1)
⇔ j ≥ i → partitioned(a, 0, i − 1, i, |a| − 1)
cannot be generalized to
partitioned(a, 0, i − 1, i, |a| − 1) .
Find the strongest formula H that can annotate the inner loop such that the
VC
for the path is valid. In other words, seek a formula H annotating the inner
loop that is supported by the annotation G of the outer loop. The strongest
such formula is G itself.
These new annotations result in Figure 5.17.
typedef struct qs {
int pivot;
int[] array;
} qs;
@pre ⊤
@post sorted(rv , 0, |rv | − 1)
int[] QuickSort(int[] a) {
return qsort(a, 0, |a| − 1);
}
@pre ⊤
@post ⊤
int[] qsort(int[] a0 , int ℓ, int u) {
int[] a := a0 ;
if (ℓ ≥ u) return a;
else {
qs p := partition(a, ℓ, u);
a := p.array;
a := qsort(a, ℓ, p.pivot − 1);
a := qsort(a, p.pivot + 1, u);
return a;
}
}
array a because qsort modifies a (recall that pi does not allow parameters
to be modified). The qs data structure holds the two data that the partition
function, listed in Figure 6.3, returns: the pivot index pivot and the partitioned
array array.
One level of recursion of qsort works as follows. If ℓ ≥ u, then the trivial
range [ℓ, u] of a0 is already sorted. Otherwise, partition chooses a pivot index
pi ∈ [ℓ, u], remembering the pivot value a[pi] as pv. It then swaps cells pi and
u of a so that the randomly chosen pivot now appears on the right side of the
[ℓ, u] subarray. random has the following prototype:
@pre ℓ ≤ u
@post ℓ ≤ rv ≤ u
int random(int ℓ, int u);
The for loop of partition partitions a such that all elements at most pv
are on the left and all elements greater than pv are on the right. Within the
loop, j < u, so that the pivot value pv, stored in a[u], remains untouched.
When the loop finishes, if i < u − 1, then the value a[i + 1] is the first value
greater than pv; otherwise, all elements of a are at most pv. Finally, partition
166 6 Program Correctness: Strategies
@pre ⊤
@post ⊤
qs partition(int[] a0 , int ℓ, int u) {
int[] a := a0 ;
int pi := random(ℓ, u);
int pv := a[pi];
a[pi] := a[u];
a[u] := pv;
int i := ℓ − 1;
for @ ⊤
(int j := ℓ; j < u; j := j + 1) {
if (a[j] ≤ pv) {
i := i + 1;
t := a[i];
a[i] := a[j];
a[j] := t;
}
}
t := a[i + 1];
a[i + 1] := a[u];
a[u] := t;
return
{ pivot = i + 1;
a = a;
};
}
swaps the pivot value a[u] with a[i + 1] so that a is partitioned as follows in
the range [ℓ, u]: cells to the left of i + 1 have value at most pv; a[i + 1] = pv;
and cells to the right of i + 1 have value greater than pv. It returns the pivot
index i + 1 and the partitioned array a via an instance of the qs data type.
Finally, qsort recursively sorts the subarrays to the left and to the right of
the pivot index.
Figure 6.4 presents a sample trace. In the first line, partition chooses the
second cell as the pivot and swaps it with cell u. The subsequent six lines follow
the partition’s loop as it partitions elements according to pv. The penultimate
line shows the swap that brings the pivot element into the pivot position. The
final line shows the state of the array when it is returned to qsort. qsort calls
itself recursively on the two indicated subarrays. We encourage the reader to
understand QuickSort and the sample trace before reading further.
6.2 Extended Example: QuickSort 167
0 3 2 6 5
ℓ pi u
9
0 5 2 6 3 0≤3 >
>
i j
>
>
>
>
>
>
>
>
>
0 6 2 5 3
>
>
>
>
>
i, j >
>
>
>
>
>
>
>
0 5 2 6 3 5 6≤ 3
>
>
>
>
i j
>
>
=
loop
0 5 2 6 3 2≤3
>
>
>
>
i j
>
>
>
>
>
>
>
>
>
0 5 2 6 3
>
>
>
>
>
i j
>
>
>
>
>
>
>
>
0 2 5 6 3 6 6≤ 3
>
>
>
>
i j
;
0 2 5 6 3
i i+1 j, u
0 2 3 6 5
in the theory TZ ∪ TA . It asserts that two arrays are equal in the index range
[k1 , k2 ].
The annotations for partition vary slightly because of its return type:
sorted(rv , ℓ, u) .
ℓ ≤ rv .pivot ≤ u ,
or, as a partition,
which does not capture the strict inequality but is more convenient for rea-
soning. For partition, we have thus specified the following:
Let us step back a moment. Essentially, we have specified that qsort does
not modify the array outside of the range [ℓ, u]. Regarding the subarray given
by [ℓ, u], all we have asserted is that it is sorted in the returned array. Focus
on the recursive calls to qsort: is knowing that the ranges [ℓ, p.pivot − 1] and
[p.pivot+ 1, u] of a are sorted enough to conclude that the range [ℓ, u] is sorted
when a is returned? In other words, is the VC corresponding to the following
basic path valid? The basic path follows the path from the precondition to
the second return statement, using the function call abstraction introduced
in Section 5.2.1 to abstract away functions calls:
(·)
@pre 0 ≤ ℓ ∧ u < |a0 |
a := a0 ;
assumeℓ < u;
|v1 .array| = |a| ∧ beq(v1 .array, a, 0, ℓ − 1)
∧ beq(v1 .array, a, u + 1, |a| − 1)
assume ∧ ℓ ≤ v1 .pivot ≤ u ;
∧ partitioned(v1 .array, ℓ, v1 .pivot − 1, v1 .pivot, v1 .pivot)
∧ partitioned(v1 .array, v1 .pivot, v1 .pivot, v1 .pivot + 1, u)
p := v1 ;
a := p.array;
|v | = |a| ∧ beq(v2 , a, 0, ℓ − 1) ∧ beq(v2 , a, p.pivot, |a| − 1)
assume 2 ;
∧ sorted(v2 , ℓ, p.pivot − 1)
a := v2;
|v | = |a| ∧ beq(v3 , a, 0, p.pivot) ∧ beq(v3 , a, u + 1, |a| − 1)
assume 3 ;
∧ sorted(v3 , p.pivot + 1, u)
a := v3 ;
rv:= a;
|rv | = |a0 | ∧ beq(rv , a0 , 0, ℓ − 1) ∧ beq(rv , a0 , u + 1, |a0 | − 1)
@
∧ sorted(rv , ℓ, u)
(a permuted array contains the same elements as the original array but pos-
sibly in a different order). However, reasoning about permutations presents
a problem. A straightforward formalization of permutation is not possible in
FOL, instead requiring second-order logic. We could assert that the output
is a weak permutation of the input: all values occurring in the input array
occur in the output array but possibly with a varying number of occurrences.
Formally,
That is, partition preserves this partitioning even as it manipulates the ele-
ments in the range [ℓ, u]. Indeed, partition itself imposes the necessary parti-
6.2 Extended Example: QuickSort 171
tioning for the next level of recursion, which we already observed earlier as
the final partitioned assertions of the function postcondition.
The annotations of QuickSort and qsort are inductive. Exercise 6.3 asks
the reader to finish the proof by annotating the for loop of partition so that
the annotations of partition are also inductive.
for
@L1 : ℓ ≤ j ∧ j ≤ u
↓ δ1 : u − j
(int j := ℓ; j < u; j := j + 1)
Proving that the recursion of qsort always halts is superficially more dif-
ficult. The argument that we would like to make is that u − ℓ decreases on
each recursive call, which requires proving that the pivot value returned by
partition lies within the range [ℓ, u].
Observe, however, that u − ℓ may be negative when qsort is called with
ℓ > u. But in this case, ℓ = u + 1, for either |a0 | = 0, and qsort was called
from QuickSort; or p.pivot = ℓ or p.pivot = u, and qsort was called recursively.
More generally, we can establish that u − ℓ + 1 ≥ 0 is an invariant of qsort.
Hence, δ2 : u − ℓ + 1 is our proposed ranking function that maps the program
states to N with well-founded relation <.
Figure 6.5 formalizes the arguments that δ1 and δ2 are ranking functions.
Notice that bounds on i are proved as loop invariants at L1 . These bounds
imply that rv .pivot lies within the range [ℓ, u] as required.
One trick that would avoid reasoning about the case in which ℓ > u is to
cut the recursion at a point within qsort rather than at function entry. Figure
6.6 provides an alternate argument in which the ranking function labels the
172 6 Program Correctness: Strategies
@pre u − ℓ + 1 ≥ 0
@post ⊤
↓ δ2 : u − ℓ + 1
int[] qsort(int[] a0 , int ℓ, int u) {
int[] a := a0 ;
if (ℓ ≥ u) return a;
else {
qs p := partition(a, ℓ, u);
a := p.array;
a := qsort(a, ℓ, p.pivot − 1);
a := qsort(a, p.pivot + 1, u);
return a;
}
}
@pre ℓ ≤ u
@post ℓ ≤ rv .pivot ∧ rv .pivot ≤ u
qs partition(int[] a0 , int ℓ, int u) {
..
.
int i := ℓ − 1;
for
@L1 : ℓ ≤ j ∧ j ≤ u ∧ ℓ − 1 ≤ i ∧ i < j
↓ δ1 : u − j
(int j := ℓ; j < u; j := j + 1) {
..
.
}
..
.
return
{ pivot = i + 1;
a = a;
};
}
else branch in qsort. The first branch terminates the recursion. partition is
annotated as in Figure 6.5.
6.3 Summary
This chapter presents strategies for specifying and proving the correctness of
sequential programs. It covers:
• Strategies for proving partial correctness. The need for strengthening an-
notations. Basic facts; the precondition method.
Exercises 173
@pre ⊤
@post ⊤
int[] qsort(int[] a0 , int ℓ, int u) {
int[] a := a0 ;
if (ℓ ≥ u) return a;
else {
↓ δ3 : u − ℓ
qs p := partition(a, ℓ, u);
a := p.array;
a := qsort(a, ℓ, p.pivot − 1);
a := qsort(a, p.pivot + 1, u);
return a;
}
}
@pre ⊤
@post ∀i. 0 ≤ i < |rv | → rv [i] ≥ 0
int[] abs(int[] a0 ) {
int[] a := a0 ;
for @ ⊤
(int i := 0; i < |a|; i := i + 1) {
if (a[i] < 0) {
a[i] := − a[i];
}
}
return a;
}
Bibliographic Remarks
QuickSort was discovered by Tony Hoare, who also proposed specifying and
verifying programs using FOL [39].
Exercises
6.1 (Absolute value). Prove the partial correctness of abs in Figure 6.7.
That is, annotate the function; list basic paths and verification conditions;
and argue that the VCs are valid.
174 6 Program Correctness: Strategies
@pre ⊤
@post sorted(rv , 0, |rv | − 1)
int[] InsertionSort(int[] a0 ) {
int[] a := a0 ;
for @ ⊤
(int i := 1; i < |a|; i := i + 1) {
int t := a[i];
for @ ⊤
(int j := i − 1; j ≥ 0; j := j − 1) {
if (a[j] ≤ t) break;
a[j + 1] := a[j];
}
a[j + 1] := t;
}
return a;
}
6.4 (MergeSort). Prove the partial correctness of MergeSort. See Figure 6.9.
The function merge uses the pi keyword new, which allocates an array of the
specified size. Therefore, it is known after the allocation to buf that |buf | =
u − ℓ + 1.
First, deduce the function specifications for ms and merge by focusing
on MergeSort and ms. Prove MergeSort and ms correct with respect to these
annotations. Then analyze merge.
Since MergeSort is fairly long, you need not list basic paths and VCs. Just
present MergeSort with its inductive annotations.
Compared to the full definition (6.1) of weak permutation, wperm does not
have a universally quantified variable e; instead, it uses a given expression e,
in this case the global constant e.
(a) Argue that the annotations of Figure 6.10 are inductive. That is, list the
VCs and argue their validity.
(b) Argue that the annotations imply that BubbleSort actually satisfies the
weakest permutation property. That is, prove that the validity of the VC
wperm(a, a0 , e) ∧ a′ = . . . → wperm(a′ , a0 , e)
(c) Can this approximation be used to prove the weakest permutation prop-
erty of
(i) InsertionSort (Figure 6.8)?
(ii) MergeSort (Figure 6.9)?
(iii) QuickSort (Section 6.2)?
If so, prove it. If not, explain why not.
6.7 (Sets with sorted arrays). Implement an API for manipulating sets.
The underlying data structure of the implementation is sorted arrays.
(a) Prove the correctness of the union function of Figure 6.12 by adding in-
ductive annotations.
176 6 Program Correctness: Strategies
(b) Implement and specify and prove the correctness of an intersection func-
tion, which takes two sorted arrays a0 and b0 and returns the intersection
of the sets they represent as a sorted set.
(c) Implement and specify and prove the correctness of a subset function,
which takes two sorted arrays a0 and b0 and returns true iff the first set,
represented by a0 , is a subset of the second set, represented by b0 .
6.8 (QuickSort halts). Provide the basic paths and verification conditions for
the proof of Section 6.2.2 that QuickSort always halts.
6.9 (Intuitive ranking functions). Following the proof that the recursion
of qsort halts, move the location of the ranking function annotations in the
following functions to produce more intuitive arguments:
(a) BinarySearch, Figure 5.2
(b) BubbleSort, Figure 5.3
(c) InsertionSort, Figure 6.8
@pre ⊤
@post sorted(rv , 0, |rv | − 1)
int[] MergeSort(int[] a) {
return ms(a, 0, |a| − 1);
}
@pre ⊤
@post ⊤
int[] ms(int[] a0 , int ℓ, int u) {
int[] a := a0 ;
if (ℓ ≥ u) return a;
else {
int m := (ℓ + u) div 2;
a := ms(a, ℓ, m);
a := ms(a, m + 1, u);
a := merge(a, ℓ, m, u);
return a;
}
}
@pre ⊤
@post ⊤
int[] merge(int[] a0 , int ℓ, int m, int u) {
int[] a := a0 , buf := new int[u − ℓ + 1];
int i := ℓ, j := m + 1;
for @ ⊤
(int k := 0; k < |buf |; k := k + 1) {
if (i > m) {
buf [k] := a[j];
j := j + 1;
} else if (j > u) {
buf [k] := a[i];
i := i + 1;
} else if (a[i] ≤ a[j]) {
buf [k] := a[i];
i := i + 1;
} else {
buf [k] := a[j];
j := j + 1;
}
}
for @ ⊤
(k := 0; k < |buf |; k := k + 1) {
a[ℓ + k] := buf [k];
}
return a;
}
define int e = ?;
@pre ⊤
@post wperm(a, a0 , e)
int[] BubbleSort(int[] a0 ) {
int[] a := a0 ;
for
@L1 : −1 ≤ i < |a| ∧ wperm(a, a0 , e)
(int i := |a| − 1; i > 0; i := i − 1) {
for
@L2 : 0 ≤ j < i ∧ i < |a| ∧ wperm(a, a0 , e)
(int j := 0; j < i; j := j + 1) {
if (a[j] > a[j + 1]) {
int t := a[j];
a[j] := a[j + 1];
a[j + 1] := t;
}
}
}
return a;
}
define int e = ?;
@pre ⊤» –
(∃i. 0 ≤ i < |rv | ∧ rv [i] = e)
@post
↔ (∃i. 0 ≤ i < |a0 | ∧ a0 [i] = e) ∨ (∃i. 0 ≤ i < |b0 | ∧ b0 [i] = e)
int[] union(int[] a0 , int[] b0 ) {
int[] u := new int[|a0 | + |b0 |];
int j := 0;
for @ ⊤
(int i = 0; i < |a0 |; i := i + 1) {
u[j] := a0 [i];
j := j + 1;
}
for @ ⊤
(int i = 0; i < |b0 |; i := i + 1) {
u[j] := b0 [i];
j := j + 1;
}
return u;
}
define int e = ?;
Algorithmic Reasoning
Quantifier elimination (QE) is the main technique that underlies the al-
gorithms of this chapter. As the name suggests, the idea is to eliminate quan-
tifiers of a formula F until only a quantifier-free formula G that is equivalent
to F remains.
Formally, a theory T admits quantifier elimination if there is an algo-
rithm that, given Σ-formula F , returns a quantifier-free Σ-formula G that is
T -equivalent to F . Then T is decidable if satisfiability in the quantifier-free
fragment of T is decidable.
F : ∃x. 2x = y ,
which expresses the set of rationals y that can be halved. Intuitively, all ra-
tionals can be halved, so a quantifier-free TQ -equivalent formula is
G: ⊤,
which expresses the set of all rationals. Also, G states that F is valid.
F : ∃x. 2x = y ,
which expresses the set of integers y that can be halved (to produce another
integer). Intuitively, only even integers can be halved. Can you think of a
quantifier-free TZ -equivalent formula to F ? In fact, no such formula exists.
Later, we introduce an augmented theory of integers that contains a countably
infinite number of divisibility predicates. For example, in this extended theory,
an equivalent formula to F is
G: 2|y ,
which expresses the set of even integers: integers that are divisible by 2.
7.2 Quantifier Elimination over Integers 185
7.1.2 A Simplification
The innermost quantified formula is now ∀y. F2 [x, y]; rewriting, we have
Applying the QE algorithm to the existential subformula ∃y. ¬F2 [x, y] pro-
duces F3 [x]. We now have
where
186 7 Quantified Linear Arithmetic
S : {n ∈ Z : F {y 7→ n} is TZ -valid} .
Either S ∩ Z+ or Z+ \ S is finite.
Z+ is the set of positive integers; ∩ and \ are set intersection and com-
plement, respectively. The lemma says that every quantifier-free ΣZ -formula
with only one free variable represents a set of integers S such that either the
subset of positive integers in S has finite cardinality or the set of positive
integers not in S has finite cardinality. Exercise 7.1 asks the reader to prove
this lemma.
Consider again the case of ∃x. 2x = y, and let S be the set of integers
satisfying the formula, namely the even integers. Since both the set of positive
even integers S ∩Z+ and the set of positive odd integers Z+ \S are infinite, the
set of even integers cannot be represented in TZ by a quantifier-free formula
according to the lemma. Therefore, there is no quantifier-free ΣZ -formula that
is TZ -equivalent to ∃x. 2x = y, and thus TZ does not admit QE.
To circumvent this problem, we augment the theory TZ with an infinite
but countable number of unary divisibility predicates
k| · for k ∈ Z+ ;
that is, a predicate exists for each positive integer k. The intended interpre-
tation of k | x is that it holds iff k divides x without any remainder. For
example,
¬(2 | x) ∧ 4 | x
is not satisfiable.
7.2 Quantifier Elimination over Integers 187
Step 1
cZ -equivalent to ∃x. F [x] and is
Put F [x] in NNF. The output ∃x. F1 [x] is T
such that F1 is a positive Boolean combination (only ∧ and ∨) of literals.
Step 2
cZ -equivalences, applied from left
Replace literals according to the following T
to right:
cZ -terms and k ∈ Z+ .
where s, t are Σ
cZ -equivalences to
Example 7.5. Applying the T
188 7 Quantified Linear Arithmetic
Step 3
Collect terms containing x so that literals have the form
hx < t , t < hx , k | hx + t , or ¬(k | hx + t) ,
where t is a term that does not contain x and h, k ∈ Z+ . The output is the
cZ -equivalent to ∃x. F [x].
formula ∃x. F3 [x], which is T
Example 7.6. Collecting terms in
x + x + y < z + 3z + 2y − 4x
cZ -equivalent formula
produces the T
6x < 4z + y .
Step 4
Let
δ ′ = lcm{h : h is a coefficient of x in F3 [x]} ,
where lcm returns the least common multiple of the set. Multiply atoms in
F3 [x] by constants so that δ ′ is the coefficient of x everywhere:
hx < t ⇔ δ ′ x < h′ t where h′ h = δ ′
t < hx ⇔ h′ t < δ ′ x where h′ h = δ ′
k | hx + t ⇔ h′ k | δ ′ x + h′ t where h′ h = δ ′
¬(k | hx + t) ⇔ ¬(h′ k | δ ′ x + h′ t) where h′ h = δ ′
Notice the abuse of notation: h′ k | · is a different (unary) predicate than k | ·.
This rewriting results in formula F3′ in which all occurrences of x occur in
terms δ ′ x. Replace δ ′ x terms with a fresh variable x′ to form F3′′ :
F3′′ : F3′ {δ ′ x 7→ x′ } .
Finally, construct
∃x′ . F3′′ [x′ ] ∧ δ ′ | x′ .
| {z }
F4 [x′ ]
Step 5
Construct the left infinite projection F−∞ [x′ ] from F4 [x′ ] by replacing
and
The idea is that very small numbers (the left side of the “number line”) satisfy
(A) literals but not (B) literals.
Let
h of (C) literals h | x′ + c
δ = lcm
k of (D) literals ¬(k | x′ + d)
Step 5 is the trickiest part of the procedure, so let us focus on this step.
The first major disjunct of F5 contains only divisibility literals. It asserts that
an infinite number of small numbers n satisfy F4 [n]. For if there exists one
number n that satisfies the Boolean combination of divisibility literals in F−∞ ,
then every n − λδ, for λ ∈ Z+ , also satisfies F−∞ .
The second major disjunct asserts that there is a least n ∈ Z that satisfies
F4 [n]. This least n is determined by the b terms of the (B) literals.
More formerly, consider the following periodicity property of the divis-
ibility predicates:
δ ′ = lcm{2, 3, 5} = 30 .
Replacing 30x with fresh x′ and conjoining a divides atom completes Step 4:
F−∞ [x] : ⊤ ∧ ⊥ ∧ 24 | x′ + 6 ∧ 30 | x′ ,
7.2 Quantifier Elimination over Integers 191
and
B = {10y − 10} .
which simplifies to
120
_
10y + j < 15z + 100 ∧ 0 < j
F5 : .
∧ 24 | 10y + j − 4 ∧ 30 | 10y + j − 10
j=1
Example 7.8. Consider again the formula defining the set of even integers:
∃x. 2x = y .
| {z }
F [x]
Then
δ ′ = lcm{2, 2} = 2 ,
δ = lcm{2} = 2
and
B = {y − 1} ,
so
192 7 Quantified Linear Arithmetic
2
_
F5 : (y − 1 < y − 1 + j ∧ y − 1 + j < y + 1 ∧ 2 | y − 1 + j) .
j=1
Simplifying, we find
2
_
F5 : (0 < j ∧ j < 2 ∧ 2 | y + j − 1) ,
j=1
and then
F5 : 2 | y ,
so
δ ′ = lcm{3, 7} = 21 .
we replace 21x by x′ :
Then
F−∞ [x′ ] : (⊤ ∨ ⊥) ∧ 42 | x′ ∧ 21 | x′ ,
or, simplifying,
F−∞ [x′ ] : 42 | x′ ∧ 21 | x′ .
Finally,
δ = lcm{21, 42} = 42
7.2 Quantifier Elimination over Integers 193
and
B = {39} ,
so
42
_
(42 | j ∧ 21 | j) ∨
j=1
F5 : 42
_
((39 + j < 63 ∨ 39 < 39 + j) ∧ 42 | 39 + j ∧ 21 | 39 + j) .
j=1
Since 42 | 42 and 21 | 42, the left main disjunct simplifies to ⊤, so that ∃x. F [x]
cZ -equivalent to ⊤. Thus, F is T
is T cZ -valid.
Theorem 7.10 (Correct). Given Σ cZ -formula ∃x. F [x] in which F is quantifier-
c
free, Cooper’s method returns a TZ -equivalent quantifier-free formula.
Proof. The transformations of the first four steps produce formula F4 . By
cZ
inspection, we assert that in T
I ⊳ {x 7→ b∗ + j ∗ } |= F4 [x]
I |= ∃x. F4 [x] .
Otherwise, one of the first set of disjuncts is true, so for some j ∗ ∈ [1, δ],
I ⊳ {x 7→ j ∗ } |= F−∞ [x]. By construction of F−∞ , there is some λ > 0 such
that I ⊳ {x 7→ j ∗ − λδ} |= F4 [x]. That is, there is some j ∗ − λδ that is so small
that the inequality literals of F4 evaluate under I ⊳ {x 7→ j ∗ − λδ} exactly as
in the construction of F−∞ . Thus, I |= ∃x. F4 [x] in this case as well.
For the other direction, assume that I |= ∃x. F4 [x]. Thus, some n ∈ Z
exists such that I ⊳ {x 7→ n} |= F4 [x]. If for some b∗ ∈ B and j ∗ ∈ [1, δ],
194 7 Quantified Linear Arithmetic
The construction in Step 5 was biased to the left. We can just as easily define
a right elimination. Construct the right infinite projection F+∞ [x′ ] from
F4 [x′ ] by replacing
and
The idea is that very large numbers (the right side of the “number line”)
satisfy (B) literals but not (A) literals.
Then define δ as before, but now define A as the set of a terms appearing
in (A) literals. Construct
δ
_ δ _
_
F5 : F+∞ [−j] ∨ F4 [a − j] .
j=1 j=1 a∈A
there are two (A) literals but only one (B) literal. Hence, choose the left
infinite projection to produce fewer disjuncts.
∃x1 . · · · ∃xn−1 .
_δ δ _
_
G1 :
F−∞ [x1 , . . . , xn−1 , j] ∨ F4 [x1 , . . . , xn−1 , b + j] .
j=1 j=1 b∈B
At Step 3,
δ ′ = lcm{1, 13} = 13 ,
producing
∃y. ∃x. 13x < −26 ∧ 13 − 65y < 13x ∧ 1 + y < 13x
and then
With δ = lcm{13} = 13, A = {−26}, and B = {13 − 65y, 1 + y}, choose the
right elimination to form:
13
_
−26 − j < −26 ∧ 13 − 65y < −26 − j
∃y. .
∧ 1 + y < −26 − j ∧ 13 | − 26 − j
j=1
⋆
7.2.5 Solving Divides Constraints
also without free variables. Expanding this formula by attempting every pos-
sible combination of values for j1 , . . . , jn produces δ1 × δ2 × · · · × δn disjuncts.
This naive expansion is prohibitively expensive on even small problems.
Notice, however, that Step 4 introduces many divisibility literals as con-
juncts. F5 has the form
7.2 Quantifier Elimination over Integers 197
δ1 δn
!
_ _ ^
F5 : ··· F ′′ ∧ ki | ti [j1 , . . . , jn ] ,
j1 =1 jn =1 i
where the ti are terms containing only constants and the ji iterators. Cooper
realized that the conjuncts
^
D: ki | ti [j1 , . . . , jn ]
i
The following theorem provides a means for reducing the number of literals
that contain some ji . It applies Euclid’s algorithm for computing the greatest
common divisor (GCD) d of two integers m and n. Euclid’s algorithm also
returns two integers p and q such that pm + qn = d.
Theorem 7.13. Consider two divisibility constraints
F : m | ax + b ∧ n | αx + β ,
where m, n ∈ Z+ , a, α ∈ Z \ {0}, and b, β are terms not containing x. Let
d, p, q = gcd(an, αm) be such that d is the GCD of an and αm, and p and q
obey pan + qαm = d. Then F is satisfiable iff
G : mn | dx + bpn + βqm ∧ d | αb − aβ
is satisfiable.
While both of the literals of F contain x, only one of the literals of G
contains x. Therefore, we can apply this theorem to triangulate a set S of
divisibility constraints. Let ≺ be a linear ordering of j1 , . . . , jn . S is in trian-
gular form if for each ji , at most one constraint of S contains ji as the least
(according to ≺) free variable.
The triangularization algorithm proceeds iteratively. On one iteration, per-
form the following steps:
1. Choose from S two constraints
m | aji + b and n | αji + β
such that there is no jk ≺ ji that occurs in at least two divisibility con-
straints of S.
2. Apply Theorem 7.13 to produce the new constraints
mn | dji + bpn + βqm and d | αb − aβ .
Replace the original constraints with these constraints in S.
198 7 Quantified Linear Arithmetic
13 | − j − 26 ∧ 65 | 39 + j + k .
Fix the variable order j ≺ k. According to this order, only one constraint
should have an occurrence of j, so apply Theorem 7.13:
m | aj + b n | αj + β
13 | −j + −26 ∧ 65 | j + k + 39
Compute
d, p, q = gcd(an, αm) pan + qαm = d
so that
13, 0, 1 = gcd(−65, 13) 0(−65) + 1(13) = 13
and construct
mn | dj + bpn + βqm d | αb − aβ
845 | 13j + (−26)(0)(65) + (k + 39)13 ∧ 13 | −26 − (−1)(k + 39)
or, simplifying,
F : m | ax + b ,
F : m | aji + b
3. If F is unsatisfiable, return.
4. Otherwise, instantiate the remaining constraints of S with each solution
to ji within the range [1, δi ], and recursively solve.
13 | − j − 26 ∧ 65 | 39 + j + k .
The system is already in triangular form for the variable order k ≺ j. To solve
the first constraint
m | aj + b d, p, q = gcd(a, m)
compute
13 | −j + −26 1, −1, 0 = gcd(−1, 13) .
m | ak + b d, p, q = gcd(a, m)
compute
65 | k + 52 1, 1, 0 = gcd(1, 65) .
with free variables or some quantifier alternation? For example, consider the
formula
cZ -equivalent formula
Eliminating the inner block produces the T
δ1
_ δn
_
G′ : ∀y1 . · · · ∀ym . ··· F ′ [j1 , . . . , jn , y1 , . . . , ym ] .
j1 =1 jn =1
ΣQ : {0, 1, +, −, =, ≥} ,
where
• 0 and 1 are constants;
• + is a binary function;
• − is a unary function;
• and = and ≥ are binary predicates.
To be consistent with our presentation of Cooper’s method, we switch from
weak inequality ≥ to strict inequality >. Of course, they are interchangeable:
Step 1
Put F [x] in NNF. The output ∃x. F1 [x] is TQ -equivalent to ∃x. F [x] and is
such that F1 is a positive Boolean combination (only ∧ and ∨) of literals.
7.3 Quantifier Elimination over Rationals 201
Step 2
The output ∃x. F2 [x] is TQ -equivalent to ∃x. F [x] and does not contain any
negations.
Step 3
Solve for x in each atom of F2 [x]: for example, replace the atom
t < cx ,
Step 4
and
(C) atoms x = c by ⊥ .
and
(C) atoms x = c by ⊥ .
202 7 Quantified Linear Arithmetic
b+a
(a) b 2 a
◦ ⊲ • • ⊳ ◦
b+a
(b) b c 2 a
◦ ⊲ • ◦ ⊳ ◦
b+a c+c
Fig. 7.2. Satisfying points: (a) 2
(b) 2
The left (right) infinite projection captures the case when small (large) n ∈ Q
satisfy F3 [n].
Let S be the set of a, b, and c terms from the (A), (B), and (C) atoms.
Construct the final output
_
s+t
F4 : F−∞ ∨ F+∞ ∨ F3 ,
2
s,t∈S
∃x. 2x = y .
| {z }
F [x]
and S = {3, 13
7 }. Since x < 3 is an (A) atom and x >
13
7 is a (B) atom, both
F−∞ and F+∞ simplify to ⊥, leaving
_ s + t s+t 13
F4 : <3 ∧ > .
2 2 7
s,t∈S
13
+3
s+t
2 takes on three expressions: 3, 13
7 , and
7
2 . The first two expressions arise
when s and t are the same terms. F3 [3] and F3 [ 137 ] both simplify to ⊥ since
the inequalities are strict; however,
13 13 13
7 +3 +3 +3 13
F3 : 7 <3 ∧ 7 >
2 2 2 7
simplifies to ⊤. Thus, F4 : ⊤ is TQ -equivalent to ∃x. F [x], so ∃x. F [x] is
TQ -valid.
Example 7.19. Consider the ΣQ -formula
G : ∀x. x < y .
To eliminate x, consider the subformula F of
G′ : ¬(∃x. ¬(x < y)) .
| {z }
F [x]
Step 2 rewrites F as
∃x. y < x ∨ y = x .
The literals are already in solved form for x in Step 3. Then
F−∞ : ⊥ ∨ ⊥ and F+∞ : ⊤ ∨ ⊥
simplify to ⊥ and ⊤, respectively. Since F+∞ is ⊤, we need not consider the
rest of Step 4, but instead declare that ∃x. F [x] is TQ -equivalent to F4 : ⊤.
Then G′ is ¬⊤, so that G is TQ -equivalent to ⊥.
204 7 Quantified Linear Arithmetic
⋆
7.4 Complexity
Fischer and Rabin proved the following lower bounds. The length n of a
formula is the number of symbols.
Theorem 7.22 (TQ Lower Bound). There is a fixed constant c > 0 such
that for all sufficiently large n, there is a ΣQ -formula of length n that requires
at least 2cn steps to decide its validity.
Closing the gap between the lower and upper bounds would require an-
swering long-standing open questions in complexity theory.
7.5 Summary
Quantifier elimination is a standard technique for reasoning about theories
in which satisfiability is decidable even with arbitrary quantification. This
chapter presents the technique in the context of arithmetic over integers and
over rationals or reals. It covers:
Exercises 205
Bibliographic Remarks
Presburger proves that arithmetic over the natural numbers without multi-
plication TN is decidable [73]. Cooper presents the version of the quantifier-
elimination procedure for TZ that we describe [19]. Fischer and Rabin provide
the lower bound on the complexity of the decision problem for TZ [33], while
Oppen analyzes Cooper’s procedure to obtain an upper bound [68].
Ferrante and Rackoff describe the quantifier-elimination procedure that
we present and the lower and upper complexity bounds on the problem [32].
Exercises
7.1 (TZ does not admit QE). Prove Lemma 7.4. Hint : Apply structural
induction; the base cases involve comparisons between ay and c, for constants
a and c.
cZ ). Apply quantifier-elimination to the following ΣZ -formulae.
7.2 (QE for T
(a) ∀y. 3 < x + 2y ∨ 2x + y < 3
(b) ∃y. 3 < x + 2y ∨ 2x + y < 3
206 7 Quantified Linear Arithmetic
G : ∀x1 , . . . , xn . F [x1 , . . . , xn ] ,
¬G : ∃x1 , . . . , xn . ¬F [x1 , . . . , xn ] ,
F1 ∨ · · · ∨ Fk
We define basic concepts and notation of linear algebra, covering only what is
required for understanding the remainder of the chapter. We refer the reader
interested in learning more about linear algebra to relevant texts in Biblio-
graphic Remarks.
IA = AI = A
for any n × n-matrix A. Finally, the unit vector ei is the vector in which
the ith element is 1 and all other elements are 0. Again, the sizes of I and ei
depend on their context.
Linear Equations
A vector space is a set of vectors that is closed under addition and scaling
of vectors: if v 1 , . . . , v k ∈ S are vectors in vector space S, then also
λ1 v 1 + · · · + λk v k ∈ S
F : Ax = b ,
Now solve the final row, representing equation x3 = −6, for x3 , yielding
x3 = −6. Substituting into the second equation yields −x2 − 6 = −3, or
x2 = −3. Substituting the solutions for x2 and x3 into the first equation
T
yields 3x1 − 3 − 12 = 6, or x1 = 7. Hence, the solution is x = 7 −3 −6 .
From the last row, x4 = −3. We cannot solve for x3 because there is not a row
in which the x3 column has the first non-zero element; therefore, x3 can take
on any value. To solve the second row, −x2 + x3 − x4 = 0, for x2 , replace x4
with its value −3 and let x3 be any value: −x2 + x3 + 3 = 0. Then x2 = 3 + x3 .
Substituting for x2 in the first equation, solve 3x1 + (3 + x3) + 2x3 = 6 for x1 :
x1 = 1 − x3 . Solutions thus lie on the line described by
1 − x3
3 + x3
x3
−3
Gaussian elimination for y, rather than computing all of A−1 and extracting
the kth column.
solve
312 x1 0
1 0 1 x2 = 1 .
221 x3 0
| {z } | {z }
A e2
2. Add −1 times the first row and 2 times the second row to the second row:
3 12 0
0 −1 1 3
0 01 4
G : Ax ≤ b ,
214 8 Quantifier-Free Linear Arithmetic
Linear Programs
The linear optimization problem, or linear program,
max cT x
subject to
Ax ≤ b
(x − z1 ) + (y − z2 ) .
x ≥ 0 ∧ y ≥ 0 ∧ z1 ≥ 0 ∧ z2 ≥ 0
∧ x + y ≤ 3 ∧ x − z1 ≤ 2 ∧ y − z2 ≤ 2 .
T
One vertex of the constraints is v = 2 1 0 0 . Why is it a vertex? Consider
the submatrix A0 of A consisting of rows 3, 4, 5, and 6; and the subvector b0
of b consisting of the same rows. A0 is invertible. Additionally, A0 v = b0 :
0 0 −1 0 2 0
0 0 0 −1 1 0
1 1 0 00 = 3 .
1 0 −1 0 0 2
| {z } | {z } | {z }
A0 v b0
δ+
δ−
•
Ax ≤ b cT x ≤ δ
max{cT x : Ax ≤ b} = min{y T b : y ≥ 0 ∧ yT A = cT }
if Ax ≤ b is satisfiable.
Ax ≤ b ⇒ cT x ≤ δ .
Ax ≤ b ⇒ y T Ax ≤ yT b .
Hence, to prove
Ax ≤ b ⇒ cT x ≤ δ
yT A = cT and yT b = δ .
That is,
cT x ≤ |{z}
Ax ≤ b ⇒ |{z} δ .
yT A yT b
But we want to find a minimal δ such that the implication holds, not just
prove it for a fixed δ. Thus, choose y ≥ 0 such that y T A = cT and that
minimizes y T b. This equivalence between maximizing cT x and minimizing y T b
is the one claimed by Theorem 8.6.
We refer the reader in Bibliographic Remarks to texts that contain the
proof of this theorem.
TQ -Satisfiability
with both weak and strict inequalities. Equalities can be written as two in-
equalities. F is TQ -equivalent to the ΣQ -formula
m
^
F′ : ai1 x1 + · · · + ain xn ≤ bi
i=1
^ℓ
∧ αi1 x1 + · · · + αin xn + xn+1 ≤ βi
i=1
∧ xn+1 > 0
218 8 Quantifier-Free Linear Arithmetic
with only weak inequalities except for xn+1 > 0. To decide the TQ -satisfiability
of F ′ , and thus of F , pose and solve the following linear program:
max xn+1
subject to
m
^
ai1 x1 + · · · + ain xn ≤ bi
i=1
ℓ
^
αi1 x1 + · · · + αin xn + xn+1 ≤ βi
i=1
max 1
subject to
^m
ai1 x1 + · · · + ain xn ≤ bi
i=1
M : max cT x
subject to
G : Ax ≤ b
The simplex method solves the linear program in two main steps. In the
first step, it obtains an initial vertex v 1 of Ax ≤ b. In the second step, it
iteratively traverses the vertices of Ax ≤ b, beginning at v 1 , in search of the
vertex that maximizes the objective function. On each iteration of the second
step, it determines if the current vertex v i has a greater objective value than
the vertices adjacent to v i . If not, it moves to one of the adjacent vertices with
a greater objective value. If so, it halts and reports v i as the optimum point
with value cT v i .
v i is a local optimum since its adjacent vertices have lesser objective
values. But because the space defined by Ax ≤ b is convex, v i is also the
global optimum: it is the highest value attained by any point that satisfies
the constraints.
8.4 The Simplex Method 219
How does the simplex method find the initial vertex v 1 ? In the first step,
it constructs a new linear program
T
M0 : max c′ x
subject to
′
G0 : A′ x ≤ b
8.4.1 From M to M0
To find the initial vertex of M , the simplex method constructs and solves a
new linear program M0 . To that end, reformulate the constraints G : Ax ≤ b
of M so that they have the form
x ≥ 0, Ax ≤ b
D2 has only one row, so z = [z]. According to (8.1), pose the following opti-
mization problem:
x1
x2
max 1 −1 1 −1
y1 − [z]
y2
subject to
x1 , x2 , y1 , y2, z ≥ 0
x1
x2
−1 1 1 −1
y1 ≤ [1]
y2
x1
x2
1 −1 1 −1
y1 − [z] ≤ [1]
y2
T
F is TQ -satisfiable iff the optimum is vG = 1 g 2 = 1.
T T
We know that the point x1 x2 y1 y2 z = 00000 is a vertex. It
satisfies all constraints and has defining constraints x1 , x2 , y1 , y2 , z ≥ 0.
max cT y
subject to
Ay ≤ b
Let
x
y= .
z
where blank regions of the matrix are filled with 0s. Hence, M0 of (8.1) in
standard form is written
T x
M0 : max 1 D2 −I (8.2)
| {z } z
cT
|{z}
y
subject
to
−I 0
−I x 0
D1 z ≤ g1
D2 −I |{z} g2
| {z } y | {z }
A b
Construction of u
To begin the ith iteration, we use the vertex v i to construct a vector u such
that uT A = cT . If u ≥ 0 then the Duality Theorem (Theorem 8.6) implies
that v i is optimal, as we discuss below, and the process terminates. However,
in all but the final iteration, u 6≥ 0: at least one row of u is negative.
To construct u, choose one of its sets of defining constraints: choose a n×n
nonsingular submatrix Ai of A with corresponding rows bi such that
Ai v i = bi . (8.4)
Let R be the indices of the rows of A in Ai . Such a subset of constraints exists
because v i is a vertex.
224 8 Quantifier-Free Linear Arithmetic
x2
cT x
•
v1 x1
Ai T ui = c (8.5)
uT A = cT . (8.6)
max [−1 1] x
| {z }
cT
subject
to
−1 0 0
0 −1 x ≤ 0
2 1 2
| {z } | {z }
A b
T
for which we know that v 1 = 0 0 is a vertex.
The problem and initial vertex is visualized in Figure 8.2. The solid lines
represent the constraints of the problem, and the set of satisfying points cor-
responds to the interior of the triangle. The dashed line indicates cT x; the
arrow points in the direction of increasing value.
T
Given vertex v 1 = 0 0 , the first two constraints are the defining con-
straints of v 1 , so choose R = [1; 2]:
−1 0 0
A1 = and b1 = .
0 −1 0
where the first two elements are from u1 . Check that this u satisfies uT A = cT
of (8.6) as desired.
Example 8.11. Continuing from Examples 8.8 and 8.9, choose the first five
rows of A and b (R = [1; 2; 3; 4; 5]) since
−1 0 0 0 0 0 0
0 −1 0 0 0 0 0
0 0 −1 0 0 0 = 0
0 0 0 −1 0 0 0
0 0 0 0 −1 0 0
| {z } | {z } | {z }
A1 v1 b1
yields
u1 T = −1 1 −1 1 1 .
Then
T
u = −1 1 −1 1 1 0 0 ,
where the first five elements are from u1 . Check that this u satisfies uT A = cT
of (8.6) as desired.
Case 1: u ≥ 0
We prove that in this case, v i is actually the optimal point with optimal value
cT v i . The crux of the argument is the Duality Theorem (Theorem 8.6).
From equation (8.6), we have
cT v i = uT Av i .
We claim next that
uT Av i = uT b . (8.7)
First, from (8.4)
Ai v i = bi implies ui T Ai v i = ui T bi
so that equation (8.7) holds at rows R. For rows j 6∈ R, we know that uj = 0
by construction, so that both
(uT Av i )j = 0 and (uT b)j = 0 ,
proving equation (8.7). Reasoning further,
uT b ≥ min{y T b : y ≥ 0 ∧ yT A = cT }
since u is a member of the set by (8.6) and the case u ≥ 0. By duality (Theorem
8.6),
min{yT b : y ≥ 0 ∧ y T A = cT } = max{cT x : Ax ≤ b} .
In summary, we have by (8.6), (8.7), and Theorem 8.6,
cT v i = uT Av i = uT b ≥
min{yT b : y ≥ 0 ∧ y T A = cT } = max{cT x : Ax ≤ b} ,
which proves that v i is actually the optimal point with optimal value cT v i .
Figure 8.3 illustrates this case: the vertex v i maximizes cT x. The dashed
line illustrates the objective function cT x. Moving upward relative to it in-
creases its value. In this illustration, cT x cannot be increased without leaving
the region defined by the constraints.
Case 2: u 6≥ 0
In this case, v i is not the optimal point. Thus, we need to move along an
edge y to an adjacent vertex to increase the value of the objective function.
In moving to an adjacent vertex, we swap one of the defining constraints of
v i for another constraint to form the defining constraints of v i+1 .
In this second case, there exists some uk < 0. Let k be the lowest index
of u such that uk < 0 (it must be one of the indices of R since for all other
indices ℓ, uℓ = 0). Let k ′ be the index of the row of ui and Ai corresponding
to row k of u and Ai .
8.4 The Simplex Method 227
vi
•
Ax ≤ b cT x
Construction of y
Having fixed the indices k (row k of A and b) and k ′ (the corresponding row
k ′ of Ai and bi ) of the offending constraint, we seek a direction along which
to travel away from vertex v i and, in particular, away from the plane that
defines the kth constraint.
Define y to be the k ′ th column of −A−1
i . To find y, solve
Ai y = −ek′ (8.8)
(where, recall, ek′ is the k ′ th unit vector, which consists of 0s except in position
k ′ ) for y. Thus,
aℓ y = 0 for every row aℓ of Ai except the k ′ th row
and
ak′ y = −1 for the k ′ th row ak′ of Ai .
The vector y provides the direction along which to move to the next vertex.
For all rows aℓ but the k ′ th row of Ai , aℓ y = 0 implies that moving in direction
y stays on the boundary of the constraints (i.e., both aℓ v i = b and, in the next
step, aℓ v i+1 = b), so aℓ will be a row in Ai+1 . However, moving along y moves
inward from the boundary of the k ′ th constraint because ak′ y = −1. This
change is desirable, as this constraint is keeping u from being nonnegative.
Example 8.12. Let us examine
T T
u1 = 1 −1 and u = 1 −1 0
of Example 8.10. Since the second row of u is −1, we are in Case 2 with k = 2,
corresponding to row k ′ = 2 of u1 . Let y be the second column of −A−1
1 : solve
−1 0 0
y=
0 −1 −1
| {z } | {z }
A1 −e2
228 8 Quantifier-Free Linear Arithmetic
x2
• v2
cT x
•
v1 x1
T
for y, yielding y = 0 1 .
This y is visualized in Figure 8.4 by the dark solid arrow that points up
from v 1 . The vertical and horizontal lines are the defining constraints of v 1 ; in
moving in the direction y, we keep the vertical constraint for the next vertex
v 2 but drop the horizontal constraint. The diagonal constraint will become
the second of v 2 ’s defining constraints.
T
for y, yielding y = 1 0 0 0 0 .
Again we have two cases to consider: either the optimum is bounded (Case
2(a)) or it is unbounded (Case 2(b)).
In this case, we move along the edge y to a better vertex v i+1 , according to
the objective function cT x. However, there is a set of rows of A with indices
8.4 The Simplex Method 229
S such that for ℓ ∈ S, aℓ y > 0. For these constraints, moving in the direction
y actually moves toward leaving the satisfying region. These constraints limit
how far in direction y we can move. For example, in Figure 8.4, the diagonal
constraint limits how far we can move in direction y.
Construction of λi , v i+1
A(v i + λi y) ≤ b . (8.9)
aℓ (v i + λi y) = bℓ (8.10)
am (v i + λi y) ≤ bm . (8.11)
Set
def
v i+1 = v i + λi y . (8.12)
Finally, we construct the defining constraints of v i+1 for the next iteration.
Construct submatrix Ai+1 of A from Ai : replace row ak′ of Ai with row aℓ of
A. Choose the corresponding rows of b for bi+1 . Row ℓ and the rows carried
over from this iteration comprise the defining constraints of v i+1 .
A(v 1 + λ1 y) ≤ b ,
specifically
−1 0 0
0 −1 0 0
+ λ1 ≤ 0 ,
0 1
2 1 2
This vertex is visualized in Figure 8.4. Choosing R = [1; 3] and replacing the
second row of A1 and b1 with the third row of Ax ≤ b yields
230 8 Quantifier-Free Linear Arithmetic
−1 0 0
A2 = and b2 = .
21 2
In Figure 8.4, the rows of A2 and b2 correspond to the vertical and diagonal
constraints, respectively, which are the defining constraints of v 2 .
T
In the next iteration, solving A2 T u2 = c yields u2 = 3 1 . Adding 0s for
rows not in R produces
T
u= 301 .
cT v i+1 = cT (v i + λi y)
= cT v i + λi cT y
= cT v i + λi (−uk )
> cT v i .
ui T = cT A−1
i
A(v i + λy) = Av i +λ Ay ≤ b ,
|{z} |{z}
≤b ≤0
8.4 The Simplex Method 231
v i+1 •
Ax ≤ b cT x Ax ≤ b cT x
y y
• •
vi vi
(a) (b)
cT (v i + λy) = cT v i + λcT y
= cT v i + λuT Ay
= cT v i + −λ uk ,
|{z}
<0
T
which grows with λ (recall cT = uT A, and Ay = 0 · · · 0 −1 0 · · · 0 by
selection of y). Thus, the maximum is unbounded. Figure 8.5(b) illustrates
this case: all points along the ray labeled y satisfy the constraints, and moving
along the ray increases cT x without bound.
T
Example 8.15. Continuing with y = 1 0 0 0 0 from Example 8.13, com-
pute Ay
−1 0 0 0 0 −1
0 −1 0 0 0 1 0
0 0 −1 0 0 0 0
0 0 0 −1 0 0 = 0
0 0 0 0 −1 0 0
−1 1 1 −1 0 0 −1
1 −1 1 −1 −1 | {z } 1
| {z } y
A
to find that S = [7] since a7 y = 1 > 0. Thus, according to (8.10), examine the
seventh row of the constraints and choose the greatest λ1 such that
232 8 Quantifier-Free Linear Arithmetic
0 1
0 0
1 −1 1 −1 −1 (v 1 + λ1 y) = 1 −1 1 −1 −1
0 + λ1 0 = 1 ;
| {z } 0 0
a7
0 0
cT v 1 = 0 < cT v 2 = 1 .
F : x + y ≥ 1 ∧ x − y ≥ −1
the assignment
x = x1 − x2 = 1 − 0 = 1 and y = y1 − y2 = 0 − 0 = 0 ,
Construction of M0
Because x and y are already constrained to be nonnegative, we do not need
to introduce new x1 , x2 , y1 , y2 . Rewrite the final three literals of F as two sets
of constraints:
x 10 x 2
[1 1] ≤ [3] and ≥
|{z} y |{z} 01 y 2
D1 g1 | {z } |{z}
D2 g2
Iteration 1
0 0 0 −1 0
A(v 1 + λ1 y) ≤ b .
Focusing on the fifth and sixth rows of A (since S = [5; 6]), choose the largest
λ1 such that
0 1
0
1 1 0 0 0 3
+ λ1 0 ≤
1 0 −1 0 0 2
| {z } 0 0
|{z}
rows 5,6 of A | {z } | {z } rows 5,6 of b
v1 y
For the next iteration, replace the first row of A1 (since k ′ = 1) with the sixth
row of A (since ℓ = 6) to produce
1 0 −1 0 2
0 −1 0 0 0
A2 = and b2 = .
0 0 −1 0 0
0 0 0 −1 0
cT v 1 = 0 < 2 = cT v 2 .
Iteration 2
T
Now R = [6; 2; 3; 4]. Solve A2 T u2 = c to yield u2 = 1 −1 0 1 for rows
[6; 2; 3; 4]. Then filling in 0s for the other rows of A produces:
T
u = [0 −1 0 1 0 1 0] .
row: 2 3 4 6
Replace the second row of A2 (since k ′ = 2) with the fifth row of A (since
ℓ = 5) to produce
1 0 −1 0 2
1 1 0 0 3
A3 =
0 0 −1 0 and b3 = 0 .
0 0 0 −1 0
cT v 1 = 0 < cT v 2 = 2 < cT v 3 = 3 .
Iteration 3
T
Now R = [6; 5; 3; 4]. Solve A3 T u3 = c, yielding u3 = 0 1 1 1 . Because
u3 ≥ 0, we are in Case 1: v 3 is the optimum with objective value
8.5 Summary 237
2
1
c v 3 = 1 1 −1 −1
T
0 = 3 .
0
⋆
8.4.3 Complexity
8.5 Summary
This chapter covers linear programming and the simplex method for solving
linear programs. It covers:
• How decision procedures that reason only about conjunctive formulae ex-
tend to arbitrary Boolean structure by converting to DNF. Exercise 8.1
explores a more effective way of converting to DNF.
• A review of linear algebra.
• Linear programs, which are optimization problems with linear constraints
and linear objective functions. Application to TQ -satisfiability.
• The simplex method. Finding an initial point via a new linear program.
Greedy search along vertices.
The structure of the simplex method is markedly different from the struc-
ture of the quantifier elimination procedures of Chapter 7. It focuses on the
structure of the set of interpretations of the given formula, rather than on the
formula itself. Exercises 8.3, 8.4, and 8.5 explore these sets, which describe
polyhedra.
238 8 Quantifier-Free Linear Arithmetic
Bibliographic Remarks
Exercises
8.1 (⋆ Conjunctive quantifier-free formulae). Converting an arbitrary
quantifier-free Σ-formula to DNF and then applying a decision procedure to
each disjunct can be prohibitively expensive. In practice, SAT solvers (decision
procedures for propositional logic, such as DPLL) are used to extend a decision
procedure for conjunctive quantifier-free Σ-formula to arbitrary quantifier-free
Σ-formula.
(a) Show that the DNF of a formula F can be exponentially larger than F .
(b) Describe a procedure that, using a SAT solver, extracts conjunctive Σ-
formulae from a quantifier-free Σ-formula F . Using this procedure, each
discovered conjunctive formula’s T -satisfiability will be decided. If it is
T -satisfiable, then F is T -satisfiable, so the procedure finishes; otherwise,
the procedure finds another conjunctive formula.
(c) The proposed procedure is really no more efficient than simply converting
to DNF. This part explores an optimization. An unsatisfiable core of T -
unsatisfiable conjunctive Σ-formula G is the conjunction H of a subset of
literals of G such (1) H is also T -unsatisfiable, and (2) the conjunction of
each strict subset of literals of H is T -satisfiable. Improve your procedure
from the previous part to use a function UnsatCoreT (G) that returns a
T -unsatisfiable core of a T -unsatisfiable conjunctive Σ-formula G.
(d) Given a decision procedure DPT for conjunctive quantifier-free Σ-formula,
describe a procedure UnsatCore(G) for computing an unsatisfiable core of
G that takes no more than a number of DPT calls linear in the number
of literals of G. Note that G can have multiple unsatisfiable cores; your
procedure need only return one.
Exercises 239
T
Show that hull(P ) is convex. Conclude that if Ax ≤ ǫb for each [xT ǫ] ∈ P ,
then hull(P ) ⊆ {x : Ax ≤ b}.
240 8 Quantifier-Free Linear Arithmetic
The congruence closure algorithm is the basis for the other decision proce-
dures of this chapter as well. It is extended in Section 9.4 to decide satisfiability
in the quantifier-free fragment of the theory of recursive data structures TRDS ,
and in particular in the theory of lists Tcons . Finally, it is applied in Section 9.5
to decide satisfiability in the quantifier-free fragment of the theory of arrays
TA .
The quantifier-free fragment of TE and its satisfiability decision procedure
play a central role in combining theories that share the equality predicate. We
discuss the combination of theories in Chapter 10.
ΣE : {=, a, b, c, . . . , f, g, h, . . . , p, q, r, . . .} ,
consists of
• =, a binary predicate;
• and all constant, function, and predicate symbols.
As in every other theory, ΣE -formulae are constructed from symbols of the
signature, variables, logical connectives, and quantifiers.
The equality predicate = is interpreted, or given meaning, via the axioms
of TE . The axioms
1. ∀x. x = x (reflexivity)
2. ∀x, y. x = y → y = x (symmetry)
3. ∀x, y, z. x = y ∧ y = z → x = z (transitivity)
define = to be an equivalence relation. These axioms give = the expected
meaning of equality on pairs of variable terms.
However, they do not provide the full meaning for = in the context of
function terms, such as in f (x) = f (g(y, z)). The following axiom schema
stands for an infinite but countable set of axioms:
4. for each positive integer n and n-ary function symbol f ,
n
!
^
∀x, y. xi = yi → f (x) = f (y) (function congruence)
i=1
For example, two instances of this axiom schema are the following:
and
Then
x = g(y, z) → f (x) = f (g(y, z))
is TE -valid by the first instance. Alternately,
x = g(y, z) ∧ f (x) 6= f (g(y, z))
is TE -unsatisfiable, where t1 6= t2 abbreviates ¬(t1 = t2 ). This axiom schema
makes = a congruence relation.
Finally, observe that the logical operator ↔ should behave on predicate
formulae similarly to the way = behaves on function terms. For example, our
intuition asserts that
x = y → (p(x) ↔ p(y)) (9.1)
should be TE -valid. In Chapter 3, we list a fifth axiom schema:
5. for each positive integer n and n-ary predicate symbol p,
n
!
^
∀x, y. xi = yi → (p(x) ↔ p(y)) (predicate congruence)
i=1
x = y → (p(x) ↔ p(y)) ,
Similarly, transform
into
9.2.1 Relations
Binary Relations
Example 9.2. Consider the set Z of integers and the equivalence relation ≡2
such that
m, n ∈ Z are related iff they are both even or both odd. The equivalence class
of 3 under ≡2 is
246 9 Quantifier-Free Equality and Data Structures
and disjoint,
∀S1 , S2 ∈ P. S1 6= S2 → S1 ∩ S2 = ∅ .
The quotient S/R of S by the equivalence (congruence) relation R is a par-
tition of S: it is a set of equivalence (congruence) classes
S/R = {[s]R : s ∈ S} .
Example 9.3. The quotient Z/ ≡2 is a partition: it is the set of equivalence
classes
{{n ∈ Z : n is odd}, {n ∈ Z : n is even}} .
Just as an equivalence relation R induces a partition S/R of S, a given
partition P of S induces an equivalence relation over S. Specifically, s1 Rs2 iff
for some S ′ ∈ P , both s1 , s2 ∈ S ′ .
Relation Refinements
Consider two binary relations R1 and R2 over set S. R1 is a refinement of
R2 , or R1 ≺ R2 , if
∀s1 , s2 ∈ S. s1 R1 s2 → s1 R2 s2 .
We also say that R1 refines R2 . Viewing the relations as sets of pairs, R1 ⊆
R2 .
Example 9.4. For S = {a, b}, R1 : {aR1 b} ≺ R2 : {aR2 b, bR2 b}.
Viewing the relations as sets of pairs, R1 ≺ R2 iff R1 ⊆ R2 .
Example 9.5. Consider set S, the relation
R1 : {sR1 s : s ∈ S}
induced by the partition
P1 : {{s} : s ∈ S} ,
and the relation
R2 : {sR2 t : s, t ∈ S}
induced by the partition
P2 : {S} .
Then R1 ≺ R2 .
9.2 Congruence Closure Algorithm 247
Closures
The equivalence closure RE of the binary relation R over S is the equiva-
lence relation such that
• R refines RE : R ≺ RE ;
• for all other equivalence relations R′ such that R ≺ R′ , either R′ = RE or
RE ≺ R′ .
That is, RE is the “smallest” equivalence relation that “covers” R.
Example 9.7. If S = {a, b, c, d} and
R = {aRb, bRc, dRd} ,
then
• aRb, bRc, dRd ∈ RE since R ⊆ RE ;
• aRa, bRb, cRc ∈ RE by reflexivity;
• bRa, cRb ∈ RE by symmetry;
• aRc ∈ RE by transitivity;
• cRa ∈ RE by symmetry.
Hence,
RE = {aRb, bRa, aRa, bRb, bRc, cRb, cRc, aRc, cRa, dRd} .
The congruence closure RC of R is the “smallest” congruence relation
that “covers” R. Shortly, we shall illustrate the congruence closure of a term
set.
{s1 = t1 , . . . , sm = tm }
∼ |= s1 = t1 ∧ · · · ∧ sm = tm .
{{s} : s ∈ SF }
in which each term of SF is its own congruence class. Then, for each i ∈
{1, . . . , m}, impose si = ti by merging the congruence classes
Construct the following initial partition by letting each member of the subterm
set SF be its own class:
to form partition
From the union {a, f 3 (a)}, deduce the following congruence propagations:
and
F : f (x) = f (y) ∧ x 6= y .
The union {f (x), f (y)} does not yield any new congruences, so the final par-
tition is
Does
is a DAG.
252 9 Quantifier-Free Equality and Data Structures
node
node i returns the node n with id i. Thus, (node i).id = i. For example, in
Figure 9.1(b), (node 2).find = 3.
find
The find function returns the representative of a node’s equivalence class. It
follows find edges until it finds a self-loop:
let rec find i =
let n = node i in
if n.find = i then i else find n.find
Example 9.13. In the DAG of Figure 9.1(b), find 2 is 3. find follows the
find edge of 2 to 3; then it recognizes the self-loop and thus returns 3.
union
The union function returns the union of two equivalence classes, given two
node identities i1 and i2 . It first finds the representatives n1 and n2 of i1 ’s
and i2 ’s equivalence classes, respectively. Next, it sets n1 ’s find to n2 ’s repre-
sentative, which is the identity of n2 itself. Now n2 represents the new larger
equivalence class.
Finally, it combines the congruence closure parents, storing the new set
in n2 ’s ccpar field because n2 is the representative of the union equivalence
class. This last step is not strictly part of the union-find algorithm (which,
recall, computes equivalence classes); rather, it is intended for when we use
the union-find algorithm to compute congruence classes. In code,
let union i1 i2 =
let n1 = node (find i1 ) in
let n2 = node (find i2 ) in
n1 .find ← n2 .find;
n2 .ccpar ← n1 .ccpar ∪ n2 .ccpar;
n1 .ccpar ← ∅
9.3 Congruence Closure with DAGs 255
ccpar
The simple function ccpar i returns the parents of all nodes in i’s congruence
class:
let ccpar i =
(node (find i)).ccpar
We are ready to build the congruence closure algorithm using the basic oper-
ations. First we define the congruent function to check whether two nodes
that are not in the same congruence class are in fact congruent. Then we
define the merge function to merge two congruence classes and to propagate
the effects of new congruences recursively.
congruent
let congruent i1 i2 =
let n1 = node i1 in
let n2 = node i2 in
n1 .fn = n2 .fn
∧ |n1 .args| = |n2 .args|
∧ ∀i ∈ {1, . . . , |n1 .args|}. find n1 .args[i] = find n2 .args[i]
Example 9.15. Consider the DAG of Figure 9.1(b). Are nodes 1 and 2 con-
gruent? congruent notes that
• their fn fields are both f : n1 .fn = n2 .fn = f ;
• their numbers of arguments are both 2;
• their left arguments f (a, b) and a are both congruent to 3:
n1 .args = [2; 4], n2 .args = [3; 4], and find 2 = find 3 = 3;
256 9 Quantifier-Free Equality and Data Structures
merge
Given ΣE -formula
F : s1 = t1 ∧ · · · ∧ sm = tm ∧ sm+1 6= tm+1 ∧ · · · ∧ sn 6= tn
in which each term is its own congruence class. The DAG of Figure 9.1(a)
represents this initial partition.
According to the literal f (a, b) = a, merge 2 3. find 2 6= find 3, so let
and thus the congruence relation in which a, f (a, b), and f (f (a, b), b) are
congruent. Does
which induces the initial partition and DAG shown in Figure 9.2(a). Accord-
ing to the literal f (f (f (a))) = a, merge 3 0. On this initial merge
Additionally, the labels of both 4 and 1 are f , and their arguments, 3 and 0
respectively, are congruent after union 3 0. Thus, recursively merge 4 1. After
union 4 1, their parents 5 and 2 are congruent, so merge 5 2. The recursion
finishes after union 5 2 since P5 = ∅, resulting in the DAG of Figure 9.2(b).
258 9 Quantifier-Free Equality and Data Structures
⋆
9.3.5 Complexity
Let e be the number of edges and n be the number of nodes in the initial
DAG.
Theorem 9.19 (Complexity). The congruence closure algorithm runs in
time O(e2 ) for O(n) merges.
However, Downey, Sethi, and Tarjan described an algorithm with O(e log e)
average running time for O(n) merges. Computing TE -satisfiability is inex-
pensive.
9.4 Recursive Data Structures 259
Recursive data structures include records, lists, trees, stacks, and queues. The
theory TRDS can model records, lists, trees, and stacks, but not queues: whereas
a particular list has a single representation, queues — in which only order
matters — do not. In this section, we discuss the theory of lists Tcons for
ease of exposition. Both its axiomatization and the decision procedure for the
quantifier-free fragment rely on our discussion of TE .
Recall that the signature of Tcons is
where
• cons is a binary function, called the constructor; cons(a, b) represents the
list constructed by prepending a to b;
• car is a unary function, called the left projector: car(cons(a, b)) = a;
• cdr is a unary function, called the right projector: cdr(cons(a, b)) = b;
• atom is a unary predicate;
• and = is a binary predicate.
Its axioms are the following:
1. the axioms of (reflexivity), (symmetry), and (transitivity) of TE
2. instantiations of the (function congruence) axiom schema for cons, car, and
cdr:
∀x1 , x2 , y1 , y2 . x1 = x2 ∧ y1 = y2 → cons(x1 , y1 ) = cons(x2 , y2 )
car cdr
cons cons
=⇒
x y x y
F : s1 = t1 ∧ · · · ∧ sm = tm ∧ sm+1 6= tm+1 ∧ · · · ∧ sn 6= tn
∧ atom(u1 ) ∧ · · · ∧ atom(uℓ )
in which si , ti , and ui are Tcons -terms. To decide its Tcons -satisfiability, perform
the following steps:
1. Construct the initial DAG for the subterm set SF .
2. For each node n such that n.fn = cons,
• add car(n) to the DAG and merge car(n) n.args[1];
• add cdr(n) to the DAG and merge cdr(n) n.args[2]
by the (left projection) and (right projection) axioms. See Figure 9.3.
3. For i ∈ {1, . . . , m}, merge si ti .
4. For i ∈ {m + 1, . . . , n}, if find si = find ti , return unsatisfiable.
5. For i ∈ {1, . . . , ℓ} if ∃v. find v = find ui ∧ v.fn = cons, return unsatis-
fiable by axiom (atom).
6. Otherwise, return satisfiable.
Steps 1, 3, 4, and 6 are identical to Steps 1-4 of the decision procedure for TE .
Because of their similarity, it is simple to combine the two theories.
x y x y
u1 v1 u2 v2 u1 v1 u2 v2
(a) (b)
The first two and final two literals imply that u1 = u2 and v1 = v2 so that
again x = y. The remaining reasoning is as for F .
Let us apply the decision procedure to F ′ . The initial DAG of F ′ is dis-
played in Figure 9.4(a). Figure 9.4(b) displays the DAG after Step 2.
According to the literals car(x) = car(y) and cdr(x) = cdr(y), compute
which add the two dashed arrows on the top of Figure 9.5(a). Then according
to literal x = cons(u1 , v1 ),
merge x cons(u1 , v1 ) ,
which adds the dashed arrow from x to cons in Figure 9.5(a). Consequently,
car(x) and car(cons(u1 , v1 )) become congruent. Since
the find of car(y) is set to point to u1 during the subsequent union, resulting
in the left dotted arrow of Figure 9.5(a). Similarly, cdr(x) and cdr(cons(u1 , v1 ))
become congruent, with similar effects (the right dotted arrow of Figure
262 9 Quantifier-Free Equality and Data Structures
x y x y
u1 v1 u2 v2 u1 v1 u2 v2
(a) (b)
9.5(a)). The state of the DAG after these merges is shown in Figure 9.5(a).
Dashed lines indicate merges that arise directly from the literals of F ′ ; dotted
lines indicate deduced merges.
Next, according to the literal y = cons(u2 , v2 ),
merge y cons(u2 , v2 ) ,
resulting in the new dashed line from y to cons(u2 , v2 ) in Figure 9.5(b). This
merge produces two new congruences:
Trace through the actions of these merges to understand the addition of the
two bottom dotted arrows from u1 to u2 and from v1 to v2 in Figure 9.5(b).
During the computation
f (x) = f (y) ;
merge f (x) f (y) produces the final dotted edge from f (x) to f (y). Figure
9.5(b) displays the final DAG.
Does this DAG model F ? No, as find f (x) is f (y) and find f (y) is f (y) so
that f (x) ∼ f (y); however, F asserts that f (x) 6= f (y). F is thus (Tcons ∪ TE )-
unsatisfiable.
9.5 Arrays 263
9.5 Arrays
where
• a[i] is a binary function: a[i] represents the value of array a at position i;
• ahi ⊳ vi is a ternary function: ahi ⊳ vi represents the modified array a in
which position i has value v;
• and = is a binary predicate.
The axioms of TA are the following:
1. the axioms of (reflexivity), (symmetry), and (transitivity) of TE
2. ∀a, i, j. i = j → a[i] = a[j] (array congruence)
3. ∀a, v, i, j. i = j → ahi ⊳ vi[j] = v (read-over-write 1)
4. ∀a, v, i, j. i 6= j → ahi ⊳ vi[j] = a[j] (read-over-write 2)
We consider TA -satisfiability in the quantifier-free fragment of TA . As usual,
we consider only conjunctive ΣA -formulae since conversion to DNF extends
the decision procedure to arbitrary quantifier-free ΣA -formulae.
The decision procedure for TA -satisfiability of quantifier-free ΣA -formula
F is based on a reduction to TE -satisfiability via applications of the (read-over-
write) axioms. Intuitively, if F does not contain any write terms, then the read
terms can be viewed as uninterpreted function terms. Otherwise, any write
term must occur in the context of a read — as a read-over-write term ahi⊳vi[j]
— since arrays themselves cannot be asserted to be equal or not equal. In this
case, the (read-over-write) axioms can be applied to deconstruct the read-
over-write terms. In detail, to decide the TA -satisfiability of F , perform the
following recursive steps.
Step 1
If F does not contain any write terms ahi ⊳ vi, perform the following steps:
1. Associate each array variable a with a fresh function symbol fa , and re-
place each read term a[i] with fa (i).
2. Decide and return the TE -satisfiability of the resulting formula.
264 9 Quantifier-Free Equality and Data Structures
Step 2
Select some read-over-write term ahi ⊳ vi[j] (recall that a may itself be a write
term), and split on two cases:
1. According to (read-over-write 1), replace
F [ahi ⊳ vi[j]] with F1 : F [v] ∧ i = j ,
and recurse on F1 . If F1 is found to be TA -satisfiable, return satisfiable.
2. According to (read-over-write 2), replace
F [ahi ⊳ vi[j]] with F2 : F [a[j]] ∧ i 6= j ,
and recurse on F2 . If F2 is found to be TA -satisfiable, return satisfiable.
If both F1 and F2 are found to be TA -unsatisfiable, return unsatisfiable.
The power of disjunction manifested in Steps 2(a) and 2(b) of the decision
procedure results in this complexity.
vP 6= v¬P
(¬P ∨ Q ∨ ¬R) .
9.6 Summary
This chapter presents the congruence closure algorithm and applications to
deciding satisfiability in the quantifier-free fragments of TE , Tcons , and TA . It
covers:
• The congruence closure algorithm at an abstract level. Relations, equiva-
lence relations, congruence relations. Partitions; equivalence and congru-
ence classes. Closures.
266 9 Quantifier-Free Equality and Data Structures
Bibliographic Remarks
The quantifier-free fragment of TE was first proved decidable by Ackermann
in 1954 [1] and later studied by various teams in the late 1970s. Shostak [83],
Nelson and Oppen [66], and Downey, Sethi, and Tarjan [29] present alternate
solutions to the problem. We discuss the method of Nelson and Oppen [66].
Oppen presents a theory of acyclic recursive data structures [69, 71]. The
decision problem in the quantifier-free fragment of this theory is decidable in
linear time, and the full theory is decidable. Our presentation is based on the
work of Nelson and Oppen [66] for possibly-cyclic data structures.
McCarthy proposes the axiomatization of arrays based on read-over-write
[58]. James King implemented the decision procedure for the quantifier-free
fragment as part of his thesis work [50].
Exercises 267
Exercises
9.1 (DP for TE ). Apply the decision procedure for TE to the following ΣE -
formulae. Provide a level of detail as in Example 9.10.
(a) f (x, y) = f (y, x) ∧ f (a, y) 6= f (y, a)
(b) f (g(x)) = g(f (x)) ∧ f (g(f (y))) = x ∧ f (y) = x ∧ g(f (x)) 6= x
(c) f (f (f (a))) = f (f (a)) ∧ f (f (f (f (a)))) = a ∧ f (a) 6= a
(d) f (f (f (a))) = f (a) ∧ f (f (a)) = a ∧ f (a) 6= a
(e) p(x) ∧ f (f (x)) = x ∧ f (f (f (x))) = x ∧ ¬p(f (x))
9.3 (⋆ Undecidable fragment). Show that allowing even one quantifier al-
ternation (i.e., ∃x1 , . . . , xk . ∀y1 , . . . , yn . F [x, y]) makes satisfiability in TE un-
decidable.
9.4 (⋆ DAG). Describe a data structure and algorithm for constructing the
initial DAG in the congruence closure procedure. It should run in time ap-
proximately linear in the size of the formula.
9.6 (DP for Tcons ). Apply the decision procedure for Tcons to the following
Tcons -formulae. Provide a level of detail as in Example 9.20.
(a) car(x) = y ∧ cdr(x) = z ∧ x 6= cons(y, z)
(b) ¬atom(x) ∧ car(x) = y ∧ cdr(x) = z ∧ x 6= cons(y, z)
1
Suggested by a typo in [66].
268 9 Quantifier-Free Equality and Data Structures
9.8 (DP for quantifier-free TA ). Apply the decision procedure for quantifier-
free TA to the following ΣA -formulae.
(a) ahi ⊳ ei[j] = e ∧ i 6= j
(b) ahi ⊳ ei[j] = e ∧ a[j] 6= e
(c) ahi ⊳ ei[j] = e ∧ i 6= j ∧ a[j] 6= e
(d) ahi ⊳ eihj ⊳ f i[k] = g ∧ j 6= k ∧ i = j ∧ a[k] 6= g
(e) i1 = j ∧ a[j] = v1 ∧ ahi1 ⊳ v1 ihi2 ⊳ v2 i[j] 6= a[j]
10
Combining Decision Procedures
Σ1 ∩ Σ2 = {=} .
Σ2 : {a, b, =} ,
The signatures of TE and TZ only share =. Also, both theories are stably
infinite. Hence, the N-O combination of the decision procedures for TE and TZ
decides the (TE ∪ TZ )-satisfiability of F .
Intuitively, F is (TE ∪ TZ )-unsatisfiable. For the first two literals imply
x = 1 ∨ x = 2 so that f (x) = f (1) ∨ f (x) = f (2). Yet the last two literals
contradict this conclusion.
F [f (t1 , . . . , t, . . . , tn )] =⇒ F [f (t1 , . . . , w, . . . , tn )] ∧ w = t
F [p(t1 , . . . , t, . . . , tn )] =⇒ F [p(t1 , . . . , w, . . . , tn )] ∧ w = t
272 10 Combining Decision Procedures
F [s = t] =⇒ F [w = t] ∧ w = s
1 ≤ x, x ≤ 2, w1 = 1, and w2 = 2
FZ : 1 ≤ x ∧ x ≤ 2 ∧ w1 = 1 ∧ w2 = 2
w1 = x + y ∧ w1 = f (x) .
10.2 Nelson-Oppen Method: Nondeterministic Version 273
which asserts that variables related by E are equal and that variables unre-
lated by E are not equal. The formula F is (T1 ∪ T2 )-satisfiable iff there exists
an equivalence relation E of V such that
• F1 ∧ α(V, E) is T1 -satisfiable, and
• F2 ∧ α(V, E) is T2 -satisfiable.
Otherwise, F is (T1 ∪ T2 )-unsatisfiable.
Example 10.7. Consider (ΣE ∪ ΣZ )-formula
F : 1 ≤ x ∧ x ≤ 2 ∧ f (x) 6= f (1) ∧ f (x) 6= f (2) .
Phase 1 separates this formula into the ΣZ -formula
FZ : 1 ≤ x ∧ x ≤ 2 ∧ w1 = 1 ∧ w2 = 2
and the ΣE -formula
FE : f (x) 6= f (w1 ) ∧ f (x) 6= f (w2 ) ,
with
V = shared(FZ , FE ) = {x, w1 , w2 } .
There are 5 equivalence relations to consider, which we list by stating the
partitions:
274 10 Combining Decision Procedures
FZ : w1 + w2 = z ,
with
V = shared(Fcons , FZ ) = {z, w1 , w2 } .
The arrangement
α(V, E) : z 6= w1 ∧ z 6= w2 ∧ w1 6= w2
satisfies both Fcons and FZ : Fcons ∧α(V, E) is Tcons -satisfiable, and FZ ∧α(V, E)
is TZ -satisfiable. Hence, F is (Tcons ∪ TZ )-satisfiable.
into ΣZ -formula
FZ : w1 = x + y ∧ x ≤ y + z ∧ x + z ≤ y ∧ y = 1 ∧ w2 = 2
and ΣE -formula
Then
V = shared(FZ , FE ) = {x, w1 , w2 } .
x 6= w1 ∧ x 6= w2 ∧ w1 = w2 ,
so F is (TE ∪ TZ )-satisfiable.
then
Example 10.10. The theory of integers TZ is not convex. For consider the
quantifier-free conjunctive ΣZ -formula
Then
10.3 Nelson-Oppen Method: Deterministic Version 277
F ⇒ z=u ∨ z=v ,
but neither
F ⇒ z = u nor F ⇒ z=v .
Example 10.11. The theory of arrays TA is not convex. For consider the
quantifier-free conjunctive ΣA -formula
F : ahi ⊳ vi[j] = v .
Then
F ⇒ i = j ∨ a[j] = v ,
but neither
Example 10.12. ⋆ The theory of rationals TQ is convex, as it is convex in a
geometric sense (see Chapter 8).
Each equality ui = vi of the disjunction G of (10.1) is geometrically convex,
but G itself is not. Consider, for example,
H: x=y ∨ x=z .
from Sx=y and Sx=z , respectively, such that neither is in their intersection
Sx=y=z (i.e., v1 6= u and v2 6= w). Then for any λ ∈ (0, 1), the point
Fi ∧ E ⇒ u = v .
The central manager then propagates this new equality to the other decision
procedure.
If Tj is not convex, Pj discovers a new disjunction of equalities S when
_
Fj ∧ E ⇒ (ui = vi ) ,
ui =vi ∈ S
for shared variables ui and vi . In this case, the central manager must split the
disjunction and search along multiple branches. Each branch assumes one of
the disjuncts. The search along a branch ends either when a full arrangement
is discovered (so the original formula is (T1 ∪ T2 )-satisfiable; see below) or
when all sub-branches end in contradiction (Ti -unsatisfiability for some i).
In the latter case, the central manager tries another branch. If no branches
remain to try, then the central manager declares the original formula to be
(T1 ∪ T2 )-unsatisfiable.
If at some point, neither P1 nor P2 finds a new equality (or a disjunction
of equalities in the non-convex case), then the central manager concludes that
the given formula is (T1 ∪ T2 )-satisfiable. For if E is the set of all learned
equalities, S is the set of all possible remaining equalities, and
_ _
F1 ∧ E 6⇒ (ui = vi ) and F2 ∧ E 6⇒ (ui = vi ) ,
ui =vi ∈ S ui =vi ∈ S
(which must hold when no new disjunctions of equalities are discovered), then
^ ^
F1 ∧ E ∧ (ui 6= vi ) and F2 ∧ E ∧ (ui 6= vi )
ui =vi ∈ S ui =vi ∈ S
F is (TE ∪TQ )-unsatisfiable: the final three literals imply that z = 0 and x = y,
so that f (x) = f (y). But then from the first literal, f (0) 6= f (0) since both
f (x) − f (y) and z equal 0.
Phase 1 separates F into two formulae. According to transformation 1, it
replaces f (x) by u, f (y) by v, and u − v by w, resulting in ΣE -formula
and ΣQ -formula
FQ : x ≤ y ∧ y + z ≤ x ∧ 0 ≤ z ∧ w = u − v ,
with
V = shared(FE , FQ ) = {x, y, z, u, v, w} .
Recall that TE and TQ are convex theories. The decision procedure PQ for
TQ discovers
FQ ⇒ x = y
from x ≤ y ∧ y + z ≤ x ∧ 0 ≤ z, so
E1 : x = y .
FE ∧ E1 ⇒ u = v ,
yielding
E2 : x = y ∧ u = v .
But then
FQ ∧ E2 ⇒ z = w
E3 : x = y ∧ u = v ∧ z = w
280 10 Combining Decision Procedures
{}
FQ |= x = y
{x = y}
FE ∧ x = y |= u = v
{x = y, u = v}
FQ ∧ u = v |= z = w
{x = y, u = v, z = w}
FE ∧ z = w |= ⊥
⊥
FE ∧ E3 ⇒ ⊥ ;
FZ : 1 ≤ x ∧ x ≤ 2 ∧ w1 = 1 ∧ w2 = 2
with
V = shared(FZ , FE ) = {x, w1 , w2 } .
FZ ⇒ x = w1 ∨ x = w2 ,
{}
x = w1 ⋆ x = w2
{x = w1 } {x = w2 }
FE ∧ x = w1 |= ⊥ FE ∧ x = w2 |= ⊥
⊥ ⊥
⋆ : FZ |= x = w1 ∨ x = w2
E1a : x = w1
FE ∧ E1a ⇒ ⊥ ,
E1b : x = w2 .
FE ∧ E1b ⇒ ⊥ ,
FZ : 1 ≤ x ∧ x ≤ 3 ∧ w1 = 1 ∧ w2 = 2 ∧ w3 = 3
with
282 10 Combining Decision Procedures
V = shared(FZ , FE ) = {x, w1 , w2 , w3 } .
FZ ⇒ x = w1 ∨ x = w2 ∨ x = w3 .
E1a : x = w1 ,
PE finds that
FE ∧ E1a ⇒ ⊥
E1b : x = w2 ,
FZ ∧ E1b 6⇒ x = w1 ∨ x = w3 ∨ w1 = w2 ∨ w1 = w3 ∨ w2 = w3
and
FE ∧ E1b 6⇒ x = w1 ∨ x = w3 ∨ w1 = w2 ∨ w1 = w3 ∨ w2 = w3 ;
FZ ∧ E1b ∧ x 6= w1 ∧ x 6= w3 ∧ w1 6= w2 ∧ w1 6= w3 ∧ w2 6= w3
is TZ -satisfiable, and
FE ∧ E1b ∧ x 6= w1 ∧ x 6= w3 ∧ w1 6= w2 ∧ w1 6= w3 ∧ w2 6= w3
Fj ∧ E ⇒ ui = vi .
{}
⋆
x = w1 x = w2 x = w3
{x = w1 } {x = w2 } {x = w3 }
FE ∧ x = w1 |= ⊥ FE ∧ x = w3 |= ⊥
⊥ ⊥
⋆ : FZ |= x = w1 ∨ x = w2 ∨ x = w3
⋆
10.4 Correctness of the Nelson-Oppen Method
In this section, we prove the correctness of the Nelson-Oppen combination
method. We reason at the level of arrangements, which is more suited to the
nondeterministic version of the method. However, Section 10.3 shows how to
construct an arrangement in the deterministic version, as well, so the following
proof can be extended to the deterministic version. We also focus on the second
phase of the nondeterministic procedure, which chooses an arrangement if
one exists. We thus assume that the variable abstraction phase is correct: it
284 10 Combining Decision Procedures
S1 ∧ F1 ⇒ H ′ and S2 ∧ H ′ ⇒ ¬F2 .
Gi : G{x 7→ yi } .
G′ : G0 ∨ G1 ∨ · · · ∨ Gn .
For G′ asserts that x is either equal to some free variable yi or not. Because
we consider only interpretations with infinite domains, it is always possible
for x not to equal any yi .
By Section 7.1, we have a weak quantifier elimination procedure over the
pure equality fragment of T . It is weak because equivalence is only guaranteed
to hold on infinite interpretations.
286 10 Combining Decision Procedures
F : x 6= y ∧ (∀z. z = x ∨ z = y) .
G : ∃z. z 6= x ∧ z 6= y ,
G0 : ¬⊥ ∧ ¬⊥ ⇔ ⊤
and
Gx : x 6= x ∧ x 6= y ⇔ ⊥ Gy : y 6= x ∧ y 6= y ⇔ ⊥ .
Then
G′ : G0 ∨ Gx ∨ Gy ⇔ ⊤ .
x 6= y ∧ ¬(⊤) ⇔ ⊥ .
S1 ∧ F1 ⇒∗ H and S2 ∧ H ⇒∗ ¬F2 .
S1 ∧ F1 ∧ K and S2 ∧ F2 ∧ K
S2 ∧ H ⇒∗ ¬F2 ,
or, rearranging,
S2 ∧ F2 ⇒∗ ¬H .
S2 ∧ F2 ⇒∗ ¬K .
⋆
10.5 Complexity
Assume that T1 and T2 are stably infinite theories such that Σ1 ∩ Σ = {=}.
Also, they have decision procedures P1 and P2 for their respective conjunctive
quantifier-free fragments.
10.6 Summary
Combining decision procedures in a general and efficient manner is crucial
for most applications. This chapter covers the Nelson-Oppen combination
method, in particular:
• The nondeterministic Nelson-Oppen method. Three requirements: the the-
ories only share =; the theories are stably infinite; and the considered for-
mula is quantifier-free. Variable abstraction, separation into theory-specific
formulae. Shared variables, equivalence relations over shared variables, ar-
rangements.
• The deterministic Nelson-Oppen method. Convex theories. Equality prop-
agation.
• Correctness of the Nelson-Oppen method, which follows from the Craig
Interpolation Lemma of Chapter 2.
• Complexity. When the individual decision procedures are convex and run
in polynomial time, the combination procedure runs in polynomial time.
The Nelson-Oppen combination method provides a general means of reasoning
simultaneously about the theories studied in this book using the individual
decision procedures. Being able to reason in union theories is crucial. For
example, almost all of the verification conditions of Chapters 5 and 6 are
expressed in multiple signatures.
Bibliographic Remarks
Nelson and Oppen describe the Nelson-Oppen combination method [65]. Their
original proof of correctness was flawed; Oppen presents a corrected proof in
[70], and Nelson presents a corrected proof in [64]. Oppen also proves in [70]
the complexity results that we state. Tinelli and Harandi present an alter-
nate proof of correctness in [92]. Our correctness proof derives from that of
Nelson and Oppen. See [56] for another presentation of the method and its
correctness.
Another general combination method that has received much attention is
that of Shostak [84]. See the work of Ruess and Shankar [78] for a correct
presentation of the method.
Exercises
10.1 (DP for combinations). For each of the following formulae, identify
the combination of theories in which it lies. To avoid ambiguity, prefer TZ to
TQ . Then apply the N-O method using the appropriate decision procedures.
Use either the nondeterministic or deterministic version. Provide a level of
detail as in the examples of the chapter.
Exercises 289
10.5 (⋆ Convex theories). Prove that the following theories are convex:
(a) TE
(b) Tcons
10.6 (⋆ Complexity). Prove the complexity results about the N-O method.
(a) Theorem 10.21.
(b) Theorem 10.22.
11
Arrays
Hashtables are another important data type. They are similar to arrays
with uninterpreted indices in that their indices, or keys, can only be compared
via equality. However, hashtables allow two new interesting operations: first, a
key/value pair can be removed; and second, a hashtable’s domain — its set of
keys that it maps to values — can be read. Section 11.3 formalizes reasoning
about hashtables in the theory TH and then presents a decision procedure
for the hashtable property fragment of TH . The procedure operates by
transforming ΣH -formulae to ΣA -formulae in the array property fragment such
that the original formula is TH -satisfiable iff the constructed formula is TA -
satisfiable.
the first conjunct asserts that ahi ⊳ vi and a are equal. This formula is TA -
unsatisfiable.
in which i is a list of variables, and F [i] and G[i] are the index guard and
the value constraint, respectively. The index guard F [i] is any ΣA -formula
that is syntactically constructed according to the following grammar:
11.1 Arrays with Uninterpreted Indices 293
Example 11.4. Reasoning about arrays is most useful when we can say some-
thing interesting about their elements. Suppose array elements are interpreted
in some theory T with signature Σ. Then we can assert that all elements of
an array have some property F [x], where F is a quantifier-free Σ-formula:
∀i. F [a[i]]; or that all but a finite number of elements have some property
F [x]:
n
!
^
∀i. i 6= jk → F [a[i]] .
k=1
The idea of the decision procedure for the array property fragment is to reduce
universal quantification to finite conjunction. It constructs a finite set of index
terms such that examining only these positions of the arrays is sufficient to
decide satisfiability.
Example 11.5. Consider the formula
F : ahi ⊳ vi = a ∧ a[i] 6= v ,
which expands to
or simply
Simplifying,
v = a[i] ∧ a[i] 6= v ,
Step 1
Put F in NNF.
11.1 Arrays with Uninterpreted Indices 295
Step 2
F [ahi ⊳ vi]
for fresh a′ (write)
F [a′ ] ∧ a′ [i] = v ∧ (∀j. j 6= i → a[j] = a′ [j])
Rules should be read from top to bottom. For example, this rule states that
given a formula F containing an occurrence of a write term ahi ⊳ vi, substitute
every occurrence of ahi ⊳ vi with a fresh variable a′ and conjoin several new
conjuncts.
This step deconstructs write terms in a straightforward manner, essen-
tially encoding the (read-over-write) axioms into the new formula. After an
application of the rule, the resulting formula contains at least one fewer write
terms than the given formula.
Step 3
F [∃i. G[i]]
for fresh j (exists)
F [G[j]]
Existential quantification can arise during Step 1 if the given formula has a
negated array property.
Step 4
{λ}
I = ∪ {t : ·[t] ∈ F3 such that t is not a universally quantified variable}
∪ {t : t occurs as an evar in the parsing of index guards}
Recall that evar is any constant or unquantified variable. This index set is the
finite set of indices that need to be examined. It includes all terms t that occur
in some read a[t] anywhere in F (unless it is a universally quantified variable)
and all terms t that are compared to a universally quantified variable in some
index guard. λ is a fresh constant that represents all other index positions
that are not explicitly in I.
296 11 Arrays
Step 5
where n is the size of the list of quantified variables i. This is the key step.
It replaces universal quantification with finite conjunction over the index set.
The notation i ∈ I n means that the variables i range over all n-tuples of
terms in I.
Step 6
The new conjuncts assert that the variable λ introduced in Step 4 is indeed
unique: it does not equal any other index mentioned in F5 .
Step 7
Decide the TA -satisfiability of F6 using the decision procedure for the quantifier-
free fragment.
Suppose array elements are interpreted in some theory T with signature Σ.
For deciding the (TA ∪ T )-satisfiability of an array property (ΣA ∪ Σ)-formula,
use a combination decision procedure for the quantifier-free fragment of TA ∪T
in Step 7. Thus, this procedure is a decision procedure precisely when the
quantifier-free fragment of TA ∪ T is decidable. Chapter 10 discusses deciding
satisfiability in combinations of quantifier-free fragments of theories.
in which the index guard is i 6= ℓ and the value constraint is a[i] = b[i]. It is
already in NNF. According to Step 2, rewrite F as
11.1 Arrays with Uninterpreted Indices 297
Expanding produces
F5′ : a′ [k] = b[k] ∧ b[k] 6= v ∧ a[k] = v ∧ (λ 6= ℓ → a[λ] = b[λ])
∧ (k 6= ℓ → a[k] = b[k]) ∧ (ℓ 6= ℓ → a[ℓ] = b[ℓ])
∧ a′ [ℓ] = v ∧ (λ 6= ℓ → a[λ] = a′ [λ])
∧ (k 6= ℓ → a[k] = a′ [k]) ∧ (ℓ 6= ℓ → a[ℓ] = a′ [ℓ]) .
Simplifying produces
F5′′ : a′ [k] = b[k] ∧ b[k] 6= v ∧ a[k] = v ∧ (λ 6= ℓ → a[λ] = b[λ])
∧ (k 6= ℓ → a[k] = b[k])
∧ a′ [ℓ] = v ∧ (λ 6= ℓ → a[λ] = a′ [λ])
∧ (k 6= ℓ → a[k] = a′ [k]) .
Step 6 distinguishes λ from other members of I:
F6 : a′ [k] = b[k] ∧ b[k] 6= v ∧ a[k] = v ∧ (λ 6= ℓ → a[λ] = b[λ])
∧ (k 6= ℓ → a[k] = b[k])
∧ a′ [ℓ] = v ∧ (λ 6= ℓ → a[λ] = a′ [λ])
∧ (k 6= ℓ → a[k] = a′ [k])
∧ λ 6= k ∧ λ 6= ℓ .
Simplifying, we have
F6′ : a′ [k] = b[k] ∧ b[k] 6= v ∧ a[k] = v
∧ a[λ] = b[λ] ∧ (k 6= ℓ → a[k] = b[k])
∧ a′ [ℓ] = v ∧ a[λ] = a′ [λ] ∧ (k 6= ℓ → a[k] = a′ [k])
∧ λ 6= k ∧ λ 6= ℓ .
There are two cases to consider. If k = ℓ, then a′ [ℓ] = v and a′ [k] = b[k]
imply b[k] = v, yet b[k] 6= v. If k 6= ℓ, then a[k] = v and a[k] = b[k] imply
b[k] = v, but again b[k] 6= v. Hence, F6′ is TA -unsatisfiable, indicating that F
is TA -unsatisfiable.
Verify that the array decision procedure of Section 9.5 reaches the same
conclusion for F6′ .
298 11 Arrays
projI : ΣA -terms → I .
Define J to be like I except for its arrays. Under J, let a[i] = a[projI (i)].
Technically, we are specifying how αJ assigns values to terms of F and the
array read function ·[·]; however, we can think in terms of arrays.
To prove that J |= F , we focus on a particular subformula ∀i. F [i] → G[i].
Assume that
^
I |= F [i] → G[i] ;
i∈I n
then also
^
J |= F [i] → G[i] (11.1)
i∈I n
J ⊳ {i 7→ v} |= F [i] → G[i]
11.2 Integer-Indexed Arrays 299
K |= (1) (2)
?
F [i] G[i]
where i is a list of integer variables, and F [i] and G[i] are the index guard and
the value constraint, respectively. The form of an index guard is constrained
according to the following grammar:
where uvar is any universally quantified integer variable, and evar is any
existentially quantified or free integer variable.
The form of a value constraint is also constrained. Any occurrence of a
quantified index variable i must be as a read into an array, a[i], for array term
a. Array reads may not be nested; e.g., a[b[i]] is not allowed. Section 11.4
explains the need for these restrictions.
The array property fragment of TAZ then consists of formulae that are
Boolean combinations of quantifier-free ΣAZ -formulae and array properties.
Example 11.9. As in the basic arrays of Section 11.1, reasoning about arrays
is most useful when we can say something interesting about their elements.
Suppose array elements are interpreted in some theory T with signature Σ.
Now that both indices and elements can be interpreted in theories, we list
several interesting forms of properties and their definitions for various element
theories.
• Array equality a = b in TA :
Step 1
Put F in NNF.
Step 2
Apply the following rule exhaustively to remove writes:
F [ahi ⊳ ei]
for fresh a′ (write)
F [a ] ∧ a [i] = e ∧ (∀j. j 6= i → a[j] = a′ [j])
′ ′
To meet the syntactic requirements on an index guard, rewrite the third con-
junct as
∀j. j ≤ i − 1 ∨ i + 1 ≤ j → a[j] = a′ [j] .
Step 3
Apply the following rule exhaustively to remove existential quantification:
F [∃i. G[i]]
for fresh j (exists)
F [G[j]]
Existential quantification can arise during Step 1 if the given formula has a
negated array property.
302 11 Arrays
Step 4
Step 5
Step 6
Using this projection function, the remainder of the proof closely follows the
proof of Theorem 11.7. Exercise 11.4 asks the reader to finish the proof.
11.3 Hashtables
Hashtables are a common data structure in modern programs. In this section,
we describe a theory for hashtables TH and provide a reduction of the hashtable
property fragment of ΣH -formulae into the array property fragment of TA .
The signature of TH is the following:
where
• put(h, k, v) is the hashtable that is modified from h by mapping key k to
value v.
• remove(h, k) is the hashtable that is modified from h by unmapping the
key k.
• get(h, k) is the value mapped by key k, which is undetermined if h does
not map k to any value.
• k ∈ keys(h) is true iff h maps the key k.
k ∈ keys(h) is merely convenient notation for a binary predicate. However, we
will exploit this notation in the following useful operations:
• Key sets keys(h) can be unioned (k1 ∪ k2 ), intersected (k1 ∩ k2 ), and
complemented (k).
• The predicate init(h) is true iff h does not map any key.
Each is definable using the basic signature.
The axioms of TH are the following:
11.3 Hashtables 305
• ∀x. x = x (reflexivity)
• ∀x, y. x = y → y = x (symmetry)
• ∀x, y, z. x = y ∧ y = z → x = z (transitivity)
• ∀h, j, k. j = k → get(h, j) = get(h, k) (hashtable congruence)
• ∀h, j, k, v. j = k → get(put(h, k, v), j) = v
(read-over-put 1)
• ∀h, k, v. ∀j ∈ keys(h). j 6= k → get(put(h, k, v), j) = get(h, j)
(read-over-put 2)
• ∀h, k. ∀j ∈ keys(h). j 6= k → get(remove(h, k), j) = get(h, j)
(read-over-remove)
• ∀h, k, v. k ∈ keys(put(h, k, v)) (keys-put)
• ∀h, k. k 6∈ keys(remove(h, k)) (keys-remove)
Notice the similarity between the first six axioms of TH and those of TA . Key
sets complicate the (read-over-put 2) axiom compared to the (read-over-write
2) axiom, while keys sets and key removal require three additional axioms. In
particular, reading a hashtable with an unmapped key is undefined.
where F [k] is the key guard, and G[k] is the value constraint. Key guards
are defined exactly as index guards of the array property fragment of TA : they
are positive Boolean combinations of equalities between universally quantified
keys; and equalities and disequalities between universally quantified keys k
and other key terms. Value constraints can use universally quantified keys
k in hashtable reads get(h, k) and in key set membership checks. Finally, a
hashtable property does not contain any init literals.
ΣH -formulae that are Boolean combinations of quantifier-free ΣH -formulae
and hashtable properties comprise the hashtable property fragment of TH .
Example 11.12. Consider the following hashtable property formula:
F : ∀k ∈ keys(h). get(h, k) ≥ 0 .
k ∈ keys(h) → get(h, k) ≥ 0 .
@L1 : F
assume v ≥ 0;
put(h, s, v);
@L2 : F
306 11 Arrays
The key set keys(h) provides a mechanism for reasoning about the incremental
modification of hashtables.
Step 1
Construct F ∧ • =
6 ◦, for fresh constants • and ◦.
Step 2
for fresh variable h′ . In the second rule, h = h′ hk ⊳h[k]i expresses that position
k of h′ is undetermined since the mapping from k is being removed. Recall
that equality a = b between arrays is defined by ∀i. a[i] = b[i].
11.3 Hashtables 307
Step 3
Step 4
F1 : F ∧ • =
6 ◦.
states that extending the array property fragments in natural ways produces
fragments for which satisfiability is undecidable.
Theorem 11.16. Consider the following extensions to the array property
fragment of TAZ (TA , where appropriate):
• Permit an additional quantifier alternation.
• Permit nested reads (e.g., a1 [a2 [i]], where i is universally quantified).
• Permit array reads by a universally quantified variable in the index guard.
• Permit general Presburger arithmetic expressions over universally quanti-
fied index variables (even just addition of 1: i + 1) in the index guard or
in the value constraint.
• Permit strict comparison < between universally quantified variables.
• Augment the theory with a predicate expressing that one array is a permu-
tation of another.
For each resulting fragment, there exists an element theory T such that sat-
isfiability in the array property fragment of TAZ ∪ T (TA ∪ T ) is decidable, yet
satisfiability in the resulting fragment of TAZ ∪ T (TA ∪ T ) is undecidable.
Bibliographic Remarks refers the interested reader to texts that contain
the proof of this theorem.
11.5 Summary
This chapter presents several decision procedures for reasoning about array-
like data structures with some quantification. It covers:
• The array property fragment of TA , which allows expressing properties
of arrays themselves, rather than just their elements. Elements may be
interpreted in some theory.
• The array property fragment of TAZ , which allows expressing properties of
arrays and subarrays with indices interpreted within TZ .
• Quantifier instantiation as a basis for decision procedures, which allows
the direct application of decision procedures for quantifier-free fragments.
• Hashtables. The decision procedure rewrites a ΣH -formula into a ΣA -
formula in which each hashtable is represented by two arrays. Reductions
of this form extend decision procedures to reason about theories similar
to the originally targeted theory.
Reasoning about data structures is crucial for considering the correctness
of programs. The decision procedures of Chapter 9 for reasoning about re-
cursive data structures and arrays without quantifiers provide a means of
accessing elements of data structures. Additionally, equality in TRDS extends
to data structures. The decision procedures for the array property fragments
of TA and TAZ facilitate reasoning about whole or segments of arrays, not just
individual elements.
310 11 Arrays
Bibliographic Remarks
Theories of arrays have been studied for over four decades. McCarthy proposes
the axiomatization based on read-over-write in [58]. James King implemented
the decision procedure for the quantifier-free fragment as part of his thesis
work [50]. Several authors discuss the quantifier-free fragment of a theory
with predicates useful for reasoning about sorting algorithms [57, 43, 89].
Suzuki and Jefferson present a permutation predicate in a more restricted
fragment [89]. The approximation to reasoning about the weak permutation
predicate of Exercise 6.5 captures a similar fragment, though for a weaker form
of permutation. Stump, Barrett, Dill, and Levitt describe a decision procedure
for the quantifier-free fragment of an extensional theory of arrays [88]. Bradley,
Manna, and Sipma [9] and Bradley [6] explore the array property fragment,
including the proof of Theorem 11.16, that is the basis for the presentation of
this chapter.
Exercises
11.1 (DP for array property fragment of TA ). Apply the decision pro-
cedure for the array property fragment of TA to the following TA -formulae.
(a) ∀i. ahk ⊳ ei[i] 6= e
(b) a[k] = b[k] ∧ ∀i. a[i] 6= b[i]
(c) a[k] 6= b[k] ∧ ∀i. a[i] = b[i]
11.2 (DP for array property fragment of TAZ ). Apply the decision pro-
cedure for the array property fragment of TAZ to the following ΣAZ -formulae.
(a) sorted(a, ℓ, u) ∧ a[ℓ] > a[u]
(b) sorted(a, ℓ, u) ∧ e ≤ a[ℓ] ∧ ¬sorted(ahℓ − 1 ⊳ ei, ℓ − 1, u)
11.3 (DP for array property fragment of TAZ ). Apply the decision pro-
cedure for the array property fragment of TAZ to the following ΣAZ -formula:
•s′
•s
S S
sp(F, S)
•s F •s0
wp(F, S) F
(a) (b)
The weakest precondition wp(F, S) has the defining characteristic that if state
s is such that
s |= wp(F, S)
s′ |= F .
12.1 Invariant Generation 313
s |= sp(F, S)
then there exists a state s0 such that executing S on s0 results in state s and
s0 |= F .
sp(F, assume c) ⇔ c ∧ F ,
for if program control makes it past the statement, then c must hold.
Unlike in the case of wp, there is no simple definition of sp on assignments:
Let s0 and s be the states before and after executing the assignment, respec-
tively. v 0 represents the value of v in state s0 . Every variable other than v
maintains its value from s0 in s. Then v = e[v 0 ] asserts that the value of v
in state s is equal to the value of e in state s0 . F [v 0 ] asserts that s0 |= F .
Overall, sp(F, v := e) describes the states that can be obtained by executing
v := e from F -states, states that satisfy F .
Finally, define sp inductively on a sequence of statements S1 ; . . . ; Sn :
sp(F, S1 ; . . . ; Sn ) ⇔ sp(sp(F, S1 ), S2 ; . . . ; Sn ) .
sp(i ≥ n, i := i + k)
⇔ ∃i0 . i = i0 + k ∧ i0 ≥ n
⇔ i−k ≥n
since i0 = i − k.
314 12 Invariant Generation
Compute
sp(i ≥ n, assume k ≥ 0; i := i + k)
⇔ sp(sp(i ≥ n, assume k ≥ 0), i := i + k)
⇔ sp(k ≥ 0 ∧ i ≥ n, i := i + k)
⇔ ∃i0 . i = i0 + k ∧ k ≥ 0 ∧ i0 ≥ n
⇔ k ≥0 ∧ i−k ≥n
Example 12.2. Let us prove that
sp(wp(F, S), S) ⇒ F ⇒ wp(sp(F, S), S) ;
that is, the strongest postcondition of the weakest precondition of F on state-
ment S implies F , which implies the weakest precondition of the strongest
postcondition of F on S. We prove the first implication and leave the second
implication as Exercise 12.1.
Suppose that S is the statement assume c. Then
sp(wp(F, assume c), assume c)
⇔ sp(c → F, assume c)
⇔ c ∧ (c → F )
⇔ c ∧ F
⇒ F
Now suppose that S is an assignment statement v := e. Then
sp(wp(F [v], v := e[v]), v := e[v])
⇔ sp(F [e[v]], v := e[v])
⇔ ∃v 0 . v = e[v 0 ] ∧ F [e[v 0 ]]
⇔ ∃v 0 . v = e[v 0 ] ∧ F [v]
⇒ F
Recall the definition of a verification condition in terms of wp:
{F }S1 ; . . . ; Sn {G} : F ⇒ wp(G, S1 ; . . . ; Sn ) .
We can similarly define a verification condition in terms of sp:
{F }S1 ; . . . ; Sn {G} : sp(F, S1 ; . . . ; Sn ) ⇒ G .
Typically, we prefer working with the weakest precondition because of its
syntactic handling of assignment statements. However, in the remainder of
this chapter, we shall see the value of the strongest postcondition.
12.1 Invariant Generation 315
⋆
12.1.2 General Definitions of wp and sp
Section 12.1.1 defines the wp and sp predicate transformers for our simple
language of assumption (assume c) and assignment (v := e) statements. This
section defines these predicate transformers more generally.
Describe program statements with FOL formulae over the program vari-
ables x, the program counter pc, and the primed variables x′ and pc′ .
The program counter ranges over the locations of the program. The primed
variables represent the values of the corresponding unprimed variables in the
next state. For example, the statement
Li : assume c;
Lj :
Li : xi := e;
Lj :
where pres(V ), for a set of variables V , abbreviates the assertion that each
variable in V remains unchanged:
^
v′ = v .
v∈V
Formulae (12.1) and (12.2) are called transition relations. The expressive-
ness of FOL allows many more constructs to be encoded as transition relations
than can be encoded in either pi or the simple language of assumptions and
assignments.
Let us consider the weakest precondition and the strongest postcon-
dition in this general context. For convenience, let y be all the variables of
the program including the program counter pc. The weakest precondition of
F over transition relation ρ[y, y′ ] is given by
Notice that free(wp(F, ρ)) = y. A satisfying y represents a state from which all
ρ-successors, which are states y ′ that ρ relates to y, are F -states. Technically,
316 12 Invariant Generation
such a state might not have any successors at all; for example, ρ could describe
a guard that is false in the state described by y.
Consider the transition relation (12.1) corresponding to assume c:
wp(F, pc = Li ∧ pc′ = Lj ∧ c ∧ x′ = x)
⇔ ∀pc′ , x′ . pc = Li ∧ pc′ = Lj ∧ c ∧ x′ = x → F [pc′ , x′ ]
⇔ pc = Li ∧ c → F [Lj , x]
wp(F, assume c) ⇔ c → F .
Similarly,
Notice that free(sp(F, ρ)) = y. It describes all states y that have some
ρ-predecessor y0 that is an F -state; or, in other words, it describes all ρ-
successors of F -states.
Exercise 12.2 asks the reader to specialize this definition of sp to the case
of assumption and assignment statements. Exercise 12.3 asks the reader to
reproduce the arguments of Example 12.2 and Exercise 12.1 in this more
general setting.
µ : L → FOL
Li : @ µ(Li )
Si ;
..
.
Sj ;
Lj : @ µ(Lj )
This configuration represents entering the function with some values satisfying
its precondition.
Maintain a set S ⊆ L of locations that still need processing. Initially, let
S = {L0 }. Terminate when S = ∅.
Suppose that we are on iteration i, having constructed the intermediate
assertion map µ. Choose some location Lj ∈ S to process, and remove it from
S. For each basic path
Lj : @ µ(Lj )
Sj ;
..
.
Sk ;
318 12 Invariant Generation
Lk : @ µ(Lk )
starting at Lj , compute if
sp(µ(Lj ), Sj ; . . . ; Sk ) ⇒ µ(Lk ) . (12.4)
while the check in the inner loop and the emptiness of S guarantees that
sp(µ(Lj ), Sj ; . . . ; Sk ) ⇒ µ(Lk ) ,
12.1 Invariant Generation 319
12.1.4 Abstraction
sp(µ(Lj ), Sj ; . . . ; Sk ) ⇒ µ(Lk )
of the inner loop is undecidable for FOL. Second, even if this check were
decidable — for example, if we restricted µ(Lk ) to be in a decidable theory or
fragment — the while loop itself may not terminate, as the following example
shows.
Example 12.4. Consider the following loop with integer variables i and n:
@L0 : i = 0 ∧ n ≥ 0;
while
@L1 : ?
(i < n) {
i := i + 1;
}
and
(2)
@L1 : ?;
assume i < n;
i := i + 1;
@L1 : ?;
320 12 Invariant Generation
µ(L1 ) := µ(L1 ) ∨ (i = 0 ∧ n ≥ 0) .
µ(L1 ) ⇔ i = 0 ∧ n ≥ 0 .
Currently µ(L1 ) ⇔ i = 0 ∧ n ≥ 0, so
{z n > 0} ⇒ i| = 0 ∧
|i = 1 ∧ {z
n≥0
}
F µ(L1 )
is invalid,
µ(L1 ) ⇔ (i = 0 ∧ n ≥ 0) ∨ (i = 1 ∧ n > 0)
| {z }
F
µ(L1 ) ⇔ (i = 0 ∧ n ≥ 0) ∨ (i = 1 ∧ n ≥ 1)
∨ · · · ∨ (i = k ∧ n ≥ k) .
Because the implication
i=k ∧ n≥k
⇓
(i = 0 ∧ n ≥ 0) ∨ (i = 1 ∧ n ≥ 1) ∨ · · · ∨ (i = k − 1 ∧ n ≥ k − 1)
12.1 Invariant Generation 321
0≤i≤n
In the first step, we choose the form of state sets that the abstract interpreta-
tion manipulates. The abstract domain D is a syntactic class of Σ-formulae
of some theory T ; each member Σ-formula represents a particular set of states
(those that satisfy it). In Section 12.2, the interval abstract domain DI
consists of conjunctions of ΣQ -literals of the forms
c≤v and v ≤ c ,
for constant c and program variable v. In Section 12.3, we fix Karr’s abstract
domain DK to consist of conjunctions of ΣQ -literals of the form
c0 + c1 x1 + · · · + cn xn = 0 ,
νD : FOL → D
322 12 Invariant Generation
F : i=0 ∧ n≥0
νDI (F ) : 0 ≤ i ∧ i ≤ 0 ∧ 0 ≤ n
νDK (F ) : i = 0
F ⇒ νD (F ) .
sp(F, assume c) ⇔ c ∧ F .
F1 ∧ F2 ⇒ F1 ⊓D F2 and F1 ⊓D F2 ∈ D
G ⇒ µ(L) ,
F1 ∨ F2 ⇒ F1 ⊔D F2 and F1 ⊔D F2 ∈ D
for F1 , F2 ∈ D.
Unlike conjunction, exact disjunction is usually not represented in the
domain D. We examine specific instances in Sections 12.2 and 12.3.
sp(µ(Lj ), Sj ; . . . ; Sk ) ⇒ µ(Lk )
▽D : D × D → D
such that
F1 ∨ F2 ⇒ F1 ▽D F2 .
324 12 Invariant Generation
Fi ⇒ Fi+1 .
Gi ⇔ Gi+1 .
That is, the sequence Gi converges even if the sequence Fi does not converge.
Intuitively, the widening operator “guesses” an over-approximation to the
limit of a sequence of formulae. Of course, a widening operator could always
return ⊤; however, better widening operators make more precise guesses.
A proper strategy of applying widening guarantees that the forward prop-
agation procedure terminates.
Figure 12.3 applies the operators defined in these six steps in the abstract
forward propagation algorithm. Given a function’s precondition Fpre and
a cutset L of its locations, AbstractForwardPropagate returns the in-
ductive map µ. When Widen() determines that widening should be applied,
12.2 Interval Analysis 325
µ(Lk ) is updated to be the widening of µ(Lk ) and µ(Lk )⊔D F . A proper defini-
tion of Widen() ensures that the procedure terminates. For example, a simple
strategy is that after some predetermined number of iterations, Widen() al-
ways evaluates to true.
Subsequent sections examine instances of this framework.
v≤c and c ≤ v ,
@L0 : i = 0 ∧ n ≥ 0;
while
@L1 : ?
(i < n) {
i := i + 1;
}
c≤v and v ≤ c ,
−∞ + c = −∞ , ∞+c=∞ , c · −∞ = −∞ , and c · ∞ = ∞
for c ≥ 0, and
−∞ + c = −∞ , ∞+c=∞ , c · −∞ = ∞ , and c · ∞ = −∞ ,
[∞, −∞] ;
[ℓ1 , u1 ] ⊓ [ℓ2 , u2 ]
[∞, −∞] if max(ℓ1 , ℓ2 ) > min(u1 , u2 )
=
[max(ℓ1 , ℓ2 ), min(u1 , u2 )] otherwise
Intersection is exact: the computed interval represents the set that is the
set intersection of the two sets represented by the given intervals.
• Interval union:
The result is called the interval hull. It over-approximates the true union:
the computed interval represents a set that may include more elements
than the set union of the two sets represented by the given intervals.
12.2 Interval Analysis 327
For predicates,
[ℓ1 , u1 ] ≤ [ℓ2 , u2 ] if ℓ1 ≤ u2 ,
and
c0 + c1 x1 + · · · + cn xn ,
c0 + c1 [ℓ1 , u1 ] + · · · + cn [ℓn , un ] .
and
This definition of νDI is not as precise as it could be. For consider the case
in which we know G ∈ DI ; then it is possible to evaluate the truth-value of,
for example,
H : c0 + c1 x1 + · · · + cn xn ≤ 0 ,
even when n > 1. Define νDI (H, G) to equal the interval evaluation (either ⊤ or
⊥) of H in the context of G. For all other literals F , define νDI (F, G) = νDI (F ).
When F has several literals, it is sometimes more precise to compute
νDI (F, G ∧ νDI (F )) ,
which uses as much information from F as possible.
Example 12.7. Consider
F : i = 0 ∧ n ≥ 0 ∧ j + 2n ≤ 4 .
| {z }
H
Then
νDI (F ) = 0 ≤ i ∧ i ≤ 0 ∧ 0 ≤ n .
Note that νDI (H) = ⊤. Now, given
G: 5≤j ,
compute G ∧ νDI (F ):
G′ : 5 ≤ j ∧ 0 ≤ i ∧ i ≤ 0 ∧ 0 ≤ n .
Then compute
νDI (H, G′ ) = ⊥
since in the context G′ ,
j + 2n = [5, ∞] + 2[0, ∞] = [5, ∞] 6≤ [4, 4] .
Hence,
νDI (F, G ∧ νDI (F )) = ⊥ .
Compare this result to the weaker νDI (F ).
Simply computing νDI (F, G) yields
νDI (F, G) : 0 ≤ i ∧ i ≤ 0 ∧ 0 ≤ n ,
since in the context G,
j + 2n = [5, ∞] + 2[−∞, ∞] = [−∞, ∞] ≤ [4, 4] .
12.2 Interval Analysis 329
Compute
The context F asserts that i ∈ [0, 0] and n ∈ [0, ∞], resulting in the compari-
son of intervals of the fourth line.
spDI (F, v := e) ⇔ ℓ ≤ v ∧ v ≤ u ∧ G ,
spDI (F, v := e) ⇔ G .
G: 0≤n,
Then
spDI (F, i := i + 1) ⇔ 1 ≤ i ∧ i ≤ 1 ∧ 0 ≤ n .
Define F1 ⊔DI F2 as follows. For each variable x, let F1 assert that x ∈ [ℓ1 , u1 ]
and F2 assert that x ∈ [ℓ2 , u2 ]. Then F1 ⊔DI F2 asserts that x is in the interval
hull of [ℓ1 , u1 ] and [ℓ2 , u2 ]: x ∈ [ℓ1 , u1 ] ⊔ [ℓ2 , u2 ]. The interval hull is defined in
Step 1.
Interval analysis does not naturally terminate, as the following example illus-
trates.
@L0 : i = 0 ∧ n ≥ 0;
while
@L1 : ?
(i < n) {
i := i + 1;
}
12.2 Interval Analysis 331
and
(2)
@L1 : ?;
assume i < n;
i := i + 1;
@L1 : ?;
Initially,
µ(L0 ) ⇔ νDI (i = 0 ∧ n ≥ 0) ⇔ 0 ≤ i ∧ i ≤ 0 ∧ 0 ≤ n .
Currently µ(L1 ) ⇔ 0 ≤ i ∧ i ≤ 0 ∧ 0 ≤ n, so
is invalid,
µ(L1 ) ⇔ 0 ≤ i ∧ i ≤ 1 ∧ 0 ≤ n
µ(L1 ) ⇔ 0 ≤ i ∧ i ≤ k ∧ 0 ≤ n .
332 12 Invariant Generation
Example 12.11. On the kth iteration (for some small k, say, k = 3) of the
analysis in Example 12.10, compute
µ(L1 ) := µ(L1 ) ▽DI (µ(L1 ) ⊔DI spDI (µ(L1 ), assume i < n; i := i + 1)) .
That is,
(0 ≤ i ∧ i ≤ k − 1 ∧ 0 ≤ n) ▽DI (0 ≤ i ∧ i ≤ k ∧ 0 ≤ n)
⇔ 0≤i ∧ 0≤n
µ(L1 ) ⇔ 0 ≤ i ∧ 0 ≤ n .
While this new µ(L1 ) does not imply the previous one,
0 ≤ i ∧ 0 ≤ n 6⇒ 0 ≤ i ∧ i ≤ k − 1 ∧ 0 ≤ n ,
one more iteration yields the same µ(L1 ), finishing the analysis. Thus,
0≤i ∧ 0≤n
is an inductive assertion at L1 .
Unfortunately, the interval abstract domain is incapable of representing
the more interesting invariant i ≤ n.
12.3 Karr’s Analysis 333
c0 + c1 x1 + · · · + cn xn = 0 ,
for ci ∈ Z and program variables xi . Such assertions are called affine asser-
tions. They are useful for tracking the relationship among program variables
and loop counters. Karr’s analysis can be implemented efficiently, with run-
ning time polynomial in the program size.
In this section, we present a simplified version of the analysis that Michael
Karr originally proposed. In particular, our analysis ignores guards of loops
and if statements. We use the notation and concepts from Section 8.2.
Example 12.12. Consider the loop
@L0 : ⊤;
i := 0;
j := 0;
k := 0;
while
@L1 : ?
(∗) {
k := k + 1;
if (∗) i := i + 1;
else j := j + 1;
}
The guard ∗ denotes nondeterministic choice: either branch can be taken.
Karr’s analysis discovers the inductive invariant i + j = k at L1 .
c0 + c1 x1 + · · · + cn xn = 0 ,
which define affine spaces. An affine space is a point, a line, a plane, etc. An
affine space can be specified by a set of equations
^
Ax = b , abbreviating ai1 x1 + · · · + ain xn = bi ,
i
For example, the affine combination of two disjoint points is a line passing
through both. These two representations are the constraint representation
and the vertex representation, respectively.
i+j =k
The vertex representation is best suited for the version of Karr’s analysis
that we present. Recall, though, that the abstract domain is really the set of
ΣQ -formulae that are conjunctions of literals of the form
c0 + c1 x1 + · · · + cn xn = 0 .
This step is trivial, as this version of Karr’s analysis does not use information
from annotations or assumption statements. Hence,
νDK (F ) = ⊤ .
12.3 Karr’s Analysis 335
Let
for any F and c. That is, ignore assumption statements. Ignoring assump-
tion statements is not a terrible loss in precision: at best, only affine guards
c : Ax = b could be interpreted within DK . Such guards are uncommon in
practice.
Consider assignment xk := e, where e is an affine expression
e0 + e1 x1 + · · · + en xn ,
where the row with e is the kth row, corresponding to xk , and the rest of
the matrix is the identity matrix. Abbreviate this transformation with the
notation [[xk := e]].
Now consider an affine space F represented by a set of vertices VF . To
compute the effect of applying the assignment xk := e, apply [[xk := e]]:
The transformed affine space is given by applying [[xk := e]] to each of the
vertices of VF . Then
Then
210 1 3 5
[[i := 2i + j + 3]]V = 0 1 0 0 + 0 = 0 .
001 1 0 1
when e is not an affine expression. In the new affine space, xk can have any
value. Exercise 12.5 asks the reader to prove this claim.
The final set of vertices represents the set of states in which j = 0, k = 1, and
i is any value.
This query can be decided efficiently using algorithms for solving linear equa-
tions, such as Gaussian elimination.
Then F1 ⇒ F2 iff for all v ∈ V , v ∈ affine(W ).
@L0 : ⊤;
i := 0;
j := 0;
k := 0;
while
@L1 : ?
(∗) {
k := k + 1;
if (∗) i := i + 1;
else j := j + 1;
}
(2)
@L1 : ?;
k := k + 1;
i := i + 1;
@L1 : ?;
Next,
340 12 Invariant Generation
0 1 0 1
µ(L1 ) := µ(L1 ) ⊔DK τ3 µ(L1 ) = 0 , 0 ∪ τ3 0 , τ3 0
0 1 0 1
0 1 0
= 0, 0, 1 .
0 1 1
T T
The new vertex 0 1 1 is obtained from τ3 0 0 0 . Note that τ3 is applied
T
to 1 0 1 as well; however
1 1 0 1 0
τ3 0 = 1 = (−1) 0 + (1) 0 + (1) 1 ,
1 2 0 1 1
T
and −1 + 1 + 1 = 1. Hence, 1 1 2 is redundant.
On the next iteration, we obtain convergence. For
0 1 1 2 0 1
τ2 0 = 0 , τ2 0 = 0 = (−1) 0 + (2) 0 ,
0 1 1 2 0 1
and
0 1 0 1 0
τ2 1 = 1 = (−1) 0 + (1) 0 + (1) 1 ,
1 2 0 1 1
and
0 0 0 0
τ3 1 = 2 = (−1) 0 + (2) 1 ,
1 2 0 1
that is,
a
0 0 0 −1 1 0
1 0 1 −1 a2 = 0 .
a3
0 1 1 −1 0
b
i+j =k
at L1 .
⋆
12.4 Standard Notation and Concepts
Our presentation of abstract interpretation differs markedly from the stan-
dard presentation in the literature. To facilitate the reader’s foray into the
literature, we discuss here the standard notation and concepts and relate it
to our presentation. The main idea is to describe an abstract interpretation
in terms of a set of operations over two lattices.
Lattices
A partially ordered set (S, ), also called a poset, is a set S equipped with
a partial order , which is a binary relation that is
• reflexive: ∀s ∈ S. s s;
• antisymmetric: ∀s1 , s2 . s1 s2 ∧ s2 s1 → s1 = s2 ;
• transitive: ∀s1 , s2 , s3 ∈ S. s1 s2 ∧ s2 s3 → s1 s3 .
A lattice (S, ⊔, ⊓) is a set equipped with join ⊔ and meet ⊓ operators
that are
• commutative:
– ∀s1 , s2 . s1 ⊔ s2 = s2 ⊔ s1 ,
– ∀s1 , s2 . s1 ⊓ s2 = s2 ⊓ s1 ;
• associative:
– ∀s1 , s2 , s3 . s1 ⊔ (s2 ⊔ s3 ) = (s1 ⊔ s2 ) ⊔ s3 ,
– ∀s1 , s2 , s3 . s1 ⊓ (s2 ⊓ s3 ) = (s1 ⊓ s2 ) ⊓ s3 ;
• idempotent:
– ∀s. s ⊔ s = s,
342 12 Invariant Generation
⊔S ⊔S
s1 s2 : s3 ⊔ s4 s1 s2 : s3 ⊔ s4
.. ..
. s3 s4 . s3 s4
s6 : s3 ⊓ s4 s6 : s3 ⊓ s4
.. .. .. ..
. . . .
s7 s8
s7 s8
⊓S
⊓S
(a) (b)
– ∀s. s ⊓ s = s.
Additionally, they satisfy the absorption laws:
• ∀s1 , s2 . s1 ⊔ (s1 ⊓ s2 ) = s1 ;
• ∀s1 , s2 . s1 ⊓ (s1 ⊔ s2 ) = s1 .
One can define a partial order on S:
∀s1 , s2 . s1 s2 ↔ s1 = s1 ⊓ s2 ,
or equivalently
∀s1 , s2 . s1 s2 ↔ s2 = s1 ⊔ s2 .
Abstract Interpretation
An important complete lattice for our purposes is the lattice defined by the
sets of a program P ’s possible states S: the join is set union, the meet is set
intersection, and the partial order is set containment. The greatest element
is the set of all states; the least element is the empty set. The lattice is thus
represented by (2S , ∪, ∩), where 2S is the powerset of S. Call this lattice CP .
Treat P as a function on S: P (s) is the successor state of s during execu-
tion. Define the strongest postcondition on subsets S ′ of states S and program
P:
def
sp(S ′ , P ) = {P (s) : s ∈ S ′ } ,
α : 2S → A and γ : A → 2S ,
and
that is, if the set of program states represented by abstract element F P (a)
is a superset of the application of FP to the set of states represented by
abstract element a. One possible abstraction is given by applying FP to the
concretization of a and then abstracting the result:
def
F P (a) = α(FP (γ(a))) .
In our presentation, we take as our concrete set not 2S but rather FOL rep-
resentations of sets of states. The concrete lattice is thus (FOL, ∨, ∧) with
partial order ⇒. This lattice is not complete: there need not be a finite first-
order representation of the conjunction of an infinite number of formulae. But
not surprisingly, there need not be a finite represention of a set of infinite
cardinality, either, so the completeness of CP is not of practical value.
Our abstract domains are given by syntactic restrictions on the form of
FOL formulae. The abstraction function is νD , and the concretization function
is just the identity. spD is a valid abstraction of sp, and both are monotone
in their respective lattices.
12.5 Summary
This chapter describes a methodology for developing algorithms to reason
about program correctness. It covers:
• Invariant generation in a general setting. The forward propagation al-
gorithm based on the strongest postcondition. The need for abstraction.
Issues: decidability and convergence. Abstract interpretations.
Exercises 345
Bibliographic Remarks
We present a simplified version of the abstract interpretation framework of
Cousot and Cousot, who also describe a version of the interval domain that
we present [20].
Karr developed his analysis a year before the abstract interpretation frame-
work was presented [46]. For background on linear algebra, see [42]. Our pre-
sentation of Karr’s analysis is based on that of Müller-Olm and Seidl [63].
Many other domains of abstract interpretation have been studied. The
most widely known is the domain of polyhedra, which Cousot and Halbwachs
describe in [21]. Exercise 12.4 explores the octagon domain of Miné [61]. As an
example of a non-numerical domain, see the work of Sagiv, Reps, and Wilhelm
on shape analysis [96].
Exercises
12.1 (wp and sp). Prove the second implication of Example 12.2; that is,
prove that
F ⇒ wp(sp(F, S), S) .
12.2 (General sp). Compute sp(F, ρ1 ) and sp(F, ρ2 ) for transition relations
ρ1 and ρ2 of (12.1) and (12.2), respectively. Show that disregarding pc reveals
the original definition of sp.
12.3 (General wp and sp). For the general definitions of wp and sp, prove
that
c ≤ v1 + v2 , v1 + v2 ≤ c , c ≤ v1 − v2 , and v1 − v2 ≤ c .
Apply it to the loop of Example 12.4. Because i and n are integer variables,
the loop guard i < n is equivalent to i ≤ n − 1.
346 12 Invariant Generation
Do not seek to follow in the footsteps of the men of old; seek what they
sought.
— Matsuo Basho
Kyoroku Ribetsu no Kotoba, 1693
In this book we have presented a classical method of specifying and verify-
ing sequential programs (Chapters 5 and 6) based on first-order logic (Chap-
ters 1–3) and induction (Chapter 4). We then focused on algorithms for au-
tomating the application of this method: decision procedures for reasoning
about verification conditions (Chapters 7–11), and invariant generation pro-
cedures for deducing inductive facts about programs (Chapter 12). This ma-
terial is fundamental to all modern research in verification. In this chapter,
we indicate topics for further reading and research.
First-Order Logic
Other texts on first-order logic include [87, 31, 55]. Smullyan [87], on which
the presentation of Section 2.7 is partly based, concisely presents the main
results in first-order logic. Enderton [31] provides a comprehensive discussion
of theories of arithmetic and Gödel’s first incompleteness theorem. Manna and
Waldinger [55] explore additional first-order theories.
Decision Procedures
Static Analysis
Static analysis is one of the most active areas of research in verification. Clas-
sically, static analyses of the form presented in Chapter 12 have been studied
in two areas: compiler development and research [62] and verification [20].
Important areas of current research include fast numerical analyses for
discovering numerical relations among program variables [80], precise alias
13 Further Reading 349
and shape analyses for discovering how a program manipulates memory [96],
and predicate abstraction and refinement [5, 37, 16, 24].
Static analyses of the form in Chapter 12 solve a set of implications for
a fixpoint (an inductive assertion map is a fixpoint) by forward propagation.
Other methods exist for finding fixpoints, including constraint-based static
analysis [2, 17].
Static analyses also address total correctness by proving that loops and
functions halt [18, 7, 8]. Their structure is different than the analyses of Chap-
ter 12 as they seek ranking functions.
Concurrent Programs
Temporal Logic
Model Checking
A finite-state model checker [15, 74] is an algorithm that checks whether finite-
state systems such as hardware circuits satisfy given temporal properties.
An explicit-state model checker manipulates sets of actual states as vectors
of bits. A symbolic model checker uses a formulaic representation to represent
350 13 Further Reading
sets of states, just as we use FOL to represent possibly infinite sets of states
in Chapters 5 and 6. The first symbolic model checker was for CTL [12]; it
represents sets of states with Reduced Ordered Binary Decision Diagrams
(ROBDDs, or just BDDs) [10, 11].
LTL model checking is based on manipulating automata over infinite
strings [95]. A rich literature exists on such automata; see [91] for an in-
troduction.
Predicate abstraction and refinement [5, 37, 16, 24] has allowed model
checkers to be applied to software and represents one of the many intersections
between areas of research (model checking and static analysis in this case).
Clarke, Grumberg, and Peled discuss model checking in detail [14].
References
37. S. Graf and H. Saidi. Construction of abstract state graphs with PVS. In
Computer Aided Verification, volume 1254 of LNCS, pages 72–83. Springer-
Verlag, 1997.
38. D. Hilbert. Die Grundlagen der Mathematik. Abhandlungen aus dem Seminar
der Hamburgischen Universität, 6:65–85, 1928.
39. C. A. R. Hoare. An axiomatic basis for computer programming. Communica-
tions of the ACM, 12(10):576–580, October 1969.
40. W. Hodges. A Shorter Model Theory. Cambridge University Press, 1997.
41. J. E. Hopcroft, R. Motwani, and J. D. Ullman. Automata Theory, Languages,
and Computation. Addison-Wesley, 3rd edition, 2006.
42. R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press,
1985.
43. J. Jaffar. Presburger arithmetic with array segments. Information Processing
Letters, 12(2):79–82, 1981.
44. L. Kantorovich. Mathematical methods of organizing and planning production.
Management Science, 6:366–422, 1960. In Russian, Leningrad University, 1939.
45. N. Karmarkar. A new polynomial-time algorithm for linear programming.
Combinatorica, 4:373–395, 1984.
46. M. Karr. Affine relationships among variables of a program. Acta Informatica,
6:133–151, 1976.
47. M. Kaufmann, P. Manolios, and J. S. Moore. Computer-Aided Reasoning: An
Approach. Kluwer Academic Publishers, 2000.
48. M. Kaufmann, J. S. Moore, S. Ray, and E. Reeber. Integrating external deduc-
tion tools with ACL2. In Workshop on the Implementation of Logics, volume
212, pages 7–26, 2006.
49. L. G. Khachian. A polynomial algorithm in linear programming. Soviet Math.
Dokl., 20:191–194, 1979. In Russian, Dokl. Akad. Nauk SSSR 244, 1093–1096,
1979.
50. J. King. A Program Verifier. PhD thesis, Carnegie Mellon University, Septem-
ber 1969.
51. IEEE Symposium on Logic in Computer Science. http://www2.informatik.
hu-berlin.de/lics.
52. Z. Manna. Mathematical Theory of Computation. McGraw-Hill, 1974. Also
Dover, 2004.
53. Z. Manna and A. Pnueli. The Temporal Logic of Reactive and Concurrent
Systems: Specification. Springer-Verlag, 1991.
54. Z. Manna and A. Pnueli. Temporal Verification of Reactive Systems: Safety.
Springer-Verlag, 1995.
55. Z. Manna and R. Waldinger. The Deductive Foundations of Computer Pro-
gramming. Addison-Wesley, 1993.
56. Z. Manna and C. G. Zarba. Combining decision procedures. In Formal Methods
at the Cross Roads: From Panacea to Foundational Support, volume 2757 of
LNCS, pages 381–422. Springer-Verlag, 2003.
57. P. Mateti. A decision procedure for the correctness of a class of programs.
Journal of the ACM, 28(2), 1981.
58. J. McCarthy. Towards a mathematical science of computation. In International
Federation for Information Processing, pages 21–28, 1962.
59. J. McCarthy. A basis for a mathematical theory of computation. Computer
Programming and Formal Systems, 1963.
354 References
subformula 5, 37 group 80
strict 5, 37 torsion-free 83
subformula ordering 8 hashtables 291
subset [program] 175, 176 integer-indexed arrays 300
substitution 16, 46 integers 73, 76, 183
composition 18 lists 84
domain 16 ordered field 81
range 16 Peano arithmetic 73, 96
renaming 46 Presburger arithmetic 73, 75
safe 47 rationals 79, 82, 183, 207
schema 48 real closed field 81
variable 17 reals 79, 80
subterm 37 recursive data structures 84
set 247 ring 80
strict 38 total order 81, 83
subterm set 247 theory of rationals 79
successor [axiom] 73, 75 theory of reals 79
successor location 318 times successor [axiom] 73
supremum see join times zero [axiom] 73
symbolic execution 317 torsion-free 83
symmetric 245 torsion-free [axiom] 83
symmetry [axiom] 71, 242, 305 total (partition) 246
syntax 4 total correctness 113, 143
total order, theory of 81, 83
target (edge) 251 totality [axiom] 81, 82
term 35 transition relation 315
theory 69 transitive 245, 341
axioms 69 transitivity [axiom] 71, 81, 82, 242, 305
complete 70 transpose 209
consistent 70 triangular form (Cooper’s method)
convex 276 197
decidable 70 triangular form (matrix) 211
equivalent 70 truth symbols 4
formula 69 truth table 6
fragment 70 truth values 6
conjunctive 90 truth-table method 9
quantifier-free 70 Turing machine 54
has equality 284 Turing-decidable see decidable
interpretation 70 Turing-recognizable see semi-
intended 74, 75, 77 decidable
satisfiable 70 two [axiom] 270
signature 69
stably infinite 270 unary 4
valid 70 undecidable 54
theory of union [function] 254
abelian group 80, 83 union [program] 175, 178, 179
arrays 87, 291 union-find algorithm 254
equality 71, 241 unit clause 29
field 80 unit resolution 29
366 Index