0% found this document useful (0 votes)

140 views

Ai PDF

Uploaded by

Gopanjana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

140 views

Ai PDF

Uploaded by

Gopanjana

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

Logic and Proof

Computer Science Tripos Part IB

Lawrence C Paulson
Computer Laboratory
University of Cambridge

[email protected]

2 Propositional Logic 2

3 Proof Systems for Propositional Logic 5

4 First-order Logic 8

5 Formal Reasoning in First-Order Logic 11

6 Clause Methods for Propositional Logic 13

7 Skolem Functions, Herbrand’s Theorem and Unification 17

8 First-Order Resolution and Prolog 21

9 Decision Procedures and SMT Solvers 24

10 Binary Decision Diagrams 27

11 Modal Logics 28

12 Tableaux-Based Methods 30

i
1 INTRODUCTION AND LEARNING GUIDE 1

1 Introduction and Learning Guide • 2008 Paper 3 Q6: BDDs, DPLL, sequent calculus
• 2008 Paper 4 Q5: proving or disproving first-order formulas,
This course gives a brief introduction to logic, including resolution
the resolution method of theorem-proving and its relation
• 2009 Paper 6 Q7: modal logic (Lect. 11)
to the programming language Prolog. Formal logic is used
for specifying and verifying computer systems and (some- • 2009 Paper 6 Q8: resolution, tableau calculi
times) for representing knowledge in Artificial Intelligence • 2007 Paper 5 Q9: propositional methods, resolution, modal
programs. logic
The course should help you to understand the Prolog • 2007 Paper 6 Q9: proving or disproving first-order formulas
language, and its treatment of logic should be helpful for
• 2006 Paper 5 Q9: proof and disproof in FOL and modal logic
understanding other theoretical courses. It also describes
a variety of techniques and data structures used in auto- • 2006 Paper 6 Q9: BDDs, Herbrand models, resolution
mated theorem provers. Understanding the various deduc- (Lect. 6–8)
tive methods is a crucial part of the course, but in addition • 2005 Paper 5 Q9: resolution (Lect. 6, 8, 9)
to this purely mechanical view of logic, you should try to • 2005 Paper 6 Q9: DPLL, BDDs, tableaux (Lect. 6, 10, 12)
acquire an intuitive feel for logical reasoning.
• 2004 Paper 5 Q9: semantics and proof in FOL (Lect. 4, 5)
The most suitable course text is this book:
• 2004 Paper 6 Q9: ten true or false questions
Michael Huth and Mark Ryan, Logic in
• 2003 Paper 5 Q9: BDDs; clause-based proof methods
Computer Science: Modelling and Reasoning
(Lect. 6, 10)
about Systems, 2nd edition (CUP, 2004)
• 2003 Paper 6 Q9: sequent calculus (Lect. 5)
It covers most aspects of this course with the exception of
• 2002 Paper 5 Q11: semantics of propositional and first-order
resolution theorem proving. It includes material (symbolic logic (Lect. 2, 4)
model checking) that could be useful later.
The following book may be a useful supplement to Huth • 2002 Paper 6 Q11: resolution; proof systems
and Ryan. It covers resolution, as well as much else relevant • 2001 Paper 5 Q11: satisfaction relation; logical equivalences
to Logic and Proof. • 2001 Paper 6 Q11: clause-based proof methods; ordered
ternary decision diagrams (Lect. 6, 10)
Mordechai Ben-Ari, Mathematical Logic for
Computer Science, 2nd edition (Springer, 2001) • 2000 Paper 5 Q11: tautology checking; propositional se-
quent calculus (Lect. 2, 3, 10)
The following book provides a different perspective on
• 2000 Paper 6 Q11: unification and resolution (Lect. 8, 9)
modal logic, and it develops propositional logic carefully.
Finally available in paperback. • 1999 Paper 5 Q10: Prolog resolution versus general resolu-
tion
Sally Popkorn, First Steps in Modal Logic (CUP,
• 1999 Paper 6 Q10: Herbrand models and clause form
2008)
• 1998 Paper 5 Q10: BDDs, sequent calculus, etc. (Lect. 3,
There are numerous exercises in these notes, and they are 10)
suitable for supervision purposes. Most old examination • 1998 Paper 6 Q10: modal logic (Lect. 11); resolution
questions for Foundations of Logic Programming (the for-
• 1997 Paper 5 Q10: first-order logic (Lect. 4)
mer name of this course) are still relevant. As of 2013/14,
Herbrand’s theorem has been somewhat deprecated, still • 1997 Paper 6 Q10: sequent rules for quantifiers (Lect. 5)
mentioned but no longer in detail. Some unification the- • 1996 Paper 5 Q10: sequent calculus (Lect. 3, 5, 10)
ory has also been removed. These changes created space • 1996 Paper 6 Q10: DPLL versus Resolution (Lect. 9)
for a new lecture on Decision Procedures and SMT Solvers.
• 1995 Paper 5 Q9: BDDs (Lect. 10)
• 2014 Paper 5 Q5: proof methods for propositional logic: • 1995 Paper 6 Q9: outline logics; sequent calculus (Lect. 3,
BDDs and DPLL 5, 11)
• 2014 Paper 6 Q6: decision procedure; resolution with selec-
• 1994 Paper 5 Q9: resolution versus Prolog (Lect. 8)
tion
• 1994 Paper 6 Q9: Herbrand models (Lect. 7)
• 2013 Paper 5 Q5: DPLL, sequent or tableau calculus
• 1994 Paper 6 Q9: most general unifiers and resolution
• 2013 Paper 6 Q6: resolution problems
(Lect. 9)
• 2012 Paper 5 Q5: (Herbrand: deprecated)
• 1993 Paper 3 Q3: resolution and Prolog (Lect. 8)
• 2012 Paper 6 Q6: sequent or tableau calculus, modal logic,
BDDs
Acknowledgements. Chloë Brown, Jonathan Davies and
• 2011 Paper 5 Q5: resolution, linear resolution, BDDs
Reuben Thomas pointed out numerous errors in these notes.
• 2011 Paper 6 Q6: unification, modal logic David Richerby and Ross Younger made detailed sugges-
• 2010 Paper 5 Q5: BDDs and models tions. Thanks also to Darren Foong, Thomas Forster, Si-
• 2010 Paper 6 Q6: sequent or tableau calculus, DPLL. Note: mon Frankau, Adam Hall, Ximin Luo, Steve Payne, Tom
the formula in the first part of this question should be Puverle, Max Spencer, Ben Thorner, Tjark Weber and John
(∃x P(x) → Q) → ∀x (P(x) → Q). Wickerson.
2 PROPOSITIONAL LOGIC 2

2 Propositional Logic the meaning of the propositional symbols it contains. The

meaning can be calculated using the standard truth tables.
Propositional logic deals with truth values and the logical
connectives and, or, not, etc. Most of the concepts in propo- A B ¬A A∧B A∨B A→B A↔B
sitional logic have counterparts in first-order logic. Here are 1 1 0 1 1 1 1
the most fundamental concepts. 1 0 0 0 1 0 0
0 1 1 0 1 1 0
Syntax refers to the formal notation for writing assertions. 0 0 1 0 0 1 1
It also refers to the data structures that represent asser-
tions in a computer. At the level of syntax, 1 + 2 is a By inspecting the table, we can see that A → B is equiv-
string of three symbols, or a tree with a node labelled alent to ¬A ∨ B and that A ↔ B is equivalent to (A →
+ and having two children labelled 1 and 2. B) ∧ (B → A). (The latter is also equivalent to ¬(A ⊕ B),
where ⊕ is exclusive-or.)
Semantics expresses the meaning of a formula in terms of Note that we are using t and f in the language as symbols
mathematical or real-world entities. While 1 + 2 and to denote the truth values 1 and 0. The former belongs to
2 + 1 are syntactically distinct, they have the same se- syntax, the latter to semantics. When it comes to first-order
mantics, namely 3. The semantics of a logical state- logic, we shall spend some time on the distinction between
ment will typically be true or false. symbols and their meanings.
We now make some definitions that will be needed
Proof theory concerns ways of proving statements, at least throughout the course.
the true ones. Typically we begin with axioms and
arrive at other true statements using inference rules. Definition 1 An interpretation, or truth assignment, for a
Formal proofs are typically finite and mechanical: set of formulas is a function from its set of propositional
their correctness can be checked without understand- symbols to {1, 0}.
ing anything about the subject matter. An interpretation satisfies a formula if the formula eval-
uates to 1 under the interpretation.
Syntax can be represented in a computer. Proof methods A set S of formulas is valid (or a tautology) if every in-
are syntactic, so they can be performed by computer. On terpretation for S satisfies every formula in S.
the other hand, as semantics is concerned with meaning, it A set S of formulas is satisfiable (or consistent) if there is
exists only inside people’s heads. This is analogous to the some interpretation for S that satisfies every formula in S.
way computers handle digital photos: the computer has no A set S of formulas is unsatisfiable (or inconsistent) if it
conception of what your photos mean to you, and internally is not satisfiable.
they are nothing but bits. A set S of formulas entails A if every interpretation that
satisfies all elements of S, also satisfies A. Write S |H A.
2.1 Syntax of propositional logic Formulas A and B are equivalent, A ' B, provided A |H
B and B |H A.
Take a set of propositional symbols P, Q, R, . . .. A formula
consisting of a propositional symbol is called atomic. We Some relationships hold among these primitives. Note
use t and f to denote true and false. the following in particular:
Formulas are constructed from atomic formulas using the
• S |H A if and only if {¬A} ∪ S is inconsistent.
logical connectives1
• If S is inconsistent, then S |H A for any A. This is an
¬ (not)
instance of the phenomenon that we can deduce any-
∧ (and)
thing from a contradiction.
∨ (or)
→ (implies) • |H A if and only if A is valid, if and only if {¬A} is
↔ (if and only if) inconsistent.
These are listed in order of precedence; ¬ is highest. We It is usual to write A |H B instead of {A} |H B. We may
shall suppress needless parentheses, writing, for example, similarly identify a one-element set with a formula in the
other definitions.
(((¬P)∧Q)∨R) → ((¬P)∨Q) as ¬P∧Q∨R → ¬P∨Q. Note that |H and ' are not logical connectives but rela-
tions between formulas. They belong not to the logic but to
In the metalanguage (these notes), the letters A, B, C, . . . the metalanguage: they are symbols we use to discuss the
stand for arbitrary formulas. The letters P, Q, R, . . . stand logic. They therefore have lower precedence than the logi-
for atomic formulas. cal connectives. No parentheses are needed in A ∧ A ' A
because the only possible reading is (A ∧ A) ' A. We may
2.2 Semantics not write A ∧ (A ' A) because A ' A is not a formula.
In propositional logic, a valid formula is also called a
Propositional Logic is a formal language. Each formula tautology. Here are some examples of these definitions.
has a meaning (or semantics) — either 1 or 0 — relative to
• The formulas A → A and ¬(A ∧ ¬A) are valid for
1 Using ⊃ for implies and ≡ for if-and-only-if is archaic. every formula A.
2 PROPOSITIONAL LOGIC 3

• The formulas P and P ∧ (P → Q) are satisfiable: thing. The formula A → B is a tautology if and only if
they are both true under the interpretation that maps P A |H B.
and Q to 1. But they are not valid: they are both false Here is a listing of some of the more basic equivalences
under the interpretation that maps P and Q to 0. of propositional logic. They provide one means of reason-
ing about propositions, namely by transforming one propo-
• If A is a valid formula then ¬A is unsatisfiable. sition into an equivalent one. They are also needed to con-
vert propositions into various normal forms.
• This set of formulas is unsatisfiable: {P, Q, ¬P ∨
¬Q}. idempotency laws

2.3 Applications of propositional logic A∧ A' A

A∨ A' A
In hardware design, propositional logic has long been used
to minimize the number of gates in a circuit, and to show commutative laws
the equivalence of combinational circuits. There now exist
highly efficient tautology checkers, such as BDDs (Binary A∧B ' B∧ A
Decision Diagrams), which can verify complex combina- A∨B ' B∨ A
tional circuits. This is an important branch of hardware ver-
ification. The advent of efficient SAT solvers has produced associative laws
an explosion of applications involving approximating var-
ious phenomena as large propositional formulas, typically (A ∧ B) ∧ C ' A ∧ (B ∧ C)
through some process of iterative refinement.
(A ∨ B) ∨ C ' A ∨ (B ∨ C)
Chemical synthesis is a more offbeat example.2 Under
suitable conditions, the following chemical reactions are distributive laws
possible:
A ∨ (B ∧ C) ' (A ∨ B) ∧ (A ∨ C)
HCl + NaOH → NaCl + H2 O
A ∧ (B ∨ C) ' (A ∧ B) ∨ (A ∧ C)
C + O2 → CO2
CO2 + H2 O → H2 CO3 de Morgan laws

Show we can make H2 CO3 given supplies of HCl, NaOH, ¬(A ∧ B) ' ¬A ∨ ¬B
O2 , and C. ¬(A ∨ B) ' ¬A ∧ ¬B
Chang and Lee formalize the supplies of chemicals as
four axioms and prove that H2 CO3 logically follows. The other negation laws
idea is to formalize each compound as a propositional sym-
bol and express the reactions as implications: ¬(A → B) ' A ∧ ¬B
¬(A ↔ B) ' (¬A) ↔ B ' A ↔ (¬B)
HCl ∧ NaOH → NaCl ∧ H2 O
C ∧ O2 → CO2 laws for eliminating certain connectives
CO2 ∧ H2 O → H2 CO3
A ↔ B ' (A → B) ∧ (B → A)
Note that this involves an ideal model of chemistry. What ¬A ' A → f
if the reactions can be inhibited by the presence of other A → B ' ¬A ∨ B
chemicals? Proofs about the real world always depend upon
general assumptions. It is essential to bear these in mind simplification laws
when relying on such a proof.
A∧f'f
2.4 Equivalences A∧t' A
A∨f' A
Note that A ↔ B and A ' B are different kinds of asser-
tions. The formula A ↔ B refers to some fixed interpre- A∨t't
tation, while the metalanguage statement A ' B refers to ¬¬A ' A
all interpretations. On the other hand, |H A ↔ B means the A ∨ ¬A ' t
same thing as A ' B. Both are metalanguage statements,
A ∧ ¬A ' f
and A ' B is equivalent to saying that the formula A ↔ B
is a tautology.
Propositional logic enjoys a principle of duality: for ev-
Similarly, A → B and A |H B are different kinds of
ery equivalence A ' B there is another equivalence A0 '
assertions, while |H A → B and A |H B mean the same
B 0 , derived by exchanging ∧ with ∨ and t with f. Before ap-
2 Chang and Lee, page 21, as amended by Ross Younger, who knew plying this rule, remove all occurrences of → and ↔, since
more about Chemistry! they implicitly involve ∧ and ∨.
2 PROPOSITIONAL LOGIC 4

2.5 Normal forms Step 2. Push negations in until they apply only to atoms,
repeatedly replacing by the equivalences
The language of propositional logic has much redundancy:
many of the connectives can be defined in terms of others. ¬¬A ' A
By repeatedly applying certain equivalences, we can trans-
¬(A ∧ B) ' ¬A ∨ ¬B
form a formula into a normal form. A typical normal form
eliminates certain connectives and uses others in a restricted ¬(A ∨ B) ' ¬A ∧ ¬B
manner. The restricted structure makes the formula easy to
process, although the normal form may be much larger than At this point, the formula is in Negation Normal Form.
the original formula, and unreadable.
Step 3. To obtain CNF, push disjunctions in until they ap-
Definition 2 (Normal Forms) ply only to literals. Repeatedly replace by the equivalences

• A literal is an atomic formula or its negation. Let K , L, A ∨ (B ∧ C) ' (A ∨ B) ∧ (A ∨ C)

L 0 , . . . stand for literals.
(B ∧ C) ∨ A ' (B ∨ A) ∧ (C ∨ A)
• A formula is in Negation Normal Form (NNF) if the
These two equivalences obviously say the same thing, since
only connectives in it are ∧, ∨, and ¬, where ¬ is only
disjunction is commutative. In fact, we have
applied to atomic formulas.

• A formula is in Conjunctive Normal Form (CNF) if (A∧ B)∨(C ∧ D) ' (A∨C)∧(A∨ D)∧(B ∨C)∧(B ∨ D).
it has the form A1 ∧ · · · ∧ Am , where each Ai is a
Use this equivalence when you can, to save writing.
disjunction of one or more literals.

• A formula is in Disjunctive Normal Form (DNF) if it Step 4. Simplify the resulting CNF by deleting any dis-
has the form A1 ∨ · · · ∨ Am , where each Ai is a conjunction that contains both P and ¬P, since it is equivalent
junction of one or more literals. to t. Also delete any conjunct that includes another con-
junct (meaning, every literal in the latter is also present in
An atomic formula like P is in all the normal forms NNF, the former). This is correct because (A ∨ B) ∧ A ' A. Fi-
CNF, and DNF. The formula nally, two disjunctions of the form P ∨ A and ¬P ∨ A can
be replaced by A, thanks to the equivalence
(P ∨ Q) ∧ (¬P ∨ S) ∧ (R ∨ P)
(P ∨ A) ∧ (¬P ∨ A) ' A.
is in CNF. Unlike in some hardware applications, the dis-
juncts in a CNF formula do not have to mention all the vari- This simplification is related to the resolution rule, which
ables. On the contrary, they should be as simple as possible. we shall study later.
Simplifying the formula Since ∨ is commutative, a conjunct of the form A ∨ B
could denote any possible way of arranging the literals into
(P ∨ Q) ∧ (¬P ∨ Q) ∧ (R ∨ S) two parts. This includes A ∨ f, since one of those parts may
be empty and the empty disjunction is false. So in the last
to Q ∧ (R ∨ S) counts as an improvement, because it will
simplification above, two conjuncts of the form P and ¬P
make our proof procedures run faster. For examples of DNF
can be replaced by f.
formulas, exchange ∧ and ∨ in the examples above. As
with CNF, there is no need to mention all combinations of
variables. Steps 3’ and 4’. To obtain DNF, apply instead the other
NNF can reveal the underlying nature of a formula. For distributive law:
example, converting ¬(A → B) to NNF yields A ∧ ¬B.
This reveals that the original formula was effectively a con- A ∧ (B ∨ C) ' (A ∧ B) ∨ (A ∧ C)
junction. Every formula in CNF or DNF is also in NNF, but (B ∨ C) ∧ A ' (B ∧ A) ∨ (C ∧ A)
the NNF formula ((¬P ∧ Q) ∨ R) ∧ P is in neither CNF
nor DNF. Exactly the same simplifications can be performed for DNF
as for CNF, exchanging the roles of ∧ and ∨.

2.6 Translation to normal form

2.7 Tautology checking using CNF
Every formula can be translated into an equivalent formula
in NNF, CNF, or DNF by means of the following steps. Here is a (totally impractical) method of proving theorems
in propositional logic. To prove A, reduce it to CNF. If
the simplified CNF formula is t then A is valid because
Step 1. Eliminate ↔ and → by repeatedly applying the each transformation preserves logical equivalence. And if
following equivalences: the CNF formula is not t, then A is not valid.
To see why, suppose the CNF formula is A1 ∧ · · · ∧ Am .
A ↔ B ' (A → B) ∧ (B → A)
If A is valid then each Ai must also be valid. Write Ai as
A → B ' ¬A ∨ B L 1 ∨ · · · ∨ L n , where the L j are literals. We can make an
3 PROOF SYSTEMS FOR PROPOSITIONAL LOGIC 5

interpretation I that falsifies every L j , and therefore falsi- Step 2, push negations in, gives
fies Ai . Define I such that, for every propositional letter P,
( (¬¬(¬P ∨ Q) ∧ ¬P) ∨ P
0 if L j is P for some j
I (P) = ((¬P ∨ Q) ∧ ¬P) ∨ P
1 if L j is ¬P for some j
Step 3, push disjunctions in, gives
This definition is legitimate because there cannot exist lit-
erals L j and L k such that L j is ¬L k ; if there did, then sim- (¬P ∨ Q ∨ P) ∧ (¬P ∨ P)
plification would have deleted the disjunction Ai .
The powerful BDD method is based on similar ideas, but Simplifying again yields t. Thus Peirce’s law is valid.
uses an if-then-else data structure, an ordering on the propo- There is a dual method of refuting A (proving inconsis-
sitional letters, and some standard algorithmic techniques tency). To refute A, reduce it to DNF, say A1 ∨ · · · ∨ Am .
(such as hashing) to gain efficiency. If A is inconsistent then so is each Ai . Suppose Ai is
L 1 ∧ · · · ∧ L n , where the L j are literals. If there is some
Example 1 Start with literal L 0 such that the L j include both L 0 and ¬L 0 , then
Ai is inconsistent. If not then there is an interpretation that
P ∨ Q → Q ∨ R. verifies every L j , and therefore Ai .
To prove A, we can use the DNF method to refute ¬A.
Step 1, eliminate →, gives The steps are exactly the same as the CNF method because
the extra negation swaps every ∨ and ∧. Gilmore imple-
¬(P ∨ Q) ∨ (Q ∨ R).
mented a theorem prover based upon this method in 1960.
Step 2, push negations in, gives
Exercise 1 Is the formula P → ¬P satisfiable, or valid?
(¬P ∧ ¬Q) ∨ (Q ∨ R).
Exercise 2 Verify the de Morgan and distributive laws us-
Step 3, push disjunctions in, gives ing truth tables.

(¬P ∨ Q ∨ R) ∧ (¬Q ∨ Q ∨ R). Exercise 3 Each of the following formulas is satisfiable but
not valid. Exhibit an interpretation that makes the formula
Simplifying yields (¬P ∨ Q ∨ R) ∧ t and then true and another that makes the formula false.
¬P ∨ Q ∨ R. P→Q P∨Q→P∧Q
The interpretation P 7 → 1, Q 7 → 0, R 7 → 0 falsifies this ¬(P ∨ Q ∨ R) ¬(P ∧ Q) ∧ ¬(Q ∨ R) ∧ (P ∨ R)
formula, which is equivalent to the original formula. So the
Exercise 4 Convert each of the following propositional for-
original formula is not valid.
mulas into Conjunctive Normal Form and also into Disjunc-
tive Normal Form. For each formula, state whether it is
Example 2 Start with valid, satisfiable, or unsatisfiable; justify each answer.
P∧Q→Q∧P (P → Q) ∧ (Q → P)
Step 1, eliminate →, gives ((P ∧ Q) ∨ R) ∧ ¬(P ∨ R)
¬(P ∨ Q ∨ R) ∨ ((P ∧ Q) ∨ R)
¬(P ∧ Q) ∨ Q ∧ P
Exercise 5 Using ML, define datatypes for representing
Step 2, push negations in, gives propositions and interpretations. Write a function to test
(¬P ∨ ¬Q) ∨ (Q ∧ P) whether or not a proposition holds under an interpretation
(both supplied as arguments). Write a function to convert a
Step 3, push disjunctions in, gives proposition to Negation Normal Form.

(¬P ∨ ¬Q ∨ Q) ∧ (¬P ∨ ¬Q ∨ P)
3 Proof Systems for Propositional
Simplifying yields t ∧ t, which is t. Both conjuncts are
valid since they contain a formula and its negation. Thus Logic
P ∧ Q → Q ∧ P is valid.
We can verify any tautology by checking all possible in-
terpretations, using the truth tables. This is a semantic ap-
Example 3 Peirce’s law is another example. Start with proach, since it appeals to the meanings of the connectives.
The syntactic approach is formal proof: generating the-
((P → Q) → P) → P
orems, or reducing a conjecture to a known theorem, by
Step 1, eliminate →, gives applying syntactic transformations of some sort. We have
already seen a proof method based on CNF. Most proof
¬(¬(¬P ∨ Q) ∨ P) ∨ P methods are based on axioms and inference rules.
3 PROOF SYSTEMS FOR PROPOSITIONAL LOGIC 6

What about efficiency? Deciding whether a propositional Here A ∧ B and A → B are defined in terms of ¬ and ∨.
formula is satisfiable is an NP-complete problem (Aho, Where do truth tables fit into all this? Truth tables define
Hopcroft and Ullman 1974, pages 377–383). Thus all ap- the semantics, while proof systems define what is some-
proaches are likely to be exponential in the length of the times called the proof theory. A proof system must respect
formula. Technologies such as BDDs and SAT solvers, the truth tables. Above all, we expect the proof system to
which can decide huge problems in propositional logic, are be sound: every theorem it generates must be a tautology.
all the more stunning because their success was wholly un- For this to hold, every axiom must be a tautology and every
expected. But even they require a “well-behaved” input for- inference rule must yield a tautology when it is applied to
mula, and are exponential in the worst case. tautologies.
The converse property is completeness: the proof system
3.1 A Hilbert-style proof system can generate every tautology. Completeness is harder to
achieve and show. There are complete proof systems even
Here is a simple proof system for propositional logic. There for first-order logic. (And Gödel’s incompleteness theo-
are countless similar systems. They are often called Hilbert rem uses the word “completeness” with a different technical
systems after the logician David Hilbert, although they ex- meaning.)
isted before him.
This proof system provides rules for implication only.
The other logical connectives are not taken as primitive. 3.2 Gentzen’s Natural Deduction Systems
They are instead defined in terms of implication: Natural proof systems do exist. Natural deduction, devised
def by Gerhard Gentzen, is based upon three principles:
¬A = A → f
def 1. Proof takes place within a varying context of assump-
A ∨ B = ¬A → B
tions.
def
A ∧ B = ¬(¬A ∨ ¬B)
2. Each logical connective is defined independently of the
Obviously, these definitions apply only when we are dis- others. (This is possible because item 1 eliminates the
cussing this proof system! need for tricky uses of implication.)
Note that A → (B → A) is a tautology. Call it Axiom K.
Also, 3. Each connective is defined by introduction and elimi-
nation rules.
(A → (B → C)) → ((A → B) → (A → C))
is a tautology. Call it Axiom S. The Double-Negation Law For example, the introduction rule for ∧ describes how to
¬¬A → A, is a tautology. Call it Axiom DN. deduce A ∧ B:
A B (∧i)
These axioms are more properly called axiom schemes, A∧B
since we assume all instances of them that can be obtained
by substituting formulas for A, B and C. For example, Ax- The elimination rules for ∧ describe what to deduce from
iom K is really an infinite set of formulas. A ∧ B:
A ∧ B (∧e1) A ∧ B (∧e2)
Whenever A → B and A are both valid, it follows that B
A B
is valid. We write this as the inference rule
A→B A The elimination rule for → says what to deduce from A →
B. B. It is just Modus Ponens:
This rule is traditionally called Modus Ponens. Together A → B A (→e)
with Axioms K, S, and DN and the definitions, it suffices to B
prove all tautologies of classical propositional logic.3 How-
ever, this formalization of propositional logic is inconve- The introduction rule for → says that A → B is proved by
nient to use. For example, try proving A → A! assuming A and deriving B:
A variant of this proof system replaces the Double-
Negation Law by the Contrapositive Law: [A]
..
..
(¬B → ¬A) → (A → B) B (→i)
Another formalization of propositional logic consists of A→B
the Modus Ponens rule plus the following axioms:
For simple proofs, this notion of assumption is pretty intu-
A∨ A→ A itive. Here is a proof of the formula A ∧ B → A:
B → A∨B
[A ∧ B]
A∨B → B∨ A A
(∧e1)
(→i)
(B → C) → (A ∨ B → A ∨ C) A∧B → A
3 If the Double-Negation Law is omitted, we get so-called intuitionis-
tic logic. This axiom system is motivated by a philosophy of constructive
The key point is that rule (→i) discharges its assumption:
mathematics. In a precise sense, it is connected with advanced topics in- the assumption could be used to prove A from A ∧ B, but is
cluding type theory and the combinators S and K in the λ-calculus. no longer available once we conclude A ∧ B → A.
3 PROOF SYSTEMS FOR PROPOSITIONAL LOGIC 7

The introduction rules for ∨ are straightforward: Working backwards, this rule breaks down some implica-
tion on the right side of a sequent; 0 and 1 stand for the
A B
(∨i1) (∨i2) sets of formulas that are unaffected by the inference. The
A∨B A∨B
analogue of the pair (∨i1) and (∨i2) is the single rule
The elimination rule says that to show some C from A ∨ B 0 ⇒ 1, A, B
there are two cases to consider, one assuming A and one (∨r )
0 ⇒ 1, A ∨ B
assuming B:
[A]
.. [B]
.. This breaks down some disjunction on the right side, re-
.. .. placing it by both disjuncts. Thus, the sequent calculus is a
A∨B C C (∨e) kind of multiple-conclusion logic. Figure 1 summarises the
C rules.
The scope of assumptions can get confusing in complex Let us prove that the rule (∨l) is sound. We must show
proofs. Let us switch attention to the sequent calculus, that if both premises are valid, then so is the conclusion. For
which is similar in spirit but easier to use. contradiction, assume that the conclusion, A ∨ B, 0 ⇒ 1,
is not valid. Then there exists an interpretation I under
which the left side is true while the right side is false; in
3.3 The sequent calculus particular, A ∨ B and 0 are true while 1 is false. Since
The sequent calculus resembles natural deduction, but it A ∨ B is true under interpretation I , either A is true or B
makes the set of assumptions explicit. Thus, it is more con- is. In the former case, A, 0 ⇒ 1 is false; in the latter case,
crete. B, 0 ⇒ 1 is false. Either case contradicts the assumption
A sequent has the form 0 ⇒ 1, where 0 and 1 are finite that the premises are valid.
sets of formulas.4 These sets may be empty. The sequent
basic sequent: A, 0 ⇒ A, 1
A1 , . . . , Am ⇒ B1 , . . . , Bn Negation rules:

is true (in a particular interpretation) if A1 ∧. . .∧ Am implies 0 ⇒ 1, A A, 0 ⇒ 1

(¬l) (¬r )
B1 ∨. . .∨ Bn . In other words, if each of A1 , . . . , Am are true, ¬A, 0 ⇒ 1 0 ⇒ 1, ¬A
then at least one of B1 , . . . , Bn must be true. The sequent is
valid if it is true in all interpretations. Conjunction rules:
A basic sequent is one in which the same formula appears A, B, 0 ⇒ 1 0 ⇒ 1, A 0 ⇒ 1, B
on both sides, as in P, B ⇒ B, R. This sequent is valid (∧l) (∧r )
A ∧ B, 0 ⇒ 1 0 ⇒ 1, A ∧ B
because, if all the formulas on the left side are true, then in
particular B is; so, at least one right-side formula (B again) Disjunction rules:
is true. Our calculus therefore regards all basic sequents as
proved. A, 0 ⇒ 1 B, 0 ⇒ 1 0 ⇒ 1, A, B
(∨l) (∨r )
Every basic sequent might be written in the form {A} ∪ A ∨ B, 0 ⇒ 1 0 ⇒ 1, A ∨ B
0 ⇒ {A} ∪ 1, where A is the common formula and 0 and
1 are the other left- and right-side formulas, respectively. Implication rules:
The sequent calculus identifies the one-element set {A} with
0 ⇒ 1, A B, 0 ⇒ 1 A, 0 ⇒ 1, B
its element A and denotes union by a comma. Thus, the (→l) (→r )
correct notation for the general form of a basic sequent is A → B, 0 ⇒ 1 0 ⇒ 1, A → B
A, 0 ⇒ A, 1.
Sequent rules are almost always used backward. We start Figure 1: Sequent Rules for Propositional Logic
with the sequent that we would like to prove. We view the
sequent as saying that A1 , . . . , Am are true, and we try to
show that one of B1 , . . . , Bn is true. Working backwards, we
use sequent rules to reduce it to simpler sequents, stopping 3.4 Examples of Sequent Calculus Proofs
when those sequents become trivial. The forward direction To illustrate the use of multiple formulas on the right, let us
would be to start with known facts and derive new facts, prove the classical theorem (A → B)∨(B → A). Working
but this approach tends to generate random theorems rather backwards (or upwards), we reduce this formula to a basic
than ones we want. sequent:
Sequent rules are classified as right or left, indicating
which side of the ⇒ symbol they operate on. Rules that op- A, B ⇒ B, A
(→r )
erate on the right side are analogous to natural deduction’s A ⇒ B, B → A
introduction rules, and left rules are analogous to elimina- (→r )
⇒ A → B, B → A
tion rules. (∨r )
⇒ (A → B) ∨ (B → A)
The sequent calculus analogue of (→i) is the rule
The basic sequent has a line over it to emphasize that it is
A, 0 ⇒ 1, B
(→r ) provable.
0 ⇒ 1, A → B
This example is typical of the sequent calculus: start with
4 With minor changes, sequents can instead be lists or multisets. the desired theorem and work upward. Notice that inference
4 FIRST-ORDER LOGIC 8

rules still have the same logical meaning, namely that the states that this rule is not required. All uses of it can be
premises (above the line) imply the conclusion (below the removed from any proof, but the proof could get exponen-
line). Instead of matching a rule’s premises with facts that tially larger.
we know, we match its conclusion with the formula we want
0 ⇒ 1, A A, 0 ⇒ 1
to prove. That way, the form of the desired theorem controls (cut)
the proof search. 0⇒1
The distributive law A ∨ (B ∧ C) ' (A ∨ B) ∧ (A ∨ C) This special case of cut may be easier to understand. We
is proved (one direction at least) as follows: prove lemma A from 0 and use A and 0 together to reach
the conclusion B.
B, C ⇒ A, B
A ⇒ A, B B ∧ C ⇒ A, B
(∧l) 0 ⇒ B, A A, 0 ⇒ B
A ∨ (B ∧ C) ⇒ A, B
(∨l) 0⇒B
(∨r )
A ∨ (B ∧ C) ⇒ A ∨ B similar Since 0 contains as much information as A, it is natural to
(∧r )
A ∨ (B ∧ C) ⇒ (A ∨ B) ∧ (A ∨ C) expect that such lemmas should not be necessary, but the
cut-elimination theorem is hard to prove.
The second, omitted proof tree proves A ∨(B ∧C) ⇒ A ∨C
similarly.
Note On the course website, there is a simple theorem
Finally, here is a failed proof of the invalid formula A ∨
prover called folderol.ML. It can prove easy first-order
B → B ∨ C.
theorems using the sequent calculus, and outputs a sum-
A ⇒ B, C B ⇒ B, C mary of each proof. The file begins with very basic instruc-
(∨l) tions describing how to run it. The file testsuite.ML
A ∨ B ⇒ B, C
(∨r ) contains further instructions and numerous examples.
A∨ B⇒B ∨C
(→r )
⇒A∨ B → B ∨C Exercise 6 Prove the following sequents:

The sequent A ⇒ B, C has no line over it because it is not ¬¬A ⇒ A

valid! The interpretation A 7 → 1, B 7 → 0, C 7 → 0 falsifies A∧B⇒B∧ A
it. We have already seen this as Example 1 (page 5).
A∨B⇒B∨ A

3.5 Further Sequent Calculus Rules Exercise 7 Prove the following sequents:

Structural rules concern sequents in general rather than par- (A ∧ B) ∧ C ⇒ A ∧ (B ∧ C)

ticular connectives. They are little used in this course, be- (A ∨ B) ∧ (A ∨ C) ⇒ A ∨ (B ∧ C)
cause they are not useful for proof procedures. However, a
¬(A ∨ B) ⇒ ¬A ∧ ¬B
brief mention is essential in any introduction to the sequent
calculus.
The weakening rules allow additional formulas to be in- 4 First-order Logic
serted on the left or right side. Obviously, if 0 ⇒ 1 holds
then the sequent continues to hold after further assumptions First-order logic (FOL) extends propositional logic to allow
or goals are added. When writing a proof from the bottom reasoning about the members (such as numbers) of some
up, these rules are useful for discarding unwanted formulas. non-empty universe. It uses the quantifiers ∀ (for all) and
∃ (there exists). First-order logic has variables ranging over
0⇒1 0⇒1
(weaken:l) (weaken:r ) so-called individuals, but not over functions or predicates;
A, 0 ⇒ 1 0 ⇒ 1, A
such variables are found in second- or higher-order logic.
Exchange rules allow formulas in a sequent to be re-
ordered. We do not need them because our sequents are 4.1 Syntax of first-order Logic
sets rather than lists. Contraction rules allow formulas to
Terms stand for individuals while formulas stand for truth
be used more than once, for when writing a proof from the
values. We assume there is an infinite supply of variables x,
bottom upwards, their effect is to duplicate a formula.
y, . . . that range over individuals. A first-order language
A, A, 0 ⇒ 1 0 ⇒ 1, A, A specifies symbols that may appear in terms and formulas.
(contract:l) (contract:r )
A, 0 ⇒ 1 0 ⇒ 1, A A first-order language L contains, for all n ≥ 0, a set of n-
place function symbols f , g, . . . and n-place predicate sym-
Because the sets {A} and {A, A} are identical, we don’t need bols P, Q, . . .. These sets may be empty, finite, or infinite.
contraction rules either. Moreover, it turns out that we al- Constant symbols a, b, . . . are simply 0-place function
most never need to use a formula more than once. Excep- symbols. Intuitively, they are names for fixed elements of
tions are ∀x A (when it appears on the left) and ∃x A (when the universe. It is not required to have a constant for each
it appears on the right). element; conversely, two constants are allowed to have the
The cut rule allows the use of lemmas. Some formula A same meaning.
is proved in the first premise, and assumed in the second Predicate symbols are also called relation symbols. Pro-
premise. A famous result, the cut-elimination theorem, log programmers refer to function symbols as functors.
4 FIRST-ORDER LOGIC 9

Definition 3 The terms t, u, . . . of a first-order language Every student’s tutor is a member of the student’s Col-
are defined recursively as follows: lege:

• A variable is a term. ∀x y (student(x) ∧ college(y) ∧ member(x, y)

• A constant symbol is a term. → member(tutor(x), y))

• If t1 , . . ., tn are terms and f is an n-place function Formalising the notion of tutor as a function incorporates
symbol then f (t1 , . . . , tn ) is a term. the assumption that every student has exactly one tutor.
A mathematical example: there exist infinitely many Py-
Definition 4 The formulas A, B, . . . of a first-order lan- thagorean triples:
guage are defined recursively as follows:
∀n ∃i jk (i > n ∧ i 2 + j 2 = k 2 )
• If t1 , . . ., tn are terms and P is an n-place predi-
cate symbol then P(t1 , . . . , tn ) is a formula (called an Here the superscript 2 refers to the squaring function.
atomic formula). Equality (=) is just another relation symbol (satisfying suit-
able axioms) but there are many special techniques for it.
• If A and B are formulas then ¬A, A ∧ B, A ∨ B, A → First-order logic requires a non-empty domain: thus
B, A ↔ B are also formulas. ∀x P(x) implies ∃x P(x). If the domain could be empty,
even ∃x t could fail to hold. Note also that ∀x ∃y y 2 = x is
• If x is a variable and A is a formula then ∀x A and true if the domain is the complex numbers, and is false if the
∃x A are also formulas. domain is the integers or reals. We determine properties of
the domain by asserting the set of statements it must satisfy.
Brackets are used in the conventional way for grouping. There are many other forms of logic. Many-sorted first-
Terms and formulas are tree-like data structures, not strings. order logic assigns types to each variable, function sym-
The quantifiers ∀x A and ∃x A bind tighter than the bi- bol and predicate symbol, with straightforward type check-
nary connectives; thus ∀x A∧ B is equivalent to (∀x A)∧ B. ing; types are called sorts and denote non-empty domains.
Frequently, you will see an alternative quantifier syntax, Second-order logic allows quantification over functions and
∀x . A and ∃x . B, which binds more weakly than the bi- predicates. It can express mathematical induction by
nary connectives: ∀x . A ∧ B is equivalent to ∀x (A ∧ B).
The dot is the give-away; look out for it! ∀P [P(0) ∧ ∀k (P(k) → P(k + 1)) → ∀n P(n)],
Nested quantifications such as ∀x ∀y A are abbreviated
to ∀x y A. using quantification over the unary predicate P. In second-
order logic, these functions and predicates must themselves
be first-order, taking no functions or predicates as argu-
Example 4 A language for arithmetic might have the con- ments. Higher-order logic allows unrestricted quantifica-
stant symbols 0, 1, 2, . . ., and function symbols +, −, ×, tion over functions and predicates of any order. The list of
/, and the predicate symbols =, <, >, . . .. We informally logics could be continued indefinitely.
may adopt an infix notation for the function and predicate
symbols. Terms include 0 and (x + 3) − y; formulas include
y = 0 and x + y < y + z.
4.3 Formal semantics of first-order logic
Let us rigorously define the meaning of formulas. An inter-
pretation of a language maps its function symbols to actual
4.2 Examples of first-order statements
functions, and its relation symbols to actual relations. For
Here are some sample formulas with a rough English trans- example, the predicate symbol “student” could be mapped
lation. English is easier to understand but is too ambiguous to the set of all students currently enrolled at the University.
for long derivations.
All professors are brilliant: Definition 5 Let L be a first-order language. An interpre-
tation I of L is a pair (D, I ). Here D is a nonempty set,
∀x (professor(x) → brilliant(x)) the domain or universe. The operation I maps symbols to
individuals, functions or sets:
The income of any banker is greater than the income of
• if c is a constant symbol (of L) then I [c] ∈ D
any bedder:
• if f is an n-place function symbol then I [ f ] ∈ D n →
∀x y (banker(x) ∧ bedder(y) → income(x) > income(y)) D (which means I [ f ] is an n-place function on D)
Note that > is a 2-place relation symbol. The infix notation • if P is an n-place relation symbol then I [P] ∈ D n →
is simply a convention. {1, 0} (equivalently, I [P] ⊆ D n , which means I [P] is
Every student has a supervisor: an n-place relation on D)

∀x (student(x) → ∃y supervises(y, x)) It is natural to regard predicates as truth-valued functions.

But in first-order logic, relations and functions are distinct
This does not preclude a student having several supervisors. concepts because no term can denote a truth value. One
4 FIRST-ORDER LOGIC 10

of the benefits of higher-order logic is that relations are a Definition 8 Let A be a formula having no free variables.
special case of functions, and formulas are simply boolean-
valued terms. • An interpretation I satisfies a formula if |HI A holds.
An interpretation does not say anything about variables.
• A set S of formulas is valid if every interpretation of S
An environment or valuation can be used to represent the
satisfies every formula in S.
values of variables.
Definition 6 A valuation V of L over D is a function from • A set S of formulas is satisfiable (or consistent) if there
the variables of L into D. Write I [t] for the value of t is some interpretation that satisfies every formula in S.
V
with respect to I and V , defined by • A set S of formulas is unsatisfiable (or inconsistent) if
def it is not satisfiable. (Each interpretation falsifies some
IV [x] = V (x) if x is a variable
formula of S.)
def
IV [c] = I [c]
• A model of a set S of formulas is an interpretation that
def
IV [ f (t1 , . . . , tn )] = I [ f ](IV [t1 ], . . . , IV [tn ]) satisfies every formula in S. We also consider models
that satisfy a single formula.
Write V {a/x} for the valuation that maps x to a and is
otherwise the same as V . Typically, we modify a valua- Unlike in propositional logic, models can be infinite and
tion one variable at a time. This is a semantic analogue of there can be an infinite number of models. There is no
substitution for the variable x. chance of proving validity by checking all models. We must
rely on proof.
4.4 What is truth?
We can define truth in first-order logic. This formidable def- Example 5 The formula P(a) ∧ ¬P(b) is satisfiable.
inition formalizes the intuitive meanings of the connectives. Consider the interpretation with D = {London, Paris} and I
Thus it almost looks like a tautology. It effectively specifies defined by
each connective by English descriptions. Valuations help
I [a] = Paris
specify the meanings of quantifiers. Alfred Tarski first de-
fined truth in this manner. I [b] = London
I [P] = {Paris}
Definition 7 Let A be a formula. Then for an interpretation
I = (D, I ) write |HI ,V A to mean that A is true in I un- On the other hand, ∀x y (P(x) ∧ ¬P(y)) is unsatisfiable be-
der V . This is defined by cases on the construction of the cause it requires P(x) to be both true and false for all x.
formula A: Also unsatisfiable is P(x) ∧ ¬P(y): its free variables are
|HI ,V P(t1 , . . . , tn ) is defined to hold if taken to be universally quantified, so it is equivalent to
∀x y (P(x) ∧ ¬P(y)).
I [P](IV [t1 ], . . . , IV [tn ]) = 1 The formula (∃x P(x)) → P(c) holds in the interpreta-
(that is, the actual relation I [P] is true for the tion (D, I ) where D = {0, 1}, I [P] = {0}, and I [c] = 0.
given values) (Thus P(x) means “x equals 0” and c denotes 0.) If we
modify this interpretation by making I [c] = 1 then the for-
|HI ,V t = u if IV [t] equals IV [u] (if = is a pred-
mula no longer holds. Thus it is satisfiable but not valid.
icate symbol of the language, then we insist
The formula (∀x P(x)) → (∀x P( f (x))) is valid, for let
that it really denotes equality)
(D, I ) be an interpretation. If ∀x P(x) holds in this inter-
|HI ,V ¬B if |HI ,V B does not hold pretation then P(x) holds for all x ∈ D, thus I [P] = D.
|HI ,V B ∧ C if |HI ,V B and |HI ,V C The symbol f denotes some actual function I [ f ] ∈ D →
|HI ,V B ∨ C if |HI ,V B or |HI ,V C D. Since I [P] = D and I [ f ](x) ∈ D for all x ∈ D,
formula ∀x P( f (x)) holds.
|HI ,V B → C if |HI ,V B does not hold or |HI ,V
The formula ∀x y x = y is satisfiable but not valid; it is
C holds
true in every domain that consists of exactly one element.
|HI ,V B ↔ C if |HI ,V B and |HI ,V C both hold (The empty domain is not allowed in first-order logic.)
or neither hold
|HI ,V ∃x B if there exists some m ∈ D such that Example 6 Let L be the first-order language consisting
|HI ,V {m/x} B holds (that is, B holds when x of the constant 0 and the (infix) 2-place function symbol +.
has the value m) An interpretation I of this language is any non-empty do-
|HI ,V ∀x B if |HI ,V {m/x} B holds for all m ∈ D main D together with values I [0] and I [+], with I [0] ∈ D
and I [+] ∈ D × D → D. In the language L we may
The cases for ∧, ∨, → and ↔ follow the propositional truth express the following axioms:
tables.
Write |HI A provided |HI ,V A for all V . Clearly, if A is x +0= x
closed (contains no free variables) then its truth is indepen-
0+x = x
dent of the valuation. The definitions of valid, satisfiable,
etc. carry over almost verbatim from Sect. 2.2. (x + y) + z = x + (y + z)
5 FORMAL REASONING IN FIRST-ORDER LOGIC 11

(Remember, free variables in effect are universally quanti- and falsify axioms 4 and 5. Consider only interpretations
fied, by the definition of |HI A.) One model of these ax- that make = denote the equality relation. (This exercise
ioms is the set of natural numbers, provided we give 0 and asks whether you can make the connection between the ax-
+ the obvious meanings. But the axioms have many other ioms and typical mathematical objects satisfying them. A
models.5 Below, let A be some set. start is to say that R(x, y) means x < y, but on what do-
main?)
1. The set of all strings (in ML say) letting 0 denote the
empty string and + string concatenation.
2. The set of all subsets of A, letting 0 denote the empty
5 Formal Reasoning in First-Order
set and + union. Logic
3. The set of functions in A → A, letting 0 denote the This section reviews some syntactic notations: free vari-
identity function and + composition. ables versus bound variables and substitution. It lists some
of the main equivalences for quantifiers. Finally it describes
Exercise 8 To test your understanding of quantifiers, con-
and illustrates the quantifier rules of the sequent calculus.
sider the following formulas: everybody loves somebody vs
there is somebody that everybody loves:
5.1 Free vs bound variables
∀x ∃y loves(x, y) (1)
The notion of bound variable occurs widely in mathemat-
∃y ∀x loves(x, y) (2) R
ics: consider the role of x in f (x)d x and the role of k
∞
Does (1) imply (2)? Does (2) imply (1)? Consider both the in limk=0 ak . Similar concepts occur in the λ-calculus. In
informal meaning and the formal semantics defined above. first-order logic, variables are bound by quantifiers (rather
than by λ).
Exercise 9 Describe a formula that is true in precisely
those domains that contain at least m elements. (We say it Definition 9 An occurrence of a variable x in a formula is
characterises those domains.) Describe a formula that char- bound if it is contained within a subformula of the form
acterises the domains containing fewer than m elements. ∀x A or ∃x A.
An occurrence of the form ∀x or ∃x is called the binding
Exercise 10 Let ≈ be a 2-place predicate symbol, which occurrence of x.
we write using infix notation as x ≈ y instead of ≈ (x, y). An occurrence of a variable is free if it is not bound.
Consider the axioms A closed formula is one that contains no free variables.
A ground term, formula, etc. is one that contains no vari-
∀x x ≈ x (1) ables at all.
∀x y (x ≈ y → y ≈ x) (2)
In ∀x ∃y R(x, y, z), the variables x and y are bound
∀x yz (x ≈ y ∧ y ≈ z → x ≈ z) (3) while z is free.

Let the universe be the set of natural numbers, N = In (∃x P(x)) ∧ Q(x), the occurrence of x just after P
{0, 1, 2, . . .}. Which axioms hold if I [≈] is is bound, while that just after Q is free. Thus x has
both free and bound occurrences. Such situations can be
1. the empty relation, ∅? avoided by renaming bound variables, for example obtain-
2. the universal relation, {(x, y) | x, y ∈ N }? ing (∃y P(y)) ∧ Q(x). Renaming can also ensure that all
bound variables in a formula are distinct. The renaming of
3. the equality relation, {(x, x) | x ∈ N }? bound variables is sometimes called α-conversion.
4. the relation {(x, y) | x, y ∈ N ∧ x + y is even}?
5. the relation {(x, y) | x, y ∈ N ∧ x + y = 100}? Example 7 Renaming bound variables in a formula pre-
serves its meaning, if done properly. Consider the following
6. the relation {(x, y) | x, y ∈ N ∧ x ≤ y? renamings of ∀x ∃y R(x, y, z):
Exercise 11 Taking = and R as 2-place relation symbols, ∀u ∃y R(u, y, z) OK
consider the following axioms: ∀x ∃w R(x, w, z) OK
∀u ∃y R(x, y, z) not done consistently
∀x ¬R(x, x) (1)
∀y ∃y R(y, y, z) clash with bound variable y
∀x y ¬(R(x, y) ∧ R(y, x)) (2) ∀z ∃y R(z, y, z) clash with free variable z
∀x yz (R(x, y) ∧ R(y, z) → R(x, z)) (3)
∀x y (R(x, y) ∨ (x = y) ∨ R(y, x)) (4) 5.2 Substitution
∀x z (R(x, z) → ∃y (R(x, y) ∧ R(y, z))) (5)
If A is a formula, t is a term, and x is a variable, then A[t/x]
is the formula obtained by substituting t for x throughout A.
Exhibit two interpretations that satisfy axioms 1–5. Exhibit
The substitution only affects the free occurrences of x. Pro-
two interpretations that satisfy axioms 1–4 and falsify ax-
nounce A[t/x] as “A with t for x”. We also use u[t/x]
iom 5. Exhibit two interpretations that satisfy axioms 1–3
for substitution in a term u and C[t/x] for substitution in a
5 Models of these axioms are called monoids. clause C (clauses are described in Sect. 6 below).
5 FORMAL REASONING IN FIRST-ORDER LOGIC 12

Substitution is only sensible provided all bound variables Two substitution laws do not involve quantifiers explic-
in A are distinct from all variables in t. This can be achieved itly, but let us use x = t to replace x by t in a restricted
by renaming the bound variables in A. For example, if ∀x A context:
then A[t/x] is true for all t; the formula holds when we drop
the ∀x and replace x by any term. But ∀x ∃y x = y is true (x = t ∧ A) ' (x = t ∧ A[t/x])
in all models, while ∃y y + 1 = y is not. We may not (x = t → A) ' (x = t → A[t/x])
replace x by y + 1, since the free occurrence of y in y + 1
gets captured by the ∃y . First we must rename the bound y, Many first-order formulas have easy proofs using equiv-
getting say ∀x ∃z x = z; now we may replace x by y + 1, alences:
getting ∃z y + 1 = z. This formula is true in all models,
regardless of the meaning of the symbols + and 1. ∃x (x = a ∧ P(x)) ' ∃x (x = a ∧ P(a))
' ∃x (x = a) ∧ P(a)
5.3 Equivalences involving quantifiers ' P(a)
These equivalences are useful for transforming and simpli- The following formula is quite hard to prove using the
fying quantified formulas. They can be to convert formulas sequent calculus, but using equivalences it is simple:
into prenex normal form, where all quantifiers are at the
front, or conversely to move quantifiers into the smallest ∃z (P(z) → P(a) ∧ P(b)) ' ∀z P(z) → P(a) ∧ P(b)
possible scope. ' ∀z P(z) ∧ P(a) ∧ P(b) → P(a) ∧ P(b)
pulling quantifiers through negation 't
(infinitary de Morgan laws)
If you are asked to prove a formula, but no particular formal
¬(∀x A) ' ∃x ¬A system (such as the sequent calculus) has been specified,
then you may use any convincing argument. Using equiv-
¬(∃x A) ' ∀x ¬A
alences as above can shorten the proof considerably. Also,
pulling quantifiers through conjunction and disjunction take advantage of symmetries; in proving A ∧ B ' B ∧ A,
(provided x is not free in B) it obviously suffices to prove A ∧ B |H B ∧ A.

(∀x A) ∧ B ' ∀x (A ∧ B) 5.4 Sequent rules for the universal quantifier

(∀x A) ∨ B ' ∀x (A ∨ B)
Here are the sequent rules for ∀:
(∃x A) ∧ B ' ∃x (A ∧ B)
(∃x A) ∨ B ' ∃x (A ∨ B) A[t/x], 0 ⇒ 1 0 ⇒ 1, A
(∀l) (∀r )
∀x A, 0 ⇒ 1 0 ⇒ 1, ∀x A
distributive laws
Rule (∀r ) holds provided x is not free in the conclusion!
(∀x A) ∧ (∀x B) ' ∀x (A ∧ B) Note that if x were indeed free somewhere in 0 or 1, then
(∃x A) ∨ (∃x B) ' ∃x (A ∨ B) the sequent would be assuming properties of x. This restric-
tion ensures that x is a fresh variable, which therefore can
implication: A → B as ¬A ∨ B denote an arbitrary value. Contrast the proof of the theorem
(provided x is not free in B) ∀x [P(x) → P(x)] with an attempted proof of the invalid
formula P(x) → ∀x P(x). Since x is a bound variable, you
(∀x A) → B ' ∃x (A → B) may rename it to get around the restriction, and obviously
(∃x A) → B ' ∀x (A → B) P(x) → ∀y P(y) should have no proof.
Rule (∀l) lets us create many instances of ∀x A. The exer-
expansion: ∀ and ∃ as infinitary conjunction and cises below include some examples that require more than
disjunction one copy of the quantified formula. Since we regard se-
quents as consisting of sets, we may regard them as con-
∀x A ' (∀x A) ∧ A[t/x] taining unlimited quantities of each of their elements. But
∃x A ' (∃x A) ∨ A[t/x] except for the two rules (∀l) and (∃r ) (see below), we only
need one copy of each formula.
With the help of the associative and commutative laws
for ∧ and ∨, a quantifier can be pulled out of any conjunct
Example 8 In this elementary proof, rule (∀l) is applied to
or disjunct.
instantiate the bound variable x with the term f (y). The
The distributive laws differ from pulling: they replace
application of (∀r ) is permitted because y is not free in the
two quantifiers by one. (Note that the quantified variables
conclusion (which, in fact, is closed).
will probably have different names, so one of them will have
be renamed.) Depending upon the situation, using distribu-
tive laws can be either better or worse than pulling. There P( f (y)) ⇒ P( f (y))
(∀l)
are no distributive laws for ∀ over ∨ and ∃ over ∧. If in ∀x P(x) ⇒ P( f (y))
(∀r )
doubt, do not use distributive laws! ∀x P(x) ⇒ ∀y P( f (y))
6 CLAUSE METHODS FOR PROPOSITIONAL LOGIC 13

Example 9 This proof concerns part of the law for pulling 5.5 Sequent rules for existential quantifiers
universal quantifiers out of conjunctions. Rule (∀l) just dis-
Here are the sequent rules for ∃:
cards the quantifier, since it instantiates the bound vari-
able x with the free variable x. A, 0 ⇒ 1 0 ⇒ 1, A[t/x]
(∃l) (∃r )
∃x A, 0 ⇒ 1 0 ⇒ 1, ∃x A
P(x), Q ⇒ P(x)
(∧l) Rule (∃l) holds provided x is not free in the conclusion—
P(x) ∧ Q ⇒ P(x)
(∀l) that is, not free in the formulas of 0 or 1. These rules
∀x (P(x) ∧ Q) ⇒ P(x)
(∀r ) are strictly dual to the ∀-rules; any example involving ∀
∀x (P(x) ∧ Q) ⇒ ∀x P(x) can easily be transformed into one involving ∃ and hav-
ing a proof of precisely the same form. For example, the
Example 10 The sequent ∀x (A → B) ⇒ A → ∀x B is sequent ∀x P(x) ⇒ ∀y P( f (y)) can be transformed into
valid provided x is not free in A. That condition is required ∃y P( f (y)) ⇒ ∃x P(x).
for the application of (∀r ) below: If you have a choice, apply rules that have provisos —
namely (∃l) and (∀r ) — before applying the other quantifier
A ⇒ A, B A, B ⇒ B rules as you work upwards. The other rules introduce terms
(→l)
A, A → B ⇒ B and therefore new variables to the sequent, which could pre-
(∀l)
A, ∀x (A → B) ⇒ B vent you from applying (∃l) and (∀r ) later.
(∀r )
A, ∀x (A → B) ⇒ ∀x B
(→r )
∀x (A → B) ⇒ A → ∀x B Example 11 Figure 2 presents half of the ∃ distributive
law. Rule (∃r ) just discards the quantifier, instantiating the
What if the condition fails to hold? Let A and B both be bound variable x with the free variable x. In the general
the formula x = 0. Then ∀x (x = 0 → x = 0) is valid, case, it can instantiate the bound variable with any term.
but x = 0 → ∀x (x = 0) is not valid (it fails under any The restriction on the sequent rules, namely “x is not free
valuation that sets x to 0). in the conclusion”, can be confusing when you are building
Note. The proof on the slides of a sequent proof working backwards. One simple way to
avoid problems is always to rename a quantified variable if
∀x (P → Q(x)) ⇒ P → ∀y Q(y) the same variable appears free in the sequent. For example,
when you see the sequent P(x), ∃x Q(x) ⇒ 1, replace it
is essentially the same as the proof above, but uses different
immediately by P(x), ∃y Q(y) ⇒ 1.
variable names so that you can see how a quantified formula
like ∀x (P → Q(x)) is instantiated to produce P → Q(y).
The proof given above is also correct: the variable names Example 12 The sequent
are identical, the instantiation is trivial and ∀x (A → B) ∃x P(x) ∧ ∃x Q(x) ⇒ ∃x (P(x) ∧ Q(x))
simply produces A → B. In our example, B may be any
formula possibly containing the variable x; the proof on the is not valid: the value of x that makes P(x) true could differ
slides uses the specific formula Q(x). from the value of x that makes Q(x) true. This comes out
clearly in the proof attempt, where we are not allowed to
Exercise 12 Verify the following equivalences by appeal- apply (∃l) twice with the same variable name, x. As soon as
ing to the truth definition for first-order logic: we are forced to rename the second variable to y, it becomes
obvious that the two values could differ. Turning to the right
¬(∃x P(x)) ' ∀x ¬P(x) side of the sequent, no application of (∃r ) can lead to a proof.
(∀x P(x)) ∧ R ' ∀x (P(x) ∧ R) We have nothing to instantiate x with:
(∃x P(x)) ∨ (∃x Q(x)) ' ∃x (P(x) ∨ Q(x)) P(x), Q(y) ⇒ P(x) ∧ Q(x)
(∃r )
P(x), Q(y) ⇒ ∃x (P(x) ∧ Q(x))
(∃l)
Exercise 13 Explain why the following are not equiva- P(x), ∃x Q(x) ⇒ ∃x (P(x) ∧ Q(x))
lences. Are they implications? In which direction? (∃l)
∃x P(x), ∃x Q(x) ⇒ ∃x (P(x) ∧ Q(x))
(∧l)
∃x P(x) ∧ ∃x Q(x) ⇒ ∃x (P(x) ∧ Q(x))
(∀x A) ∨ (∀x B) 6 ' ∀x (A ∨ B)
(∃x A) ∧ (∃x B) 6 ' ∃x (A ∧ B) Exercise 17 Prove the following using the sequent calcu-
lus. The last one is difficult and requires two uses of (∃r ).
Exercise 14 Prove ¬∀y [(Q(a) ∨ Q(b)) ∧ ¬Q(y)] using
P(a) ∨ ∃x P( f (x)) ⇒ ∃y P(y)
equivalences, and then formally using the sequent calculus.
∃x (P(x) ∨ Q(x)) ⇒ (∃y P(y)) ∨ (∃y Q(y))
Exercise 15 Prove the following sequents. Note that the ⇒ ∃z (P(z) → P(a) ∧ P(b))
last one requires two uses of the (∀l) rule!

(∀x P(x)) ∧ (∀x Q(x)) ⇒ ∀y (P(y) ∧ Q(y)) 6 Clause Methods for Propositional
∀x (P(x) ∧ Q(x)) ⇒ (∀y P(y)) ∧ (∀y Q(y)) Logic
∀x [P(x) → P( f (x))], P(a) ⇒ P( f ( f (x)))
This section discusses two proof methods in the context of
Exercise 16 Prove ∀x [P(x) ∨ P(a)] ' P(a). propositional logic.
6 CLAUSE METHODS FOR PROPOSITIONAL LOGIC 14

P(x) ⇒ P(x), Q(x)

(∨r )
P(x) ⇒ P(x) ∨ Q(x)
(∃r )
P(x) ⇒ ∃x (P(x) ∨ Q(x)) similar
(∃l) (∃l)
∃x P(x) ⇒ ∃x (P(x) ∨ Q(x)) ∃x Q(x) ⇒ ∃x (P(x) ∨ Q(x))
(∨l)
∃x P(x) ∨ ∃x Q(x) ⇒ ∃x (P(x) ∨ Q(x))

Figure 2: Half of the ∃ distributive law

The Davis-Putnam-Logeman-Loveland procedure dates 6.2 The DPLL Method

from 1960, and its application to first-order logic has been
regarded as obsolete for decades. However, the procedure The Davis-Putnam-Logeman-Loveland (DPLL) method is
has been rediscovered and high-performance implementa- based upon some obvious identities:
tions built. In the 1990s, these “SAT solvers” were applied
t∧ A ' A
to obscure problems in combinatorial mathematics, such as
the existence of Latin squares. Recently, they have been A ∧ (A ∨ B) ' A
generalised to SMT solvers, also handling arithmetic, with A ∧ (¬A ∨ B) ' A ∧ B
an explosion of serious applications. A ' (A ∧ B) ∨ (A ∧ ¬B)
Resolution is a powerful proof method for first-order
logic. We first consider ground resolution, which works for Here is an outline of the algorithm:
propositional logic. Though of little practical use, ground
resolution introduces some of the main concepts. The res- 1. Delete tautological clauses: {P, ¬P, . . .}
olution method is not natural for hand proofs, but it is easy
to automate: it has only one inference rule and no axioms. 2. Unit propagation: for each unit clause {L},
(For first-order logic, resolution must be augmented with a
• delete all clauses containing L
second rule: factoring.)
Both methods require the original formula to be negated, • delete ¬L from all clauses
then converted into CNF. Recall that CNF is a conjunction
of disjunction of literals. A disjunction of literals is called 3. Delete all clauses containing pure literals. A literal L
a clause, and written as a set of literals. Converting the is pure if there is no clause containing ¬L.
negated formula to CNF yields a set of such clauses. Both
4. If the empty clause is generated, then we have a refu-
methods seek a contradiction in the set of clauses; if the
tation. Conversely, if all clauses are deleted, then the
clauses are unsatisfiable, then so is the negated formula, and
original clause set is satisfiable.
therefore the original formula is valid.
To refute a set of clauses is to prove that they are incon- 5. Perform a case split on some literal L, and recursively
sistent. The proof is called a refutation. apply the algorithm to the L and ¬L subcases. The
clause set is satisfiable if and only if one of the sub-
6.1 Clausal notation cases is satisfiable.
Definition 10 A clause is a disjunction of literals This is a decision procedure. It must terminate because each
¬K 1 ∨ · · · ∨ ¬K m ∨ L 1 ∨ · · · ∨ L n , case split eliminates a propositional symbol. Modern imple-
mentations such as zChaff and MiniSat add various heuris-
written as a set tics. They also rely on carefully designed data structures
that improve performance by reducing the number of cache
{¬K 1 , . . . , ¬K m , L 1 , . . . , L n }.
misses, for example.
A clause is true (in some interpretation) just when one Historical note. Davis and Putnam introduced the first
of the literals is true. Thus the empty clause, namely {}, version of this procedure. Logeman and Loveland intro-
indicates contradiction. It is normally written . duced the splitting rule, and their version has completely
Since ∨ is commutative, associative, and idempotent, the superseded the original Davis-Putnam method.
order of literals in a clause does not matter. The above Tautological clauses are deleted because they are always
clause is logically equivalent to the implication true, and thus cannot participate in a contradiction. A pure
literal can always be assumed to be true; deleting the clauses
(K 1 ∧ · · · ∧ K m ) → (L 1 ∨ · · · ∨ L n ) containing it can be regarded as a degenerate case split, in
Kowalski notation abbreviates this to which there is only one case.

K1, · · · , Km → L 1, · · · , L n
Example 13 DPLL can show that a formula is not a theo-
and when n = 1 we have the familiar Prolog clauses, also rem. Consider the formula P ∨ Q → Q ∨ R. After negat-
known as definite or Horn clauses. ing this and converting to CNF, we obtain the three clauses
6 CLAUSE METHODS FOR PROPOSITIONAL LOGIC 15

{P, Q}, {¬Q} and {¬R}. The DPLL method terminates A special case of resolution is when A and C are empty:
rapidly:
B ¬B
{P, Q} {¬Q} {¬R} initial clauses f
{P} {¬R} unit ¬Q This detects contradictions.
{¬R} unit P (also pure) Resolution works with disjunctions. The aim is to prove
unit ¬R (also pure) a contradiction, refuting a formula. Here is the method for
All clauses have been deleted, so execution terminates. The proving a formula A:
clauses are satisfiable by P 7 → 1, Q 7 → 0, R 7 → 0. This 1. Translate ¬A into CNF as A1 ∧ · · · ∧ Am .
interpretation falsifies P ∨ Q → Q ∨ R.
2. Break this into a set of clauses: A1 , . . ., Am .
Example 14 Here is an example of a case split. Consider 3. Repeatedly apply the resolution rule to the clauses,
the clause set producing new clauses. These are all consequences of
¬A.
{¬Q, R} {¬R, P} {¬R, Q}
{¬P, Q, R} {P, Q} {¬P, ¬Q}. 4. If a contradiction is reached, we have refuted ¬A.
In set notation the resolution rule is
There are no unit clauses or pure literals, so we arbitrarily
select P for case splitting: {B, A1 , . . . , Am } {¬B, C1 , . . . , Cn }
{A1 , . . . , Am , C1 , . . . , Cn }
{¬Q, R} {¬R, Q} {Q, R} {¬Q} if P is true
{¬R} {R} unit ¬Q Resolution takes two clauses and creates a new one. A col-
{} unit R lection of clauses is maintained; the two clauses are cho-
{¬Q, R} {¬R} {¬R, Q} {Q} if P is false sen from the collection according to some strategy, and the
{¬Q} {Q} unit ¬R new clause is added to it. If m = 0 or n = 0 then the
{} unit ¬Q new clause will be smaller than one of the parent clauses; if
m = n = 0 then the new clause will be empty. If the empty
The empty clause is written {} above to make the pattern of clause is generated, resolution terminates successfully: we
execution clearer; traditionally, however, the empty clause have found a contradiction!
is written . When we encounter a contradiction, we aban-
don the current case and consider any remaining cases. If 6.4 Examples of ground resolution
all cases are contradictory, then the original set of clauses is
inconsistent. If they arose from some negated formula ¬A, Let us try to prove
then A is a theorem.
You might find it instructive to download MiniSat6 , P∧Q→Q∧P
which is a very concise open-source SAT solver. It is coded Convert its negation to CNF:
in C++. These days, SAT solvers are largely superseded by
SMT solvers, which also handle arithmetic, rays, bit vectors ¬(P ∧ Q → Q ∧ P)
, arrays, bit vectors, etc.
We can combine steps 1 (eliminate →) and 2 (push nega-
tions in) using the law ¬(A → B) ' A ∧ ¬B:
6.3 Introduction to resolution
(P ∧ Q) ∧ ¬(Q ∧ P)
Resolution combines two clauses containing complemen-
tary literals. It is essentially the following rule of inference: (P ∧ Q) ∧ (¬Q ∨ ¬P)

B ∨ A ¬B ∨ C Step 3, push disjunctions in, has nothing to do. The clauses

A∨C are
{P} {Q} {¬Q, ¬P}
To convince yourself that this rule is sound, note that B
We resolve {P} and {¬Q, ¬P} as follows:
must be either false or true.
{P} {¬P, ¬Q}
• if B is false, then B ∨ A is equivalent to A, so we get {¬Q}
A∨C
The resolvent is {¬Q}. Resolving {Q} with this new clause
• if B is true, then ¬B ∨ C is equivalent to C, so we get gives
A∨C {Q} {¬Q}
You might also understand this rule via transitivity of → {}
The resolvent is the empty clause, properly written as .
¬A → B B → C We have proved P ∧ Q → Q ∧ P by assuming its negation
¬A → C
and deriving a contradiction.
6 http://minisat.se/ It is nicer to draw a tree like this:
6 CLAUSE METHODS FOR PROPOSITIONAL LOGIC 16

{P} {¬Q, ¬P}

Assumption H → M ∨ N is essentially in clause form
already:
{Q} {¬Q} {¬H, M, N }
Assumption M → K ∧ P becomes two clauses:

{¬M, K } {¬M, P}
Another example is (P ↔ Q) ↔ (Q ↔ P). The steps
Assumption N → L ∧ P also becomes two clauses:
of the conversion to clauses is left as an exercise; remember
to negate the formula first! The final clauses are {¬N , L} {¬N , P}
{P, Q} {¬P, Q} {P, ¬Q} {¬P, ¬Q} The negated conclusion, ¬(H → P), becomes two clauses:
A tree for the resolution proof is {H } {¬P}

{P, Q} {¬P, Q} {P, ¬Q} {¬P, ¬Q} A tree for the resolution proof is

{Q}
{¬Q} {H } {¬H, M, N }

{M, N } {¬M, P}

{N , P}

{¬N , P}
Note that the tree contains {Q} and {¬Q} rather than
{Q, Q} and {¬Q, ¬Q}. If we forget to suppress repeated {P} {¬P}
literals, we can get stuck. Resolving {Q, Q} and {¬Q, ¬Q}
(keeping repetitions) gives {Q, ¬Q}, a tautology. Tautolo-
gies are useless. Resolving this one with the other clauses
leads nowhere. Try it. The clauses were not tried at random. Here are some
These examples could mislead. Must a proof use each points of proof strategy.
clause exactly once? No! A clause may be used repeatedly,
and many problems contain redundant clauses. Here is an Ignoring irrelevance. Clauses {¬M, K } and {¬N , L}
example: lead nowhere, so they were not tried. Resolving with one
{¬P, R} {P} {¬Q, R} {¬R}
of these would make a clause containing K or L. There is
(unused) no way of getting rid of either literal, for no clause contains
¬K or ¬L. So this is not a way to obtain the empty clause.
{R}
(K and L are pure literals.)

Working from the goal. In each resolution step, at least
Redundant clauses can make the theorem-prover flounder; one clause involves the negated conclusion (possibly via
this is a challenge facing the field. earlier resolution steps). We do not attempt to find a con-
tradiction in the assumptions alone, provided (as is often
the case) we know them to be consistent: any contradiction
6.5 A proof using a set of assumptions must involve the negated conclusion. This strategy is called
In this example we assume set of support. Although largely obsolete, it’s very useful
when working problems by hand.
H →M∨N M→K∧P N →L∧P
Linear resolution. The proof has a linear structure: each
and prove H → P. It turns out that we can generate clauses resolvent becomes the parent clause for the next resolution
separately from the assumptions (taken positively) and the step. Furthermore, the other parent clause is always one
conclusion (negated). of the original set of clauses. This simple structure is very
If we call the assumptions A1 , . . ., Ak and the conclu- efficient because only the last resolvent needs to be saved.
sion B, then the desired theorem is It is similar to the execution strategy of Prolog.

(A1 ∧ · · · ∧ Ak ) → B Exercise 18 Apply the DPLL procedure to the clause set

Try negating this and converting to CNF. Using the law {P, Q} {¬P, Q} {P, ¬Q} {¬P, ¬Q}.
¬(A → B) ' A ∧ ¬B, the negation converts in one step to
Exercise 19 Use resolution to prove
A1 ∧ · · · ∧ Ak ∧ ¬B
(A → B ∨ C) → [(A → B) ∨ (A → C)].
Since the entire formula is a conjunction, we can separately
convert A1 , . . ., Ak , and ¬B to clause form and pool the Exercise 20 Explain in more detail the conversion into
clauses together. clauses for the example of §6.5.
7 SKOLEM FUNCTIONS, HERBRAND’S THEOREM AND UNIFICATION 17

Exercise 21 Prove Peirce’s law, ((P → Q) → P) → P, After Skolemization, the formula has only universal
using resolution. quantifiers. The next step is to throw the remaining quan-
tifiers away. This step is correct because we are converting
Exercise 22 Use resolution (showing the steps of convert- to clause form, and a clause implicitly includes universal
ing the formula into clauses) to prove these two formulas: quantifiers over all of its free variables.
We are almost back to the propositional case, except the
(Q → R) ∧ (R → P ∧ Q) ∧ (P → Q ∨ R) → (P ↔ Q) formula typically contains terms. We shall have to handle
(P ∧ Q → R) ∧ (P ∨ Q ∨ R) → ((P ↔ Q) → R) constants, function symbols, and variables.
Prenex normal form—where all quantifiers are moved to
Exercise 23 Prove that (P ∧ Q) → (R ∧ S) follows from the front of the formula—would make things easier to fol-
P → R and R ∧ P → S using linear resolution. low. However, increasing the scope of the quantifiers prior
to Skolemization makes proofs much more difficult. Push-
Exercise 24 Convert these axioms to clauses, showing all
ing quantifiers in as far as possible, instead of pulling them
steps. Then prove Winterstorm → Miserable by resolution:
out, yields a better set of clauses.
Rain ∧ (Windy ∨ ¬Umbrella) → Wet
Winterstorm → Storm ∧ Cold Examples of Skolemization
Wet ∧ Cold → Miserable For simplicity, we start with prenex normal form. The af-
Storm → Rain ∧ Windy fected expressions are underlined.

Example 15 Start with

7 Skolem Functions, Herbrand’s
∃x ∀y ∃z R(x, y, z)
Theorem and Unification
Eliminate the ∃x using the Skolem constant a:
Propositional logic is the basis of many proof methods for
first-order logic. Eliminating the quantifiers from a first- ∀y ∃z R(a, y, z)
order formula reduces it “almost” to propositional logic.
This section describes how to do so. Eliminate the ∃z using the 1-place Skolem function f :

∀y R(a, y, f (y))
7.1 Removing quantifiers: Skolem form
Skolemisation replaces every existentially bound variable Finally, drop the ∀y and convert the remaining formula to
by a Skolem constant or function. This transformation does a clause:
not preserve the meaning of a formula. However, it does {R(a, y, f (y))}
preserve inconsistency, which is the critical property: reso-
lution works by detecting contradictions. Example 16 Start with

How to Skolemize a formula ∃u ∀v ∃w ∃x ∀y ∃z ((P(h(u, v)) ∨ Q(w)) ∧ R(x, h(y, z)))

Take a formula in negation normal form. Starting from the Eliminate the ∃u using the Skolem constant c:
outside, follow the nesting structure of the quantifiers. If the
formula contains an existential quantifier, then the series of ∀v ∃w ∃x ∀y ∃z ((P(h(c, v)) ∨ Q(w)) ∧ R(x, h(y, z)))
quantifiers must have the form
Eliminate the ∃w using the 1-place Skolem function f :
∀x1 · · · ∀x2 · · · ∀xk · · · ∃y A
∀v ∃x ∀y ∃z ((P(h(c, v)) ∨ Q( f (v))) ∧ R(x, h(y, z)))
where A is a formula, k ≥ 0, and ∃y is the leftmost exis-
tential quantifier. Choose a k-place function symbol f not Eliminate the ∃x using the 1-place Skolem function g:
present in A (that is, a new function symbol). Delete the ∃y
∀v ∀y ∃z ((P(h(c, v)) ∨ Q( f (v))) ∧ R(g(v), h(y, z)))
and replace all other occurrences of y by f (x1 , x2 , . . . , xk ).
The result is another formula: Eliminate the ∃z using the 2-place Skolem function j (the
function h is already used!):
∀x1 · · · ∀x2 · · · ∀xk · · · A[ f (x1 , x2 , . . . , xk )/y]
∀v ∀y ((P(h(c, v)) ∨ Q( f (v))) ∧ R(g(v), h(y, j (v, y))))
If some existential quantifier is not enclosed in any uni-
versal quantifiers, then the formula contains simply ∃y A Dropping the universal quantifiers yields a set of clauses:
as a subformula. Then this quantifier is deleted and occur-
rences of y are replaced by a new constant symbol c. The {P(h(c, v)), Q( f (v))} {R(g(v), h(y, j (v, y)))}
resulting subformula is A[c/y].
Then repeatedly eliminate all remaining existential quan- Each clause is implicitly enclosed by universal quantifiers
tifiers as above. The new symbols are called Skolem func- over each of its variables. So the occurrences of the vari-
tions (or Skolem constants). able v in the two clauses are independent of each other.
7 SKOLEM FUNCTIONS, HERBRAND’S THEOREM AND UNIFICATION 18

Let’s try this example again, first pushing quantifiers in Thus, H consists of all the terms that can be written using
to the smallest possible scopes: only the constants and function symbols present in S. There
are no variables: the elements of H are ground terms. For-
∃u ∀v P(h(u, v)) ∨ ∃w Q(w) ∧ ∃x ∀y ∃z R(x, h(y, z)) mally, H turns out to satisfy the recursive equation

Now the Skolem functions take fewer arguments. H = { f (t1 , . . . , tn ) | t1 , . . . , tn ∈ H and f ∈ Fn }

{P(h(c, v)), Q(d)} {R(e, h(y, f (y)))} The definition above ensures that C is non-empty. It follows
that H is also non-empty, which is necessary.
The difference between this set of clauses and the previous The elements of H are ground terms. An interpretation
set may seem small, but in practice it can be huge. (H, I H ) is a Herbrand interpretation provided I H [t] = t
for all ground terms t. The point of this peculiar exercise is
Correctness of Skolemization that we can give meanings to the symbols of S in a purely
Skolemization does not preserve meaning. The version pre- syntactic way.
sented above does not even preserve validity! For example,
Example 17 Given the two clauses
∃x (P(a) → P(x))
{P(a)} {Q(g(y, z)), ¬P( f (x))}
is valid. (Because the required x is just a.)
Replacing the ∃x by the Skolem constant b gives Then C = {a}, F1 = { f }, F2 = {g} and

P(a) → P(b) H = {a, f (a), g(a, a), f ( f (a)), g( f (a), a),

g(a, f (a)), g( f (a), f (a)), . . .}
This has a different meaning since it refers to a constant b
not previously mentioned. And it is not valid! For example, Every Herbrand interpretation I H defines each n-place
it is false in the interpretation where P(x) means x equals predicate symbol P to denote some truth-valued function
0 and a denotes 0 and b denotes 1. I H [P] ∈ H n → {1, 0}. We take
Skolemization preserves consistency.
I H [P(t1 , . . . , tn )] = 1
• The formula ∀x ∃y A is consistent iff it holds in some
interpretation I = (D, I ) if and only if P(t , . . . , t ) holds in our desired “real” in-
1 n
• iff for all x ∈ D there is some y ∈ D such that A holds terpretation I of the clauses. In other words, any specific
interpretation I = (D, I ) over some universe D can be
• iff there is some function on D, say fˆ ∈ D → D, such mimicked by an Herbrand interpretation. One can show the
that for all x ∈ D, if y = fˆ(x) then A holds following two results:
• iff the formula ∀x A[ f (x)/y] is consistent. Lemma 12 Let S be a set of clauses. If an interpretation
If a formula is consistent then so is the Skolemized version. satisfies S, then an Herbrand interpretation satisfies S.
If it is inconsistent then so is the Skolemized version. And
Theorem 13 A set S of clauses is unsatisfiable if and only
resolution works by proving that a formula is inconsistent.
if no Herbrand interpretation satisfies S.

7.2 Herbrand interpretations Equality behaves strangely in Herbrand interpretations.

Given an interpretation I, the denotation of = is the set of
An Herbrand interpretation basically consists of all terms
all pairs of ground terms (t1 , t2 ) such that t1 = t2 according
that can be written using just the constant and function sym-
to I. In a context of the natural numbers, the denotation
bols in a set of clauses S (or quantifier-free formula). The
of = could include pairs like (1 + 1, 2) — the two compo-
data processed by a Prolog program S is simply its Her-
nents need not be the same.
brand universe.
To define the Herbrand universe for the set of clauses S
we start with sets of the constant and function symbols in S, 7.3 The Skolem-Gödel-Herbrand Theorem
including Skolem functions.
This theorem tells us that unsatisfiability can always be de-
Definition 11 Let C be the set of all constants in S. If there tected by a finite process. It provided the original motivation
are none, let C = {a} for some constant symbol a of the for research into automated theorem proving.
first-order language. For n > 0 let Fn be the set of all n-
place function symbols in S and let Pn be the set of all n- Definition 14 An instance of a clause C is a clause that
place predicate symbols in S. results by replacing variables in C by terms. A ground in-
The Herbrand universe is the set H = i≥0 Hi , where
S stance of C is an instance of C that contains no variables.

Since the variables in a clause are taken to be universally

H0 = C
quantified, every instance of C is a logical consequence
Hi+1 = Hi ∪ { f (t1 , . . . , tn ) | t1 , . . . , tn ∈ Hi and f ∈ Fn } of C.
7 SKOLEM FUNCTIONS, HERBRAND’S THEOREM AND UNIFICATION 19

Theorem 15 (Herbrand) A set S of clauses is unsatisfi- Example 19 The substitution θ = [h(y)/x, b/y] says to
able if and only if there is a finite unsatisfiable set S 0 of replace x by h(y) and y by b. The replacements occur si-
ground instances of clauses of S. multaneously; it does not have the effect of replacing x by
h(b). Applying this substitution gives
The theorem is valuable because the new set S 0 expresses
f (x, g(u), y)θ = f (h(y), g(u), b)
the inconsistency in a finite way. However, it only tells us
0 0
that S exists; it does not tell us how to derive S . So how R(h(x), z)θ = R(h(h(y)), z)
do we generate useful ground instances of clauses? One {P(x), ¬Q(y)}θ = {P(h(y)), ¬Q(b)}
answer, outlined below, is unification.
Definition 17 If φ and θ are substitutions then so is their
composition φ ◦ θ , which satisfies
Example 18 To demonstrate the Skolem-Gödel-Herbrand
t (φ ◦ θ ) = (tφ)θ for all terms t
theorem, consider proving the formula

∀x P(x) ∧ ∀y [P(y) → Q(y)] → Q(a) ∧ Q(b). Example 20 Let

φ = [ j (x)/u, 0/y] and θ = [h(z)/x, g(3)/y].
If we negate this formula, we trivially obtain the following
set S of clauses: Then φ ◦ θ = [ j (h(z))/u, h(z)/x, 0/y].
Notice that y(φ ◦ θ ) = (yφ)θ = 0θ = 0; the replacement
{P(x)} {¬P(y), Q(y)} {¬Q(a), ¬Q(b)}. g(3)/y has disappeared.

This set is inconsistent. Here is a finite set of ground in- 7.5 Unifiers
stances of clauses in S:
Definition 18 A substitution θ is a unifier of terms t1 and
{P(a)} {P(b)} {¬P(a), Q(a)} t2 if t1 θ = t2 θ . More generally, θ is a unifier of terms t1 ,
t2 , . . ., tm if t1 θ = t2 θ = · · · = tm θ. The term t1 θ is the
{¬P(b), Q(b)} {¬Q(a), ¬Q(b)}.
common instance.
This set reflects the intuitive proof of the theorem. We obvi- Two terms can only be unified if they have similar struc-
ously have P(a) and P(b); using ∀y [P(y) → Q(y)] with ture apart from variables. The terms f (x) and h(y, z) are
a and b, we obtain Q(a) and Q(b). If we can automate this clearly non-unifiable since no substitution can do anything
procedure, then we can generate such proofs automatically. about the differing function symbols. It is easy to see that
θ unifies f (t1 , . . . , tn ) and f (u 1 , . . . , u n ) if and only if θ
unifies ti and u i for all i = 1, . . . , n.
7.4 Unification
Unification is the operation of finding a common instance Example 21 The substitution [3/x, g(3)/y] unifies the
of two or more terms. Consider a few examples. The terms terms g(g(x)) and g(y). The common instance is g(g(3)).
f (x, b) and f (a, y) have the common instance f (a, b), re- These terms have many other unifiers, such as these:
placing x by a and y by b. The terms f (x, x) and f (a, b) unifying substitution common instance
have no common instance, assuming that a and b are dis- [ f (u)/x, g( f (u))/y] g(g( f (u)))
tinct constants. The terms f (x, x) and f (y, g(y)) have no [z/x, g(z)/y] g(g(z))
common instance, since there is no way that x can have the [g(x)/y] g(g(x))
form y and g(y) at the same time — unless we admit the
infinite term g(g(g(· · · ))). Note that g(g(3)) and g(g( f (u))) are both instances of
Only variables may be replaced by other terms. Con- g(g(x)). Thus g(g(x)) is more general than g(g(3)) and
stants are not affected (they remain constant!). Instances of g(g( f (u))). Certainly g(g(3)) seems to be arbitrary —
the term f (t, u) must have the form f (t 0 , u 0 ), where t 0 is an neither of the original terms mentions 3! Also important:
instance of t and u 0 is an instance of u. g(g(x)) is as general as g(g(z)), despite the different vari-
able names. Let us formalize these intuitions.
Definition 16 A substitution is a finite set of replacements Definition 19 The substitution θ is more general than φ if
φ = θ ◦ σ for some substitution σ .
[t1 /x1 , . . . , tk /xk ]
Example 22 Recall the unifiers of g(g(x)) and g(y). The
where x1 , . . ., xk are distinct variables such that ti 6 = xi for unifier [g(x)/y] is more general than the others listed, for
all i = 1, . . . , k. We use Greek letters φ, θ , σ to stand for
substitutions. [3/x, g(3)/y] = [g(x)/y] ◦ [3/x]
[ f (u)/x, g( f (u))/y] = [g(x)/y] ◦ [ f (u)/x]
A substitution θ = [t1 /x1 , . . . , tk /xk ] defines a func- [z/x, g(z)/y] = [g(x)/y] ◦ [z/x]
tion from the variables {x1 , . . . , xk } to terms. Postfix no-
[g(x)/y] = [g(x)/y] ◦ []
tation is usual for applying a substitution; thus, for exam-
ple, xi θ = ti . Substitution on terms, literals and clauses is The last line above illustrates that every substitution θ is
defined recursively in the obvious way: more general than itself because θ = θ ◦ [].
7 SKOLEM FUNCTIONS, HERBRAND’S THEOREM AND UNIFICATION 20

Definition 20 A substitution θ is a most general unifier Examples of unification

(MGU) of terms t1 , . . ., tm if θ unifies t1 , . . ., tm , and θ
In most of these examples, the two terms have no variables
is more general than every other unifier of t1 , . . ., tm .
in common. Most uses of unification (including resolution
Thus if θ is an MGU of terms t1 and t2 and t1 φ = t2 φ then and Prolog) rename variables in one of the terms to ensure
φ = θ ◦ σ for some substitution σ . The natural unification this. Such renaming is not part of unification itself.
algorithm returns a most general unifier of two terms.
Example 24 Unify f (x, b) with f (a, y). Steps:

7.6 A simple unification algorithm Unify x and a getting [a/x].

Try to unify b[a/x] and y[a/x]. These are b
Unification is often presented as operating on the concrete
and y, so unification succeeds with [b/y].
syntax of terms, scanning along character strings. But terms
are really tree structures and are so represented in a com- Result is [a/x] ◦ [b/y], which is [a/x, b/y].
puter. Unification should be presented as operating on trees.
In fact, we need consider only binary trees, since these can Example 25 Unify f (x, x) with f (a, b). Steps:
represent n-ary branching trees.
Unify x and a getting [a/x].
Our trees have three kinds of nodes:
Try to unify x[a/x] and b[a/x]. These are dis-
• A variable x, y, . . . — can be modified by substitution tinct constants, a and b. Fail.

• A constant a, b, . . . — handles function symbols also

Example 26 Unify f (x, g(y)) with f (y, x). Steps:
• A pair (t, u) — where t and u are terms Unify x and y getting [y/x].
Unification of two terms considers nine cases, most of Try to unify g(y)[y/x] and x[y/x]. These are
which are trivial. It is impossible to unify a constant with a g(y) and y, violating the occurs check. Fail.
pair; in this case the algorithm fails. When trying to unify But we can unify f (x, g(y)) with f (y 0 , x 0 ). The
two constants a and b, if a = b then the most general unifier failure was caused by having the same variables
is []; if a 6 = b then unification is impossible. The interesting in both terms.
cases are variable-anything and pair-pair.
Example 27 Unify f (x, x) with f (y, g(y)). Steps:
Unification with a variable Unify x and y getting [y/x].
When unifying a variable x with a term t, where x 6 = t, we Try to unify x[y/x] and g(y)[y/x]. But these
must perform the occurs check. If x does not occur in t then are y and g(y), where y occurs in g(y). Fail.
the substitution [t/x] has no effect on t, so it does the job How can f (x, x) and f (y, g(y)) have a common
trivially: instance when the arguments of f must be iden-
x[t/x] = t = t[t/x] tical in the first case and different in the second?
If x does occur in t then no unifier exists, for if xθ = tθ
Example 28 Unify j (w, a, h(w)) with j ( f (x, y), x, z).
then the term xθ would be a proper subterm of itself.
Unify w and f (x, y) getting [ f (x, y)/w].
Example 23 The terms x and f (x) are not unifiable. If Unify a and x (the substitution has no effect) get-
xθ = u then f (x)θ = f (u). Thus xθ = f (x)θ would ting [a/x]. Then unify
imply u = f (u). We could introduce the infinite term h(w)[ f (x, y)/w][a/x] and z[ f (x, y)/w][a/x],
u = f ( f ( f ( f ( f (· · · ))))) namely h( f (a, y)) and z; unifier is
[h( f (a, y))/z].
as a unifier, but this would require a rigorous definition of
the syntax and semantics of infinite terms.
Result is [ f (x, y)/w] ◦ [a/x] ◦ [h( f (a, y))/z].
Performing the compositions, this simplifies to
Unification of two pairs [ f (a, y)/w, a/x, h( f (a, y))/z].
Unifying the pairs (t1 , t2 ) with (u 1 , u 2 ) requires two recur-
sive calls of the unification algorithm. If θ1 unifies t1 with Example 29 Unify j (w, a, h(w)) with j ( f (x, y), x, y).
u 1 and θ2 unifies t2 θ1 with u 2 θ1 then θ1 ◦ θ2 unifies (t1 , t2 ) This is the previous example but with a y in place of a z.
with (u 1 , u 2 ). Unify w and f (x, y) getting [ f (x, y)/w].
It is possible to prove that this process terminates, and
Unify a and x getting [a/x]. Then unify
that if θ1 and θ2 are most general unifiers then so is θ1 ◦ θ2 .
If either recursive call fails then the pairs are not unifiable. h(w)[ f (x, y)/w][a/x] and y[ f (x, y)/w][a/x].
As given above, the algorithm works from left to right, These are h( f (a, y)) and y, but y occurs in
but choosing the reverse direction makes no real difference. h( f (a, y)). Fail.
8 FIRST-ORDER RESOLUTION AND PROLOG 21

In the diagrams, the lines indicate variable replacements: Observe that R(x, f (x)) and R(g(y), y) are not unifiable
because of the occurs check. And so it should be, because
j (w, a, h(w)) j ( f (x, y), x, y) the original formula is not a theorem! The best way to
demonstrate that a formula is not a theorem is to exhibit
a/x a counterexample. Here are two:

f (a, y)/w • The domain is the set of all people who have ever lived.
The relation R(x, y) holds if x loves y. The function
h( f (a, y))/y??? f (x) denotes the mother of x, and {R(x, f (x))} holds
because everybody loves their mother. The function
g(x) denotes the landlord of x, and ¬R(g(y), y) holds.
Implementation remarks
• The domain is the set of integers. The relation R(x, y)
To unify terms t1 , t2 , . . ., tm for m > 2, compute a unifier holds whenever x = y. The function f is defined by
θ of t1 and t2 , then recursively compute a unifier σ of the f (x) = x and {R(x, f (x))} holds. The function g is
terms t2 θ , . . ., tm θ . The overall unifier is then θ ◦ σ . If any defined by g(x) = x + 1 and so {¬R(g(y), y)} holds.
unification fails then the set is not unifiable.
A real implementation does not need to compose substi- Exercise 25 Consider a first-order language with 0 and 1
tutions. Most represent variables by pointers and effect the as constant symbols, with − as a 1-place function symbol
substitution [t/x] by updating pointer x to t. The compo- and + as a 2-place function symbol, and with < as a 2-place
sitions are cumulative, so this works. However, if unifica- predicate symbol.
tion fails at some point, the pointer assignments must be (a) Describe the Herbrand Universe for this language.
undone! The algorithm sketched here can take exponential
time in unusual cases. Faster algorithms exist, but they are (b) The language can be interpreted by taking the integers
more complex and seldom adopted. for the universe and giving 0, 1, −, +, and < their
Prolog systems, for the sake of efficiency, omit the oc- usual meanings over the integers. What do those sym-
curs check. This can result in circular data structures and bols denote in the corresponding Herbrand model?
looping. It is unsound for theorem proving.
Exercise 26 For each of the following pairs of terms, give
a most general unifier or explain why none exists. Do not
7.7 Examples of theorem proving rename variables prior to performing the unification.

These two examples are fundamental. They illustrate how f (g(x), z) f (y, h(y))
the occurs check enforces correct quantifier reasoning. j (x, y, z) j ( f (y, y), f (z, z), f (a, a))
j (x, z, x) j (y, f (y), z)
Example 30 Consider a proof of j ( f (x), y, a) j (y, z, z)
j (g(x), a, y) j (z, x, f (z, z))
(∃y ∀x R(x, y)) → (∀x ∃y R(x, y)).

For simplicity, produce clauses separately for the an- 8 First-Order Resolution and Prolog
tecedent and for the negation of the consequent.
By means of unification, we can extend resolution to first-
• The antecedent is ∃y ∀x R(x, y); replacing y by the
order logic. As a special case we obtain Prolog. Other the-
Skolem constant a yields the clause {R(x, a)}.
orem provers are also based on unification. Other applica-
tions include polymorphic type checking.
• In ¬(∀x ∃y R(x, y)), pushing in the negation produces
A number of resolution theorem provers can be down-
∃x ∀y ¬R(x, y). Replacing x by the Skolem con-
loaded for free from the Internet. Some of the main ones
stant b yields the clause {¬R(b, y)}.
include Vampire (http://www.vprover.org, SPASS
Unifying R(x, a) with R(b, y) detects the contradiction (http://www.spass-prover.org) and E (http:
R(b, a) ∧ ¬R(b, a). //www.eprover.org). It might be instructive to down-
load one of them and experiment with it.

Example 31 In a similar vein, let us try to prove

8.1 Binary resolution
(∀x ∃y R(x, y)) → (∃y ∀x R(x, y)). We now define the binary resolution rule with unification:

• Here the antecedent is ∀x ∃y R(x, y); replacing y by {B, A1 , . . . , Am } {¬D, C1 , . . . , Cn }

the Skolem function f yields the clause {R(x, f (x))}. {A1 , . . . , Am , C1 , . . . , Cn }σ if Bσ = Dσ

• The negation of the consequent is ¬(∃y ∀x R(x, y)), As before, the first clause contains B and other literals,
which becomes ∀y ∃x ¬R(x, y). Replacing x by the while the second clause contains ¬D and other literals. The
Skolem function g yields the clause {¬R(g(y), y)}. substitution σ is a unifier of B and D (almost always a most
8 FIRST-ORDER RESOLUTION AND PROLOG 22

general unifier). This substitution is applied to all remain- Factoring is necessary here. Applying the factoring rule
ing literals, producing the conclusion. to each of these clauses yields two additional clauses:
The variables in one clause are renamed before resolu-
tion to prevent clashes with the variables in the other clause. {¬P(a, a)} {P(a, a)}
Renaming is sound because the scope of each variable is its
clause. Resolution is sound because it takes an instance of These are complementary unit clauses, so resolution yields
each clause — the instances are valid, because the clauses the empty clause. This proof is trivial!
are universally valid — and then applies the propositional As a general rule, if there are no unit clauses to begin
resolution rule, which is sound. For example, the two with, then factoring will be necessary. Otherwise, resolu-
clauses tion steps will continue to yield clauses that have at least
{P(x)} and {¬P(g(x))} two literals. The only exception is when there are repeated
literals, as in the following example.
yield the empty clause in a single resolution step. This
works by renaming variables — say, x to y in the second
clause — and unifying P(x) with P(g(y)). Forgetting to Example 33 Let us prove ∃x [P → Q(x)] ∧ ∃x [Q(x) →
rename variables is fatal, because P(x) cannot be unified P] → ∃x [P ↔ Q(x)]. The clauses are
with P(g(x)).
{P, ¬Q(b)} {P, Q(x)} {¬P, ¬Q(x)} {¬P, Q(a)}

8.2 Factoring A short resolution proof follows. The complementary liter-

als are underlined:
Factoring is a separate inference rule, but a vital part of
the resolution method. It takes a clause and unifies some Resolve {P, ¬Q(b)} with {P, Q(x)} getting {P}
literals within it (which must all have the same sign), yield- Resolve {¬P, ¬Q(x)} with {¬P, Q(a)} getting {¬P}
ing a new clause. For example, starting with the clause Resolve {P} with {¬P} getting
{P(x, b), P(a, y)}, factoring can derive {P(a, b)}, since
P(a, b) is the result of unifying P(x, b) with P(a, y). This This proof relies on the set identity {P, P} = {P}. We can
new clause is a unit clause and therefore especially useful. view it as a degenerate case of factoring where the literals
Factoring is necessary for completeness. Resolution by are identical, with no need for unification.
itself tends to make clauses longer and longer. Only short
clauses are likely to lead to a contradiction. If every clause 8.3 Redundant clauses and subsumption
has at least two literals, then the only way to reach the empty
clause is with the help of factoring. During resolution, the number of clauses builds up dramat-
The search space is huge. Resolution and factoring can ically; it is important to delete all redundant clauses.
be applied in many different ways at each step of the proof. Each new clause is a consequence of the existing clauses.
Modern resolution systems use complex heuristics to man- A contradiction can only be derived if the original set of
age and limit the search. clauses is inconsistent. A clause can be deleted if it does
not affect the consistency of the set. A tautology should
Example 32 Prove ∀x ∃y ¬(P(y, x) ↔ ¬P(y, y)). always be deleted: it is true in all interpretations.
Here is a subtler case. Consider the clauses
Negate and expand the ↔, getting
{S, R} {P, ¬S} {P, Q, R}
¬∀x ∃y ¬[(¬P(y, x) ∨ ¬P(y, y))
∧ (¬¬P(y, y) ∨ P(y, x))] Resolving the first two yields {P, R}. Since each clause is
a disjunction, any interpretation that satisfies {P, R} also
Its negation normal form is satisfies {P, Q, R}. Thus {P, Q, R} cannot cause inconsis-
tency, and should be deleted.
∃x ∀y [(¬P(y, x) ∨ ¬P(y, y)) ∧ (P(y, y) ∨ P(y, x))] Put another way, P ∨ R implies P ∨ Q ∨ R. Anything
that could be derived from P ∨ Q ∨ R could also be derived
Skolemization yields from P ∨ R. This sort of deletion is called subsumption;
clause {P, R} subsumes {P, Q, R}.
(¬P(y, a) ∨ ¬P(y, y)) ∧ (P(y, y) ∨ P(y, a)).
Subsumption typically involves unification. In the gen-
The problem consists of just two clauses: eral case, a clause C subsumes a clause D if Cθ ⊆ D for
some substitution θ (if some instance of C implies D). For
{¬P(y, a), ¬P(y, y)} {P(y, y), P(y, a)} example, {P(x)} subsumes {P(a), Q(y)}: since x is a vari-
able, {P(x)} implies every formula of the form P(t).
At such a situation, it is a common error to imagine that
we can resolve on all the literals at the same time, imme-
diately reaching the empty clause. This is not possible: a
8.4 Prolog clauses
resolution step combines one single pair of complementary Prolog clauses, also called Horn clauses, have at most one
literals. We can, for example, resolve these two clauses on positive literal. A definite clause is one of the form
the literal P(y, a). We obtain {¬P(y, y), P(y, y)}, which
is a tautology and therefore worthless. {¬A1 , . . . , ¬Am , B}
8 FIRST-ORDER RESOLUTION AND PROLOG 23

It is logically equivalent to (A1 ∧ · · · ∧ Am ) → B. Prolog’s clause, which happens to be identical to the original one.
notation is Prolog never notices the repeated goal clause, so it repeats
B ← A1 , . . . , Am . the same useless resolution over and over again. Depth-first
search means that at every choice point, such as between
If m = 0 then the clause is simply written as B and is some-
using P ← P and P ← , Prolog will explore every avenue
times called a fact.
arising from its first choice before considering the second
A negative or goal clause is one of the form
choice. Obviously, the second choice would prove the goal
{¬A , . . . , ¬A } trivially, but Prolog never notices this.
1 m

Prolog permits just one of these; it represents the list of 8.6 Example of Prolog execution
unsolved goals. Prolog’s notation is
Here are axioms about the English succession: how y can
← A1 , . . . , Am . become King after x.
∀x ∀y (oldestson(y, x) ∧ king(x) → king(y))
A Prolog database consists of definite clauses. Observe that
definite clauses cannot express negative assertions, since ∀x ∀y (defeat(y, x) ∧ king(x) → king(y))
they must contain a positive literal. From a mathematical king(richardIII)
point of view, they have little expressive power; every set of defeat(henryVII, richardIII)
definite clauses is consistent! Even so, definite clauses are
oldestson(henryVIII, henryVII)
a natural notation for many problems.
The goal is to prove king(henryVIII). Now here is the same
problem in the form of definite clauses:
8.5 Prolog computations
{¬oldestson(y, x), ¬king(x), king(y)}
A Prolog computation takes a database of definite clauses
together with one goal clause. It repeatedly resolves the {¬defeat(y, x), ¬king(x), king(y)}
goal clause with some definite clause to produce a new goal {king(richardIII)}
clause. If resolution produces the empty goal clause, then {defeat(henryVII, richardIII)}
execution succeeds.
{oldestson(henryVIII, henryVII)}
Here is a diagram of a Prolog computation step:
The goal clause is
definite clause goal clause
A1 An B B1 Bm {¬king(henryVIII)}.
unify B B1 Figure 3 shows the execution. The subscripts in the clauses
new goal clause
are to rename the variables.
A1 An B2 Bm Note how crude this formalization is. It says nothing
about the passage of time, about births and deaths, about
This is a linear resolution (§6). Two program clauses are not having two kings at once. The oldest son of Henry VII,
never resolved with each other. The result of each resolution Arthur, died aged 15, leaving the future Henry VIII as the
step becomes the next goal clause; the previous goal clause oldest surviving son. All formal models must omit some
is discarded after use. real-world details: reality is overwhelmingly complex.
Prolog resolution is efficient, compared with general res- The Frame Problem in Artificial Intelligence reveals an-
olution, because it involves less search and storage. Gen- other limitation of logic. Consider writing an axiom system
eral resolution must consider all possible pairs of clauses; it to describe a robot’s possible actions. We might include
adds their resolvents to the existing set of clauses; it spends an axiom to state that if the robot lifts an object at time t,
a great deal of effort getting rid of subsumed (redundant) then it will be holding the object at time t + 1. But we also
clauses and probably useless clauses. Prolog always re- need to assert that the positions of everything else remain
solves some program clause with the goal clause. Because the same as before. Then we must consider the possibil-
goal clauses do not accumulate, Prolog requires little stor- ity that the object is a table and has other things on top of
age. Prolog never uses factoring and does not even remove it. Separation Logic, a variant of Hoare logic, was invented
repeated literals from a clause. to solve the frame problem, especially for reasoning about
Prolog has a fixed, deterministic execution strategy. The linked data structures.
program is regarded as a list of clauses, not a set; the clauses Prolog is a powerful and useful programming language,
are tried strictly in order. With a clause, the literals are also but it is seldom logic. Most Prolog programs rely on special
regarded as a list. The literals in the goal clause are proved predicates that affect execution but have no logical mean-
strictly from left to right. The goal clause’s first literal is ing. There is a huge gap between the theory and practice of
replaced by the literals from the unifying program clause, logic programming.
preserving their order. Exercise 27 Convert the following formula into clauses,
Prolog’s search strategy is depth-first. To illustrate what showing your working. Then present two resolution proofs
this means, suppose that the goal clause is simply ← P different from the one shown in Example 33 above.
and that the program clauses are P ← P and P ← . Pro-
log will resolve P ← P with ← P to obtain a new goal ∃x [P → Q(x)] ∧ ∃x [Q(x) → P] → ∃x [P ↔ Q(x)]
9 DECISION PROCEDURES AND SMT SOLVERS 24

definite clause goal clause Exercise 32 Find a refutation from the following set of
clauses using resolution and factoring.

{¬P(x, a), ¬P(x, y), ¬P(y, x)}

{¬os(y1 , x1 ), ¬k(x1 ), k(y1 )} {¬k(henryVIII)} {P(x, f (x)), P(x, a)}
{P( f (x), x), P(x, a)}

Exercise 33 Prove the following formulas by resolution,

{os(henryVIII, henryVII)} {¬os(henryVIII, x1 ), ¬k(x1 )} showing all steps of the conversion into clauses. Remember
to negate first!

∀x (P ∨ Q(x)) → (P ∨ ∀x Q(x))
{¬defeat(y2 , x2 ), ¬k(x2 ), k(y2 )} {¬k(henryVII)} ∃x y (R(x, y) → ∀zw R(z, w))

Note that P is just a predicate symbol, so in particular, x is

not free in P.
{defeat(henryVII, richardIII)} {¬defeat(henryVII, x2 ), ¬k(x2 )}

9 Decision Procedures and SMT

Solvers
{k(richardIII)} {¬k(richardIII)}
One of the original objectives of formal logic was to re-
place argumentation with calculation: to answer mathemat-
ical questions mechanically. Unfortunately, this reduction-
istic approach is unrealistic. Researchers soon found out
that most fundamental problems were insoluble, in general.
Figure 3: Execution of a Prolog program (os = oldestson, The Halting Problem was found to be undecidable. Gödel
k = king) proved that no reasonable system of axioms yielded a com-
plete theory for arithmetic. And problem classes that were
decidable turned out to be very difficult: propositional satis-
Exercise 28 Is the clause {P(x, b), P(a, y)} logically fiability is NP-complete (and probably exponential), while
equivalent to the unit clause {P(a, b)}? Is the clause many other decision procedures are of hyper-exponential
{P(y, y), P(y, a)} logically equivalent to {P(y, a)}? Ex- complexity, limiting them to the smallest problems.
plain both answers. But as we have seen before, worst-case results can be
overly pessimistic. DPLL solves propositional satisfiabil-
Exercise 29 Show that every set of definite clauses is con- ity for very large formulas. Used in conjunction with other
sistent. (Hint: first consider propositional logic, then extend techniques, it yields highly effective program analysis tools
your argument to first order logic.) that can, in particular, identify program loops that could fail
to terminate. Below we shall briefly consider some sim-
Exercise 30 Convert these formulas into clauses, showing
ple methods for solving systems of arithmetic constraints.
each step: negating the formula, eliminating → and ↔,
These decision procedures can be combined with the DPLL
pushing in negations, Skolemizing, dropping the universal
method. A modern Satisfiability Modulo Theories (SMT)
quantifiers, and converting the resulting formula into CNF.
solver brings together a large number of separate, small
(∀x ∃y R(x, y)) → (∃y ∀x R(x, y)) reasoning subsystems to solve very large and difficult prob-
lems. As before, we work with negated problems, which we
(∃y ∀x R(x, y)) → (∀x ∃y R(x, y)) attempt to show unsatisfiable or else to construct a model.
∃x ∀yz ((P(y) → Q(z)) → (P(x) → Q(x)))
¬∃y ∀x (R(x, y) ↔ ¬∃z (R(x, z) ∧ R(z, x)))
9.1 Decision procedures and problems
Exercise 31 Consider the Prolog program consisting of the A class of mathematical problems is called decidable if
definite clauses there exists an algorithm for determining whether a given
problem has a solution or not. Such an algorithm is called
P( f (x, y)) ← Q(x), R(y) a decision procedure for that class of problems. In theory
Q(g(z)) ← R(z) we are only interested in yes/no answers, though in prac-
R(a) ← tice, many decision procedures return additional informa-
tion. DPLL answers the yes/no question of whether a set of
Describe the Prolog computation starting from the goal clauses is satisfiable, but in the “yes” case it also delivers a
clause ← P(v). Keep track of the substitutions affect- model, and in the “no” case it can even deliver a proof. A
ing v to determine what answer the Prolog system would decidable class of mathematical problems is called a deci-
return. sion problem.
9 DECISION PROCEDURES AND SMT SOLVERS 25

A number of decidable subcases of first-order logic were The first case yields an upper bound for xn while the second
identified in the first half of the 20th century. One of case yields a lower bound. Now for every pair of constraints
the more interesting decision problems is Presburger arith- i and i 0 involving opposite signs can be added writing the
β0
metic: the first-order theory of the natural numbers with lower bound case as −xn ≤ − a i0 to obtain
i n
addition (and subtraction, and multiplication by constants).
There is an algorithm to determine whether a given sentence βi βi 0
in Presburger arithmetic is valid or not. 0≤ − .
ain ai 0 n
Real numbers behave differently from the natural num-
bers (where m < n ⇐⇒ m + 1 ≤ n) and require their Consider the following small set of constraints:
own algorithms. Once again, the only allowed operations
are addition, subtraction and constant multiplication. Such x≤y x≤z − x + y + 2z ≤ 0 − z ≤ −1
a language is called linear arithmetic. The validity of linear
arithmetic formulas over the reals is also decidable. Let’s work through the algorithm very informally. The first
Even unrestricted arithmetic (with multiplication) is de- two constraints give upper bounds for x, while the third
cidable for the reals. Unfortunately, the algorithms are too constraint gives a lower bound, and can be rewritten as
complicated and expensive for widespread use. Even Eu- −x ≤ −y − 2z. Adding it to x ≤ y yields 0 ≤ −2z
clidean geometry can be reduced to problems on the reals, which can be rewritten as z ≤ 0. Doing the same thing with
and is therefore decidable. Practical decision procedures x ≤ z yields y + z ≤ 0 which can be rewritten as z ≤ −y.
exist for simple data structures such as arrays and lists. This leaves us with a new set of constraints, where we have
eliminated the variable x:
9.2 Fourier-Motzkin variable elimination z≤0 z ≤ −y − z ≤ −1
Fourier-Motzkin variable elimination is a classic decision
procedure for real (or rational) linear arithmetic. It dates Now we have two separate upper bounds for z, as well as
from 1826 and is very inefficient in general, but relatively a lower bound, because we know z ≥ 1. There are again
easy to understand. In the general case, it deals with con- two possible combinations of a lower bound with an upper
junctions of linear constraints over the reals or rationals: bound, and we derive 0 ≤ −1 and 0 ≤ −y−1. Because 0 ≤
−1 is contradictory, Fourier-Motzkin variable elimination
m X
n
^ has refuted the original set of constraints.
ai j x j ≤ bi (6) A many other decision procedures exist, frequently for
i=1 j=1 more restrictive problem domains, aiming for greater ef-
It works by eliminating variables in succession. Eventually ficiency and better integration with other reasoning tools.
a either contradiction or a trivial constraint will remain. Difference arithmetic is an example: arithmetic constraints
The key idea is a technique known as quantifier elimi- are restricted to the form x − y ≤ c, where c is an inte-
nation (QE). We have already seen Skolemization, which ger constant. Satisfiability of a set of difference arithmetic
removes quantifiers from a formula, but that technique does constraints can be determined very quickly by construct-
not preserve the formula’s meaning. QE replaces a formula ing a graph and invoking the Bellman-Ford algorithm to
with an equivalent but quantifier-free formula. It is only look for a cycle representing the contradiction 0 ≤ −1.
possible for specific theories, and is generally very expen- In other opposite direction, harder decision problems can
sive. handle more advanced applications but require much more
For the reals, existential quantifiers can be eliminated as computer power.
follows:
m
^ n
^ m ^
^ n 9.3 Other decidable theories
∃x ( ai ≤ x ∧ x ≤ b j ) ⇐⇒ ai ≤ b j
i=1 j=1 i=1 j=1 One of the most dramatic examples of quantifier elimination
concerns the domain of real polynomial arithmetic:
A system of constraints has many lower bounds, {ai }i=1 m ,

and upper bounds, {b j } j=1 . To eliminate a variable, we

n
∃x [ax 2 + bx + c = 0] ⇐⇒
need to generalise to include every combination of a lower
bound with an upper bound. Observe that removing one b2 ≥ 4ac ∧ (c = 0 ∨ a 6 = 0 ∨ b2 > 4ac)
quantifier replaces a conjunction of m + n inequalities by a
conjunction of m × n inequalities. The left-hand side asks when a quadratic equation can have
Given a problem in the form (6), to eliminate the variable solutions, and the right-hand side gives necessary and suf-
xn , we examine each of the m constraints based on the sign ficient conditions, including degenerate cases (a = c = 0).
of the relevant coefficient, ain . If ain = 0 thenPit does not But this neat formula is the exception, not the rule. In gen-
constrain xn at all. Otherwise, define βi = bi − n−1 eral, QE yields gigantic formulas. Applying QE to a for-
j=1 ai j x j .
Then mula containing no free variables yields a sentence, which
simplifies to t or f, but even then, runtime is typically hy-
βi
If ain > 0 then xn ≤ perexponential.
ain For linear integer arithmetic, every decision algorithm
βi cn
has a worst-case runtime of at least 22 . People typically
If ain < 0 then xn ≥
ain use the Omega test or Cooper’s algorithm.
9 DECISION PROCEDURES AND SMT SOLVERS 26

There exist decision procedures for arrays, for at least the Unit propagation using 2a < b yields just this:
trivial theory of (where lk is lookup and up is update)
{3a > 2, a < 0}
(
v if i = j Now a case split on the literal 3a > 2 returns a “model”:
lk(up(a, i, v), j) =
lk(a, j) if i 6 = j b < a ∧ c 6 = 0 ∧ 2a < b ∧ 3a > 2.
The theory of lists with head, tail, cons is also decidable. But the decision procedure finds these contradictory and
Combinations of decidable theories remain decidable under returns a new clause:
certain circumstances, e.g., the theory of arrays with linear {¬(b < a), ¬(2a < b), ¬(3a > 2)}
arithmetic subscripts. The seminal publication, still cited
today, is Nelson and Oppen [1980]. Finally, we get a true model:
b < a ∧ c 6 = 0 ∧ 2a < b ∧ a < 0.
9.4 Satisfiability modulo theories Case splitting operates as usual for DPLL. But note that
Many decision procedures operate on existentially quanti- pure literal elimination would make no sense here, as there
fied conjunctions of inequalities. An arbitrary formula can are connections between literals (consider a < 0 and a > 0
be solved by translating it into disjunctive normal form (as again) that are not visible at the propositional level.
opposed to the more usual conjunctive normal form) and by
eliminating universal quantifiers in favour of negated exis- 9.6 Final remarks
tential quantifiers. However, these transformations typically
cause exponential growth and may need to be repeated as We seen here the concepts of over-approximation and
each variable is eliminated. counterexample-driven refinement. It is frequently used to
Satisfiability modulo theories (SMT) is an extension of extend SAT solving to richer domains than propositional
DPLL to make use of decision procedures, extending their logic. By over-approximation we mean that every model
scope while avoiding the problems mentioned above. The of the original problem assigns truth values to the enriched
idea is that DPLL handles the logical part of the problem “propositional letters” (such as a > 0), yielding a model
while delegating reasoning about particular theories to the of the propositional clauses obtained by ignoring the under-
relevant decision procedures. lying meaning of the propositional atoms. As above, any
claimed model is then somehow checked against the richer
We extend the language of propositional satisfiability to
problem domain, and the propositional model is then itera-
include atomic formulas belonging to our decidable theory
tively refined. But if the propositional clauses are inconsis-
(or theories). For the time being, these atomic formulas are
tent, then so is the original problem.
not interpreted, so a literal such as a < 0 is wholly unrelated
SMT solvers are the focus of great interest at the mo-
to a > 0. But the decision procedures are invoked during
ment, and have largely superseded SAT solvers (which they
the execution of DPLL; if we have already asserted a < 0,
incorporate and generalise). One of the most popular SMT
then the attempt to assert a > 0 will be rejected by the
solvers is Z3, a product of Microsoft Research but free
decision procedure, causing backtracking. Information can
for non-commercial use. Others include Yices and CVC4.
be fed back from the decision procedure to DPLL in the
They are applied to a wide range of problems, including
form of a new clause, such as ¬(a < 0) ∨ ¬(a > 0).
hardware and software verification, program analysis, sym-
[Remark: the Fourier-Motzkin decision procedure elim-
bolic software execution, and hybrid systems verification.
inates variables, but all decision procedures in actual use
can deal with constants such as a as well, and satisfiability- Exercise 34 In Fourier-Motzkin variable elimination, any
preserving transformations exist between formulas involv- variable not bounded both above and below is deleted from
ing constants and those involving quantified variables.] the problem. For example, given the set of constraints
3x ≥ y x ≥0 y≥z z≤1 z≥0
9.5 SMT example the variables x and then y can be removed (with their con-
Let’s consider an example. Suppose we start with the fol- straints), reducing the problem to z ≤ 1 ∧ z ≥ 0. Explain
lowing four clauses. Note that a, b, c are constants: vari- how this happens and why it is correct.
ables are not possible with this sort of proof procedure. Exercise 35 Apply Fourier-Motzkin variable elimination
to the set of constraints
{c = 0, 2a < b} {b < a}
{3a > 2, a < 0} {c 6 = 0, ¬(b < a)} x ≥ z y ≥ 2z z ≥ 0 x + y ≤ z.
Exercise 36 Apply Fourier-Motzkin variable elimination
Unit propagation using b < a yields three clauses: to the set of constraints
{c = 0, 2x < b} {3a > 2, a < 0} {c 6 = 0} x ≤ 2y x ≤ y+3 z≤x 0≤z y ≤ 4x.
Exercise 37 Apply the SMT algorithm sketched above to
Unit propagation using c 6 = 0 yields two clauses:
the following set of clauses:
{2a < b} {3a > 2, a < 0} {c = 0, c > 0} {a 6 = b} {c < 0, a = b}
10 BINARY DECISION DIAGRAMS 27

10 Binary Decision Diagrams U and U 0 , then construct a new decision node from P
to them. Do the usual simplifications. If U = U 0 then
A binary decision tree represents the truth table of a propo- the resulting BDD for the conjunction is U .
sitional formula by binary decisions, namely if-then-else
(b) If P < P 0 then build the BDD X ∧Z 0PY ∧Z 0 . When
expressions over the propositional letters. (In the relevant
building BDDs on paper, it is easier to pretend that the
literature, propositional letters are called variables.) Unfor-
second decision node also starts with P: assume that it
tunately, a decision tree may contain much redundancy. A
has the redundant decision Z 0PZ 0 and proceed as in (a).
binary decision diagram is a directed acyclic graph, shar-
ing identical subtrees. An ordered binary decision diagram (c) If P > P 0 , the approach is analogous to (b).
is based upon giving an ordering < to the variables: they
Other connectives, even ⊕, are treated similarly, differing
must be tested in order. Further refinements ensure that each
only in the base cases. The negation of the BDD X PY is
propositional formula is mapped to a unique diagram, for a
¬X P¬Y . In essence we copy the BDD, and when we reach
given ordering. We get a compact and canonical represen-
the leaves, exchange 1 and 0. The BDD of Z → f is the
tation of the truth table of any formula.
same as the BDD of ¬Z .
The acronym BDD for binary decision diagram is well-
During this processing, the same input (consisting of a
established in the literature. However, many earlier papers
connective and two BDDs) may be transformed into a BDD
use OBDD or even ROBDD (for “reduced ordered binary
repeatedly. Efficient implementations therefore have an
decision diagram”) synonymously.
additional hash table, which associates inputs to the cor-
A BDD must satisfy the following conditions:
responding BDDs. The result of every transformation is
• ordering: if P is tested before Q, then P < Q stored in the hash table so that it does not have to be com-
(thus in particular, P cannot be tested more than once puted again.
on a single path)
Example 34 We apply the BDD Canonicalisation Algo-
• uniqueness: identical subgraphs are stored only once
rithm to P ∨ Q → Q ∨ R. First, we make tiny BDDs for
(to do this efficiently, hash each node by its variable
P and Q. Then, we combine them using ∨ to make a small
and pointer fields)
BDD for P ∨ Q:
• irredundancy: no test leads to identical subgraphs in
P
the 1 and 0 cases (thanks to uniqueness, redundant
tests can be detected by comparing pointers)
P Q
For a given variable ordering, the BDD representation
of each formula is unique: BDDs are a canonical form. 0 1 0 1
Canonical forms usually lead to good algorithms — for
a start, you can test whether two things are equivalent by The BDD for Q ∨ R has a similar construction, so we omit
comparing their canonical forms. it. We combine the two small BDDs using →, then simplify
The BDD of a tautology is 1. Similarly, that of any in- (removing a redundant test on Q) to obtain the final BDD.
consistent formula is 0. To check whether two formulas
are logically equivalent, convert both to BDDs and then — P
thanks to uniqueness — simply compare the pointers.
A recursive algorithm converts a formula to a BDD. All P
the logical connectives can be handled directly, including
→ and ↔. (Exclusive-or is also used, especially in hard- P Q Q Q
ware examples.) The expensive transformation of A ↔ B
into (A → B) ∧ (B → A) is unnecessary. Q R R
Here is how to convert a conjunction A ∧ A0 to a BDD.
In this algorithm, X PY is a decision node that tests the vari- 0 1 0 1 0 1
able P, with a true-link to X and a false-link to Y . In other
words, X PY is the BDD equivalent of the decision “if P then The new construction is shown in grey. In both of these
X else Y ”. examples, it appears over the rightmost formula because its
0
1. Recursively convert A and A to BDDs Z and Z . 0 variables come later in the ordering.
The final diagram indicates that the original formula is
2. Check for trivial cases. If Z = Z 0 (pointer compari- always true except if P is true while Q and R are false.
son) then the result is Z ; if either operand is 0, then the When you have such a simple BDD, you can easily check
result is 0; if either operand is 1, then the result is the that it is correct. For example, this BDD suggests the for-
other operand. mula evaluates to 1 when P is false, and indeed we find that
the formula simplifies to Q → Q ∨ R, which simplifies
3. In the general case, let Z = X PY and Z 0 = X 0P 0 Y 0 . further to 1.
There are three possibilities: Huth and Ryan [2004] present a readable introduction to
(a) If P = P 0 then build the BDD X ∧X 0PY ∧Y 0 recur- BDDs. A classic but more formidable source of information
sively. This means convert X ∧ X 0 and Y ∧ Y 0 to BDDs is Bryant [1992].
11 MODAL LOGICS 28

Exercise 38 Compute the BDD for each of the following This definition of truth is more complex than we have
formulas, taking the variables as alphabetically ordered: seen previously (§2.2), because of the extra parameters W
and R. We shall not consider quantifiers at all; they really
P∧Q→Q∧P P∨Q→P∧Q
complicate matters, especially if the universe is allowed to
¬(P ∨ Q) ∨ P ¬(P ∧ Q) ↔ (P ∨ R) vary from one world to the next.
Exercise 39 Verify these equivalences using BDDs: For a particular frame (W, R), further relations can be
defined in terms of w A:
(P ∧ Q) ∧ R ' P ∧ (Q ∧ R)
(P ∨ Q) ∨ R ' P ∨ (Q ∨ R) |HW,R,I A means w A for all w under interpretation I
P ∨ (Q ∧ R) ' (P ∨ Q) ∧ (P ∨ R) |HW,R A means w A for all w and all I
P ∧ (Q ∨ R) ' (P ∧ Q) ∨ (P ∧ R)
Exercise 40 Verify these equivalences using BDDs: Now |H A means |HW,R A for all frames. We say that A
is universally valid. In particular, all tautologies of propo-
¬(P ∧ Q) ' ¬P ∨ ¬Q sitional logic are universally valid.
(P ↔ Q) ↔ R ' P ↔ (Q ↔ R) Typically we make further assumptions on the accessibil-
(P ∨ Q) → R ' (P → R) ∧ (Q → R) ity relation. We may assume, for example, that R is transi-
tive, and consider whether a formula holds under all such
frames. More formulas become universally valid if we re-
11 Modal Logics strict the accessibility relation, as they exclude some modal
frames from consideration. The purpose of such assump-
Modal logic allows us to reason about statements being tions is to better model the task at hand. For instance, to
“necessary” or “possible”. Some variants are effectively model the passage of time, we might want R to be reflex-
about time (temporal logic) where a statement might hold ive and transitive; we could even make it a linear ordering,
“henceforth” or “eventually”. though branching-time temporal logic is popular.
There are many forms of modal logic. Each one is based
upon two parameters:
11.2 Hilbert-style modal proof systems
• W is the set of possible worlds (machine states, future
times, . . .) Start with any proof system for propositional logic. Then
• R is the accessibility relation between worlds (state add the distribution axiom
transitions, flow of time, . . .)
2(A → B) → (2A → 2B)
The pair (W, R) is called a modal frame.
The two modal operators, or modalities, are 2 and 3: and the necessitation rule:
• 2A means A is necessarily true A
• 3A means A is possibly true 2A

Here “necessarily true’ means “true in all worlds accessible There are no axioms or inference rules for 3. The modal-
from the present one”. The modalities are related by the law ity is viewed simply as an abbreviation:
¬3A ' 2¬A; in words, “it is not possible that A is true”
is equivalent to “A is necessarily false”. def
3A = ¬2¬A
Complex modalities are made up of strings of the modal
operators, such as 22A, 23A, 32A, etc. Typically many The distribution axiom clearly holds in our semantics.
of these are equivalent to others; in S4, an important modal The propositional connectives obey their usual truth tables
logic, 22A is equivalent to 2A. in each world. If A holds in all worlds, and A → B holds
in all worlds, then B holds in all worlds. Thus if 2A and
11.1 Semantics of propositional modal logic 2(A → B) hold then so does 2B, and that is the essence
of the distribution axiom.
Here are some basic definitions, with respect to a particular
The necessitation rule states that all theorems are neces-
frame (W, R):
sarily true. In more detail, if A can be proved, then it holds
An interpretation I maps the propositional letters to sub-
in all worlds; therefore 2A is also true.
sets of W . For each letter P, the set I (P) consists of those
The modal logic that results from adding the distribution
worlds in which P is regarded as true.
axiom and necessitation rule is called K . It is a pure modal
If w ∈ W and A is a modal formula, then w A means
logic, from which others are obtained by adding further ax-
A is true in world w. This relation is defined as follows:
ioms. Each axiom corresponds to a property that is assumed
w P ⇐⇒ w ∈ I (P) to hold of all accessibility relations. Here are just a few of
w 2A ⇐⇒ v A for all v such that R(w, v) the main ones:
w 3A ⇐⇒ v A for some v such that R(w, v)
w A ∨ B ⇐⇒ w A or w B T 2A → A (reflexive)
w A ∧ B ⇐⇒ w A and w B 4 2A → 22A (transitive)
w ¬A ⇐⇒ w A does not hold B A → 23A (symmetric)
11 MODAL LOGICS 29

Logic T includes axiom T: reflexivity. Logic S4 includes

axioms T and 4: reflexivity and transitivity. Logic S5 in-
A, 0 ⇒ 1 0 ∗ ⇒ 1∗ , A
cludes axioms T, 4 and B: reflexivity, transitivity and sym- (2l) (2r )
metry; these imply that the accessibility relation is an equiv- 2A, 0 ⇒ 1 0 ⇒ 1, 2A
alence relation, which is a strong condition.
Other conditions on the accessibility relation concern A, 0 ∗ ⇒ 1∗ 0 ⇒ 1, A
(3l) (3r )
forms of confluence. One such condition might state that 3A, 0 ⇒ 1 0 ⇒ 1, 3A
if w1 and w2 are both accessible from w then there exists
some v that is accessible from both w1 and w2 . The (2r ) rule is analogous to the necessitation rule. But
now A may be proved from other formulas. This intro-
duces complications. Modal logic is notorious for requiring
11.3 Sequent Calculus Rules for S4 strange conditions in inference rules. The symbols 0 ∗ and
1∗ stand for sets of formulas, defined as follows:
We shall mainly look at S4, which is one of the mainstream
0 ∗ = {2B | 2B ∈ 0}
def
modal logics. It’s more intuitive than many of the other
variants, and has a particularly clean sequent calculus.
1∗ = {3B | 3B ∈ 1}
def
As mentioned above, S4 assumes that the accessibility
relation is reflexive and transitive. Think of the flow of time. In effect, applying rule (2r ) in a backward proof throws
Here are some S4 statements with their intuitive meanings: away all left-hand formulas that do not begin with a 2 and
all right-hand formulas that do not begin with a 3.
• 2A means “A will be true from now on”. If you consider why the (2r ) rule actually holds, it is not
hard to see why those formulas must be discarded. If we
• 3A means “A will be true at some point in the future”, forgot about the restriction, then we could use (2r ) to infer
where the future includes the present moment. A ⇒ 2A from A ⇒ A, which is ridiculous. The restriction
ensures that the proof of A in the premise is independent of
• 23A means “3A will be true from now on”. At any any particular world.
future time, A must become true some time afterwards. The rule (3l) is an exact dual of (2r ). The obligation to
In short, A will be true infinitely often. discard formulas forces us to plan proofs carefully. If rules
are applied in the wrong order, vital information may have
• 22A means “2A will be true from now on”. At any to be discarded and the proof will fail.
future time, A will continue to be true. So 22A and
2A have the same meaning in S4. 11.4 Some sample proofs in S4
A few examples will illustrate how the S4 sequent calculus
is used.
The distribution axiom is assumed in the Hilbert-style
proof system. Using the sequent calculus, we can prove
it (I omit the (→r ) steps):

A⇒ A B⇒B
(→l)
A → B, A ⇒ B
(2l)
A → B, 2A ⇒ B
(2l)
2(A → B), 2A ⇒ B
(2r )
2(A → B), 2A ⇒ 2B
Figure 4: Counterexample to 32A ∧ 32B → 32(A ∧ B)
Intuitively, why is this sequent true? We assume 2(A →
B): from now on, if A holds then so does B. We assume
The “time” described by S4 allows multiple futures, 2A: from now on, A holds. Obviously we can conclude
which can be confusing. For example, 32A intuitively that B will hold from now on, which we write formally as
means “eventually A will be true forever”. You might ex- 2B.
pect 32A and 32B to imply 32(A ∧ B), since eventu- The order in which you apply rules is important. Working
ally A and B should both have become true. However, this backwards, you must first apply rule (2r ). This rule discards
property fails because time can split, with A becoming true non-2 formulas, but there aren’t any. If you first apply (2l),
in one branch and B in the other (Fig. 4). Note in partic- removing the boxes from the left side, then you will get
ular that 232A is stronger than 32A, and means “in all stuck:
futures, eventually A will be true forever”. now what?
?
The sequent calculus for S4 extends the usual sequent ⇒B
(2r )
rules for propositional logic with additional ones for 2 and A → B, A ⇒ 2B
(2l)
3. Four rules are required because the modalities may oc- A → B, 2A ⇒ 2B
(2l)
cur on either the left or right side of a sequent. 2(A → B), 2A ⇒ 2B
12 TABLEAUX-BASED METHODS 30

Applying (2r ) before (2l) is analogous to applying (∀r ) be- we are trying to show A ∧ B, but there is no reason why A
fore (∀l). The analogy because 2A has an implicit universal should hold in that world.
quantifier: for all accessible worlds. The sequent 32A, 32B ⇒ 32(A ∧ B) is not valid
The following two proofs establish the modal equiva- because A and B can become true in different futures.
lence 2323A ' 23A. Strings of modalities, like 2323 However, the sequents 32A, 232B ⇒ 32(A ∧ B) and
and 23, are called operator strings. So the pair of results 232A, 232B ⇒ 232(A ∧ B) are both valid.
establish an operator string equivalence. The validity of this
particular equivalence is not hard to see. Recall that 23A Exercise 41 Why does the dual of an operator string equiv-
means that A holds infinitely often. So 2323A means that alence also hold?
23A holds infinitely often — but that can only mean that
A holds infinitely often, which is the meaning of 23A. Exercise 42 Prove the sequents 3(A ∨ B) ⇒ 3A, 3B and
Now, let us prove the equivalence. Here is the first half of 3A ∨ 3B ⇒ 3(A ∨ B), thus proving the equivalence
the proof. As usual we apply (2r ) before (2l). Dually, and 3(A ∨ B) ' 3A ∨ 3B.
analogously to the treatment of the ∃ rules, we apply (3l)
before (3r ): Exercise 43 Prove the sequent 3(A → B), 2A ⇒ 3B.
3A ⇒ 3A Exercise 44 Prove the equivalence 2(A ∧ B) ' 2A ∧ 2B.
(2l)
23A ⇒ 3A
(3l)
323A ⇒ 3A Exercise 45 Prove 232A, 232B ⇒ 232(A ∧ B).
(2l)
2323A ⇒ 3A
(2r )
2323A ⇒ 23A
12 Tableaux-Based Methods
The opposite entailment is easy to prove:
There is a lot of redundancy among the connectives ¬, ∧,
23A ⇒ 23A ∨, →, ↔, ∀, ∃. We could get away using only three of
(3r )
23A ⇒ 323A them (two if we allowed exclusive-or), but use the full set
(2r )
23A ⇒ 2323A for readability. There is also a lot of redundancy in the se-
quent calculus, because it was designed to model human
Logic S4 enjoys many operator string equivalences, in-
reasoning, not to be as small as possible.
cluding 22A ' 2A. And for every operator string equiv-
One approach to removing redundancy results in the res-
alence, its dual (obtained by exchanging 2 with 3) also
olution method. Clause notation replaces the connectives,
holds. In particular, 33A ' 3A and 3232A ' 32A
and there is only one inference rule. A less radical ap-
hold. So we only need to consider operator strings in which
proach still removes much of the redundancy, while pre-
the boxes and diamonds alternate, and whose length does
serving much of the natural structure of formulas. The re-
not exceed three.
sulting formalism, known as the tableau calculus, is often
The distinct S4 operator strings are therefore 2, 3, 23,
adopted by proof theorists because of its logical simplicity.
32, 232 and 323.
Adding unification produces yet another formalism known
Finally, here are two attempted proofs that fail — be-
as free-variable tableaux; this form is particularly amenable
cause their conclusions are not theorems! The modal se-
to implementation. Both formalisms use proof by contra-
quent A ⇒ 23A states that if A holds now then it necessar-
diction.
ily holds again: from each accessible world, another world
is accessible in which A holds. This formula is valid if the
accessibility relation is symmetric; then one could simply 12.1 Simplifying the sequent calculus
return to the original world. The formula is therefore a the-
The usual formalisation of first-order logic involves seven
orem of S5 modal logic, but not S4.
connectives, or nine in the case of modal logic. For each
⇒A connective the sequent calculus has a left and a right rule.
(3r )
⇒ 3A So, apart from the structural rules (basic sequent and cut)
(2r )
A ⇒ 23A there are 14 rules, or 18 for modal logic.
Suppose we allow only formulas in negation normal
Here, the modal sequent 3A, 3B ⇒ 3(A∧ B) states that form. This immediately disposes of the connectives → and
if A holds in some accessible world, and B holds in some ↔. Really ¬ is discarded also, as it is allowed only on
accessible world, then both A and B hold in some accessi- propositional letters. So only four connectives remain, six
ble world. It is a fallacy because those two worlds need not for modal logic.
coincide. The (3l) rule prevents us from removing the dia- The greatest simplicity gain comes in the sequent rules.
monds from both 3A and 3B; if we choose one we must The only sequent rules that move formulas from one side
discard the other: to the other (across the ⇒ symbol) are the rules for the
B⇒A∧B connectives that we have just discarded. Half of the sequent
(3r )
B ⇒ 3(A ∧ B) rules can be discarded too. It makes little difference whether
(3l) we discard the left-side rules or the right-side rules.
3A, 3B ⇒ 3(A ∧ B)
Let us discard the right-side rules. The resulting system
The topmost sequent may give us a hint as to why the con- allows sequents of the form A ⇒ . It is a form of refutation
clusion fails. Here we are in a world in which B holds, and system (proof by contradiction), since the formula A has
12 TABLEAUX-BASED METHODS 31

the same meaning as the sequent ¬A ⇒ . Moreover, a basic Then allow unification to instantiate variables with terms.
sequent has the form of a contradiction. We have created a This should occur when trying to solve any goal containing
new formal system, known as the tableau calculus. two formulas, ¬A and B. Try to unify A with B, producing
a basic sequent. Instantiating a variable updates the entire
¬A, 0 ⇒ A, 0 ⇒ proof tree.
(basic) (cut)
¬A, A, 0 ⇒ 0⇒ Up until now, we have treated rule (∃l) backward proofs
as creating a fresh variable. That will no longer do: we now
A, B, 0 ⇒ A, 0 ⇒ B, 0 ⇒ allow variables to become instantiated by terms. To elimi-
(∧l) (∨l)
A ∧ B, 0 ⇒ A ∨ B, 0 ⇒ nate this problem, we do not include (∃l) in the free-variable
tableau calculus; instead we Skolemize the formula. All ex-
istential quantifiers disappear, so we can discard rule (∃l).
A[t/x], 0 ⇒ A, 0 ⇒
(∀l) (∃l) This version of the tableau method is known as the free-
∀x A, 0 ⇒ ∃x A, 0 ⇒
variable tableau calculus.
Rule (∃l) has the usual proviso: it holds provided x is not Warning: if you wish to use unification, you absolutely
free in the conclusion! must also use Skolemization. If you use unification without
We can extend the system to S4 modal logic by adding Skolemization, then you are trying to use two formalisms at
just two further rules, one for 2 and one for 3: the same time and your proofs will be nonsense! This is be-
cause unification is likely to introduce variable occurrences
A, 0 ⇒ A, 0 ∗ ⇒ in places where they are forbidden by the side condition of
(2l) (3l)
2A, 0 ⇒ 3A, 0 ⇒ the existential rule.
The Skolemised version of ∀y ∃z Q(y, z) ∧ ∃x P(x) is
As previously, 0 ∗ is defined to erase all non-2 formulas: ∀y Q(y, f (y)) ∧ P(a). The subformula ∃x P(x) goes to
P(a) and not to P(g(y)) because it is outside the scope of
0 ∗ = {2B | 2B ∈ 0}
def
the ∀y.

We have gone from 14 rules to four, ignoring the struc-

tural rules. For modal logic, we have gone from 18 rules to 12.3 Proofs using free-variable tableaux
six.
A simple proof will illustrate how the tableau calculus Let us prove the formula ∃x ∀y [P(x) → P(y)]. First
works. Let us prove ∀x (A → B) ⇒ A → ∀x B, where x negate it and convert to NNF, getting ∀x ∃y [P(x)∧¬P(y)].
is not free in A. We must negate the formula, convert it to Then Skolemize the formula, and finally put it on the left
NNF and finally put it on the left side of the arrow. The re- side of the arrow. The sequent to be proved is ∀x [P(x) ∧
sulting sequent is A ∧ ∃x ¬B, ∀x (¬A ∨ B) ⇒ . Elaborate ¬P( f (x))] ⇒ . Unification completes the proof by creating
explanations should not be necessary because this tableau a basic sequent; there are two distinct ways of doing so:
calculus is essentially a subset of the sequent calculus de-
z 7 → f (y) or y 7 → f (z)
scribed in §5. basic
P(y), ¬P( f (y)), P(z), ¬P( f (z)) ⇒
(∧l)
A, ¬B, ¬A ⇒ A, ¬B, B ⇒ P(y), ¬P( f (y)), P(z) ∧ ¬P( f (z)) ⇒
(∨l) (∀l)
A, ¬B, ¬A ∨ B ⇒ P(y), ¬P( f (y)), ∀x [P(x) ∧ ¬P( f (x))] ⇒
(∀l) (∧l)
A, ¬B, ∀x (¬A ∨ B) ⇒ P(y) ∧ ¬P( f (y)), ∀x [P(x) ∧ ¬P( f (x))] ⇒
(∃l) (∀l)
A, ∃x ¬B, ∀x (¬A ∨ B) ⇒ ∀x [P(x) ∧ ¬P( f (x))] ⇒
(∧l)
A ∧ ∃x ¬B, ∀x (¬A ∨ B) ⇒
In the first inference from the bottom, the universal formula
is retained because it must be used again. In principle, uni-
12.2 The free-variable tableau calculus versally quantified formulas ought always to be retained, as
Some proof theorists adopt the tableau calculus as their for- they may be used any number of times. I normally erase
malisation of first-order logic. It has the advantages of the them to save space.
sequent calculus, without the redundancy. But can we use Pulling quantifiers to the front is not merely unnecessary;
it as the basis for a theorem prover? Implementing the cal- it can be harmful. Skolem functions should have as few ar-
culus (or indeed, implementing the full sequent calculus) guments as possible, as this leads to shorter proofs. Attain-
requires a treatment of quantifiers. As with resolution, a ing this requires that quantifiers should have the smallest
good computational approach is to combine unification with possible scopes; we ought to push quantifiers in, not pull
Skolemization. them out. This is sometimes called miniscope form.
First, consider how to add unification. The rule (∀l) sub- For example, the formula ∃x ∀y [P(x) → P(y)] is tricky
stitutes some term for the bound variable. Since we do not to prove, as we have just seen. But putting it in miniscope
know in advance what the term ought to be, instead substi- form makes its proof trivial. Let us do this step by step:
tute a free variable. The variable ought to be fresh, not used
elsewhere in the proof: Negate; convert to NNF: ∀x ∃y [P(x) ∧ ¬P(y)]
Push in the ∃y : ∀x [P(x) ∧ ∃y ¬P(y)]
A[z/x], 0 ⇒ Push in the ∀x : ∀x P(x) ∧ ∃y ¬P(y)
(∀l)
∀x A, 0 ⇒ Skolemize: ∀x P(x) ∧ ¬P(a)
REFERENCES 32

The formula ∀x P(x) ∧ ¬P(a) is obviously inconsistent. (Lit = -Neg; -Lit = Neg) ->
Here is its refutation in the free-variable tableau calculus: (unify(Neg,L); prove(Lit,[],Lits,_,_)).
prove(Lit,[Next|UnExp],Lits,FreeV,VarLim) :-
y 7→ a prove(Next,UnExp,[Lit|Lits],FreeV,VarLim).
basic
P(y), ¬P(a) ⇒
(∀l) The first clause handles conjunctions, the second disjunc-
∀x P(x), ¬P(a) ⇒
(∧l) tions, the third universal quantification. The fourth line han-
∀x P(x) ∧ ¬P(a) ⇒
dles literals, including negation. The fifth line brings in the
A failed proof is always illuminating. Let us try to prove next formula to be analyzed.
the invalid formula You are not expected to memorize this program or to un-
derstand how it works in detail.
∀x [P(x) ∨ Q(x)] → [∀x P(x) ∨ ∀x Q(x)].
Exercise 46 Use the free variable tableau calculus to prove
Negation and conversion to NNF gives these formulas:

∃x ¬P(x) ∧ ∃x ¬Q(x) ∧ ∀x [P(x) ∨ Q(x)]. (∃y ∀x R(x, y)) → (∀x ∃y R(x, y))
Skolemization gives ¬P(a)∧¬Q(b)∧∀x [P(x)∨ Q(x)]. (P(a, b) ∨ ∃z P(z, z)) → ∃x y P(x, y)
The proof fails because a and b are distinct constants. It ∃x P(x) → Q) → ∀x (P(x) → Q)
is impossible to instantiate y to both simultaneously. The
following proof omits the initial (∧l) steps.
References
y 7→ a y 7 → b???
¬P(a), ¬Q(b), P(y) ⇒ ¬P(a), ¬Q(b), Q(y) ⇒ B. Beckert and J. Posegga. leanTAP: Lean, tableau-based
(∨l) theorem proving. In A. Bundy, editor, Automated
¬P(a), ¬Q(b), P(y) ∨ Q(y) ⇒
(∀l) Deduction — CADE-12 International Conference, LNAI
¬P(a), ¬Q(b), ∀x [P(x) ∨ Q(x)] ⇒
814, pages 793–797. Springer, 1994.

12.4 Tableaux-based theorem provers R. E. Bryant. Symbolic boolean manipulation with ordered
binary-decision diagrams. Computing Surveys, 24(3):
A tableau represents a partial proof as a set of branches of 293–318, Sept. 1992.
formulas. Each formula on a branch is expanded until this
is no longer possible (and the proof fails) or until the proof M. Huth and M. Ryan. Logic in Computer Science:
succeeds. Modelling and Reasoning about Systems. Cambridge
Expanding a conjunction A ∧ B on a branch replaces it University Press, 2nd edition, 2004.
by the two conjuncts, A and B. Expanding a disjunction
A∨ B splits the branch in two, with one branch containing A G. Nelson and D. C. Oppen. Fast decision procedures
and the other branch B. Expanding the quantification ∀x A based on congruence closure. J. ACM, 27(2):356–364,
extends the branch by a formula of the form A[t/x]. If a 1980. ISSN 0004-5411. doi:
branch contains both A and ¬A then it is said to be closed. http://doi.acm.org/10.1145/322186.322198.
When all branches are closed, the proof has succeeded.
A tableau can be viewed as a compact, graph-based rep-
resentation of a set of sequents. The branch operations de-
scribed above correspond to our sequent rules in an obvious
way.
Quite a few theorem provers have been based upon free-
variable tableaux. The simplest is due to Beckert and
Posegga [1994] and is called leanTAP. The entire program
appears below! Its deductive system is similar to the re-
duced sequent calculus we have just studied. It relies on
some Prolog tricks, and is certainly not pure Prolog code.
It demonstrates just how simple a theorem prover can be.
leanTAP does not outperform big resolution systems. But
it quickly proves some fairly hard theorems.
prove((A,B),UnExp,Lits,FreeV,VarLim) :- !,
prove(A,[B|UnExp],Lits,FreeV,VarLim).
prove((A;B),UnExp,Lits,FreeV,VarLim) :- !,
prove(A,UnExp,Lits,FreeV,VarLim),
prove(B,UnExp,Lits,FreeV,VarLim).
prove(all(X,Fml),UnExp,Lits,FreeV,VarLim) :- !,
\+ length(FreeV,VarLim),
copy_term((X,Fml,FreeV),(X1,Fml1,FreeV)),
append(UnExp,[all(X,Fml)],UnExp1),
prove(Fml1,UnExp1,Lits,[X1|FreeV],VarLim).
prove(Lit,_,[L|Lits],_,_) :-

Rene Cori, Daniel Lascar, Donald H. Pelletier - Mathematical Logic - A Course With Exercises Part II - Recursion Theory, Godel's Theorems, Set Theory, Model Theory (2001, Oxford University Press)
No ratings yet
Rene Cori, Daniel Lascar, Donald H. Pelletier - Mathematical Logic - A Course With Exercises Part II - Recursion Theory, Godel's Theorems, Set Theory, Model Theory (2001, Oxford University Press)
347 pages
(20 Mathematical Logic) Nerode Logic For Applications
No ratings yet
(20 Mathematical Logic) Nerode Logic For Applications
382 pages
Finite Elements and Approximation
From Everand
Finite Elements and Approximation
O. C. Zienkiewicz
4.5/5 (4)
Project
No ratings yet
Project
3 pages
Apt K.R. From Logic Programming To Prolog (PH, 1997) (ISBN 013230368X) (O) (345s) - CSPL
100% (1)
Apt K.R. From Logic Programming To Prolog (PH, 1997) (ISBN 013230368X) (O) (345s) - CSPL
345 pages
Gate Cse PDF
100% (1)
Gate Cse PDF
640 pages
Combinatorial Optimization: Networks and Matroids
From Everand
Combinatorial Optimization: Networks and Matroids
Eugene Lawler
3.5/5 (2)
Lectures in Logic
No ratings yet
Lectures in Logic
57 pages
CS F214 1090
No ratings yet
CS F214 1090
7 pages
From Logic Programming To Prolog
100% (4)
From Logic Programming To Prolog
345 pages
Discrete Mathematics
100% (1)
Discrete Mathematics
64 pages
Logic Programming PDF
No ratings yet
Logic Programming PDF
272 pages
Artificial Intelligence:: First-Order Logic
No ratings yet
Artificial Intelligence:: First-Order Logic
32 pages
IMG-20240920211938375
No ratings yet
IMG-20240920211938375
31 pages
CS367 Logic For Computer Science
No ratings yet
CS367 Logic For Computer Science
2 pages
A Problem Book in Mathematical Logic II PDF
No ratings yet
A Problem Book in Mathematical Logic II PDF
93 pages
Logic and Discrete Mathematics
0% (1)
Logic and Discrete Mathematics
10 pages
A Problematic Course in Math Logic
100% (3)
A Problematic Course in Math Logic
95 pages
Logic and Proof: Computer Science Tripos Part IB Michaelmas Term
No ratings yet
Logic and Proof: Computer Science Tripos Part IB Michaelmas Term
65 pages
A Problem Course in Mathematical Logic: Stefan Bilaniuk
No ratings yet
A Problem Course in Mathematical Logic: Stefan Bilaniuk
91 pages
Mathematical Logic
No ratings yet
Mathematical Logic
99 pages
Automated Reasoning With PROLOG
100% (1)
Automated Reasoning With PROLOG
5 pages
Knowledge Representation
No ratings yet
Knowledge Representation
50 pages
Rule Based Programming: 3rd Year, 2nd Semester
No ratings yet
Rule Based Programming: 3rd Year, 2nd Semester
17 pages
Logique mathematique
No ratings yet
Logique mathematique
28 pages
Rule Based Programming: 3rd Year, 2nd Semester
No ratings yet
Rule Based Programming: 3rd Year, 2nd Semester
21 pages
Lectures
No ratings yet
Lectures
12 pages
Ch 1 - Logic Theory (1)
No ratings yet
Ch 1 - Logic Theory (1)
52 pages
CSE420 Symbolic Logic and Logic Processing 18306::savleen Kaur 3.0 0.0 0.0 3.0 Courses With Conceptual Focus
No ratings yet
CSE420 Symbolic Logic and Logic Processing 18306::savleen Kaur 3.0 0.0 0.0 3.0 Courses With Conceptual Focus
9 pages
Logic 2 Notes
No ratings yet
Logic 2 Notes
312 pages
Book Review (Mantiq)
No ratings yet
Book Review (Mantiq)
9 pages
Ch 1 - Logic Theory
No ratings yet
Ch 1 - Logic Theory
51 pages
Arindama Singh Logics For Computer Science, 2nd Edition PHI 2018
No ratings yet
Arindama Singh Logics For Computer Science, 2nd Edition PHI 2018
431 pages
Chapter3 Logic Part1
No ratings yet
Chapter3 Logic Part1
20 pages
AI-unit-3
No ratings yet
AI-unit-3
154 pages
Introduction To Discrete Structures
No ratings yet
Introduction To Discrete Structures
52 pages
Proposition Al 2
No ratings yet
Proposition Al 2
88 pages
Module 1 Logic
No ratings yet
Module 1 Logic
72 pages
Logica Matematica Parte 1
100% (3)
Logica Matematica Parte 1
360 pages
Lect W02 CHP 02b Propositional
No ratings yet
Lect W02 CHP 02b Propositional
46 pages
Logic and Discrete Mathematics A Computer Science - 1
No ratings yet
Logic and Discrete Mathematics A Computer Science - 1
11 pages
076bct048 AI Lab3
No ratings yet
076bct048 AI Lab3
11 pages
2024-11-12-CSC407StudyGuide4Midterms
No ratings yet
2024-11-12-CSC407StudyGuide4Midterms
5 pages
Discrete Mathematics
100% (1)
Discrete Mathematics
16 pages
(De Gruyter Textbook) Daniel Cunningham - Mathematical Logic - An Introduction-Walter de Gruyter (2023)
No ratings yet
(De Gruyter Textbook) Daniel Cunningham - Mathematical Logic - An Introduction-Walter de Gruyter (2023)
270 pages
Mathematical Logic I Rene Cori Lascar Pelletier
100% (1)
Mathematical Logic I Rene Cori Lascar Pelletier
360 pages
Discrete Mathematics AND Its Applications
No ratings yet
Discrete Mathematics AND Its Applications
42 pages
Discrete Mathematics AND Its Applications
No ratings yet
Discrete Mathematics AND Its Applications
42 pages
Logic For Philosophy 2e
No ratings yet
Logic For Philosophy 2e
369 pages
Math/CSE 1019C: Discrete Mathematics For Computer Science: Suprakash Datta
No ratings yet
Math/CSE 1019C: Discrete Mathematics For Computer Science: Suprakash Datta
41 pages
Ted Draft) Logic For Philosophers
100% (2)
Ted Draft) Logic For Philosophers
284 pages
Logic 1 Course Guide
No ratings yet
Logic 1 Course Guide
4 pages
Atp BW
No ratings yet
Atp BW
31 pages
5 Logic
No ratings yet
5 Logic
29 pages
Lecture Notes An Introduction To Prolog Programming: Ulle Endriss
No ratings yet
Lecture Notes An Introduction To Prolog Programming: Ulle Endriss
80 pages
3. Module-3 Knowledge Representation
No ratings yet
3. Module-3 Knowledge Representation
96 pages
Computational Logic 2012-2013
No ratings yet
Computational Logic 2012-2013
8 pages
10 Lecture CSC462
No ratings yet
10 Lecture CSC462
28 pages
Popular Lectures on Mathematical Logic
From Everand
Popular Lectures on Mathematical Logic
Hao Wang
No ratings yet
First-Order Partial Differential Equations, Vol. 1
From Everand
First-Order Partial Differential Equations, Vol. 1
Hyun-Ku Rhee
5/5 (1)
Algorithms Unlocked: Mastering Computational Problem Solving
From Everand
Algorithms Unlocked: Mastering Computational Problem Solving
Peter Johnson
No ratings yet
What Is Meant by The Term Decomposition? What Is A Subroutine? Which Type of Translator Is Good For Finding Errors / Debugging and Why?
No ratings yet
What Is Meant by The Term Decomposition? What Is A Subroutine? Which Type of Translator Is Good For Finding Errors / Debugging and Why?
2 pages
Edison Tsai Dissertation-Final
No ratings yet
Edison Tsai Dissertation-Final
391 pages
Python
No ratings yet
Python
16 pages
Maximum Flow Problem
No ratings yet
Maximum Flow Problem
21 pages
C++ Technical Apti Notes
No ratings yet
C++ Technical Apti Notes
28 pages
Solving N Queens Problem Using Backtracking
No ratings yet
Solving N Queens Problem Using Backtracking
5 pages
Cpps Unit-3 Arrays Array:: Declaration: Syntax: Data Type Array - Name (Size of The Array)
No ratings yet
Cpps Unit-3 Arrays Array:: Declaration: Syntax: Data Type Array - Name (Size of The Array)
24 pages
aMOD 7 DIP
No ratings yet
aMOD 7 DIP
30 pages
Nbtree: A Naive Bayes/Decision-Tree Hybrid: Darin Morrison
No ratings yet
Nbtree: A Naive Bayes/Decision-Tree Hybrid: Darin Morrison
27 pages
09-Guessing Game Challenge
No ratings yet
09-Guessing Game Challenge
2 pages
Or (Bba-405)
No ratings yet
Or (Bba-405)
4 pages
Daa 6 SK
No ratings yet
Daa 6 SK
3 pages
Introduction-To-Ml-Part-3 Edited
No ratings yet
Introduction-To-Ml-Part-3 Edited
73 pages
Physical Design Automation
No ratings yet
Physical Design Automation
116 pages
CS502 Fundamentals of Algorithms 2013 Final Term Mcqs Solved With References by Moaaz
0% (1)
CS502 Fundamentals of Algorithms 2013 Final Term Mcqs Solved With References by Moaaz
46 pages
Minimum Cost Flows: Primal Dual Algorithm and The Out-of-Kilter Algorithm
No ratings yet
Minimum Cost Flows: Primal Dual Algorithm and The Out-of-Kilter Algorithm
34 pages
Federated Learning
No ratings yet
Federated Learning
36 pages
CS369 StringAlgs PDF
No ratings yet
CS369 StringAlgs PDF
33 pages
Soft Input Viterbi Decoder
No ratings yet
Soft Input Viterbi Decoder
7 pages
Lecture 14
No ratings yet
Lecture 14
18 pages
Searching and Sorting
No ratings yet
Searching and Sorting
67 pages
Dijkstra Algorithm Lecture Notes
100% (1)
Dijkstra Algorithm Lecture Notes
23 pages
CoSc4142 AI Chapter 3
No ratings yet
CoSc4142 AI Chapter 3
55 pages
Enumeration Is A List of Named Constants. An Enumeration
No ratings yet
Enumeration Is A List of Named Constants. An Enumeration
9 pages
Unit 5
No ratings yet
Unit 5
33 pages
Pro Material Series: Free Placement Video Tutorial With Mock Test Visit
No ratings yet
Pro Material Series: Free Placement Video Tutorial With Mock Test Visit
5 pages
Subject Name: Artificial Intelligence Subject Code:3161608: Semester: VI (2020)
No ratings yet
Subject Name: Artificial Intelligence Subject Code:3161608: Semester: VI (2020)
15 pages
Algorithms For Optimization
No ratings yet
Algorithms For Optimization
10 pages
Comparison of Load Flow Methods
No ratings yet
Comparison of Load Flow Methods
3 pages

Ai PDF

Uploaded by

Ai PDF

Uploaded by

Logic and Proof

Computer Science Tripos Part IB

3 Proof Systems for Propositional Logic 5

5 Formal Reasoning in First-Order Logic 11

6 Clause Methods for Propositional Logic 13

7 Skolem Functions, Herbrand’s Theorem and Unification 17

8 First-Order Resolution and Prolog 21

9 Decision Procedures and SMT Solvers 24

10 Binary Decision Diagrams 27

2 Propositional Logic the meaning of the propositional symbols it contains. The

2.3 Applications of propositional logic A∧ A' A

• A literal is an atomic formula or its negation. Let K , L, A ∨ (B ∧ C) ' (A ∨ B) ∧ (A ∨ C)

2.6 Translation to normal form

is true (in a particular interpretation) if A1 ∧. . .∧ Am implies 0 ⇒ 1, A A, 0 ⇒ 1

The sequent A ⇒ B, C has no line over it because it is not ¬¬A ⇒ A

Structural rules concern sequents in general rather than par- (A ∧ B) ∧ C ⇒ A ∧ (B ∧ C)

• A variable is a term. ∀x y (student(x) ∧ college(y) ∧ member(x, y)

∀x (student(x) → ∃y supervises(y, x)) It is natural to regard predicates as truth-valued functions.

(∀x A) ∧ B ' ∀x (A ∧ B) 5.4 Sequent rules for the universal quantifier

P(x) ⇒ P(x), Q(x)

Figure 2: Half of the ∃ distributive law

The Davis-Putnam-Logeman-Loveland procedure dates 6.2 The DPLL Method

B ∨ A ¬B ∨ C Step 3, push disjunctions in, has nothing to do. The clauses

{P} {¬Q, ¬P}

(A1 ∧ · · · ∧ Ak ) → B Exercise 18 Apply the DPLL procedure to the clause set

Example 15 Start with

How to Skolemize a formula ∃u ∀v ∃w ∃x ∀y ∃z ((P(h(u, v)) ∨ Q(w)) ∧ R(x, h(y, z)))

Now the Skolem functions take fewer arguments. H = { f (t1 , . . . , tn ) | t1 , . . . , tn ∈ H and f ∈ Fn }

P(a) → P(b) H = {a, f (a), g(a, a), f ( f (a)), g( f (a), a),

7.2 Herbrand interpretations Equality behaves strangely in Herbrand interpretations.

Since the variables in a clause are taken to be universally

∀x P(x) ∧ ∀y [P(y) → Q(y)] → Q(a) ∧ Q(b). Example 20 Let

Definition 20 A substitution θ is a most general unifier Examples of unification

7.6 A simple unification algorithm Unify x and a getting [a/x].

• A constant a, b, . . . — handles function symbols also

Example 31 In a similar vein, let us try to prove

• Here the antecedent is ∀x ∃y R(x, y); replacing y by {B, A1 , . . . , Am } {¬D, C1 , . . . , Cn }

8.2 Factoring A short resolution proof follows. The complementary liter-

{¬P(x, a), ¬P(x, y), ¬P(y, x)}

Exercise 33 Prove the following formulas by resolution,

Note that P is just a predicate symbol, so in particular, x is

9 Decision Procedures and SMT

and upper bounds, {b j } j=1 . To eliminate a variable, we

Logic T includes axiom T: reflexivity. Logic S4 includes

We have gone from 14 rules to four, ignoring the struc-

You might also like