Ai PDF
Ai PDF
Lawrence C Paulson
Computer Laboratory
University of Cambridge
Copyright
c 2014 by Lawrence C. Paulson
Contents
1 Introduction and Learning Guide 1
2 Propositional Logic 2
4 First-order Logic 8
11 Modal Logics 28
12 Tableaux-Based Methods 30
i
1 INTRODUCTION AND LEARNING GUIDE 1
1 Introduction and Learning Guide • 2008 Paper 3 Q6: BDDs, DPLL, sequent calculus
• 2008 Paper 4 Q5: proving or disproving first-order formulas,
This course gives a brief introduction to logic, including resolution
the resolution method of theorem-proving and its relation
• 2009 Paper 6 Q7: modal logic (Lect. 11)
to the programming language Prolog. Formal logic is used
for specifying and verifying computer systems and (some- • 2009 Paper 6 Q8: resolution, tableau calculi
times) for representing knowledge in Artificial Intelligence • 2007 Paper 5 Q9: propositional methods, resolution, modal
programs. logic
The course should help you to understand the Prolog • 2007 Paper 6 Q9: proving or disproving first-order formulas
language, and its treatment of logic should be helpful for
• 2006 Paper 5 Q9: proof and disproof in FOL and modal logic
understanding other theoretical courses. It also describes
a variety of techniques and data structures used in auto- • 2006 Paper 6 Q9: BDDs, Herbrand models, resolution
mated theorem provers. Understanding the various deduc- (Lect. 6–8)
tive methods is a crucial part of the course, but in addition • 2005 Paper 5 Q9: resolution (Lect. 6, 8, 9)
to this purely mechanical view of logic, you should try to • 2005 Paper 6 Q9: DPLL, BDDs, tableaux (Lect. 6, 10, 12)
acquire an intuitive feel for logical reasoning.
• 2004 Paper 5 Q9: semantics and proof in FOL (Lect. 4, 5)
The most suitable course text is this book:
• 2004 Paper 6 Q9: ten true or false questions
Michael Huth and Mark Ryan, Logic in
• 2003 Paper 5 Q9: BDDs; clause-based proof methods
Computer Science: Modelling and Reasoning
(Lect. 6, 10)
about Systems, 2nd edition (CUP, 2004)
• 2003 Paper 6 Q9: sequent calculus (Lect. 5)
It covers most aspects of this course with the exception of
• 2002 Paper 5 Q11: semantics of propositional and first-order
resolution theorem proving. It includes material (symbolic logic (Lect. 2, 4)
model checking) that could be useful later.
The following book may be a useful supplement to Huth • 2002 Paper 6 Q11: resolution; proof systems
and Ryan. It covers resolution, as well as much else relevant • 2001 Paper 5 Q11: satisfaction relation; logical equivalences
to Logic and Proof. • 2001 Paper 6 Q11: clause-based proof methods; ordered
ternary decision diagrams (Lect. 6, 10)
Mordechai Ben-Ari, Mathematical Logic for
Computer Science, 2nd edition (Springer, 2001) • 2000 Paper 5 Q11: tautology checking; propositional se-
quent calculus (Lect. 2, 3, 10)
The following book provides a different perspective on
• 2000 Paper 6 Q11: unification and resolution (Lect. 8, 9)
modal logic, and it develops propositional logic carefully.
Finally available in paperback. • 1999 Paper 5 Q10: Prolog resolution versus general resolu-
tion
Sally Popkorn, First Steps in Modal Logic (CUP,
• 1999 Paper 6 Q10: Herbrand models and clause form
2008)
• 1998 Paper 5 Q10: BDDs, sequent calculus, etc. (Lect. 3,
There are numerous exercises in these notes, and they are 10)
suitable for supervision purposes. Most old examination • 1998 Paper 6 Q10: modal logic (Lect. 11); resolution
questions for Foundations of Logic Programming (the for-
• 1997 Paper 5 Q10: first-order logic (Lect. 4)
mer name of this course) are still relevant. As of 2013/14,
Herbrand’s theorem has been somewhat deprecated, still • 1997 Paper 6 Q10: sequent rules for quantifiers (Lect. 5)
mentioned but no longer in detail. Some unification the- • 1996 Paper 5 Q10: sequent calculus (Lect. 3, 5, 10)
ory has also been removed. These changes created space • 1996 Paper 6 Q10: DPLL versus Resolution (Lect. 9)
for a new lecture on Decision Procedures and SMT Solvers.
• 1995 Paper 5 Q9: BDDs (Lect. 10)
• 2014 Paper 5 Q5: proof methods for propositional logic: • 1995 Paper 6 Q9: outline logics; sequent calculus (Lect. 3,
BDDs and DPLL 5, 11)
• 2014 Paper 6 Q6: decision procedure; resolution with selec-
• 1994 Paper 5 Q9: resolution versus Prolog (Lect. 8)
tion
• 1994 Paper 6 Q9: Herbrand models (Lect. 7)
• 2013 Paper 5 Q5: DPLL, sequent or tableau calculus
• 1994 Paper 6 Q9: most general unifiers and resolution
• 2013 Paper 6 Q6: resolution problems
(Lect. 9)
• 2012 Paper 5 Q5: (Herbrand: deprecated)
• 1993 Paper 3 Q3: resolution and Prolog (Lect. 8)
• 2012 Paper 6 Q6: sequent or tableau calculus, modal logic,
BDDs
Acknowledgements. Chloë Brown, Jonathan Davies and
• 2011 Paper 5 Q5: resolution, linear resolution, BDDs
Reuben Thomas pointed out numerous errors in these notes.
• 2011 Paper 6 Q6: unification, modal logic David Richerby and Ross Younger made detailed sugges-
• 2010 Paper 5 Q5: BDDs and models tions. Thanks also to Darren Foong, Thomas Forster, Si-
• 2010 Paper 6 Q6: sequent or tableau calculus, DPLL. Note: mon Frankau, Adam Hall, Ximin Luo, Steve Payne, Tom
the formula in the first part of this question should be Puverle, Max Spencer, Ben Thorner, Tjark Weber and John
(∃x P(x) → Q) → ∀x (P(x) → Q). Wickerson.
2 PROPOSITIONAL LOGIC 2
• The formulas P and P ∧ (P → Q) are satisfiable: thing. The formula A → B is a tautology if and only if
they are both true under the interpretation that maps P A |H B.
and Q to 1. But they are not valid: they are both false Here is a listing of some of the more basic equivalences
under the interpretation that maps P and Q to 0. of propositional logic. They provide one means of reason-
ing about propositions, namely by transforming one propo-
• If A is a valid formula then ¬A is unsatisfiable. sition into an equivalent one. They are also needed to con-
vert propositions into various normal forms.
• This set of formulas is unsatisfiable: {P, Q, ¬P ∨
¬Q}. idempotency laws
Show we can make H2 CO3 given supplies of HCl, NaOH, ¬(A ∧ B) ' ¬A ∨ ¬B
O2 , and C. ¬(A ∨ B) ' ¬A ∧ ¬B
Chang and Lee formalize the supplies of chemicals as
four axioms and prove that H2 CO3 logically follows. The other negation laws
idea is to formalize each compound as a propositional sym-
bol and express the reactions as implications: ¬(A → B) ' A ∧ ¬B
¬(A ↔ B) ' (¬A) ↔ B ' A ↔ (¬B)
HCl ∧ NaOH → NaCl ∧ H2 O
C ∧ O2 → CO2 laws for eliminating certain connectives
CO2 ∧ H2 O → H2 CO3
A ↔ B ' (A → B) ∧ (B → A)
Note that this involves an ideal model of chemistry. What ¬A ' A → f
if the reactions can be inhibited by the presence of other A → B ' ¬A ∨ B
chemicals? Proofs about the real world always depend upon
general assumptions. It is essential to bear these in mind simplification laws
when relying on such a proof.
A∧f'f
2.4 Equivalences A∧t' A
A∨f' A
Note that A ↔ B and A ' B are different kinds of asser-
tions. The formula A ↔ B refers to some fixed interpre- A∨t't
tation, while the metalanguage statement A ' B refers to ¬¬A ' A
all interpretations. On the other hand, |H A ↔ B means the A ∨ ¬A ' t
same thing as A ' B. Both are metalanguage statements,
A ∧ ¬A ' f
and A ' B is equivalent to saying that the formula A ↔ B
is a tautology.
Propositional logic enjoys a principle of duality: for ev-
Similarly, A → B and A |H B are different kinds of
ery equivalence A ' B there is another equivalence A0 '
assertions, while |H A → B and A |H B mean the same
B 0 , derived by exchanging ∧ with ∨ and t with f. Before ap-
2 Chang and Lee, page 21, as amended by Ross Younger, who knew plying this rule, remove all occurrences of → and ↔, since
more about Chemistry! they implicitly involve ∧ and ∨.
2 PROPOSITIONAL LOGIC 4
2.5 Normal forms Step 2. Push negations in until they apply only to atoms,
repeatedly replacing by the equivalences
The language of propositional logic has much redundancy:
many of the connectives can be defined in terms of others. ¬¬A ' A
By repeatedly applying certain equivalences, we can trans-
¬(A ∧ B) ' ¬A ∨ ¬B
form a formula into a normal form. A typical normal form
eliminates certain connectives and uses others in a restricted ¬(A ∨ B) ' ¬A ∧ ¬B
manner. The restricted structure makes the formula easy to
process, although the normal form may be much larger than At this point, the formula is in Negation Normal Form.
the original formula, and unreadable.
Step 3. To obtain CNF, push disjunctions in until they ap-
Definition 2 (Normal Forms) ply only to literals. Repeatedly replace by the equivalences
• A formula is in Conjunctive Normal Form (CNF) if (A∧ B)∨(C ∧ D) ' (A∨C)∧(A∨ D)∧(B ∨C)∧(B ∨ D).
it has the form A1 ∧ · · · ∧ Am , where each Ai is a
Use this equivalence when you can, to save writing.
disjunction of one or more literals.
• A formula is in Disjunctive Normal Form (DNF) if it Step 4. Simplify the resulting CNF by deleting any dis-
has the form A1 ∨ · · · ∨ Am , where each Ai is a con- junction that contains both P and ¬P, since it is equivalent
junction of one or more literals. to t. Also delete any conjunct that includes another con-
junct (meaning, every literal in the latter is also present in
An atomic formula like P is in all the normal forms NNF, the former). This is correct because (A ∨ B) ∧ A ' A. Fi-
CNF, and DNF. The formula nally, two disjunctions of the form P ∨ A and ¬P ∨ A can
be replaced by A, thanks to the equivalence
(P ∨ Q) ∧ (¬P ∨ S) ∧ (R ∨ P)
(P ∨ A) ∧ (¬P ∨ A) ' A.
is in CNF. Unlike in some hardware applications, the dis-
juncts in a CNF formula do not have to mention all the vari- This simplification is related to the resolution rule, which
ables. On the contrary, they should be as simple as possible. we shall study later.
Simplifying the formula Since ∨ is commutative, a conjunct of the form A ∨ B
could denote any possible way of arranging the literals into
(P ∨ Q) ∧ (¬P ∨ Q) ∧ (R ∨ S) two parts. This includes A ∨ f, since one of those parts may
be empty and the empty disjunction is false. So in the last
to Q ∧ (R ∨ S) counts as an improvement, because it will
simplification above, two conjuncts of the form P and ¬P
make our proof procedures run faster. For examples of DNF
can be replaced by f.
formulas, exchange ∧ and ∨ in the examples above. As
with CNF, there is no need to mention all combinations of
variables. Steps 3’ and 4’. To obtain DNF, apply instead the other
NNF can reveal the underlying nature of a formula. For distributive law:
example, converting ¬(A → B) to NNF yields A ∧ ¬B.
This reveals that the original formula was effectively a con- A ∧ (B ∨ C) ' (A ∧ B) ∨ (A ∧ C)
junction. Every formula in CNF or DNF is also in NNF, but (B ∨ C) ∧ A ' (B ∧ A) ∨ (C ∧ A)
the NNF formula ((¬P ∧ Q) ∨ R) ∧ P is in neither CNF
nor DNF. Exactly the same simplifications can be performed for DNF
as for CNF, exchanging the roles of ∧ and ∨.
interpretation I that falsifies every L j , and therefore falsi- Step 2, push negations in, gives
fies Ai . Define I such that, for every propositional letter P,
( (¬¬(¬P ∨ Q) ∧ ¬P) ∨ P
0 if L j is P for some j
I (P) = ((¬P ∨ Q) ∧ ¬P) ∨ P
1 if L j is ¬P for some j
Step 3, push disjunctions in, gives
This definition is legitimate because there cannot exist lit-
erals L j and L k such that L j is ¬L k ; if there did, then sim- (¬P ∨ Q ∨ P) ∧ (¬P ∨ P)
plification would have deleted the disjunction Ai .
The powerful BDD method is based on similar ideas, but Simplifying again yields t. Thus Peirce’s law is valid.
uses an if-then-else data structure, an ordering on the propo- There is a dual method of refuting A (proving inconsis-
sitional letters, and some standard algorithmic techniques tency). To refute A, reduce it to DNF, say A1 ∨ · · · ∨ Am .
(such as hashing) to gain efficiency. If A is inconsistent then so is each Ai . Suppose Ai is
L 1 ∧ · · · ∧ L n , where the L j are literals. If there is some
Example 1 Start with literal L 0 such that the L j include both L 0 and ¬L 0 , then
Ai is inconsistent. If not then there is an interpretation that
P ∨ Q → Q ∨ R. verifies every L j , and therefore Ai .
To prove A, we can use the DNF method to refute ¬A.
Step 1, eliminate →, gives The steps are exactly the same as the CNF method because
the extra negation swaps every ∨ and ∧. Gilmore imple-
¬(P ∨ Q) ∨ (Q ∨ R).
mented a theorem prover based upon this method in 1960.
Step 2, push negations in, gives
Exercise 1 Is the formula P → ¬P satisfiable, or valid?
(¬P ∧ ¬Q) ∨ (Q ∨ R).
Exercise 2 Verify the de Morgan and distributive laws us-
Step 3, push disjunctions in, gives ing truth tables.
(¬P ∨ Q ∨ R) ∧ (¬Q ∨ Q ∨ R). Exercise 3 Each of the following formulas is satisfiable but
not valid. Exhibit an interpretation that makes the formula
Simplifying yields (¬P ∨ Q ∨ R) ∧ t and then true and another that makes the formula false.
¬P ∨ Q ∨ R. P→Q P∨Q→P∧Q
The interpretation P 7 → 1, Q 7 → 0, R 7 → 0 falsifies this ¬(P ∨ Q ∨ R) ¬(P ∧ Q) ∧ ¬(Q ∨ R) ∧ (P ∨ R)
formula, which is equivalent to the original formula. So the
Exercise 4 Convert each of the following propositional for-
original formula is not valid.
mulas into Conjunctive Normal Form and also into Disjunc-
tive Normal Form. For each formula, state whether it is
Example 2 Start with valid, satisfiable, or unsatisfiable; justify each answer.
P∧Q→Q∧P (P → Q) ∧ (Q → P)
Step 1, eliminate →, gives ((P ∧ Q) ∨ R) ∧ ¬(P ∨ R)
¬(P ∨ Q ∨ R) ∨ ((P ∧ Q) ∨ R)
¬(P ∧ Q) ∨ Q ∧ P
Exercise 5 Using ML, define datatypes for representing
Step 2, push negations in, gives propositions and interpretations. Write a function to test
(¬P ∨ ¬Q) ∨ (Q ∧ P) whether or not a proposition holds under an interpretation
(both supplied as arguments). Write a function to convert a
Step 3, push disjunctions in, gives proposition to Negation Normal Form.
(¬P ∨ ¬Q ∨ Q) ∧ (¬P ∨ ¬Q ∨ P)
3 Proof Systems for Propositional
Simplifying yields t ∧ t, which is t. Both conjuncts are
valid since they contain a formula and its negation. Thus Logic
P ∧ Q → Q ∧ P is valid.
We can verify any tautology by checking all possible in-
terpretations, using the truth tables. This is a semantic ap-
Example 3 Peirce’s law is another example. Start with proach, since it appeals to the meanings of the connectives.
The syntactic approach is formal proof: generating the-
((P → Q) → P) → P
orems, or reducing a conjecture to a known theorem, by
Step 1, eliminate →, gives applying syntactic transformations of some sort. We have
already seen a proof method based on CNF. Most proof
¬(¬(¬P ∨ Q) ∨ P) ∨ P methods are based on axioms and inference rules.
3 PROOF SYSTEMS FOR PROPOSITIONAL LOGIC 6
What about efficiency? Deciding whether a propositional Here A ∧ B and A → B are defined in terms of ¬ and ∨.
formula is satisfiable is an NP-complete problem (Aho, Where do truth tables fit into all this? Truth tables define
Hopcroft and Ullman 1974, pages 377–383). Thus all ap- the semantics, while proof systems define what is some-
proaches are likely to be exponential in the length of the times called the proof theory. A proof system must respect
formula. Technologies such as BDDs and SAT solvers, the truth tables. Above all, we expect the proof system to
which can decide huge problems in propositional logic, are be sound: every theorem it generates must be a tautology.
all the more stunning because their success was wholly un- For this to hold, every axiom must be a tautology and every
expected. But even they require a “well-behaved” input for- inference rule must yield a tautology when it is applied to
mula, and are exponential in the worst case. tautologies.
The converse property is completeness: the proof system
3.1 A Hilbert-style proof system can generate every tautology. Completeness is harder to
achieve and show. There are complete proof systems even
Here is a simple proof system for propositional logic. There for first-order logic. (And Gödel’s incompleteness theo-
are countless similar systems. They are often called Hilbert rem uses the word “completeness” with a different technical
systems after the logician David Hilbert, although they ex- meaning.)
isted before him.
This proof system provides rules for implication only.
The other logical connectives are not taken as primitive. 3.2 Gentzen’s Natural Deduction Systems
They are instead defined in terms of implication: Natural proof systems do exist. Natural deduction, devised
def by Gerhard Gentzen, is based upon three principles:
¬A = A → f
def 1. Proof takes place within a varying context of assump-
A ∨ B = ¬A → B
tions.
def
A ∧ B = ¬(¬A ∨ ¬B)
2. Each logical connective is defined independently of the
Obviously, these definitions apply only when we are dis- others. (This is possible because item 1 eliminates the
cussing this proof system! need for tricky uses of implication.)
Note that A → (B → A) is a tautology. Call it Axiom K.
Also, 3. Each connective is defined by introduction and elimi-
nation rules.
(A → (B → C)) → ((A → B) → (A → C))
is a tautology. Call it Axiom S. The Double-Negation Law For example, the introduction rule for ∧ describes how to
¬¬A → A, is a tautology. Call it Axiom DN. deduce A ∧ B:
A B (∧i)
These axioms are more properly called axiom schemes, A∧B
since we assume all instances of them that can be obtained
by substituting formulas for A, B and C. For example, Ax- The elimination rules for ∧ describe what to deduce from
iom K is really an infinite set of formulas. A ∧ B:
A ∧ B (∧e1) A ∧ B (∧e2)
Whenever A → B and A are both valid, it follows that B
A B
is valid. We write this as the inference rule
A→B A The elimination rule for → says what to deduce from A →
B. B. It is just Modus Ponens:
This rule is traditionally called Modus Ponens. Together A → B A (→e)
with Axioms K, S, and DN and the definitions, it suffices to B
prove all tautologies of classical propositional logic.3 How-
ever, this formalization of propositional logic is inconve- The introduction rule for → says that A → B is proved by
nient to use. For example, try proving A → A! assuming A and deriving B:
A variant of this proof system replaces the Double-
Negation Law by the Contrapositive Law: [A]
..
..
(¬B → ¬A) → (A → B) B (→i)
Another formalization of propositional logic consists of A→B
the Modus Ponens rule plus the following axioms:
For simple proofs, this notion of assumption is pretty intu-
A∨ A→ A itive. Here is a proof of the formula A ∧ B → A:
B → A∨B
[A ∧ B]
A∨B → B∨ A A
(∧e1)
(→i)
(B → C) → (A ∨ B → A ∨ C) A∧B → A
3 If the Double-Negation Law is omitted, we get so-called intuitionis-
tic logic. This axiom system is motivated by a philosophy of constructive
The key point is that rule (→i) discharges its assumption:
mathematics. In a precise sense, it is connected with advanced topics in- the assumption could be used to prove A from A ∧ B, but is
cluding type theory and the combinators S and K in the λ-calculus. no longer available once we conclude A ∧ B → A.
3 PROOF SYSTEMS FOR PROPOSITIONAL LOGIC 7
The introduction rules for ∨ are straightforward: Working backwards, this rule breaks down some implica-
tion on the right side of a sequent; 0 and 1 stand for the
A B
(∨i1) (∨i2) sets of formulas that are unaffected by the inference. The
A∨B A∨B
analogue of the pair (∨i1) and (∨i2) is the single rule
The elimination rule says that to show some C from A ∨ B 0 ⇒ 1, A, B
there are two cases to consider, one assuming A and one (∨r )
0 ⇒ 1, A ∨ B
assuming B:
[A]
.. [B]
.. This breaks down some disjunction on the right side, re-
.. .. placing it by both disjuncts. Thus, the sequent calculus is a
A∨B C C (∨e) kind of multiple-conclusion logic. Figure 1 summarises the
C rules.
The scope of assumptions can get confusing in complex Let us prove that the rule (∨l) is sound. We must show
proofs. Let us switch attention to the sequent calculus, that if both premises are valid, then so is the conclusion. For
which is similar in spirit but easier to use. contradiction, assume that the conclusion, A ∨ B, 0 ⇒ 1,
is not valid. Then there exists an interpretation I under
which the left side is true while the right side is false; in
3.3 The sequent calculus particular, A ∨ B and 0 are true while 1 is false. Since
The sequent calculus resembles natural deduction, but it A ∨ B is true under interpretation I , either A is true or B
makes the set of assumptions explicit. Thus, it is more con- is. In the former case, A, 0 ⇒ 1 is false; in the latter case,
crete. B, 0 ⇒ 1 is false. Either case contradicts the assumption
A sequent has the form 0 ⇒ 1, where 0 and 1 are finite that the premises are valid.
sets of formulas.4 These sets may be empty. The sequent
basic sequent: A, 0 ⇒ A, 1
A1 , . . . , Am ⇒ B1 , . . . , Bn Negation rules:
rules still have the same logical meaning, namely that the states that this rule is not required. All uses of it can be
premises (above the line) imply the conclusion (below the removed from any proof, but the proof could get exponen-
line). Instead of matching a rule’s premises with facts that tially larger.
we know, we match its conclusion with the formula we want
0 ⇒ 1, A A, 0 ⇒ 1
to prove. That way, the form of the desired theorem controls (cut)
the proof search. 0⇒1
The distributive law A ∨ (B ∧ C) ' (A ∨ B) ∧ (A ∨ C) This special case of cut may be easier to understand. We
is proved (one direction at least) as follows: prove lemma A from 0 and use A and 0 together to reach
the conclusion B.
B, C ⇒ A, B
A ⇒ A, B B ∧ C ⇒ A, B
(∧l) 0 ⇒ B, A A, 0 ⇒ B
A ∨ (B ∧ C) ⇒ A, B
(∨l) 0⇒B
(∨r )
A ∨ (B ∧ C) ⇒ A ∨ B similar Since 0 contains as much information as A, it is natural to
(∧r )
A ∨ (B ∧ C) ⇒ (A ∨ B) ∧ (A ∨ C) expect that such lemmas should not be necessary, but the
cut-elimination theorem is hard to prove.
The second, omitted proof tree proves A ∨(B ∧C) ⇒ A ∨C
similarly.
Note On the course website, there is a simple theorem
Finally, here is a failed proof of the invalid formula A ∨
prover called folderol.ML. It can prove easy first-order
B → B ∨ C.
theorems using the sequent calculus, and outputs a sum-
A ⇒ B, C B ⇒ B, C mary of each proof. The file begins with very basic instruc-
(∨l) tions describing how to run it. The file testsuite.ML
A ∨ B ⇒ B, C
(∨r ) contains further instructions and numerous examples.
A∨ B⇒B ∨C
(→r )
⇒A∨ B → B ∨C Exercise 6 Prove the following sequents:
3.5 Further Sequent Calculus Rules Exercise 7 Prove the following sequents:
Definition 3 The terms t, u, . . . of a first-order language Every student’s tutor is a member of the student’s Col-
are defined recursively as follows: lege:
• If t1 , . . ., tn are terms and f is an n-place function Formalising the notion of tutor as a function incorporates
symbol then f (t1 , . . . , tn ) is a term. the assumption that every student has exactly one tutor.
A mathematical example: there exist infinitely many Py-
Definition 4 The formulas A, B, . . . of a first-order lan- thagorean triples:
guage are defined recursively as follows:
∀n ∃i jk (i > n ∧ i 2 + j 2 = k 2 )
• If t1 , . . ., tn are terms and P is an n-place predi-
cate symbol then P(t1 , . . . , tn ) is a formula (called an Here the superscript 2 refers to the squaring function.
atomic formula). Equality (=) is just another relation symbol (satisfying suit-
able axioms) but there are many special techniques for it.
• If A and B are formulas then ¬A, A ∧ B, A ∨ B, A → First-order logic requires a non-empty domain: thus
B, A ↔ B are also formulas. ∀x P(x) implies ∃x P(x). If the domain could be empty,
even ∃x t could fail to hold. Note also that ∀x ∃y y 2 = x is
• If x is a variable and A is a formula then ∀x A and true if the domain is the complex numbers, and is false if the
∃x A are also formulas. domain is the integers or reals. We determine properties of
the domain by asserting the set of statements it must satisfy.
Brackets are used in the conventional way for grouping. There are many other forms of logic. Many-sorted first-
Terms and formulas are tree-like data structures, not strings. order logic assigns types to each variable, function sym-
The quantifiers ∀x A and ∃x A bind tighter than the bi- bol and predicate symbol, with straightforward type check-
nary connectives; thus ∀x A∧ B is equivalent to (∀x A)∧ B. ing; types are called sorts and denote non-empty domains.
Frequently, you will see an alternative quantifier syntax, Second-order logic allows quantification over functions and
∀x . A and ∃x . B, which binds more weakly than the bi- predicates. It can express mathematical induction by
nary connectives: ∀x . A ∧ B is equivalent to ∀x (A ∧ B).
The dot is the give-away; look out for it! ∀P [P(0) ∧ ∀k (P(k) → P(k + 1)) → ∀n P(n)],
Nested quantifications such as ∀x ∀y A are abbreviated
to ∀x y A. using quantification over the unary predicate P. In second-
order logic, these functions and predicates must themselves
be first-order, taking no functions or predicates as argu-
Example 4 A language for arithmetic might have the con- ments. Higher-order logic allows unrestricted quantifica-
stant symbols 0, 1, 2, . . ., and function symbols +, −, ×, tion over functions and predicates of any order. The list of
/, and the predicate symbols =, <, >, . . .. We informally logics could be continued indefinitely.
may adopt an infix notation for the function and predicate
symbols. Terms include 0 and (x + 3) − y; formulas include
y = 0 and x + y < y + z.
4.3 Formal semantics of first-order logic
Let us rigorously define the meaning of formulas. An inter-
pretation of a language maps its function symbols to actual
4.2 Examples of first-order statements
functions, and its relation symbols to actual relations. For
Here are some sample formulas with a rough English trans- example, the predicate symbol “student” could be mapped
lation. English is easier to understand but is too ambiguous to the set of all students currently enrolled at the University.
for long derivations.
All professors are brilliant: Definition 5 Let L be a first-order language. An interpre-
tation I of L is a pair (D, I ). Here D is a nonempty set,
∀x (professor(x) → brilliant(x)) the domain or universe. The operation I maps symbols to
individuals, functions or sets:
The income of any banker is greater than the income of
• if c is a constant symbol (of L) then I [c] ∈ D
any bedder:
• if f is an n-place function symbol then I [ f ] ∈ D n →
∀x y (banker(x) ∧ bedder(y) → income(x) > income(y)) D (which means I [ f ] is an n-place function on D)
Note that > is a 2-place relation symbol. The infix notation • if P is an n-place relation symbol then I [P] ∈ D n →
is simply a convention. {1, 0} (equivalently, I [P] ⊆ D n , which means I [P] is
Every student has a supervisor: an n-place relation on D)
of the benefits of higher-order logic is that relations are a Definition 8 Let A be a formula having no free variables.
special case of functions, and formulas are simply boolean-
valued terms. • An interpretation I satisfies a formula if |HI A holds.
An interpretation does not say anything about variables.
• A set S of formulas is valid if every interpretation of S
An environment or valuation can be used to represent the
satisfies every formula in S.
values of variables.
Definition 6 A valuation V of L over D is a function from • A set S of formulas is satisfiable (or consistent) if there
the variables of L into D. Write I [t] for the value of t is some interpretation that satisfies every formula in S.
V
with respect to I and V , defined by • A set S of formulas is unsatisfiable (or inconsistent) if
def it is not satisfiable. (Each interpretation falsifies some
IV [x] = V (x) if x is a variable
formula of S.)
def
IV [c] = I [c]
• A model of a set S of formulas is an interpretation that
def
IV [ f (t1 , . . . , tn )] = I [ f ](IV [t1 ], . . . , IV [tn ]) satisfies every formula in S. We also consider models
that satisfy a single formula.
Write V {a/x} for the valuation that maps x to a and is
otherwise the same as V . Typically, we modify a valua- Unlike in propositional logic, models can be infinite and
tion one variable at a time. This is a semantic analogue of there can be an infinite number of models. There is no
substitution for the variable x. chance of proving validity by checking all models. We must
rely on proof.
4.4 What is truth?
We can define truth in first-order logic. This formidable def- Example 5 The formula P(a) ∧ ¬P(b) is satisfiable.
inition formalizes the intuitive meanings of the connectives. Consider the interpretation with D = {London, Paris} and I
Thus it almost looks like a tautology. It effectively specifies defined by
each connective by English descriptions. Valuations help
I [a] = Paris
specify the meanings of quantifiers. Alfred Tarski first de-
fined truth in this manner. I [b] = London
I [P] = {Paris}
Definition 7 Let A be a formula. Then for an interpretation
I = (D, I ) write |HI ,V A to mean that A is true in I un- On the other hand, ∀x y (P(x) ∧ ¬P(y)) is unsatisfiable be-
der V . This is defined by cases on the construction of the cause it requires P(x) to be both true and false for all x.
formula A: Also unsatisfiable is P(x) ∧ ¬P(y): its free variables are
|HI ,V P(t1 , . . . , tn ) is defined to hold if taken to be universally quantified, so it is equivalent to
∀x y (P(x) ∧ ¬P(y)).
I [P](IV [t1 ], . . . , IV [tn ]) = 1 The formula (∃x P(x)) → P(c) holds in the interpreta-
(that is, the actual relation I [P] is true for the tion (D, I ) where D = {0, 1}, I [P] = {0}, and I [c] = 0.
given values) (Thus P(x) means “x equals 0” and c denotes 0.) If we
modify this interpretation by making I [c] = 1 then the for-
|HI ,V t = u if IV [t] equals IV [u] (if = is a pred-
mula no longer holds. Thus it is satisfiable but not valid.
icate symbol of the language, then we insist
The formula (∀x P(x)) → (∀x P( f (x))) is valid, for let
that it really denotes equality)
(D, I ) be an interpretation. If ∀x P(x) holds in this inter-
|HI ,V ¬B if |HI ,V B does not hold pretation then P(x) holds for all x ∈ D, thus I [P] = D.
|HI ,V B ∧ C if |HI ,V B and |HI ,V C The symbol f denotes some actual function I [ f ] ∈ D →
|HI ,V B ∨ C if |HI ,V B or |HI ,V C D. Since I [P] = D and I [ f ](x) ∈ D for all x ∈ D,
formula ∀x P( f (x)) holds.
|HI ,V B → C if |HI ,V B does not hold or |HI ,V
The formula ∀x y x = y is satisfiable but not valid; it is
C holds
true in every domain that consists of exactly one element.
|HI ,V B ↔ C if |HI ,V B and |HI ,V C both hold (The empty domain is not allowed in first-order logic.)
or neither hold
|HI ,V ∃x B if there exists some m ∈ D such that Example 6 Let L be the first-order language consisting
|HI ,V {m/x} B holds (that is, B holds when x of the constant 0 and the (infix) 2-place function symbol +.
has the value m) An interpretation I of this language is any non-empty do-
|HI ,V ∀x B if |HI ,V {m/x} B holds for all m ∈ D main D together with values I [0] and I [+], with I [0] ∈ D
and I [+] ∈ D × D → D. In the language L we may
The cases for ∧, ∨, → and ↔ follow the propositional truth express the following axioms:
tables.
Write |HI A provided |HI ,V A for all V . Clearly, if A is x +0= x
closed (contains no free variables) then its truth is indepen-
0+x = x
dent of the valuation. The definitions of valid, satisfiable,
etc. carry over almost verbatim from Sect. 2.2. (x + y) + z = x + (y + z)
5 FORMAL REASONING IN FIRST-ORDER LOGIC 11
(Remember, free variables in effect are universally quanti- and falsify axioms 4 and 5. Consider only interpretations
fied, by the definition of |HI A.) One model of these ax- that make = denote the equality relation. (This exercise
ioms is the set of natural numbers, provided we give 0 and asks whether you can make the connection between the ax-
+ the obvious meanings. But the axioms have many other ioms and typical mathematical objects satisfying them. A
models.5 Below, let A be some set. start is to say that R(x, y) means x < y, but on what do-
main?)
1. The set of all strings (in ML say) letting 0 denote the
empty string and + string concatenation.
2. The set of all subsets of A, letting 0 denote the empty
5 Formal Reasoning in First-Order
set and + union. Logic
3. The set of functions in A → A, letting 0 denote the This section reviews some syntactic notations: free vari-
identity function and + composition. ables versus bound variables and substitution. It lists some
of the main equivalences for quantifiers. Finally it describes
Exercise 8 To test your understanding of quantifiers, con-
and illustrates the quantifier rules of the sequent calculus.
sider the following formulas: everybody loves somebody vs
there is somebody that everybody loves:
5.1 Free vs bound variables
∀x ∃y loves(x, y) (1)
The notion of bound variable occurs widely in mathemat-
∃y ∀x loves(x, y) (2) R
ics: consider the role of x in f (x)d x and the role of k
∞
Does (1) imply (2)? Does (2) imply (1)? Consider both the in limk=0 ak . Similar concepts occur in the λ-calculus. In
informal meaning and the formal semantics defined above. first-order logic, variables are bound by quantifiers (rather
than by λ).
Exercise 9 Describe a formula that is true in precisely
those domains that contain at least m elements. (We say it Definition 9 An occurrence of a variable x in a formula is
characterises those domains.) Describe a formula that char- bound if it is contained within a subformula of the form
acterises the domains containing fewer than m elements. ∀x A or ∃x A.
An occurrence of the form ∀x or ∃x is called the binding
Exercise 10 Let ≈ be a 2-place predicate symbol, which occurrence of x.
we write using infix notation as x ≈ y instead of ≈ (x, y). An occurrence of a variable is free if it is not bound.
Consider the axioms A closed formula is one that contains no free variables.
A ground term, formula, etc. is one that contains no vari-
∀x x ≈ x (1) ables at all.
∀x y (x ≈ y → y ≈ x) (2)
In ∀x ∃y R(x, y, z), the variables x and y are bound
∀x yz (x ≈ y ∧ y ≈ z → x ≈ z) (3) while z is free.
Let the universe be the set of natural numbers, N = In (∃x P(x)) ∧ Q(x), the occurrence of x just after P
{0, 1, 2, . . .}. Which axioms hold if I [≈] is is bound, while that just after Q is free. Thus x has
both free and bound occurrences. Such situations can be
1. the empty relation, ∅? avoided by renaming bound variables, for example obtain-
2. the universal relation, {(x, y) | x, y ∈ N }? ing (∃y P(y)) ∧ Q(x). Renaming can also ensure that all
bound variables in a formula are distinct. The renaming of
3. the equality relation, {(x, x) | x ∈ N }? bound variables is sometimes called α-conversion.
4. the relation {(x, y) | x, y ∈ N ∧ x + y is even}?
5. the relation {(x, y) | x, y ∈ N ∧ x + y = 100}? Example 7 Renaming bound variables in a formula pre-
serves its meaning, if done properly. Consider the following
6. the relation {(x, y) | x, y ∈ N ∧ x ≤ y? renamings of ∀x ∃y R(x, y, z):
Exercise 11 Taking = and R as 2-place relation symbols, ∀u ∃y R(u, y, z) OK
consider the following axioms: ∀x ∃w R(x, w, z) OK
∀u ∃y R(x, y, z) not done consistently
∀x ¬R(x, x) (1)
∀y ∃y R(y, y, z) clash with bound variable y
∀x y ¬(R(x, y) ∧ R(y, x)) (2) ∀z ∃y R(z, y, z) clash with free variable z
∀x yz (R(x, y) ∧ R(y, z) → R(x, z)) (3)
∀x y (R(x, y) ∨ (x = y) ∨ R(y, x)) (4) 5.2 Substitution
∀x z (R(x, z) → ∃y (R(x, y) ∧ R(y, z))) (5)
If A is a formula, t is a term, and x is a variable, then A[t/x]
is the formula obtained by substituting t for x throughout A.
Exhibit two interpretations that satisfy axioms 1–5. Exhibit
The substitution only affects the free occurrences of x. Pro-
two interpretations that satisfy axioms 1–4 and falsify ax-
nounce A[t/x] as “A with t for x”. We also use u[t/x]
iom 5. Exhibit two interpretations that satisfy axioms 1–3
for substitution in a term u and C[t/x] for substitution in a
5 Models of these axioms are called monoids. clause C (clauses are described in Sect. 6 below).
5 FORMAL REASONING IN FIRST-ORDER LOGIC 12
Substitution is only sensible provided all bound variables Two substitution laws do not involve quantifiers explic-
in A are distinct from all variables in t. This can be achieved itly, but let us use x = t to replace x by t in a restricted
by renaming the bound variables in A. For example, if ∀x A context:
then A[t/x] is true for all t; the formula holds when we drop
the ∀x and replace x by any term. But ∀x ∃y x = y is true (x = t ∧ A) ' (x = t ∧ A[t/x])
in all models, while ∃y y + 1 = y is not. We may not (x = t → A) ' (x = t → A[t/x])
replace x by y + 1, since the free occurrence of y in y + 1
gets captured by the ∃y . First we must rename the bound y, Many first-order formulas have easy proofs using equiv-
getting say ∀x ∃z x = z; now we may replace x by y + 1, alences:
getting ∃z y + 1 = z. This formula is true in all models,
regardless of the meaning of the symbols + and 1. ∃x (x = a ∧ P(x)) ' ∃x (x = a ∧ P(a))
' ∃x (x = a) ∧ P(a)
5.3 Equivalences involving quantifiers ' P(a)
These equivalences are useful for transforming and simpli- The following formula is quite hard to prove using the
fying quantified formulas. They can be to convert formulas sequent calculus, but using equivalences it is simple:
into prenex normal form, where all quantifiers are at the
front, or conversely to move quantifiers into the smallest ∃z (P(z) → P(a) ∧ P(b)) ' ∀z P(z) → P(a) ∧ P(b)
possible scope. ' ∀z P(z) ∧ P(a) ∧ P(b) → P(a) ∧ P(b)
pulling quantifiers through negation 't
(infinitary de Morgan laws)
If you are asked to prove a formula, but no particular formal
¬(∀x A) ' ∃x ¬A system (such as the sequent calculus) has been specified,
then you may use any convincing argument. Using equiv-
¬(∃x A) ' ∀x ¬A
alences as above can shorten the proof considerably. Also,
pulling quantifiers through conjunction and disjunction take advantage of symmetries; in proving A ∧ B ' B ∧ A,
(provided x is not free in B) it obviously suffices to prove A ∧ B |H B ∧ A.
Example 9 This proof concerns part of the law for pulling 5.5 Sequent rules for existential quantifiers
universal quantifiers out of conjunctions. Rule (∀l) just dis-
Here are the sequent rules for ∃:
cards the quantifier, since it instantiates the bound vari-
able x with the free variable x. A, 0 ⇒ 1 0 ⇒ 1, A[t/x]
(∃l) (∃r )
∃x A, 0 ⇒ 1 0 ⇒ 1, ∃x A
P(x), Q ⇒ P(x)
(∧l) Rule (∃l) holds provided x is not free in the conclusion—
P(x) ∧ Q ⇒ P(x)
(∀l) that is, not free in the formulas of 0 or 1. These rules
∀x (P(x) ∧ Q) ⇒ P(x)
(∀r ) are strictly dual to the ∀-rules; any example involving ∀
∀x (P(x) ∧ Q) ⇒ ∀x P(x) can easily be transformed into one involving ∃ and hav-
ing a proof of precisely the same form. For example, the
Example 10 The sequent ∀x (A → B) ⇒ A → ∀x B is sequent ∀x P(x) ⇒ ∀y P( f (y)) can be transformed into
valid provided x is not free in A. That condition is required ∃y P( f (y)) ⇒ ∃x P(x).
for the application of (∀r ) below: If you have a choice, apply rules that have provisos —
namely (∃l) and (∀r ) — before applying the other quantifier
A ⇒ A, B A, B ⇒ B rules as you work upwards. The other rules introduce terms
(→l)
A, A → B ⇒ B and therefore new variables to the sequent, which could pre-
(∀l)
A, ∀x (A → B) ⇒ B vent you from applying (∃l) and (∀r ) later.
(∀r )
A, ∀x (A → B) ⇒ ∀x B
(→r )
∀x (A → B) ⇒ A → ∀x B Example 11 Figure 2 presents half of the ∃ distributive
law. Rule (∃r ) just discards the quantifier, instantiating the
What if the condition fails to hold? Let A and B both be bound variable x with the free variable x. In the general
the formula x = 0. Then ∀x (x = 0 → x = 0) is valid, case, it can instantiate the bound variable with any term.
but x = 0 → ∀x (x = 0) is not valid (it fails under any The restriction on the sequent rules, namely “x is not free
valuation that sets x to 0). in the conclusion”, can be confusing when you are building
Note. The proof on the slides of a sequent proof working backwards. One simple way to
avoid problems is always to rename a quantified variable if
∀x (P → Q(x)) ⇒ P → ∀y Q(y) the same variable appears free in the sequent. For example,
when you see the sequent P(x), ∃x Q(x) ⇒ 1, replace it
is essentially the same as the proof above, but uses different
immediately by P(x), ∃y Q(y) ⇒ 1.
variable names so that you can see how a quantified formula
like ∀x (P → Q(x)) is instantiated to produce P → Q(y).
The proof given above is also correct: the variable names Example 12 The sequent
are identical, the instantiation is trivial and ∀x (A → B) ∃x P(x) ∧ ∃x Q(x) ⇒ ∃x (P(x) ∧ Q(x))
simply produces A → B. In our example, B may be any
formula possibly containing the variable x; the proof on the is not valid: the value of x that makes P(x) true could differ
slides uses the specific formula Q(x). from the value of x that makes Q(x) true. This comes out
clearly in the proof attempt, where we are not allowed to
Exercise 12 Verify the following equivalences by appeal- apply (∃l) twice with the same variable name, x. As soon as
ing to the truth definition for first-order logic: we are forced to rename the second variable to y, it becomes
obvious that the two values could differ. Turning to the right
¬(∃x P(x)) ' ∀x ¬P(x) side of the sequent, no application of (∃r ) can lead to a proof.
(∀x P(x)) ∧ R ' ∀x (P(x) ∧ R) We have nothing to instantiate x with:
(∃x P(x)) ∨ (∃x Q(x)) ' ∃x (P(x) ∨ Q(x)) P(x), Q(y) ⇒ P(x) ∧ Q(x)
(∃r )
P(x), Q(y) ⇒ ∃x (P(x) ∧ Q(x))
(∃l)
Exercise 13 Explain why the following are not equiva- P(x), ∃x Q(x) ⇒ ∃x (P(x) ∧ Q(x))
lences. Are they implications? In which direction? (∃l)
∃x P(x), ∃x Q(x) ⇒ ∃x (P(x) ∧ Q(x))
(∧l)
∃x P(x) ∧ ∃x Q(x) ⇒ ∃x (P(x) ∧ Q(x))
(∀x A) ∨ (∀x B) 6 ' ∀x (A ∨ B)
(∃x A) ∧ (∃x B) 6 ' ∃x (A ∧ B) Exercise 17 Prove the following using the sequent calcu-
lus. The last one is difficult and requires two uses of (∃r ).
Exercise 14 Prove ¬∀y [(Q(a) ∨ Q(b)) ∧ ¬Q(y)] using
P(a) ∨ ∃x P( f (x)) ⇒ ∃y P(y)
equivalences, and then formally using the sequent calculus.
∃x (P(x) ∨ Q(x)) ⇒ (∃y P(y)) ∨ (∃y Q(y))
Exercise 15 Prove the following sequents. Note that the ⇒ ∃z (P(z) → P(a) ∧ P(b))
last one requires two uses of the (∀l) rule!
(∀x P(x)) ∧ (∀x Q(x)) ⇒ ∀y (P(y) ∧ Q(y)) 6 Clause Methods for Propositional
∀x (P(x) ∧ Q(x)) ⇒ (∀y P(y)) ∧ (∀y Q(y)) Logic
∀x [P(x) → P( f (x))], P(a) ⇒ P( f ( f (x)))
This section discusses two proof methods in the context of
Exercise 16 Prove ∀x [P(x) ∨ P(a)] ' P(a). propositional logic.
6 CLAUSE METHODS FOR PROPOSITIONAL LOGIC 14
K1, · · · , Km → L 1, · · · , L n
Example 13 DPLL can show that a formula is not a theo-
and when n = 1 we have the familiar Prolog clauses, also rem. Consider the formula P ∨ Q → Q ∨ R. After negat-
known as definite or Horn clauses. ing this and converting to CNF, we obtain the three clauses
6 CLAUSE METHODS FOR PROPOSITIONAL LOGIC 15
{P, Q}, {¬Q} and {¬R}. The DPLL method terminates A special case of resolution is when A and C are empty:
rapidly:
B ¬B
{P, Q} {¬Q} {¬R} initial clauses f
{P} {¬R} unit ¬Q This detects contradictions.
{¬R} unit P (also pure) Resolution works with disjunctions. The aim is to prove
unit ¬R (also pure) a contradiction, refuting a formula. Here is the method for
All clauses have been deleted, so execution terminates. The proving a formula A:
clauses are satisfiable by P 7 → 1, Q 7 → 0, R 7 → 0. This 1. Translate ¬A into CNF as A1 ∧ · · · ∧ Am .
interpretation falsifies P ∨ Q → Q ∨ R.
2. Break this into a set of clauses: A1 , . . ., Am .
Example 14 Here is an example of a case split. Consider 3. Repeatedly apply the resolution rule to the clauses,
the clause set producing new clauses. These are all consequences of
¬A.
{¬Q, R} {¬R, P} {¬R, Q}
{¬P, Q, R} {P, Q} {¬P, ¬Q}. 4. If a contradiction is reached, we have refuted ¬A.
In set notation the resolution rule is
There are no unit clauses or pure literals, so we arbitrarily
select P for case splitting: {B, A1 , . . . , Am } {¬B, C1 , . . . , Cn }
{A1 , . . . , Am , C1 , . . . , Cn }
{¬Q, R} {¬R, Q} {Q, R} {¬Q} if P is true
{¬R} {R} unit ¬Q Resolution takes two clauses and creates a new one. A col-
{} unit R lection of clauses is maintained; the two clauses are cho-
{¬Q, R} {¬R} {¬R, Q} {Q} if P is false sen from the collection according to some strategy, and the
{¬Q} {Q} unit ¬R new clause is added to it. If m = 0 or n = 0 then the
{} unit ¬Q new clause will be smaller than one of the parent clauses; if
m = n = 0 then the new clause will be empty. If the empty
The empty clause is written {} above to make the pattern of clause is generated, resolution terminates successfully: we
execution clearer; traditionally, however, the empty clause have found a contradiction!
is written . When we encounter a contradiction, we aban-
don the current case and consider any remaining cases. If 6.4 Examples of ground resolution
all cases are contradictory, then the original set of clauses is
inconsistent. If they arose from some negated formula ¬A, Let us try to prove
then A is a theorem.
You might find it instructive to download MiniSat6 , P∧Q→Q∧P
which is a very concise open-source SAT solver. It is coded Convert its negation to CNF:
in C++. These days, SAT solvers are largely superseded by
SMT solvers, which also handle arithmetic, rays, bit vectors ¬(P ∧ Q → Q ∧ P)
, arrays, bit vectors, etc.
We can combine steps 1 (eliminate →) and 2 (push nega-
tions in) using the law ¬(A → B) ' A ∧ ¬B:
6.3 Introduction to resolution
(P ∧ Q) ∧ ¬(Q ∧ P)
Resolution combines two clauses containing complemen-
tary literals. It is essentially the following rule of inference: (P ∧ Q) ∧ (¬Q ∨ ¬P)
{P, Q} {¬P, Q} {P, ¬Q} {¬P, ¬Q} A tree for the resolution proof is
{Q}
{¬Q} {H } {¬H, M, N }
{M, N } {¬M, P}
{N , P}
{¬N , P}
Note that the tree contains {Q} and {¬Q} rather than
{Q, Q} and {¬Q, ¬Q}. If we forget to suppress repeated {P} {¬P}
literals, we can get stuck. Resolving {Q, Q} and {¬Q, ¬Q}
(keeping repetitions) gives {Q, ¬Q}, a tautology. Tautolo-
gies are useless. Resolving this one with the other clauses
leads nowhere. Try it. The clauses were not tried at random. Here are some
These examples could mislead. Must a proof use each points of proof strategy.
clause exactly once? No! A clause may be used repeatedly,
and many problems contain redundant clauses. Here is an Ignoring irrelevance. Clauses {¬M, K } and {¬N , L}
example: lead nowhere, so they were not tried. Resolving with one
{¬P, R} {P} {¬Q, R} {¬R}
of these would make a clause containing K or L. There is
(unused) no way of getting rid of either literal, for no clause contains
¬K or ¬L. So this is not a way to obtain the empty clause.
{R}
(K and L are pure literals.)
Working from the goal. In each resolution step, at least
Redundant clauses can make the theorem-prover flounder; one clause involves the negated conclusion (possibly via
this is a challenge facing the field. earlier resolution steps). We do not attempt to find a con-
tradiction in the assumptions alone, provided (as is often
the case) we know them to be consistent: any contradiction
6.5 A proof using a set of assumptions must involve the negated conclusion. This strategy is called
In this example we assume set of support. Although largely obsolete, it’s very useful
when working problems by hand.
H →M∨N M→K∧P N →L∧P
Linear resolution. The proof has a linear structure: each
and prove H → P. It turns out that we can generate clauses resolvent becomes the parent clause for the next resolution
separately from the assumptions (taken positively) and the step. Furthermore, the other parent clause is always one
conclusion (negated). of the original set of clauses. This simple structure is very
If we call the assumptions A1 , . . ., Ak and the conclu- efficient because only the last resolvent needs to be saved.
sion B, then the desired theorem is It is similar to the execution strategy of Prolog.
Try negating this and converting to CNF. Using the law {P, Q} {¬P, Q} {P, ¬Q} {¬P, ¬Q}.
¬(A → B) ' A ∧ ¬B, the negation converts in one step to
Exercise 19 Use resolution to prove
A1 ∧ · · · ∧ Ak ∧ ¬B
(A → B ∨ C) → [(A → B) ∨ (A → C)].
Since the entire formula is a conjunction, we can separately
convert A1 , . . ., Ak , and ¬B to clause form and pool the Exercise 20 Explain in more detail the conversion into
clauses together. clauses for the example of §6.5.
7 SKOLEM FUNCTIONS, HERBRAND’S THEOREM AND UNIFICATION 17
Exercise 21 Prove Peirce’s law, ((P → Q) → P) → P, After Skolemization, the formula has only universal
using resolution. quantifiers. The next step is to throw the remaining quan-
tifiers away. This step is correct because we are converting
Exercise 22 Use resolution (showing the steps of convert- to clause form, and a clause implicitly includes universal
ing the formula into clauses) to prove these two formulas: quantifiers over all of its free variables.
We are almost back to the propositional case, except the
(Q → R) ∧ (R → P ∧ Q) ∧ (P → Q ∨ R) → (P ↔ Q) formula typically contains terms. We shall have to handle
(P ∧ Q → R) ∧ (P ∨ Q ∨ R) → ((P ↔ Q) → R) constants, function symbols, and variables.
Prenex normal form—where all quantifiers are moved to
Exercise 23 Prove that (P ∧ Q) → (R ∧ S) follows from the front of the formula—would make things easier to fol-
P → R and R ∧ P → S using linear resolution. low. However, increasing the scope of the quantifiers prior
to Skolemization makes proofs much more difficult. Push-
Exercise 24 Convert these axioms to clauses, showing all
ing quantifiers in as far as possible, instead of pulling them
steps. Then prove Winterstorm → Miserable by resolution:
out, yields a better set of clauses.
Rain ∧ (Windy ∨ ¬Umbrella) → Wet
Winterstorm → Storm ∧ Cold Examples of Skolemization
Wet ∧ Cold → Miserable For simplicity, we start with prenex normal form. The af-
Storm → Rain ∧ Windy fected expressions are underlined.
∀y R(a, y, f (y))
7.1 Removing quantifiers: Skolem form
Skolemisation replaces every existentially bound variable Finally, drop the ∀y and convert the remaining formula to
by a Skolem constant or function. This transformation does a clause:
not preserve the meaning of a formula. However, it does {R(a, y, f (y))}
preserve inconsistency, which is the critical property: reso-
lution works by detecting contradictions. Example 16 Start with
Take a formula in negation normal form. Starting from the Eliminate the ∃u using the Skolem constant c:
outside, follow the nesting structure of the quantifiers. If the
formula contains an existential quantifier, then the series of ∀v ∃w ∃x ∀y ∃z ((P(h(c, v)) ∨ Q(w)) ∧ R(x, h(y, z)))
quantifiers must have the form
Eliminate the ∃w using the 1-place Skolem function f :
∀x1 · · · ∀x2 · · · ∀xk · · · ∃y A
∀v ∃x ∀y ∃z ((P(h(c, v)) ∨ Q( f (v))) ∧ R(x, h(y, z)))
where A is a formula, k ≥ 0, and ∃y is the leftmost exis-
tential quantifier. Choose a k-place function symbol f not Eliminate the ∃x using the 1-place Skolem function g:
present in A (that is, a new function symbol). Delete the ∃y
∀v ∀y ∃z ((P(h(c, v)) ∨ Q( f (v))) ∧ R(g(v), h(y, z)))
and replace all other occurrences of y by f (x1 , x2 , . . . , xk ).
The result is another formula: Eliminate the ∃z using the 2-place Skolem function j (the
function h is already used!):
∀x1 · · · ∀x2 · · · ∀xk · · · A[ f (x1 , x2 , . . . , xk )/y]
∀v ∀y ((P(h(c, v)) ∨ Q( f (v))) ∧ R(g(v), h(y, j (v, y))))
If some existential quantifier is not enclosed in any uni-
versal quantifiers, then the formula contains simply ∃y A Dropping the universal quantifiers yields a set of clauses:
as a subformula. Then this quantifier is deleted and occur-
rences of y are replaced by a new constant symbol c. The {P(h(c, v)), Q( f (v))} {R(g(v), h(y, j (v, y)))}
resulting subformula is A[c/y].
Then repeatedly eliminate all remaining existential quan- Each clause is implicitly enclosed by universal quantifiers
tifiers as above. The new symbols are called Skolem func- over each of its variables. So the occurrences of the vari-
tions (or Skolem constants). able v in the two clauses are independent of each other.
7 SKOLEM FUNCTIONS, HERBRAND’S THEOREM AND UNIFICATION 18
Let’s try this example again, first pushing quantifiers in Thus, H consists of all the terms that can be written using
to the smallest possible scopes: only the constants and function symbols present in S. There
are no variables: the elements of H are ground terms. For-
∃u ∀v P(h(u, v)) ∨ ∃w Q(w) ∧ ∃x ∀y ∃z R(x, h(y, z)) mally, H turns out to satisfy the recursive equation
Theorem 15 (Herbrand) A set S of clauses is unsatisfi- Example 19 The substitution θ = [h(y)/x, b/y] says to
able if and only if there is a finite unsatisfiable set S 0 of replace x by h(y) and y by b. The replacements occur si-
ground instances of clauses of S. multaneously; it does not have the effect of replacing x by
h(b). Applying this substitution gives
The theorem is valuable because the new set S 0 expresses
f (x, g(u), y)θ = f (h(y), g(u), b)
the inconsistency in a finite way. However, it only tells us
0 0
that S exists; it does not tell us how to derive S . So how R(h(x), z)θ = R(h(h(y)), z)
do we generate useful ground instances of clauses? One {P(x), ¬Q(y)}θ = {P(h(y)), ¬Q(b)}
answer, outlined below, is unification.
Definition 17 If φ and θ are substitutions then so is their
composition φ ◦ θ , which satisfies
Example 18 To demonstrate the Skolem-Gödel-Herbrand
t (φ ◦ θ ) = (tφ)θ for all terms t
theorem, consider proving the formula
This set is inconsistent. Here is a finite set of ground in- 7.5 Unifiers
stances of clauses in S:
Definition 18 A substitution θ is a unifier of terms t1 and
{P(a)} {P(b)} {¬P(a), Q(a)} t2 if t1 θ = t2 θ . More generally, θ is a unifier of terms t1 ,
t2 , . . ., tm if t1 θ = t2 θ = · · · = tm θ. The term t1 θ is the
{¬P(b), Q(b)} {¬Q(a), ¬Q(b)}.
common instance.
This set reflects the intuitive proof of the theorem. We obvi- Two terms can only be unified if they have similar struc-
ously have P(a) and P(b); using ∀y [P(y) → Q(y)] with ture apart from variables. The terms f (x) and h(y, z) are
a and b, we obtain Q(a) and Q(b). If we can automate this clearly non-unifiable since no substitution can do anything
procedure, then we can generate such proofs automatically. about the differing function symbols. It is easy to see that
θ unifies f (t1 , . . . , tn ) and f (u 1 , . . . , u n ) if and only if θ
unifies ti and u i for all i = 1, . . . , n.
7.4 Unification
Unification is the operation of finding a common instance Example 21 The substitution [3/x, g(3)/y] unifies the
of two or more terms. Consider a few examples. The terms terms g(g(x)) and g(y). The common instance is g(g(3)).
f (x, b) and f (a, y) have the common instance f (a, b), re- These terms have many other unifiers, such as these:
placing x by a and y by b. The terms f (x, x) and f (a, b) unifying substitution common instance
have no common instance, assuming that a and b are dis- [ f (u)/x, g( f (u))/y] g(g( f (u)))
tinct constants. The terms f (x, x) and f (y, g(y)) have no [z/x, g(z)/y] g(g(z))
common instance, since there is no way that x can have the [g(x)/y] g(g(x))
form y and g(y) at the same time — unless we admit the
infinite term g(g(g(· · · ))). Note that g(g(3)) and g(g( f (u))) are both instances of
Only variables may be replaced by other terms. Con- g(g(x)). Thus g(g(x)) is more general than g(g(3)) and
stants are not affected (they remain constant!). Instances of g(g( f (u))). Certainly g(g(3)) seems to be arbitrary —
the term f (t, u) must have the form f (t 0 , u 0 ), where t 0 is an neither of the original terms mentions 3! Also important:
instance of t and u 0 is an instance of u. g(g(x)) is as general as g(g(z)), despite the different vari-
able names. Let us formalize these intuitions.
Definition 16 A substitution is a finite set of replacements Definition 19 The substitution θ is more general than φ if
φ = θ ◦ σ for some substitution σ .
[t1 /x1 , . . . , tk /xk ]
Example 22 Recall the unifiers of g(g(x)) and g(y). The
where x1 , . . ., xk are distinct variables such that ti 6 = xi for unifier [g(x)/y] is more general than the others listed, for
all i = 1, . . . , k. We use Greek letters φ, θ , σ to stand for
substitutions. [3/x, g(3)/y] = [g(x)/y] ◦ [3/x]
[ f (u)/x, g( f (u))/y] = [g(x)/y] ◦ [ f (u)/x]
A substitution θ = [t1 /x1 , . . . , tk /xk ] defines a func- [z/x, g(z)/y] = [g(x)/y] ◦ [z/x]
tion from the variables {x1 , . . . , xk } to terms. Postfix no-
[g(x)/y] = [g(x)/y] ◦ []
tation is usual for applying a substitution; thus, for exam-
ple, xi θ = ti . Substitution on terms, literals and clauses is The last line above illustrates that every substitution θ is
defined recursively in the obvious way: more general than itself because θ = θ ◦ [].
7 SKOLEM FUNCTIONS, HERBRAND’S THEOREM AND UNIFICATION 20
In the diagrams, the lines indicate variable replacements: Observe that R(x, f (x)) and R(g(y), y) are not unifiable
because of the occurs check. And so it should be, because
j (w, a, h(w)) j ( f (x, y), x, y) the original formula is not a theorem! The best way to
demonstrate that a formula is not a theorem is to exhibit
a/x a counterexample. Here are two:
f (a, y)/w • The domain is the set of all people who have ever lived.
The relation R(x, y) holds if x loves y. The function
h( f (a, y))/y??? f (x) denotes the mother of x, and {R(x, f (x))} holds
because everybody loves their mother. The function
g(x) denotes the landlord of x, and ¬R(g(y), y) holds.
Implementation remarks
• The domain is the set of integers. The relation R(x, y)
To unify terms t1 , t2 , . . ., tm for m > 2, compute a unifier holds whenever x = y. The function f is defined by
θ of t1 and t2 , then recursively compute a unifier σ of the f (x) = x and {R(x, f (x))} holds. The function g is
terms t2 θ , . . ., tm θ . The overall unifier is then θ ◦ σ . If any defined by g(x) = x + 1 and so {¬R(g(y), y)} holds.
unification fails then the set is not unifiable.
A real implementation does not need to compose substi- Exercise 25 Consider a first-order language with 0 and 1
tutions. Most represent variables by pointers and effect the as constant symbols, with − as a 1-place function symbol
substitution [t/x] by updating pointer x to t. The compo- and + as a 2-place function symbol, and with < as a 2-place
sitions are cumulative, so this works. However, if unifica- predicate symbol.
tion fails at some point, the pointer assignments must be (a) Describe the Herbrand Universe for this language.
undone! The algorithm sketched here can take exponential
time in unusual cases. Faster algorithms exist, but they are (b) The language can be interpreted by taking the integers
more complex and seldom adopted. for the universe and giving 0, 1, −, +, and < their
Prolog systems, for the sake of efficiency, omit the oc- usual meanings over the integers. What do those sym-
curs check. This can result in circular data structures and bols denote in the corresponding Herbrand model?
looping. It is unsound for theorem proving.
Exercise 26 For each of the following pairs of terms, give
a most general unifier or explain why none exists. Do not
7.7 Examples of theorem proving rename variables prior to performing the unification.
These two examples are fundamental. They illustrate how f (g(x), z) f (y, h(y))
the occurs check enforces correct quantifier reasoning. j (x, y, z) j ( f (y, y), f (z, z), f (a, a))
j (x, z, x) j (y, f (y), z)
Example 30 Consider a proof of j ( f (x), y, a) j (y, z, z)
j (g(x), a, y) j (z, x, f (z, z))
(∃y ∀x R(x, y)) → (∀x ∃y R(x, y)).
For simplicity, produce clauses separately for the an- 8 First-Order Resolution and Prolog
tecedent and for the negation of the consequent.
By means of unification, we can extend resolution to first-
• The antecedent is ∃y ∀x R(x, y); replacing y by the
order logic. As a special case we obtain Prolog. Other the-
Skolem constant a yields the clause {R(x, a)}.
orem provers are also based on unification. Other applica-
tions include polymorphic type checking.
• In ¬(∀x ∃y R(x, y)), pushing in the negation produces
A number of resolution theorem provers can be down-
∃x ∀y ¬R(x, y). Replacing x by the Skolem con-
loaded for free from the Internet. Some of the main ones
stant b yields the clause {¬R(b, y)}.
include Vampire (http://www.vprover.org, SPASS
Unifying R(x, a) with R(b, y) detects the contradiction (http://www.spass-prover.org) and E (http:
R(b, a) ∧ ¬R(b, a). //www.eprover.org). It might be instructive to down-
load one of them and experiment with it.
• The negation of the consequent is ¬(∃y ∀x R(x, y)), As before, the first clause contains B and other literals,
which becomes ∀y ∃x ¬R(x, y). Replacing x by the while the second clause contains ¬D and other literals. The
Skolem function g yields the clause {¬R(g(y), y)}. substitution σ is a unifier of B and D (almost always a most
8 FIRST-ORDER RESOLUTION AND PROLOG 22
general unifier). This substitution is applied to all remain- Factoring is necessary here. Applying the factoring rule
ing literals, producing the conclusion. to each of these clauses yields two additional clauses:
The variables in one clause are renamed before resolu-
tion to prevent clashes with the variables in the other clause. {¬P(a, a)} {P(a, a)}
Renaming is sound because the scope of each variable is its
clause. Resolution is sound because it takes an instance of These are complementary unit clauses, so resolution yields
each clause — the instances are valid, because the clauses the empty clause. This proof is trivial!
are universally valid — and then applies the propositional As a general rule, if there are no unit clauses to begin
resolution rule, which is sound. For example, the two with, then factoring will be necessary. Otherwise, resolu-
clauses tion steps will continue to yield clauses that have at least
{P(x)} and {¬P(g(x))} two literals. The only exception is when there are repeated
literals, as in the following example.
yield the empty clause in a single resolution step. This
works by renaming variables — say, x to y in the second
clause — and unifying P(x) with P(g(y)). Forgetting to Example 33 Let us prove ∃x [P → Q(x)] ∧ ∃x [Q(x) →
rename variables is fatal, because P(x) cannot be unified P] → ∃x [P ↔ Q(x)]. The clauses are
with P(g(x)).
{P, ¬Q(b)} {P, Q(x)} {¬P, ¬Q(x)} {¬P, Q(a)}
It is logically equivalent to (A1 ∧ · · · ∧ Am ) → B. Prolog’s clause, which happens to be identical to the original one.
notation is Prolog never notices the repeated goal clause, so it repeats
B ← A1 , . . . , Am . the same useless resolution over and over again. Depth-first
search means that at every choice point, such as between
If m = 0 then the clause is simply written as B and is some-
using P ← P and P ← , Prolog will explore every avenue
times called a fact.
arising from its first choice before considering the second
A negative or goal clause is one of the form
choice. Obviously, the second choice would prove the goal
{¬A , . . . , ¬A } trivially, but Prolog never notices this.
1 m
Prolog permits just one of these; it represents the list of 8.6 Example of Prolog execution
unsolved goals. Prolog’s notation is
Here are axioms about the English succession: how y can
← A1 , . . . , Am . become King after x.
∀x ∀y (oldestson(y, x) ∧ king(x) → king(y))
A Prolog database consists of definite clauses. Observe that
definite clauses cannot express negative assertions, since ∀x ∀y (defeat(y, x) ∧ king(x) → king(y))
they must contain a positive literal. From a mathematical king(richardIII)
point of view, they have little expressive power; every set of defeat(henryVII, richardIII)
definite clauses is consistent! Even so, definite clauses are
oldestson(henryVIII, henryVII)
a natural notation for many problems.
The goal is to prove king(henryVIII). Now here is the same
problem in the form of definite clauses:
8.5 Prolog computations
{¬oldestson(y, x), ¬king(x), king(y)}
A Prolog computation takes a database of definite clauses
together with one goal clause. It repeatedly resolves the {¬defeat(y, x), ¬king(x), king(y)}
goal clause with some definite clause to produce a new goal {king(richardIII)}
clause. If resolution produces the empty goal clause, then {defeat(henryVII, richardIII)}
execution succeeds.
{oldestson(henryVIII, henryVII)}
Here is a diagram of a Prolog computation step:
The goal clause is
definite clause goal clause
A1 An B B1 Bm {¬king(henryVIII)}.
unify B B1 Figure 3 shows the execution. The subscripts in the clauses
new goal clause
are to rename the variables.
A1 An B2 Bm Note how crude this formalization is. It says nothing
about the passage of time, about births and deaths, about
This is a linear resolution (§6). Two program clauses are not having two kings at once. The oldest son of Henry VII,
never resolved with each other. The result of each resolution Arthur, died aged 15, leaving the future Henry VIII as the
step becomes the next goal clause; the previous goal clause oldest surviving son. All formal models must omit some
is discarded after use. real-world details: reality is overwhelmingly complex.
Prolog resolution is efficient, compared with general res- The Frame Problem in Artificial Intelligence reveals an-
olution, because it involves less search and storage. Gen- other limitation of logic. Consider writing an axiom system
eral resolution must consider all possible pairs of clauses; it to describe a robot’s possible actions. We might include
adds their resolvents to the existing set of clauses; it spends an axiom to state that if the robot lifts an object at time t,
a great deal of effort getting rid of subsumed (redundant) then it will be holding the object at time t + 1. But we also
clauses and probably useless clauses. Prolog always re- need to assert that the positions of everything else remain
solves some program clause with the goal clause. Because the same as before. Then we must consider the possibil-
goal clauses do not accumulate, Prolog requires little stor- ity that the object is a table and has other things on top of
age. Prolog never uses factoring and does not even remove it. Separation Logic, a variant of Hoare logic, was invented
repeated literals from a clause. to solve the frame problem, especially for reasoning about
Prolog has a fixed, deterministic execution strategy. The linked data structures.
program is regarded as a list of clauses, not a set; the clauses Prolog is a powerful and useful programming language,
are tried strictly in order. With a clause, the literals are also but it is seldom logic. Most Prolog programs rely on special
regarded as a list. The literals in the goal clause are proved predicates that affect execution but have no logical mean-
strictly from left to right. The goal clause’s first literal is ing. There is a huge gap between the theory and practice of
replaced by the literals from the unifying program clause, logic programming.
preserving their order. Exercise 27 Convert the following formula into clauses,
Prolog’s search strategy is depth-first. To illustrate what showing your working. Then present two resolution proofs
this means, suppose that the goal clause is simply ← P different from the one shown in Example 33 above.
and that the program clauses are P ← P and P ← . Pro-
log will resolve P ← P with ← P to obtain a new goal ∃x [P → Q(x)] ∧ ∃x [Q(x) → P] → ∃x [P ↔ Q(x)]
9 DECISION PROCEDURES AND SMT SOLVERS 24
definite clause goal clause Exercise 32 Find a refutation from the following set of
clauses using resolution and factoring.
∀x (P ∨ Q(x)) → (P ∨ ∀x Q(x))
{¬defeat(y2 , x2 ), ¬k(x2 ), k(y2 )} {¬k(henryVII)} ∃x y (R(x, y) → ∀zw R(z, w))
A number of decidable subcases of first-order logic were The first case yields an upper bound for xn while the second
identified in the first half of the 20th century. One of case yields a lower bound. Now for every pair of constraints
the more interesting decision problems is Presburger arith- i and i 0 involving opposite signs can be added writing the
β0
metic: the first-order theory of the natural numbers with lower bound case as −xn ≤ − a i0 to obtain
i n
addition (and subtraction, and multiplication by constants).
There is an algorithm to determine whether a given sentence βi βi 0
in Presburger arithmetic is valid or not. 0≤ − .
ain ai 0 n
Real numbers behave differently from the natural num-
bers (where m < n ⇐⇒ m + 1 ≤ n) and require their Consider the following small set of constraints:
own algorithms. Once again, the only allowed operations
are addition, subtraction and constant multiplication. Such x≤y x≤z − x + y + 2z ≤ 0 − z ≤ −1
a language is called linear arithmetic. The validity of linear
arithmetic formulas over the reals is also decidable. Let’s work through the algorithm very informally. The first
Even unrestricted arithmetic (with multiplication) is de- two constraints give upper bounds for x, while the third
cidable for the reals. Unfortunately, the algorithms are too constraint gives a lower bound, and can be rewritten as
complicated and expensive for widespread use. Even Eu- −x ≤ −y − 2z. Adding it to x ≤ y yields 0 ≤ −2z
clidean geometry can be reduced to problems on the reals, which can be rewritten as z ≤ 0. Doing the same thing with
and is therefore decidable. Practical decision procedures x ≤ z yields y + z ≤ 0 which can be rewritten as z ≤ −y.
exist for simple data structures such as arrays and lists. This leaves us with a new set of constraints, where we have
eliminated the variable x:
9.2 Fourier-Motzkin variable elimination z≤0 z ≤ −y − z ≤ −1
Fourier-Motzkin variable elimination is a classic decision
procedure for real (or rational) linear arithmetic. It dates Now we have two separate upper bounds for z, as well as
from 1826 and is very inefficient in general, but relatively a lower bound, because we know z ≥ 1. There are again
easy to understand. In the general case, it deals with con- two possible combinations of a lower bound with an upper
junctions of linear constraints over the reals or rationals: bound, and we derive 0 ≤ −1 and 0 ≤ −y−1. Because 0 ≤
−1 is contradictory, Fourier-Motzkin variable elimination
m X
n
^ has refuted the original set of constraints.
ai j x j ≤ bi (6) A many other decision procedures exist, frequently for
i=1 j=1 more restrictive problem domains, aiming for greater ef-
It works by eliminating variables in succession. Eventually ficiency and better integration with other reasoning tools.
a either contradiction or a trivial constraint will remain. Difference arithmetic is an example: arithmetic constraints
The key idea is a technique known as quantifier elimi- are restricted to the form x − y ≤ c, where c is an inte-
nation (QE). We have already seen Skolemization, which ger constant. Satisfiability of a set of difference arithmetic
removes quantifiers from a formula, but that technique does constraints can be determined very quickly by construct-
not preserve the formula’s meaning. QE replaces a formula ing a graph and invoking the Bellman-Ford algorithm to
with an equivalent but quantifier-free formula. It is only look for a cycle representing the contradiction 0 ≤ −1.
possible for specific theories, and is generally very expen- In other opposite direction, harder decision problems can
sive. handle more advanced applications but require much more
For the reals, existential quantifiers can be eliminated as computer power.
follows:
m
^ n
^ m ^
^ n 9.3 Other decidable theories
∃x ( ai ≤ x ∧ x ≤ b j ) ⇐⇒ ai ≤ b j
i=1 j=1 i=1 j=1 One of the most dramatic examples of quantifier elimination
concerns the domain of real polynomial arithmetic:
A system of constraints has many lower bounds, {ai }i=1 m ,
There exist decision procedures for arrays, for at least the Unit propagation using 2a < b yields just this:
trivial theory of (where lk is lookup and up is update)
{3a > 2, a < 0}
(
v if i = j Now a case split on the literal 3a > 2 returns a “model”:
lk(up(a, i, v), j) =
lk(a, j) if i 6 = j b < a ∧ c 6 = 0 ∧ 2a < b ∧ 3a > 2.
The theory of lists with head, tail, cons is also decidable. But the decision procedure finds these contradictory and
Combinations of decidable theories remain decidable under returns a new clause:
certain circumstances, e.g., the theory of arrays with linear {¬(b < a), ¬(2a < b), ¬(3a > 2)}
arithmetic subscripts. The seminal publication, still cited
today, is Nelson and Oppen [1980]. Finally, we get a true model:
b < a ∧ c 6 = 0 ∧ 2a < b ∧ a < 0.
9.4 Satisfiability modulo theories Case splitting operates as usual for DPLL. But note that
Many decision procedures operate on existentially quanti- pure literal elimination would make no sense here, as there
fied conjunctions of inequalities. An arbitrary formula can are connections between literals (consider a < 0 and a > 0
be solved by translating it into disjunctive normal form (as again) that are not visible at the propositional level.
opposed to the more usual conjunctive normal form) and by
eliminating universal quantifiers in favour of negated exis- 9.6 Final remarks
tential quantifiers. However, these transformations typically
cause exponential growth and may need to be repeated as We seen here the concepts of over-approximation and
each variable is eliminated. counterexample-driven refinement. It is frequently used to
Satisfiability modulo theories (SMT) is an extension of extend SAT solving to richer domains than propositional
DPLL to make use of decision procedures, extending their logic. By over-approximation we mean that every model
scope while avoiding the problems mentioned above. The of the original problem assigns truth values to the enriched
idea is that DPLL handles the logical part of the problem “propositional letters” (such as a > 0), yielding a model
while delegating reasoning about particular theories to the of the propositional clauses obtained by ignoring the under-
relevant decision procedures. lying meaning of the propositional atoms. As above, any
claimed model is then somehow checked against the richer
We extend the language of propositional satisfiability to
problem domain, and the propositional model is then itera-
include atomic formulas belonging to our decidable theory
tively refined. But if the propositional clauses are inconsis-
(or theories). For the time being, these atomic formulas are
tent, then so is the original problem.
not interpreted, so a literal such as a < 0 is wholly unrelated
SMT solvers are the focus of great interest at the mo-
to a > 0. But the decision procedures are invoked during
ment, and have largely superseded SAT solvers (which they
the execution of DPLL; if we have already asserted a < 0,
incorporate and generalise). One of the most popular SMT
then the attempt to assert a > 0 will be rejected by the
solvers is Z3, a product of Microsoft Research but free
decision procedure, causing backtracking. Information can
for non-commercial use. Others include Yices and CVC4.
be fed back from the decision procedure to DPLL in the
They are applied to a wide range of problems, including
form of a new clause, such as ¬(a < 0) ∨ ¬(a > 0).
hardware and software verification, program analysis, sym-
[Remark: the Fourier-Motzkin decision procedure elim-
bolic software execution, and hybrid systems verification.
inates variables, but all decision procedures in actual use
can deal with constants such as a as well, and satisfiability- Exercise 34 In Fourier-Motzkin variable elimination, any
preserving transformations exist between formulas involv- variable not bounded both above and below is deleted from
ing constants and those involving quantified variables.] the problem. For example, given the set of constraints
3x ≥ y x ≥0 y≥z z≤1 z≥0
9.5 SMT example the variables x and then y can be removed (with their con-
Let’s consider an example. Suppose we start with the fol- straints), reducing the problem to z ≤ 1 ∧ z ≥ 0. Explain
lowing four clauses. Note that a, b, c are constants: vari- how this happens and why it is correct.
ables are not possible with this sort of proof procedure. Exercise 35 Apply Fourier-Motzkin variable elimination
to the set of constraints
{c = 0, 2a < b} {b < a}
{3a > 2, a < 0} {c 6 = 0, ¬(b < a)} x ≥ z y ≥ 2z z ≥ 0 x + y ≤ z.
Exercise 36 Apply Fourier-Motzkin variable elimination
Unit propagation using b < a yields three clauses: to the set of constraints
{c = 0, 2x < b} {3a > 2, a < 0} {c 6 = 0} x ≤ 2y x ≤ y+3 z≤x 0≤z y ≤ 4x.
Exercise 37 Apply the SMT algorithm sketched above to
Unit propagation using c 6 = 0 yields two clauses:
the following set of clauses:
{2a < b} {3a > 2, a < 0} {c = 0, c > 0} {a 6 = b} {c < 0, a = b}
10 BINARY DECISION DIAGRAMS 27
10 Binary Decision Diagrams U and U 0 , then construct a new decision node from P
to them. Do the usual simplifications. If U = U 0 then
A binary decision tree represents the truth table of a propo- the resulting BDD for the conjunction is U .
sitional formula by binary decisions, namely if-then-else
(b) If P < P 0 then build the BDD X ∧Z 0PY ∧Z 0 . When
expressions over the propositional letters. (In the relevant
building BDDs on paper, it is easier to pretend that the
literature, propositional letters are called variables.) Unfor-
second decision node also starts with P: assume that it
tunately, a decision tree may contain much redundancy. A
has the redundant decision Z 0PZ 0 and proceed as in (a).
binary decision diagram is a directed acyclic graph, shar-
ing identical subtrees. An ordered binary decision diagram (c) If P > P 0 , the approach is analogous to (b).
is based upon giving an ordering < to the variables: they
Other connectives, even ⊕, are treated similarly, differing
must be tested in order. Further refinements ensure that each
only in the base cases. The negation of the BDD X PY is
propositional formula is mapped to a unique diagram, for a
¬X P¬Y . In essence we copy the BDD, and when we reach
given ordering. We get a compact and canonical represen-
the leaves, exchange 1 and 0. The BDD of Z → f is the
tation of the truth table of any formula.
same as the BDD of ¬Z .
The acronym BDD for binary decision diagram is well-
During this processing, the same input (consisting of a
established in the literature. However, many earlier papers
connective and two BDDs) may be transformed into a BDD
use OBDD or even ROBDD (for “reduced ordered binary
repeatedly. Efficient implementations therefore have an
decision diagram”) synonymously.
additional hash table, which associates inputs to the cor-
A BDD must satisfy the following conditions:
responding BDDs. The result of every transformation is
• ordering: if P is tested before Q, then P < Q stored in the hash table so that it does not have to be com-
(thus in particular, P cannot be tested more than once puted again.
on a single path)
Example 34 We apply the BDD Canonicalisation Algo-
• uniqueness: identical subgraphs are stored only once
rithm to P ∨ Q → Q ∨ R. First, we make tiny BDDs for
(to do this efficiently, hash each node by its variable
P and Q. Then, we combine them using ∨ to make a small
and pointer fields)
BDD for P ∨ Q:
• irredundancy: no test leads to identical subgraphs in
P
the 1 and 0 cases (thanks to uniqueness, redundant
tests can be detected by comparing pointers)
P Q
For a given variable ordering, the BDD representation
of each formula is unique: BDDs are a canonical form. 0 1 0 1
Canonical forms usually lead to good algorithms — for
a start, you can test whether two things are equivalent by The BDD for Q ∨ R has a similar construction, so we omit
comparing their canonical forms. it. We combine the two small BDDs using →, then simplify
The BDD of a tautology is 1. Similarly, that of any in- (removing a redundant test on Q) to obtain the final BDD.
consistent formula is 0. To check whether two formulas
are logically equivalent, convert both to BDDs and then — P
thanks to uniqueness — simply compare the pointers.
A recursive algorithm converts a formula to a BDD. All P
the logical connectives can be handled directly, including
→ and ↔. (Exclusive-or is also used, especially in hard- P Q Q Q
ware examples.) The expensive transformation of A ↔ B
into (A → B) ∧ (B → A) is unnecessary. Q R R
Here is how to convert a conjunction A ∧ A0 to a BDD.
In this algorithm, X PY is a decision node that tests the vari- 0 1 0 1 0 1
able P, with a true-link to X and a false-link to Y . In other
words, X PY is the BDD equivalent of the decision “if P then The new construction is shown in grey. In both of these
X else Y ”. examples, it appears over the rightmost formula because its
0
1. Recursively convert A and A to BDDs Z and Z . 0 variables come later in the ordering.
The final diagram indicates that the original formula is
2. Check for trivial cases. If Z = Z 0 (pointer compari- always true except if P is true while Q and R are false.
son) then the result is Z ; if either operand is 0, then the When you have such a simple BDD, you can easily check
result is 0; if either operand is 1, then the result is the that it is correct. For example, this BDD suggests the for-
other operand. mula evaluates to 1 when P is false, and indeed we find that
the formula simplifies to Q → Q ∨ R, which simplifies
3. In the general case, let Z = X PY and Z 0 = X 0P 0 Y 0 . further to 1.
There are three possibilities: Huth and Ryan [2004] present a readable introduction to
(a) If P = P 0 then build the BDD X ∧X 0PY ∧Y 0 recur- BDDs. A classic but more formidable source of information
sively. This means convert X ∧ X 0 and Y ∧ Y 0 to BDDs is Bryant [1992].
11 MODAL LOGICS 28
Exercise 38 Compute the BDD for each of the following This definition of truth is more complex than we have
formulas, taking the variables as alphabetically ordered: seen previously (§2.2), because of the extra parameters W
and R. We shall not consider quantifiers at all; they really
P∧Q→Q∧P P∨Q→P∧Q
complicate matters, especially if the universe is allowed to
¬(P ∨ Q) ∨ P ¬(P ∧ Q) ↔ (P ∨ R) vary from one world to the next.
Exercise 39 Verify these equivalences using BDDs: For a particular frame (W, R), further relations can be
defined in terms of w
A:
(P ∧ Q) ∧ R ' P ∧ (Q ∧ R)
(P ∨ Q) ∨ R ' P ∨ (Q ∨ R) |HW,R,I A means w
A for all w under interpretation I
P ∨ (Q ∧ R) ' (P ∨ Q) ∧ (P ∨ R) |HW,R A means w
A for all w and all I
P ∧ (Q ∨ R) ' (P ∧ Q) ∨ (P ∧ R)
Exercise 40 Verify these equivalences using BDDs: Now |H A means |HW,R A for all frames. We say that A
is universally valid. In particular, all tautologies of propo-
¬(P ∧ Q) ' ¬P ∨ ¬Q sitional logic are universally valid.
(P ↔ Q) ↔ R ' P ↔ (Q ↔ R) Typically we make further assumptions on the accessibil-
(P ∨ Q) → R ' (P → R) ∧ (Q → R) ity relation. We may assume, for example, that R is transi-
tive, and consider whether a formula holds under all such
frames. More formulas become universally valid if we re-
11 Modal Logics strict the accessibility relation, as they exclude some modal
frames from consideration. The purpose of such assump-
Modal logic allows us to reason about statements being tions is to better model the task at hand. For instance, to
“necessary” or “possible”. Some variants are effectively model the passage of time, we might want R to be reflex-
about time (temporal logic) where a statement might hold ive and transitive; we could even make it a linear ordering,
“henceforth” or “eventually”. though branching-time temporal logic is popular.
There are many forms of modal logic. Each one is based
upon two parameters:
11.2 Hilbert-style modal proof systems
• W is the set of possible worlds (machine states, future
times, . . .) Start with any proof system for propositional logic. Then
• R is the accessibility relation between worlds (state add the distribution axiom
transitions, flow of time, . . .)
2(A → B) → (2A → 2B)
The pair (W, R) is called a modal frame.
The two modal operators, or modalities, are 2 and 3: and the necessitation rule:
• 2A means A is necessarily true A
• 3A means A is possibly true 2A
Here “necessarily true’ means “true in all worlds accessible There are no axioms or inference rules for 3. The modal-
from the present one”. The modalities are related by the law ity is viewed simply as an abbreviation:
¬3A ' 2¬A; in words, “it is not possible that A is true”
is equivalent to “A is necessarily false”. def
3A = ¬2¬A
Complex modalities are made up of strings of the modal
operators, such as 22A, 23A, 32A, etc. Typically many The distribution axiom clearly holds in our semantics.
of these are equivalent to others; in S4, an important modal The propositional connectives obey their usual truth tables
logic, 22A is equivalent to 2A. in each world. If A holds in all worlds, and A → B holds
in all worlds, then B holds in all worlds. Thus if 2A and
11.1 Semantics of propositional modal logic 2(A → B) hold then so does 2B, and that is the essence
of the distribution axiom.
Here are some basic definitions, with respect to a particular
The necessitation rule states that all theorems are neces-
frame (W, R):
sarily true. In more detail, if A can be proved, then it holds
An interpretation I maps the propositional letters to sub-
in all worlds; therefore 2A is also true.
sets of W . For each letter P, the set I (P) consists of those
The modal logic that results from adding the distribution
worlds in which P is regarded as true.
axiom and necessitation rule is called K . It is a pure modal
If w ∈ W and A is a modal formula, then w
A means
logic, from which others are obtained by adding further ax-
A is true in world w. This relation is defined as follows:
ioms. Each axiom corresponds to a property that is assumed
w
P ⇐⇒ w ∈ I (P) to hold of all accessibility relations. Here are just a few of
w
2A ⇐⇒ v
A for all v such that R(w, v) the main ones:
w
3A ⇐⇒ v
A for some v such that R(w, v)
w
A ∨ B ⇐⇒ w
A or w
B T 2A → A (reflexive)
w
A ∧ B ⇐⇒ w
A and w
B 4 2A → 22A (transitive)
w
¬A ⇐⇒ w
A does not hold B A → 23A (symmetric)
11 MODAL LOGICS 29
A⇒ A B⇒B
(→l)
A → B, A ⇒ B
(2l)
A → B, 2A ⇒ B
(2l)
2(A → B), 2A ⇒ B
(2r )
2(A → B), 2A ⇒ 2B
Figure 4: Counterexample to 32A ∧ 32B → 32(A ∧ B)
Intuitively, why is this sequent true? We assume 2(A →
B): from now on, if A holds then so does B. We assume
The “time” described by S4 allows multiple futures, 2A: from now on, A holds. Obviously we can conclude
which can be confusing. For example, 32A intuitively that B will hold from now on, which we write formally as
means “eventually A will be true forever”. You might ex- 2B.
pect 32A and 32B to imply 32(A ∧ B), since eventu- The order in which you apply rules is important. Working
ally A and B should both have become true. However, this backwards, you must first apply rule (2r ). This rule discards
property fails because time can split, with A becoming true non-2 formulas, but there aren’t any. If you first apply (2l),
in one branch and B in the other (Fig. 4). Note in partic- removing the boxes from the left side, then you will get
ular that 232A is stronger than 32A, and means “in all stuck:
futures, eventually A will be true forever”. now what?
?
The sequent calculus for S4 extends the usual sequent ⇒B
(2r )
rules for propositional logic with additional ones for 2 and A → B, A ⇒ 2B
(2l)
3. Four rules are required because the modalities may oc- A → B, 2A ⇒ 2B
(2l)
cur on either the left or right side of a sequent. 2(A → B), 2A ⇒ 2B
12 TABLEAUX-BASED METHODS 30
Applying (2r ) before (2l) is analogous to applying (∀r ) be- we are trying to show A ∧ B, but there is no reason why A
fore (∀l). The analogy because 2A has an implicit universal should hold in that world.
quantifier: for all accessible worlds. The sequent 32A, 32B ⇒ 32(A ∧ B) is not valid
The following two proofs establish the modal equiva- because A and B can become true in different futures.
lence 2323A ' 23A. Strings of modalities, like 2323 However, the sequents 32A, 232B ⇒ 32(A ∧ B) and
and 23, are called operator strings. So the pair of results 232A, 232B ⇒ 232(A ∧ B) are both valid.
establish an operator string equivalence. The validity of this
particular equivalence is not hard to see. Recall that 23A Exercise 41 Why does the dual of an operator string equiv-
means that A holds infinitely often. So 2323A means that alence also hold?
23A holds infinitely often — but that can only mean that
A holds infinitely often, which is the meaning of 23A. Exercise 42 Prove the sequents 3(A ∨ B) ⇒ 3A, 3B and
Now, let us prove the equivalence. Here is the first half of 3A ∨ 3B ⇒ 3(A ∨ B), thus proving the equivalence
the proof. As usual we apply (2r ) before (2l). Dually, and 3(A ∨ B) ' 3A ∨ 3B.
analogously to the treatment of the ∃ rules, we apply (3l)
before (3r ): Exercise 43 Prove the sequent 3(A → B), 2A ⇒ 3B.
3A ⇒ 3A Exercise 44 Prove the equivalence 2(A ∧ B) ' 2A ∧ 2B.
(2l)
23A ⇒ 3A
(3l)
323A ⇒ 3A Exercise 45 Prove 232A, 232B ⇒ 232(A ∧ B).
(2l)
2323A ⇒ 3A
(2r )
2323A ⇒ 23A
12 Tableaux-Based Methods
The opposite entailment is easy to prove:
There is a lot of redundancy among the connectives ¬, ∧,
23A ⇒ 23A ∨, →, ↔, ∀, ∃. We could get away using only three of
(3r )
23A ⇒ 323A them (two if we allowed exclusive-or), but use the full set
(2r )
23A ⇒ 2323A for readability. There is also a lot of redundancy in the se-
quent calculus, because it was designed to model human
Logic S4 enjoys many operator string equivalences, in-
reasoning, not to be as small as possible.
cluding 22A ' 2A. And for every operator string equiv-
One approach to removing redundancy results in the res-
alence, its dual (obtained by exchanging 2 with 3) also
olution method. Clause notation replaces the connectives,
holds. In particular, 33A ' 3A and 3232A ' 32A
and there is only one inference rule. A less radical ap-
hold. So we only need to consider operator strings in which
proach still removes much of the redundancy, while pre-
the boxes and diamonds alternate, and whose length does
serving much of the natural structure of formulas. The re-
not exceed three.
sulting formalism, known as the tableau calculus, is often
The distinct S4 operator strings are therefore 2, 3, 23,
adopted by proof theorists because of its logical simplicity.
32, 232 and 323.
Adding unification produces yet another formalism known
Finally, here are two attempted proofs that fail — be-
as free-variable tableaux; this form is particularly amenable
cause their conclusions are not theorems! The modal se-
to implementation. Both formalisms use proof by contra-
quent A ⇒ 23A states that if A holds now then it necessar-
diction.
ily holds again: from each accessible world, another world
is accessible in which A holds. This formula is valid if the
accessibility relation is symmetric; then one could simply 12.1 Simplifying the sequent calculus
return to the original world. The formula is therefore a the-
The usual formalisation of first-order logic involves seven
orem of S5 modal logic, but not S4.
connectives, or nine in the case of modal logic. For each
⇒A connective the sequent calculus has a left and a right rule.
(3r )
⇒ 3A So, apart from the structural rules (basic sequent and cut)
(2r )
A ⇒ 23A there are 14 rules, or 18 for modal logic.
Suppose we allow only formulas in negation normal
Here, the modal sequent 3A, 3B ⇒ 3(A∧ B) states that form. This immediately disposes of the connectives → and
if A holds in some accessible world, and B holds in some ↔. Really ¬ is discarded also, as it is allowed only on
accessible world, then both A and B hold in some accessi- propositional letters. So only four connectives remain, six
ble world. It is a fallacy because those two worlds need not for modal logic.
coincide. The (3l) rule prevents us from removing the dia- The greatest simplicity gain comes in the sequent rules.
monds from both 3A and 3B; if we choose one we must The only sequent rules that move formulas from one side
discard the other: to the other (across the ⇒ symbol) are the rules for the
B⇒A∧B connectives that we have just discarded. Half of the sequent
(3r )
B ⇒ 3(A ∧ B) rules can be discarded too. It makes little difference whether
(3l) we discard the left-side rules or the right-side rules.
3A, 3B ⇒ 3(A ∧ B)
Let us discard the right-side rules. The resulting system
The topmost sequent may give us a hint as to why the con- allows sequents of the form A ⇒ . It is a form of refutation
clusion fails. Here we are in a world in which B holds, and system (proof by contradiction), since the formula A has
12 TABLEAUX-BASED METHODS 31
the same meaning as the sequent ¬A ⇒ . Moreover, a basic Then allow unification to instantiate variables with terms.
sequent has the form of a contradiction. We have created a This should occur when trying to solve any goal containing
new formal system, known as the tableau calculus. two formulas, ¬A and B. Try to unify A with B, producing
a basic sequent. Instantiating a variable updates the entire
¬A, 0 ⇒ A, 0 ⇒ proof tree.
(basic) (cut)
¬A, A, 0 ⇒ 0⇒ Up until now, we have treated rule (∃l) backward proofs
as creating a fresh variable. That will no longer do: we now
A, B, 0 ⇒ A, 0 ⇒ B, 0 ⇒ allow variables to become instantiated by terms. To elimi-
(∧l) (∨l)
A ∧ B, 0 ⇒ A ∨ B, 0 ⇒ nate this problem, we do not include (∃l) in the free-variable
tableau calculus; instead we Skolemize the formula. All ex-
istential quantifiers disappear, so we can discard rule (∃l).
A[t/x], 0 ⇒ A, 0 ⇒
(∀l) (∃l) This version of the tableau method is known as the free-
∀x A, 0 ⇒ ∃x A, 0 ⇒
variable tableau calculus.
Rule (∃l) has the usual proviso: it holds provided x is not Warning: if you wish to use unification, you absolutely
free in the conclusion! must also use Skolemization. If you use unification without
We can extend the system to S4 modal logic by adding Skolemization, then you are trying to use two formalisms at
just two further rules, one for 2 and one for 3: the same time and your proofs will be nonsense! This is be-
cause unification is likely to introduce variable occurrences
A, 0 ⇒ A, 0 ∗ ⇒ in places where they are forbidden by the side condition of
(2l) (3l)
2A, 0 ⇒ 3A, 0 ⇒ the existential rule.
The Skolemised version of ∀y ∃z Q(y, z) ∧ ∃x P(x) is
As previously, 0 ∗ is defined to erase all non-2 formulas: ∀y Q(y, f (y)) ∧ P(a). The subformula ∃x P(x) goes to
P(a) and not to P(g(y)) because it is outside the scope of
0 ∗ = {2B | 2B ∈ 0}
def
the ∀y.
The formula ∀x P(x) ∧ ¬P(a) is obviously inconsistent. (Lit = -Neg; -Lit = Neg) ->
Here is its refutation in the free-variable tableau calculus: (unify(Neg,L); prove(Lit,[],Lits,_,_)).
prove(Lit,[Next|UnExp],Lits,FreeV,VarLim) :-
y 7→ a prove(Next,UnExp,[Lit|Lits],FreeV,VarLim).
basic
P(y), ¬P(a) ⇒
(∀l) The first clause handles conjunctions, the second disjunc-
∀x P(x), ¬P(a) ⇒
(∧l) tions, the third universal quantification. The fourth line han-
∀x P(x) ∧ ¬P(a) ⇒
dles literals, including negation. The fifth line brings in the
A failed proof is always illuminating. Let us try to prove next formula to be analyzed.
the invalid formula You are not expected to memorize this program or to un-
derstand how it works in detail.
∀x [P(x) ∨ Q(x)] → [∀x P(x) ∨ ∀x Q(x)].
Exercise 46 Use the free variable tableau calculus to prove
Negation and conversion to NNF gives these formulas:
∃x ¬P(x) ∧ ∃x ¬Q(x) ∧ ∀x [P(x) ∨ Q(x)]. (∃y ∀x R(x, y)) → (∀x ∃y R(x, y))
Skolemization gives ¬P(a)∧¬Q(b)∧∀x [P(x)∨ Q(x)]. (P(a, b) ∨ ∃z P(z, z)) → ∃x y P(x, y)
The proof fails because a and b are distinct constants. It ∃x P(x) → Q) → ∀x (P(x) → Q)
is impossible to instantiate y to both simultaneously. The
following proof omits the initial (∧l) steps.
References
y 7→ a y 7 → b???
¬P(a), ¬Q(b), P(y) ⇒ ¬P(a), ¬Q(b), Q(y) ⇒ B. Beckert and J. Posegga. leanTAP: Lean, tableau-based
(∨l) theorem proving. In A. Bundy, editor, Automated
¬P(a), ¬Q(b), P(y) ∨ Q(y) ⇒
(∀l) Deduction — CADE-12 International Conference, LNAI
¬P(a), ¬Q(b), ∀x [P(x) ∨ Q(x)] ⇒
814, pages 793–797. Springer, 1994.
12.4 Tableaux-based theorem provers R. E. Bryant. Symbolic boolean manipulation with ordered
binary-decision diagrams. Computing Surveys, 24(3):
A tableau represents a partial proof as a set of branches of 293–318, Sept. 1992.
formulas. Each formula on a branch is expanded until this
is no longer possible (and the proof fails) or until the proof M. Huth and M. Ryan. Logic in Computer Science:
succeeds. Modelling and Reasoning about Systems. Cambridge
Expanding a conjunction A ∧ B on a branch replaces it University Press, 2nd edition, 2004.
by the two conjuncts, A and B. Expanding a disjunction
A∨ B splits the branch in two, with one branch containing A G. Nelson and D. C. Oppen. Fast decision procedures
and the other branch B. Expanding the quantification ∀x A based on congruence closure. J. ACM, 27(2):356–364,
extends the branch by a formula of the form A[t/x]. If a 1980. ISSN 0004-5411. doi:
branch contains both A and ¬A then it is said to be closed. http://doi.acm.org/10.1145/322186.322198.
When all branches are closed, the proof has succeeded.
A tableau can be viewed as a compact, graph-based rep-
resentation of a set of sequents. The branch operations de-
scribed above correspond to our sequent rules in an obvious
way.
Quite a few theorem provers have been based upon free-
variable tableaux. The simplest is due to Beckert and
Posegga [1994] and is called leanTAP. The entire program
appears below! Its deductive system is similar to the re-
duced sequent calculus we have just studied. It relies on
some Prolog tricks, and is certainly not pure Prolog code.
It demonstrates just how simple a theorem prover can be.
leanTAP does not outperform big resolution systems. But
it quickly proves some fairly hard theorems.
prove((A,B),UnExp,Lits,FreeV,VarLim) :- !,
prove(A,[B|UnExp],Lits,FreeV,VarLim).
prove((A;B),UnExp,Lits,FreeV,VarLim) :- !,
prove(A,UnExp,Lits,FreeV,VarLim),
prove(B,UnExp,Lits,FreeV,VarLim).
prove(all(X,Fml),UnExp,Lits,FreeV,VarLim) :- !,
\+ length(FreeV,VarLim),
copy_term((X,Fml,FreeV),(X1,Fml1,FreeV)),
append(UnExp,[all(X,Fml)],UnExp1),
prove(Fml1,UnExp1,Lits,[X1|FreeV],VarLim).
prove(Lit,_,[L|Lits],_,_) :-