0% found this document useful (0 votes)
3 views88 pages

Proposition Al 2

The document discusses propositional approaches to first-order theorem proving, highlighting the historical context and contributions of key figures in AI such as Martin Davis and Alan Robinson. It explores the differences between human and computer theorem proving, the challenges faced by current theorem provers, and potential applications in various fields. The document also covers the structure of propositional calculus, resolution methods, and the significance of Horn clauses in logic-based AI systems.

Uploaded by

Akila
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views88 pages

Proposition Al 2

The document discusses propositional approaches to first-order theorem proving, highlighting the historical context and contributions of key figures in AI such as Martin Davis and Alan Robinson. It explores the differences between human and computer theorem proving, the challenges faced by current theorem provers, and potential applications in various fields. The document also covers the structure of propositional calculus, resolution methods, and the significance of Horn clauses in logic-based AI systems.

Uploaded by

Akila
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 88

Propositional

Approaches to
First-Order
Theorem Proving
David A. Plaisted
UNC Chapel Hill
May 2004
History of AI
 Early emphasis on general
methods
 Newell Shaw Simon GPS
 Robinson 1965 resolution
 Cordell Green question answering
 Shift to specialized techniques
 Feigenbaum Expert Systems
 Is logic a suitable basis for AI?
04/09/25
Approaches to AI
 Weak vs. strong methods in AI
 Declarative vs. procedural knowledge
 My interest: general logic-based approaches

04/09/25
Aristotle on Deduction
A deduction is speech (logos)
in which, certain things having
been supposed, something
different from those supposed
results of necessity because of
their being so. (Prior Analytics
I.2, 24b18-20)
Proof
 Proof is the idol before whom the
pure mathematician tortures
himself.
-- Sir Arthur Eddington
 You may prove anything by
figures. --Thomas Carlyle
 What is now proved was once
only imagined. -- William Blake
Proof
 You cannot demonstrate
an emotion or prove an
aspiration. -- John Morley
 Prove all things; hold

fast that which is good. --


Bible, I Thessalonians
Logic

 No, no, you're not thinking;


you're just being logical. --
Niels Bohr
 Logic is one thing and

commonsense another. --
Elbert Hubbard, The Note
Book, 1927
Theorem Proving
 Potentially a key technology for AI
 Brittleness problem for expert
systems
 An unsolved problem
 Weak versus strong methods
 Problems with resolution
 Impact on entire field
 Importance of space versus time
Theorem Proving on a
Computer
 Speed and accuracy of computers
 People get tired and make
mistakes
 How do people prove theorems?
Potential applications
 Hardware verification
 Software verification
 AI and expert systems
 Robots
 Deductive Databases
 Semantic web and query
answering
 Mathematics research
 Education
Current theorem provers
 Largely syntactic
 Resolution or ME (tableau) based
 First-order provers are often
poor on non-Horn clauses
 Rarely can solve hard problems
 Human interaction needed for
hard problems

04/09/25
How do humans prove
theorems?
 Semantics
 Case analysis
 Sequential search through space
of possible structures
 Focus on the theorem
People versus computers
 In a few areas computers are faster
 Propositional calculus
 Equational logic

 Geometry

 More to come in the future

 In general people are much better.


Why?
 Humans use semantics
 Computers use syntax in most cases
The future

 Will provers soon be much more


powerful than they are now?
 Will they ever be much more
powerful than humans?
Organization of the talk
 History of ATP
 Contributions of Martin Davis
 Contributions of Alan Robinson
 Achievements of Provers

 Propositional Calculus
 Propositional Resolution
 Horn Clauses
 Davis and Putnam’s Method
 The Satisfiability Threshold
 Propositional Calculus (continued)
 Performance Obtained
 Applications

 Semantics in Theorem Proving


 First Order Logic
 Clause form and Herbrand’s
theorem
 Criteria for evaluating provers
 Resolution
 Otter
 Model elimination
 Matings
 Propositional approaches to first
order logic
 Clause Linking
 Disconnection Calculus
 Disconnection Calculus Theorem
Prover
 First-Order DPLL Method
 Replacement Rules
 Definitions
 OSHL with semantics
 Comments on CADE system
competition
David Hilbert

 Hilbert’s goal was to mechanize


mathematics. “Hilbert’s Program.”
 Goedel showed that this is
impossible.
 Automatic theorem proving tries to
mechanize what can be mechanized.
Martin Davis
 Theorem Proving on Computers
 Davis and Putnam’s Method
 Clause Form Refutational
Theorem Proving
 Foreshadowing of Resolution
Alan Robinson

 Resolution in First-Order Logic


 Unification in a Clause Form
Refutational Prover
 Many non-resolution methods are
still in this tradition
 First reasonably powerful
theorem prover for first-order
logic
Achievements of Provers
 Robbins Problem Solution
 Proof of Cantor’s Theorem
 Hardware Verification
 Prolog
 Constraints
 Quasigroup existence and
nonexistence
 Equivalential calculus axiom systems
 Euclidean and non-Euclidean
geometry
Achievements of Provers
 Verification of communication
networks
 Basketball scheduling
 Planning
 RRTP and description logic
Propositional Calculus
 Formulae are composed of
Boolean variables p,q,r, … and
Boolean connectives:
  (conjunction, “and”)
  (disjunction, “or”)
  (negation, “not”)
  (implication, “if then”)
  (equivalence, “if and only if”)
 Example formula
 pqp
 Interpretation:
 “It is raining” and “It is Tuesday”
implies “It is raining.
 Another interpretation:
 “All birds are green” and “All fish are
purple” implies “All birds are green.”
 Both interpretations make the
formula true.
 The formula is valid (true in all
interps.)
 Another example formula:
 pqp
 Interpretation:
 2=2  3=3  2  2
 Another interpretation:
 2=2  3  3  2  2
 The first interpretation makes
the formula false.
 The second makes it true.
 The formula is not valid.
Truth Tables
Truth Table for Conjunctions
1st Conjunct 2nd Conjunct Statement
A B A B
true true true
true false false
false true false
false false false

Truth Table for Disjunctions


1st Disjunct 2nd Disjunct Statement
A B A B
true true true
true false true
false true true
false false false
Truth Table for Conditionals
Antacedent Consequent Statement
A B A B
true true true
true false false
false true true
false false true

Truth Table for Equivalences


Antacedent Consequent Statement
A B A B
true true true
true false false
false true false
false false true

Truth Table for Negations


Negation
Statement
A A
true false
false true
 Interpretations assign meanings to
symbols.
 In Boolean logic interpretations
assign truth values (true, false) to the
symbols.
 An interpretation in Boolean logic is
called a valuation.
 Thus a valuation I is an assignment of
truth values (true or false) to each
variable in a formula
A valid formula
P Q P Q P  Q P
T T T T
T F F T
F T F T
F F F T

A satisfiable invalid formula


P Q P Q (P  Q)  Q
T T T T
T F F F
F T F T
F F F F
 An unsatisfiable formula: P  P

P P P P

T F F

F T F
Testing Validity
 Using truth tables is exponential
 Resolution
 Davis and Putnam’s Method
 Local Search Methods
Conjunctive Normal
Form
 Any propositional formula can be
put into conjunctive normal form
(clause form).
 Example:
 (p  q  r)  (p  r)  (q  r)

 Represent as sets:
 {p, q, r}, {p, r}, {q, r}



clause clauseclause
Conjunctive Normal
Form
 A formula in conjunctive normal
form is unsatisfiable if for every
interpretation I, there is a clause
C that is false in I.
 A formula in cnf is satisfiable if
there is an interpretation I that
makes all clauses true.
 Binary Resolution Step
 For any two clauses C1 and C2, if there is
a literal L1 in C1 that is complementary
to a literal L2 in C2, then delete L1 and L2
from C1 and C2 respectively, and
construct the disjunction of the
remaining clauses. The constructed
clause is a resolvent of C1 and C2.
 Examples of Resolution Step
 C1=a b, C2=b c
 Complementary literals : b,b
 Resolvent: ac

 C1=a bc, C2=b d


 Complementary literals : b, b

 Resolution in Propositional
Logic
1. a b c a b c
2. b b
3. c  d  e c  d  e
4. e  f ef
5. d   f d
f
 Resolution in Propositional Logic
(continued)
 First, the goal to be
a
proved, a , is negated a  b c

and added to the b c b


clause set.
 The derivation of c

c d

indicates that the e f d e


database of clauses
d f  d
is inconsistent.
f f

Horn clauses
 At most one positive literal
 Basis of Prolog
 Satisfiability can be tested in linear
time
 Resolution is fast for Horn clauses
 Resolution is very slow for non Horn
clauses
 Horn clauses: p  q  r, p  q   r, r
 Non Horn clause: p  q  r
 Hard problems are usually non-Horn
DPLL (Davis and Putnam’s
Method)
(Purity rule omitted)
1. If no clauses in KB, return T (Satisfiable)
2. If a clause in KB is empty (FALSE), return F
(Unsatisfiable)
3. If KB has a unit clause C with prop. p, then
return DPLL(KB,p←polarity(p,C))
4. Choose an uninstantiated variable p
5. If DPLL(KB, p←TRUE) returns T, return T
6. If DPLL(KB, p←FALSE) returns T, return T
7. Return F
DPLL Example
{p,r},{p,q,r},
{p,r}
p=T p=
F
{T,r},{T,q,r}, {F,r},{F,q,r},
{T,r} {F,r}
SIMPLIF SIMPLI
Y FY
{q,r} {r},
{r} SIMPLI
{} FY
DPLL Viewed Abstractly
 The call DPLL(KB, p←TRUE) is
testing interpretations where p is
TRUE
 The call DPLL(KB, p←FALSE) is
testing interpretations where p is
FALSE
 In this way, interpretations are
examined in a sequential manner
 For each interpretation, a reason is
found that the formula is false in it
 Such a sequential search of
interpretations is very fast
DPLL (Davis and Putnam’s
method), contiued
 DPLL does a backtracking search
for a model of the formula
 DPLL is much faster than
propositional resolution for non-
Horn clauses
 Very fast data structures developed
 Popular for hardware verification
 Local search can be much faster but
is incomplete
 “Systematic methods can now
routinely solve verification
problems with thousands or tens
of thousands of variables, while
local search methods can solve
hard random 3SAT problems with
millions of variables.”
 (from a conference
announcement)
NP Complete but Easy
 How can the satisfiability problem
be so easy when it is NP complete?
 If there are many clauses the proof
is likely to be short and can be
found quickly
 If there are few clauses there are
likely to be many interpretations
and one is likely to be found quickly
 The hard problems are in the middle
at the “satisfiability threshold”
First Order Logic
 Formulae may contain Boolean
connectives and also variables x,
y, z, …, predicates P,Q,R, …,
function symbols f,g,h, …, and
quantifiers  and  meaning “for
all” and “there exists.”
 Example: x(P(x)  yQ(f(x),y))
Individual Constants
 Formulae can also contain
constant symbols like a,b,c which
can be regarded as functions of
no arguments.
 Example: x(P(x)  Q(x,c))
 Consider the formula yxP(x,y) 
xyP(x,y). Let the domain be the
set of people, and let P(x,y) be “x
loves y”.
 The formula then is interpreted as
“if there exists y such that for all x, x
loves y, then for all x, there exists y
such that x loves y.” In other words,
if there is someone that everyone
loves, then everyone loves someone.
 The formula is true under this
interpretation.
 In fact this formula is true under all
interpretations, and is a valid formula.
 Consider this formula: xyP(x,y) 
yxP(x,y). Under the same
interpretation, this formula becomes “If
for all x, there exists y such that x loves
y, then there exists y such that for all x,
x loves y.”
 In other words, if everyone loves
someone, then there is someone that
everyone loves.
 This formula is false under this
interpretation and is not a valid formula.
Clauses
 An atom is a predicate symbol followed
by arguments, as, P(a, f(x)).
 A literal is an atom or its negation, as,
P(a,f(x)).
 A clause is a disjunction of literals,
often written as a set.
 Example: {p(x), p(f(x))} for p(x) 
p(f(x))
 A conjunction of clauses is also written
as a set, as, {C1, C2, C3} signifying C1
C2  C3.
Substitutions
 A substitution  is an assignment
of terms to variables.
 If C is a clause then C  is C with
the substitution applied uniformly.
 Thus {P(x)}{x  f(a)} is {P(f(a))}.
 C  is called an instance of C. If C
 has no variables, it is called a
ground instance of C.
Semantics
 Gelernter 1959 Geometry Theorem
Prover
 Adapt semantics to clause form:
 An interpretation (semantics) I is an
assignment of truth values to
literals so that I assigns opposite
truth values to L and L for atoms L.
 The literals L and L are said to be
complementary.
Semantics
 We write I C (I satisfies C) to


indicate that semantics I makes the
clause C true.
 If C is a ground clause then I satisfies
C if I satisfies at least one of its
literals.
 Otherwise I satisfies C if I satisfies all
ground instances D of C. (Herbrand
interpretations.)
 If I does not satisfy C then we say I
falsifies C.
Example Semantics
 Specify I by interpreting symbols
 Interpret predicate p(x,y) as x = y
 Interpret function f(x,y) as x + y
 Interpret a as 1, b as 2, c as 3
 Then p(f(a,b),c) interprets to TRUE
but p(a,b) interprets to FALSE
 Thus I satisfies p(f(a,b),c) but I
falsifies p(a,b)
Obtaining Semantics
 Humans using mathematical
knowledge
 Automatic methods (finite
models)
 Trivial semantics
Herbrand’s Theorem
 A set S of clauses is unsatisfiable if
there is a finite unsatisfiable set T
of ground instances of S.
 The basis of uniform proof
procedures.
 Example: S = {{p(a)},{p(x),
p(f(x))}, {p(f(f(a)))}}
 T = {{p(a)},{p(a), p(f(a))},
{p(f(a)), p(f(f(a)))}, {p(f(f(a)))}}
{p(a)} {p(x), p(f(x))}
{p(f(f(a)))}

{p(a)}
{p(a), p(f(a))}
{p(f(a)), p(f(f(a)))}

{p(f(f(a)))}
Criteria to evaluate
provers
 Don’t know versus don’t care
nondeterminism
 Clauses generated by need or
possibility
 Instantiation by unification or by
semantics or neither
 Clauses selected by semantics
 Goal sensitivity
 Space versus time
Resolution Principle
 Steps for resolution refutation proofs
 Put the premises or axioms into clause
form.
 Add the negation of what is to be proved, in

clause form, to the set of axioms.


 Resolve these clauses together, producing

new clauses that logically follow from them.


 Produce a contradiction by generating the

empty clause.
 This is possible if and only if the theorem is

valid. (Completeness)
 Prove that “Fido will die.” from the
statements “Fido is a dog.”,
“All dogs are animals.”
and “All animals will die.”
 Changing premises to predicates
 (x) (dog(X)  animal(X))
 dog(fido)

 Modus Ponens and {fido/X}


 animal(fido)
 (Y) (animal(Y)  die(Y))

 Modus Ponens and {fido/Y}


 die(fido)
 Equivalent Reasoning by Resolution
 Convert predicates to clause form

Predicate form Clause form


1.(x) (dog(X)  animal(X)) dog(X) 
animal(X)
2.dog(fido) dog(fido)
3.(Y) (animal(Y)  die(Y)) animal(Y)  die(Y)

 Negate the conclusion

4.die(fido) die(fido)
 Equivalent Reasoning by
Resolution(continued)

dog(X)  animal(X) animal(Y)  die(Y)


{Y/X}
dog(fido) dog(Y)  die(Y)
{fido/Y}

die(fido) die(fido)

Resolution proof for the “dead dog” problem


 Skolemization
 Skolem constant
 (X)(dog(X)) may be replaced by dog(fido)
where the name fido is picked from the
domain of definition of X to represent that
individual X.
 Skolem function
 If the predicate has more than one
argument and the existentially quantified
variable is within the scope of universally
quantified variables, the existential variable
must be a function of those other variables.
 (X)(Y)(mother(X,Y))

(X)mother(X,m(X))
 (X)(Y)(Z)(W)(foo (X,Y,Z,W))

(X)(Y)(W)(foo(X,Y,f(X,Y),W))
 Resolution on the predicate calculus
 A literal and its negation in parent
clauses produce a resolvent only if they
unify under some substitution .  is
then applied to the resolvent before
adding it to the clause set.
 C = dog(X) animal(X)
1

C2 = animal(Y) die(Y)
Resolvent : dog(Y) die(Y) {Y/X}
C1 = p(X)  q(f(X)) C2 = q(Y)  r(g(Y))
Resolvent: p(X)  r(g(f(X)))
 “Lucky student”
1. Anyone passing his history exams
and winning the lottery is happy
 X(pass(X,history)  win(X,lottery) 
happy(X))
2. Anyone who studies or is lucky can
pass all his exams.
 XY(study(X)  lucky(X)  pass(X,Y))
3. John did not study but he is lucky
 study(john)  lucky(john)
4. Anyone who is lucky wins the
lottery.
 X(lucky(X)  win(X,lottery))
 Clause forms of “Lucky student”
1. pass(X,history)  win(X,lottery)
happy(X)
2. study(X)  pass(Y,Z)
lucky(W) pass(W,V)
3. study(john)
lucky(john)
4. lucky(V)  win(V,lottery)
5. Negate the conclusion “John is
happy”
happy(john)
 Resolution refutation for the
“Lucky Student” problem

pass(X, history) win(X,lottery) happy(X) win(U,lottery) lucky(U)

{U/X}
pass(U, history) happy(U) lucky(U) happy(john)

{john/U}
lucky(john) pass(john,history) lucky(join)

{}
pass(john,history) lucky(V) pass(V,W)

{john/V,history/W}
lucky(john) lucky(john)

{}
Evaluating resolution
 Clauses generated by possibility
(bad)
 Don’t care nondeterminism (good)
 Unification based (good?)
 No semantics (bad)
 Uses a large amount of space (bad)
 Often not goal sensitive (bad)
Refinements
 Many refinements of resolution
have been developed in an attempt
to improve its performance
 Set of support
 Hyper resolution
 Ancestry filter form
 Unit preference
 …
Otter
 PROBLEM SEC CLAUSES KEPT
 LCL064-1.in 0.14 1080844 8604
 LCL064-2.in 0.00 9448 1954
 LCL065-1.in 0.00 2992 653
 LCL066-1.in 0.00 1452 306
 LCL067-1.in 0.14 492984 9283
 LCL068-1.in 0.29 569577 9593
 LCL069-1.in 0.00 3577 288
 LCL070-1.in 0.14 427166 8840
 LCL071-1.in 0.29 449389 8941
 LCL072-1.in 0.00 161139 6280
Hyper Linking
 Separates instantiation and
inference
 Given S, selects clauses C and D in S
and literals L in C and M in D, and
generates instances C’ and D’ so that
L’ and M’ are complementary. Then
C’ and D’ are added to S.
 Periodically S is tested for
unsatisfiability using DPLL.
Hyper Linking
Problem Input OTTER Hyper
Clauses (sec) Linking
Ph5 45 38606.76 1.8
Ph9 297 >24 hrs 2266.6
Latinsq 16 >24 hrs 56.4
Salt 44 1523.82 28.0
Zebra 128 >24 hrs 866.2
 Eliminating Duplication with the
Hyper-Linking Strategy, Shie-Jue
Lee and David A. Plaisted,
Journal of Automated Reasoning
9 (1992) 25-42.
Later propositional
strategies
 Billon’s disconnection calculus,
derived from hyper-linking
 Disconnection calculus theorem
prover (DCTP), derived from
Billon’s work
 FDPLL
Performance of DCTP on
TPTP, 2003
 First in EPS and EPR (largely
propositional)
 Third in FNE (first-order, no
equality) solving same number as
best provers
 Fourth in FOF and FEQ (all first-
order formulae, and formulae with
equality)
 Not tuned to 50 categories!
Definition Detection

Problem OSHL Otter Otter


Time Time Clauses
P1 0.3 0.03 51
P2 2.3 1000+ 41867
P3 11.25 1000+ 27656
P4 1.35 1000+ 105244
P5 2.0 1000+ 54660
 Replacement Rules with
Definition Detection, David A.
Plaisted and Yunshan Zhu, in
Caferra and Salzer, eds.,
Automated Deduction in
Classical and Non-Classical
Logics, LNAI 1761 (1998) 80-94.
Structure of OSHL
 Goal sensitivity if semantics chosen
properly
 Choose initial semantics to satisfy axioms
 Use of natural semantics
 For group theory problems, can specify a
group
 Sequential search through possible
interpretations
 Thus similar to Davis and Putnam’s method
 Propositional Efficiency

 Constructs a semantic tree


Ordered Semantic Hyperlinking (Oshl)

 Reduce first-order logic problem to


propositional problem
 Imports propositional efficiency into
first-order logic
 The algorithm
 Imposes an ordering on clauses
 Progresses by generating instances and
refining interpretations

I0 I1 I2 I3 …

D0 D1 D2 T
unsatisfiable
OSHL
 I0 is specified by the user
 Di is chosen so that Ii falsifies Di
 Di is an instance of a clause in S
 Ii is chosen so that Ii satisfies Dj for
all j < i
 Let Ti be {D0,D1, …, Di-1}.
 Ii falsifies Di but satisfies Ti
 When Ti is unsatisfiable OSHL stops

and reports that S is unsatisfiable.


Rules of OSHL
(C1,C2, …, Cn), D minimal contradict I
(C1,C2, …, Cn,D)

(C1,C2, …, Cn), Cn not needed


(C1,C2, …, Cn-1,D)

(C1,C2, …, Cn,D), max resolution possible


(C1,C2, …, Cn-1,res(Cn,D,L))
Example
()
({-p1,-p2,-p3})
({-p1,-p2,-p3},{-p4,-p5,-p6})
({…},{…},{-p7})
({…},{…},{-p7},{p3,p7})
({…},{-p4,-p5,-p6},{p3})
({-p1,-p2,-p3},{p3})
({-p1,-p2})
Number of Clauses
Generated
 Problem #clauses, Otter Oshl+semantics
 GRP005-1 57 3
 GRP006-1 62 7
 GRO007-1 85 22
 GRP018-1 266 16
 GRP019-1 267 15
 GRP020-1 265 18
 GRP021-1 264 19
 GRP023-1 79 22
 GRP032-3 83 14
 GRP034-3 141 30
 GRP034-4 222 6
 GRP042-2 21 15
 GRP043-2 80 81
 GRP136-1 0 8
 GRP137-1 0 8
Engineering Issue
 OSHL generates about 10 clauses
per second
 Otter generates more than a
million clauses per second
 A factor of 100,000 in
engineering!
 Need to look at search space
sizes rather than times
Evaluating OSHL
 Clauses generated by need (good)
 Don’t care nondeterminism (good)
 Instantiates using semantics (good)
 Goal sensitive (good)
 Space efficient (good)
 No unification (bad?)
 Need for more engineering
 TPTP library by Geoff Sutcliffe &
Christian Suttner
 Thousands of problems for theorem
provers
 Used to benchmark first order theorem
provers
 Contains 6973 theorems at present

 CASC competition by Sutcliffe et al.


 Every year: who has the fastest/most
accurate first order theorem prover on
the planet?
 Uses blind test from the TPTP library
 Current chamption: Vampire
 By Voronkov and Riazonov in Manchester
CADE System
Competition
 The issue of 50 categories
 The 300 seconds issue
Summary
 Efficiency of DPLL
 First-Order Theorem Proving
 Resolution
 Propositional Approaches
 Clause Linking
 DCTP and the CADE Competition
 Semantics
 OSHL

You might also like