Foundations of LP
Foundations of LP
Marin Mircea
April 2021
Contents
1 Notions specific to Logic Programming 2
2 Unification 4
2.1 Unifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 The Martelli-Montanari algorithm . . . . . . . . . . . . . . . . . 7
3 SLD-resolution 9
3.1 Selection rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.1.1 Successful SLD-trees . . . . . . . . . . . . . . . . . . . . . 13
3.1.2 Finitely failed SLD-trees . . . . . . . . . . . . . . . . . . . 13
3.2 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . 13
4 Programming in Prolog 15
4.1 SLDNF-resolution . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . 17
1
1 Notions specific to Logic Programming
• An atom is a formula p(t1 , . . . , tn ) where p is a predicate symbol with
arity n and t1 , . . . , tn are terms.
• A literal is either an atom or the negation of atom. A literal is positive
if it is an atom, and negative otherwise.
Vp
• A query is a formula i=1 Ai
where A1 , . . . , Ap are atoms.
• A clause is a formula H ← B
Vp
where H is atom and B = i=1 Bi is a (possibly empty) conjunction of
atoms. H is called the head, and B is the body of the clause.
H:-B1 , . . . , Bp .
H.
Vp
• A query i=1 Ai is written after the ?- prompt of Prolog, as follows:
2
?- A1 , . . . , Ap .
where the variables are recognized because they start with an uppercase
letter, and all of them are existentially quantified.
Prolog programs are saved in text files with extension .pl
For example, the following program contains some facts about parent-child
relations and clauses that define the relations
father(john,jack).
father(john,bob).
mother(mary,jack).
mother(ana,ray).
parent(X,Y) :- father(X,Y).
parent(X,Y) :- mother(X,Y).
siblings(X,Y) :- parent(Z,X),parent(Z,Y),X \= Y.
Note the declarative interpretation of the last clause: “for all X, Y, Z: X and Y are
siblings if Z is the parent of X, Z is the parent of Y, and X and Y are different.”
Although all variables in a clause are universally quantified, it is often more
natural to read variables in the conditions that are not in the conclusion as
existentially quantified with the body of the rule as their scope. For example,
the following interpretation is equivalent with the previous one: “X and Y are
siblings id there is a Z who is parent of both X and Y, and X and Y are different.”
A possible query could be
?- parent(X,jack).
X = john ;
X = mary.
with the declarative reading “Prove that there exists X which is the parent
of jack.” Prolog uses a proof method called SLDNF-resolution that proves
queries in a constructive way. “Constructive” means here that the values of the
variables are in effect computed. In the previous example, the answer is not a
simple "yes" but a constructive one, like “X = john” or “X = mary”.
3
Logic programs and queries. A logic program is the knowledge base that
represents what we know about the world, and a query is a formula that
we want to prove from a program in a constructive way.
SLD-resolution: a proof method built upon unification that allows us to prove
queries from a program in a constructive way.
Semantics, which gives a meaning logic programs and queries. The relation-
ships between the proof method of SLD-resolution and the semantics (of
programs and queries) are clarified by the so-called soundness and com-
pleteness properties of SLD-resolution.
2 Unification
In logic programming, variables represent unknown values, like in mathematics.
The values assigned to variables are terms, and the assignment of terms to
variables is by means of substitutions called most general unifiers. The
process of computing most general unifiers is called unification.
From now on we consider terms from T (F, X ) and write V ar(t) for the set of
variables occurring in a term t. If V ar(t) = ∅ we say that t is a ground term.
A substitution is a mapping θ from a finite set of variables Dom(θ) ⊆ X
to terms, which assigns to each variable X ∈ Dom(θ) a term t different from
X. In this lecture, we write a substitution θ as {X1 → t1 , . . . , Xn → tn } where
• X1 , . . . , Xn are the different variables in its domain Dom(θ),
• for every 1 ≤ i ≤ n, ti is the term θ(Xi ). By definition, ti 6= Xi .
Every pair Xi → ti is called a binding of θ, and we say that Xi is bound to
ti . We denote by Range(θ) the set of terms {t1 , . . . , tn }, by Ran(θ) the set of
variables with occurrences in t1 , . . . , tn , and let V ar(θ) = Dom(θ) ∪ Ran(θ).
When n = 0, θ becomes the empty mapping, which is called empty substitution
and is denoted by .
Consider a substitution θ = {X1 → t1 , . . . , Xn → tn }. If all terms t1 , . . . , tn
are ground then θ is called ground substitution, and if all t1 , . . . , tn are variables
then θ is called a pure variable substitution. When Dom(θ) = Range(θ) then θ
is a bijective mapping from its domain to itself, and is called a renaming. For
example, the substitutions and {X → Y, Y → Z, Z → X} are renamings, but
{X → Y, Y → Z} is not a renaming.
Substitutions can be applied to terms. The result of applying a substitu-
tion θ to a term t, written as tθ, is the result of simultaneous replacement of
each occurrence in t of a variable from Dom(θ) by the corresponding term in
4
Range(θ). Formally, this operation is defined by induction on the structure of
the term t:
θ(X) if X ∈ Dom(θ),
• if X ∈ X then Xθ =
X otherwise.
X1 → t1 η, . . . , Xn → tn η, Y1 → s1 , . . . , Ym → sm
5
1. For every renaming θ there exists exactly one substitution θ−1 such that
θθ−1 = θ−1 θ = . Moreover, θ−1 is also a renaming.
2. If θη = then θ and η are renamings.
Variants have the following remarkable properties:
2.1 Unifiers
Informally, unification is the process of making terms identical by means of
certain substitutions. For example, the terms f (a, Y, Z) and f (X, b, Z) can be
made identical by applying to them the substitution {X → a, Y → b}: both
sides then become f (a, b, Z). But the substitution {X → a, Y → b, Z → a} also
makes these two terms identical. Such substitutions are called unifiers. The
first unifier is preferable because it is more general — the second one is a special
case of the first one. The following definition clarifies this difference.
6
Let s, t ∈ T (F, X ). A substitution θ is called
Intuitively, an mgu is a substitution which makes two terms equal but which
does it in a “most general way”, without unnecessary bindings. So θ is an mgu
if every other unifier η is of the form η = θγ for some γ. An mgu γ is strong
if for every unifier η, the substitution γ for which η = θγ holds can be always
chosen to be η itself.
The problem of deciding whether two terms are unifiable is called the uni-
fication problem. We solve this problem by providing an algorithm that ter-
minates with failure if the terms are not unifiable and that otherwise produces
a strong mgu. In general, two terms may be not unifiable for two reasons:
1. Two terms starting with a different function symbol can not unify. For
example, f (g(X, a), X) and f (g(X, b), b) are not unifiable because the cor-
responding red subterms start with different function symbols.
2. A variable X can not be unified with a term t 6= X with X ∈ V ar(t).
For example, g(X, a) and g(f (X), a) are not unifiable because the corre-
sponding red subterms can not be unified for this reason.
In the literature, this reason is called occur-check failure.
Each possibility can occur at some “inner level” of the considered two terms.
7
2) f (s1 , . . . , sn ) = g(t1 , . . . , tm ) with f 6= g.
Halt with failure.
3) X = X.
Delete the equation.
4) t = X where t 6∈ X .
Replace it with X = t.
5) X = t where X 6∈ V ar(t) and X occurs elsewhere.
Perform the substitution {X → t} on all other equations,
• If the algorithm terminates with failure, the terms s and t are not unifiable.
• Otherwise, the algorithm stops with a system of equations of the form
{X1 = t1 , . . . , Xn = tn } where the Xi s are distinct variables and none of
them occurs in a term tj .
In this case, {X1 → t1 , . . . , Xn → tn } is a strong mgu of s and t.
Examples:
1. f (g(X, a), X) and f (g(X, b), b) are not unifiable because
8
3 SLD-resolution
The computation process within the logic programming framework can be ex-
plained as follows. A program P can be viewed as a set of axioms and a query
Q as a request to find an instance Qθ of it which follows from P . A successful
computation yields such a θ and can be viewed as a proof of Qθ from P .
The computation is a sequence
Vp of basic steps. Each basic step selects an
atom Ai in the current query i=1 Ai and a clause H ← B in the program
P . If Ai unifies with H, then the next query is obtained by replacing Ai by
the clause body B and by applying to the outcome an mgu of Ai and H. The
computation terminates successfully when the empty query is produced. θ is
then the composition of the mgus used.
Thus logic programs compute through a combination of two mechanisms:
replacement and unification. To understand better various fine points of
this computation, let’s introduceVa few auxiliary notions.
Let P be a program, Q = A∈A A a non-empty query, and H ← B a
variant of c ∈ P such that V ar(H ← B) ∩ V ar(Q) = ∅, and θ is an mgu of H
and A0 ∈ A.
The query
^
Q0 = B ∧ A θ
A∈A−{A0 }
theorem proving that works backwards, from conclusion to hypotheses from which it can
follow.
9
Instantiation of the query and the clause by an mgu of the selected atom and
the head of the clause.
Replacement of the instance of the selected atom by the instance of the body
of the clause.
sum(X,0,X) ← true. % c1
sum(X,s(Y),s(Z)) ← sum(X,Y,Z). % c2
The query sum(s(s(0)),s(s(0)),Z) asks for the value of Z which is the sum
of numerals s(s(0)) and s(s(0)). The SLD-derivation of this query is
10
sum(s(s(0)), s(s(0)), Z) ⇒θ1 ={X1 →s(s(0)),Y1 →s(0),Z→s(Z1 )},c2 sum(s(s(0)), s(0), Z1 )
⇒θ2 ={X2 →s(s(0)),Y2 →0,Z1 →s(Z2 )},c2 sum(s(s(0)), 0, Z2 )
⇒θ3 ={X3 →s(s(0)),Z2 →s(s(0))},c1 true.
where the variants used in these steps are sum(X1 , s(Y1 ), s(Z1 )) ← sum(X1 , Y1 , Z1 ),
sum(X2 , s(Y2 ), s(Z2 )) ← sum(X2 , Y2 , Z2 ), and sum(X3 , 0, X3 ) ← true.
The corresponding computed answer substitution is
θ1 θ2 θ3 |{Z} = {Z → s(s(s(s(0))))}.
11
• Its branches are SLD-derivations of P ∪ {Q} via R.
• Every node Q with selected atom A has exactly one descendant for every
clause c from P which is applicable to A. This descendant is a resolvent
of Q and c with respect to A.
path(X,c)
{X1 → X, Z1 → c} c2 {X → c, X1 → c}
c1
arc(X,Y)∧path(Y,c)
{X → b, Y → c} c3
path(c,c)
{X2 → c, Z2 → c}
c1 c2 {X2 → c}
arc(c,Y2 ),path(Y2 ,c)
fail
path(X,c)
{X1 → X, Z1 → c} c2 {X → c, X1 → c}
c1
arc(X,Y)∧path(Y,c)
{X2 → Y, Z2 → c} c2 {Y → c}
c1
arc(X,Y)∧arc(Y,Y2 )∧path(Y2 ,c) arc(X,c)
c2 {X3 → c, Y2 → c} c3 {X → b}
... c1 arc(X,Y)∧arc(Y,c)
infinite subtree c3 {Y → c}
arc(X,b)
fail
12
where represents the empty clause (true). The first tree is finite while the
second one is infinite. Both trees are successful and contain the same computed
answer substitutions: {X → b} and {X → c}.
13
If the tree is infinite, the computation of the whole tree does not terminate. If
the tree is computed incrementally, by breadth-first traversal, then
3. We decide P ` ∃X1 . · · · .∃.Xn .Q as soon as we produce a node for . In
this case, we can also report P ` Qθ for the computed answer θ by the
SLD-derivation to that node.
Note that, if the tree is infinite, we can decide P ` ∃X1 . · · · .∃.Xn .Q only when
a node for is produced. If such a node is never produced, the computation of
the tree runs forever and we can not decide anything.
Breadth-first is a fair traversal strategy because it traverses all nodes of a
tree, even if it the tree is infinite. By contrast, depth-first is an unfair traversal
strategy because it does not traverse all nodes of some infinite trees. For exam-
ple, the depth-first traversal of the SLD-tree via rightmost atom selection from
page 13 has the shape
9
9 9
9 9
9 9 9
9
infinite
branch
Breadth-first will generate all nodes of the tree in the order depicted below:
1
2 3
4 5
6 7 8
9
infinite
branch
and depth-first will generate only the leftmost branch of the tree:
1
2 9
3 9
4 9 9
9
infinite
branch
14
Implementation of SLD-derivations in Prolog
The incremental computation of a SLD-tree by breadth-first traversal is fair but
memory-consuming. For this reason, most languages for logic programming,
including Prolog, construct an SLD-tree incrementally, by depth-first traversal,
which consumes much less memory. Since this traversal strategy is unfair, it may
never produce successful derivations (and computed answers) which exist. In
contrast, breadth-first traversal guarantees that all successful derivations (and
computed answers) are eventually produced.
4 Programming in Prolog
Prolog is the most popular language used for logic programming. It is de-
signed to answer queries using a predefined and predictable search strategy
called SLDNF-resolution.
4.1 SLDNF-resolution
SLDNF [AB94] is an extension of SLD-resolution to deal with negation as failure.
This means that Prolog can work with
Vp
• general clauses, which are formulas H ← i=1 Li where H is atom and Li
are literals.
Vp
• general queries, which are formulas i=1 Li where Li are literals.
The simplified syntax of Prolog can be used to write such clauses and queries too;
the only extension is that we write a literal ¬p(t1 , . . . , pn ) as not(p(t1 , . . . , tn )).
For example, we can write the Prolog program
bird(olaf). % c1
penguin(olaf). % c2
fly(X) :- bird(X), not(abnormal(X)). % c3
abnormal(X) :- penguin(X). % c4
15
1. If θ is an mgu of H and an atom A0 ∈ L, then the query
^
0
Q = B ∧ L θ
L∈L−{A0 }
The query Q = member(1, [1, 2, 1]) is successful because Pmember ∪ {Q} has the
following successful tree starting from Q:
16
member(1,[1,2,1])
{X1 → 1} mc1 mc2 {X1 → 1, T1 → [2, 1]}
member(1,[2,1])
mc2 {X2 → 1, T2 → [1]}
member(1,[1])
{X3 → 1} mc1 mc2 {X3 → 1, T2 → []}
member(1,[])
fail
In fact, Prolog returns true immediately after the generation of the partial
SLDNF-tree
member(1,[1,2,1]) ?-member(1,[1,2,1]).
true.
{X1 → 1} mc1
and will generate the rest of the tree only if the user decides to do so by pressing
;. Because Prolog can find a successful tree of Q, the query ¬member(1, [1, 2, 1])
fails, and Prolog returns false:
¬member(1,[1,2,1]) ?- not(member(1,[1,2,1])).
fail false.
17
non-monotonic reasoning. SLDNF-resolution with general clauses and queries
is adequate for AI.
Sources of nondeterminism
The search for solutions of a query Q with respect to a program P is based on
the construction of a SLDNF-tree for P ∪ {Q}. We noticed that there are four
sources of nondeterminism in the computation of a successful SLDNF-derivation
Q ⇒θ1 ,c1 Q2 · · · ⇒θn−1 ,cn−1 Qn−1 ⇒θn ,cn
that produces a computed answer θ = θ1 . . . θn |V ar(Q) :
(A) choice of the selected atom in the considered query,
(B) choice of the program clause applicable to the selected atom,
(C) choice of the renaming of the program clause used,
(D) choice of the mgu.
(C) and (D) are don’t care sources of nondeterminism: we are free to choose
any mgu and any standardized apart variant of a program clause, because these
choices don’t affect the computation of a complete set of answers for a query.
(B) is a don’t know source of nondeterminism. This means that, if we want
to find all answers of a query, we must consider all program clauses applicable
to the selected atom. That is, we must generate all nodes of the SLDNF-tree.
We can do so with a fair tree traversal strategy, like breadth-first, but Prolog
is based on depth-first traversal, where nodes are generated in the top-down
order of the applicable rules in the program. Therefore, the computation of
some answers may depend on the order how program clauses are written in a
program. A simple heuristic is:
• Write the program clauses for base cases before the recursive program
clauses.
(A) is a don’t care source of determinism for positive programs and queries,
but is relevant in Prolog, which deals with negation as failure, because some
selection rules guarantee the generation of a finitely failed SLDNF-tree and
others do not. To illustrate, consider the query Q = ¬path(d, c) with respect
to the logic program P below:
path(X,Z) ← arc(X,Y)∧path(Y,Z). % c1
path(X,X) ← true. % c2
arc(b,c) ← true. % c3
{Q} ∪ P has a successful SLDNF-tree via the leftmost selection rule because
path(d, c) has a finitely failed SLDNF-tree via the leftmost selection rule, namely
path(d,c)
c1 {X1 → d, Z2 → c}
arc(d, Y1 ) ∧ path(Y1 , c)
fail
18
but P ∪ {Q} has an infinite unsuccessful SLDNF-tree via the rightmost selec-
tion rule because path(d, c) has an infinite unsuccessful SLDNF-tree via the
rightmost selection rule, namely
path(d,c)
c1 {X1 → d, Z2 → c}
arc(d, Y1 ) ∧ path(Y1 , c)
{X2 → Y1 , Z2 → c} c1 c2 {Y1 → c}
arc(d, Y1 ) ∧ arc(Y1 , Y2 ) ∧ path(Y2 , c) arc(d, c)
c {Y → c} fail
c1 2 2
... arc(d, Y1 ) ∧ arc(Y1 , c)
infinite c3 {Y1 → b}
branch
arc(d,b)
fail
For (A), Prolog implements the leftmost selection rule. A simple heuristics to
avoid the generation of infinite unsuccessful SLD-trees is:
• Avoid left-recursion, by not placing the recursive calls to be the first ones
in the body of clauses.
To appreciate this heuristics, see what happens if we change program P to be
path(X,Z) ← path(X,Y)∧arc(Y,Z). % c1 is left-recursive
path(X,X) ← true. % c2
arc(b,c) ← true. % c3
and we use the leftmost selection rule of Prolog for the query path(d,c).
References
[AB94] K.R. Apt and R.N. Bol. Logic programming and negation: A survey.
Journal of Logic Programming, 19,20:9–71, 1994. Available here.
[Kow74] R.A. Kowalski. Predicate logic as a programming language. Informa-
tion Processing, pages 569–574, 1974. Available here.
[Kow14] R. Kowalski. Computational logic. In D. Gabbay and J. Woods, edi-
tors, Computational Logic, History of Logic, pages 523–569. Elsevier,
2014. Available here.
[MM82] A. Martelli and U. Montanari. An efficient unification algorithm.
ACM Transactions on Programming Languages and Systems, 4:258–
282, 1982. Available here.
[Rob65] J.A. Robinson. A machine-oriented logic based on the resolution prin-
ciple. Journal of the ACM, 12(1):23–41, 1965. Available here.
19