0% found this document useful (0 votes)
21 views

Foundations of LP

This document is the lecture notes for a course on the foundations of logic programming. It introduces some key concepts in logic programming, including atoms, literals, queries, clauses, and programs. It then discusses the concepts of unification and most general unifiers, which allow variables in queries and clauses to be assigned values. The document explains how unification is used in the resolution method for logic programming called SLD-resolution to constructively prove queries from a program.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Foundations of LP

This document is the lecture notes for a course on the foundations of logic programming. It introduces some key concepts in logic programming, including atoms, literals, queries, clauses, and programs. It then discusses the concepts of unification and most general unifiers, which allow variables in queries and clauses to be assigned values. The document explains how unification is used in the resolution method for logic programming called SLD-resolution to constructively prove queries from a program.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Lecture 9: Foundations of Logic Programming

Marin Mircea
April 2021

Contents
1 Notions specific to Logic Programming 2

2 Unification 4
2.1 Unifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 The Martelli-Montanari algorithm . . . . . . . . . . . . . . . . . 7

3 SLD-resolution 9
3.1 Selection rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.1.1 Successful SLD-trees . . . . . . . . . . . . . . . . . . . . . 13
3.1.2 Finitely failed SLD-trees . . . . . . . . . . . . . . . . . . . 13
3.2 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . 13

4 Programming in Prolog 15
4.1 SLDNF-resolution . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . 17

1
1 Notions specific to Logic Programming
• An atom is a formula p(t1 , . . . , tn ) where p is a predicate symbol with
arity n and t1 , . . . , tn are terms.
• A literal is either an atom or the negation of atom. A literal is positive
if it is an atom, and negative otherwise.
Vp
• A query is a formula i=1 Ai
where A1 , . . . , Ap are atoms.
• A clause is a formula H ← B
Vp
where H is atom and B = i=1 Bi is a (possibly empty) conjunction of
atoms. H is called the head, and B is the body of the clause.

• A program is a finite set of clauses.

Programs and queries have two interpretations: declarative and procedural.


Vp
• A query i=1 Ai with variables X1 , . . . , Xn has the declarative interpre-
tation “There exist X1 , . . . , Xn such that A1 and . . . and Ap hold” and
the procedural interpretation “Find the values of X1 , . . . , Xn for which
the formulas A1 , . . . , Ap hold simultaneusly.”

• A clause H←B with variables X1 , . . . , Xn has the declarative interpreta-


tion “for all X1 , . . . , Xn , H holds if B holds” and the procedural interpre-
tation “To prove that H holds, it is sufficient to prove that B holds.”
We say that queries and clauses of this kind are positive because all their
components are atoms, which are positive literals. Later on, we will see that
Prolog can work with more general kinds of queries and clauses.
Prolog uses a simplified syntax to write programs and queries:
Vp
• A clause H ← i=1 Bi is written

H:-B1 , . . . , Bp .

The variables are recognized because they start with an uppercase


V0 letter,
and all of them are universally quantified. Clauses H ← i=1 Bi with
empty body are called facts and are written

H.

Vp
• A query i=1 Ai is written after the ?- prompt of Prolog, as follows:

2
?- A1 , . . . , Ap .

where the variables are recognized because they start with an uppercase
letter, and all of them are existentially quantified.
Prolog programs are saved in text files with extension .pl
For example, the following program contains some facts about parent-child
relations and clauses that define the relations

parent(X,Y) with intended reading “X is the parent of Y”, and


siblings(X,Y) with intended reading “X and Y are siblings”.

father(john,jack).
father(john,bob).
mother(mary,jack).
mother(ana,ray).
parent(X,Y) :- father(X,Y).
parent(X,Y) :- mother(X,Y).
siblings(X,Y) :- parent(Z,X),parent(Z,Y),X \= Y.

Note the declarative interpretation of the last clause: “for all X, Y, Z: X and Y are
siblings if Z is the parent of X, Z is the parent of Y, and X and Y are different.”
Although all variables in a clause are universally quantified, it is often more
natural to read variables in the conditions that are not in the conclusion as
existentially quantified with the body of the rule as their scope. For example,
the following interpretation is equivalent with the previous one: “X and Y are
siblings id there is a Z who is parent of both X and Y, and X and Y are different.”
A possible query could be
?- parent(X,jack).
X = john ;
X = mary.

with the declarative reading “Prove that there exists X which is the parent
of jack.” Prolog uses a proof method called SLDNF-resolution that proves
queries in a constructive way. “Constructive” means here that the values of the
variables are in effect computed. In the previous example, the answer is not a
simple "yes" but a constructive one, like “X = john” or “X = mary”.

The goal of this lecture is to explain the computational mechanism of Prolog


for the kinds of programs and queries defined before.

The theoretical foundations of logic programming rely on the following concepts:

Unification: the basic mechanism which assigns values to variables in logic


programming.

3
Logic programs and queries. A logic program is the knowledge base that
represents what we know about the world, and a query is a formula that
we want to prove from a program in a constructive way.
SLD-resolution: a proof method built upon unification that allows us to prove
queries from a program in a constructive way.
Semantics, which gives a meaning logic programs and queries. The relation-
ships between the proof method of SLD-resolution and the semantics (of
programs and queries) are clarified by the so-called soundness and com-
pleteness properties of SLD-resolution.

2 Unification
In logic programming, variables represent unknown values, like in mathematics.
The values assigned to variables are terms, and the assignment of terms to
variables is by means of substitutions called most general unifiers. The
process of computing most general unifiers is called unification.

Historical note. Unification was defined by J.A. Robinson [Rob65] in


the context of automated teorem proving. Its use for computing is due to
R. Kowalski [Kow74].

From now on we consider terms from T (F, X ) and write V ar(t) for the set of
variables occurring in a term t. If V ar(t) = ∅ we say that t is a ground term.
A substitution is a mapping θ from a finite set of variables Dom(θ) ⊆ X
to terms, which assigns to each variable X ∈ Dom(θ) a term t different from
X. In this lecture, we write a substitution θ as {X1 → t1 , . . . , Xn → tn } where
• X1 , . . . , Xn are the different variables in its domain Dom(θ),
• for every 1 ≤ i ≤ n, ti is the term θ(Xi ). By definition, ti 6= Xi .
Every pair Xi → ti is called a binding of θ, and we say that Xi is bound to
ti . We denote by Range(θ) the set of terms {t1 , . . . , tn }, by Ran(θ) the set of
variables with occurrences in t1 , . . . , tn , and let V ar(θ) = Dom(θ) ∪ Ran(θ).
When n = 0, θ becomes the empty mapping, which is called empty substitution
and is denoted by .
Consider a substitution θ = {X1 → t1 , . . . , Xn → tn }. If all terms t1 , . . . , tn
are ground then θ is called ground substitution, and if all t1 , . . . , tn are variables
then θ is called a pure variable substitution. When Dom(θ) = Range(θ) then θ
is a bijective mapping from its domain to itself, and is called a renaming. For
example, the substitutions  and {X → Y, Y → Z, Z → X} are renamings, but
{X → Y, Y → Z} is not a renaming.
Substitutions can be applied to terms. The result of applying a substitu-
tion θ to a term t, written as tθ, is the result of simultaneous replacement of
each occurrence in t of a variable from Dom(θ) by the corresponding term in

4
Range(θ). Formally, this operation is defined by induction on the structure of
the term t:

θ(X) if X ∈ Dom(θ),
• if X ∈ X then Xθ =
X otherwise.

• f (t1 , . . . , tn )θ = f (t1 θ, . . . , tn θ).


In particular, if t is a constant c then tθ = cθ = c = t.

The term tθ is called an instance of t. An instance is called ground if it has


no variables. If θ is a renaming, then sθ is a variant of s. Finally, a term s is
called more general than another term t is t is an instance of s. For example,
1) f (Y, X) is a variant of f (X, Y ) since f (Y, X) = f (X, Y ){X → Y, Y → X}.

2) f (X, Z) is a variant of f (X, Y ) since f (X, Z) = f (X, Y ){Y → Z, Z → Y }.


Note that the binding Z → Y had to be added to make the substitution
a renaming.
3) f (X, X) is not a variant of f (X, Y ).
Next, we define the composition of substitutions θ and η written as θη, as follows:

θη(X) = (Xθ)η for all X ∈ X .

In other words, θη assigns to a variable X the term obtained by applying


the substitution η to the term Xθ. Clearly, Dom(θη) ⊆ Dom(θ) ∪ Dom(η).

A constructive definition of composition of substitutions

The definition of composition is equivalent with the following, which is


easier to use in computations: If θ = {X1 → t1 , . . . , Xn → tn } and η =
{Y1 → s1 , . . . , Ym → sm } then θη is the result of the following computation:
1. Remove from the sequence

X1 → t1 η, . . . , Xn → tn η, Y1 → s1 , . . . , Ym → sm

the bindings Xi → ti η for which ti η = Xi and the bindings Yj → sj


for which Yj ∈ {X1 , . . . , Xn }
2. Form from the resulting sequence of bindings a substitution.

For example, if θ = {U → Z, X → 3, Y → f (X, 1)} and η = {X → 4, Z → U }


then θη = {X → 3, Y → f (3, 1), Z → U }.
Substitution composition is associative: If θ, η, γ are substitutions and s is
a term, then (θη)γ = θ(ηγ) and (sθ)η = s(θη).
Of special interest are the renamings and the variants. Renamings have the
following remarkable properties:

5
1. For every renaming θ there exists exactly one substitution θ−1 such that
θθ−1 = θ−1 θ = . Moreover, θ−1 is also a renaming.
2. If θη =  then θ and η are renamings.
Variants have the following remarkable properties:

1. s is a variant of t if and only if t is a variant of s.


2. If s is a variant of t then there is a renaming θ such that s = tθ and
V ar(θ) ⊆ V ar(s) ∪ V ar(t).
For example, the renaming θ = {X → Y, Y → Z, Z → X} has the inverse
θ−1 = {X → Z, Y → X, Z → Y }. The pure variable substitution η = {X → Y }
is not a renaming, and there is no substitution η −1 such that ηη −1 = η −1 η = .

2.1 Unifiers
Informally, unification is the process of making terms identical by means of
certain substitutions. For example, the terms f (a, Y, Z) and f (X, b, Z) can be
made identical by applying to them the substitution {X → a, Y → b}: both
sides then become f (a, b, Z). But the substitution {X → a, Y → b, Z → a} also
makes these two terms identical. Such substitutions are called unifiers. The
first unifier is preferable because it is more general — the second one is a special
case of the first one. The following definition clarifies this difference.

Let θ and τ be substitutions. We say that θ is more general than τ if


we have τ = θη for some η.

For example, {X → a, Y → b} is more general than {X → a, Y → b, Z → a}


because {X → a, Y → b, Z → a} = {X → a, Y → b}{Z → a}.
From now on, we write θ  η if θ is more general than η. Note that ‘’ is

1. reflexive: θ  θ because θ = θ,


2. transitive: if θ1  θ2 and θ2  θ3 then there exist η1 , η2 such that θ2 = θ1 η1
and θ3 = θ2 η2 . Thus θ1  θ3 because θ3 = (θ1 η1 )η2 = θ1 (η1 η2 ),
but is not antisymmetric because, for example, θ = {X → Y, Y → X} has the
following properties: θ 6= ,   θ because θ = θ and θ   because  = θθ. It
can be shown that the following lemma holds.

Renaming Lemma. θ  η and η  θ if and only if η = θγ for some


renaming γ with V ar(γ) ⊆ V ar(θ) ∪ V ar(η).

The following are the key notions of this section.

6
Let s, t ∈ T (F, X ). A substitution θ is called

• a unifier of s and t is sθ = tθ. If a unifier of s and t exists we say


that s and t are unifiable.
• a most general unifier of s and t (mgu) if it is a unifier of s and
t, and it is more general than all unifiers of s and t.

• strong mgu of s and t if for all unifiers η of s and t we have η = θη.

Intuitively, an mgu is a substitution which makes two terms equal but which
does it in a “most general way”, without unnecessary bindings. So θ is an mgu
if every other unifier η is of the form η = θγ for some γ. An mgu γ is strong
if for every unifier η, the substitution γ for which η = θγ holds can be always
chosen to be η itself.
The problem of deciding whether two terms are unifiable is called the uni-
fication problem. We solve this problem by providing an algorithm that ter-
minates with failure if the terms are not unifiable and that otherwise produces
a strong mgu. In general, two terms may be not unifiable for two reasons:
1. Two terms starting with a different function symbol can not unify. For
example, f (g(X, a), X) and f (g(X, b), b) are not unifiable because the cor-
responding red subterms start with different function symbols.
2. A variable X can not be unified with a term t 6= X with X ∈ V ar(t).
For example, g(X, a) and g(f (X), a) are not unifiable because the corre-
sponding red subterms can not be unified for this reason.
In the literature, this reason is called occur-check failure.
Each possibility can occur at some “inner level” of the considered two terms.

2.2 The Martelli-Montanari algorithm


This algorithm was proposed by Martelli and Montanari [MM82] to solve an
apparently more general problem:
Given a set of term equations S = {s1 = t1 , . . . , sn = tn }
Compute An mgu of this set, that is, a substitution θ such that
1. si θ = ti θ for all 1 ≤ i ≤ n. Such a substitution θ is called unifier of
the system S.
2. θ is more general than all other unifiers of S.
The algorithm works by nondeterministically choosing an equation of a form
below from the set of equations, and performing the associated action:
1) f (s1 , . . . , sn ) = f (t1 , . . . , tn ).
Replace it by the equations s1 = t1 , . . . , sn = tn .

7
2) f (s1 , . . . , sn ) = g(t1 , . . . , tm ) with f 6= g.
Halt with failure.
3) X = X.
Delete the equation.

4) t = X where t 6∈ X .
Replace it with X = t.
5) X = t where X 6∈ V ar(t) and X occurs elsewhere.
Perform the substitution {X → t} on all other equations,

6) X = t where X ∈ V ar(t) and X 6= t.


Halt with failure.
To use this algorithm for unifying two terms s, t we activate it with the singleton
set {s = t}. It is well known that the algorithm always terminates. Moreover,

• If the algorithm terminates with failure, the terms s and t are not unifiable.
• Otherwise, the algorithm stops with a system of equations of the form
{X1 = t1 , . . . , Xn = tn } where the Xi s are distinct variables and none of
them occurs in a term tj .
In this case, {X1 → t1 , . . . , Xn → tn } is a strong mgu of s and t.

Examples:
1. f (g(X, a), X) and f (g(X, b), b) are not unifiable because

{f (g(X, a), X) = f (g(X, b), b)} ⇒ {g(X, a) = g(X, b), X = b} ⇒


{X = X, a = b, X = b} ⇒ failure.

2. {X → g(a), Y → b, Z → h(g(a))} is a strong mgu of k(Z, f (X, b, Z)) and


k(h(X), f (g(a), Y, Z)) because

{k(Z, f (X, b, Z)) = k(h(X), f (g(a), Y, Z))} ⇒


{Z = h(X), f (X, b, Z) = f (g(a), Y, Z)} ⇒
{Z = h(X), X = g(a), b = Y, Z = Z} ⇒ {Z = h(X), X = g(a), b = Y } ⇒
{Z = h(X), X = g(a), Y = b} ⇒ {Z = h(g(a)), X = g(a), Y = b}.

We conclude this section by mentioning two important properties of strong


mgus:
1. Any strong mgu θ is idempotent, that is, θθ = θ.
2. In general, a substitution θ is idempotent iff Dom(θ) ∩ Ran(θ) = ∅.

8
3 SLD-resolution
The computation process within the logic programming framework can be ex-
plained as follows. A program P can be viewed as a set of axioms and a query
Q as a request to find an instance Qθ of it which follows from P . A successful
computation yields such a θ and can be viewed as a proof of Qθ from P .
The computation is a sequence
Vp of basic steps. Each basic step selects an
atom Ai in the current query i=1 Ai and a clause H ← B in the program
P . If Ai unifies with H, then the next query is obtained by replacing Ai by
the clause body B and by applying to the outcome an mgu of Ai and H. The
computation terminates successfully when the empty query is produced. θ is
then the composition of the mgus used.
Thus logic programs compute through a combination of two mechanisms:
replacement and unification. To understand better various fine points of
this computation, let’s introduceVa few auxiliary notions.
Let P be a program, Q = A∈A A a non-empty query, and H ← B a
variant of c ∈ P such that V ar(H ← B) ∩ V ar(Q) = ∅, and θ is an mgu of H
and A0 ∈ A.

The query  
^
Q0 = B ∧ A θ
A∈A−{A0 }

is called an SLD-resolvent of Q and c w.r.t. A0 , with mgu θ, and we


write Q ⇒θ,c Q0 to express this fact. A0 is called the selected atom, c
is called the input clause, and this relation is called an SLD-derivation
step.

An SLD-derivation step can also be presented in the form of a rule of deduction:


V
A H←B
 A∈A V 
B ∧ A∈A−{A0 } A θ

where (H ← B) is a variant of a clause c ∈ P such that V ar(H ← B)∩V ar(Q) =


∅, and θ is an mgu of H and A0 ∈ A. This rule of deduction is called SLD-
resolution. It was introduced by R.A. Kowalski in logic programming [Kow74]
to derive conclusions by means of backward reasoning.1
Thus a resolvent of a non-empty query and a clause is obtained by the
following successive steps:

Selection of an atom A in the query.


Renaming, if necessary, a clause chosen from the program.
1 Backward reasoning, or backward chaining, is an inference method used in automated

theorem proving that works backwards, from conclusion to hypotheses from which it can
follow.

9
Instantiation of the query and the clause by an mgu of the selected atom and
the head of the clause.
Replacement of the instance of the selected atom by the instance of the body
of the clause.

By iterating SLD-derivation steps we obtain an SLD-derivation. Formally, if P


is a program and Q0 a query, the an SLD-derivation of P ∪ {Q0 } is a maximal
sequence
Q0 ⇒θ1 ,c1 Q1 · · · Qn ⇒θn+1 ,cn+1 Qn+1 · · ·
of SLD-derivation steps, and each step satisfies the following additional technical
condition:
Standardization apart: each variant Hi ← Bi of an input ci has variables
which did not occur in previous SLD-derivation steps.
Intuitively, this means that at each step of an SLD-derivation the variables
of the input clauses should be fresh.
A few more definitions will be helpful.
1. A clause is called applicable to an atom if a variant of its head unifies
with the atom.
2. The length of an SLD-derivation is the number of SLD-derivation steps
used in it. So an SLD-derivation of length 0 consists of a single query Q
such that either Q is empty or no clause of the program is applicable to
its selected atom.

3. A derivation ξ := Q0 ⇒θ1 Q1 · · · ⇒θn Qn is either


successful if Qn = true is the empty conjunction.
failed if Qn is not the empty conjunction and no clause of P is applicable
to the selected atom of Qn .

If the derivation is successful then the restriction (θ1 · · · θn )|V ar(Q) of θ1 · · · θn


to the variables of Q is called a computed answer substitution (c.a.s. in
short) of Q and Qθ1 · · · θn is called a computed instance of Q.
For example, let’s consider terms built out of the constant 0 (“zero”) by means
of the unary function symbol s (“successor”). We call such terms numerals.
The following program defines the predicate sum(X,Y,Z) which holds if Z is the
sum of numerals X and Y:

sum(X,0,X) ← true. % c1
sum(X,s(Y),s(Z)) ← sum(X,Y,Z). % c2
The query sum(s(s(0)),s(s(0)),Z) asks for the value of Z which is the sum
of numerals s(s(0)) and s(s(0)). The SLD-derivation of this query is

10
sum(s(s(0)), s(s(0)), Z) ⇒θ1 ={X1 →s(s(0)),Y1 →s(0),Z→s(Z1 )},c2 sum(s(s(0)), s(0), Z1 )
⇒θ2 ={X2 →s(s(0)),Y2 →0,Z1 →s(Z2 )},c2 sum(s(s(0)), 0, Z2 )
⇒θ3 ={X3 →s(s(0)),Z2 →s(s(0))},c1 true.
where the variants used in these steps are sum(X1 , s(Y1 ), s(Z1 )) ← sum(X1 , Y1 , Z1 ),
sum(X2 , s(Y2 ), s(Z2 )) ← sum(X2 , Y2 , Z2 ), and sum(X3 , 0, X3 ) ← true.
The corresponding computed answer substitution is

θ1 θ2 θ3 |{Z} = {Z → s(s(s(s(0))))}.

3.1 Selection rules


According to the definition of an SLD-derivation, the following four choices are
made in each SLD-derivation step:
(A) choice of the selected atom in the considered query,
(B) choice of the program clause applicable to the selected atom,
(C) choice of the renaming of the program clause used,
(D) choice of the mgu.
Here we discuss the impact of choice (A): the selection of an atom in a query.

Let IN IT be the set of initial fragments of SLD-derivations in which the


last query is non-empty. A selection rule is a function R which, when
applied to an element of IN IT , yields an occurrence of an atom in its last
query.

Given a selection rule R, we say that an SLD-derivation ξ is via R if all choices


of the selected atoms in ξ are performed according to R. That is, for each initial
fragment ξ < of ξ ending with a non-empty query Q, R(ξ < ) is the selected atom
of Q.

Remark. Every SLD-derivation is via a selection rule. The name SLD


comes from Selection rule driven Linear resolution for Definite clauses.

Examples of selection rules could be:


1. always choose the leftmost atom. This selection rule is used by the imple-
mentations of Prolog.
2. always choose the rightmost atom.
3. choose the leftmost atom at even SLD-derivation steps, and the last atom
at odd SLD-derivation steps.
When searching for all computed answers of a query Q, we construct all SLD-
derivations via a selection rule R, with the aim of generating the empty query.
All these derivations form a tree-like search space, called SLD-tree.
An SLD-tree for P ∪ {Q} via a selection rule R is a tree such that

11
• Its branches are SLD-derivations of P ∪ {Q} via R.
• Every node Q with selected atom A has exactly one descendant for every
clause c from P which is applicable to A. This descendant is a resolvent
of Q and c with respect to A.

We call an SLD-tree successful if it contains the empty query. We call an


SLD-tree finitely failed if it is finite and not successful.
To illustrate, consider the logic program P below
path(X,Z) ← arc(X,Y)∧path(Y,Z). % c1
path(X,X) ← true. % c2
arc(b,c) ← true. % c3
and the query Q = path(X,c). A possible interpretation of the relations arc
and path defined by this program is as follows: arc(X,Y) holds if there is an arc
from X to Y, and path(X,Y) holds if there is a path from X to Y. Two SLD-trees
for P ∪ {Q} are illustrated below:

SLD-tree via leftmost atom selection rule

path(X,c)
{X1 → X, Z1 → c} c2 {X → c, X1 → c}
c1
arc(X,Y)∧path(Y,c) 
{X → b, Y → c} c3
path(c,c)
{X2 → c, Z2 → c}
c1 c2 {X2 → c}
arc(c,Y2 ),path(Y2 ,c) 
fail

SLD-tree via rightmost atom selection rule

path(X,c)
{X1 → X, Z1 → c} c2 {X → c, X1 → c}
c1
arc(X,Y)∧path(Y,c) 
{X2 → Y, Z2 → c} c2 {Y → c}
c1
arc(X,Y)∧arc(Y,Y2 )∧path(Y2 ,c) arc(X,c)

c2 {X3 → c, Y2 → c} c3 {X → b}
... c1 arc(X,Y)∧arc(Y,c) 
infinite subtree c3 {Y → c}
arc(X,b)
fail

12
where  represents the empty clause (true). The first tree is finite while the
second one is infinite. Both trees are successful and contain the same computed
answer substitutions: {X → b} and {X → c}.

3.1.1 Successful SLD-trees


If an SLD-tree for P ∪ {Q} is successful, then all SLD-trees of P ∪ {Q} are
successful.

Moreover, if θ is an answer substitution obtained with an SLD-derivation ξ1 of


length n for a selection rule R1 , then the same θ can be obtained with any other
selection rule R2 , by constructing another SLD-derivation ξ2 of length n for R2 .
For example, the SLD-derivation via leftmost atom selection
ξleft := path(X,c) ⇒{X1 →X,Z1 →c},c1 arc(X,Y)∧path(Y,c)
⇒{X→b,Y→c},c3 path(c,c) ⇒{X2 →c},c2 
has length 3 and computed answer {X → c}. The SLD-derivation via rightmost
atom selection corresponding to ξleft is
ξright := path(X,c) ⇒{X1 →X,Z1 →c},c1 arc(X,Y)∧path(Y,c)
⇒{Y→c},c2 arc(X,c) ⇒{X→b},c3 

3.1.2 Finitely failed SLD-trees


If an SLD-tree is not successful, it is either finitely failed or infinite. Unfortu-
nately, there are queries Q such that P ∪ {Q} has both a finitely failed SLD-tree
and an infinite tree which is unsuccessful. For example, if P is the previous
program and Q is the query path(a,b), then the SLD-tree of P ∪ {Q} via left-
most selection rule is finitely failed, and the SLD-tree of P ∪ {Q} via rightmost
selection rule is infinite and unsuccessful.

3.2 Concluding remarks


Suppose Q is a query with variables X1 , . . . , Xn . We write P ` ∃X1 . · · · .∃.Xn .Q
if the formula
∃X1 . · · · .∃.Xn .Q
follows from a program P , and wish to decide if the relation P ` ∃X1 . · · · .∃.Xn .Q
holds or not. Moreover, if P ` ∃X1 . · · · .∃.Xn .Q holds, we want to find the sub-
stitutions θ such that P ` Qθ holds.
To solve this problem, we compute an SLD-tree of P ∪ {Q}. If the tree is
finite, then the computation terminates and:
1. If the tree is finitely failed, then P 6` ∃X1 . · · · .∃.Xn .Q, and therefore there
are no substitutions θ such that P ` Qθ.
2. If the tree is successful, then P ` ∃X1 . · · · .∃.Xn .Q. Moreover, we can
extract from the tree the computed answers θ such that P ` Qθ.

13
If the tree is infinite, the computation of the whole tree does not terminate. If
the tree is computed incrementally, by breadth-first traversal, then
3. We decide P ` ∃X1 . · · · .∃.Xn .Q as soon as we produce a node for . In
this case, we can also report P ` Qθ for the computed answer θ by the
SLD-derivation to that node.
Note that, if the tree is infinite, we can decide P ` ∃X1 . · · · .∃.Xn .Q only when
a node for  is produced. If such a node is never produced, the computation of
the tree runs forever and we can not decide anything.
Breadth-first is a fair traversal strategy because it traverses all nodes of a
tree, even if it the tree is infinite. By contrast, depth-first is an unfair traversal
strategy because it does not traverse all nodes of some infinite trees. For exam-
ple, the depth-first traversal of the SLD-tree via rightmost atom selection from
page 13 has the shape
9

9 9

9 9

9 9 9

9
infinite
branch

Breadth-first will generate all nodes of the tree in the order depicted below:
1

2 3

4 5

6 7 8

9
infinite
branch

and depth-first will generate only the leftmost branch of the tree:
1

2 9

3 9

4 9 9

9
infinite
branch

14
Implementation of SLD-derivations in Prolog
The incremental computation of a SLD-tree by breadth-first traversal is fair but
memory-consuming. For this reason, most languages for logic programming,
including Prolog, construct an SLD-tree incrementally, by depth-first traversal,
which consumes much less memory. Since this traversal strategy is unfair, it may
never produce successful derivations (and computed answers) which exist. In
contrast, breadth-first traversal guarantees that all successful derivations (and
computed answers) are eventually produced.

4 Programming in Prolog
Prolog is the most popular language used for logic programming. It is de-
signed to answer queries using a predefined and predictable search strategy
called SLDNF-resolution.

4.1 SLDNF-resolution
SLDNF [AB94] is an extension of SLD-resolution to deal with negation as failure.
This means that Prolog can work with
Vp
• general clauses, which are formulas H ← i=1 Li where H is atom and Li
are literals.
Vp
• general queries, which are formulas i=1 Li where Li are literals.
The simplified syntax of Prolog can be used to write such clauses and queries too;
the only extension is that we write a literal ¬p(t1 , . . . , pn ) as not(p(t1 , . . . , tn )).
For example, we can write the Prolog program

bird(olaf). % c1
penguin(olaf). % c2
fly(X) :- bird(X), not(abnormal(X)). % c3
abnormal(X) :- penguin(X). % c4

to encode the set of clauses {bird(tweety), fly(X) ← bird(X) ∧ ¬abnormal(X),


abnormal(X) ← penguin(X)} which represent the following knowledge:
• Tweety is a bird,

• Every X can fly if it is a bird and it is not abnormal.


• Every X is abnormal if it is a penguin.
V
If P is a program, Q = L∈L L is a non-empty query, and H ← B a variant of
a clause c ∈ P such that V ar(H ← B) ∩ V ar(Q) = ∅, then

15
1. If θ is an mgu of H and an atom A0 ∈ L, then the query
 
^
0
Q = B ∧ L θ
L∈L−{A0 }

is an SLDNF-resolvent of Q and c w.r.t. A0 , with mgu θ, and we write


Q ⇒θ,c Q0 to express this fact.
2. If ¬A0 ∈ L is a ground
V literal and there is a finitely failed SLDNF-tree of
P ∪ {A0 }, then Q0 = L∈L−{¬A0 } L is a an SLDNF-resolvent of Q, and
we write Q ⇒ Q0 to express this fact.
This means that, when such a variable-free literal ¬A0 is selected, a subproof
(or subcomputation) is attempted to determine whether there is an SLDNF-
refutation (=finitely failed tree) of P ∪ {A0 } starting from A0 . The selected
subgoal ¬A0 succeeds if the subproof fails, and it fails if the subproof succeeds.
Example 1. ¬fly(olaf) succeeds because P ∪ {fly(olaf)} has the finitely
failed SLDNF-tree
fly(olaf)
c3 {X1 → olaf}
bird(olaf)∧¬abnormal(olaf)
c1 
¬abnormal(olaf)
fail

The subgoal ¬abnormal(olaf) fails because the subgoal abnormal(olaf) suc-


ceeds:
abnormal(olaf)
c4 
penguin(olaf)
c2 


Example 2. List membership is a built-in Prolog relation, defined by the


program Pmember
% base case: X is member of any list that starts with X
member(X,[X|_]). % mc1
% recursive case: X is member of any list whose tail contains X
member(X,[_|T]) :- member(X,T). % mc2

The query Q = member(1, [1, 2, 1]) is successful because Pmember ∪ {Q} has the
following successful tree starting from Q:

16
member(1,[1,2,1])
{X1 → 1} mc1 mc2 {X1 → 1, T1 → [2, 1]}
 member(1,[2,1])
mc2 {X2 → 1, T2 → [1]}
member(1,[1])
{X3 → 1} mc1 mc2 {X3 → 1, T2 → []}
 member(1,[])
fail

In fact, Prolog returns true immediately after the generation of the partial
SLDNF-tree
member(1,[1,2,1]) ?-member(1,[1,2,1]).
true.
{X1 → 1} mc1

and will generate the rest of the tree only if the user decides to do so by pressing
;. Because Prolog can find a successful tree of Q, the query ¬member(1, [1, 2, 1])
fails, and Prolog returns false:
¬member(1,[1,2,1]) ?- not(member(1,[1,2,1])).
fail false.

4.2 Concluding remarks


Closed-world assumption and Open-world assumption
Prolog programming are based on the closed-world assumption (CWA),
which presumes that a statement is true if and only if it can be deduced from
the knowledge (=program) we have. The opposite of the closed-world assump-
tion is the open-world assumption (OWA), stating that lack of knowledge
does not imply falsity. For example, if our knowledge is the logic program
P = {bird(olaf)} then bird(theety) is

• false, in reasoning systems based on the closed world assumption (includ-


ing Prolog), because it can not be deduced from P .
• unknown, in reasoning systems based on the open world assumption.
Reasoning systems based on the closed-world assumption are non-monotonic:
this means that, if we extend our knowledge P with new knowledge to P 0 , then
some stetements which were previously true may become false. For example,
¬bird(tweety) is true with respect to the program P = {bird(olaf)} and false
with respect to the extended program P 0 = {bird(olaf), bird(tweety)}.
According to Kowalski [Kow14], programming with positive clauses and
queries using SLD-resolution is sufficient for database applications, but not ad-
equate for Artificial Intelligence, most importantly because it fails to capture

17
non-monotonic reasoning. SLDNF-resolution with general clauses and queries
is adequate for AI.

Sources of nondeterminism
The search for solutions of a query Q with respect to a program P is based on
the construction of a SLDNF-tree for P ∪ {Q}. We noticed that there are four
sources of nondeterminism in the computation of a successful SLDNF-derivation
Q ⇒θ1 ,c1 Q2 · · · ⇒θn−1 ,cn−1 Qn−1 ⇒θn ,cn 
that produces a computed answer θ = θ1 . . . θn |V ar(Q) :
(A) choice of the selected atom in the considered query,
(B) choice of the program clause applicable to the selected atom,
(C) choice of the renaming of the program clause used,
(D) choice of the mgu.
(C) and (D) are don’t care sources of nondeterminism: we are free to choose
any mgu and any standardized apart variant of a program clause, because these
choices don’t affect the computation of a complete set of answers for a query.
(B) is a don’t know source of nondeterminism. This means that, if we want
to find all answers of a query, we must consider all program clauses applicable
to the selected atom. That is, we must generate all nodes of the SLDNF-tree.
We can do so with a fair tree traversal strategy, like breadth-first, but Prolog
is based on depth-first traversal, where nodes are generated in the top-down
order of the applicable rules in the program. Therefore, the computation of
some answers may depend on the order how program clauses are written in a
program. A simple heuristic is:
• Write the program clauses for base cases before the recursive program
clauses.
(A) is a don’t care source of determinism for positive programs and queries,
but is relevant in Prolog, which deals with negation as failure, because some
selection rules guarantee the generation of a finitely failed SLDNF-tree and
others do not. To illustrate, consider the query Q = ¬path(d, c) with respect
to the logic program P below:
path(X,Z) ← arc(X,Y)∧path(Y,Z). % c1
path(X,X) ← true. % c2
arc(b,c) ← true. % c3
{Q} ∪ P has a successful SLDNF-tree via the leftmost selection rule because
path(d, c) has a finitely failed SLDNF-tree via the leftmost selection rule, namely
path(d,c)
c1 {X1 → d, Z2 → c}
arc(d, Y1 ) ∧ path(Y1 , c)
fail

18
but P ∪ {Q} has an infinite unsuccessful SLDNF-tree via the rightmost selec-
tion rule because path(d, c) has an infinite unsuccessful SLDNF-tree via the
rightmost selection rule, namely
path(d,c)
c1 {X1 → d, Z2 → c}
arc(d, Y1 ) ∧ path(Y1 , c)
{X2 → Y1 , Z2 → c} c1 c2 {Y1 → c}
arc(d, Y1 ) ∧ arc(Y1 , Y2 ) ∧ path(Y2 , c) arc(d, c)
c {Y → c} fail
c1 2 2
... arc(d, Y1 ) ∧ arc(Y1 , c)
infinite c3 {Y1 → b}
branch
arc(d,b)
fail

For (A), Prolog implements the leftmost selection rule. A simple heuristics to
avoid the generation of infinite unsuccessful SLD-trees is:
• Avoid left-recursion, by not placing the recursive calls to be the first ones
in the body of clauses.
To appreciate this heuristics, see what happens if we change program P to be
path(X,Z) ← path(X,Y)∧arc(Y,Z). % c1 is left-recursive
path(X,X) ← true. % c2
arc(b,c) ← true. % c3
and we use the leftmost selection rule of Prolog for the query path(d,c).

References
[AB94] K.R. Apt and R.N. Bol. Logic programming and negation: A survey.
Journal of Logic Programming, 19,20:9–71, 1994. Available here.
[Kow74] R.A. Kowalski. Predicate logic as a programming language. Informa-
tion Processing, pages 569–574, 1974. Available here.
[Kow14] R. Kowalski. Computational logic. In D. Gabbay and J. Woods, edi-
tors, Computational Logic, History of Logic, pages 523–569. Elsevier,
2014. Available here.
[MM82] A. Martelli and U. Montanari. An efficient unification algorithm.
ACM Transactions on Programming Languages and Systems, 4:258–
282, 1982. Available here.
[Rob65] J.A. Robinson. A machine-oriented logic based on the resolution prin-
ciple. Journal of the ACM, 12(1):23–41, 1965. Available here.

19

You might also like