CIS587: LOGIC, PART 3
Proof as Search
Some Proof Strategies
SLD-Resolution
Pure Prolog
Prolog
Forward-Chaining and Backward-Chaining
Some Review Thoughts
Proof as Search
Up to now we have exhibited proofs as if they were found miracolously.
We gave formulae and showed proofs of the intended results. We did not
exhibit how the proofs were derived.
We now examine how proofs can actually be found. In so doing we stress the
close ties between theorem proving and search.
A General Proof Procedure
We use binary resolution [we represent clauses as sets] and we represent the
proof as a tree, the Proof Tree. In this tree nodes represent
clauses.
We start with two disjoint sets of clauses INITIAL and OTHERS.
- We create a node START and introduce an hyperarc from START to
new nodes, each representing a distinct element of INITIAL. We put in
a set OPEN all these new nodes. These nodes are
called AND Nodes.
- If OPEN is empty, we terminate. Otherwise
we remove an element N from OPEN using a selection function SELECT.
- If N is an AND node, we connect N with a single hyperarc
to new nodes N1 ... Nm, one for each literal in the clause C represented at N.
These nodes are also labeled with C and are called OR Nodes.
All of these nodes are placed into OPEN.
[NOTE 1: In the case that C consists
of a single literal, we can just say that N is now an OR node.]
[NOTE 2: One may decide not to create all the nodes N1 .. Nm in a single
action and instead to choose one of the literals of C and to
create a node Ni to represent C with this choice of literal. Ni is
inserted into OPEN.
N also goes back into OPEN if not all of its literals have been considered.
The rule used to choose the literal in C is called a Selection
Rule]
Repeat from 2.
- If N is an OR node, say, corresponding to the ith literal of a clause C,
we consider all the possible ways of applying binary
resolution between C and clauses from the set OTHERS,
resolving on the ith literal of C.
Let D1 .. Dp be the resulting clauses. We represent each of these clauses Dj
by an AND node N(Dj) as in 1. above. We put an arc from N to N(Dj).
We set OPEN to NEXT-OPEN(OPEN, C, {D1, .., Dp}).
We set OTHERS to NEXT-OTHERS(OTHERS, C, {D1, .., Dp}).
If in the proof tree we have, starting at START, a hyperpath (i.e. a path
that may include hyperarcs) whose leaves have all label {}, we terminate with
success.
[NOTE 3: One may decide, instead of resolving C on its ith literal
with all possible element of OTHERS, to select one compatible element of OTHERS
and to resolve C with it, putting this resolvent and C back into OPEN.
We would call a rule for selecting an element of OTHERS
a Search Rule.]
Repeat from 2.
In this proof procedure we have left indetermined:
- The sets INITIAL and OTHERS of clauses
- The function SELECT by which we choose which node to expand
- The function NEXT-OPEN for computing the next value of OPEN
- The function NEXT-OTHERS for computing the next value of OTHERS
There is no guaranty that for any choice of INITIAL, OTHERS, SELECT,
NEXT-OPEN and NEXT-OTHERS
the resulting theorem prover will be "complete", i.e. that
everything that is provable will be provable with this theorem prover.
Example 1
Suppose that we want to prove (we already encountered this problem in the
Logic 2 notes) that
- NOT P(x) OR NOT R(x)
is inconsistent with the set of clauses:
- NOT S(x) OR H(x)
- NOT S(x) OR R(x)
- S(b)
- P(b)
The following are possible selections for the indeterminates:
- INITIAL: {1.}, that is, it consists of the clauses representing the negation of the goal.
- OTHERS: {2. 3. 4. 5.}, that is, it consists of the non-logical axioms.
- SELECT: We use OPEN as a FIFO queue, i.e. we do breadth-first search.
- NEXT-OPEN: It sets NEXT-OPEN(OPEN, C, {D1, .., Dp}) to the union of OPEN,
{C}, and {D1, .., Dp}.
- NEXT-OTHERS: It leaves OTHERS unchanged
The Proof Tree is then (we underline AND nodes and all their outgoing arcs
are assumed to form a single hyperlink)
START
|
|
NOT P(x) OR NOT R(x)
/---------------\
/ \
NOT P(x) OR NOT R(x) NOT P(x) OR NOT R(x)
| |
| |
NOT R(B) NOT P(x) OR NOT S(x)
| /----------- \
| / \
NOT S(B) NOT S(B) NOT P(B)
| | |
| | |
{} {} {}
At an OR node we apply resolution between the current clause at its
selected literal and all the compatible elements of OTHERS.
Example 2
The following example uses Horn clauses in propositional logic.
I use the notation that is common in Prolog: we represent the
implication:
A1 AND A2 AND .. AND An IMPLIES A
as
A < = A1, A2, .. , An
Our problem is:
- A, This is what we want to prove
- A < = B, C
- A < = D
- B < = D, E
- B < = F
- C < =
- C < = D, F
- D < =
- F < =
We now partially represent the proof tree. We do not apply breadth first
because we want to see how great is the impact of the choise of SELECT.
A
/ \
/ \
/ \
B, C D
/------\ |
/ \ |
/ \ {}
B,C B,C
/ \ | \
/ \ | \
C,D,E C,F B B,D,F
............................................
You can keep on expanding the tree and see how depth first would
generate a large tree while breadth first rapidly derives
D from A, and {} from D.
In other circumstances other strategies would be appropriate as
we see below.
Some Proof Strategies
From the early sixties people have looked for strategies that would
simplify the search problem in theorem proving. Here are some of the
strategies suggested.
Unit Preference
When we apply resolution if one of the premises is a unit clause (it has
a single literal), the resolvent will have one less literal than the largest
of the premises, thus getting closer to the desired goal {}. Thus it appears
desirable to use resolution between clauses one of which is the unit clause.
This unit preference is applied both when selecting from the OPEN set (i.e.
at the leaves of the tree) and when we at an OR node we select an element of OTHERS
to resolve with it.
Set of Support Strategy
When we use the Set of Support Strategy we have:
- NEXT-OPEN(OPEN, C, {D1, .., Dp}) is the union of OPEN, {C}, and {D1,..,Dp}
NEXT-OTHERS(OTHERS, C, {D1,..,Dp}) is OTHERS.
In simple words, the set of support strategy uses the OPEN set
as its set of support. Each
application of resolution has as a premise an element
of the set of support
and adds that premise and the resolvent to the set of support.
Usually the INITIAL set (i.e. the initial set of support) consists of all the
clauses obtained by negating the intended "theorem".
The set of support strategy is complete if
- The OTHERS set of clauses is satisfiable. That is the case when we
are given a satisfiable set of non-logical axioms and we use it as OTHERS.
Input Resolution
In Input Resolution:
- INITIAL consists of the union of the negation of the "theorem"
and of the set of non-logical axioms.
- OTHERS usually is the set of non-logical axioms.
- NEXT-OPEN(OPEN, C, {D1,..,Dp}) is the union of OPEN and {C}, that is, OPEN
does not change in successive iterations
- NEXT-OTHERS(OTHERS, C, {D1,..,Dp}) is the union of OTHERS, {C}, and
{D1,..,Dp}.
In other words, in each resolution one of the premises is one of the original
clauses.
In general Input Resolution is incomplete, as it can be seen with
the following unsatisfiable set of clauses (from Nilsson) from which we are unable to derive
FALSE using Input Resolution:
- Q(u) OR P(A)
- NOT Q(w) OR P(w)
- NOT Q(x) OR NOT P(x)
- Q(y) OR NOT P(y)
[You can use this set of clauses as both OPEN and OTHERS.]
Input resolution is complete
in the case that all the clauses are Horn clauses.
Linear Resolution
Linear Resolution, also known as Ancestry-Filtered Form Strategy,
is a generalization of Input Resolution. The generalization is that now
in each resolution one of the clauses is one of the original clauses or it
is an ancestor in the proof tree of the second premise.
Linear Resolution is complete.
SLD-Resolution
Selected Linear Resolution for Definite Clauses, or simply SLD-Resolution,
is the basis for a proof procedure for theorems that are in the form of
conjunctions of
positive literals (this conjunction is called a Goal)
given non-logical axioms that are definite clauses.
More precisely:
- A Goal is the conjunction of positive literals
- A Selection Rule is a method for selecting one of the
literals of a goal (the first literal, or the last literal, etc.)
- In SLD-Resolution, given a goal G = A1 .. An, a definite clause
C: A < = B1 .. Bm, and a subgoal Ai selected from G using a Selection
Rule S, where Ai and A are unifiable with MGU s, the
Resolvent of G and C under S is
(A1 .. Ai-1 B1 ..Bm Ai+1..An).s
- An SLD-Derivation of a goal G given a selection
rule S and a set P of definite clauses, is a sequence of triples
(Gi, Ci, si), for i from 0 to k, where
- G0 is G
- for each i > 0, Ci is obtained from a clause in P by replacement of
all of its variables with new variables
- Gi+1 is the SLD-Resolvent of Gi and Ci by use of S and with MGU si.
- A Refutation of a goal G given definite clauses P and
selection rule S, is a finite SLD-derivation of G given S and P whose last
goal is the null clause.
If s1 .. sk are the MGUs used in the refutation, then s = si.s2....sk is a
substitution that, restricted to the variables of G, makes G true whenever
the clauses of P are true.
- The goal G succeeds for given selection rule S
and set of definite clauses P if it has a refutation for P and S; otherwise
it Fails.
Theorem: SLD-Resolution is Sound and Complete for conjunctive
positive goals and definite clauses.
An important consequence of this theorem is that it remains true no matter
the selection rule we use to select literals in goals. Thus we can select
literals as we please, for instance left-to right. An other important
aspect is that the substitution s = s1.s2...sn gives us a method
for finding the individuals that satisfy the goal in the structures
that satisfy the clauses in P.
Nothing has been said in SLD-Resolution about what rule should be
used to select the clause of P to resolve with the current literal of the
current goal (such a rule is called a Search rule).
Example (from Van Le)
Suppose that we are given the following clauses:
- WOMAN(MOTHER(v)) ; Every mother is a woman
- GOOD(HUSBAND(ANN)) ; The husband of Ann is good
- GOOD(z) < = LIKES(MOTHER(z), z); if z likes his mother then z is good
and we want to find out a woman that likes's someone's husband.
The goal can be stated as:
- WOMAN(x),LIKES(x,HUSBAND(y))
[NOTE: the variables x and y are implicitly existentially quantified.]
The SLD-Derivation is:
((WOMAN(x),LIKES(x,HUSBAND(y))), WOMAN(MOTHER(v)), [MOTHER(v)/x])
((LIKES(MOTHER(v),HUSBAND(y))), GOOD(z) < = LIKES(MOTHER(z), z),
[HUSBAND(y)/v, HUSBAND(y)/z])
((GOOD(HUSBAND(y))), GOOD(HUSBAND(ANN)), [ANN/y])
{}
and the substitution is:
[MOTHER(HUSBAND(ANN))/x, ANN/y]
Pure Prolog
We are now ready to deal with (pure) Prolog, the major Logic Programming
Language. It is obtained from a variation of SLD-Resolution that allows Horn clauses
with the following rules and conventions:
- The Selection Rule is to select the leftmost literals in the goal.
- The Search Rule is to consider the clauses
in the order they appear in the current list of clauses,
from top to bottom.
- Negation as Failure, that is, Prolog assumes that a literal
L is proven if if it is unable to prove (NOT L)
- Terms can be set equal to variables but not in general to other terms. For example,
we can say that x=A and x=F(B) but we cannot say that A=F(B).
- Resolvents are added to the bottom of the list of available clauses.
These rules make for very rapid processing. Unfortunately:
The Pure Prolog Inference Procedure is Sound but not Complete
This can be seen by example. Given
- P(A,B)
- P(C,B)
- P(y,x) < = P(x,y)
- P(x,z) < = P(x,y),P(y,z)
we are unable to derive in Prolog that P(A,C) because we get caught in an
ever deepening depth-first search.
The derivation mechanism of Pure Prolog has a very simple form that
can be described by the following flow chart due to Colmerauer:
Interpreter for Pure Prolog
Notational conventions:
START
|
v
i := O; Gi :=G; Ki := 0;
|
v No
+------------------------>Gi = {}? ------------------>+
^ | |
| |Yes |
| v |
Gi+1 := (t(C),t(Gi)).Si; OUTPUT ANSWER |
i := i+1; | |
Ki := 0; v No |
^ +----->i = 0?--------> i := i-1 |
| ^ | | |
| | | Yes | |
| | v | |
| | END | |
| | Yes v |
| +<-----Ki=Max? <---------+ |
| ^ | |
| | v No v
Si = FAIL?---------------->+ +------------------>Ki := Ki + 1;
^ |
| v
+<--------------- C := COPY(Clause(Ki)); <----------+
Si := MGU(h(Gi), h(C));
Prolog
Real Prolog systems differ from pure Prolog for a number of reasons. Many of which
have to do with the ability in Prolog to modify the control (search) strategy so
as to achieve efficient programs. In fact there is a dictum due to Kowalski:
Logic + Control = Algorithm
But the reason
that is important to us now is that Prolog uses a Unification procedure which does not
enforce the Occur Test. This has an unfortunate consequence that, while
Prolog may give origin to efficient programs,
Prolog is not Sound
Forward-Chaining and Backward-Chaining
We have managed to talk for a long time about theorem proving without
ever introducing two important concepts,
and Forward Chaining.
- Forward-Chaining takes place when we add or assert
a new formula to the set of formulae
that we already know and we infer consequences
of this new formula. For example,
if we have know already two formulae (say, two non logical axioms) and we derive from
them a new formula by resolution, we say that the resolvent is the result of a
Forward Chaining Control Strategy. Forward-Chaining often needs some termination condition
so that it does not keep on generating more and more, possibly irrelevant consequences.
[In first-order login we know that we can infer an infinity of formulae from any
given formula, for example, A, A AND A, A AND A AND A, ...
- Backward-Chaining takes place when we ask a question, and we see how we can
answer it by deriving formulae that, if answered, then from them we can answer the original
question. In terms of search, it is like starting at the goal state with the question,
and moving backward from it towards the initially
given formulae. Prolog uses backward-chaining.
The set-of-support strategy, and most other proof strategies, use mostly backward-chaining
(forward chaining takes place when we resolve from the non-logical axioms or from
formulae not derived from the goal clauses.
Some Review Thoughts
Logic has provided a good avenue for examining how we can represent knowledge and how
we can reason with it. We have first established a representation language, First-Order
Logic, and understood the fundamental semantic soncepts of Satisfiability, Truth, and
Logical Consequence. Then we have shown how we can use Deductions, Theorems, Consistency
to express through symbol manipulation many of the semantic concepts, and express
Reasoning. We understood
fundamental concepts like Soundness and Completeness for a logical system and showed that
Resolution was the only thing we needed to achieve both soundness and completeness.
Once soundness and completeness are established, it becomes a scramble to come up
with proof procedures that are efficient. Restrictions are introduced on the form of
clauses that can be used (Horn clauses are particularly appealing). Restrictions
are introduced in the way that proofs can be built. In the search for efficiency
at times Completeness is given up, as in pure Prolog, and at times even Soundness
is abandoned, as in Prolog.
Now that we have a "standard" way to represent knowledge and to reason in First Order Logic,
- we can see how this machinery can be used to solve common problems,
- we can examine alternative approaches to knowledge representation and reasoning, and
- we can see limitation of these approaches, and in particular, of first-order logic.
One thing we have not said about first order logic:
- First-Order Logic is Monotonic.
This means that if we can derive a formula A from a set of formulae F then
if G is a set of formulae that contains F then we can derive F from G.
Do you think that is how it happens in life?
ingargiola@cis.temple.edu