CIS587: LOGIC, PART 1
Introduction
Syntax
Substitutions
Unification
Semantics
Herbrand Universe
Reduction to Clausal Form
Deduction
Soundness, Completeness, Consistency, Satisfiability,
Introduction
Our interest is about:
- Perceiving, that is, how we acquire information
from our environment,
- Knowledge Representation,
that is, how we represent our understanding of the world,
- Reasoning, that is, how we infer the implications
of what we know and of the choices we have, and
- Acting, that is, how we choose what we want to do
and carry it out.
Logic is a crucial tool in Knowledge Representation and Reasoning. In the
following we deal with Logic in a fairly formal manner [but, sorry, no proofs
here and essentially no examples].
The treatment of Logic in Rich and Knight, Chapter 5, is less complete than
desirable. I highly recommend, for the basic ideas and definitions, to look
at the course
cs157: Logic and Automated Reasoning taught at Stanford by professor
Genesereth and at his book Logical Foundations of Artificial
Intelligence
. You may also look at the lecture notes available for
Logic in PAIL.
Both PAIL
and
FLAIR
have modules on various aspects of Logic and Theorem Proving. For the
utterly devoted to Theorem Proving there is a very powerful tool called
Otter.
Logic is a formal system and it is not trivial to covert the statements that
we use in normal speech into statements in logic. Here are two examples of
english statements that you may wish to convert into a logic language:
(Barwise and Etchemendy) If the unicorn is mythical then it is
immortal, but if it is not mythical then it is a mortal mammal. If the uniforn
is either immortal or a mammal, then it is horned. The unicorn is magical if
it is horned.
(Rich and Knight)
- Marcus was a man
- Marcus was a Pompeian
- All Pompeian were Romans
- Caesar was a ruler
- All Romans were either loyal to Caesar or hated him
- Everyone is loyal to someone
- People only try to assassinate rulers they are not loyal to
- Marcus tried to assassinate Caesar.
In the following we introduce First Order Logic (FOL), and
just mention that it is a generalization of Propositional Logic
.
Syntax
Let us first introduce the symbols, or alphabet, being used. Beware that
there are all sorts of slightly different ways to define FOL. Also, beware that
I have problems in HTML in writing down some of the logical symbols.
Alphabet
- Logical Symbols: These are symbols that have a standard meaning,
like: AND, OR, NOT, ALL, EXISTS, IMPLIES, IFF, FALSE, =.
- Non-Logical Symbols: divided in:
- Constants:
- Predicates: 1-ary, 2-ary, .., n-ary. These are usually just identifiers.
- Functions: 0-ary, 1-ary, 2-ary, .., n-ary. These are usually just
identifiers. 0-ary functions are also called individual constants.
Where predicates return true or false, functions can return any value.
- Variables: Usually an identifier.
One needs to be able to distinguish the identifiers used for predicates,
functions, and variables by using some appropriate convention, for example,
capitals for function and predicate symbols and lower cases for variables.
Terms
A Term is either an individual constant (a 0-ary function), or a variable, or
an n-ary function applied to n terms: F(t1 t2 ..tn)
[We will use both the notation F(t1 t2 ..tn) and the notation (F t1 t2 .. tn)]
Atomic Formulae
An Atomic Formula is either FALSE or an n-ary predicate applied to n terms:
P(t1 t2 .. tn). In the case that "=" is a logical symbol in the language,
(t1 = t2), where t1 and t2 are terms, is an atomic formula.
Literals
A Literal is either an atomic formula (a Positive Literal),
or the negation of an atomic formula (a Negative Literal).
A Ground Literal is a variable-free literal.
Clauses
A Clause is a disjunction of literals. A Ground Clause is
a variable-free clause. A Horn Clause is a clause with at most
one positive literal. A Definite Clause is a Horn Clause
with exactly one positive Literal.
Notice that implications are equivalent to Horn or Definite clauses:
- (A IMPLIES B) is equivalent to ( (NOT A) OR B)
- (A AND B IMPLIES FALSE) is equivalent to ((NOT A) OR (NOT B)).
Formulae
A Formula is either:
- an atomic formula, or
- a Negation, i.e. the NOT of a formula, or
- a Conjunctive Formula, i.e. the AND of formulae, or
- a Disjunctive Formula, i.e. the OR of formulae, or
- an Implication, that is a formula of the form
(formula1 IMPLIES formula2), or
- an Equivalence, that is a formula of the form
(formula1 IFF formula2), or
- a Universaly Quantified Formula, that is a formula
of the form (ALL variable formula). We say that occurrences of variable
are bound in formula [we should be more precise]. Or
- a Existentially Quantified Formula, that is a formula
of the form (EXISTS variable formula). We say that occurrences of variable
are bound in formula [we should be more precise].
An occurrence of a variable in a formula that is not bound, is said to be
free.
A formula where all occurrences of variables are bound is called a
closed formula, one where all variables are free is called
an open formula.
A formula that is the disjunction of clauses is said to be in
Clausal Form. We shall see that there is a sense in
which every formula is equivalent to a clausal form.
Often it is convenient to refer to terms and formulae with a single name.
Form or Expression is used to this end.
Substitutions
- Given a term s, the result [substitution instance] of
substituting a term t in s for a variable x, s[t/x], is:
- t, if s is the variable x
- y, if s is the variable y different from x
- F(s1[t/x] s2[t/x] .. sn[t/x]), if s is F(s1 s2 .. sn).
- Given a formula A, the result (substitution instance) of
substituting a term t in A for a variable x, A[t/x], is:
- FALSE, if A is FALSE,
- P(t1[t/x] t2[t/x] .. tn[t/x]), if A is P(t1 t2 .. tn),
- (B[t/x] AND C[t/x]) if A is (B AND C), and similarly for the other
connectives,
- (ALL x B) if A is (ALL x B), (similarly for EXISTS),
- (ALL y B[t/x]), if A is (ALL y B) and y is different from x (similarly for
EXISTS).
The substitution [t/x] can be seen as a map from
terms to terms and from formulae to formulae. We can define similarly
[t1/x1 t2/x2 .. tn/xn], where t1 t2 .. tn are terms and x1 x2 .. xn are
variables, as a map, the [simultaneous] substitution of x1 by t1, x2
by t2, .., of xn by tn. [If all the terms t1 .. tn are variables,
the substitution is called an alphabetic variant, and if they are
ground terms, it is called a ground substitution.]
Note that a simultaneous substitution is not
the same as a sequential substitution. For example, as we will see in the
next section,
P(x)[z/x u/z] is P(z), but P(x)[z/x][u/z] is P(u)
Unification
- Given two substitutions S = [t1/x1 .. tn/xn] and V = [u1/y1 .. um/ym],
the composition of S and V, S . V, is the substitution
obtained by:
- Applying V to t1 .. tn [the operation on substitutions with just this
property is called concatenation],
and
- adding any pair uj/yj such that yj is not in {x1 .. xn}.
For example: [G(x y)/z].[A/x B/y C/w D/z] is [G(A B)/z A/x B/y C/w].
Composition is an operation that is associative and non commutative
- A set of forms f1 .. fn is unifiable iff there is
a substitution S such that f1.S = f2.S = .. = fn.S. We then say that S is
a unifier of the set.
For example {P(x F(y) B) P(x F(B) B)} is unified by [A/x B/y] and also unified
by [B/y].
- A Most General Unifier (MGU) of a set of forms f1 .. fn
is a substitution S that unifies this set and such that for any other
substitution T that unifies the set there is a substitution V such that
S.V = T. The result of applying the MGU to the forms is called a
Most General Instance (MGI). Here are some examples:
FORMULAE MGU MGI
--------------------------------------------------------------
(P x), (P A) [A/x] (P A)
--------------------------------------------------------------
(P (F x) y (G y)), [x/y x/z] (P (F x) x (G x))
(P (F x) z (G x))
--------------------------------------------------------------
(F x (G y)), [(G u)/x y/z] (F (G u) (G y))
(F (G u) (G z))
--------------------------------------------------------------
(F x (G y)), Not Unifiable
(F (G u) (H z))
--------------------------------------------------------------
(F x (G x) x), Not Unifiable
(F (G u) (G (G z)) z)
--------------------------------------------------------------
This last example is interesting: we first find that (G u) should replace
x, then that (G z) should replace x; finally that x and z are equivalent.
So we need x->(G z) and x->z to be both true. This would be possible only if
z and (G z)
were equivalent. That cannot happen for a finite term. To recognize cases
such as this that do not allow unification [we cannot replace z by
(G z) since z occurs in (G z)], we need what is called an Occur Test
. Most Prolog implementation use Unification extensively but do not
do the occur test for efficiency reasons.
The determination of Most General Unifier is done by the
Unification Algorithm.
Here is the pseudo code for it, and
here is the corresponding Lisp code.
Semantics
Before we can continue in the "syntactic" domain with concepts like Inference
Rules and Proofs, we need to clarify the Semantics, or meaning, of First Order
Logic.
An L-Structure or Conceptualization for
a language L is a structure M= (U,I), where:
- U is a non-empty set, called the Domain, or
Carrier, or Universe of Discourse of M, and
- I is an Interpretation that associates to each
n-ary function symbol F of L a map
I(F): UxU..xU -> U
and to each n-ary predicate symbol P of L a subset of UxU..xU.
The set of functions (predicates) so introduced form the Functional
Basis (Relational Basis) of the conceptualization.
Given a language L and a conceptualization (U,I), an
Assignment is a map from the variables of L to U.
An X-Variant of an assignment s is an assignment that
is identical to s everywhere except at x where it differs.
Given a conceptualization M=(U,I) and an assignment s it is easy to
extend s to map each term t of L to an individual s(t) in U by using
induction on the structure of the term.
Then
- M satisfies a formula A under s iff
- A is atomic, say P(t1 .. tn), and (s(t1) ..s(tn)) is in I(P).
- A is (NOT B) and M does not satisfy B under s.
- A is (B OR C) and M satisfies B under s, or M
satisfies C under s. [Similarly for all other connectives.]
- A is (ALL x B) and M satisfies B under all x-variants of s.
- A is (EXISTS x B) and M satisfies B under some x-variants of s.
- Formula A is satisfiable in M iff there is an assignment
s such that M satisfies A under s.
- Formula A is satisfiable iff there is an L-structure
M such that A is satisfiable in M.
- Formula A is valid or logically true in M iff M satisfies
A under any s. We then say that M is a model of A.
- Formula A is Valid or Logically True iff for any L-structure
M and any assignment s, M satisfies A under s.
Some of these definitions can be made relative to a set of formulae GAMMA:
- Formula A is a Logical Consequence of GAMMA in M iff
M satisfies A under any s that also satisfies all the formulae in GAMMA.
- Formula A is a Logical Consequence of GAMMA iff
for any L-structure M, A is a logical consequence of GAMMA in M.
At times instead of "A is a logical consequence of GAMMA" we say
"GAMMA entails A".
We say that formulae A and B are (logically) equivalent iff
A is a logical consequence of {B} and B is a logical consequence of {A}.
EXAMPLE 1: A Block World
Here we look at a problem and see how to represent it in a language.
We consider a simple world of blocks as described by the following figures:
+--+
|a |
+--+
|e |
+--+ +--+
|a | |c |
+--+ +--+ +--+
|b | |d | ======> |d |
+--+ +--+ +--+
|c | |e | |b |
--------------- --------------------
We see two possible states of the world. On the left is the current
state, on the right a desired new state. A robot is available to do the
transformation. To describe these worlds we can use a structure with
domain U = {a b c d e}, and with predicates {ON, ABOVE, CLEAR, TABLE}
with the following meaning:
- ON: (ON x y) iff x is immediately above y.
The interpretation of ON in the left world is {(a b) (b c) (d e)}, and
in the right world is {(a e) (e c) (c d) (d b)}.
- ABOVE: (ABOVE x y) iff x is above y.
The interpretation of ABOVE [in the left world] is {(a b) (b c) (a c) (d e)}
and in the right world is {(a e) (a c) (a d) (a b) (e c) (e d) (e b) (c d)
(c b) (d b)}
- CLEAR: (CLEAR x) iff x does not have anything above it.
The interpretation of CLEAR [in the left world] is {a d} and in the right
world is {a}
- TABLE: (TABLE x) iff x rests directly on the table.
The interpretation of TABLE [in the left world] is {c e} and in the right
world id {b}.
Examples of formulae true in the block world [both in the left and in the right
state] are [these formulae are
known as Non-Logical Axioms]:
- (ON x y) IMPLIES (ABOVE x y)
- ((ON x y) AND (ABOVE y z)) IMPLIES (ABOVE x z)
- (ABOVE x y) IMPLIES (NOT (ABOVE y x))
- (CLEAR x) IFF (NOT (EXISTS y (ON y x)))
- (TABLE x) IFF (NOT (EXISTS y (ON x y)))
Note that there are things that we cannot say about the block world with the
current functional and predicate basis unless we use equality.
Namely, we cannot say as we would like that a block
can be ON at most one other block. For that we need to say that if x is ON y
and x is ON z then y is z. That is, we need to use a logic with equality.
Not all formulae that are true on the left world are true on the right world
and viceversa. For example, a formula true in the left world but not in the
right world is (ON a b).
Assertions about the left and right world
can be in contradiction. For example (ABOVE b c) is true on left, (ABOVE c b)
is true on right and together they contradict the non-logical axioms.
This means that the theory that we have developed for talking about the block
world can talk of only one world at a time. To talk about two worlds
simultaneously we would need something like the Situation Calculus that we
will study later.
Herbrand Universe
It is a good exercise to determine for given formulae if they are
satisfied/valid on specific L-structures, and to determine, if they exist,
models for them. A good starting point in this task, and useful for a number
of other reasons, is the Herbrand Universe for this set of
formulae. Say that {F01 .. F0n} are the individual constants in the formulae
[if there are no such constants, then introduce one, say, F0]. Say that
{F1 .. Fm} are all the non 0-ary function symbols occurring in the formulae.
Then the set of (constant) terms obtained starting from the individual
constants using the non 0-ary functions, is called the Herbrand Universe for
these formulae.
For example, given the formula (P x A) OR (Q y), its Herbrand Universe is
just {A}.
Given the formulae (P x (F y)) OR (Q A), its Herbrand Universe is
{A (F A) (F (F A)) (F (F (F A))) ...}.
Reduction to Clausal Form
In the following we give an algorithm for deriving from a formula
an equivalent clausal form through a series of truth preserving
transformations. You can find this algorithm
here and the corresponding Lisp code here.
We can state an (unproven by us) theorem:
Theorem: Every formula is equivalent to a clausal form
We can thus, when we want, restrict our attention only to such forms.
Deduction
An Inference Rule is a rule for obtaining a new formula
[the consequence]
from a set of given formulae [the premises].
A most famous inference rule is Modus Ponens:
{A, NOT A OR B}
---------------
B
For example:
{Sam is tall,
if Sam is tall then Sam is unhappy}
------------------------------------
Sam is unhappy
When we introduce inference rules we want them to be Sound,
that is, we want the consequence of the rule to be a logical consequence
of the premises of the rule.
Modus Ponens is sound. But the following rule, called Abduction
, is not:
{B, NOT A OR B}
--------------
A
is not. For example:
John is wet
If it is raining then John is wet
---------------------------------
It is raining
gives us a conclusion that is usually, but not always true [John takes a
shower even when it is not raining].
A Logic or Deductive System is a language,
plus a set of inference rules, plus a set of logical axioms [formulae that are
valid].
A Deduction or Proof or Derivation in a deductive system D,
given a set of formulae GAMMA, is a a sequence of formulae
B1 B2 .. Bn such that:
- for all i from 1 to n, Bi is either a logical axiom of D, or an element
of GAMMA, or is obtained from a subset of {B1 B2 .. Bi-1} by using
an inference rule of D.
In this case we say that Bn is Derived from GAMMA in D, and
in the case that GAMMA is empty, we say that Bn is a Theorem
of D.
Soundness, Completeness, Consistency, Satisfiability
A Logic D is Sound iff for all sets of formulae
GAMMA and any formula A:
- if A is derived from GAMMA in D, then A is a logical consequence of GAMMA
A Logic D is Complete iff for all sets of formulae GAMMA and
any formula A:
- If A is a logical consequence of GAMMA, then A can be derived from GAMMA
in D.
A Logic D is Refutation Complete iff for all sets of
formulae GAMMA and any formula A:
- If A is a logical consequence of GAMMA, then the union of GAMMA and
(NON A) is inconsistent
Note that if a Logic is Refutation Complete then we can enumerate all the
logical consequences of GAMMA and, for any formula A, we can reduce the
question if A is or not a logical consequence of GAMMA to the question:
the union of GAMMA and NOT A is or not consistent.
We will work with logics that are both Sound and Complete, or at least Sound
and Refutation Complete.
A Theory T consists of a logic and of a set of Non-logical
axioms.
For convenience, we may refer, when not ambiguous, to the logic of T, or the
non-logical axioms of T, just as T.
The common situation is that we have in mind a well defined "world" or
set of worlds. For example we may know about the natural numbers and
the arithmetic operations and relations. Or we may think of the block world.
We choose a language
to talk about these worlds. We introduce function and predicate
symbols as it is appropriate. We then introduce formulae, called
Non-Logical Axioms, to characterize the things that are true
in the worlds of interest to us. We choose a logic, hopefully sound
and (refutation) complete, to derive new facts about the worlds
from the non-logical axioms.
A Theorem in a theory T is a formula A that can be derived
in the logic of T from the non-logical axioms of T.
A Theory T is Consistent iff there is no formula A such
that both A and NOT A are theorems of T; it is Inconsistent
otherwise.
If a theory T is inconsistent, then, for essentially any logic, any formula
is a theorem of T. [Since T is inconsistent, there is a formula
A such that both A and NOT A are theorems of T. It is hard to imagine a
logic where from A and (NOT A) we cannot infer FALSE, and from FALSE
we cannot infer any formula. We will say that a logic that is at least
this powerful is Adeguate.]
A Theory T is Unsatisfiable if there is no structure where
all the non-logical axioms of T are valid. Otherwise it is
Satisfiable.
Given a Theory T, a formula A is a Logical Consequence of T
if it is a logical consequence of the non logical axioms of T.
Theorem: If the logic we are using is sound then:
- If a theory T is satisfiable then T is consistent
- If the logic used is also adequate then if T is consistent then T is
satisfiable
- If a theory T is satisfiable and by adding to T the non-logical axiom
(NOT A) we get a theory that is not satisfiable Then A is a logical
consequence of T.
- If a theory T is satisfiable and by adding the formula (NOT A) to T
we get a theory that is inconsistent, then A is a logical consequence of T.
ingargiola@cis.temple.edu