CIS587: LOGIC, PART 1

Introduction
Syntax
Substitutions
Unification
Semantics
Herbrand Universe
Reduction to Clausal Form
Deduction
Soundness, Completeness, Consistency, Satisfiability,

Introduction

Our interest is about:

Perceiving, that is, how we acquire information from our environment,
Knowledge Representation, that is, how we represent our understanding of the world,
Reasoning, that is, how we infer the implications of what we know and of the choices we have, and
Acting, that is, how we choose what we want to do and carry it out.

Logic is a crucial tool in Knowledge Representation and Reasoning. In the following we deal with Logic in a fairly formal manner [but, sorry, no proofs here and essentially no examples].

The treatment of Logic in Rich and Knight, Chapter 5, is less complete than desirable. I highly recommend, for the basic ideas and definitions, to look at the course cs157: Logic and Automated Reasoning taught at Stanford by professor Genesereth and at his book Logical Foundations of Artificial Intelligence . You may also look at the lecture notes available for Logic in PAIL. Both PAIL and FLAIR have modules on various aspects of Logic and Theorem Proving. For the utterly devoted to Theorem Proving there is a very powerful tool called Otter.

Logic is a formal system and it is not trivial to covert the statements that we use in normal speech into statements in logic. Here are two examples of english statements that you may wish to convert into a logic language:

(Barwise and Etchemendy) If the unicorn is mythical then it is immortal, but if it is not mythical then it is a mortal mammal. If the uniforn is either immortal or a mammal, then it is horned. The unicorn is magical if it is horned.

(Rich and Knight)

Marcus was a man
Marcus was a Pompeian
All Pompeian were Romans
Caesar was a ruler
All Romans were either loyal to Caesar or hated him
Everyone is loyal to someone
People only try to assassinate rulers they are not loyal to
Marcus tried to assassinate Caesar.

In the following we introduce First Order Logic (FOL), and just mention that it is a generalization of Propositional Logic .

Syntax

Let us first introduce the symbols, or alphabet, being used. Beware that there are all sorts of slightly different ways to define FOL. Also, beware that I have problems in HTML in writing down some of the logical symbols.

Alphabet

Logical Symbols: These are symbols that have a standard meaning, like: AND, OR, NOT, ALL, EXISTS, IMPLIES, IFF, FALSE, =.
Non-Logical Symbols: divided in:
- Constants:
  - Predicates: 1-ary, 2-ary, .., n-ary. These are usually just identifiers.
  - Functions: 0-ary, 1-ary, 2-ary, .., n-ary. These are usually just identifiers. 0-ary functions are also called individual constants.
  Where predicates return true or false, functions can return any value.
- Variables: Usually an identifier.
One needs to be able to distinguish the identifiers used for predicates, functions, and variables by using some appropriate convention, for example, capitals for function and predicate symbols and lower cases for variables.

Terms

A Term is either an individual constant (a 0-ary function), or a variable, or an n-ary function applied to n terms: F(t1 t2 ..tn)
[We will use both the notation F(t1 t2 ..tn) and the notation (F t1 t2 .. tn)]

Atomic Formulae

An Atomic Formula is either FALSE or an n-ary predicate applied to n terms: P(t1 t2 .. tn). In the case that "=" is a logical symbol in the language, (t1 = t2), where t1 and t2 are terms, is an atomic formula.

Literals

A Literal is either an atomic formula (a Positive Literal), or the negation of an atomic formula (a Negative Literal). A Ground Literal is a variable-free literal.

Clauses

A Clause is a disjunction of literals. A Ground Clause is a variable-free clause. A Horn Clause is a clause with at most one positive literal. A Definite Clause is a Horn Clause with exactly one positive Literal.
Notice that implications are equivalent to Horn or Definite clauses:

(A IMPLIES B) is equivalent to ( (NOT A) OR B)

(A AND B IMPLIES FALSE) is equivalent to ((NOT A) OR (NOT B)).

Formulae

A Formula is either:

an atomic formula, or
a Negation, i.e. the NOT of a formula, or
a Conjunctive Formula, i.e. the AND of formulae, or
a Disjunctive Formula, i.e. the OR of formulae, or
an Implication, that is a formula of the form (formula1 IMPLIES formula2), or
an Equivalence, that is a formula of the form (formula1 IFF formula2), or
a Universaly Quantified Formula, that is a formula of the form (ALL variable formula). We say that occurrences of variable are bound in formula [we should be more precise]. Or
a Existentially Quantified Formula, that is a formula of the form (EXISTS variable formula). We say that occurrences of variable are bound in formula [we should be more precise].

An occurrence of a variable in a formula that is not bound, is said to be free.
A formula where all occurrences of variables are bound is called a closed formula, one where all variables are free is called an open formula.

A formula that is the disjunction of clauses is said to be in Clausal Form. We shall see that there is a sense in which every formula is equivalent to a clausal form.

Often it is convenient to refer to terms and formulae with a single name. Form or Expression is used to this end.

Substitutions

Given a term s, the result [substitution instance] of substituting a term t in s for a variable x, s[t/x], is:
- t, if s is the variable x
- y, if s is the variable y different from x
- F(s1[t/x] s2[t/x] .. sn[t/x]), if s is F(s1 s2 .. sn).
Given a formula A, the result (substitution instance) of substituting a term t in A for a variable x, A[t/x], is:
- FALSE, if A is FALSE,
- P(t1[t/x] t2[t/x] .. tn[t/x]), if A is P(t1 t2 .. tn),
- (B[t/x] AND C[t/x]) if A is (B AND C), and similarly for the other connectives,
- (ALL x B) if A is (ALL x B), (similarly for EXISTS),
- (ALL y B[t/x]), if A is (ALL y B) and y is different from x (similarly for EXISTS).

The substitution [t/x] can be seen as a map from terms to terms and from formulae to formulae. We can define similarly [t1/x1 t2/x2 .. tn/xn], where t1 t2 .. tn are terms and x1 x2 .. xn are variables, as a map, the [simultaneous] substitution of x1 by t1, x2 by t2, .., of xn by tn. [If all the terms t1 .. tn are variables, the substitution is called an alphabetic variant, and if they are ground terms, it is called a ground substitution.] Note that a simultaneous substitution is not the same as a sequential substitution. For example, as we will see in the next section,

P(x)[z/x u/z] is P(z), but P(x)[z/x][u/z] is P(u)

Unification

Given two substitutions S = [t1/x1 .. tn/xn] and V = [u1/y1 .. um/ym], the composition of S and V, S . V, is the substitution obtained by:
- Applying V to t1 .. tn [the operation on substitutions with just this property is called concatenation], and
- adding any pair uj/yj such that yj is not in {x1 .. xn}.
For example: [G(x y)/z].[A/x B/y C/w D/z] is [G(A B)/z A/x B/y C/w].
Composition is an operation that is associative and non commutative
A set of forms f1 .. fn is unifiable iff there is a substitution S such that f1.S = f2.S = .. = fn.S. We then say that S is a unifier of the set.
For example {P(x F(y) B) P(x F(B) B)} is unified by [A/x B/y] and also unified by [B/y].
A Most General Unifier (MGU) of a set of forms f1 .. fn is a substitution S that unifies this set and such that for any other substitution T that unifies the set there is a substitution V such that S.V = T. The result of applying the MGU to the forms is called a Most General Instance (MGI). Here are some examples:
```
FORMULAE		MGU		MGI
--------------------------------------------------------------
(P x), (P A)		[A/x]		(P A)
--------------------------------------------------------------
(P (F x) y (G y)),	[x/y x/z]	(P (F x) x (G x))
(P (F x) z (G x))
--------------------------------------------------------------
(F x (G y)), 		[(G u)/x y/z]	(F (G u) (G y))
(F (G u) (G z))
--------------------------------------------------------------
(F x (G y)), 		Not Unifiable
(F (G u) (H z))
--------------------------------------------------------------
(F x (G x) x),		Not Unifiable
(F (G u) (G (G z)) z)
--------------------------------------------------------------
```
This last example is interesting: we first find that (G u) should replace x, then that (G z) should replace x; finally that x and z are equivalent. So we need x->(G z) and x->z to be both true. This would be possible only if z and (G z) were equivalent. That cannot happen for a finite term. To recognize cases such as this that do not allow unification [we cannot replace z by (G z) since z occurs in (G z)], we need what is called an Occur Test . Most Prolog implementation use Unification extensively but do not do the occur test for efficiency reasons.

The determination of Most General Unifier is done by the Unification Algorithm. Here is the pseudo code for it, and here is the corresponding Lisp code.

Semantics

Before we can continue in the "syntactic" domain with concepts like Inference Rules and Proofs, we need to clarify the Semantics, or meaning, of First Order Logic.

An L-Structure or Conceptualization for a language L is a structure M= (U,I), where:

U is a non-empty set, called the Domain, or Carrier, or Universe of Discourse of M, and
I is an Interpretation that associates to each n-ary function symbol F of L a map
I(F): UxU..xU -> U
and to each n-ary predicate symbol P of L a subset of UxU..xU.

The set of functions (predicates) so introduced form the Functional Basis (Relational Basis) of the conceptualization.

Given a language L and a conceptualization (U,I), an Assignment is a map from the variables of L to U.
An X-Variant of an assignment s is an assignment that is identical to s everywhere except at x where it differs.

Given a conceptualization M=(U,I) and an assignment s it is easy to extend s to map each term t of L to an individual s(t) in U by using induction on the structure of the term.
Then

M satisfies a formula A under s iff
- A is atomic, say P(t1 .. tn), and (s(t1) ..s(tn)) is in I(P).
- A is (NOT B) and M does not satisfy B under s.
- A is (B OR C) and M satisfies B under s, or M satisfies C under s. [Similarly for all other connectives.]
- A is (ALL x B) and M satisfies B under all x-variants of s.
- A is (EXISTS x B) and M satisfies B under some x-variants of s.
Formula A is satisfiable in M iff there is an assignment s such that M satisfies A under s.
Formula A is satisfiable iff there is an L-structure M such that A is satisfiable in M.
Formula A is valid or logically true in M iff M satisfies A under any s. We then say that M is a model of A.
Formula A is Valid or Logically True iff for any L-structure M and any assignment s, M satisfies A under s.

Some of these definitions can be made relative to a set of formulae GAMMA:

Formula A is a Logical Consequence of GAMMA in M iff M satisfies A under any s that also satisfies all the formulae in GAMMA.
Formula A is a Logical Consequence of GAMMA iff for any L-structure M, A is a logical consequence of GAMMA in M. At times instead of "A is a logical consequence of GAMMA" we say "GAMMA entails A".

We say that formulae A and B are (logically) equivalent iff A is a logical consequence of {B} and B is a logical consequence of {A}.

EXAMPLE 1: A Block World

Here we look at a problem and see how to represent it in a language. We consider a simple world of blocks as described by the following figures:

					+--+
					|a |
					+--+
					|e |
	+--+				+--+
	|a |				|c |
	+--+	+--+			+--+
	|b |	|d |	======>		|d |
	+--+	+--+			+--+
	|c |	|e |			|b |
      ---------------		--------------------

We see two possible states of the world. On the left is the current state, on the right a desired new state. A robot is available to do the transformation. To describe these worlds we can use a structure with domain U = {a b c d e}, and with predicates {ON, ABOVE, CLEAR, TABLE} with the following meaning:

ON: (ON x y) iff x is immediately above y.
The interpretation of ON in the left world is {(a b) (b c) (d e)}, and in the right world is {(a e) (e c) (c d) (d b)}.
ABOVE: (ABOVE x y) iff x is above y.
The interpretation of ABOVE [in the left world] is {(a b) (b c) (a c) (d e)} and in the right world is {(a e) (a c) (a d) (a b) (e c) (e d) (e b) (c d) (c b) (d b)}
CLEAR: (CLEAR x) iff x does not have anything above it.
The interpretation of CLEAR [in the left world] is {a d} and in the right world is {a}
TABLE: (TABLE x) iff x rests directly on the table.
The interpretation of TABLE [in the left world] is {c e} and in the right world id {b}.

Examples of formulae true in the block world [both in the left and in the right state] are [these formulae are known as Non-Logical Axioms]:

(ON x y) IMPLIES (ABOVE x y)
((ON x y) AND (ABOVE y z)) IMPLIES (ABOVE x z)
(ABOVE x y) IMPLIES (NOT (ABOVE y x))
(CLEAR x) IFF (NOT (EXISTS y (ON y x)))
(TABLE x) IFF (NOT (EXISTS y (ON x y)))

Note that there are things that we cannot say about the block world with the current functional and predicate basis unless we use equality. Namely, we cannot say as we would like that a block can be ON at most one other block. For that we need to say that if x is ON y and x is ON z then y is z. That is, we need to use a logic with equality.

Not all formulae that are true on the left world are true on the right world and viceversa. For example, a formula true in the left world but not in the right world is (ON a b).
Assertions about the left and right world can be in contradiction. For example (ABOVE b c) is true on left, (ABOVE c b) is true on right and together they contradict the non-logical axioms. This means that the theory that we have developed for talking about the block world can talk of only one world at a time. To talk about two worlds simultaneously we would need something like the Situation Calculus that we will study later.

Herbrand Universe

It is a good exercise to determine for given formulae if they are satisfied/valid on specific L-structures, and to determine, if they exist, models for them. A good starting point in this task, and useful for a number of other reasons, is the Herbrand Universe for this set of formulae. Say that {F01 .. F0n} are the individual constants in the formulae [if there are no such constants, then introduce one, say, F0]. Say that {F1 .. Fm} are all the non 0-ary function symbols occurring in the formulae. Then the set of (constant) terms obtained starting from the individual constants using the non 0-ary functions, is called the Herbrand Universe for these formulae.

For example, given the formula (P x A) OR (Q y), its Herbrand Universe is just {A}.
Given the formulae (P x (F y)) OR (Q A), its Herbrand Universe is {A (F A) (F (F A)) (F (F (F A))) ...}.

Reduction to Clausal Form

In the following we give an algorithm for deriving from a formula an equivalent clausal form through a series of truth preserving transformations. You can find this algorithm here and the corresponding Lisp code here.
We can state an (unproven by us) theorem:

Theorem: Every formula is equivalent to a clausal form

We can thus, when we want, restrict our attention only to such forms.

Deduction

An Inference Rule is a rule for obtaining a new formula [the consequence] from a set of given formulae [the premises].

A most famous inference rule is Modus Ponens:

                {A, NOT A OR B}
		---------------
                       B
For example:
		{Sam is tall,
	         if Sam is tall then Sam is unhappy}
		------------------------------------
		  Sam is unhappy

When we introduce inference rules we want them to be Sound, that is, we want the consequence of the rule to be a logical consequence of the premises of the rule.
Modus Ponens is sound. But the following rule, called Abduction , is not:

		{B, NOT A OR B}
		--------------
		      A

is not. For example:

		John is wet
		If it is raining then John is wet
		---------------------------------
		It is raining

gives us a conclusion that is usually, but not always true [John takes a shower even when it is not raining].

A Logic or Deductive System is a language, plus a set of inference rules, plus a set of logical axioms [formulae that are valid].

A Deduction or Proof or Derivation in a deductive system D, given a set of formulae GAMMA, is a a sequence of formulae B1 B2 .. Bn such that:

for all i from 1 to n, Bi is either a logical axiom of D, or an element of GAMMA, or is obtained from a subset of {B1 B2 .. Bi-1} by using an inference rule of D.

In this case we say that Bn is Derived from GAMMA in D, and in the case that GAMMA is empty, we say that Bn is a Theorem of D.

Soundness, Completeness, Consistency, Satisfiability

A Logic D is Sound iff for all sets of formulae GAMMA and any formula A:

if A is derived from GAMMA in D, then A is a logical consequence of GAMMA

A Logic D is Complete iff for all sets of formulae GAMMA and any formula A:

If A is a logical consequence of GAMMA, then A can be derived from GAMMA in D.

A Logic D is Refutation Complete iff for all sets of formulae GAMMA and any formula A:

If A is a logical consequence of GAMMA, then the union of GAMMA and (NON A) is inconsistent

Note that if a Logic is Refutation Complete then we can enumerate all the logical consequences of GAMMA and, for any formula A, we can reduce the question if A is or not a logical consequence of GAMMA to the question: the union of GAMMA and NOT A is or not consistent.

We will work with logics that are both Sound and Complete, or at least Sound and Refutation Complete.

A Theory T consists of a logic and of a set of Non-logical axioms.
For convenience, we may refer, when not ambiguous, to the logic of T, or the non-logical axioms of T, just as T.

The common situation is that we have in mind a well defined "world" or set of worlds. For example we may know about the natural numbers and the arithmetic operations and relations. Or we may think of the block world. We choose a language to talk about these worlds. We introduce function and predicate symbols as it is appropriate. We then introduce formulae, called Non-Logical Axioms, to characterize the things that are true in the worlds of interest to us. We choose a logic, hopefully sound and (refutation) complete, to derive new facts about the worlds from the non-logical axioms.

A Theorem in a theory T is a formula A that can be derived in the logic of T from the non-logical axioms of T.

A Theory T is Consistent iff there is no formula A such that both A and NOT A are theorems of T; it is Inconsistent otherwise.
If a theory T is inconsistent, then, for essentially any logic, any formula is a theorem of T. [Since T is inconsistent, there is a formula A such that both A and NOT A are theorems of T. It is hard to imagine a logic where from A and (NOT A) we cannot infer FALSE, and from FALSE we cannot infer any formula. We will say that a logic that is at least this powerful is Adeguate.]

A Theory T is Unsatisfiable if there is no structure where all the non-logical axioms of T are valid. Otherwise it is Satisfiable.

Given a Theory T, a formula A is a Logical Consequence of T if it is a logical consequence of the non logical axioms of T.

Theorem: If the logic we are using is sound then:

If a theory T is satisfiable then T is consistent
If the logic used is also adequate then if T is consistent then T is satisfiable
If a theory T is satisfiable and by adding to T the non-logical axiom (NOT A) we get a theory that is not satisfiable Then A is a logical consequence of T.
If a theory T is satisfiable and by adding the formula (NOT A) to T we get a theory that is inconsistent, then A is a logical consequence of T.

ingargiola@cis.temple.edu