A General Theory of Intelligence Chapter 3. Inference System

Section 3.2. Types of inference systems

Inference, reasoning, and logic

In this book, "inference" and "reasoning" are used as synonyms to indicate the working process of information systems that can be described as "deriving new beliefs from existing beliefs", step by step. In this process, the beliefs can be described as sentences in a certain language. A typical action of the system is the one that produces new beliefs from some existing ones. A typical goal of the system is specified as a desired result of inference.

Before computer was invented, the human mind is the only known system that have general reasoning capability. The previous study of human reasoning is carried out by two carefully separated domains:

Descriptive models of human reasoning is studied in psychology, where the aim of the research is to find regularities of actual human reasoning, right or wrong.
Normative models of human reasoning is studied in logic, where the aim of the research is to find regularities of valid reasoning, which can be justified according to certain higher principles.

In this book, the aim of the research is not to accurately duplicate human behaviors (including reasoning behaviors), but to build general model according to the principle of "adaptation with insufficient knowledge and resources", what we are interested is in the normative, or logical models of reasoning, though descriptive models still provide us with important inspirations and clues on how the normative models can be implemented or approximated.

The first major logic was presented by Aristotle's Organon, in the form of syllogisms. This logic system has nominated the field for two thousand years, until the coming of mathematical logic, with First-Order Predicate Logic (FOPL) at the core, from the work of Frege, Whitehead, and Russell, which then nominated the field until now. Though many other logics have been proposed with various motivations and have reached various extents of success, FOPL remains to be taken as the standard form of logic.

Each logic typically specifies a language (with its grammar and semantics) to represent the system's beliefs, and a set of rules to represent the allowed premise-conclusion relation in each inference step. The suitability of the language and the validity of the rules are justified according to the desired application of the logic, as well as the assumption about its application environment. Consequently, for different purposes and different situations, what inference steps are considered as "valid" may also be different. In the following, we are going to analyze and compare a few typical types of logic.

Beside its logical part (i.e., language and rules), each inference system also has a control part, which is responsible for the storage and access of the premises and conclusions, as well as the selection of them and the inference rule for each step of the system. Their design and evaluation also depend on the purpose and situation of the reasoning system, as we will see from the following discussion.

Pure-axiomatic systems

The study of logic, to a large extent, came from the longing for certainty and reliability in our beliefs, or knowledge, as descriptions of our environment as we know. Intuitively, everyone understands that some beliefs are more reliable than the others, and so are some ways of producing new beliefs from existing ones. To put this understanding in its extreme form, we all want to know how to get truth.

In the study of logic, the pursuing of truth directly leads to the classical notion of validity in inference: a valid (or sound) inference rule always derives true conclusions from true premises.

Of course, for the above definition of validity to be meaningful, we need to define "true" or "truth" first. Intuitively speaking, a statement is true if and only if it corresponds to a fact, that is, a description of the world as it is. This idea is formally represented in model-theoretic semantics. According to this approach, if we want to describe the world in a language L, we should first assume there is a meta-language ML that can describe the relevant part of the world as it is, in terms of the existing objects, as well as their properties and relations. Such a description is called a model. Now each word in L can get its meaning by referring to an object (or property, relation) in the model, and a statement is true if and only if the situation it describes match a fact in the model.

Now we can build an axiomatic reasoning system, which consists of a set of axioms (or postulates) that are true in the given model, plus a set of inference rules that are valid in the above sense. Consequently, all the conclusions derived by the system are all true (in the model). Furthermore, we usually assume the system has enough storage space to accomplish inference process of any finite length, and there is an algorithm that guides the inference process, in the sense that for any given statement, an inference process will be carried out in finite time, which will decide whether the statement is true or not.

The technical name of such a system is "decidable axiomatic system", and in this book it is called "pure-axiomatic system", for a reason to be explained later. This kind of reasoning system has been dreamed by many thinkers and researchers, including Leibniz, Boole, Hilbert, and many others. For a given domain, such a system will provide the final solution to all problems.

Clearly, this kind of system can only be designed and built under certain conditions. For a given domain and problems to be solved in the domain, the following conditions must be satisfied:

In principle, there is a complete and consistent description of the domain consists of statements that serve as perfect solutions to all the problems.
Some statements in the description is initially known and can be represented as axioms, from which the other statements in the description can be derived by a set of valid inference rules.
There is a known algorithm that, for any given statement in the scope of the domain, selects proper premises and rule for each inference step, and eventually decides the truth value of the given statement.
The system has enough time to finish any of these inference processes, as well as enough space to hold all the statements and rules involved in the process.

When all the above conditions are satisfied, we can say that the system has sufficient knowledge and resources with respect to the problems to be solved, where the knowledge indicates the axioms, inference rule, and the process-control algorithm, while the resources indicate the time and space needed.

Since the above requirement is very difficult to satisfy, pure axiomatic systems can only be built in limited mathematical domains.

Semi-axiomatic systems

What if the system cannot get sufficient knowledge and resources for the problems in the domain? Of course, by definition in this situation the system cannot find perfect solutions for the problems, but there are still different alternatives to be evaluated.

One simple option is to say "I don't know" to the problems for which no prefect solution can be guaranteed. The trouble with this option is that very often the system has to deal with certain problems, and to do nothing can lead to a worse consequence than what an imperfect solution can cause. There are many such examples.

When a reasoning system has to work with some kind of insufficiency in its knowledge and resources, a common and natural strategy is to revise and/or extend the "pure-axiomatic" approach, by allowing certain aspect of the system to be imperfect, while remaining as close to the classical (pure-axiomatic) systems as possible. Many "non-classical logics" belong to this category, as well as various types of AI systems in the fields of "commonsense reasoning", "uncertain reasoning", and "machine learning". The following is an incomplete list:

Non-monotonic logics attempt to formalize commonsense reasoning, by allowing default rules to derive revisable conclusions. [McCarthy 1988]
Fuzzy logic treats categorical membership as a matter of degree, and uses membership functions to numerically measure and calculate it. [Zadeh 1979]
Bayesian networks represent the uncertainty of a statement as probability, and use Bayes' Rule to update the system's beliefs in light of new evidence. [Pearl 1988]
Several probability-based theories use some kind of interval to represent the system's ignorance about the probability of the statements [Shafer 1976, Walley 1991, Kyburg 1988].
Study on inductive logic aims at the formalization and justification of inductive inference [Carnap 1950, Flach and Kakas 2000].
Case-based reasoning solves problems using methods that have been successfully applied to similar problem cases [Leake 1996].
In machine learning, various types of algorithm have been developed to learn declarative knowledge that cannot be logically deduced from the given data [Mitchell 1997].
When the best solution to problems is not available, many AI systems depend on heuristics that often (though not always) work [Pearl 1984].
When a problem-specific algorithm is not available, in certain situation a system can learn it out of the feedback from the environment, using methods like reinforcement learning [Sutton and Barto 1998] or genetic programming [Koza 1992].
For systems that have to deal with variable time constraints, ideas like anytime algorithm [Frisch and Haddawy 1994] and meta-reasoning [Russell 1991] have been explored, which allow the system to have different amounts of resources when solving the instances of the same problem in different contexts.

In this book, these systems are called "semi-axiomatic systems", since they partially reject the sufficiency assumption (in knowledge and resources) of the axiomatic systems, while partially accept it. These systems have made many theoretical and practical achievements, though they cannot be simply bundled together to make a general-purpose reasoning system that can work in all kinds of situations, because each of them is usually based on some very special assumptions about the environment and the system itself.

Non-axiomatic systems

As explained in Chapter 2, intelligent systems are adaptive, and can work with insufficient knowledge and resources. When such a system is designed in the framework of inference system, all the assumptions of pure-axiomatic system are violated, and consequently, all the major components of classical reasoning system cannot be used here.

Conventional wisdom suggests us to take the semi-axiomatic approach by revising or extending one aspect of the classical system at a time, so as to gradually acomendate the insufficiency of knowledge and resources. After all, scientific methods usually solve a large problem by isolating its components, then solving them one by one.

However, the situation we are facing is special. Not only the semi-axiomatic systems cannot be easily integrated, but also from a theoretical point of view, all the issues they addressed separately are caused by a common reason, that is, the insufficiency of knowledge and resources. Therefore, these issues should be handled together, on the foundation of a new notion of validity. Actually, many problems in the semi-axiomatic systems are caused by the conflict between different notions of validity accepted in different aspects of the system. Therefore, addressing all the related issues in a consistent manner may turn out to be an easier approach than addressing them separately.

Therefore, in the following we are going to explore a novel type of inference system, called non-axiomatic system, in which the insufficiency of knowledge and resources are assumed completely. Also explained in Chapter 2, intelligence is justified as adaptation. Applied to the situation of inference system, it means in a non-axiomatic system, it is impossible to get absolute truth, nor to know the distance between its beliefs and the absolute truth. What the system has is its experience, so it will use this experience, not a model, as reference when defining meaning and truth. Similarly, certain inference rules are valid, in the sense that the truth value it assigns to its conclusion properly measures the evidential support provided by the premises. The memory and control structure of a non-axiomatic system should allow the system to work with its available knowledge and resources efficiently, even though they are insufficient to provide perfect solutions to the problems.

Such a non-axiomatic system is not only desired, but also plausible. Actually, such a formal model, NARS (for Non-Axiomatic Reasoning System), has been designed and implemented in a computer system. The most comprehensive description of this model is [Wang, 2006]. In this book, we only describe and discuss NARS at the conceptual level, without going into technical details.

In the following sections, the major parts of NARS will be introduced. The purpose here is not to describe this formal model, but to use it as an example to describe an intelligent inference system in general.