A General Theory of Intelligence     Chapter 3. Inference System

Section 3.1. Formalization of information system

The need of formalization

In Chapter 2, a working definition of intelligent system is described and justified, which allow us to distinguish intelligent systems from non-intelligent (instinctive) systems, according to the relationship between a system's experience and its behavior. However, the description of intelligent systems is not detailed enough for us to study their internal structure, especially for the need of AI.

To further specify goals, operations, and beliefs within an information system, we need to extend the vocabulary of information system introduced in Chapter 1. As we have seen, that vocabulary allows us to describes a system abstractly, so as to capture the similarity between animals and machines. Now we need to describe the system formally, or symbolically.

In this book, "formalization" or "symbolization" is used to indicate a special form of abstraction, in which some terms in a natural language (English) are replaced by special symbols that has no innate meaning.

Formalization is a technique commonly used in mathematics, and it has two major advantages, compared to descriptions in natural languages:

However, formalization has its limitations. Descriptions about an information system cannot be fully formal without losing its relation to the system to be described. Also, formal descriptions are not necessary more correct than informal ones, because it is just a way to express some ideas, rather than to justify them. Finally, formal descriptions are hard to understand. For readability, I am actually trying to avoid formalization in this book whenever possible.

On the other hand, formalization is absolutely necessary in the research of AI and CogSci. There have been profound literature in philosophy on the nature of human thinking, but because of the ambiguity of natural languages, every philosophical conclusion typically have several very different interpretations, and therefore it ends up not very instructive to the concrete research and development. This is why it is argued in the Preface that a theory, like the one proposed in this book, should be formalized first, before it is implemented in a computer system.

Currently there are three major frameworks of formalizing an information system. In the following, they will be discussed one by one.

Dynamical system

A system, including an information system as a special case, can be measured along multiple dimensions, corresponding to the relevant properties of the system. Consequently, the system's state at a given moment can be represented as a point in a multi-dimensional space, and the state changes of the system can be described as the trajectory of the point moving in the space, often by differential equations or other transformation functions.

This representation framework is often called "dynamical system representation" of "state space representation". This tradition started in physics, where the systems under description are various objects, with their physical properties measured.

When the system under discussion is abstractly described, the dynamical system framework is still appliable, though the dimensions of the state-space no longer correspond to physical properties, but to inputs, outputs, and states variables, that represent the system's properties abstractly. This is basically how a system is described in General System Theory and Cybernetics, as well as in the study of complex and nonlinear systems. In AI and CogSci research, this type of representation framework is widely used in fields like neural network and robotics.

For the information system vocabulary used in this book, in a dynamical system, the goals of the system are the special states that the system tends to move into or stay in (often called attractors), the operations are whatever the system can do to cause state-change, and the beliefs are the relations between the goals and the operations.

A simple example of such a system is a room-temperature controller. Here the goal of the system is to keep the temperature of a room in a certain range. Whenever the temperature is not in the range, an operation is executed (to start the heating or cooling mechanism) to bring it back.

In a artificial neural network, the beliefs of the system are represented mainly by the topological structure of the network (which is fixed) and the weights (or strength) of the links. The system's operation is its learning algorithm, which modifies the weights, until the network converges into a stable function that maps the input variables into the output variables.

There are several major issues to develop general-purpose intelligent systems in this framework.

First, this framework suggests a "flat" representation, in the sense that all the components of the system (beliefs, operations, and beliefs) are represented in terms of the same dimensions. Which properties of the environment, or the system, should be selected as dimensions? For systems developed to solve one types of problem, it is usually easy to decide, but for a general-purpose system, the number of dimensions can be huge, though for each problems, only a very small percentage of them is relevant.

An obvious alternative is to divide the system into many subsystems, each with its own state-space, where the dimensions are chosen to naturally represent one type of problem. However, it is unclear how to divide the overall function of intelligence into subsystems, and how to integrate these subsystems together.

When we describe a complex object or process, we usually prefer a hierarchically structured representation, which, unlike a flat representation, does not always depends on the same set of primitives (dimensions) in the description. Though in principle the dynamic framework allows hierarchically structured representation, the current works usually focus on special problems, and it is still unclear how to support such a representation in general. Especially, if some dimensions are no longer directly used to measure properties of the environment, but used to code more complicated information, by using a point (or a set of points) as a symbol for something else, then nearby points are not necessarily similar in their meaning, and the state change trajectory may become discontinuous if the operations directly manipulate the symbols. Consequently, the representation may lose its intuitive appealing.

Another major issue is the representation of goals. As discussed in the previous chapter, an intelligent system usually pursue multiple goals at the same time. There is little work showing how to do this, or even represent the situation in a state space. Whatever it is, it is not a balance or compromise among multiple attractors, or achieve them one after another.

Computational system

At the current time, the most widely used framework for an information system is that of computational system, which originated in mathematics, and flourishes in computer science. In this framework, each state of an information system is indicated by the data in its memory, coded as binary strings. The goals of the system are specified by functions, or computations, that map input data into output data. The system's operations are machine instructions that manipulate the data, and therefore change the internal state of the system. The system's beliefs are programs that organize the instructions into proper sequences to carry out the computations.

Though this framework has a shorter history than the dynamic system framework, it becomes dominating in AI research, mainly because of its natural and close relationship with computer, the common platform of information system. Conceptually speaking, the difference between these two frameworks are similar to that between digital systems and analog systems, in that the latter are described using continuous numbers, while the former using discrete symbols.

The representation in this framework is hierarchically structured by nature. Since a data structure can be addressed by name, an object or event can be described at levels of granularity, rather than being restricted by a fixed set of primitives, as in the dynamic system system framework. This is clearly an advantage when a general-purpose system is analyzed or designed.

However, when systems are designed within this framework, they end up with strong instinct but weak intelligence, as analyzed in Section 2.2. This is the case because the designers usually treat the system's problem-solving process as computation carried out by predetermined program, therefore leave little room for learning and adaptation.

Inferential system

The inferential system (or reasoning system) framework comes mainly from the study of logic. Generally speaking, every inferential system has the following major components: The first two components, the language and the rules, consists of a logic, while the last component supports the computer interpretation of the logic.

In such a system, goals and beliefs are represented as sentences in the language, and the operations of the system include the inference rules that derive new goals and beliefs from existing ones, as well as certain changes the system can cause within itself and in the outside environment.

Though an inferential system is often implemented in a computer, conceptually speaking this framework is different from the computational system framework. in a computational system, what a piece of data means mainly depends on the programs that process it, while the sentences in an inferential system have meaning provided by the semantics of the language. Also, inference rules are not merely data-manipulating instructions, because each rule needs to be justified according to the semantics of the language, too. As a result, even though each rule is usually implemented as a program, a typical problem-solving process does not have to be determined in advance. Since each rule is justified individually, they can be flexibly assembled to form inference steps. On the contrary, most random sequences of instruction in a computational system usually have no useful overall functionality, therefore need to be programmed in advance.

The problem of developing AI systems in this framework mainly comes from the fact that the existing logics are either unmature or not designed for the purpose of AI. Especially, the study of modern logic has been dominated by mathematical logic, which was designed to provide a solid logical foundation for mathematics. Attempts have been made to extend or revise it for AI, though the resulting system have been widely criticized for their rigidity and brittleness. Furthermore, many human behaviors are usually judged as irrational, therefore many people conclude that intelligence does not follow any logic.

Formalization of intelligent system

In principle, it is hard to say which of the above three frameworks is the best way to formalize an information system. Actually, it can even be argued that they have equivalent expressing and processing power. With a large number of dimensions, it is always possible to represent a computational system or an inferential system in a state space, so as to describe it as a dynamical system. Similarly, it is possible to design an inferential system, which has enough sentences to describe a given dynamical system or computational system. Finally, since dynamical systems and inferential systems can be implemented in computers (with a certain accuracy), they can be turned into computational systems. Therefore, in principle, an information system can be described in any of the three framework.

However, for a given information system, the naturalness and easiness for it to be specified in each of the three can be very different. What is easy to do in one framework can be very difficult in another, even it can be done in principle. Therefore, it still makes sense to say that one framework is better than the others for a given problem.

Now the problem we are interested in is to formalize intelligent systems, as specified in Chapter 2. As discussed above, each framework meets its challenge in doing that.

If each framework has its strength and weakness, why not to use them together in a hybrid system? Of course, this is once again possible in principle, but difficult in practice. Even though a hybrid system has the advantage of combining the strength of different approaches, it also has its own trouble in keeping the consistency and coherence of the system as a whole, as well as the overhead in providing interface between the heterogeneous frameworks within it.

A major, and probably controversial, conclusion proposed in this book is: the most proper framework to formalize an intelligent system is to describe it as an inferential system. It is more natural and easy to describe the system's goals, operations, and beliefs as sentences in a language than as points or trajectories in a multi-dimensional space, and it is more flexible to describe the working process as an inference process assembled by justifiable steps at the current moment than as an execution of a prearranged program. As for the problems with the existing logics, they just indicate the need of developing a new logic for intelligence, which will be discussed further in the following sections of this chapter.