To further specify goals, operations, and beliefs within an information system, we need to extend the vocabulary of information system introduced in Chapter 1. As we have seen, that vocabulary allows us to describes a system abstractly, so as to capture the similarity between animals and machines. Now we need to describe the system formally, or symbolically.
In this book, "formalization" or "symbolization" is used to indicate a special form of abstraction, in which some terms in a natural language (English) are replaced by special symbols that has no innate meaning.
Formalization is a technique commonly used in mathematics, and it has two major advantages, compared to descriptions in natural languages:
On the other hand, formalization is absolutely necessary in the research of AI and CogSci. There have been profound literature in philosophy on the nature of human thinking, but because of the ambiguity of natural languages, every philosophical conclusion typically have several very different interpretations, and therefore it ends up not very instructive to the concrete research and development. This is why it is argued in the Preface that a theory, like the one proposed in this book, should be formalized first, before it is implemented in a computer system.
Currently there are three major frameworks of formalizing an information system. In the following, they will be discussed one by one.
This representation framework is often called "dynamical system representation" of "state space representation". This tradition started in physics, where the systems under description are various objects, with their physical properties measured.
When the system under discussion is abstractly described, the dynamical system framework is still appliable, though the dimensions of the state-space no longer correspond to physical properties, but to inputs, outputs, and states variables, that represent the system's properties abstractly. This is basically how a system is described in General System Theory and Cybernetics, as well as in the study of complex and nonlinear systems. In AI and CogSci research, this type of representation framework is widely used in fields like neural network and robotics.
For the information system vocabulary used in this book, in a dynamical system, the goals of the system are the special states that the system tends to move into or stay in (often called attractors), the operations are whatever the system can do to cause state-change, and the beliefs are the relations between the goals and the operations.
A simple example of such a system is a room-temperature controller. Here the goal of the system is to keep the temperature of a room in a certain range. Whenever the temperature is not in the range, an operation is executed (to start the heating or cooling mechanism) to bring it back.
In a artificial neural network, the beliefs of the system are represented mainly by the topological structure of the network (which is fixed) and the weights (or strength) of the links. The system's operation is its learning algorithm, which modifies the weights, until the network converges into a stable function that maps the input variables into the output variables.
There are several major issues to develop general-purpose intelligent systems in this framework.
First, this framework suggests a "flat" representation, in the sense that all the components of the system (beliefs, operations, and beliefs) are represented in terms of the same dimensions. Which properties of the environment, or the system, should be selected as dimensions? For systems developed to solve one types of problem, it is usually easy to decide, but for a general-purpose system, the number of dimensions can be huge, though for each problems, only a very small percentage of them is relevant.
An obvious alternative is to divide the system into many subsystems, each with its own state-space, where the dimensions are chosen to naturally represent one type of problem. However, it is unclear how to divide the overall function of intelligence into subsystems, and how to integrate these subsystems together.
When we describe a complex object or process, we usually prefer a hierarchically structured representation, which, unlike a flat representation, does not always depends on the same set of primitives (dimensions) in the description. Though in principle the dynamic framework allows hierarchically structured representation, the current works usually focus on special problems, and it is still unclear how to support such a representation in general. Especially, if some dimensions are no longer directly used to measure properties of the environment, but used to code more complicated information, by using a point (or a set of points) as a symbol for something else, then nearby points are not necessarily similar in their meaning, and the state change trajectory may become discontinuous if the operations directly manipulate the symbols. Consequently, the representation may lose its intuitive appealing.
Another major issue is the representation of goals. As discussed in the previous chapter, an intelligent system usually pursue multiple goals at the same time. There is little work showing how to do this, or even represent the situation in a state space. Whatever it is, it is not a balance or compromise among multiple attractors, or achieve them one after another.
Though this framework has a shorter history than the dynamic system framework, it becomes dominating in AI research, mainly because of its natural and close relationship with computer, the common platform of information system. Conceptually speaking, the difference between these two frameworks are similar to that between digital systems and analog systems, in that the latter are described using continuous numbers, while the former using discrete symbols.
The representation in this framework is hierarchically structured by nature. Since a data structure can be addressed by name, an object or event can be described at levels of granularity, rather than being restricted by a fixed set of primitives, as in the dynamic system system framework. This is clearly an advantage when a general-purpose system is analyzed or designed.
However, when systems are designed within this framework, they end up with strong instinct but weak intelligence, as analyzed in Section 2.2. This is the case because the designers usually treat the system's problem-solving process as computation carried out by predetermined program, therefore leave little room for learning and adaptation.
In such a system, goals and beliefs are represented as sentences in the language, and the operations of the system include the inference rules that derive new goals and beliefs from existing ones, as well as certain changes the system can cause within itself and in the outside environment.
Though an inferential system is often implemented in a computer, conceptually speaking this framework is different from the computational system framework. in a computational system, what a piece of data means mainly depends on the programs that process it, while the sentences in an inferential system have meaning provided by the semantics of the language. Also, inference rules are not merely data-manipulating instructions, because each rule needs to be justified according to the semantics of the language, too. As a result, even though each rule is usually implemented as a program, a typical problem-solving process does not have to be determined in advance. Since each rule is justified individually, they can be flexibly assembled to form inference steps. On the contrary, most random sequences of instruction in a computational system usually have no useful overall functionality, therefore need to be programmed in advance.
The problem of developing AI systems in this framework mainly comes from the fact that the existing logics are either unmature or not designed for the purpose of AI. Especially, the study of modern logic has been dominated by mathematical logic, which was designed to provide a solid logical foundation for mathematics. Attempts have been made to extend or revise it for AI, though the resulting system have been widely criticized for their rigidity and brittleness. Furthermore, many human behaviors are usually judged as irrational, therefore many people conclude that intelligence does not follow any logic.
However, for a given information system, the naturalness and easiness for it to be specified in each of the three can be very different. What is easy to do in one framework can be very difficult in another, even it can be done in principle. Therefore, it still makes sense to say that one framework is better than the others for a given problem.
Now the problem we are interested in is to formalize intelligent systems, as specified in Chapter 2. As discussed above, each framework meets its challenge in doing that.
If each framework has its strength and weakness, why not to use them together in a hybrid system? Of course, this is once again possible in principle, but difficult in practice. Even though a hybrid system has the advantage of combining the strength of different approaches, it also has its own trouble in keeping the consistency and coherence of the system as a whole, as well as the overhead in providing interface between the heterogeneous frameworks within it.
A major, and probably controversial, conclusion proposed in this book is: the most proper framework to formalize an intelligent system is to describe it as an inferential system. It is more natural and easy to describe the system's goals, operations, and beliefs as sentences in a language than as points or trajectories in a multi-dimensional space, and it is more flexible to describe the working process as an inference process assembled by justifiable steps at the current moment than as an execution of a prearranged program. As for the problems with the existing logics, they just indicate the need of developing a new logic for intelligence, which will be discussed further in the following sections of this chapter.