A General Theory of Intelligence Chapter 2. Intelligent System

Section 2.3. Intelligence under restriction

What the restriction is

Compared to the other popular definitions of intelligence, probably the most unusual part of mine is the requirement that an intelligent system must be able to work under the restriction of "insufficient knowledge and resources".

Concretely, this requirement is further extended into three properties an intelligent system must have:

Finite. The system must work in accordance of its limited information-processing capability.
Open. The system must be prepared to deal with an unlimited range of goals within the system and situations in the environment.
Real-time. The system must allow a goal to rise in any moment, to be achieved within a certain period.

The purpose of these requirements is to stress the need for an intelligent system to work in a realistic situation, rather than a highly idealized one. After all, every concrete information system has finite hardware at any given moment, which can only hold a certain amount of data and process it at a certain speed; the system has no full control over what can and cannot happen in the environment, nor that when a new goal will show up, with what time requirement.

Treatment of the restriction

Even though the above restriction sounds realistic, it is usually not fully considered in the current AI research.

Few people deny the finite nature of a computer system, but theoretical models (such as Turing Machine) often ignore this restriction, or to take this "finite" to mean "as large as needed". Even in situations where the capacity restriction must be respected, the system often require human intervene, rather than directly handling it. For example, when you write a file into a full disk, it is you, not the computer, that decide how to make space for it.

Of course, no system can be "open" to all forms of information, because what can be recognized as "information" is limited by the system's input devices, among other things. The "openness" in the current discussion is not about the content, not the form, of information that the system can accept. In existing AI systems, usually there are some types of information which can be recognized (by form), but not accepted (by content). For example, most systems cannot accept conflicting information, or tasks that are beyond the current problem-solving capability of the system.

Most of the existing AI systems work in real time, in the sense that it is the system, not the environment, that decide when to accept new information, and how long to process it. If it takes an algorithm 3 seconds to solve a problem, then the user usually gets nothing in 2 seconds, and gets nothing more if wait for 4 additional seconds. After all, the running time of an algorithm is a function of the problem instance, and cannot be changed by other factors that are not explicitly included in the problem specification.

Therefore, in general though every AI system has limited, or bounded, knowledge and resources, they are still sufficient, with respect to the goals the system attempts to achieve. For goals beyond that scope, the system does not even try.

Since none of the above issues is new, there are various approaches addressing a certain aspect of the insufficiency of knowledge and resources. For example, some systems use a "forgetting" mechanism to manage the shortage of memory space; various "non-monotonic" logics have been proposed to let the system work with incomplete knowledge while be open to new evidence; there are "anytime" algorithms that allow the environment, rather than the algorithms themselves, to decide when to stop the incremental improvement of the results.

Even so, few systems has fully assumed the insufficiency of knowledge and resources in its design. Instead, they assume the system's knowledge and resources are insufficient in certain aspects, while still sufficient in the others.

Idealization and approximation

To partially treat the knowledge/resources insufficiency is often justified as a necessary idealization. Since the realistic situation is too complicated, the scientific approach is to handle the problems one at a time, then to combine them together. Though approach is often proper, it is not a good idea of the perceived "problems" are actually different aspects of a single underlying problem. In the later parts of the book, I will repeatedly reveal relations among the "problems" to show that they are indeed closely tangled with each other, so is better to be taken as one problem.

Another objection of taking the knowledge/resources insufficiency as a defining feature of intelligence comes from the position that "intelligence" means "optimization", for which the insufficiency is undesired. A representative opinion can be found in Legg and Hutter (2007), where they give the following working definition of intelligence:

Intelligence measures an agent’s ability to achieve goals in a wide range of environments.

When comparing with other working definitions, they say

We consider the addition of resource limitations to the definition of intelligence to be either superfluous, or wrong.

My working definition is listed as an example of the "wrong" case. Their argument is:

... if limited resources are not a fundamental restriction, for example a new model of computation was discovered that was vastly more powerful than the current model, then it would be odd to claim that the unbelievably powerful machines that would then result were not intelligent. Normally we do not judge the intelligence of something relative to the resources it uses. For example, if a rat had human level learning and problem solving abilities, we would not think of the rat as being more intelligent than a human due to the fact that its brain was much smaller.

There are several problems in this argument:

A more powerful computer will surely solve many more problems, but there will still be problems for them it has insufficient knowledge and resources. Whether it is more intelligent or not depends on whether it can handle those problems, not on its raw computational power. My working definition surely does not say "the less powerful a system is, the more intelligent it is".
Normally we do "judge the intelligence of something relative to the resources it uses". When someone always solve problems faster than other people, that person is considered as having a higher intelligence. Actually this is how IQ was introduced in the first place --- an above 100 IQ originally meant the person reach a level of mental capability in less years than average people.

In the article they introduced their "universal optimal learning agent" AIXI, which is described in detail in Hutter (2005). Among other things, AIXI assumes unlimited resources. Though this assumption makes the model unpractical, it is proposed as a theoretical model, with the hope that all the practical systems can be evaluated according to how close they approximate AIXI.

It is true that every theory, including the one described in this book, is an idealization, in which many details are omitted. However, if a theory has been "idealized" so far away from the original situation as to ignore some of its defining features, it is no longer relevant that much to the situation. At least to me, the resources restriction is such a defining feature of intelligence.

No matter which one better fitting the label "intelligence", "information system with limited resources" and "information system with unlimited resources" have fundamentally different properties, and the former is not necessarily converging to the latter when it gets more and more resources.

Different rationality

The above analysis does not only apply to the resources restriction. In general, different assumptions on knowledge and resources lead to different models of rationality. Each of them can be justified as "the right thing to do" for a system in that situation, and can be applied to situations where the assumptions are roughly satisfied.

In a situation where the system can be assumed as having sufficient knowledge and resources (with respect to the goals of the system), then the "right thing" to do is to follow computer science, as summarized by Marr. Here "sufficient knowledge" means the designer, or the system itself, can specify the goal as a computation, and its achieving process by an algorithm; "sufficient resources" means the algorithm can be implemented in the existing computers, with the available resources supply. In this situation we don't really need a new label "intelligence" to be selectively attached to some computations, algorithms, and implementations, because the label added little value to the work.

This is probably what caused the so-called "AI Effect" (see the topic in AAAI website) --- as soon as an "AI" problem is solved, it is no longer considered as an "AI" problem anymore. Many AI researchers complained that the field has not got the recognition and credit from the outside world because of this bias.

To me this is just fair. Before the problem is solved, no one has sufficient knowledge and resources for it, so its solution needs intelligence. When it is "solved", in the traditional sense it means now the computer system has sufficient knowledge and resources for it, so no "intelligence" is needed, and "computation" is enough. A really intelligent computer system should be able to "solve" a problem without specifying it as a computation, and treating it with an algorithm. Only in this way, we can solve a problem which remains in AI. How to actually do this will be explained in the following chapters.

Even before going into details, we can say that according to the theory introduced in this book, AI is not a subfield of Computer Science (CS). Instead, their theoretical assumptions are fundamentally different. CS assumes sufficient knowledge and resources, and therefore is about instinctive systems; AI assumes insufficient knowledge and resources, and therefore is about intelligent systems. They have different usages and applications, and no one is "better" than the other in general. Of course, there are some overlaps between the two, and they use each other, but still, they are different paradigms, and many previous failures in AI can be explained as being too close to CS in theoretical assumptions.