A General Theory of Intelligence Chapter 3. Inference System

Section 3.3. NARS: language

Function of language

As stated in Chapter 1, what an information system does is to take actions to achieve goals; as stated in Chapter 2, in an intelligent system, or adaptive system in general, the selection of action is based in the system's previous experience, that is, the system usually takes the action that is most likely to achieve the current goal, according to the system's past experience.

What makes the situation more complicated than in instinctive systems is that the system needs to deal with goals for which the best action sequence is unknown at the moment, and has to be found by the system itself. To do this according to experience means to remember the "history" of each action, like its precondition, cause, and consequences. Since the new situations are usually not identical to the old situations, and the system cannot afford the resources to remember all the details, the knowledge about each action must be represented in a general form, which is not only compact, and therefore more efficient, but also ignores irrelevant details, so that the individual aspects of a new situation can be recognized as the same of those that happened in the past.

To make the above happen, the system needs to systematically represent its goals and actions, as will as its beliefs that relate them to one another. This representation should allow the description of a situation to happen at different levels, each with its granularity and details, to satisfy different requirements on the accuracy and complexity of the description. A language is nothing but this "systematic way of representation".

Concretely speaking, in an inference system a language may play two major functions:

Within the system, a language is used to represent the beliefs, goals, and actions of the system, and therefore can be used to describe the system's working process.
Between the system and other systems in the environment, a language is used to communicate with other systems, and therefore can be used to describe the system's experience and behavior.

These two functions are clearly related to each other, and therefore in NARS, a single language, Narsese, is used for both representation and communication, with some minor difference between the two purposes. The definition of Narsese includes its grammar and semantics, which specify the form and content of the language, respectively.

Semantics of Narsese

Semantics is the study of "meaning" and "truth". In an inference system, its semantic theory plays two major roles. When the system is designed, the semantic theory guides and justifies the design of the inference rules and control mechanism by making the notion of "validity" concrete; when the system runs in an environment, the semantic theory makes the language understandable to the other system, by grounding the components of the language in the environment.

As mentioned in Section 3.2, traditional inference systems are based on model-theoretic semantics, where the "meaning" of a word is the entity it denotes in the model, and the "truth-value" of a statement indicates whether it matches with a fact in the model.

Under the assumption of insufficient knowledge and resources, NARS cannot assume the existence of a "model", as a complete and consistent description of the environment, and specify meaning and truth-value accordingly. Instead, the system's knowledge about the environment all comes from its experience. Consequently, the system's semantics has to be "experience-grounded", that is, the meaning of every word and the truth-value of every statements in NARS are all defined with respect to the given experience of the system, at least in principle.

This experience-grounded semantics is fundamentally different from model-theoretic semantics in several major aspects:

The system's beliefs is not a "world model" that attempts to describe the world as it is. Instead, it is a summary of the system's experience, that is, the record of the interaction between the system and its environment, as far as the system's resources allow it to be recorded.
The meaning of a term (or call it symbol or word) in Narsese, to the system, is determined by its role in the experience, that is, how it has been related to other terms in the past.
The truth-value of a statement in Narsese, to the system, is determined by its relationship with the experience, that is, how it has been supported or refuted by other statements in the past.
Defined in this way, meaning and truth-value are subjective, in the sense that they change as the system's experience develops in time, though they are by no means arbitrary, but are determined by the system's experience and experience-processing capability.
While a semantic model is described in the meta-language, and is not necessarily accessible to the system, the experience is always the system's own experience, and accessible to the system. As a result, the syntax and semantics of a sentence in Narsese are no longer independent of each other, like in traditional systems.

We are going to re-visit these semantic properties of NARS again and again in the following.

Term-oriented language

The design of Narsese grammar is to a large extent driven by the need of the above semantics.

In mathematical logic, the design of the formal language is strongly influenced by mathematical languages, and has deliberately kept a distance from natural languages, to avoid ambiguity and other forms of uncertainty. For example, the language used in FOPL consists of constants and variables to represent outside objects, as well as predicates to represent properties of, and relations among, the objects. Each simple proposition represents a statement about a property of an object, or a relation among some objects. Complicated information are represented using logical operations (such as "and", "or", and "not") to connect simple propositions into compound propositions.

The precondition for the usage of such a language is that the world has already been clearly categorized into objects, properties, and relations, where each category has well-defined boundary and determined criteria, specified independently from the system that lives and works in the world. Furthermore, the state of affair in the world is fully expressible in the language, so what the system needs to do is simply to identify the true statements among all possible statements.

For a system whose language is not used to represent the world as it is, but as what the system has experienced, then the situation is fundamentally different. Here the basic unit of representation is not individual objects, but primitive and atomic components in the system's experience, that is, its "percepts" and "acts". From them, complex components are built hierarchically to represent the repeatedly appeared structures or patterns in experience, and some of them intuitively correspond to what we usually call "object", "property", or "relation". However, since all of them are summaries of experience, not names of pre-existing entities, their application boundaries are fuzzy, relatively-defined, and context-sensitive.

To deal with this situation, Narsese is designed to be a "term logic" (also called "categorical logic"), where a basic statement has the form of "subject-copula-predicate". In this structure, the subject and the predicate are both terms, and the "copula" links the former to the latter.

A term can be simple, as a string of characters from a certain alphabet, or compound, as a structure composed of other (simpler) terms by a term-operator from a certain set. The basic type of copula is called inheritance in Narsese, and written as "→". As a result, the simplest statement form in Narsese is like "s → p". To make the description easier, in the following examples English words are used as terms, though it is not claimed that the meaning of such a word in NARS is exactly the same as what it means to a native English speaker, but only related to it. At as under this condition we can use Narsese statement "bird → animal", and say that it intuitively means "Bird is a kind of animal" in English.

As the above example shows, the statement specifies the relation between the two terms as that "bird is a specialization of animal", or equivalently, that "animal is a generalization of bird", where each term represents an abstraction of certain aspects of the system's experience, and the statements says that one can be treated as the other in certain ways.

As said above, the "meaning" of a term to the system is determined by the role it plays in the experience of the system, so now we can put the definition in a more concrete form: the meaning of a term T is determined by its specializations (i.e., all the x that satisfy "x → T") and its generalizations (i.e., all the y that satisfy "T → y"). Furthermore, since this "inheritance" relation is clearly transitive, it can be proved that "S → P" if and only if all the specializations of S are also specializations of P, or, equivalently, if all the generalizations of P are also generalizations of S.

In the technical writings on NARS, the above "specializations" and "generalizations" are called "extension" and "intension", respectively.

Evidence and truth-value

Since truth-value in NARS measures the relationship between a statement and the system's experience, there is no absolute truth, under the "openness" assumption accepted by this system. The future is not accurately predictable, so any new experience is possible.

However, it does not mean every statement is equally justified --- in an adaptive system, all justification is a matter of degree. Here a numerical measurement is preferred, not because it is more accurate, but it is compact and general. Though it may be more informative to remember all the relevant details about the confirmation/disconfirmation of a statement in the past, the system usually cannot afford the resources for that, and also, that will make the comparison between competing conclusions very complicated. For this reason, NARS uses numerical measurements for truth-value, even though the numbers do not provide all the relevant information about the "history" of a statement.

Since statement "S → P" is equivalent to "all the specializations of S are also specializations of P, and all the generalizations of P are also generalizations of S", it can be seen as a summary of many cases in the system's experience. Therefore, if a case is consistent with this summary, it can be considered as a piece of positive evidence, otherwise it is negative evidence. Therefore, for statement "bird → animal", its positive evidence include the shared specializations and generalizations of terms bird and animal, while its negative evidence include the specializations of bird that are not specializations of animal, as well as the generalizations of animal that are not generalizations of bird.

Assume the available amount of positive evidence and negative evidence of a statement are written as w⁺ and w^-, respectively, then the total amount of evidence is w = w⁺ + w^-. The frequency of the statement is f = w⁺/w, and the confidence of the statement is c = w/(w+k), where k is a positive constant. For the current discussion, we can take k = 1. The truth-value of the statement is the pair <f, c>.

Therefore, frequency is the proportion of positive evidence among total current evidence; confidence is the proportion of current evidence among all evidence after the coming of new evidence of amount k. While frequency indicates the extent to which the statement is consistent with the system's experience, confidence indicates the extent to which the frequency can be modified by future evidence. These two measurements are independent of each other, in the sense that from one, the other cannot be determined, or even bounded, except in trivial cases.

The confidence measurement is the major factor that differs the NARS approach of uncertainty processing from the other approaches, such as Bayesian network and fuzzy logic. Since each belief in NARS is based on finite evidence, its confidence value can indefinitely approach the upper bound 1, but never reach it. Consequently, "absolute truth" is represented by truth-value <1, 1>, a limit of the actual truth-values in the system.

Compound terms

Term logic is traditionally considered as less powerful than predicate logic, mainly because many statements cannot be put into the "subject-copula-predicate" format. In Narsese, this problem is solved by allowing compound term and multiple types of copula. In the following, they are informally introduced. For formal and detailed discussion of them, see [Wang, 2006].

Unlike simple terms bird and animal that have no internal structure, a compound term consists of one or more component terms that are bundled together by a term operator.

A term can be a set, specified by listing its instances or properties, such as "Boston, Philadelphia, and Pittsburgh" and "red and round".
A term can be specified as intersection or difference of other terms in specialization or generalization, such as "green light" and "unfinished story".
A term can be specified as a relation, or by its relation with other terms, as in "Mary and Ann are sisters" and "Mary is Ann's sister".
A term can be a statement, as in "Tom knows that Mary and Ann are sisters".
A term can be a compound statement, formed from simpler statements, such as "Tom knows that Mary and Ann are sisters, and both of them are his friend."

Beside inheritance, there are three other types of copula used between a subject term and a predicate term:

Similarity is symmetric inheritance, meaning "is the same as".
Implication is between two statements, meaning "if ... then ...".
Equivalence is also between two statements, meaning "if and only if".

Each of the above copula has its criteria for evidence, like inheritance. Consequently, the statements they formed are all true to various degrees, measured by frequency and confidence, in the same way as the inheritance statements. Intuitively, a copula indicates a certain form of "substitutability" between the two terms, that is, one "can be used as" the other in certain way.

An event is a statement with relative temporal attribute, specified with respect to other events. For example, "Tom met Mary, then they became friends". In Narsese, the two basic temporal relationships are "subsequent" and "simultaneous".

An operation is an event that correspond to the system's action, such as "to move forward" in a robot. It is a statement with a procedural interpretation.

Expressive power of Narsese

There are three types of sentences in Narsese:

Judgment: a statement with a truth-value. It is a task for the system to accommodate it, as new knowledge, into the system's beliefs.
Question: a statement, or a statement template with variables in it. It is a task for the system to evaluate its truth-value ("yes-no" question) or to instantiate the variables ("wh-question").
Goal: an event, which may have variables in it. It is a task for the system to realize, that is, to make the event happen by executing certain operations.

In summary, this is how Narsese is used for communication and representation:

At the interface, the system's experience is a stream of (input) sentences, each of which is a task for the system to process; the system's behavior is also a stream of (output) sentences, each of which is a task for other systems in the environment to process.
Within the system, there are three types of data items expressed in Narsese: tasks (as described above), beliefs (judgments summarizing previous experience), and operations (built-in executable statements).

Though specified by a formal grammar, the semantics of Narsese makes the language more similar to natural languages than traditional formal languages, such as First-Order Predicate Calculus.

Many practical problems are most efficiently represented in special format, such as matrix in mathematics and tree in data structure. Narsese allows these representations to be embedded in it, using relations and compound terms. In this way, Narsese is a common meta-language of various special representation formats and languages.