Artificial General Intelligence
— A gentle introduction
Pei Wang
[This page contains up-to-date information about the field of Artificial General Intelligence (AGI), collected and organized according to my judgment, though efforts are made to avoid personal biases.] [Español]
From AI to AGI
AI: in different directions, and through seasonal cycles
Artificial Intelligence (AI) started with "thinking machine" or
"human-comparable intelligence" as the ultimate goal, as documented by
the following literature:
In the past, there were some ambitious projects aiming at this goal, though they all failed. The best-known examples include the following ones:
Partly due to the recognized difficulty of the problem, in the
1970s-1980s mainstream AI gradually moved away from general-purpose
intelligent systems, and turned to domain-specific problems and
special-purpose solutions, though there are opposite attitudes
toward this change:
Consequently, the field currently called "AI" consists of many
loosely related subfields without a common foundation or framework,
and suffers from an identity crisis:
- External
recognition: As soon as a problem is solved, it is no
longer considered as requiring "intelligence" anymore, so the AI community rarely gets credit.
- Internal
fragmentation: The subfields of AI become less and less
associated to one another, even though their problems are closely
related.
A new spring
Roughly in the period of 2004 to 2007, calls for research on
general-purpose systems returned, both inside and outside mainstream
AI.
Anniversaries are good time to review the big picture of the
field. In the following collections and events, many
well-established AI researchers raised the topic of
general-purpose and human-level intelligence:
More or less coincidentally, from outside mainstream AI, there were several books with bold titles and novel technical approaches to produce
intelligence as a whole in computers:
- Eric Baum, What is
Thought?, 2004
- Jeff Hawkins, On Intelligence, 2004
- Marcus Hutter, Universal
Artificial Intelligence, 2005
- Pei Wang, Rigid
Flexibility:
The Logic of Intelligence, 2006 [The manuscript was finished in 2003.]
- Ben Goertzel & Cassio Pennachin (Editors), Artificial General Intelligence, 2007 [The manuscript was finished in 2003.]
There were also several less technical but more influential books,
with the same optimism on the possibility of building
general-purpose AI:
- Ray Kurzweil, The
Singularity Is Near: When Humans Transcend Biology, 2005
- Marvin Minsky, The
Emotion Machine: Commonsense Thinking, Artificial
Intelligence, and the Future of the Human Mind, 2006
- Ben Goertzel, The
Hidden Pattern: A Patternist Philosophy of Mind, 2006
- J. Storrs Hall, Beyond AI: Creating the Conscience of the Machine, 2007
So after several decades, "general-purpose system", "integrated AI",
and "human-level AI" become less taboo (though still far from popular)
topics, as shown by several related meetings:
It's summer again
Since 2008, several research communities have emerged, with similar focuses and overlapping participants:
More research books have been published:
- Joscha Bach, Principles of Synthetic Intelligence PSI: An Architecture of Motivated Cognition, 2009
- John Laird, The Soar Cognitive Architecture, 2012
- Pei Wang and Ben Goertzel (Editors), Theoretical
Foundations of Artificial General Intelligence, 2012
- Pei Wang, Non-Axiomatic
Logic: A Model of Intelligent Reasoning, 2013
- Ben Goertzel et al., Engineering General Intelligence, Part 1 and Part 2, 2014
In mainstream AI, deep learning has made impressive progress in recent years, which raises many people's hope on "human-level" AI once again. The claim "The Turing Test has been passed" and the success of AlphaGo in the board game Go renewed the discussion on what "artificial intelligence" is really about, and how to reach it. There is still no consensus, and the opinions are not even converging.
Several large companies have labeled their results as "steps towards AGI", and their approaches are either extensions of deep learning or integrations of the existing AI techniques. This approach is exemplified by large language models like GPT-4, which is claimed by its creator as "a significant step towards AGI".
Partly triggered by the recent progresses, more and more people consider AGI, or whatever it is called, as really possible. As a consequence, the risk and safety of it becomes a hot topic:
AGI Basics
The most general questions every AGI researcher needs to answer include:
- What is AGI, accurately specified?
- Is it possible to build the AGI as specified?
- If AGI is possible, what is the most plausible way to achieve it?
- Even if we know how to achieve AGI, should we really do it?
[My own answers to these questions are here.]
In the following the major answers in the field of AGI are summarized.
What is AGI
Roughly speaking, Artificial General Intelligence (AGI) research has the following features:
- Stressing on the general-purpose nature of intelligence,
- Taking a holistic or integrative viewpoint on intelligence,
- Believing the time has come to build an AI that is comparable to human intelligence.
Therefore, "AGI" is closer to the original meaning "AI", while very different from the current mainstream "AI research", which focuses on domain-specific and problem-specific methods. "AGI" is similar or related to notions like " strong AI", " human-level AI", " complete AI", " thinking machine", " cognitive computing", and some others. Here is an explanation about the selection of the term "AGI".
A complete work of AGI should consist of
- A theory of intelligence (described in a human language),
- A model of the theory (described in a symbolic/mathematical language),
- A computer implementation of the model (realized in software/hardware).
Even though there is a vague consensus on the objective of reproducing "intelligence" as a whole in computers, the current AGI projects are not aimed at exactly the same goal. Though every AGI approach gets its inspiration from the same source, that is, human intelligence, here "intelligence" is understood in several senses. Consequently, AGI projects attempt to duplicate human intelligence at different levels of abstraction:
- Structure
Rationale: Intelligence is produced by the human brain. Therefore, to build an intelligent computer means to simulate the brain structure as faithfully as possible.
Background: Neuroscience, biology, etc.
Examples: HTM, Vicarious
Challenge: There may be biological details that are neither
possible nor necessary to be reproduced in AI systems.
- Behavior
Rationale: Intelligence is displayed in how the human beings
behave. Therefore, the goal should be to make a computer to behave exactly like a human.
Background: Psychology, linguistics, etc.
Examples: Turing
Test, ChatGPT
Challenge: There may be psychological or social factors that are
neither possible nor necessary to be reproduced in AI
systems.
- Capability
Rationale: Intelligence is evaluated by problem-solving capability. Therefore, an intelligent system should be able to solve certain practical problem that is currently solvable by humans only.
Background: Computer application guided by domain knowledge
Examples: IBM Watson, AlphaGo
Challenge: There is no defining problems of intelligence, and the
special-purpose solutions lack generality and flexibility.
- Function
Rationale: Intelligence is associated to a collection of cognitive
functionality, such as perceiving, reasoning, learning, acting, communicating, problem solving, etc. Therefore the goal is to reproduce these functions in computers.
Background: Computer science
Examples: Mainstream
AI textbooks, Soar
Challenge: The AI techniques developed so far are highly
fragmented and rigid, and it is hard for them to work together.
- Principle
Rationale: Intelligence is a form of rationality or
optimality. Therefore, an intelligent system should always "do the right thing" according to certain general principles.
Background: Logic, mathematics, etc.
Examples: AIXI, NARS
Challenge: There are too many aspects in intelligence and cognition to be explained and reproduced by a
simple theory.
From top to bottom, they correspond to descriptions of human intelligence in more and more general level, and to reproduce that description in computer systems. Since different descriptions have different granularity and scope, the above objectives are related, but still very different, and do not subsume each other. The best way to achieve one is usually not a good choice for the others. [A more detailed discussion of this issue can be found here.]
Not only the "I" in AGI has different understanding, even the "G" has been interpreted differently, as referring to AI systems that
- Can solve all problems — though no AGI researcher has taken this position, such a 'strawman' target has been used by some people to claim the impossibility of AGI,
- Can solve all human-solvable problems — this is basically why the Large Language Models (LLMs) are considered as AGI,
- Can solve all computable problems — this is roughly why models like AIXI are considered as AGI,
- Can try to solve all representable problems — this is roughly why models like NARS are considered as AGI.
These "AGIs" are after different goals, as clearly shown in their roadmaps:
Because of this diversity in research objectives, the achievements of LLMs have not dominated the current AGI research (as shown in the annal AGI conferences and the Journal of AGI), though to many people outside this research community, "AGI" means "LLM".
Limitations and objections
Since the idea of AI or "thinking machine" appeared, there have been various objections against its possibility. Some people claimed that they have proved that AGI, or whatever it is called, is theoretically impossible, due to certain fundamental limitations of computers.
Many researchers have argued against these objections. Classical arguments can be found in the following works:
Obviously, all AGI researchers believe that AGI can be achieved (though they have different interpretations to the term). In the introductory chapter of the AGI 2006 Workshop Proceedings, I and Ben Goertzel responded to the following common doubts and objections of this research:
- AGI is impossible.
- There is no such a thing as general intelligence.
- General-purpose systems are not as good as special-purpose ones.
- AGI is already included in the current AI.
- It is too early to work on AGI.
- AGI is nothing but hype.
- AGI research is not fruitful.
- AGI is dangerous.
Some of the doubts about the possibility of AGI come from misconceptions on what AGI attempts to achieve or what computers can do. The previous subsection has clarified the former issue, while an analysis of the latter issue can be found here.
Strategies and techniques
On one hand, the ultimate goal of AGI is to reproduce intelligence as a whole, while on the other hand, engineering practice must be step-by-step. To resolve this dilemma, three overall strategies have been proposed:
- Hybrid
Approach: To develop individual functions first (using
different theories and techniques), then to connect them
together.
Argument: (AA)AI:
More than the Sum of Its Parts, Ronald Brachman
Difficulty: Compatibility of the theories and techniques
- Integrated
Approach: To design an architecture first, then to design
its modules (using various techniques) accordingly.
Argument: Cognitive
Synergy: A Universal Principle for Feasible General
Intelligence?, Ben Goertzel
Difficulty: Isolation, specification, and coordination of
the functions
- Unified
Approach: Using a single technique to start from a core
system, then to extend and augment it incrementally.
Argument: Toward
a Unified Artificial Intelligence, Pei Wang
Difficulty: Versatility and extensibility of the core technique
Obviously, the selection of development strategy partially depends on the selection of the research objective.
At the current time, the major techniques used in AGI projects include, though are not limited to:
- logic
- probability theory
- production system
- graph theory
- knowledge base
- learning algorithm
- neural network
- evolutionary computation
- robotics
- multi-agent system
Though each of these techniques is also explored in mainstream AI, to use it in a general-purpose system leads to very different design decisions in technical details.
The ethics of AGI
Even if we have found out how to achieve AGI, it does not necessarily mean we really want to do it. Like all major scientific discoveries and technical breakthroughs, AGI has the potential to revolutionize our life and even the fate of the human species, either in a desired way or an undesired way — or, as things usually go, a mixture of the two.
AGI researchers are aware of their responsibility on this topic, though most of them think that, according to the currently available evidence, progress in AGI research will benefit the human species, rather than to destroy it. Discussions on how to make AGI "safe" have existed in AGI meetings since the very beginning. Sample discussions include
Of course, many crucial problems remain open, but to find their solutions, the research of AGI should be speed up, rather than slowed down. Once again, some wide-spreading concerns and fears about AGI are based on misconceptions about the nature of AGI.
Representative AGI Projects
The following projects are selected to represent the current AGI research, as for each of them, it can be said that
- It is clearly oriented to AGI (that is why IBM's Watson and DeepMind's AlphaGo are not included)
- It is still very active (that is why Pollock's OSCAR and Brooks' Cog are no longer included)
- It has ample publications on technical details (that is why many recent AGI projects are not included yet, except GPT-4 that is used to represent various deep learning projects toward AGI)
The projects are listed in alphabetical order. Each project name is linked to the project website, where the following quotations are extracted. The focus of the quotations is on the research goal (the 1st question) and technical path (the 3rd question). Two publications on the project are selected, usually one brief introduction and one detailed description.
ACT-R
[ An Integrated Theory of the Mind; The Atomic Components of Thought]
ACT-R is a cognitive architecture: a theory for simulating and
understanding human cognition. Researchers working on ACT-R
strive to understand how people organize knowledge and produce
intelligent behavior. As the research continues, ACT-R evolves
ever closer into a system which can perform the full range of
human cognitive tasks: capturing in great detail the way we
perceive, think about, and act on the world.
On the exterior, ACT-R looks like a programming language;
however, its constructs reflect assumptions about human
cognition. These assumptions are based on numerous facts
derived from psychology experiments. Like a programming
language, ACT-R is a framework: for different tasks (e.g.,
Tower of Hanoi, memory for text or for list of words, language
comprehension, communication, aircraft controlling),
researchers create models (aka programs) that are written in
ACT-R and that, beside incorporating the ACT-R's view of
cognition, add their own assumptions about the particular
task. These assumptions can be tested by comparing the results
of the model with the results of people doing the same tasks.
ACT-R is a hybrid cognitive
architecture. Its symbolic structure is a production system;
the subsymbolic structure is represented by a set of massively
parallel processes that can be summarized by a number of
mathematical equations. The subsymbolic equations control many
of the symbolic processes. For instance, if several
productions match the state of the buffers, a subsymbolic
utility equation estimates the relative cost and benefit
associated with each production and decides to select for
execution the production with the highest utility. Similarly,
whether (or how fast) a fact can be retrieved from declarative
memory depends on subsymbolic retrieval equations, which take
into account the context and the history of usage of that
fact. Subsymbolic mechanisms are also responsible for most
learning processes in ACT-R.
AERA
[ Anytime Bounded Rationality; Autocatalytic Endogenous Reflective Architecture]
AERA is a cognitive architecture - and a blueprint - for constructing agents with high levels of operational autonomy, starting from only a small amount of designer-specified code – a seed. Using a value-driven dynamic priority scheduling to control the parallel execution of a vast number of lines of reasoning, the system accumulates increasingly useful models of its experience, resulting in recursive self-improvement that can be autonomously sustained after the machine leaves the lab, within the boundaries imposed by its designers.
AERA demonstrates domain-independent self-supervised cumulative learning of complex tasks. Unlike contemporary AI systems, AERA-based agents excel at handling novelty - situations, information, data, tasks - that their programmers could not anticipate. It is the only implementable / implemented system in existence for achieving bounded recursive self-improvement.
AERA-based agents learn cumulatively from experience by interacting with the world and generating compositional causal-relational micro-models of its experience. Using non-axiomatic abduction and deduction, it constantly predicts how to achieve its active goals and what the future may hold, generating a flexible opportunistically-interruptable plan for action.
AIXI [ Universal Algorithmic Intelligence: A mathematical top->down approach; Universal Artificial Intelligence]
An important observation is that most, if not all known facets of intelligence can be formulated as goal driven or, more precisely, as maximizing some utility function.
Sequential decision theory formally solves the problem of rational agents in uncertain worlds if the true environmental prior probability distribution is known. Solomonoff's theory of universal induction formally solves the problem of sequence prediction for unknown prior distribution. We combine both ideas and get a parameter-free theory of universal Artificial Intelligence. We give strong arguments that the resulting AIXI model is the most intelligent unbiased agent possible.
The major drawback of the AIXI model is that it is uncomputable, ... which makes an implementation impossible. To overcome this problem, we constructed a modified model AIXItl, which is still effectively more intelligent than any other time t and length l bounded algorithm.
Cyc [ Cyc: A Large-Scale Investment in Knowledge Infrastructure; Building Large Knowledge-Based Systems]
Vast amounts of commonsense knowledge, representing human consensus reality, would need to be encoded to produce a general AI system. In order to mimic human reasoning, Cyc would require background knowledge regarding science, society and culture, climate and weather, money and financial systems, health care, history, politics, and many other domains of human experience. The Cyc Project team expected to encode at least a million facts spanning these and many other topic areas.
The Cyc knowledge base (KB) is a formalized representation of a vast quantity of fundamental human knowledge: facts, rules of thumb, and heuristics for reasoning about the objects and events of everyday life. The medium of representation is the formal language CycL. The KB consists of terms -- which constitute the vocabulary of CycL -- and assertions which relate those terms. These assertions include both simple ground assertions and rules.
GPT-4 [ GPT-4 Technical Report; Sparks of Artificial General Intelligence]
We’ve created GPT-4, the latest milestone in OpenAI’s effort in scaling up deep learning. GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that, while less capable than humans in many real-world scenarios, exhibits human-level performance on various professional and academic benchmarks.
The combination of the generality of GPT-4's capabilities, with numerous abilities spanning a broad swath of domains, and its performance on a wide spectrum of tasks at or beyond human-level, makes us comfortable with saying that GPT-4 is a significant step towards AGI.
HTM [ Hierarchical Temporal Memory; On Intelligence]
At the core of every Grok model is the Cortical Learning Algorithm (CLA), a detailed and realistic model of a layer of cells in the neocortex. Contrary to popular belief, the neocortex is not a computing system, it is a memory system. When you are born, the neocortex has structure but virtually no knowledge. You learn about the world by building models of the world from streams of sensory input. From these models, we make predictions, detect anomalies, and take actions.
In other words, the brain can best be described as a predictive modeling system that turns predictions into actions. Three key operating principles of the neocortex are described below: sparse distributed representations, sequence memory, and on-line learning.
LIDA
[ The LIDA Architecture;
LIDA Tutorial]
Implementing and fleshing out
a number of psychological and neuroscience theories of
cognition, the LIDA conceptual model aims at being a cognitive
"theory of everything." With modules or processes for
perception, working memory, episodic memories,
"consciousness," procedural memory, action selection,
perceptual learning, episodic learning, deliberation,
volition, and non-routine problem solving, the LIDA model is
ideally suited to provide a working ontology that would allow
for the discussion, design, and comparison of AGI systems. The
LIDA technology is based on the LIDA cognitive cycle, a sort
of "cognitive atom." The more elementary cognitive modules
play a role in each cognitive cycle. Higher-level processes
are performed over multiple cycles.
The LIDA architecture
represents perceptual entities, objects, categories,
relations, etc., using nodes and links .... These serve as
perceptual symbols acting as the common currency for
information throughout the various modules of the LIDA
architecture.
MicroPsi [The MicroPsi Agent Architecture; Principles of Synthetic Intelligence]
The MicroPsi agent architecture describes the interaction of emotion,
motivation and cognition of situated agents, mainly based on the Psi
theory of Dietrich Dorner.
The Psi theory addresses emotion, perception, representation and bounded
rationality, but being formulated within psychology, has had relatively
little impact on the discussion of agents within computer science.
MicroPsi is a formulation of the original theory in a more abstract and
formal way, at the same time enhancing it with additional concepts for
memory, building of ontological categories and attention.
The agent framework uses semantic networks, called node nets, that are a
unified representation for control structures, plans, sensory and
action schemas, Bayesian networks and neural nets. Thus it is possible
to set up different kinds of agents on the same framework.
NARS [Intelligence: From Definition to Design; Rigid Flexibility: The Logic of Intelligence]
What makes NARS different from conventional reasoning systems is its ability to learn from its experience and to work with insufficient knowledge and resources. NARS attempts to uniformly explain and reproduce many cognitive facilities, including reasoning, learning, planning, etc, so as to provide a unified theory, model, and system for AI as a whole. The ultimate goal of this research is to build a thinking machine.
The development of NARS takes an incremental approach consisting four major stages. At each stage, the logic is extended to give the system a more expressive language, a richer semantics, and a larger set of inference rules; the memory and control mechanism are then adjusted accordingly to support the new logic.
In NARS the notion of "reasoning" is extended to represent a system's ability to predict the future according to the past, and to satisfy the unlimited resources demands using the limited resources supply, by flexibly combining justifiable micro steps into macro behaviors in a domain-independent manner.
OpenCog [The General Theory of General Intelligence: A Pragmatic Patternist Perspective; Engineering General Intelligence, Part 1 and Part 2]
OpenCog, as a software framework, aims to provide research scientists and software developers with a common platform to build and share artificial intelligence programs. The long-term goal of OpenCog is acceleration of the development of beneficial AGI.
OpenCogPrime is a specific AGI design being constructed within the OpenCog framework. It comes with a fairly detailed, comprehensive design covering all aspects of intelligence. The hypothesis is that if this design is fully implemented and tested on a reasonably-sized distributed network, the result will be an AGI system with general intelligence at the human level and ultimately beyond.
While an OpenCogPrime based AGI system could do a lot of things, we are initially focusing on using OpenCogPrime to control simple virtual agents in virtual worlds. We are also experimenting with using it to control a Nao humanoid robot. See http://novamente.net/example for some illustrative videos.
Sigma [Lessons from Mapping Sigma onto the Standard Model of the Mind; The Sigma Cognitive Architecture and System]
The goal of this effort is to develop a sufficiently efficient,
functionally elegant, generically cognitive, grand unified, cognitive
architecture in support of virtual humans (and hopefully intelligent
agents/robots – and even a new form of unified theory of human cognition
– as well).
Our focus is on the development of the Sigma (∑) architecture, which explores the graphical architecture hypothesis
that progress at this point depends on blending what has been learned
from over three decades worth of independent development of cognitive
architectures and graphical models, a broadly applicable state-of-the-art formalism for constructing intelligent mechanisms. The result is a hybrid (discrete+continuous) mixed
(symbolic+probabilistic) approach that has yielded initial results
across memory and learning, problem solving and decision making, mental
imagery and perception, speech and natural language, and emotion and
attention.
SNePS
[ The
GLAIR Cognitive Architecture; SNePS Tutorial]
The long term goal of the SNePS Research Group is to understand the
nature of intelligent cognitive processes by developing and
experimenting with computational cognitive agents that are
able to use and understand natural language, reason, act, and
solve problems in a wide variety of domains.
The SNePS knowledge representation, reasoning, and
acting system has several features that facilitate
metacognition in SNePS-based agents. The most prominent is the
fact that propositions are represented in SNePS as terms
rather than as logical sentences. The effect is that
propositions can occur as arguments of propositions, acts, and
policies without limit, and without leaving first-order logic.
Soar [A Gentle Introduction to Soar; The Soar Cognitive Architecture]
The ultimate in intelligence would be complete rationality which would imply the ability to use all available knowledge for every task that the system encounters. Unfortunately, the complexity of retrieving relevant knowledge puts this goal out of reach as the body of knowledge increases, the tasks are made more diverse, and the requirements in system response time more stringent. The best that can be obtained currently is an approximation of complete rationality. The design of Soar can be seen as an investigation of one such approximation.
For many years, a secondary principle has been that the number of distinct architectural mechanisms should be minimized. Through Soar 8, there has been a single framework for all tasks and subtasks (problem spaces), a single representation of permanent knowledge (productions), a single representation of temporary knowledge (objects with attributes and values), a single mechanism for generating goals (automatic subgoaling), and a single learning mechanism (chunking). We have revisited this assumption as we attempt to ensure that all available knowledge can be captured at runtime without disrupting task performance. This is leading to multiple learning mechanisms (chunking, reinforcement learning, episodic learning, and semantic learning), and multiple representations of long-term knowledge (productions for procedural knowledge, semantic memory, and episodic memory).
Two additional principles that guide the design of Soar are functionality and performance. Functionality involves ensuring that Soar has all of the primitive capabilities necessary to realize the complete suite of cognitive capabilities used by humans, including, but not limited to reactive decision making, situational awareness, deliberate reasoning and comprehension, planning, and all forms of learning. Performance involves ensuring that there are computationally efficient algorithms for performing the primitive operations in Soar, from retrieving knowledge from long-term memories, to making decisions, to acquiring and storing new knowledge.
A rough classification
The above AGI projects are roughly classified in the following
table, according to the type of their answers to the previously
listed 1st question (on research goal) and 3rd question (on
technical path).
goal \ path |
hybrid |
integrated |
unified |
principle |
|
|
AERA, AIXI, NARS |
function |
|
OpenCog, Sigma, Soar |
SNePS |
capability |
|
|
Cyc |
behavior |
|
ACT-R, LIDA, MicroPsi |
GPT-4 |
structure |
|
|
HTM |
Since this classification is made at a high level, projects in the same entry of the table are still quite different in the details of their research goals and technical paths.
In summary, the current AGI projects are based on very different theories and techniques.
AGI Literatures and Resources
AGI collections:
|
|