# Will ChatGPT and its successors be Artificial General Intelligence? Author: Pei Wang (**[pei.wang@temple.edu](mailto:pei.wang@temple.edu)**) Translated by Tangrui Li (**[tuo90515@temple.edu](mailto:tuo90515@temple.edu)**) Edited by Christian Hahm (**[christian.hahm@temple.edu](mailto:christian.hahm@temple.edu)**) ----- ⚠This article is firstly published [here](https://mp.weixin.qq.com/s/j5xPMjrwTLconbUS4MTc9A) on Fanpu (返朴), WeChat ID: fanpu2019, in Chinese, here is a translated version. **Fanpu Editorial**: *After ChatGPT swept society with its performance, on March 14, GPT-4 was released by OpenAI, claiming that it can solve complex problems more accurately. Is it possible for such models to become the artificial general intelligence (AGI) that we have been dreaming of? This article will theoretically analyze some limitations in deep and several misunderstandings of ChatGPT. Although the discussion is aimed at ChatGPT, the conclusions are applicable to other large language models.* ------ ChatGPT has received much praise and controversy since it was released. Most of us are surprised by its huge amount of knowledge and fluent language skills, but some still assert that it is not useful because of the "serious nonsense" [editor's note: AI dialogue which is stated confidently and verbosely, may sound correct on the surface level, but lacks logical consistency upon further investigation] it spouts from time to time. In considering the future prospects, some people are positive about the improvement of the mental work efficiency, while others are negatively predicting the reduction of employment opportunities. But this article does not intend to discuss these side issues, only to focus on one topic: Will this kind of system become so-called "artificial general intelligence"? ## What is ChatGPT? ChatGPT is from the family of "Large Language Models" (LLM). An LLM's direct goal is to summarize the statistical regulations of human language using. Its construction mainly includes two stages: (1) First, use the language materials provided by the Internet and other sources to train a giant artificial neural network, and directly summarize the behavior of language users at the level of wording and paragraphing. The simplest case is to count how often a word appears after another word, such as how many times the next word is "apple" after "red". Since the vocabulary of a language is limited, this kind of statistics is theoretically possible (regardless of whether it is actually achievable), but with respect to the massive amount of calculation needed, no one (human) learns a language like this. Furthermore, how a sentence will follow given a few words could be given by probability. Just as you type the beginning of a word on your smart phone, then it can be automatically completed according to your habits. It's just that the data (and computing resources) used by ChatGPT is far beyond your imagination, which makes you think it is intelligent. This explains the content abundance and language fluency of it, since it will response you what the speakers of the language are most likely to say in that context. (2) On the basis of the above training, ChatGPT has gone through an additional reinforcement learning process, in which human trainers provide it with many typical questions, then reward or punish its replies, to make its behavior conform to human requirements. This explains why it provides responses that deviate from statistical data on certain questions, especially when those statistical results may cause troubles, or do not fit its position as a program. Although the above training will cover many scenarios, there will always be problems beyond it. That is, when there are no significant statistical conclusions from the language materials, nor in subsequent training, ChatGPT (or the artificial neural network technology that underlies it) responds with just the "closest" answer, in which the "closest standard" is statistical. When training examples for "correct" answers are insufficient, the proportion of answers that are not meaningfully relevant will be increased, so the credibility of the response is very questionable, and this answer will be "statistically close" rather than having "close meaning" (as in human communication) and leads to some serious nonsense. Although the above introduction is simple, some fundamental limitations of the system can already be seen. Since the training material cannot be exhaustive for all practical situations, and a solution based on statistical (rather than semantic) similarity might not be reliable, especially in rare cases, the system is probably not useful for finding solutions to problems beyond the current human consensus. This technology has few effective ways to expand knowledge, though it can summarize and express existing human knowledge. And since the goal of such models is to "reproduce the average behavior of an average human being," it's natural for a problem to be overwhelmed by platitudes, even though it DOES have insights in its training materials. It is also worth noting that, in this regard, to say something like "ChatGPT thinks ..." actually means "people think ...", and ChatGPT has no "personal opinions". Many users are keen to ask ChatGPT's views on various value evaluation issues, then interpret the responses as "reflecting the values of artificial intelligence systems", which is a misunderstanding for large language models. ## What is AGI? There is not much consensus when it comes to identifying and understanding AI vs non-AI systems. But this does not mean the systems we call AI cannot be concretely defined and classified. This is discussed in detail in reference [1], and briefly in reference [2]. The main result is that "AI is an improvement of a certain aspect (but not all) of human intelligence". For different researchers, this aspect will be one of the following options: 1. Structure: To interpret AI as systems based on the structure of human brains. 2. Behavior: To interpret AI as systems which act and behave exactly like a human being. 3. Capability: To interpret AI as systems which solve those difficult problems that we think only humans can do. 4. Function: To interpret AI as systems equipped with various cognitive functions equipped by human beings, e.g., learning, reasoning, perceiving, moving, communicating. 5. Principle: To interpret AI as systems which follow a unified cognitive mechanism, such as logic in human intelligence. My opinion is that the above five all have their theoretical and practical importance, but they are not the same as each other, nor do they include each other, although they are all referred to as "artificial intelligence" for historical reasons. Well, these are the existing understandings of AI, but how AI could become general (become artificial general intelligence)? When AI was firstly introduced, AI research aimed at developing computer systems comparable to human intelligence, which is apparent in Turing's paper (ref. [3]) and other early literatures. However, the efforts to build a general-purpose system were repeatedly frustrated, leading most AI researchers to turn to specific problem solving. Regarding the research on general-purpose intelligent as a dead end, this initial research was criticized as daydreams. But about 20 years ago, some AI researchers (including myself) who disagreed with this trend began to come together and chose the name "Artificial General Intelligence (AGI)" as a new encouraging flag. A major consideration at the time was that the "g-factor" ("g" stands for "general") and the corresponding "general intelligence", which were already well-known concepts in psychology (in the study of Intelligence Quotient (IQ)). By making it artificial, it can be naturally interpreted as AGI. Compared with other candidate names, such as "Real AI", "Strong AI", "Human-level AI", this name is more accurate. After we settled on the name, the first AGI anthology was published in 2007 (Ref. [4]), and the AGI Annual Meeting and Journal began in the following years, thus marking the departure from mainstream AI. But with the rise of deep learning, many large companies began to refer to their technological progress as "a major step towards AGI". Due to their huge influence, the concept of AGI has then established a close relationship with deep learning in the public's impression. Some people even think that since deep learning can be used in many fields to solve different problems, this is already AGI. The misunderstanding here is to confuse "general-purposed technology" with "general-purpose system". Say deep learning can indeed be regarded as a general-purpose technology, but the computer systems developed with this technology can often only do one (e.g., playing Go, image classification, translation) or a limited number of related tasks, so they are specific systems. A general-purpose system cannot only do specific things, no matter how well it does. So here comes the question: Does "general purpose" have any other meaning besides being the opposite of specific? In the current research, the understanding of general artificial intelligence systems mainly include the following: 1. Can solve all problems. 2. Can solve all problems that human can solve. 3. Can solve all problems under a definition (e.g., the problems represented by a Turing machine). 4. Try to solve (but not guaranteed to solve) all problems perceived. Of the 4 understandings of general-purpose systems above, I am not aware of any AGI researchers with the 1st as the goal, which has been repeatedly rejected [5, 6]. And in the field of science and technology, the mention of "general purpose" can only be relative, conditional or limited, such as "general purpose Turing machine", "general purpose computer", "general intelligence (in psychology)" and so on. So it is not quite possible to solve "all" problems. Except for the 1st understanding, we agree that the above 2nd, 3rd, and 4th can all be regarded as reasonable (regardless of whether they can actually be realized). Considering the just-mentioned 3 understandings of generality, and the 5 understandings of AI (structure, behavior, ...), there are 15 combinations of "AGI". Though not all these fields are currently active, an abundance of research in all of these branches has been shown. ## What is the relationship between ChatGPT and AGI? With what was introduced above, ChatGPT is not "all of AGI". Considering the second understanding of generality and the second understanding of AI, say "AI should perform like a human being and can solve all problems that humans can solve", ChatGPT can be said somewhat relevant to AGI. For AGI research under other understandings, take my own research project NARS (Non-Axiomatic Reasoning System) [7] as an example, which attempts to use learned knowledge and limited resources to solve all perceivable problems, lies in the 4th understanding of generality and the 5th understanding of AI. NARS may use a large language model like ChatGPT as one of the knowledge sources and language interfaces, but it may not fully trust its information like when we search on the Internet, let alone rely on it to complete the core reasoning and learning tasks of the system. The purpose of this article is not to introduce NARS, materials are open for readers who are interested. There are two separate AGI research branches, one is by large companies and is roughly based on deep learning, while the other is still exploring other non-mainstream approaches. The latter was formed before the emergence of deep learning, and it has not recognized deep learning as a core technology. The reason is far beyond what this article can cover. I just want to show that not all AGI researchers believe that deep learning (including ChatGPT and other artificial neural networks) is the best way to achieve AGI. It can hardly say out of arrogance and jealousy, even ignorance, these researchers have such opinions. Because before 2012 when deep learning has yet become popular there have already been many related works on the AGI conference, and most of the scholars hold a negative attitude upon it. But on the opposite, deep learning scholars do not often spend hours on other AGI research due to the yet-made achievements. ## Pro and cons of behaviorism. Now some readers may ask if the route presented by ChatGPT is not the only or even best possibility to realize AGI, why does it seem like the closest to success? This is about the characteristics of the "behavioral standard" (the goal of "acting like a human"). I discussed the Turing test and the ELIZA effect in reference [8] seven years ago. Some statements are clearly no longer suitable for the current situation, but I still agree with other points in that reference, and here are just some complementary. Among the five categories of the understanding of AI, "behavior" and "capability" are the most intuitive, and therefore the easiest to be accepted by the public. E.g., AlphaGo. Naturally, people will agree, it even beat the world champion, is it not intelligent enough? But these intuitive agreements still have their own cons. An obvious problem is an anthropocentrism. As pointed out in [8], although "speaking like a human" can be used as a sufficient condition for "intelligence", it is certainly not a necessary condition (Turing saw this in reference [3], but did not expand the discussion), otherwise, it is impossible for the whole universe to have other intelligence except for us. When evaluating chatbots, the ELIZA effect cannot be ignored. ELIZA is a well-known chatbot in the 1960s. It's famous not because it was technologically advanced (ELIZA relied on pre-made templates and canonical routines for conversations), but because it is deceptive, and many people mistakenly believed the program was intelligent. Henceforth, the ELIZA effect refers to people being deluded by the behavior of a computer into thinking that it possesses intelligent abilities that it does not. What was said in a comment on ChatGPT that "it shows like it understands what you say, then it indeed understands you" is a typical form of this effect, and similar judgments include saying that ChatGPT "has emotions", "can perform logical reasoning", and even "has a certain level of self-awareness". From the perspective of cognitive science, the ELIZA effect is easier to explain. When we observe a novel phenomenon, we usually try to explain and understand it with our most familiar concepts, which is similar to the assimilation proposed by Piaget (Jean Piaget, psychologist, 1896.8.9-1980.9.16). This phenomenon is related to the abduction reasoning proposed by Peirce (Charles Sanders Peirce, logician, 1839.9.10-1914.4.19). Uninformed observers will inevitably explain ChatGPT's working principle according to human behavior. In large language models, this effect is more prominent since our judgments about whether other people have various cognitive capabilities (e.g., understanding, reasoning, emotion, consciousness) are often judged through dialogues. Therefore, when a system shows the ability to talk to you, we will naturally guess that the system has human-like cognitive functions that it does not have. Some readers may also be curious about "how do you know it DOES NOT have those capabilities?". There is a longer answer, but here I will only briefly analyze the "capability of logical reasoning" that ChatGPT can perform. ChatGPT does show good logical reasoning ability in many examples, but in some cases, it is obviously logically confused, and this difference is often reflected in the different amounts of training data. But logical reasoning is based on the structure or pattern of knowledge, independent of its content. For example, "A is C" can be deduced from "A is B" and "B is C", and this has nothing to do with what specific concepts A, B, and C represent. Therefore, the performance of ChatGPT in reasoning exposes that it is not really capable of logical reasoning, but only by imitating human speech behavior. We can even say that ChatGPT "does not solve problems, but summarizes people's solutions to problems". In a sense, this can indeed be said to be a "universal" method of solving problems, which is similar to "not solving the problem, but "solving" the person who raised the problem". ## My conclusion. I think that the LLM does have great theoretical and application value, but it is basically different from the intelligent model since intelligence cannot be achieved by only imitating human language behavior, even by supplementing it with sensory-motor data like PaLM-E and GPT-4. Even if we only talk about the processing of language, the language model regards language itself as a simulated object, while the intelligent model regards language as a communication tool and a source of knowledge. Specifically, the primary goal of a large language model is to speak like a person, while the goal of an intelligent system is to complete the current communication task according to the system's personal needs. Only under this premise will it consider the use of language that meets ordinary people. The resulting difference is that the intelligent system does not necessarily complete a sentence in the way most people do, but expresses it from its own point of view. The views expressed in my article are obviously inconsistent with the current general comments on ChatGPT, but interested readers may wish to use the title of this article to ask ChatGPT or other large language models, and then compare their responses with this article to see which one is more likely generated by an intelligent system. ## Reference [1] “[On Defining Artificial Intelligence](https://sciendo.com/downloadpdf/journals/jagi/10/2/article-p1.pdf)”, Pei Wang, Journal of Artificial General Intelligence, 10(2):1-37, 2019 [2] [当你谈论人工智能时,到底在谈论什么? (When you are talking about artificial intelligence, what exactly are you talking about?)](http://mp.weixin.qq.com/s?__biz=MzAwMzc2MTA4Ng==&mid=2247488863&idx=1&sn=fe1958c0155f02507c1e11590a9b2921&chksm=9b37638eac40ea988e5e9614f4ef442f060af6f352515568f6066eac0be752c2df6038eab62a&scene=21#wechat_redirect),王培,《赛先生》 2015-08-06 [3] “[Computing Machinery and Intelligence](https://watermark.silverchair.com/lix-236-433.pdf?token=AQECAHi208BE49Ooan9kkhW_Ercy7Dm3ZL_9Cf3qfKAc485ysgAAAs4wggLKBgkqhkiG9w0BBwagggK7MIICtwIBADCCArAGCSqGSIb3DQEHATAeBglghkgBZQMEAS4wEQQMraAbnRSHi8izPBHvAgEQgIICgXA1MVjNr3CuXag_iF53Hdgv9LYepOzgoSmqY64Qb8DSU9DnA61mgQEIFIuM75gyn8oUhnWD2wd6vCfbYd1VPCPQicx0ITBlENoj4q5KEnwfvdjEWBmn_3XVmA3QKdkHpCuyYlXrsixIlwobfKHyks4rqB9Jshs6cZP2J_M0k3VspKRYISLboaj946rYnsKCLkuY0igEpzlD13xzEbJywx5AFGT8c4HihRxynvfxonCtGAOism8JZHVi-tbiMIYdSdSVPRD7gxnDSnScPb5_17FNZfayeQyiXbWaiMW2EtkSEc6C_JIGdYXKYODYt9c9c9-rBL0_FNYST7kp25KmlpP4cQsRjJzEbDLE2GdDCt5gZpwBp8tERI-MOaZ18jeEAhiSxQf0ze4jBBg1RAIaxXKcsbdGckVmITCSjE-yA3WjOBNOm9vdAGRP3s60cJlgsctmLb62z7p1OyCK_CH7Upji26UoAdogf_0su1d42Pxwwxg5iHTn9Edr2FxcX2Yjv7Y_Vjm5ZU7Vq47E-m57xCZwgpCu_FT1ncE5NZWtaV8pZb60sMnZA_P4r2zEHrwTnsTc6QzgONp9NVtDbewjV-SmabLfHyhXvnfFq9hKZ9mlUQA_LkXVyrMTk9QSrL7c03C5YM2wF8qgpsj11aSMggSMvBh8ml3U8iiCpIPKwuBy_vR4-Fq-xF-9ClxeTCDpRovhURQDrijTUWWa3o988A7Yb9Aplg86yRXbrJRNkcWA26lIOn9XHXZ7h7_vEe_dKk1U_HsezAhg-LBUYf19N_CyHNqu1JZKeNSDeIKvFxDPBi0skvmJibQvizDsL2IysSX38YUkLFQ88Ay6IbzciVSc)”, Alan Turing, Mind 49: 433-460, 1950 [4] [Artificial General Intelligence](https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=782aceec18dd97923ea8d1eb93c326133ba980c2), Goertzel, B. and Pennachin, C. (eds.), Springer, New York, 2007 [5] “[Introduction: Aspects of artificial general intelligence](https://cis.temple.edu/~pwang/Publication/AGI_Aspects.pdf)”, Wang, P. and Goertzel, B., in Advance of Artificial General Intelligence, pp. 1–16, Goertzel, B. and Wang, P. (eds.), IOS Press, Amsterdam, 2007 [[6\] 计算机不是只会 “计算”,图灵机也不是一台“机器” (It is not true that computers only computes, and Turing Machine is not a machine.)](https://mp.weixin.qq.com/s?__biz=Mzg2MTUyODU2NA==&mid=2247496451&idx=1&sn=5c5fa083aab6466fb25234020fc999c8&scene=21#wechat_redirect), 王培, 《返朴》 2020-06-02 [7] [Non-Axiomatic Reasoning System: Exploring the Essence of Intelligence](https://cis.temple.edu/~wangp/Publication/thesis.pdf), Pei Wang, Ph. D. Dissertation, Indiana University, 1995 [8] [图灵测试是人工智能的标准吗? (Is Turing Test a good benchmark of artificial intelligence?)](http://mp.weixin.qq.com/s?__biz=MzAwMzc2MTA4Ng==&mid=2247488581&idx=1&sn=ba0a479ba9b926e1ee543810c8e3d61e&chksm=9b376294ac40eb822f5ef7bf8ce5f11beb4b429f6d10f91c21bacdfaee73cdac0965c1692ca7&scene=21#wechat_redirect)王培,《赛先生》 2016-05-23