Brain in a Vat: On Missing Pieces Towards Artificial General
Intelligence in Large Language Models
- URL: http://arxiv.org/abs/2307.03762v1
- Date: Fri, 7 Jul 2023 13:58:16 GMT
- Title: Brain in a Vat: On Missing Pieces Towards Artificial General
Intelligence in Large Language Models
- Authors: Yuxi Ma, Chi Zhang, Song-Chun Zhu
- Abstract summary: We propose four characteristics of generally intelligent agents.
We argue that active engagement with objects in the real world delivers more robust signals for forming conceptual representations.
We conclude by outlining promising future research directions in the field of artificial general intelligence.
- Score: 83.63242931107638
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this perspective paper, we first comprehensively review existing
evaluations of Large Language Models (LLMs) using both standardized tests and
ability-oriented benchmarks. We pinpoint several problems with current
evaluation methods that tend to overstate the capabilities of LLMs. We then
articulate what artificial general intelligence should encompass beyond the
capabilities of LLMs. We propose four characteristics of generally intelligent
agents: 1) they can perform unlimited tasks; 2) they can generate new tasks
within a context; 3) they operate based on a value system that underpins task
generation; and 4) they have a world model reflecting reality, which shapes
their interaction with the world. Building on this viewpoint, we highlight the
missing pieces in artificial general intelligence, that is, the unity of
knowing and acting. We argue that active engagement with objects in the real
world delivers more robust signals for forming conceptual representations.
Additionally, knowledge acquisition isn't solely reliant on passive input but
requires repeated trials and errors. We conclude by outlining promising future
research directions in the field of artificial general intelligence.
Related papers
- Visual-O1: Understanding Ambiguous Instructions via Multi-modal Multi-turn Chain-of-thoughts Reasoning [53.45295657891099]
This paper proposes Visual-O1, a multi-modal multi-turn chain-of-thought reasoning framework.
It simulates human multi-modal multi-turn reasoning, providing instantial experience for highly intelligent models.
Our work highlights the potential of artificial intelligence to work like humans in real-world scenarios with uncertainty and ambiguity.
arXiv Detail & Related papers (2024-10-04T11:18:41Z) - How to Measure the Intelligence of Large Language Models? [0.24578723416255752]
We argue that the intelligence of language models should not only be assessed by task-specific statistical metrics.
We show that the choice of metrics has already been shown to dramatically influence assessments on potential intelligence emergence.
arXiv Detail & Related papers (2024-07-30T13:53:48Z) - Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI [129.08019405056262]
Embodied Artificial Intelligence (Embodied AI) is crucial for achieving Artificial Intelligence (AGI)
MLMs andWMs have attracted significant attention due to their remarkable perception, interaction, and reasoning capabilities.
In this survey, we give a comprehensive exploration of the latest advancements in Embodied AI.
arXiv Detail & Related papers (2024-07-09T14:14:47Z) - WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks [85.95607119635102]
Large language models (LLMs) can mimic human-like intelligence.
WorkArena++ is designed to evaluate the planning, problem-solving, logical/arithmetic reasoning, retrieval, and contextual understanding abilities of web agents.
arXiv Detail & Related papers (2024-07-07T07:15:49Z) - Can large language models understand uncommon meanings of common words? [30.527834781076546]
Large language models (LLMs) have shown significant advancements across diverse natural language understanding (NLU) tasks.
Yet, lacking widely acknowledged testing mechanisms, answering whether LLMs are parrots or genuinely comprehend the world' remains unclear.
This paper presents innovative construction of a Lexical Semantic dataset with novel evaluation metrics.
arXiv Detail & Related papers (2024-05-09T12:58:22Z) - A Survey on Robotics with Foundation Models: toward Embodied AI [30.999414445286757]
Recent advances in computer vision, natural language processing, and multi-modality learning have shown that the foundation models have superhuman capabilities for specific tasks.
This survey aims to provide a comprehensive and up-to-date overview of foundation models in robotics, focusing on autonomous manipulation and encompassing high-level planning and low-level control.
arXiv Detail & Related papers (2024-02-04T07:55:01Z) - MacGyver: Are Large Language Models Creative Problem Solvers? [87.70522322728581]
We explore the creative problem-solving capabilities of modern LLMs in a novel constrained setting.
We create MACGYVER, an automatically generated dataset consisting of over 1,600 real-world problems.
We present our collection to both LLMs and humans to compare and contrast their problem-solving abilities.
arXiv Detail & Related papers (2023-11-16T08:52:27Z) - A Sentence is Worth a Thousand Pictures: Can Large Language Models Understand Hum4n L4ngu4ge and the W0rld behind W0rds? [2.7342737448775534]
Large Language Models (LLMs) have been linked to claims about human-like linguistic performance.
We analyze the contribution of LLMs as theoretically informative representations of a target cognitive system.
We evaluate the models' ability to see the bigger picture, through top-down feedback from higher levels of processing.
arXiv Detail & Related papers (2023-07-26T18:58:53Z) - WenLan 2.0: Make AI Imagine via a Multimodal Foundation Model [74.4875156387271]
We develop a novel foundation model pre-trained with huge multimodal (visual and textual) data.
We show that state-of-the-art results can be obtained on a wide range of downstream tasks.
arXiv Detail & Related papers (2021-10-27T12:25:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.