Sparks of Artificial General Intelligence: Early experiments with GPT-4
- URL: http://arxiv.org/abs/2303.12712v5
- Date: Thu, 13 Apr 2023 20:41:31 GMT
- Title: Sparks of Artificial General Intelligence: Early experiments with GPT-4
- Authors: S\'ebastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes
Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott
Lundberg, Harsha Nori, Hamid Palangi, Marco Tulio Ribeiro, Yi Zhang
- Abstract summary: GPT-4, developed by OpenAI, was trained using an unprecedented scale of compute and data.
We demonstrate that GPT-4 can solve novel and difficult tasks that span mathematics, coding, vision, medicine, law, psychology and more.
We believe GPT-4 could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system.
- Score: 66.1188263570629
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Artificial intelligence (AI) researchers have been developing and refining
large language models (LLMs) that exhibit remarkable capabilities across a
variety of domains and tasks, challenging our understanding of learning and
cognition. The latest model developed by OpenAI, GPT-4, was trained using an
unprecedented scale of compute and data. In this paper, we report on our
investigation of an early version of GPT-4, when it was still in active
development by OpenAI. We contend that (this early version of) GPT-4 is part of
a new cohort of LLMs (along with ChatGPT and Google's PaLM for example) that
exhibit more general intelligence than previous AI models. We discuss the
rising capabilities and implications of these models. We demonstrate that,
beyond its mastery of language, GPT-4 can solve novel and difficult tasks that
span mathematics, coding, vision, medicine, law, psychology and more, without
needing any special prompting. Moreover, in all of these tasks, GPT-4's
performance is strikingly close to human-level performance, and often vastly
surpasses prior models such as ChatGPT. Given the breadth and depth of GPT-4's
capabilities, we believe that it could reasonably be viewed as an early (yet
still incomplete) version of an artificial general intelligence (AGI) system.
In our exploration of GPT-4, we put special emphasis on discovering its
limitations, and we discuss the challenges ahead for advancing towards deeper
and more comprehensive versions of AGI, including the possible need for
pursuing a new paradigm that moves beyond next-word prediction. We conclude
with reflections on societal influences of the recent technological leap and
future research directions.
Related papers
- Evaluating Large Language Models on the GMAT: Implications for the
Future of Business Education [0.13654846342364302]
This study introduces the first benchmark to assess the performance of seven major Large Language Models (LLMs)
Our analysis shows that most LLMs outperform human candidates, with GPT-4 Turbo not only outperforming the other models but also surpassing the average scores of graduate students at top business schools.
While AI's promise in education, assessment, and tutoring is clear, challenges remain.
arXiv Detail & Related papers (2024-01-02T03:54:50Z) - GPT4Vis: What Can GPT-4 Do for Zero-shot Visual Recognition? [82.40761196684524]
This paper centers on the evaluation of GPT-4's linguistic and visual capabilities in zero-shot visual recognition tasks.
We conduct extensive experiments to evaluate GPT-4's performance across images, videos, and point clouds.
Our findings show that GPT-4, enhanced with rich linguistic descriptions, significantly improves zero-shot recognition.
arXiv Detail & Related papers (2023-11-27T11:29:10Z) - MathVista: Evaluating Mathematical Reasoning of Foundation Models in
Visual Contexts [170.01089233942594]
MathVista is a benchmark designed to combine challenges from diverse mathematical and visual tasks.
The best-performing GPT-4V model achieves an overall accuracy of 49.9%, substantially outperforming Bard, the second-best performer, by 15.1%.
GPT-4V still falls short of human performance by 10.4%, as it often struggles to understand complex figures and perform rigorous reasoning.
arXiv Detail & Related papers (2023-10-03T17:57:24Z) - The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision) [121.42924593374127]
We analyze the latest model, GPT-4V, to deepen the understanding of LMMs.
GPT-4V's unprecedented ability in processing arbitrarily interleaved multimodal inputs makes it a powerful multimodal generalist system.
GPT-4V's unique capability of understanding visual markers drawn on input images can give rise to new human-computer interaction methods.
arXiv Detail & Related papers (2023-09-29T17:34:51Z) - Generative AI in Mafia-like Game Simulation [2.44755919161855]
The study aimed to showcase the model's potential in understanding, decision-making, and interaction during game scenarios.
The findings suggest that while GPT-4 exhibits promising advancements over earlier models, there remains potential for further development.
arXiv Detail & Related papers (2023-09-20T22:38:34Z) - Gpt-4: A Review on Advancements and Opportunities in Natural Language
Processing [0.0]
Generative Pre-trained Transformer 4 (GPT-4) is the fourth-generation language model in the GPT series, developed by OpenAI.
GPT-4 has a larger model size (more than one trillion), better multilingual capabilities, improved contextual understanding, and reasoning capabilities than GPT-3.
Some of the potential applications of GPT-4 include chatbots, personal assistants, language translation, text summarization, and question-answering.
arXiv Detail & Related papers (2023-05-04T22:46:43Z) - One Small Step for Generative AI, One Giant Leap for AGI: A Complete
Survey on ChatGPT in AIGC Era [95.2284704286191]
GPT-4 (a.k.a. ChatGPT plus) is one small step for generative AI (GAI) but one giant leap for artificial general intelligence (AGI)
Since its official release in November 2022, ChatGPT has quickly attracted numerous users with extensive media coverage.
This work is the first to survey ChatGPT with a comprehensive review of its underlying technology, applications, and challenges.
arXiv Detail & Related papers (2023-04-04T06:22:09Z) - A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to
GPT-5 All You Need? [112.12974778019304]
generative AI (AIGC, a.k.a AI-generated content) has made headlines everywhere because of its ability to analyze and create text, images, and beyond.
In the era of AI transitioning from pure analysis to creation, it is worth noting that ChatGPT, with its most recent language model GPT-4, is just a tool out of numerous AIGC tasks.
This work focuses on the technological development of various AIGC tasks based on their output type, including text, images, videos, 3D content, etc.
arXiv Detail & Related papers (2023-03-21T10:09:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.