The ConceptARC Benchmark: Evaluating Understanding and Generalization in
the ARC Domain
- URL: http://arxiv.org/abs/2305.07141v1
- Date: Thu, 11 May 2023 21:06:39 GMT
- Title: The ConceptARC Benchmark: Evaluating Understanding and Generalization in
the ARC Domain
- Authors: Arseny Moskvichev, Victor Vikram Odouard, and Melanie Mitchell
- Abstract summary: We describe an in-depth evaluation benchmark for the Abstraction and Reasoning Corpus (ARC)
In particular, we describe ConceptARC, a new, publicly available benchmark in the ARC domain.
We report results on testing humans on this benchmark as well as three machine solvers.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The abilities to form and abstract concepts is key to human intelligence, but
such abilities remain lacking in state-of-the-art AI systems. There has been
substantial research on conceptual abstraction in AI, particularly using
idealized domains such as Raven's Progressive Matrices and Bongard problems,
but even when AI systems succeed on such problems, the systems are rarely
evaluated in depth to see if they have actually grasped the concepts they are
meant to capture.
In this paper we describe an in-depth evaluation benchmark for the
Abstraction and Reasoning Corpus (ARC), a collection of few-shot abstraction
and analogy problems developed by Chollet [2019]. In particular, we describe
ConceptARC, a new, publicly available benchmark in the ARC domain that
systematically assesses abstraction and generalization abilities on a number of
basic spatial and semantic concepts. ConceptARC differs from the original ARC
dataset in that it is specifically organized around "concept groups" -- sets of
problems that focus on specific concepts and that are vary in complexity and
level of abstraction. We report results on testing humans on this benchmark as
well as three machine solvers: the top two programs from a 2021 ARC competition
and OpenAI's GPT-4. Our results show that humans substantially outperform the
machine solvers on this benchmark, showing abilities to abstract and generalize
concepts that are not yet captured by AI systems. We believe that this
benchmark will spur improvements in the development of AI systems for
conceptual abstraction and in the effective evaluation of such systems.
Related papers
- A Survey on All-in-One Image Restoration: Taxonomy, Evaluation and Future Trends [67.43992456058541]
Image restoration (IR) refers to the process of improving visual quality of images while removing degradation, such as noise, blur, weather effects, and so on.
Traditional IR methods typically target specific types of degradation, which limits their effectiveness in real-world scenarios with complex distortions.
The all-in-one image restoration (AiOIR) paradigm has emerged, offering a unified framework that adeptly addresses multiple degradation types.
arXiv Detail & Related papers (2024-10-19T11:11:09Z) - Combining AI Control Systems and Human Decision Support via Robustness and Criticality [53.10194953873209]
We extend a methodology for adversarial explanations (AE) to state-of-the-art reinforcement learning frameworks.
We show that the learned AI control system demonstrates robustness against adversarial tampering.
In a training / learning framework, this technology can improve both the AI's decisions and explanations through human interaction.
arXiv Detail & Related papers (2024-07-03T15:38:57Z) - Contrastive Sparse Autoencoders for Interpreting Planning of Chess-Playing Agents [0.0]
We propose contrastive sparse autoencoders (CSAE) for studying pairs of game trajectories.
Using CSAE, we are able to extract and interpret concepts that are meaningful to the chess-agent plans.
arXiv Detail & Related papers (2024-06-06T12:57:31Z) - Artificial General Intelligence (AGI)-Native Wireless Systems: A Journey Beyond 6G [58.440115433585824]
Building future wireless systems that support services like digital twins (DTs) is challenging to achieve through advances to conventional technologies like meta-surfaces.
While artificial intelligence (AI)-native networks promise to overcome some limitations of wireless technologies, developments still rely on AI tools like neural networks.
This paper revisits the concept of AI-native wireless systems, equipping them with the common sense necessary to transform them into artificial general intelligence (AGI)-native systems.
arXiv Detail & Related papers (2024-04-29T04:51:05Z) - Abstract Visual Reasoning Enabled by Language [8.627180519837657]
We propose a general learning-based framework for solving ARC.
It is centered on transforming tasks from the vision to the language domain.
This composition of language and vision allows for pre-trained models to be leveraged at each stage.
arXiv Detail & Related papers (2023-03-07T17:52:46Z) - Core and Periphery as Closed-System Precepts for Engineering General
Intelligence [62.997667081978825]
It is unclear if an AI system's inputs will be independent of its outputs, and, therefore, if AI systems can be treated as traditional components.
This paper posits that engineering general intelligence requires new general systems precepts, termed the core and periphery.
arXiv Detail & Related papers (2022-08-04T18:20:25Z) - Evaluating Understanding on Conceptual Abstraction Benchmarks [0.0]
A long-held objective in AI is to build systems that understand concepts in a humanlike way.
We argue that understanding a concept requires the ability to use it in varied contexts.
Our concept-based approach to evaluation reveals information about AI systems that conventional test sets would have left hidden.
arXiv Detail & Related papers (2022-06-28T17:52:46Z) - Conceptual Modeling and Artificial Intelligence: Mutual Benefits from
Complementary Worlds [0.0]
We are interested in tackling the intersection of the two, thus far, mostly isolated approached disciplines of CM and AI.
The workshop embraces the assumption, that manifold mutual benefits can be realized by i) investigating what Conceptual Modeling (CM) can contribute to AI, and ii) the other way around.
arXiv Detail & Related papers (2021-10-16T18:42:09Z) - Abstraction and Analogy-Making in Artificial Intelligence [0.0]
No current AI system is anywhere close to a capability of forming humanlike abstractions or analogies.
This paper reviews the advantages and limitations of several approaches toward this goal, including symbolic methods, deep learning, and probabilistic program induction.
arXiv Detail & Related papers (2021-02-22T00:12:48Z) - Towards an Interface Description Template for AI-enabled Systems [77.34726150561087]
Reuse is a common system architecture approach that seeks to instantiate a system architecture with existing components.
There is currently no framework that guides the selection of necessary information to assess their portability to operate in a system different than the one for which the component was originally purposed.
We present ongoing work on establishing an interface description template that captures the main information of an AI-enabled component.
arXiv Detail & Related papers (2020-07-13T20:30:26Z) - A general framework for scientifically inspired explanations in AI [76.48625630211943]
We instantiate the concept of structure of scientific explanation as the theoretical underpinning for a general framework in which explanations for AI systems can be implemented.
This framework aims to provide the tools to build a "mental-model" of any AI system so that the interaction with the user can provide information on demand and be closer to the nature of human-made explanations.
arXiv Detail & Related papers (2020-03-02T10:32:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.