Evaluating Understanding on Conceptual Abstraction Benchmarks
- URL: http://arxiv.org/abs/2206.14187v1
- Date: Tue, 28 Jun 2022 17:52:46 GMT
- Title: Evaluating Understanding on Conceptual Abstraction Benchmarks
- Authors: Victor Vikram Odouard and Melanie Mitchell
- Abstract summary: A long-held objective in AI is to build systems that understand concepts in a humanlike way.
We argue that understanding a concept requires the ability to use it in varied contexts.
Our concept-based approach to evaluation reveals information about AI systems that conventional test sets would have left hidden.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A long-held objective in AI is to build systems that understand concepts in a
humanlike way. Setting aside the difficulty of building such a system, even
trying to evaluate one is a challenge, due to present-day AI's relative opacity
and its proclivity for finding shortcut solutions. This is exacerbated by
humans' tendency to anthropomorphize, assuming that a system that can recognize
one instance of a concept must also understand other instances, as a human
would. In this paper, we argue that understanding a concept requires the
ability to use it in varied contexts. Accordingly, we propose systematic
evaluations centered around concepts, by probing a system's ability to use a
given concept in many different instantiations. We present case studies of such
an evaluations on two domains -- RAVEN (inspired by Raven's Progressive
Matrices) and the Abstraction and Reasoning Corpus (ARC) -- that have been used
to develop and assess abstraction abilities in AI systems. Our concept-based
approach to evaluation reveals information about AI systems that conventional
test sets would have left hidden.
Related papers
- Imagining and building wise machines: The centrality of AI metacognition [78.76893632793497]
We argue that shortcomings stem from one overarching failure: AI systems lack wisdom.
While AI research has focused on task-level strategies, metacognition is underdeveloped in AI systems.
We propose that integrating metacognitive capabilities into AI systems is crucial for enhancing their robustness, explainability, cooperation, and safety.
arXiv Detail & Related papers (2024-11-04T18:10:10Z) - Combining AI Control Systems and Human Decision Support via Robustness and Criticality [53.10194953873209]
We extend a methodology for adversarial explanations (AE) to state-of-the-art reinforcement learning frameworks.
We show that the learned AI control system demonstrates robustness against adversarial tampering.
In a training / learning framework, this technology can improve both the AI's decisions and explanations through human interaction.
arXiv Detail & Related papers (2024-07-03T15:38:57Z) - Concept Induction using LLMs: a user experiment for assessment [1.1982127665424676]
This study explores the potential of a Large Language Model (LLM) to generate high-level concepts that are meaningful as explanations for humans.
We compare the concepts generated by the LLM with two other methods: concepts generated by humans and the ECII concept induction system.
Our findings indicate that while human-generated explanations remain superior, concepts derived from GPT-4 are more comprehensible to humans compared to those generated by ECII.
arXiv Detail & Related papers (2024-04-18T03:22:02Z) - The ConceptARC Benchmark: Evaluating Understanding and Generalization in
the ARC Domain [0.0]
We describe an in-depth evaluation benchmark for the Abstraction and Reasoning Corpus (ARC)
In particular, we describe ConceptARC, a new, publicly available benchmark in the ARC domain.
We report results on testing humans on this benchmark as well as three machine solvers.
arXiv Detail & Related papers (2023-05-11T21:06:39Z) - Towards Human Cognition Level-based Experiment Design for Counterfactual
Explanations (XAI) [68.8204255655161]
The emphasis of XAI research appears to have turned to a more pragmatic explanation approach for better understanding.
An extensive area where cognitive science research may substantially influence XAI advancements is evaluating user knowledge and feedback.
We propose a framework to experiment with generating and evaluating the explanations on the grounds of different cognitive levels of understanding.
arXiv Detail & Related papers (2022-10-31T19:20:22Z) - A Human-Centric Assessment Framework for AI [11.065260433086024]
There is no agreed standard on how explainable AI systems should be assessed.
Inspired by the Turing test, we introduce a human-centric assessment framework.
This setup can serve as framework for a wide range of human-centric AI system assessments.
arXiv Detail & Related papers (2022-05-25T12:59:13Z) - Abstraction and Analogy-Making in Artificial Intelligence [0.0]
No current AI system is anywhere close to a capability of forming humanlike abstractions or analogies.
This paper reviews the advantages and limitations of several approaches toward this goal, including symbolic methods, deep learning, and probabilistic program induction.
arXiv Detail & Related papers (2021-02-22T00:12:48Z) - Thinking Fast and Slow in AI [38.8581204791644]
This paper proposes a research direction to advance AI which draws inspiration from cognitive theories of human decision making.
The premise is that if we gain insights about the causes of some human capabilities that are still lacking in AI, we may obtain similar capabilities in an AI system.
arXiv Detail & Related papers (2020-10-12T20:10:05Z) - Bongard-LOGO: A New Benchmark for Human-Level Concept Learning and
Reasoning [78.13740873213223]
Bongard problems (BPs) were introduced as an inspirational challenge for visual cognition in intelligent systems.
We propose a new benchmark Bongard-LOGO for human-level concept learning and reasoning.
arXiv Detail & Related papers (2020-10-02T03:19:46Z) - Machine Common Sense [77.34726150561087]
Machine common sense remains a broad, potentially unbounded problem in artificial intelligence (AI)
This article deals with the aspects of modeling commonsense reasoning focusing on such domain as interpersonal interactions.
arXiv Detail & Related papers (2020-06-15T13:59:47Z) - A general framework for scientifically inspired explanations in AI [76.48625630211943]
We instantiate the concept of structure of scientific explanation as the theoretical underpinning for a general framework in which explanations for AI systems can be implemented.
This framework aims to provide the tools to build a "mental-model" of any AI system so that the interaction with the user can provide information on demand and be closer to the nature of human-made explanations.
arXiv Detail & Related papers (2020-03-02T10:32:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.