Related papers: Do great minds think alike? Investigating Human-AI Complementarity in Question Answering with CAIMIRA

Do great minds think alike? Investigating Human-AI Complementarity in Question Answering with CAIMIRA

URL: http://arxiv.org/abs/2410.06524v1
Date: Wed, 9 Oct 2024 03:53:26 GMT
Title: Do great minds think alike? Investigating Human-AI Complementarity in Question Answering with CAIMIRA
Authors: Maharshi Gor, Hal Daumé III, Tianyi Zhou, Jordan Boyd-Graber,
Abstract summary: Humans outperform AI systems in knowledge-grounded abductive and conceptual reasoning. State-of-the-art LLMs like GPT-4 and LLaMA show superior performance on targeted information retrieval.
Score: 43.116608441891096
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advancements of large language models (LLMs) have led to claims of AI surpassing humans in natural language processing (NLP) tasks such as textual understanding and reasoning. This work investigates these assertions by introducing CAIMIRA, a novel framework rooted in item response theory (IRT) that enables quantitative assessment and comparison of problem-solving abilities of question-answering (QA) agents: humans and AI systems. Through analysis of over 300,000 responses from ~70 AI systems and 155 humans across thousands of quiz questions, CAIMIRA uncovers distinct proficiency patterns in knowledge domains and reasoning skills. Humans outperform AI systems in knowledge-grounded abductive and conceptual reasoning, while state-of-the-art LLMs like GPT-4 and LLaMA show superior performance on targeted information retrieval and fact-based reasoning, particularly when information gaps are well-defined and addressable through pattern matching or data retrieval. These findings highlight the need for future QA tasks to focus on questions that challenge not only higher-order reasoning and scientific thinking, but also demand nuanced linguistic interpretation and cross-contextual knowledge application, helping advance AI developments that better emulate or complement human cognitive abilities in real-world problem-solving.

Related papers

Beyond Statistical Learning: Exact Learning Is Essential for General Intelligence [59.07578850674114]
Sound deductive reasoning is an indisputably desirable aspect of general intelligence.<n>It is well-documented that even the most advanced frontier systems regularly and consistently falter on easily-solvable reasoning tasks.<n>We argue that their unsound behavior is a consequence of the statistical learning approach powering their development.
arXiv Detail & Related papers (2025-06-30T14:37:50Z)
Imagining and building wise machines: The centrality of AI metacognition [78.76893632793497]
We argue that shortcomings stem from one overarching failure: AI systems lack wisdom. While AI research has focused on task-level strategies, metacognition is underdeveloped in AI systems. We propose that integrating metacognitive capabilities into AI systems is crucial for enhancing their robustness, explainability, cooperation, and safety.
arXiv Detail & Related papers (2024-11-04T18:10:10Z)
Converging Paradigms: The Synergy of Symbolic and Connectionist AI in LLM-Empowered Autonomous Agents [55.63497537202751]
Article explores the convergence of connectionist and symbolic artificial intelligence (AI) Traditionally, connectionist AI focuses on neural networks, while symbolic AI emphasizes symbolic representation and logic. Recent advancements in large language models (LLMs) highlight the potential of connectionist architectures in handling human language as a form of symbols.
arXiv Detail & Related papers (2024-07-11T14:00:53Z)
OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI [73.75520820608232]
We introduce OlympicArena, which includes 11,163 bilingual problems across both text-only and interleaved text-image modalities. These challenges encompass a wide range of disciplines spanning seven fields and 62 international Olympic competitions, rigorously examined for data leakage. Our evaluations reveal that even advanced models like GPT-4o only achieve a 39.97% overall accuracy, illustrating current AI limitations in complex reasoning and multimodal integration.
arXiv Detail & Related papers (2024-06-18T16:20:53Z)
Cognition is All You Need -- The Next Layer of AI Above Large Language Models [0.0]
We present Cognitive AI, a framework for neurosymbolic cognition outside of large language models. We propose that Cognitive AI is a necessary precursor for the evolution of the forms of AI, such as AGI, and specifically claim that AGI cannot be achieved by probabilistic approaches on their own. We conclude with a discussion of the implications for large language models, adoption cycles in AI, and commercial Cognitive AI development.
arXiv Detail & Related papers (2024-03-04T16:11:57Z)
Enabling High-Level Machine Reasoning with Cognitive Neuro-Symbolic Systems [67.01132165581667]
We propose to enable high-level reasoning in AI systems by integrating cognitive architectures with external neuro-symbolic components. We illustrate a hybrid framework centered on ACT-R and we discuss the role of generative models in recent and future applications.
arXiv Detail & Related papers (2023-11-13T21:20:17Z)
MAILS -- Meta AI Literacy Scale: Development and Testing of an AI Literacy Questionnaire Based on Well-Founded Competency Models and Psychological Change- and Meta-Competencies [6.368014180870025]
The questionnaire should be modular (i.e., including different facets that can be used independently of each other) to be flexibly applicable in professional life. We derived 60 items to represent different facets of AI Literacy according to Ng and colleagues conceptualisation of AI literacy. Additional 12 items to represent psychological competencies such as problem solving, learning, and emotion regulation in regard to AI.
arXiv Detail & Related papers (2023-02-18T12:35:55Z)
The Role of AI in Drug Discovery: Challenges, Opportunities, and Strategies [97.5153823429076]
The benefits, challenges and drawbacks of AI in this field are reviewed. The use of data augmentation, explainable AI, and the integration of AI with traditional experimental methods are also discussed.
arXiv Detail & Related papers (2022-12-08T23:23:39Z)
Deep Algorithmic Question Answering: Towards a Compositionally Hybrid AI for Algorithmic Reasoning [0.0]
We argue that the challenge of algorithmic reasoning in question answering can be effectively tackled with a "systems" approach to AI. We propose an approach to algorithm reasoning for QA, Deep Algorithmic Question Answering.
arXiv Detail & Related papers (2021-09-16T14:28:18Z)
Explainable Artificial Intelligence Approaches: A Survey [0.22940141855172028]
Lack of explainability of a decision from an Artificial Intelligence based "black box" system/model is a key stumbling block for adopting AI in high stakes applications. We demonstrate popular Explainable Artificial Intelligence (XAI) methods with a mutual case study/task. We analyze for competitive advantages from multiple perspectives. We recommend paths towards responsible or human-centered AI using XAI as a medium.
arXiv Detail & Related papers (2021-01-23T06:15:34Z)
Problems in AI research and how the SP System may help to solve them [0.0]
This paper describes problems in AI research and how the SP System may help to solve them. Most of the problems are described by leading researchers in AI in interviews with science writer Martin Ford.
arXiv Detail & Related papers (2020-09-02T11:33:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.