Premise-based Multimodal Reasoning: A Human-like Cognitive Process
- URL: http://arxiv.org/abs/2105.07122v1
- Date: Sat, 15 May 2021 03:25:42 GMT
- Title: Premise-based Multimodal Reasoning: A Human-like Cognitive Process
- Authors: Qingxiu Dong, Ziwei Qin, Heming Xia, Tian Feng, Shoujie Tong, Haoran
Meng, Lin Xu, Tianyu Liu, Zuifang Sui, Weidong Zhan, Sujian Li and Zhongyu
Wei
- Abstract summary: "Premise-based Multimodal Reasoning" (PMR) requires participating models to reason after establishing a profound understanding of background information.
We believe that the proposed PMR would contribute to and help shed a light on human-like in-depth reasoning.
- Score: 28.38581274528838
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reasoning is one of the major challenges of Human-like AI and has recently
attracted intensive attention from natural language processing (NLP)
researchers. However, cross-modal reasoning needs further research. For
cross-modal reasoning, we observe that most methods fall into shallow feature
matching without in-depth human-like reasoning.The reason lies in that existing
cross-modal tasks directly ask questions for a image. However, human reasoning
in real scenes is often made under specific background information, a process
that is studied by the ABC theory in social psychology. We propose a shared
task named "Premise-based Multimodal Reasoning" (PMR), which requires
participating models to reason after establishing a profound understanding of
background information. We believe that the proposed PMR would contribute to
and help shed a light on human-like in-depth reasoning.
Related papers
- A Survey on Latent Reasoning [100.54120559169735]
Large Language Models (LLMs) have demonstrated impressive reasoning capabilities.<n>CoT reasoning that verbalizes intermediate steps limits the model's expressive bandwidth.<n>Latent reasoning tackles this bottleneck by performing multi-step inference entirely in the model's continuous hidden state.
arXiv Detail & Related papers (2025-07-08T17:29:07Z) - Human-Aligned Bench: Fine-Grained Assessment of Reasoning Ability in MLLMs vs. Humans [9.315735862658244]
We propose Human-Aligned Bench, a benchmark for alignment of multimodal reasoning with human performance.<n>We collected 9,794 multimodal questions that solely rely on contextual reasoning, including bilingual (Chinese and English) multimodal questions and pure text-based questions.<n>Extensive experiments on the Human-Aligned Bench reveal notable differences between the performance of current MLLMs in multimodal reasoning and human performance.
arXiv Detail & Related papers (2025-05-16T11:41:19Z) - DialogueReason: Rule-Based RL Sparks Dialogue Reasoning in LLMs [54.4857963044859]
We propose DialogueReason, a reasoning paradigm that uncovers the lost roles in monologue-style reasoning models.<n>Our work consists of an analysis of monologue reasoning patterns and the development of a dialogue-based reasoning approach.
arXiv Detail & Related papers (2025-05-11T16:39:58Z) - The Essence of Contextual Understanding in Theory of Mind: A Study on Question Answering with Story Characters [67.61587661660852]
Theory-of-Mind (ToM) allows humans to understand and interpret the mental states of others.
In this paper, we verify the importance of comprehensive contextual understanding about personal backgrounds in ToM.
We introduce CharToM benchmark, comprising 1,035 ToM questions based on characters from classic novels.
arXiv Detail & Related papers (2025-01-03T09:04:45Z) - COLD: Causal reasOning in cLosed Daily activities [7.782872276680731]
We propose the COLD (Causal reasOning in cLosed Daily activities) framework.
It is built upon human understanding of daily real-world activities to reason about the causal nature of events.
We show that the proposed framework facilitates the creation of enormous causal queries.
arXiv Detail & Related papers (2024-11-29T06:37:13Z) - Through the Theory of Mind's Eye: Reading Minds with Multimodal Video Large Language Models [52.894048516550065]
We develop a pipeline for multimodal ToM reasoning using video and text.
We also enable explicit ToM reasoning by retrieving key frames for answering a ToM question.
arXiv Detail & Related papers (2024-06-19T18:24:31Z) - Quriosity: Analyzing Human Questioning Behavior and Causal Inquiry through Curiosity-Driven Queries [91.70689724416698]
We present Quriosity, a collection of 13.5K naturally occurring questions from three diverse sources.
Our analysis reveals a significant presence of causal questions (up to 42%) in the dataset.
arXiv Detail & Related papers (2024-05-30T17:55:28Z) - PHAnToM: Persona-based Prompting Has An Effect on Theory-of-Mind Reasoning in Large Language Models [25.657579792829743]
We empirically evaluate how role-playing prompting influences Theory-of-Mind (ToM) reasoning capabilities.
We propose the mechanism that, beyond the inherent variance in the complexity of reasoning tasks, performance differences arise because of socially-motivated prompting differences.
arXiv Detail & Related papers (2024-03-04T17:34:34Z) - Towards CausalGPT: A Multi-Agent Approach for Faithful Knowledge Reasoning via Promoting Causal Consistency in LLMs [60.244412212130264]
Causal-Consistency Chain-of-Thought harnesses multi-agent collaboration to bolster the faithfulness and causality of foundation models.
Our framework demonstrates significant superiority over state-of-the-art methods through extensive and comprehensive evaluations.
arXiv Detail & Related papers (2023-08-23T04:59:21Z) - Neural Amortized Inference for Nested Multi-agent Reasoning [54.39127942041582]
We propose a novel approach to bridge the gap between human-like inference capabilities and computational limitations.
We evaluate our method in two challenging multi-agent interaction domains.
arXiv Detail & Related papers (2023-08-21T22:40:36Z) - In-Context Analogical Reasoning with Pre-Trained Language Models [10.344428417489237]
We explore the use of intuitive language-based abstractions to support analogy in AI systems.
Specifically, we apply large pre-trained language models (PLMs) to visual Raven's Progressive Matrices ( RPM)
We find that PLMs exhibit a striking capacity for zero-shot relational reasoning, exceeding human performance and nearing supervised vision-based methods.
arXiv Detail & Related papers (2023-05-28T04:22:26Z) - ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational
Finance Question Answering [70.6359636116848]
We propose a new large-scale dataset, ConvFinQA, to study the chain of numerical reasoning in conversational question answering.
Our dataset poses great challenge in modeling long-range, complex numerical reasoning paths in real-world conversations.
arXiv Detail & Related papers (2022-10-07T23:48:50Z) - Faithful Reasoning Using Large Language Models [12.132449274592668]
We show how LMs can be made to perform faithful multi-step reasoning via a process whose causal structure mirrors the underlying logical structure of the problem.
Our approach works by chaining together reasoning steps, where each step results from calls to two fine-tuned LMs.
We demonstrate the effectiveness of our model on multi-step logical deduction and scientific question-answering, showing that it outperforms baselines on final answer accuracy.
arXiv Detail & Related papers (2022-08-30T13:44:41Z) - Humans are not Boltzmann Distributions: Challenges and Opportunities for
Modelling Human Feedback and Interaction in Reinforcement Learning [13.64577704565643]
We argue that these models are too simplistic and that RL researchers need to develop more realistic human models to design and evaluate their algorithms.
This paper calls for research from different disciplines to address key questions about how humans provide feedback to AIs and how we can build more robust human-in-the-loop RL systems.
arXiv Detail & Related papers (2022-06-27T13:58:51Z) - Prompting Contrastive Explanations for Commonsense Reasoning Tasks [74.7346558082693]
Large pretrained language models (PLMs) can achieve near-human performance on commonsense reasoning tasks.
We show how to use these same models to generate human-interpretable evidence.
arXiv Detail & Related papers (2021-06-12T17:06:13Z) - Machine Common Sense [77.34726150561087]
Machine common sense remains a broad, potentially unbounded problem in artificial intelligence (AI)
This article deals with the aspects of modeling commonsense reasoning focusing on such domain as interpersonal interactions.
arXiv Detail & Related papers (2020-06-15T13:59:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.