ECHo: A Visio-Linguistic Dataset for Event Causality Inference via
Human-Centric Reasoning
- URL: http://arxiv.org/abs/2305.14740v2
- Date: Mon, 23 Oct 2023 10:35:30 GMT
- Title: ECHo: A Visio-Linguistic Dataset for Event Causality Inference via
Human-Centric Reasoning
- Authors: Yuxi Xie and Guanzhen Li and Min-Yen Kan
- Abstract summary: ECHo is a dataset of event causality inference grounded in visio-linguistic social scenarios.
We propose a unified Chain-of-Thought (CoT) framework to assess the reasoning capability of current AI systems.
- Score: 22.951360187153156
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce ECHo (Event Causality Inference via Human-Centric Reasoning), a
diagnostic dataset of event causality inference grounded in visio-linguistic
social scenarios. ECHo employs real-world human-centric deductive information
building on a television crime drama. ECHo requires the Theory-of-Mind (ToM)
ability to understand and reason about social interactions based on multimodal
information. Using ECHo, we propose a unified Chain-of-Thought (CoT) framework
to assess the reasoning capability of current AI systems. Our ToM-enhanced CoT
pipeline accommodates various large foundation models in both zero-shot and
few-shot visio-linguistic reasoning. We use this framework to scrutinize recent
large foundation models such as InstructGPT and MiniGPT-4 on three diagnostic
human-centric tasks. Further analysis demonstrates ECHo as a challenging
dataset to expose imperfections and inconsistencies in reasoning. Our data and
code are publicly available at https://github.com/YuxiXie/ECHo.
Related papers
- ACQUIRED: A Dataset for Answering Counterfactual Questions In Real-Life
Videos [53.92440577914417]
ACQUIRED consists of 3.9K annotated videos, encompassing a wide range of event types and incorporating both first and third-person viewpoints.
Each video is annotated with questions that span three distinct dimensions of reasoning, including physical, social, and temporal.
We benchmark our dataset against several state-of-the-art language-only and multimodal models and experimental results demonstrate a significant performance gap.
arXiv Detail & Related papers (2023-11-02T22:17:03Z) - Harnessing Collective Intelligence Under a Lack of Cultural Consensus [0.1813006808606333]
Cultural Consensus Theory (CCT) provides a statistical framework for detecting and characterizing divergent consensus beliefs.
We extend CCT with a latent construct that maps between pretrained deep neural network embeddings of entities and the consensus beliefs regarding those entities among one or more subsets of respondents.
We find that iDLC-CCT better predicts the degree of consensus, generalizes well to out-of-sample entities, and is effective even with sparse data.
arXiv Detail & Related papers (2023-09-18T14:05:04Z) - Towards Fair and Explainable AI using a Human-Centered AI Approach [5.888646114353372]
We present 5 research projects that aim to enhance explainability and fairness in classification systems and word embeddings.
The first project explores the utility/downsides of introducing local model explanations as interfaces for machine teachers.
The second project presents D-BIAS, a causality-based human-in-the-loop visual tool for identifying and mitigating social biases in datasets.
The third project presents WordBias, a visual interactive tool that helps audit pre-trained static word embeddings for biases against groups.
The fourth project presents DramatVis Personae, a visual analytics tool that helps identify social
arXiv Detail & Related papers (2023-06-12T21:08:55Z) - A Study of Situational Reasoning for Traffic Understanding [63.45021731775964]
We devise three novel text-based tasks for situational reasoning in the traffic domain.
We adopt four knowledge-enhanced methods that have shown generalization capability across language reasoning tasks in prior work.
We provide in-depth analyses of model performance on data partitions and examine model predictions categorically.
arXiv Detail & Related papers (2023-06-05T01:01:12Z) - EnDex: Evaluation of Dialogue Engagingness at Scale [30.15445159524315]
We propose EnDex, the first human-reaction based model to evaluate dialogue engagingness.
We will release code, off-the-shelf EnDex model, and a large-scale dataset upon paper publication.
arXiv Detail & Related papers (2022-10-22T06:09:43Z) - ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational
Finance Question Answering [70.6359636116848]
We propose a new large-scale dataset, ConvFinQA, to study the chain of numerical reasoning in conversational question answering.
Our dataset poses great challenge in modeling long-range, complex numerical reasoning paths in real-world conversations.
arXiv Detail & Related papers (2022-10-07T23:48:50Z) - Neural Causal Models for Counterfactual Identification and Estimation [62.30444687707919]
We study the evaluation of counterfactual statements through neural models.
First, we show that neural causal models (NCMs) are expressive enough.
Second, we develop an algorithm for simultaneously identifying and estimating counterfactual distributions.
arXiv Detail & Related papers (2022-09-30T18:29:09Z) - Causal Reasoning Meets Visual Representation Learning: A Prospective
Study [117.08431221482638]
Lack of interpretability, robustness, and out-of-distribution generalization are becoming the challenges of the existing visual models.
Inspired by the strong inference ability of human-level agents, recent years have witnessed great effort in developing causal reasoning paradigms.
This paper aims to provide a comprehensive overview of this emerging field, attract attention, encourage discussions, bring to the forefront the urgency of developing novel causal reasoning methods.
arXiv Detail & Related papers (2022-04-26T02:22:28Z) - Modeling and Reasoning in Event Calculus using Goal-Directed Constraint
Answer Set Programming [8.677108656718824]
Event Calculus (EC) is a family of formalisms that model commonsense reasoning with a sound, logical basis.
Previous attempts to mechanize reasoning using EC faced difficulties in the treatment of the continuous change in dense domains.
We show how EC scenarios can be naturally and directly encoded in s(CASP) and how it enables deductive and abductive reasoning tasks.
arXiv Detail & Related papers (2021-06-28T10:43:25Z) - A Minimalist Dataset for Systematic Generalization of Perception,
Syntax, and Semantics [131.93113552146195]
We present a new dataset, Handwritten arithmetic with INTegers (HINT), to examine machines' capability of learning generalizable concepts.
In HINT, machines are tasked with learning how concepts are perceived from raw signals such as images.
We undertake extensive experiments with various sequence-to-sequence models, including RNNs, Transformers, and GPT-3.
arXiv Detail & Related papers (2021-03-02T01:32:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.