REX: Reasoning-aware and Grounded Explanation
- URL: http://arxiv.org/abs/2203.06107v1
- Date: Fri, 11 Mar 2022 17:28:42 GMT
- Title: REX: Reasoning-aware and Grounded Explanation
- Authors: Shi Chen and Qi Zhao
- Abstract summary: We develop a new type of multi-modal explanations that explain the decisions by traversing the reasoning process and grounding keywords in the images.
Second, we identify the critical need to tightly couple important components across the visual and textual modalities for explaining the decisions.
Third, we propose a novel explanation generation method that explicitly models the pairwise correspondence between words and regions of interest.
- Score: 30.392986232906107
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Effectiveness and interpretability are two essential properties for
trustworthy AI systems. Most recent studies in visual reasoning are dedicated
to improving the accuracy of predicted answers, and less attention is paid to
explaining the rationales behind the decisions. As a result, they commonly take
advantage of spurious biases instead of actually reasoning on the
visual-textual data, and have yet developed the capability to explain their
decision making by considering key information from both modalities. This paper
aims to close the gap from three distinct perspectives: first, we define a new
type of multi-modal explanations that explain the decisions by progressively
traversing the reasoning process and grounding keywords in the images. We
develop a functional program to sequentially execute different reasoning steps
and construct a new dataset with 1,040,830 multi-modal explanations. Second, we
identify the critical need to tightly couple important components across the
visual and textual modalities for explaining the decisions, and propose a novel
explanation generation method that explicitly models the pairwise
correspondence between words and regions of interest. It improves the visual
grounding capability by a considerable margin, resulting in enhanced
interpretability and reasoning performance. Finally, with our new data and
method, we perform extensive analyses to study the effectiveness of our
explanation under different settings, including multi-task learning and
transfer learning. Our code and data are available at
https://github.com/szzexpoi/rex.
Related papers
- Explainability for Machine Learning Models: From Data Adaptability to
User Perception [0.8702432681310401]
This thesis explores the generation of local explanations for already deployed machine learning models.
It aims to identify optimal conditions for producing meaningful explanations considering both data and user requirements.
arXiv Detail & Related papers (2024-02-16T18:44:37Z) - Visual Commonsense based Heterogeneous Graph Contrastive Learning [79.22206720896664]
We propose a heterogeneous graph contrastive learning method to better finish the visual reasoning task.
Our method is designed as a plug-and-play way, so that it can be quickly and easily combined with a wide range of representative methods.
arXiv Detail & Related papers (2023-11-11T12:01:18Z) - See, Think, Confirm: Interactive Prompting Between Vision and Language
Models for Knowledge-based Visual Reasoning [60.43585179885355]
We propose a novel framework named Interactive Prompting Visual Reasoner (IPVR) for few-shot knowledge-based visual reasoning.
IPVR contains three stages, see, think and confirm.
We conduct experiments on a range of knowledge-based visual reasoning datasets.
arXiv Detail & Related papers (2023-01-12T18:59:50Z) - Complementary Explanations for Effective In-Context Learning [77.83124315634386]
Large language models (LLMs) have exhibited remarkable capabilities in learning from explanations in prompts.
This work aims to better understand the mechanisms by which explanations are used for in-context learning.
arXiv Detail & Related papers (2022-11-25T04:40:47Z) - Textual Explanations and Critiques in Recommendation Systems [8.406549970145846]
dissertation focuses on two fundamental challenges of addressing this need.
The first involves explanation generation in a scalable and data-driven manner.
The second challenge consists in making explanations actionable, and we refer to it as critiquing.
arXiv Detail & Related papers (2022-05-15T11:59:23Z) - Human Interpretation of Saliency-based Explanation Over Text [65.29015910991261]
We study saliency-based explanations over textual data.
We find that people often mis-interpret the explanations.
We propose a method to adjust saliencies based on model estimates of over- and under-perception.
arXiv Detail & Related papers (2022-01-27T15:20:32Z) - A First Look: Towards Explainable TextVQA Models via Visual and Textual
Explanations [3.7638008383533856]
We propose MTXNet, an end-to-end trainable multimodal architecture to generate multimodal explanations.
We show that training with multimodal explanations surpasses unimodal baselines by up to 7% in CIDEr scores and 2% in IoU.
We also describe a real-world e-commerce application for using the generated multimodal explanations.
arXiv Detail & Related papers (2021-04-29T00:36:17Z) - Contrastive Explanations for Model Interpretability [77.92370750072831]
We propose a methodology to produce contrastive explanations for classification models.
Our method is based on projecting model representation to a latent space.
Our findings shed light on the ability of label-contrastive explanations to provide a more accurate and finer-grained interpretability of a model's decision.
arXiv Detail & Related papers (2021-03-02T00:36:45Z) - This is not the Texture you are looking for! Introducing Novel
Counterfactual Explanations for Non-Experts using Generative Adversarial
Learning [59.17685450892182]
counterfactual explanation systems try to enable a counterfactual reasoning by modifying the input image.
We present a novel approach to generate such counterfactual image explanations based on adversarial image-to-image translation techniques.
Our results show that our approach leads to significantly better results regarding mental models, explanation satisfaction, trust, emotions, and self-efficacy than two state-of-the art systems.
arXiv Detail & Related papers (2020-12-22T10:08:05Z) - Generating Hierarchical Explanations on Text Classification via Feature
Interaction Detection [21.02924712220406]
We build hierarchical explanations by detecting feature interactions.
Such explanations visualize how words and phrases are combined at different levels of the hierarchy.
Experiments show the effectiveness of the proposed method in providing explanations both faithful to models and interpretable to humans.
arXiv Detail & Related papers (2020-04-04T20:56:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.