Benchmarking Faithfulness: Towards Accurate Natural Language
Explanations in Vision-Language Tasks
- URL: http://arxiv.org/abs/2304.08174v1
- Date: Mon, 3 Apr 2023 08:24:10 GMT
- Title: Benchmarking Faithfulness: Towards Accurate Natural Language
Explanations in Vision-Language Tasks
- Authors: Jakob Ambsdorf
- Abstract summary: Natural language explanations (NLEs) promise to enable the communication of a model's decision-making in an easily intelligible way.
While current models successfully generate convincing explanations, it is an open question how well the NLEs actually represent the reasoning process of the models.
We propose three faithfulness metrics: Attribution-Similarity, NLE-Sufficiency, and NLE-Comprehensiveness.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With deep neural models increasingly permeating our daily lives comes a need
for transparent and comprehensible explanations of their decision-making.
However, most explanation methods that have been developed so far are not
intuitively understandable for lay users. In contrast, natural language
explanations (NLEs) promise to enable the communication of a model's
decision-making in an easily intelligible way. While current models
successfully generate convincing explanations, it is an open question how well
the NLEs actually represent the reasoning process of the models - a property
called faithfulness. Although the development of metrics to measure
faithfulness is crucial to designing more faithful models, current metrics are
either not applicable to NLEs or are not designed to compare different model
architectures across multiple modalities.
Building on prior research on faithfulness measures and based on a detailed
rationale, we address this issue by proposing three faithfulness metrics:
Attribution-Similarity, NLE-Sufficiency, and NLE-Comprehensiveness. The
efficacy of the metrics is evaluated on the VQA-X and e-SNLI-VE datasets of the
e-ViL benchmark for vision-language NLE generation by systematically applying
modifications to the performant e-UG model for which we expect changes in the
measured explanation faithfulness. We show on the e-SNLI-VE dataset that the
removal of redundant inputs to the explanation-generation module of e-UG
successively increases the model's faithfulness on the linguistic modality as
measured by Attribution-Similarity. Further, our analysis demonstrates that
NLE-Sufficiency and -Comprehensiveness are not necessarily correlated to
Attribution-Similarity, and we discuss how the two metrics can be utilized to
gain further insights into the explanation generation process.
Related papers
- Unconditional Truthfulness: Learning Conditional Dependency for Uncertainty Quantification of Large Language Models [96.43562963756975]
We train a regression model, which target variable is the gap between the conditional and the unconditional generation confidence.
We use this learned conditional dependency model to modulate the uncertainty of the current generation step based on the uncertainty of the previous step.
arXiv Detail & Related papers (2024-08-20T09:42:26Z) - Verbalized Probabilistic Graphical Modeling with Large Language Models [8.961720262676195]
This work introduces a novel Bayesian prompting approach that facilitates training-free Bayesian inference with large language models.
Our results indicate that the model effectively enhances confidence elicitation and text generation quality, demonstrating its potential to improve AI language understanding systems.
arXiv Detail & Related papers (2024-06-08T16:35:31Z) - Improving Open Information Extraction with Large Language Models: A
Study on Demonstration Uncertainty [52.72790059506241]
Open Information Extraction (OIE) task aims at extracting structured facts from unstructured text.
Despite the potential of large language models (LLMs) like ChatGPT as a general task solver, they lag behind state-of-the-art (supervised) methods in OIE tasks.
arXiv Detail & Related papers (2023-09-07T01:35:24Z) - KNOW How to Make Up Your Mind! Adversarially Detecting and Alleviating
Inconsistencies in Natural Language Explanations [52.33256203018764]
We leverage external knowledge bases to significantly improve on an existing adversarial attack for detecting inconsistent NLEs.
We show that models with higher NLE quality do not necessarily generate fewer inconsistencies.
arXiv Detail & Related papers (2023-06-05T15:51:58Z) - Commonsense Knowledge Transfer for Pre-trained Language Models [83.01121484432801]
We introduce commonsense knowledge transfer, a framework to transfer the commonsense knowledge stored in a neural commonsense knowledge model to a general-purpose pre-trained language model.
It first exploits general texts to form queries for extracting commonsense knowledge from the neural commonsense knowledge model.
It then refines the language model with two self-supervised objectives: commonsense mask infilling and commonsense relation prediction.
arXiv Detail & Related papers (2023-06-04T15:44:51Z) - Faithfulness Tests for Natural Language Explanations [87.01093277918599]
Explanations of neural models aim to reveal a model's decision-making process for its predictions.
Recent work shows that current methods giving explanations such as saliency maps or counterfactuals can be misleading.
This work explores the challenging question of evaluating the faithfulness of natural language explanations.
arXiv Detail & Related papers (2023-05-29T11:40:37Z) - Post Hoc Explanations of Language Models Can Improve Language Models [43.2109029463221]
We present a novel framework, Amplifying Model Performance by Leveraging In-Context Learning with Post Hoc Explanations (AMPLIFY)
We leverage post hoc explanation methods which output attribution scores (explanations) capturing the influence of each of the input features on model predictions.
Our framework, AMPLIFY, leads to prediction accuracy improvements of about 10-25% over a wide range of tasks.
arXiv Detail & Related papers (2023-05-19T04:46:04Z) - Explaining Language Models' Predictions with High-Impact Concepts [11.47612457613113]
We propose a complete framework for extending concept-based interpretability methods to NLP.
We optimize for features whose existence causes the output predictions to change substantially.
Our method achieves superior results on predictive impact, usability, and faithfulness compared to the baselines.
arXiv Detail & Related papers (2023-05-03T14:48:27Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.