Does Faithfulness Conflict with Plausibility? An Empirical Study in Explainable AI across NLP Tasks
- URL: http://arxiv.org/abs/2404.00140v1
- Date: Fri, 29 Mar 2024 20:28:42 GMT
- Title: Does Faithfulness Conflict with Plausibility? An Empirical Study in Explainable AI across NLP Tasks
- Authors: Xiaolei Lu, Jianghong Ma,
- Abstract summary: We show that Shapley value and LIME could attain greater faithfulness and plausibility.
Our findings suggest that rather than optimizing for one dimension at the expense of the other, we could seek to optimize explainability algorithms with dual objectives.
- Score: 9.979726030996051
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Explainability algorithms aimed at interpreting decision-making AI systems usually consider balancing two critical dimensions: 1) \textit{faithfulness}, where explanations accurately reflect the model's inference process. 2) \textit{plausibility}, where explanations are consistent with domain experts. However, the question arises: do faithfulness and plausibility inherently conflict? In this study, through a comprehensive quantitative comparison between the explanations from the selected explainability methods and expert-level interpretations across three NLP tasks: sentiment analysis, intent detection, and topic labeling, we demonstrate that traditional perturbation-based methods Shapley value and LIME could attain greater faithfulness and plausibility. Our findings suggest that rather than optimizing for one dimension at the expense of the other, we could seek to optimize explainability algorithms with dual objectives to achieve high levels of accuracy and user accessibility in their explanations.
Related papers
- Independence Constrained Disentangled Representation Learning from Epistemological Perspective [13.51102815877287]
Disentangled Representation Learning aims to improve the explainability of deep learning methods by training a data encoder that identifies semantically meaningful latent variables in the data generation process.
There is no consensus regarding the objective of disentangled representation learning.
We propose a novel method for disentangled representation learning by employing an integration of mutual information constraint and independence constraint.
arXiv Detail & Related papers (2024-09-04T13:00:59Z) - Can you trust your explanations? A robustness test for feature attribution methods [42.36530107262305]
The field of Explainable AI (XAI) has seen a rapid growth but the usage of its techniques has at times led to unexpected results.
We will show how leveraging manifold hypothesis and ensemble approaches can be beneficial to an in-depth analysis of the robustness.
arXiv Detail & Related papers (2024-06-20T14:17:57Z) - Clarify When Necessary: Resolving Ambiguity Through Interaction with LMs [58.620269228776294]
We propose a task-agnostic framework for resolving ambiguity by asking users clarifying questions.
We evaluate systems across three NLP applications: question answering, machine translation and natural language inference.
We find that intent-sim is robust, demonstrating improvements across a wide range of NLP tasks and LMs.
arXiv Detail & Related papers (2023-11-16T00:18:50Z) - From Heuristic to Analytic: Cognitively Motivated Strategies for
Coherent Physical Commonsense Reasoning [66.98861219674039]
Heuristic-Analytic Reasoning (HAR) strategies drastically improve the coherence of rationalizations for model decisions.
Our findings suggest that human-like reasoning strategies can effectively improve the coherence and reliability of PLM reasoning.
arXiv Detail & Related papers (2023-10-24T19:46:04Z) - Explaining Explainability: Towards Deeper Actionable Insights into Deep
Learning through Second-order Explainability [70.60433013657693]
Second-order explainable AI (SOXAI) was recently proposed to extend explainable AI (XAI) from the instance level to the dataset level.
We demonstrate for the first time, via example classification and segmentation cases, that eliminating irrelevant concepts from the training set based on actionable insights from SOXAI can enhance a model's performance.
arXiv Detail & Related papers (2023-06-14T23:24:01Z) - Beyond Model Interpretability: On the Faithfulness and Adversarial
Robustness of Contrastive Textual Explanations [2.543865489517869]
This work motivates textual counterfactuals by laying the ground for a novel evaluation scheme inspired by the faithfulness of explanations.
Experiments on sentiment analysis data show that the connectedness of counterfactuals to their original counterparts is not obvious in both models.
arXiv Detail & Related papers (2022-10-17T09:50:02Z) - Exploring the Trade-off between Plausibility, Change Intensity and
Adversarial Power in Counterfactual Explanations using Multi-objective
Optimization [73.89239820192894]
We argue that automated counterfactual generation should regard several aspects of the produced adversarial instances.
We present a novel framework for the generation of counterfactual examples.
arXiv Detail & Related papers (2022-05-20T15:02:53Z) - The Unreliability of Explanations in Few-Shot In-Context Learning [50.77996380021221]
We focus on two NLP tasks that involve reasoning over text, namely question answering and natural language inference.
We show that explanations judged as good by humans--those that are logically consistent with the input--usually indicate more accurate predictions.
We present a framework for calibrating model predictions based on the reliability of the explanations.
arXiv Detail & Related papers (2022-05-06T17:57:58Z) - AR-LSAT: Investigating Analytical Reasoning of Text [57.1542673852013]
We study the challenge of analytical reasoning of text and introduce a new dataset consisting of questions from the Law School Admission Test from 1991 to 2016.
We analyze what knowledge understanding and reasoning abilities are required to do well on this task.
arXiv Detail & Related papers (2021-04-14T02:53:32Z) - Explaining Black-Box Algorithms Using Probabilistic Contrastive
Counterfactuals [7.727206277914709]
We propose a principled causality-based approach for explaining black-box decision-making systems.
We show how such counterfactuals can quantify the direct and indirect influences of a variable on decisions made by an algorithm.
We show how such counterfactuals can provide actionable recourse for individuals negatively affected by the algorithm's decision.
arXiv Detail & Related papers (2021-03-22T16:20:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.