Towards Better Model Understanding with Path-Sufficient Explanations
- URL: http://arxiv.org/abs/2109.06181v1
- Date: Mon, 13 Sep 2021 16:06:10 GMT
- Title: Towards Better Model Understanding with Path-Sufficient Explanations
- Authors: Ronny Luss, Amit Dhurandhar
- Abstract summary: Path-Sufficient Explanations Method (PSEM) is a sequence of sufficient explanations for a given input of strictly decreasing size.
PSEM can be thought to trace the local boundary of the model in a smooth manner, thus providing better intuition about the local model behavior for the specific input.
A user study depicts the strength of the method in communicating the local behavior, where (many) users are able to correctly determine the prediction made by a model.
- Score: 11.517059323883444
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Feature based local attribution methods are amongst the most prevalent in
explainable artificial intelligence (XAI) literature. Going beyond standard
correlation, recently, methods have been proposed that highlight what should be
minimally sufficient to justify the classification of an input (viz. pertinent
positives). While minimal sufficiency is an attractive property, the resulting
explanations are often too sparse for a human to understand and evaluate the
local behavior of the model, thus making it difficult to judge its overall
quality. To overcome these limitations, we propose a novel method called
Path-Sufficient Explanations Method (PSEM) that outputs a sequence of
sufficient explanations for a given input of strictly decreasing size (or
value) -- from original input to a minimally sufficient explanation -- which
can be thought to trace the local boundary of the model in a smooth manner,
thus providing better intuition about the local model behavior for the specific
input. We validate these claims, both qualitatively and quantitatively, with
experiments that show the benefit of PSEM across all three modalities (image,
tabular and text). A user study depicts the strength of the method in
communicating the local behavior, where (many) users are able to correctly
determine the prediction made by a model.
Related papers
- Explaining the Unexplained: Revealing Hidden Correlations for Better Interpretability [1.8274323268621635]
Real Explainer (RealExp) is an interpretability method that decouples the Shapley Value into individual feature importance and feature correlation importance.
RealExp enhances interpretability by precisely quantifying both individual feature contributions and their interactions.
arXiv Detail & Related papers (2024-12-02T10:50:50Z) - Evaluating Human Alignment and Model Faithfulness of LLM Rationale [66.75309523854476]
We study how well large language models (LLMs) explain their generations through rationales.
We show that prompting-based methods are less "faithful" than attribution-based explanations.
arXiv Detail & Related papers (2024-06-28T20:06:30Z) - Towards a Unified Framework for Evaluating Explanations [0.6138671548064356]
We argue that explanations serve as mediators between models and stakeholders, whether for intrinsically interpretable models or opaque black-box models.
We illustrate these criteria, as well as specific evaluation methods, using examples from an ongoing study of an interpretable neural network for predicting a particular learner behavior.
arXiv Detail & Related papers (2024-05-22T21:49:28Z) - Log Probabilities Are a Reliable Estimate of Semantic Plausibility in Base and Instruction-Tuned Language Models [50.15455336684986]
We evaluate the effectiveness of LogProbs and basic prompting to measure semantic plausibility.
We find that LogProbs offers a more reliable measure of semantic plausibility than direct zero-shot prompting.
We conclude that, even in the era of prompt-based evaluations, LogProbs constitute a useful metric of semantic plausibility.
arXiv Detail & Related papers (2024-03-21T22:08:44Z) - On the stability, correctness and plausibility of visual explanation
methods based on feature importance [0.0]
We study the articulation between the stability, correctness and plausibility of explanations based on feature importance for image classifiers.
We show that the existing metrics for evaluating these properties do not always agree, raising the issue of what constitutes a good evaluation metric for explanations.
arXiv Detail & Related papers (2023-10-25T08:59:21Z) - MACE: An Efficient Model-Agnostic Framework for Counterfactual
Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE)
In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity.
Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z) - Logical Satisfiability of Counterfactuals for Faithful Explanations in
NLI [60.142926537264714]
We introduce the methodology of Faithfulness-through-Counterfactuals.
It generates a counterfactual hypothesis based on the logical predicates expressed in the explanation.
It then evaluates if the model's prediction on the counterfactual is consistent with that expressed logic.
arXiv Detail & Related papers (2022-05-25T03:40:59Z) - A Survey on the Robustness of Feature Importance and Counterfactual
Explanations [12.599872913953238]
We present a survey of the works that analysed the robustness of two classes of local explanations.
The survey aims to unify existing definitions of robustness, introduces a taxonomy to classify different robustness approaches, and discusses some interesting results.
arXiv Detail & Related papers (2021-10-30T22:48:04Z) - Logic Constraints to Feature Importances [17.234442722611803]
"Black box" nature of AI models is often a limit for a reliable application in high-stakes fields like diagnostic techniques, autonomous guide, etc.
Recent works have shown that an adequate level of interpretability could enforce the more general concept of model trustworthiness.
The basic idea of this paper is to exploit the human prior knowledge of the features' importance for a specific task, in order to coherently aid the phase of the model's fitting.
arXiv Detail & Related papers (2021-10-13T09:28:38Z) - Evaluations and Methods for Explanation through Robustness Analysis [117.7235152610957]
We establish a novel set of evaluation criteria for such feature based explanations by analysis.
We obtain new explanations that are loosely necessary and sufficient for a prediction.
We extend the explanation to extract the set of features that would move the current prediction to a target class.
arXiv Detail & Related papers (2020-05-31T05:52:05Z) - Guided Uncertainty-Aware Policy Optimization: Combining Learning and
Model-Based Strategies for Sample-Efficient Policy Learning [75.56839075060819]
Traditional robotic approaches rely on an accurate model of the environment, a detailed description of how to perform the task, and a robust perception system to keep track of the current state.
reinforcement learning approaches can operate directly from raw sensory inputs with only a reward signal to describe the task, but are extremely sample-inefficient and brittle.
In this work, we combine the strengths of model-based methods with the flexibility of learning-based methods to obtain a general method that is able to overcome inaccuracies in the robotics perception/actuation pipeline.
arXiv Detail & Related papers (2020-05-21T19:47:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.