Towards Better Model Understanding with Path-Sufficient Explanations
- URL: http://arxiv.org/abs/2109.06181v1
- Date: Mon, 13 Sep 2021 16:06:10 GMT
- Title: Towards Better Model Understanding with Path-Sufficient Explanations
- Authors: Ronny Luss, Amit Dhurandhar
- Abstract summary: Path-Sufficient Explanations Method (PSEM) is a sequence of sufficient explanations for a given input of strictly decreasing size.
PSEM can be thought to trace the local boundary of the model in a smooth manner, thus providing better intuition about the local model behavior for the specific input.
A user study depicts the strength of the method in communicating the local behavior, where (many) users are able to correctly determine the prediction made by a model.
- Score: 11.517059323883444
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Feature based local attribution methods are amongst the most prevalent in
explainable artificial intelligence (XAI) literature. Going beyond standard
correlation, recently, methods have been proposed that highlight what should be
minimally sufficient to justify the classification of an input (viz. pertinent
positives). While minimal sufficiency is an attractive property, the resulting
explanations are often too sparse for a human to understand and evaluate the
local behavior of the model, thus making it difficult to judge its overall
quality. To overcome these limitations, we propose a novel method called
Path-Sufficient Explanations Method (PSEM) that outputs a sequence of
sufficient explanations for a given input of strictly decreasing size (or
value) -- from original input to a minimally sufficient explanation -- which
can be thought to trace the local boundary of the model in a smooth manner,
thus providing better intuition about the local model behavior for the specific
input. We validate these claims, both qualitatively and quantitatively, with
experiments that show the benefit of PSEM across all three modalities (image,
tabular and text). A user study depicts the strength of the method in
communicating the local behavior, where (many) users are able to correctly
determine the prediction made by a model.
Related papers
- MASALA: Model-Agnostic Surrogate Explanations by Locality Adaptation [3.587367153279351]
Existing local Explainable AI (XAI) methods select a region of the input space in the vicinity of a given input instance, for which they approximate the behaviour of a model using a simpler and more interpretable surrogate model.
We propose a novel method, MASALA, for generating explanations, which automatically determines the appropriate local region of impactful model behaviour for each individual instance being explained.
arXiv Detail & Related papers (2024-08-19T15:26:45Z) - Evaluating Human Alignment and Model Faithfulness of LLM Rationale [66.75309523854476]
We study how well large language models (LLMs) explain their generations through rationales.
We show that prompting-based methods are less "faithful" than attribution-based explanations.
arXiv Detail & Related papers (2024-06-28T20:06:30Z) - Understanding prompt engineering may not require rethinking
generalization [56.38207873589642]
We show that the discrete nature of prompts, combined with a PAC-Bayes prior given by a language model, results in generalization bounds that are remarkably tight by the standards of the literature.
This work provides a possible justification for the widespread practice of prompt engineering.
arXiv Detail & Related papers (2023-10-06T00:52:48Z) - Sampling Based On Natural Image Statistics Improves Local Surrogate
Explainers [111.31448606885672]
Surrogate explainers are a popular post-hoc interpretability method to further understand how a model arrives at a prediction.
We propose two approaches to do so, namely (1) altering the method for sampling the local neighbourhood and (2) using perceptual metrics to convey some of the properties of the distribution of natural images.
arXiv Detail & Related papers (2022-08-08T08:10:13Z) - MACE: An Efficient Model-Agnostic Framework for Counterfactual
Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE)
In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity.
Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z) - Locally Invariant Explanations: Towards Stable and Unidirectional
Explanations through Local Invariant Learning [15.886405745163234]
We propose a model agnostic local explanation method inspired by the invariant risk minimization principle.
Our algorithm is simple and efficient to train, and can ascertain stable input features for local decisions of a black-box without access to side information.
arXiv Detail & Related papers (2022-01-28T14:29:25Z) - Evaluation of Local Model-Agnostic Explanations Using Ground Truth [4.278336455989584]
Explanation techniques are commonly evaluated using human-grounded methods.
We propose a functionally-grounded evaluation procedure for local model-agnostic explanation techniques.
arXiv Detail & Related papers (2021-06-04T13:47:31Z) - Search Methods for Sufficient, Socially-Aligned Feature Importance
Explanations with In-Distribution Counterfactuals [72.00815192668193]
Feature importance (FI) estimates are a popular form of explanation, and they are commonly created and evaluated by computing the change in model confidence caused by removing certain input features at test time.
We study several under-explored dimensions of FI-based explanations, providing conceptual and empirical improvements for this form of explanation.
arXiv Detail & Related papers (2021-06-01T20:36:48Z) - Building Reliable Explanations of Unreliable Neural Networks: Locally
Smoothing Perspective of Model Interpretation [0.0]
We present a novel method for reliably explaining the predictions of neural networks.
Our method is built on top of the assumption of smooth landscape in a loss function of the model prediction.
arXiv Detail & Related papers (2021-03-26T08:52:11Z) - Goal-directed Generation of Discrete Structures with Conditional
Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward.
We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z) - What Do You See? Evaluation of Explainable Artificial Intelligence (XAI)
Interpretability through Neural Backdoors [15.211935029680879]
EXplainable AI (XAI) methods have been proposed to interpret how a deep neural network predicts inputs.
Current evaluation approaches either require subjective input from humans or incur high computation cost with automated evaluation.
We propose backdoor trigger patterns--hidden malicious functionalities that cause misclassification--to automate the evaluation of saliency explanations.
arXiv Detail & Related papers (2020-09-22T15:53:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.