Framework for Evaluating Faithfulness of Local Explanations
- URL: http://arxiv.org/abs/2202.00734v1
- Date: Tue, 1 Feb 2022 20:14:06 GMT
- Title: Framework for Evaluating Faithfulness of Local Explanations
- Authors: Sanjoy Dasgupta, Nave Frost, Michal Moshkovitz
- Abstract summary: We study the faithfulness of an explanation system to the underlying prediction model.
For a variety of existing explanation systems, such as anchors, we analytically study these quantities.
We provide estimators and sample complexity bounds for empirically determining the faithfulness of black-box explanation systems.
- Score: 21.648639081403754
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the faithfulness of an explanation system to the underlying
prediction model. We show that this can be captured by two properties,
consistency and sufficiency, and introduce quantitative measures of the extent
to which these hold. Interestingly, these measures depend on the test-time data
distribution. For a variety of existing explanation systems, such as anchors,
we analytically study these quantities. We also provide estimators and sample
complexity bounds for empirically determining the faithfulness of black-box
explanation systems. Finally, we experimentally validate the new properties and
estimators.
Related papers
- Rethinking Distance Metrics for Counterfactual Explainability [53.436414009687]
We investigate a framing for counterfactual generation methods that considers counterfactuals not as independent draws from a region around the reference, but as jointly sampled with the reference from the underlying data distribution.
We derive a distance metric, tailored for counterfactual similarity that can be applied to a broad range of settings.
arXiv Detail & Related papers (2024-10-18T15:06:50Z) - On the stability, correctness and plausibility of visual explanation
methods based on feature importance [0.0]
We study the articulation between the stability, correctness and plausibility of explanations based on feature importance for image classifiers.
We show that the existing metrics for evaluating these properties do not always agree, raising the issue of what constitutes a good evaluation metric for explanations.
arXiv Detail & Related papers (2023-10-25T08:59:21Z) - On the Properties and Estimation of Pointwise Mutual Information Profiles [49.877314063833296]
The pointwise mutual information profile, or simply profile, is the distribution of pointwise mutual information for a given pair of random variables.
We introduce a novel family of distributions, Bend and Mix Models, for which the profile can be accurately estimated using Monte Carlo methods.
arXiv Detail & Related papers (2023-10-16T10:02:24Z) - On the Joint Interaction of Models, Data, and Features [82.60073661644435]
We introduce a new tool, the interaction tensor, for empirically analyzing the interaction between data and model through features.
Based on these observations, we propose a conceptual framework for feature learning.
Under this framework, the expected accuracy for a single hypothesis and agreement for a pair of hypotheses can both be derived in closed-form.
arXiv Detail & Related papers (2023-06-07T21:35:26Z) - Data-Driven Observability Analysis for Nonlinear Stochastic Systems [5.4511976387114895]
Distinguishability and observability are key properties of dynamical systems.
We show that both concepts are equivalent for a class of systems that includes linear systems.
We propose a statistical test to determine a threshold above which two states can be considered distinguishable with high confidence.
arXiv Detail & Related papers (2023-02-23T12:51:03Z) - TACTiS: Transformer-Attentional Copulas for Time Series [76.71406465526454]
estimation of time-varying quantities is a fundamental component of decision making in fields such as healthcare and finance.
We propose a versatile method that estimates joint distributions using an attention-based decoder.
We show that our model produces state-of-the-art predictions on several real-world datasets.
arXiv Detail & Related papers (2022-02-07T21:37:29Z) - Uncertainty Quantification of Surrogate Explanations: an Ordinal
Consensus Approach [1.3750624267664155]
We produce estimates of the uncertainty of a given explanation by measuring the consensus amongst a set of diverse bootstrapped surrogate explainers.
We empirically illustrate the properties of this approach through experiments on state-of-the-art Convolutional Neural Network ensembles.
arXiv Detail & Related papers (2021-11-17T13:55:58Z) - Estimating informativeness of samples with Smooth Unique Information [108.25192785062367]
We measure how much a sample informs the final weights and how much it informs the function computed by the weights.
We give efficient approximations of these quantities using a linearized network.
We apply these measures to several problems, such as dataset summarization.
arXiv Detail & Related papers (2021-01-17T10:29:29Z) - Towards Unifying Feature Attribution and Counterfactual Explanations:
Different Means to the Same End [17.226134854746267]
We present a method to generate feature attribution explanations from a set of counterfactual examples.
We show how counterfactual examples can be used to evaluate the goodness of an attribution-based explanation in terms of its necessity and sufficiency.
arXiv Detail & Related papers (2020-11-10T05:41:43Z) - Explaining predictive models with mixed features using Shapley values
and conditional inference trees [1.8065361710947976]
Shapley values stand out as a sound method to explain predictions from any type of machine learning model.
We propose a method to explain mixed dependent features by modeling the dependence structure of the features using conditional inference trees.
arXiv Detail & Related papers (2020-07-02T11:25:45Z) - Evaluations and Methods for Explanation through Robustness Analysis [117.7235152610957]
We establish a novel set of evaluation criteria for such feature based explanations by analysis.
We obtain new explanations that are loosely necessary and sufficient for a prediction.
We extend the explanation to extract the set of features that would move the current prediction to a target class.
arXiv Detail & Related papers (2020-05-31T05:52:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.