Guarantee Regions for Local Explanations
- URL: http://arxiv.org/abs/2402.12737v1
- Date: Tue, 20 Feb 2024 06:04:44 GMT
- Title: Guarantee Regions for Local Explanations
- Authors: Marton Havasi, Sonali Parbhoo, Finale Doshi-Velez
- Abstract summary: We propose an anchor-based algorithm for identifying regions in which local explanations are guaranteed to be correct.
Our method produces an interpretable feature-aligned box where the prediction of the local surrogate model is guaranteed to match the predictive model.
- Score: 29.429229877959663
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Interpretability methods that utilise local surrogate models (e.g. LIME) are
very good at describing the behaviour of the predictive model at a point of
interest, but they are not guaranteed to extrapolate to the local region
surrounding the point. However, overfitting to the local curvature of the
predictive model and malicious tampering can significantly limit extrapolation.
We propose an anchor-based algorithm for identifying regions in which local
explanations are guaranteed to be correct by explicitly describing those
intervals along which the input features can be trusted. Our method produces an
interpretable feature-aligned box where the prediction of the local surrogate
model is guaranteed to match the predictive model. We demonstrate that our
algorithm can be used to find explanations with larger guarantee regions that
better cover the data manifold compared to existing baselines. We also show how
our method can identify misleading local explanations with significantly poorer
guarantee regions.
Related papers
- Conditionally valid Probabilistic Conformal Prediction [57.80927226809277]
We develop a new method for creating prediction sets that combines the flexibility of conformal methods with an estimate of the conditional distribution.
We demonstrate the effectiveness of our approach through extensive simulations, showing that it outperforms existing methods in terms of conditional coverage.
arXiv Detail & Related papers (2024-07-01T20:44:48Z) - GLIME: General, Stable and Local LIME Explanation [11.002828804775392]
Local Interpretable Model-agnostic Explanations (LIME) is a widely adpoted method for understanding model behaviors.
We introduce GLIME, an enhanced framework extending LIME and unifying several prior methods.
By employing a local and unbiased sampling distribution, GLIME generates explanations with higher local fidelity compared to LIME.
arXiv Detail & Related papers (2023-11-27T11:17:20Z) - Semi-Supervised Crowd Counting with Contextual Modeling: Facilitating Holistic Understanding of Crowd Scenes [19.987151025364067]
This paper presents a new semi-supervised method for training a reliable crowd counting model.
We foster the model's intrinsic'subitizing' capability, which allows it to accurately estimate the count in regions.
Our method achieves the state-of-the-art performance, surpassing previous approaches by a large margin on challenging benchmarks.
arXiv Detail & Related papers (2023-10-16T12:42:43Z) - Numerically assisted determination of local models in network scenarios [55.2480439325792]
We develop a numerical tool for finding explicit local models that reproduce a given statistical behaviour.
We provide conjectures for the critical visibilities of the Greenberger-Horne-Zeilinger (GHZ) and W distributions.
The developed codes and documentation are publicly available at281.com/mariofilho/localmodels.
arXiv Detail & Related papers (2023-03-17T13:24:04Z) - RbX: Region-based explanations of prediction models [69.3939291118954]
Region-based explanations (RbX) is a model-agnostic method to generate local explanations of scalar outputs from a black-box prediction model.
RbX is guaranteed to satisfy a "sparsity axiom," which requires that features which do not enter into the prediction model are assigned zero importance.
arXiv Detail & Related papers (2022-10-17T03:38:06Z) - Change Detection for Local Explainability in Evolving Data Streams [72.4816340552763]
Local feature attribution methods have become a popular technique for post-hoc and model-agnostic explanations.
It is often unclear how local attributions behave in realistic, constantly evolving settings such as streaming and online applications.
We present CDLEEDS, a flexible and model-agnostic framework for detecting local change and concept drift.
arXiv Detail & Related papers (2022-09-06T18:38:34Z) - Sampling Based On Natural Image Statistics Improves Local Surrogate
Explainers [111.31448606885672]
Surrogate explainers are a popular post-hoc interpretability method to further understand how a model arrives at a prediction.
We propose two approaches to do so, namely (1) altering the method for sampling the local neighbourhood and (2) using perceptual metrics to convey some of the properties of the distribution of natural images.
arXiv Detail & Related papers (2022-08-08T08:10:13Z) - Locally Invariant Explanations: Towards Stable and Unidirectional
Explanations through Local Invariant Learning [15.886405745163234]
We propose a model agnostic local explanation method inspired by the invariant risk minimization principle.
Our algorithm is simple and efficient to train, and can ascertain stable input features for local decisions of a black-box without access to side information.
arXiv Detail & Related papers (2022-01-28T14:29:25Z) - XPROAX-Local explanations for text classification with progressive
neighborhood approximation [13.312630052709766]
We propose a progressive approximation of the neighborhood using counterfactual instances as initial landmarks.
We then refine counterfactuals and generate factuals in the neighborhood of the input instance to be explained.
Our experiments on real-world datasets demonstrate that our method outperforms the competitors in terms of usefulness and stability.
arXiv Detail & Related papers (2021-09-30T11:01:07Z) - MD-split+: Practical Local Conformal Inference in High Dimensions [0.5439020425819]
MD-split+ is a practical local conformal approach that creates X partitions based on localized model performance.
We discuss how our local partitions philosophically align with expected behavior from an unattainable conditional conformal inference approach.
arXiv Detail & Related papers (2021-07-07T15:19:16Z) - Learning Invariant Representations and Risks for Semi-supervised Domain
Adaptation [109.73983088432364]
We propose the first method that aims to simultaneously learn invariant representations and risks under the setting of semi-supervised domain adaptation (Semi-DA)
We introduce the LIRR algorithm for jointly textbfLearning textbfInvariant textbfRepresentations and textbfRisks.
arXiv Detail & Related papers (2020-10-09T15:42:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.