MASALA: Model-Agnostic Surrogate Explanations by Locality Adaptation
- URL: http://arxiv.org/abs/2408.10085v1
- Date: Mon, 19 Aug 2024 15:26:45 GMT
- Title: MASALA: Model-Agnostic Surrogate Explanations by Locality Adaptation
- Authors: Saif Anwar, Nathan Griffiths, Abhir Bhalerao, Thomas Popham,
- Abstract summary: Existing local Explainable AI (XAI) methods select a region of the input space in the vicinity of a given input instance, for which they approximate the behaviour of a model using a simpler and more interpretable surrogate model.
We propose a novel method, MASALA, for generating explanations, which automatically determines the appropriate local region of impactful model behaviour for each individual instance being explained.
- Score: 3.587367153279351
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Existing local Explainable AI (XAI) methods, such as LIME, select a region of the input space in the vicinity of a given input instance, for which they approximate the behaviour of a model using a simpler and more interpretable surrogate model. The size of this region is often controlled by a user-defined locality hyperparameter. In this paper, we demonstrate the difficulties associated with defining a suitable locality size to capture impactful model behaviour, as well as the inadequacy of using a single locality size to explain all predictions. We propose a novel method, MASALA, for generating explanations, which automatically determines the appropriate local region of impactful model behaviour for each individual instance being explained. MASALA approximates the local behaviour used by a complex model to make a prediction by fitting a linear surrogate model to a set of points which experience similar model behaviour. These points are found by clustering the input space into regions of linear behavioural trends exhibited by the model. We compare the fidelity and consistency of explanations generated by our method with existing local XAI methods, namely LIME and CHILLI. Experiments on the PHM08 and MIDAS datasets show that our method produces more faithful and consistent explanations than existing methods, without the need to define any sensitive locality hyperparameters.
Related papers
- Guarantee Regions for Local Explanations [29.429229877959663]
We propose an anchor-based algorithm for identifying regions in which local explanations are guaranteed to be correct.
Our method produces an interpretable feature-aligned box where the prediction of the local surrogate model is guaranteed to match the predictive model.
arXiv Detail & Related papers (2024-02-20T06:04:44Z) - GLIME: General, Stable and Local LIME Explanation [11.002828804775392]
Local Interpretable Model-agnostic Explanations (LIME) is a widely adpoted method for understanding model behaviors.
We introduce GLIME, an enhanced framework extending LIME and unifying several prior methods.
By employing a local and unbiased sampling distribution, GLIME generates explanations with higher local fidelity compared to LIME.
arXiv Detail & Related papers (2023-11-27T11:17:20Z) - Reconstructing Spatiotemporal Data with C-VAEs [49.1574468325115]
Conditional continuous representation of moving regions is commonly used.
In this work, we explore the capabilities of Conditional Varitemporal Autoencoder (C-VAE) models to generate realistic representations of regions' evolution.
arXiv Detail & Related papers (2023-07-12T15:34:10Z) - RbX: Region-based explanations of prediction models [69.3939291118954]
Region-based explanations (RbX) is a model-agnostic method to generate local explanations of scalar outputs from a black-box prediction model.
RbX is guaranteed to satisfy a "sparsity axiom," which requires that features which do not enter into the prediction model are assigned zero importance.
arXiv Detail & Related papers (2022-10-17T03:38:06Z) - Change Detection for Local Explainability in Evolving Data Streams [72.4816340552763]
Local feature attribution methods have become a popular technique for post-hoc and model-agnostic explanations.
It is often unclear how local attributions behave in realistic, constantly evolving settings such as streaming and online applications.
We present CDLEEDS, a flexible and model-agnostic framework for detecting local change and concept drift.
arXiv Detail & Related papers (2022-09-06T18:38:34Z) - Distilling Model Failures as Directions in Latent Space [87.30726685335098]
We present a scalable method for automatically distilling a model's failure modes.
We harness linear classifiers to identify consistent error patterns, and induce a natural representation of these failure modes as directions within the feature space.
We demonstrate that this framework allows us to discover and automatically caption challenging subpopulations within the training dataset, and intervene to improve the model's performance on these subpopulations.
arXiv Detail & Related papers (2022-06-29T16:35:24Z) - Contrastive Neighborhood Alignment [81.65103777329874]
We present Contrastive Neighborhood Alignment (CNA), a manifold learning approach to maintain the topology of learned features.
The target model aims to mimic the local structure of the source representation space using a contrastive loss.
CNA is illustrated in three scenarios: manifold learning, where the model maintains the local topology of the original data in a dimension-reduced space; model distillation, where a small student model is trained to mimic a larger teacher; and legacy model update, where an older model is replaced by a more powerful one.
arXiv Detail & Related papers (2022-01-06T04:58:31Z) - Towards Better Model Understanding with Path-Sufficient Explanations [11.517059323883444]
Path-Sufficient Explanations Method (PSEM) is a sequence of sufficient explanations for a given input of strictly decreasing size.
PSEM can be thought to trace the local boundary of the model in a smooth manner, thus providing better intuition about the local model behavior for the specific input.
A user study depicts the strength of the method in communicating the local behavior, where (many) users are able to correctly determine the prediction made by a model.
arXiv Detail & Related papers (2021-09-13T16:06:10Z) - MD-split+: Practical Local Conformal Inference in High Dimensions [0.5439020425819]
MD-split+ is a practical local conformal approach that creates X partitions based on localized model performance.
We discuss how our local partitions philosophically align with expected behavior from an unattainable conditional conformal inference approach.
arXiv Detail & Related papers (2021-07-07T15:19:16Z) - VAE-LIME: Deep Generative Model Based Approach for Local Data-Driven
Model Interpretability Applied to the Ironmaking Industry [70.10343492784465]
It is necessary to expose to the process engineer, not solely the model predictions, but also their interpretability.
Model-agnostic local interpretability solutions based on LIME have recently emerged to improve the original method.
We present in this paper a novel approach, VAE-LIME, for local interpretability of data-driven models forecasting the temperature of the hot metal produced by a blast furnace.
arXiv Detail & Related papers (2020-07-15T07:07:07Z) - Explaining Predictions by Approximating the Local Decision Boundary [3.60160227126201]
We present a new procedure for local decision boundary approximation (DBA)
We train a variational autoencoder to learn a Euclidean latent space of encoded data representations.
We exploit attribute annotations to map the latent space to attributes that are meaningful to the user.
arXiv Detail & Related papers (2020-06-14T19:12:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.