Related papers: ExplainReduce: Summarising local explanations via proxies

ExplainReduce: Summarising local explanations via proxies

URL: http://arxiv.org/abs/2502.10311v1
Date: Fri, 14 Feb 2025 17:14:02 GMT
Title: ExplainReduce: Summarising local explanations via proxies
Authors: Lauri Seppäläinen, Mudong Guo, Kai Puolamäki,
Abstract summary: An often-used model-agnostic approach to XAI involves using simple models as local approximations to produce so-called local explanations.<n>This paper shows how a large set of local explanations can be reduced to a small "proxy set" of simple models, which can act as a generative global explanation.
Score: 2.3185929089334594
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Most commonly used non-linear machine learning methods are closed-box models, uninterpretable to humans. The field of explainable artificial intelligence (XAI) aims to develop tools to examine the inner workings of these closed boxes. An often-used model-agnostic approach to XAI involves using simple models as local approximations to produce so-called local explanations; examples of this approach include LIME, SHAP, and SLISEMAP. This paper shows how a large set of local explanations can be reduced to a small "proxy set" of simple models, which can act as a generative global explanation. This reduction procedure, ExplainReduce, can be formulated as an optimisation problem and approximated efficiently using greedy heuristics.

Related papers

Minimizing False-Positive Attributions in Explanations of Non-Linear Models [5.186535458271726]
Suppressor variables can influence model predictions without being dependent on the target outcome.<n>These variables may cause false-positive feature attributions, undermining the utility of explanations.<n>We introduce PatternLocal, a novel XAI technique that addresses this gap.
arXiv Detail & Related papers (2025-05-16T13:06:12Z)
MASALA: Model-Agnostic Surrogate Explanations by Locality Adaptation [3.587367153279351]
Existing local Explainable AI (XAI) methods select a region of the input space in the vicinity of a given input instance, for which they approximate the behaviour of a model using a simpler and more interpretable surrogate model. We propose a novel method, MASALA, for generating explanations, which automatically determines the appropriate local region of impactful model behaviour for each individual instance being explained.
arXiv Detail & Related papers (2024-08-19T15:26:45Z)
Explainability for Large Language Models: A Survey [59.67574757137078]
Large language models (LLMs) have demonstrated impressive capabilities in natural language processing. This paper introduces a taxonomy of explainability techniques and provides a structured overview of methods for explaining Transformer-based language models.
arXiv Detail & Related papers (2023-09-02T22:14:26Z)
Language models are weak learners [71.33837923104808]
We show that prompt-based large language models can operate effectively as weak learners. We incorporate these models into a boosting approach, which can leverage the knowledge within the model to outperform traditional tree-based boosting. Results illustrate the potential for prompt-based LLMs to function not just as few-shot learners themselves, but as components of larger machine learning pipelines.
arXiv Detail & Related papers (2023-06-25T02:39:19Z)
Interpretability at Scale: Identifying Causal Mechanisms in Alpaca [62.65877150123775]
We use Boundless DAS to efficiently search for interpretable causal structure in large language models while they follow instructions. Our findings mark a first step toward faithfully understanding the inner-workings of our ever-growing and most widely deployed language models.
arXiv Detail & Related papers (2023-05-15T17:15:40Z)
Learning with Explanation Constraints [91.23736536228485]
We provide a learning theoretic framework to analyze how explanations can improve the learning of our models. We demonstrate the benefits of our approach over a large array of synthetic and real-world experiments.
arXiv Detail & Related papers (2023-03-25T15:06:47Z)
Understanding Post-hoc Explainers: The Case of Anchors [6.681943980068051]
We present a theoretical analysis of a rule-based interpretability method that highlights a small set of words to explain a text's decision. After formalizing its algorithm and providing useful insights, we demonstrate mathematically that Anchors produces meaningful results.
arXiv Detail & Related papers (2023-03-15T17:56:34Z)
Local Interpretable Model Agnostic Shap Explanations for machine learning models [0.0]
We propose a methodology that we define as Local Interpretable Model Agnostic Shap Explanations (LIMASE) This proposed technique uses Shapley values under the LIME paradigm to achieve the following (a) explain prediction of any model by using a locally faithful and interpretable decision tree model on which the Tree Explainer is used to calculate the shapley values and give visually interpretable explanations.
arXiv Detail & Related papers (2022-10-10T10:07:27Z)
An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system. Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches. This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z)
MACE: An Efficient Model-Agnostic Framework for Counterfactual Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE) In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity. Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z)
Locally Invariant Explanations: Towards Stable and Unidirectional Explanations through Local Invariant Learning [15.886405745163234]
We propose a model agnostic local explanation method inspired by the invariant risk minimization principle. Our algorithm is simple and efficient to train, and can ascertain stable input features for local decisions of a black-box without access to side information.
arXiv Detail & Related papers (2022-01-28T14:29:25Z)
MeLIME: Meaningful Local Explanation for Machine Learning Models [2.819725769698229]
We show that our approach, MeLIME, produces more meaningful explanations compared to other techniques over different ML models. MeLIME generalizes the LIME method, allowing more flexible perturbation sampling and the use of different local interpretable models.
arXiv Detail & Related papers (2020-09-12T16:06:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.