Related papers: Selective Explanations

Selective Explanations

URL: http://arxiv.org/abs/2405.19562v1
Date: Wed, 29 May 2024 23:08:31 GMT
Title: Selective Explanations
Authors: Lucas Monteiro Paes, Dennis Wei, Flavio P. Calmon,
Abstract summary: A machine learning model is trained to predict feature attribution scores with only one inference. Despite their efficiency, amortized explainers can produce inaccurate predictions and misleading explanations. We propose selective explanations, a novel feature attribution method that detects when amortized explainers generate low-quality explanations.
Score: 14.312717332216073
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Feature attribution methods explain black-box machine learning (ML) models by assigning importance scores to input features. These methods can be computationally expensive for large ML models. To address this challenge, there has been increasing efforts to develop amortized explainers, where a machine learning model is trained to predict feature attribution scores with only one inference. Despite their efficiency, amortized explainers can produce inaccurate predictions and misleading explanations. In this paper, we propose selective explanations, a novel feature attribution method that (i) detects when amortized explainers generate low-quality explanations and (ii) improves these explanations using a technique called explanations with initial guess. Our selective explanation method allows practitioners to specify the fraction of samples that receive explanations with initial guess, offering a principled way to bridge the gap between amortized explainers and their high-quality counterparts.

Related papers

Are We Merely Justifying Results ex Post Facto? Quantifying Explanatory Inversion in Post-Hoc Model Explanations [87.68633031231924]
Post-hoc explanation methods provide interpretation by attributing predictions to input features. Do these explanations unintentionally reverse the natural relationship between inputs and outputs? We propose Inversion Quantification (IQ), a framework that quantifies the degree to which explanations rely on outputs and deviate from faithful input-output relationships.
arXiv Detail & Related papers (2025-04-11T19:00:12Z)
Logic-based Explanations for Linear Support Vector Classifiers with Reject Option [0.0]
Support Vector (SVC) is a well-known Machine Learning (ML) model for linear classification problems. We propose a logic-based approach with formal guarantees on the correctness and minimality of explanations for linear SVCs with reject option.
arXiv Detail & Related papers (2024-03-24T15:14:44Z)
Causal Generative Explainers using Counterfactual Inference: A Case Study on the Morpho-MNIST Dataset [5.458813674116228]
We present a generative counterfactual inference approach to study the influence of visual features as well as causal factors. We employ visual explanation methods from OmnixAI open source toolkit to compare them with our proposed methods. This finding suggests that our methods are well-suited for generating highly interpretable counterfactual explanations on causal datasets.
arXiv Detail & Related papers (2024-01-21T04:07:48Z)
Learning with Explanation Constraints [91.23736536228485]
We provide a learning theoretic framework to analyze how explanations can improve the learning of our models. We demonstrate the benefits of our approach over a large array of synthetic and real-world experiments.
arXiv Detail & Related papers (2023-03-25T15:06:47Z)
Streamlining models with explanations in the learning loop [0.0]
Several explainable AI methods allow a Machine Learning user to get insights on the classification process of a black-box model. We exploit this information to design a feature engineering phase, where we combine explanations with feature values.
arXiv Detail & Related papers (2023-02-15T16:08:32Z)
Explaining Reject Options of Learning Vector Quantization Classifiers [6.125017875330933]
We propose to use counterfactual explanations for explaining rejects in machine learning models. We investigate how to efficiently compute counterfactual explanations of different reject options for an important class of models.
arXiv Detail & Related papers (2022-02-15T08:16:10Z)
Diagnostics-Guided Explanation Generation [32.97930902104502]
Explanations shed light on a machine learning model's rationales and can aid in identifying deficiencies in its reasoning process. We show how to optimise for several diagnostic properties when training a model to generate sentence-level explanations.
arXiv Detail & Related papers (2021-09-08T16:27:52Z)
Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction. We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss. Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z)
Evaluating Explanations: How much do explanations from the teacher aid students? [103.05037537415811]
We formalize the value of explanations using a student-teacher paradigm that measures the extent to which explanations improve student models in learning. Unlike many prior proposals to evaluate explanations, our approach cannot be easily gamed, enabling principled, scalable, and automatic evaluation of attributions.
arXiv Detail & Related papers (2020-12-01T23:40:21Z)
The Struggles of Feature-Based Explanations: Shapley Values vs. Minimal Sufficient Subsets [61.66584140190247]
We show that feature-based explanations pose problems even for explaining trivial models. We show that two popular classes of explainers, Shapley explainers and minimal sufficient subsets explainers, target fundamentally different types of ground-truth explanations.
arXiv Detail & Related papers (2020-09-23T09:45:23Z)
Model extraction from counterfactual explanations [68.8204255655161]
We show how an adversary can leverage the information provided by counterfactual explanations to build high-fidelity and high-accuracy model extraction attacks. Our attack enables the adversary to build a faithful copy of a target model by accessing its counterfactual explanations.
arXiv Detail & Related papers (2020-09-03T19:02:55Z)
SCOUT: Self-aware Discriminant Counterfactual Explanations [78.79534272979305]
The problem of counterfactual visual explanations is considered. A new family of discriminant explanations is introduced. The resulting counterfactual explanations are optimization free and thus much faster than previous methods.
arXiv Detail & Related papers (2020-04-16T17:05:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.