Feature Removal Is a Unifying Principle for Model Explanation Methods
- URL: http://arxiv.org/abs/2011.03623v2
- Date: Mon, 22 Aug 2022 23:49:30 GMT
- Title: Feature Removal Is a Unifying Principle for Model Explanation Methods
- Authors: Ian Covert, Scott Lundberg, Su-In Lee
- Abstract summary: We examine the literature and find that many methods are based on a shared principle of explaining by removing.
We develop a framework for removal-based explanations that characterizes each method along three dimensions.
Our framework unifies 26 existing methods, including several of the most widely used approaches.
- Score: 14.50261153230204
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Researchers have proposed a wide variety of model explanation approaches, but
it remains unclear how most methods are related or when one method is
preferable to another. We examine the literature and find that many methods are
based on a shared principle of explaining by removing - essentially, measuring
the impact of removing sets of features from a model. These methods vary in
several respects, so we develop a framework for removal-based explanations that
characterizes each method along three dimensions: 1) how the method removes
features, 2) what model behavior the method explains, and 3) how the method
summarizes each feature's influence. Our framework unifies 26 existing methods,
including several of the most widely used approaches (SHAP, LIME, Meaningful
Perturbations, permutation tests). Exposing the fundamental similarities
between these methods empowers users to reason about which tools to use, and
suggests promising directions for ongoing model explainability research.
Related papers
- Using Interpretation Methods for Model Enhancement [44.29399911722625]
We propose a framework of utilizing interpretation methods and gold rationales to enhance models.
Our framework is very general in the sense that it can incorporate various interpretation methods.
Experimental results show that our framework is effective especially in low-resource settings.
arXiv Detail & Related papers (2024-04-02T16:10:29Z) - An Axiomatic Approach to Model-Agnostic Concept Explanations [67.84000759813435]
We propose an approach to concept explanations that satisfy three natural axioms: linearity, recursivity, and similarity.
We then establish connections with previous concept explanation methods, offering insight into their varying semantic meanings.
arXiv Detail & Related papers (2024-01-12T20:53:35Z) - counterfactuals: An R Package for Counterfactual Explanation Methods [9.505961054570523]
We introduce the counterfactuals R package, which provides a modular and unified interface for counterfactual explanation methods.
We implement three existing counterfactual explanation methods and propose some optional methodological extensions.
We show how to integrate additional counterfactual explanation methods into the package.
arXiv Detail & Related papers (2023-04-13T14:29:15Z) - Which Explanation Should I Choose? A Function Approximation Perspective
to Characterizing Post hoc Explanations [16.678003262147346]
We show that popular explanation methods are instances of the local function approximation (LFA) framework.
We set forth a guiding principle based on the function approximation perspective, considering a method to be effective if it recovers the underlying model.
We empirically validate our theoretical results using various real world datasets, model classes, and prediction tasks.
arXiv Detail & Related papers (2022-06-02T19:09:30Z) - A Survey on Deep Semi-supervised Learning [51.26862262550445]
We first present a taxonomy for deep semi-supervised learning that categorizes existing methods.
We then offer a detailed comparison of these methods in terms of the type of losses, contributions, and architecture differences.
arXiv Detail & Related papers (2021-02-28T16:22:58Z) - Explaining by Removing: A Unified Framework for Model Explanation [14.50261153230204]
Removal-based explanations are based on the principle of simulating feature removal to quantify each feature's influence.
We develop a framework that characterizes each method along three dimensions: 1) how the method removes features, 2) what model behavior the method explains, and 3) how the method summarizes each feature's influence.
This newly understood class of explanation methods has rich connections that we examine using tools that have been largely overlooked by the explainability literature.
arXiv Detail & Related papers (2020-11-21T00:47:48Z) - Learning explanations that are hard to vary [75.30552491694066]
We show that averaging across examples can favor memorization and patchwork' solutions that sew together different strategies.
We then propose and experimentally validate a simple alternative algorithm based on a logical AND.
arXiv Detail & Related papers (2020-09-01T10:17:48Z) - Evaluating the Disentanglement of Deep Generative Models through
Manifold Topology [66.06153115971732]
We present a method for quantifying disentanglement that only uses the generative model.
We empirically evaluate several state-of-the-art models across multiple datasets.
arXiv Detail & Related papers (2020-06-05T20:54:11Z) - There and Back Again: Revisiting Backpropagation Saliency Methods [87.40330595283969]
Saliency methods seek to explain the predictions of a model by producing an importance map across each input sample.
A popular class of such methods is based on backpropagating a signal and analyzing the resulting gradient.
We propose a single framework under which several such methods can be unified.
arXiv Detail & Related papers (2020-04-06T17:58:08Z) - Deep Unfolding Network for Image Super-Resolution [159.50726840791697]
This paper proposes an end-to-end trainable unfolding network which leverages both learning-based methods and model-based methods.
The proposed network inherits the flexibility of model-based methods to super-resolve blurry, noisy images for different scale factors via a single model.
arXiv Detail & Related papers (2020-03-23T17:55:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.