Related papers: Feature Removal Is a Unifying Principle for Model Explanation Methods

Feature Removal Is a Unifying Principle for Model Explanation Methods

URL: http://arxiv.org/abs/2011.03623v2
Date: Mon, 22 Aug 2022 23:49:30 GMT
Title: Feature Removal Is a Unifying Principle for Model Explanation Methods
Authors: Ian Covert, Scott Lundberg, Su-In Lee
Abstract summary: We examine the literature and find that many methods are based on a shared principle of explaining by removing. We develop a framework for removal-based explanations that characterizes each method along three dimensions. Our framework unifies 26 existing methods, including several of the most widely used approaches.
Score: 14.50261153230204
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Researchers have proposed a wide variety of model explanation approaches, but it remains unclear how most methods are related or when one method is preferable to another. We examine the literature and find that many methods are based on a shared principle of explaining by removing - essentially, measuring the impact of removing sets of features from a model. These methods vary in several respects, so we develop a framework for removal-based explanations that characterizes each method along three dimensions: 1) how the method removes features, 2) what model behavior the method explains, and 3) how the method summarizes each feature's influence. Our framework unifies 26 existing methods, including several of the most widely used approaches (SHAP, LIME, Meaningful Perturbations, permutation tests). Exposing the fundamental similarities between these methods empowers users to reason about which tools to use, and suggests promising directions for ongoing model explainability research.

Related papers

How to Probe: Simple Yet Effective Techniques for Improving Post-hoc Explanations [69.72654127617058]
Post-hoc importance attribution methods are a popular tool for "explaining" Deep Neural Networks (DNNs) In this work we bring forward empirical evidence that challenges this very notion. We discover a strong dependency on and demonstrate that the training details of a pre-trained model's classification layer play a crucial role.
arXiv Detail & Related papers (2025-03-01T22:25:11Z)
Unifying Attribution-Based Explanations Using Functional Decomposition [1.8216507818880976]
We propose a unifying framework of attribution-based explanation methods. It provides a step towards a rigorous study of the similarities and differences of explanations.
arXiv Detail & Related papers (2024-12-18T09:04:07Z)
Using Interpretation Methods for Model Enhancement [44.29399911722625]
We propose a framework of utilizing interpretation methods and gold rationales to enhance models. Our framework is very general in the sense that it can incorporate various interpretation methods. Experimental results show that our framework is effective especially in low-resource settings.
arXiv Detail & Related papers (2024-04-02T16:10:29Z)
An Axiomatic Approach to Model-Agnostic Concept Explanations [67.84000759813435]
We propose an approach to concept explanations that satisfy three natural axioms: linearity, recursivity, and similarity. We then establish connections with previous concept explanation methods, offering insight into their varying semantic meanings.
arXiv Detail & Related papers (2024-01-12T20:53:35Z)
counterfactuals: An R Package for Counterfactual Explanation Methods [9.505961054570523]
We introduce the counterfactuals R package, which provides a modular and unified interface for counterfactual explanation methods. We implement three existing counterfactual explanation methods and propose some optional methodological extensions. We show how to integrate additional counterfactual explanation methods into the package.
arXiv Detail & Related papers (2023-04-13T14:29:15Z)
Which Explanation Should I Choose? A Function Approximation Perspective to Characterizing Post hoc Explanations [16.678003262147346]
We show that popular explanation methods are instances of the local function approximation (LFA) framework. We set forth a guiding principle based on the function approximation perspective, considering a method to be effective if it recovers the underlying model. We empirically validate our theoretical results using various real world datasets, model classes, and prediction tasks.
arXiv Detail & Related papers (2022-06-02T19:09:30Z)
A Survey on Deep Semi-supervised Learning [51.26862262550445]
We first present a taxonomy for deep semi-supervised learning that categorizes existing methods. We then offer a detailed comparison of these methods in terms of the type of losses, contributions, and architecture differences.
arXiv Detail & Related papers (2021-02-28T16:22:58Z)
Explaining by Removing: A Unified Framework for Model Explanation [14.50261153230204]
Removal-based explanations are based on the principle of simulating feature removal to quantify each feature's influence. We develop a framework that characterizes each method along three dimensions: 1) how the method removes features, 2) what model behavior the method explains, and 3) how the method summarizes each feature's influence. This newly understood class of explanation methods has rich connections that we examine using tools that have been largely overlooked by the explainability literature.
arXiv Detail & Related papers (2020-11-21T00:47:48Z)
Learning explanations that are hard to vary [75.30552491694066]
We show that averaging across examples can favor memorization and patchwork' solutions that sew together different strategies. We then propose and experimentally validate a simple alternative algorithm based on a logical AND.
arXiv Detail & Related papers (2020-09-01T10:17:48Z)
Evaluating the Disentanglement of Deep Generative Models through Manifold Topology [66.06153115971732]
We present a method for quantifying disentanglement that only uses the generative model. We empirically evaluate several state-of-the-art models across multiple datasets.
arXiv Detail & Related papers (2020-06-05T20:54:11Z)
There and Back Again: Revisiting Backpropagation Saliency Methods [87.40330595283969]
Saliency methods seek to explain the predictions of a model by producing an importance map across each input sample. A popular class of such methods is based on backpropagating a signal and analyzing the resulting gradient. We propose a single framework under which several such methods can be unified.
arXiv Detail & Related papers (2020-04-06T17:58:08Z)
Deep Unfolding Network for Image Super-Resolution [159.50726840791697]
This paper proposes an end-to-end trainable unfolding network which leverages both learning-based methods and model-based methods. The proposed network inherits the flexibility of model-based methods to super-resolve blurry, noisy images for different scale factors via a single model.
arXiv Detail & Related papers (2020-03-23T17:55:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.