There and Back Again: Revisiting Backpropagation Saliency Methods
- URL: http://arxiv.org/abs/2004.02866v1
- Date: Mon, 6 Apr 2020 17:58:08 GMT
- Title: There and Back Again: Revisiting Backpropagation Saliency Methods
- Authors: Sylvestre-Alvise Rebuffi, Ruth Fong, Xu Ji, Andrea Vedaldi
- Abstract summary: Saliency methods seek to explain the predictions of a model by producing an importance map across each input sample.
A popular class of such methods is based on backpropagating a signal and analyzing the resulting gradient.
We propose a single framework under which several such methods can be unified.
- Score: 87.40330595283969
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Saliency methods seek to explain the predictions of a model by producing an
importance map across each input sample. A popular class of such methods is
based on backpropagating a signal and analyzing the resulting gradient. Despite
much research on such methods, relatively little work has been done to clarify
the differences between such methods as well as the desiderata of these
techniques. Thus, there is a need for rigorously understanding the
relationships between different methods as well as their failure modes. In this
work, we conduct a thorough analysis of backpropagation-based saliency methods
and propose a single framework under which several such methods can be unified.
As a result of our study, we make three additional contributions. First, we use
our framework to propose NormGrad, a novel saliency method based on the spatial
contribution of gradients of convolutional weights. Second, we combine saliency
maps at different layers to test the ability of saliency methods to extract
complementary information at different network levels (e.g.~trading off spatial
resolution and distinctiveness) and we explain why some methods fail at
specific layers (e.g., Grad-CAM anywhere besides the last convolutional layer).
Third, we introduce a class-sensitivity metric and a meta-learning inspired
paradigm applicable to any saliency method for improving sensitivity to the
output class being explained.
Related papers
- Unsupervised Discovery of Interpretable Directions in h-space of
Pre-trained Diffusion Models [63.1637853118899]
We propose the first unsupervised and learning-based method to identify interpretable directions in h-space of pre-trained diffusion models.
We employ a shift control module that works on h-space of pre-trained diffusion models to manipulate a sample into a shifted version of itself.
By jointly optimizing them, the model will spontaneously discover disentangled and interpretable directions.
arXiv Detail & Related papers (2023-10-15T18:44:30Z) - Better Understanding Differences in Attribution Methods via Systematic Evaluations [57.35035463793008]
Post-hoc attribution methods have been proposed to identify image regions most influential to the models' decisions.
We propose three novel evaluation schemes to more reliably measure the faithfulness of those methods.
We use these evaluation schemes to study strengths and shortcomings of some widely used attribution methods over a wide range of models.
arXiv Detail & Related papers (2023-03-21T14:24:58Z) - Geometrically Guided Integrated Gradients [0.3867363075280543]
We introduce an interpretability method called "geometrically-guided integrated gradients"
Our method explores the model's dynamic behavior from multiple scaled versions of the input and captures the best possible attribution for each input.
We also propose a "model perturbation" sanity check to complement the traditionally used "model randomization" test.
arXiv Detail & Related papers (2022-06-13T05:05:43Z) - Towards Better Understanding Attribution Methods [77.1487219861185]
Post-hoc attribution methods have been proposed to identify image regions most influential to the models' decisions.
We propose three novel evaluation schemes to more reliably measure the faithfulness of those methods.
We also propose a post-processing smoothing step that significantly improves the performance of some attribution methods.
arXiv Detail & Related papers (2022-05-20T20:50:17Z) - A Survey on Deep Semi-supervised Learning [51.26862262550445]
We first present a taxonomy for deep semi-supervised learning that categorizes existing methods.
We then offer a detailed comparison of these methods in terms of the type of losses, contributions, and architecture differences.
arXiv Detail & Related papers (2021-02-28T16:22:58Z) - Multi-head Knowledge Distillation for Model Compression [65.58705111863814]
We propose a simple-to-implement method using auxiliary classifiers at intermediate layers for matching features.
We show that the proposed method outperforms prior relevant approaches presented in the literature.
arXiv Detail & Related papers (2020-12-05T00:49:14Z) - Explaining by Removing: A Unified Framework for Model Explanation [14.50261153230204]
Removal-based explanations are based on the principle of simulating feature removal to quantify each feature's influence.
We develop a framework that characterizes each method along three dimensions: 1) how the method removes features, 2) what model behavior the method explains, and 3) how the method summarizes each feature's influence.
This newly understood class of explanation methods has rich connections that we examine using tools that have been largely overlooked by the explainability literature.
arXiv Detail & Related papers (2020-11-21T00:47:48Z) - Feature Removal Is a Unifying Principle for Model Explanation Methods [14.50261153230204]
We examine the literature and find that many methods are based on a shared principle of explaining by removing.
We develop a framework for removal-based explanations that characterizes each method along three dimensions.
Our framework unifies 26 existing methods, including several of the most widely used approaches.
arXiv Detail & Related papers (2020-11-06T22:37:55Z) - DANCE: Enhancing saliency maps using decoys [35.46266461621123]
We propose a framework that improves the robustness of saliency methods by following a two-step procedure.
First, we introduce a perturbation mechanism that subtly varies the input sample without changing its intermediate representations.
Second, we compute saliency maps for perturbed samples and propose a new method to aggregate saliency maps.
arXiv Detail & Related papers (2020-02-03T01:21:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.