Related papers: Rethinking Positive Aggregation and Propagation of Gradients in Gradient-based Saliency Methods

Rethinking Positive Aggregation and Propagation of Gradients in Gradient-based Saliency Methods

URL: http://arxiv.org/abs/2012.00362v1
Date: Tue, 1 Dec 2020 09:38:54 GMT
Title: Rethinking Positive Aggregation and Propagation of Gradients in Gradient-based Saliency Methods
Authors: Ashkan Khakzar, Soroosh Baselizadeh, Nassir Navab
Abstract summary: Saliency methods interpret the prediction of a neural network by showing the importance of input elements for that prediction. We empirically show that two approaches for handling the gradient information, namely positive aggregation, and positive propagation, break these methods.
Score: 47.999621481852266
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Saliency methods interpret the prediction of a neural network by showing the importance of input elements for that prediction. A popular family of saliency methods utilize gradient information. In this work, we empirically show that two approaches for handling the gradient information, namely positive aggregation, and positive propagation, break these methods. Though these methods reflect visually salient information in the input, they do not explain the model prediction anymore as the generated saliency maps are insensitive to the predicted output and are insensitive to model parameter randomization. Specifically for methods that aggregate the gradients of a chosen layer such as GradCAM++ and FullGrad, exclusively aggregating positive gradients is detrimental. We further support this by proposing several variants of aggregation methods with positive handling of gradient information. For methods that backpropagate gradient information such as LRP, RectGrad, and Guided Backpropagation, we show the destructive effect of exclusively propagating positive gradient information.

Related papers

A Learning Paradigm for Interpretable Gradients [9.074325843851726]
We present a novel training approach to improve the quality of gradients for interpretability. We find that the resulting gradient is qualitatively less noisy and improves quantitatively the interpretability properties of different networks.
arXiv Detail & Related papers (2024-04-23T13:32:29Z)
Neural Gradient Learning and Optimization for Oriented Point Normal Estimation [53.611206368815125]
We propose a deep learning approach to learn gradient vectors with consistent orientation from 3D point clouds for normal estimation. We learn an angular distance field based on local plane geometry to refine the coarse gradient vectors. Our method efficiently conducts global gradient approximation while achieving better accuracy and ability generalization of local feature description.
arXiv Detail & Related papers (2023-09-17T08:35:11Z)
Generalizing Backpropagation for Gradient-Based Interpretability [103.2998254573497]
We show that the gradient of a model is a special case of a more general formulation using semirings. This observation allows us to generalize the backpropagation algorithm to efficiently compute other interpretable statistics.
arXiv Detail & Related papers (2023-07-06T15:19:53Z)
Clip21: Error Feedback for Gradient Clipping [8.979288425347702]
We design Clip21 -- the first provably effective and practically useful feedback mechanism for distributed methods. Our method converges faster in practice than competing methods.
arXiv Detail & Related papers (2023-05-30T10:41:42Z)
Natural Gradient Methods: Perspectives, Efficient-Scalable Approximations, and Analysis [0.0]
Natural Gradient Descent is a second-degree optimization method motivated by the information geometry. It makes use of the Fisher Information Matrix instead of the Hessian which is typically used. Being a second-order method makes it infeasible to be used directly in problems with a huge number of parameters and data.
arXiv Detail & Related papers (2023-03-06T04:03:56Z)
Geometrically Guided Integrated Gradients [0.3867363075280543]
We introduce an interpretability method called "geometrically-guided integrated gradients" Our method explores the model's dynamic behavior from multiple scaled versions of the input and captures the best possible attribution for each input. We also propose a "model perturbation" sanity check to complement the traditionally used "model randomization" test.
arXiv Detail & Related papers (2022-06-13T05:05:43Z)
Deep learning: a statistical viewpoint [120.94133818355645]
Deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-perfect solutions to non-optimal training problems. We conjecture that specific principles underlie these phenomena.
arXiv Detail & Related papers (2021-03-16T16:26:36Z)
Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks [78.76880041670904]
In neural networks with binary activations and or binary weights the training by gradient descent is complicated. We propose a new method for this estimation problem combining sampling and analytic approximation steps. We experimentally show higher accuracy in gradient estimation and demonstrate a more stable and better performing training in deep convolutional models.
arXiv Detail & Related papers (2020-06-04T21:51:21Z)
There and Back Again: Revisiting Backpropagation Saliency Methods [87.40330595283969]
Saliency methods seek to explain the predictions of a model by producing an importance map across each input sample. A popular class of such methods is based on backpropagating a signal and analyzing the resulting gradient. We propose a single framework under which several such methods can be unified.
arXiv Detail & Related papers (2020-04-06T17:58:08Z)
DANCE: Enhancing saliency maps using decoys [35.46266461621123]
We propose a framework that improves the robustness of saliency methods by following a two-step procedure. First, we introduce a perturbation mechanism that subtly varies the input sample without changing its intermediate representations. Second, we compute saliency maps for perturbed samples and propose a new method to aggregate saliency maps.
arXiv Detail & Related papers (2020-02-03T01:21:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.