A Learning Paradigm for Interpretable Gradients
- URL: http://arxiv.org/abs/2404.15024v1
- Date: Tue, 23 Apr 2024 13:32:29 GMT
- Title: A Learning Paradigm for Interpretable Gradients
- Authors: Felipe Torres Figueroa, Hanwei Zhang, Ronan Sicre, Yannis Avrithis, Stephane Ayache,
- Abstract summary: We present a novel training approach to improve the quality of gradients for interpretability.
We find that the resulting gradient is qualitatively less noisy and improves quantitatively the interpretability properties of different networks.
- Score: 9.074325843851726
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper studies interpretability of convolutional networks by means of saliency maps. Most approaches based on Class Activation Maps (CAM) combine information from fully connected layers and gradient through variants of backpropagation. However, it is well understood that gradients are noisy and alternatives like guided backpropagation have been proposed to obtain better visualization at inference. In this work, we present a novel training approach to improve the quality of gradients for interpretability. In particular, we introduce a regularization loss such that the gradient with respect to the input image obtained by standard backpropagation is similar to the gradient obtained by guided backpropagation. We find that the resulting gradient is qualitatively less noisy and improves quantitatively the interpretability properties of different networks, using several interpretability methods.
Related papers
- Rethinking the Principle of Gradient Smooth Methods in Model Explanation [2.6819730646697972]
Gradient Smoothing is an efficient approach to reducing noise in gradient-based model explanation method.
We propose an adaptive gradient smoothing method, AdaptGrad, based on these insights.
arXiv Detail & Related papers (2024-10-10T08:24:27Z) - Expected Grad-CAM: Towards gradient faithfulness [7.2203673761998495]
gradient-weighted CAM approaches still rely on vanilla gradients.
Our work proposes a gradient-weighted CAM augmentation that tackles the saturation and sensitivity problem.
arXiv Detail & Related papers (2024-06-03T12:40:30Z) - How to guess a gradient [68.98681202222664]
We show that gradients are more structured than previously thought.
Exploiting this structure can significantly improve gradient-free optimization schemes.
We highlight new challenges in overcoming the large gap between optimizing with exact gradients and guessing the gradients.
arXiv Detail & Related papers (2023-12-07T21:40:44Z) - Can Forward Gradient Match Backpropagation? [2.875726839945885]
Forward Gradients have been shown to be utilizable for neural network training.
We propose to strongly bias our gradient guesses in directions that are much more promising, such as feedback obtained from small, local auxiliary networks.
We find that using gradients obtained from a local loss as a candidate direction drastically improves on random noise in Forward Gradient methods.
arXiv Detail & Related papers (2023-06-12T08:53:41Z) - Scaling Forward Gradient With Local Losses [117.22685584919756]
Forward learning is a biologically plausible alternative to backprop for learning deep neural networks.
We show that it is possible to substantially reduce the variance of the forward gradient by applying perturbations to activations rather than weights.
Our approach matches backprop on MNIST and CIFAR-10 and significantly outperforms previously proposed backprop-free algorithms on ImageNet.
arXiv Detail & Related papers (2022-10-07T03:52:27Z) - Rethinking Positive Aggregation and Propagation of Gradients in
Gradient-based Saliency Methods [47.999621481852266]
Saliency methods interpret the prediction of a neural network by showing the importance of input elements for that prediction.
We empirically show that two approaches for handling the gradient information, namely positive aggregation, and positive propagation, break these methods.
arXiv Detail & Related papers (2020-12-01T09:38:54Z) - Channel-Directed Gradients for Optimization of Convolutional Neural
Networks [50.34913837546743]
We introduce optimization methods for convolutional neural networks that can be used to improve existing gradient-based optimization in terms of generalization error.
We show that defining the gradients along the output channel direction leads to a performance boost, while other directions can be detrimental.
arXiv Detail & Related papers (2020-08-25T00:44:09Z) - Understanding Integrated Gradients with SmoothTaylor for Deep Neural
Network Attribution [70.78655569298923]
Integrated Gradients as an attribution method for deep neural network models offers simple implementability.
It suffers from noisiness of explanations which affects the ease of interpretability.
The SmoothGrad technique is proposed to solve the noisiness issue and smoothen the attribution maps of any gradient-based attribution method.
arXiv Detail & Related papers (2020-04-22T10:43:19Z) - Disentangling Adaptive Gradient Methods from Learning Rates [65.0397050979662]
We take a deeper look at how adaptive gradient methods interact with the learning rate schedule.
We introduce a "grafting" experiment which decouples an update's magnitude from its direction.
We present some empirical and theoretical retrospectives on the generalization of adaptive gradient methods.
arXiv Detail & Related papers (2020-02-26T21:42:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.