Structured Gradient-based Interpretations via Norm-Regularized Adversarial Training
- URL: http://arxiv.org/abs/2404.04647v1
- Date: Sat, 6 Apr 2024 14:49:36 GMT
- Title: Structured Gradient-based Interpretations via Norm-Regularized Adversarial Training
- Authors: Shizhan Gong, Qi Dou, Farzan Farnia,
- Abstract summary: gradient-based saliency maps often lack desired structures such as sparsity and connectedness in their application to real-world computer vision models.
A frequently used approach to inducing sparsity structures into gradient-based saliency maps is to alter the simple gradient scheme using sparsification or norm-based regularization.
In this work, we propose to apply adversarial training as an in-processing scheme to train neural networks with structured simple gradient maps.
- Score: 18.876749156797935
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Gradient-based saliency maps have been widely used to explain the decisions of deep neural network classifiers. However, standard gradient-based interpretation maps, including the simple gradient and integrated gradient algorithms, often lack desired structures such as sparsity and connectedness in their application to real-world computer vision models. A frequently used approach to inducing sparsity structures into gradient-based saliency maps is to alter the simple gradient scheme using sparsification or norm-based regularization. A drawback with such post-processing methods is their frequently-observed significant loss in fidelity to the original simple gradient map. In this work, we propose to apply adversarial training as an in-processing scheme to train neural networks with structured simple gradient maps. We show a duality relation between the regularized norms of the adversarial perturbations and gradient-based maps, based on which we design adversarial training loss functions promoting sparsity and group-sparsity properties in simple gradient maps. We present several numerical results to show the influence of our proposed norm-based adversarial training methods on the standard gradient-based maps of standard neural network architectures on benchmark image datasets.
Related papers
- A Learning Paradigm for Interpretable Gradients [9.074325843851726]
We present a novel training approach to improve the quality of gradients for interpretability.
We find that the resulting gradient is qualitatively less noisy and improves quantitatively the interpretability properties of different networks.
arXiv Detail & Related papers (2024-04-23T13:32:29Z) - Neural Gradient Regularizer [150.85797800807524]
We propose a neural gradient regularizer (NGR) that expresses the gradient map as the output of a neural network.
NGR is applicable to various image types and different image processing tasks, functioning in a zero-shot learning fashion.
arXiv Detail & Related papers (2023-08-31T10:19:23Z) - Can Forward Gradient Match Backpropagation? [2.875726839945885]
Forward Gradients have been shown to be utilizable for neural network training.
We propose to strongly bias our gradient guesses in directions that are much more promising, such as feedback obtained from small, local auxiliary networks.
We find that using gradients obtained from a local loss as a candidate direction drastically improves on random noise in Forward Gradient methods.
arXiv Detail & Related papers (2023-06-12T08:53:41Z) - Interpretation of Neural Networks is Susceptible to Universal Adversarial Perturbations [9.054540533394926]
We show the existence of a Universal Perturbation for Interpretation (UPI) for standard image datasets.
We propose a gradient-based optimization method as well as a principal component analysis (PCA)-based approach to compute a UPI which can effectively alter a neural network's gradient-based interpretation on different samples.
arXiv Detail & Related papers (2022-11-30T15:55:40Z) - Gradient-Based Adversarial and Out-of-Distribution Detection [15.510581400494207]
We introduce confounding labels in gradient generation to probe the effective expressivity of neural networks.
We show that our gradient-based approach allows for capturing the anomaly in inputs based on the effective expressivity of the models.
arXiv Detail & Related papers (2022-06-16T15:50:41Z) - Entangled Residual Mappings [59.02488598557491]
We introduce entangled residual mappings to generalize the structure of the residual connections.
An entangled residual mapping replaces the identity skip connections with specialized entangled mappings.
We show that while entangled mappings can preserve the iterative refinement of features across various deep models, they influence the representation learning process in convolutional networks.
arXiv Detail & Related papers (2022-06-02T19:36:03Z) - Generalization Error Analysis of Neural networks with Gradient Based
Regularization [2.7286395031146062]
We study gradient-based regularization methods for neural networks.
We introduce a general framework to analyze the generalization error of regularized networks.
We conduct some experiments on the image classification tasks to show that gradient-based methods can significantly improve the generalization ability.
arXiv Detail & Related papers (2021-07-06T07:54:36Z) - CAMERAS: Enhanced Resolution And Sanity preserving Class Activation
Mapping for image saliency [61.40511574314069]
Backpropagation image saliency aims at explaining model predictions by estimating model-centric importance of individual pixels in the input.
We propose CAMERAS, a technique to compute high-fidelity backpropagation saliency maps without requiring any external priors.
arXiv Detail & Related papers (2021-06-20T08:20:56Z) - Backward Gradient Normalization in Deep Neural Networks [68.8204255655161]
We introduce a new technique for gradient normalization during neural network training.
The gradients are rescaled during the backward pass using normalization layers introduced at certain points within the network architecture.
Results on tests with very deep neural networks show that the new technique can do an effective control of the gradient norm.
arXiv Detail & Related papers (2021-06-17T13:24:43Z) - Channel-Directed Gradients for Optimization of Convolutional Neural
Networks [50.34913837546743]
We introduce optimization methods for convolutional neural networks that can be used to improve existing gradient-based optimization in terms of generalization error.
We show that defining the gradients along the output channel direction leads to a performance boost, while other directions can be detrimental.
arXiv Detail & Related papers (2020-08-25T00:44:09Z) - Understanding Integrated Gradients with SmoothTaylor for Deep Neural
Network Attribution [70.78655569298923]
Integrated Gradients as an attribution method for deep neural network models offers simple implementability.
It suffers from noisiness of explanations which affects the ease of interpretability.
The SmoothGrad technique is proposed to solve the noisiness issue and smoothen the attribution maps of any gradient-based attribution method.
arXiv Detail & Related papers (2020-04-22T10:43:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.