Forward Learning for Gradient-based Black-box Saliency Map Generation
- URL: http://arxiv.org/abs/2403.15603v2
- Date: Tue, 2 Jul 2024 16:05:48 GMT
- Title: Forward Learning for Gradient-based Black-box Saliency Map Generation
- Authors: Zeliang Zhang, Mingqian Feng, Jinyang Jiang, Rongyi Zhu, Yijie Peng, Chenliang Xu,
- Abstract summary: We introduce a novel framework for estimating gradients in black-box settings and generating saliency maps to interpret model decisions.
We employ the likelihood ratio method to estimate output-to-input gradients and utilize them for saliency map generation.
Experiments in black-box settings validate the effectiveness of our method, demonstrating accurate gradient estimation and explainability of generated saliency maps.
- Score: 25.636185607767988
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Gradient-based saliency maps are widely used to explain deep neural network decisions. However, as models become deeper and more black-box, such as in closed-source APIs like ChatGPT, computing gradients become challenging, hindering conventional explanation methods. In this work, we introduce a novel unified framework for estimating gradients in black-box settings and generating saliency maps to interpret model decisions. We employ the likelihood ratio method to estimate output-to-input gradients and utilize them for saliency map generation. Additionally, we propose blockwise computation techniques to enhance estimation accuracy. Extensive experiments in black-box settings validate the effectiveness of our method, demonstrating accurate gradient estimation and explainability of generated saliency maps. Furthermore, we showcase the scalability of our approach by applying it to explain GPT-Vision, revealing the continued relevance of gradient-based explanation methods in the era of large, closed-source, and black-box models.
Related papers
- Black-Box Tuning of Vision-Language Models with Effective Gradient
Approximation [71.21346469382821]
We introduce collaborative black-box tuning (CBBT) for both textual prompt optimization and output feature adaptation for black-box models.
CBBT is extensively evaluated on eleven downstream benchmarks and achieves remarkable improvements compared to existing black-box VL adaptation methods.
arXiv Detail & Related papers (2023-12-26T06:31:28Z) - On Gradient-like Explanation under a Black-box Setting: When Black-box Explanations Become as Good as White-box [9.368325306722321]
This paper presents methodAbr(gradient-estimation-based explanation), an approach that produces gradient-like explanations through only query-level access.
The proposed approach holds a set of fundamental properties for attribution methods, which are mathematically rigorously proved, ensuring the quality of its explanations.
In addition to the theoretical analysis, with a focus on image data, the experimental results empirically demonstrate the superiority of the proposed method over state-of-the-art black-box methods and its competitive performance compared to methods with full access.
arXiv Detail & Related papers (2023-08-18T08:24:57Z) - Abs-CAM: A Gradient Optimization Interpretable Approach for Explanation
of Convolutional Neural Networks [7.71412567705588]
Class activation mapping-based method has been widely used to interpret the internal decisions of models in computer vision tasks.
We propose an Absolute value Class Activation Mapping-based (Abs-CAM) method, which optimize the gradients derived from the backpropagation.
The framework of Abs-CAM is divided into two phases: generating initial saliency map and generating final saliency map.
arXiv Detail & Related papers (2022-07-08T02:06:46Z) - Geometrically Guided Integrated Gradients [0.3867363075280543]
We introduce an interpretability method called "geometrically-guided integrated gradients"
Our method explores the model's dynamic behavior from multiple scaled versions of the input and captures the best possible attribution for each input.
We also propose a "model perturbation" sanity check to complement the traditionally used "model randomization" test.
arXiv Detail & Related papers (2022-06-13T05:05:43Z) - Query-Efficient Black-box Adversarial Attacks Guided by a Transfer-based
Prior [50.393092185611536]
We consider the black-box adversarial setting, where the adversary needs to craft adversarial examples without access to the gradients of a target model.
Previous methods attempted to approximate the true gradient either by using the transfer gradient of a surrogate white-box model or based on the feedback of model queries.
We propose two prior-guided random gradient-free (PRGF) algorithms based on biased sampling and gradient averaging.
arXiv Detail & Related papers (2022-03-13T04:06:27Z) - On Training Implicit Models [75.20173180996501]
We propose a novel gradient estimate for implicit models, named phantom gradient, that forgoes the costly computation of the exact gradient.
Experiments on large-scale tasks demonstrate that these lightweight phantom gradients significantly accelerate the backward passes in training implicit models by roughly 1.7 times.
arXiv Detail & Related papers (2021-11-09T14:40:24Z) - CAMERAS: Enhanced Resolution And Sanity preserving Class Activation
Mapping for image saliency [61.40511574314069]
Backpropagation image saliency aims at explaining model predictions by estimating model-centric importance of individual pixels in the input.
We propose CAMERAS, a technique to compute high-fidelity backpropagation saliency maps without requiring any external priors.
arXiv Detail & Related papers (2021-06-20T08:20:56Z) - Enhancing Deep Neural Network Saliency Visualizations with Gradual
Extrapolation [0.0]
We propose an enhancement technique of the Class Activation Mapping methods like Grad-CAM or Excitation Backpropagation.
Our idea, called Gradual Extrapolation, can supplement any method that generates a heatmap picture by sharpening the output.
arXiv Detail & Related papers (2021-04-11T07:39:35Z) - Rethinking Positive Aggregation and Propagation of Gradients in
Gradient-based Saliency Methods [47.999621481852266]
Saliency methods interpret the prediction of a neural network by showing the importance of input elements for that prediction.
We empirically show that two approaches for handling the gradient information, namely positive aggregation, and positive propagation, break these methods.
arXiv Detail & Related papers (2020-12-01T09:38:54Z) - Understanding Integrated Gradients with SmoothTaylor for Deep Neural
Network Attribution [70.78655569298923]
Integrated Gradients as an attribution method for deep neural network models offers simple implementability.
It suffers from noisiness of explanations which affects the ease of interpretability.
The SmoothGrad technique is proposed to solve the noisiness issue and smoothen the attribution maps of any gradient-based attribution method.
arXiv Detail & Related papers (2020-04-22T10:43:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.