Sampling Matters in Explanations: Towards Trustworthy Attribution Analysis Building Block in Visual Models through Maximizing Explanation Certainty
- URL: http://arxiv.org/abs/2506.19442v2
- Date: Wed, 25 Jun 2025 11:18:04 GMT
- Title: Sampling Matters in Explanations: Towards Trustworthy Attribution Analysis Building Block in Visual Models through Maximizing Explanation Certainty
- Authors: Róisín Luo, James McDermott, Colm O'Riordan,
- Abstract summary: Building trustworthy attribution analysis needs to settle the sample distribution misalignment problem.<n>We present a semi-optimal sampling approach by suppressing features from inputs.<n>Our approach is effective and able to yield more satisfactory explanations against state-of-the-art baselines.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Image attribution analysis seeks to highlight the feature representations learned by visual models such that the highlighted feature maps can reflect the pixel-wise importance of inputs. Gradient integration is a building block in the attribution analysis by integrating the gradients from multiple derived samples to highlight the semantic features relevant to inferences. Such a building block often combines with other information from visual models such as activation or attention maps to form ultimate explanations. Yet, our theoretical analysis demonstrates that the extent to the alignment of the sample distribution in gradient integration with respect to natural image distribution gives a lower bound of explanation certainty. Prior works add noise into images as samples and the noise distributions can lead to low explanation certainty. Counter-intuitively, our experiment shows that extra information can saturate neural networks. To this end, building trustworthy attribution analysis needs to settle the sample distribution misalignment problem. Instead of adding extra information into input images, we present a semi-optimal sampling approach by suppressing features from inputs. The sample distribution by suppressing features is approximately identical to the distribution of natural images. Our extensive quantitative evaluation on large scale dataset ImageNet affirms that our approach is effective and able to yield more satisfactory explanations against state-of-the-art baselines throughout all experimental models.
Related papers
- Elucidating the representation of images within an unconditional diffusion model denoiser [10.853652149844999]
Generative diffusion models learn probability densities over diverse image datasets by estimating the score with a neural network trained to remove noise.<n>Here, we examine a UNet trained for denoising on the ImageNet dataset, to better understand its internal representation and computation of the score.<n>We show that the middle block of the UNet decomposes individual images into sparse subsets of active channels, and that the vector of spatial averages of these channels can provide a nonlinear representation of the underlying clean images.
arXiv Detail & Related papers (2025-06-02T17:33:34Z) - DiffusionPID: Interpreting Diffusion via Partial Information Decomposition [24.83767778658948]
We apply information-theoretic principles to decompose the input text prompt into its elementary components.
We analyze how individual tokens and their interactions shape the generated image.
We show that PID is a potent tool for evaluating and diagnosing text-to-image diffusion models.
arXiv Detail & Related papers (2024-06-07T18:17:17Z) - Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian
Mixture Models [59.331993845831946]
Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties.
This paper provides the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models.
arXiv Detail & Related papers (2024-03-03T23:15:48Z) - Intrinsic Image Diffusion for Indoor Single-view Material Estimation [55.276815106443976]
We present Intrinsic Image Diffusion, a generative model for appearance decomposition of indoor scenes.
Given a single input view, we sample multiple possible material explanations represented as albedo, roughness, and metallic maps.
Our method produces significantly sharper, more consistent, and more detailed materials, outperforming state-of-the-art methods by $1.5dB$ on PSNR and by $45%$ better FID score on albedo prediction.
arXiv Detail & Related papers (2023-12-19T15:56:19Z) - Rethinking interpretation: Input-agnostic saliency mapping of deep
visual classifiers [28.28834523468462]
Saliency methods provide post-hoc model interpretation by attributing input features to the model outputs.
We show that input-specific saliency mapping is intrinsically susceptible to misleading feature attribution.
We introduce a new perspective of input-agnostic saliency mapping that computationally estimates the high-level features attributed by the model to its outputs.
arXiv Detail & Related papers (2023-03-31T06:58:45Z) - Person Image Synthesis via Denoising Diffusion Model [116.34633988927429]
We show how denoising diffusion models can be applied for high-fidelity person image synthesis.
Our results on two large-scale benchmarks and a user study demonstrate the photorealism of our proposed approach under challenging scenarios.
arXiv Detail & Related papers (2022-11-22T18:59:50Z) - Out of Sight, Out of Mind: A Source-View-Wise Feature Aggregation for
Multi-View Image-Based Rendering [26.866141260616793]
We propose a source-view-wise feature aggregation method, which facilitates us to find out the consensus in a robust way.
We validate the proposed method on various benchmark datasets, including synthetic and real image scenes.
arXiv Detail & Related papers (2022-06-10T07:06:05Z) - Causal Transportability for Visual Recognition [70.13627281087325]
We show that standard classifiers fail because the association between images and labels is not transportable across settings.
We then show that the causal effect, which severs all sources of confounding, remains invariant across domains.
This motivates us to develop an algorithm to estimate the causal effect for image classification.
arXiv Detail & Related papers (2022-04-26T15:02:11Z) - Out-of-distribution Generalization via Partial Feature Decorrelation [72.96261704851683]
We present a novel Partial Feature Decorrelation Learning (PFDL) algorithm, which jointly optimize a feature decomposition network and the target image classification model.
The experiments on real-world datasets demonstrate that our method can improve the backbone model's accuracy on OOD image classification datasets.
arXiv Detail & Related papers (2020-07-30T05:48:48Z) - Understanding Integrated Gradients with SmoothTaylor for Deep Neural
Network Attribution [70.78655569298923]
Integrated Gradients as an attribution method for deep neural network models offers simple implementability.
It suffers from noisiness of explanations which affects the ease of interpretability.
The SmoothGrad technique is proposed to solve the noisiness issue and smoothen the attribution maps of any gradient-based attribution method.
arXiv Detail & Related papers (2020-04-22T10:43:19Z) - DANCE: Enhancing saliency maps using decoys [35.46266461621123]
We propose a framework that improves the robustness of saliency methods by following a two-step procedure.
First, we introduce a perturbation mechanism that subtly varies the input sample without changing its intermediate representations.
Second, we compute saliency maps for perturbed samples and propose a new method to aggregate saliency maps.
arXiv Detail & Related papers (2020-02-03T01:21:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.