Interpreting Global Perturbation Robustness of Image Models using Axiomatic Spectral Importance Decomposition
- URL: http://arxiv.org/abs/2408.01139v3
- Date: Mon, 23 Jun 2025 13:00:34 GMT
- Title: Interpreting Global Perturbation Robustness of Image Models using Axiomatic Spectral Importance Decomposition
- Authors: Róisín Luo, James McDermott, Colm O'Riordan,
- Abstract summary: Perturbation robustness evaluates the vulnerabilities of models, arising from a variety of perturbations, such as data corruptions and adversarial attacks.<n>We present a model-agnostic, global mechanistic interpretability method to interpret the perturbation robustness of image models.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Perturbation robustness evaluates the vulnerabilities of models, arising from a variety of perturbations, such as data corruptions and adversarial attacks. Understanding the mechanisms of perturbation robustness is critical for global interpretability. We present a model-agnostic, global mechanistic interpretability method to interpret the perturbation robustness of image models. This research is motivated by two key aspects. First, previous global interpretability works, in tandem with robustness benchmarks, e.g. mean corruption error (mCE), are not designed to directly interpret the mechanisms of perturbation robustness within image models. Second, we notice that the spectral signal-to-noise ratios (SNR) of perturbed natural images exponentially decay over the frequency. This power-law-like decay implies that: Low-frequency signals are generally more robust than high-frequency signals -- yet high classification accuracy can not be achieved by low-frequency signals alone. By applying Shapley value theory, our method axiomatically quantifies the predictive powers of robust features and non-robust features within an information theory framework. Our method, dubbed as \textbf{I-ASIDE} (\textbf{I}mage \textbf{A}xiomatic \textbf{S}pectral \textbf{I}mportance \textbf{D}ecomposition \textbf{E}xplanation), provides a unique insight into model robustness mechanisms. We conduct extensive experiments over a variety of vision models pre-trained on ImageNet to show that \textbf{I-ASIDE} can not only \textbf{measure} the perturbation robustness but also \textbf{provide interpretations} of its mechanisms.
Related papers
- Transformers Learn Faster with Semantic Focus [57.97235825738412]
We study sparse transformers in terms of learnability and generalization.<n>We find that input-dependent sparse attention models appear to converge faster and generalize better than standard attention models.
arXiv Detail & Related papers (2025-06-17T01:19:28Z) - Adversarial Transferability in Deep Denoising Models: Theoretical Insights and Robustness Enhancement via Out-of-Distribution Typical Set Sampling [6.189440665620872]
Deep learning-based image denoising models demonstrate remarkable performance, but their lack of robustness analysis remains a significant concern.<n>A major issue is that these models are susceptible to adversarial attacks, where small, carefully crafted perturbations to input data can cause them to fail.<n>We propose a novel adversarial defense method: the Out-of-Distribution Typical Set Sampling Training strategy.
arXiv Detail & Related papers (2024-12-08T13:47:57Z) - Stability and Generalizability in SDE Diffusion Models with Measure-Preserving Dynamics [11.919291977879801]
Inverse problems describe the process of estimating the causal factors from a set of measurements or data.
Diffusion models have shown promise as potent generative tools for solving inverse problems.
arXiv Detail & Related papers (2024-06-19T15:55:12Z) - Extreme Miscalibration and the Illusion of Adversarial Robustness [66.29268991629085]
Adversarial Training is often used to increase model robustness.
We show that this observed gain in robustness is an illusion of robustness (IOR)
We urge the NLP community to incorporate test-time temperature scaling into their robustness evaluations.
arXiv Detail & Related papers (2024-02-27T13:49:12Z) - RobustMQ: Benchmarking Robustness of Quantized Models [54.15661421492865]
Quantization is an essential technique for deploying deep neural networks (DNNs) on devices with limited resources.
We thoroughly evaluated the robustness of quantized models against various noises (adrial attacks, natural corruptions, and systematic noises) on ImageNet.
Our research contributes to advancing the robust quantization of models and their deployment in real-world scenarios.
arXiv Detail & Related papers (2023-08-04T14:37:12Z) - Semantic Image Attack for Visual Model Diagnosis [80.36063332820568]
In practice, metric analysis on a specific train and test dataset does not guarantee reliable or fair ML models.
This paper proposes Semantic Image Attack (SIA), a method based on the adversarial attack that provides semantic adversarial images.
arXiv Detail & Related papers (2023-03-23T03:13:04Z) - Beyond the Universal Law of Robustness: Sharper Laws for Random Features
and Neural Tangent Kernels [14.186776881154127]
This paper focuses on empirical risk minimization in two settings, namely, random features and the neural tangent kernel (NTK)
We prove that, for random features, the model is not robust for any degree of over- parameterization, even when the necessary condition coming from the universal law of robustness is satisfied.
Our results are corroborated by numerical evidence on both synthetic and standard prototypical datasets.
arXiv Detail & Related papers (2023-02-03T09:58:31Z) - Robustness via Uncertainty-aware Cycle Consistency [44.34422859532988]
Unpaired image-to-image translation refers to learning inter-image-domain mapping without corresponding image pairs.
Existing methods learn deterministic mappings without explicitly modelling the robustness to outliers or predictive uncertainty.
We propose a novel probabilistic method based on Uncertainty-aware Generalized Adaptive Cycle Consistency (UGAC)
arXiv Detail & Related papers (2021-10-24T15:33:21Z) - Non-Singular Adversarial Robustness of Neural Networks [58.731070632586594]
Adrial robustness has become an emerging challenge for neural network owing to its over-sensitivity to small input perturbations.
We formalize the notion of non-singular adversarial robustness for neural networks through the lens of joint perturbations to data inputs as well as model weights.
arXiv Detail & Related papers (2021-02-23T20:59:30Z) - Uncertainty-aware Generalized Adaptive CycleGAN [44.34422859532988]
Unpaired image-to-image translation refers to learning inter-image-domain mapping in an unsupervised manner.
Existing methods often learn deterministic mappings without explicitly modelling the robustness to outliers or predictive uncertainty.
We propose a novel probabilistic method called Uncertainty-aware Generalized Adaptive Cycle Consistency (UGAC)
arXiv Detail & Related papers (2021-02-23T15:22:35Z) - Understanding Double Descent Requires a Fine-Grained Bias-Variance
Decomposition [34.235007566913396]
We describe an interpretable, symmetric decomposition of the variance into terms associated with the labels.
We find that the bias decreases monotonically with the network width, but the variance terms exhibit non-monotonic behavior.
We also analyze the strikingly rich phenomenology that arises.
arXiv Detail & Related papers (2020-11-04T21:04:02Z) - Understanding Neural Abstractive Summarization Models via Uncertainty [54.37665950633147]
seq2seq abstractive summarization models generate text in a free-form manner.
We study the entropy, or uncertainty, of the model's token-level predictions.
We show that uncertainty is a useful perspective for analyzing summarization and text generation models more broadly.
arXiv Detail & Related papers (2020-10-15T16:57:27Z) - Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear.
We show that it commonly arises in parameters of discrete multiplicative noise due to variance.
A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z) - Understanding the Intrinsic Robustness of Image Distributions using
Conditional Generative Models [87.00072607024026]
We study the intrinsic robustness of two common image benchmarks under $ell$ perturbations.
We show the existence of a large gap between the robustness limits implied by our theory and the adversarial robustness achieved by current state-of-the-art robust models.
arXiv Detail & Related papers (2020-03-01T01:45:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.