Related papers: Adversarial Transferability in Deep Denoising Models: Theoretical Insights and Robustness Enhancement via Out-of-Distribution Typical Set Sampling

Adversarial Transferability in Deep Denoising Models: Theoretical Insights and Robustness Enhancement via Out-of-Distribution Typical Set Sampling

URL: http://arxiv.org/abs/2412.05943v1
Date: Sun, 08 Dec 2024 13:47:57 GMT
Title: Adversarial Transferability in Deep Denoising Models: Theoretical Insights and Robustness Enhancement via Out-of-Distribution Typical Set Sampling
Authors: Jie Ning, Jiebao Sun, Shengzhu Shi, Zhichang Guo, Yao Li, Hongwei Li, Boying Wu,
Abstract summary: Deep learning-based image denoising models demonstrate remarkable performance, but their lack of robustness analysis remains a significant concern.<n>A major issue is that these models are susceptible to adversarial attacks, where small, carefully crafted perturbations to input data can cause them to fail.<n>We propose a novel adversarial defense method: the Out-of-Distribution Typical Set Sampling Training strategy.
Score: 6.189440665620872
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep learning-based image denoising models demonstrate remarkable performance, but their lack of robustness analysis remains a significant concern. A major issue is that these models are susceptible to adversarial attacks, where small, carefully crafted perturbations to input data can cause them to fail. Surprisingly, perturbations specifically crafted for one model can easily transfer across various models, including CNNs, Transformers, unfolding models, and plug-and-play models, leading to failures in those models as well. Such high adversarial transferability is not observed in classification models. We analyze the possible underlying reasons behind the high adversarial transferability through a series of hypotheses and validation experiments. By characterizing the manifolds of Gaussian noise and adversarial perturbations using the concept of typical set and the asymptotic equipartition property, we prove that adversarial samples deviate slightly from the typical set of the original input distribution, causing the models to fail. Based on these insights, we propose a novel adversarial defense method: the Out-of-Distribution Typical Set Sampling Training strategy (TS). TS not only significantly enhances the model's robustness but also marginally improves denoising performance compared to the original model.

Related papers

Noise as a Probe: Membership Inference Attacks on Diffusion Models Leveraging Initial Noise [51.179816451161635]
Diffusion models have achieved remarkable progress in image generation, but their increasing deployment raises serious concerns about privacy.<n>In this work, we utilize a critical yet overlooked vulnerability: the widely used noise schedules fail to fully eliminate semantic information in the images.<n>We propose a simple yet effective membership inference attack, which injects semantic information into the initial noise and infers membership by analyzing the model's generation result.
arXiv Detail & Related papers (2026-01-29T12:29:01Z)
Did Models Sufficient Learn? Attribution-Guided Training via Subset-Selected Counterfactual Augmentation [61.248535801314375]
Subset-Selected Counterfactual Augmentation (SS-CA)<n>We develop Counterfactual LIMA to identify minimal spatial region sets whose removal can selectively alter model predictions.<n>Experiments show that SS-CA improves generalization on in-distribution (ID) test data and achieves superior performance on out-of-distribution (OOD) benchmarks.
arXiv Detail & Related papers (2025-11-15T08:39:22Z)
Learning Robust Diffusion Models from Imprecise Supervision [75.53546939251146]
DMIS is a unified framework for training robust Conditional Diffusion Models from Imprecise Supervision.<n>Our framework is derived from likelihood and decomposes the objective into generative and classification components.<n>Experiments on diverse forms of imprecise supervision, covering tasks covering image generation, weakly supervised learning, and dataset condensation demonstrate that DMIS consistently produces high-quality and class-discriminative samples.
arXiv Detail & Related papers (2025-10-03T14:00:32Z)
A Validation Strategy for Deep Learning Models: Evaluating and Enhancing Robustness [0.8532585403388676]
We propose a validation approach that extracts "weak robust" samples directly from the training dataset via local analysis.<n>These samples, being the most susceptible to perturbations, serve as an early and sensitive indicator of the model's vulnerabilities.<n>We demonstrate the effectiveness of our approach on models trained with CIFAR-10, CIFAR-100, and ImageNet.
arXiv Detail & Related papers (2025-09-23T16:14:14Z)
ScoreAdv: Score-based Targeted Generation of Natural Adversarial Examples via Diffusion Models [7.250878248686215]
In this paper, we introduce a novel approach for generating adversarial examples based on diffusion models, named ScoreAdv.<n>Our method is capable of generating an unlimited number of natural adversarial examples and can attack not only classification models but also retrieval models.<n>Our results demonstrate that ScoreAdv achieves state-of-the-art attack success rates and image quality.
arXiv Detail & Related papers (2025-07-08T15:17:24Z)
Holmes: Towards Effective and Harmless Model Ownership Verification to Personalized Large Vision Models via Decoupling Common Features [54.63343151319368]
This paper proposes a harmless model ownership verification method for personalized models by decoupling similar common features.<n>In the first stage, we create shadow models that retain common features of the victim model while disrupting dataset-specific features.<n>After that, a meta-classifier is trained to identify stolen models by determining whether suspicious models contain the dataset-specific features of the victim.
arXiv Detail & Related papers (2025-06-24T15:40:11Z)
Improving Group Robustness on Spurious Correlation via Evidential Alignment [26.544938760265136]
Deep neural networks often learn and rely on spurious correlations, i.e., superficial associations between non-causal features and the targets.<n>Existing methods typically mitigate this issue by using external group annotations or auxiliary deterministic models.<n>We propose Evidential Alignment, a novel framework that leverages uncertainty quantification to understand the behavior of the biased models.
arXiv Detail & Related papers (2025-06-12T22:47:21Z)
Diffusion models under low-noise regime [3.729242965449096]
We show that diffusion models are effective denoisers when the corruption level is small.<n>We quantify how training set size, data geometry, and model objective choice shape denoising trajectories.<n>This work starts to address gaps in our understanding of generative model reliability in practical applications.
arXiv Detail & Related papers (2025-06-09T15:07:16Z)
Towards Model Resistant to Transferable Adversarial Examples via Trigger Activation [95.3977252782181]
Adversarial examples, characterized by imperceptible perturbations, pose significant threats to deep neural networks by misleading their predictions. We introduce a novel training paradigm aimed at enhancing robustness against transferable adversarial examples (TAEs) in a more efficient and effective way.
arXiv Detail & Related papers (2025-04-20T09:07:10Z)
Spatial Reasoning with Denoising Models [49.83744014336816]
We introduce a framework to perform reasoning over sets of continuous variables via denoising generative models. We demonstrate for the first time, that order of generation can successfully be predicted by the denoising network itself.
arXiv Detail & Related papers (2025-02-28T14:08:30Z)
Diffusing DeBias: Synthetic Bias Amplification for Model Debiasing [15.214861534330236]
We introduce Diffusing DeBias (DDB) as a plug-in for common methods of unsupervised model debiasing. Specifically, our approach adopts conditional diffusion models to generate synthetic bias-aligned images. By tackling the fundamental issue of bias-conflicting training samples in learning auxiliary models, our proposed method beats current state-of-the-art in multiple benchmark datasets.
arXiv Detail & Related papers (2025-02-13T18:17:03Z)
A Robust Adversarial Ensemble with Causal (Feature Interaction) Interpretations for Image Classification [9.945272787814941]
We present a deep ensemble model that combines discriminative features with generative models to achieve both high accuracy and adversarial robustness. Our approach integrates a bottom-level pre-trained discriminative network for feature extraction with a top-level generative classification network that models adversarial input distributions.
arXiv Detail & Related papers (2024-12-28T05:06:20Z)
Unmasking Bias in Diffusion Model Training [40.90066994983719]
Denoising diffusion models have emerged as a dominant approach for image generation. They still suffer from slow convergence in training and color shift issues in sampling. In this paper, we identify that these obstacles can be largely attributed to bias and suboptimality inherent in the default training paradigm.
arXiv Detail & Related papers (2023-10-12T16:04:41Z)
Rethinking Model Ensemble in Transfer-based Adversarial Attacks [46.82830479910875]
An effective strategy to improve the transferability is attacking an ensemble of models. Previous works simply average the outputs of different models. We propose a Common Weakness Attack (CWA) to generate more transferable adversarial examples.
arXiv Detail & Related papers (2023-03-16T06:37:16Z)
Removing Structured Noise with Diffusion Models [14.187153638386379]
We show that the powerful paradigm of posterior sampling with diffusion models can be extended to include rich, structured, noise models. We demonstrate strong performance gains across various inverse problems with structured noise, outperforming competitive baselines. This opens up new opportunities and relevant practical applications of diffusion modeling for inverse problems in the context of non-Gaussian measurement models.
arXiv Detail & Related papers (2023-01-20T23:42:25Z)
Investigating Ensemble Methods for Model Robustness Improvement of Text Classifiers [66.36045164286854]
We analyze a set of existing bias features and demonstrate there is no single model that works best for all the cases. By choosing an appropriate bias model, we can obtain a better robustness result than baselines with a more sophisticated model design.
arXiv Detail & Related papers (2022-10-28T17:52:10Z)
Consistent Counterfactuals for Deep Models [25.1271020453651]
Counterfactual examples are used to explain predictions of machine learning models in key areas such as finance and medical diagnosis. This paper studies the consistency of model prediction on counterfactual examples in deep networks under small changes to initial training conditions.
arXiv Detail & Related papers (2021-10-06T23:48:55Z)
Harnessing Perceptual Adversarial Patches for Crowd Counting [92.79051296850405]
Crowd counting is vulnerable to adversarial examples in the physical world. This paper proposes the Perceptual Adrial Patch (PAP) generation framework to learn the shared perceptual features between models.
arXiv Detail & Related papers (2021-09-16T13:51:39Z)
Firearm Detection via Convolutional Neural Networks: Comparing a Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents. One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis. We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z)
Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors. We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method. Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)
Evaluating Neural Machine Comprehension Model Robustness to Noisy Inputs and Adversarial Attacks [9.36331571226256]
We evaluate machine comprehension models' robustness to noise and adversarial attacks by performing novel perturbations at the character, word, and sentence level. We develop a model to predict model errors during adversarial attacks.
arXiv Detail & Related papers (2020-05-01T03:05:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.