DDAD: A Two-pronged Adversarial Defense Based on Distributional Discrepancy
- URL: http://arxiv.org/abs/2503.02169v1
- Date: Tue, 04 Mar 2025 01:16:21 GMT
- Title: DDAD: A Two-pronged Adversarial Defense Based on Distributional Discrepancy
- Authors: Jiacheng Zhang, Benjamin I. P. Rubinstein, Jingfeng Zhang, Feng Liu,
- Abstract summary: Statistical adversarial data detection (SADD) detects whether an upcoming batch contains adversarial examples (AEs)<n>In this paper, we show that minimizing distributional discrepancy can help reduce the expected loss on AEs.<n>We propose a two-pronged adversarial defense method, named Distributional-Discrepancy-based Adversarial Defense (DDAD)
- Score: 30.502354813427523
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Statistical adversarial data detection (SADD) detects whether an upcoming batch contains adversarial examples (AEs) by measuring the distributional discrepancies between clean examples (CEs) and AEs. In this paper, we reveal the potential strength of SADD-based methods by theoretically showing that minimizing distributional discrepancy can help reduce the expected loss on AEs. Nevertheless, despite these advantages, SADD-based methods have a potential limitation: they discard inputs that are detected as AEs, leading to the loss of clean information within those inputs. To address this limitation, we propose a two-pronged adversarial defense method, named Distributional-Discrepancy-based Adversarial Defense (DDAD). In the training phase, DDAD first optimizes the test power of the maximum mean discrepancy (MMD) to derive MMD-OPT, and then trains a denoiser by minimizing the MMD-OPT between CEs and AEs. In the inference phase, DDAD first leverages MMD-OPT to differentiate CEs and AEs, and then applies a two-pronged process: (1) directly feeding the detected CEs into the classifier, and (2) removing noise from the detected AEs by the distributional-discrepancy-based denoiser. Extensive experiments show that DDAD outperforms current state-of-the-art (SOTA) defense methods by notably improving clean and robust accuracy on CIFAR-10 and ImageNet-1K against adaptive white-box attacks.
Related papers
- Improved Diffusion-based Generative Model with Better Adversarial Robustness [65.38540020916432]
Diffusion Probabilistic Models (DPMs) have achieved significant success in generative tasks.<n>During the denoising process, the input data distributions differ between the training and inference stages.
arXiv Detail & Related papers (2025-02-24T12:29:16Z) - Beyond Perceptual Distances: Rethinking Disparity Assessment for Out-of-Distribution Detection with Diffusion Models [28.96695036746856]
Out-of-Distribution (OoD) detection aims to justify whether a given sample is from the training distribution of the classifier-under-protection.
DM-based methods bring fresh insights to the field, yet remain under-explored.
Our work has demonstrated state-of-the-art detection performances among DM-based methods in extensive experiments.
arXiv Detail & Related papers (2024-09-16T08:50:47Z) - Purify Unlearnable Examples via Rate-Constrained Variational Autoencoders [101.42201747763178]
Unlearnable examples (UEs) seek to maximize testing error by making subtle modifications to training examples that are correctly labeled.
Our work provides a novel disentanglement mechanism to build an efficient pre-training purification method.
arXiv Detail & Related papers (2024-05-02T16:49:25Z) - Contrastive Bi-Projector for Unsupervised Domain Adaption [0.0]
This paper proposes a novel unsupervised domain adaption (UDA) method based on contrastive bi-projector (CBP)
It is called CBPUDA here, which effectively promotes the feature extractors (FEs) to reduce the generation of ambiguous features for classification and domain adaption.
Experimental results express that the CBPUDA is superior to conventional UDA methods under consideration in this paper for UDA and fine-grained UDA tasks.
arXiv Detail & Related papers (2023-08-14T09:06:21Z) - Hard Adversarial Example Mining for Improving Robust Fairness [18.02943802341582]
Adversarial training (AT) is widely considered the state-of-the-art technique for improving the robustness of deep neural networks (DNNs) against adversarial examples (AE)
Recent studies have revealed that adversarially trained models are prone to unfairness problems, restricting their applicability.
To alleviate this problem, we propose HAM, a straightforward yet effective framework via adaptive Hard Adversarial example Mining.HAM.
arXiv Detail & Related papers (2023-08-03T15:33:24Z) - Continual Detection Transformer for Incremental Object Detection [154.8345288298059]
Incremental object detection (IOD) aims to train an object detector in phases, each with annotations for new object categories.
As other incremental settings, IOD is subject to catastrophic forgetting, which is often addressed by techniques such as knowledge distillation (KD) and exemplar replay (ER)
We propose a new method for transformer-based IOD which enables effective usage of KD and ER in this context.
arXiv Detail & Related papers (2023-04-06T14:38:40Z) - MSS-PAE: Saving Autoencoder-based Outlier Detection from Unexpected Reconstruction [25.60381244912307]
AutoEncoders (AEs) are commonly used for machine learning tasks due to their intrinsic learning ability.
AE-based methods face the issue of overconfident decisions and unexpected reconstruction results of outliers, limiting their performance in Outlier Detection (OD)
The proposed methods have the potential to advance AE's development in OD.
arXiv Detail & Related papers (2023-04-03T04:01:29Z) - ADDMU: Detection of Far-Boundary Adversarial Examples with Data and
Model Uncertainty Estimation [125.52743832477404]
Adversarial Examples Detection (AED) is a crucial defense technique against adversarial attacks.
We propose a new technique, textbfADDMU, which combines two types of uncertainty estimation for both regular and FB adversarial example detection.
Our new method outperforms previous methods by 3.6 and 6.0 emphAUC points under each scenario.
arXiv Detail & Related papers (2022-10-22T09:11:12Z) - ADPS: Asymmetric Distillation Post-Segmentation for Image Anomaly
Detection [75.68023968735523]
Knowledge Distillation-based Anomaly Detection (KDAD) methods rely on the teacher-student paradigm to detect and segment anomalous regions.
We propose an innovative approach called Asymmetric Distillation Post-Segmentation (ADPS)
Our ADPS employs an asymmetric distillation paradigm that takes distinct forms of the same image as the input of the teacher-student networks.
We show that ADPS significantly improves Average Precision (AP) metric by 9% and 20% on the MVTec AD and KolektorSDD2 datasets.
arXiv Detail & Related papers (2022-10-19T12:04:47Z) - Be Your Own Neighborhood: Detecting Adversarial Example by the
Neighborhood Relations Built on Self-Supervised Learning [64.78972193105443]
This paper presents a novel AE detection framework, named trustworthy for predictions.
performs the detection by distinguishing the AE's abnormal relation with its augmented versions.
An off-the-shelf Self-Supervised Learning (SSL) model is used to extract the representation and predict the label.
arXiv Detail & Related papers (2022-08-31T08:18:44Z) - Hierarchical Distribution-Aware Testing of Deep Learning [13.254093944540438]
Deep Learning (DL) is increasingly used in safety-critical applications, raising concerns about its reliability.
DL suffers from a well-known problem of lacking robustness when faced with adversarial perturbations known as Adversarial Examples (AEs)
We propose a new robustness testing approach for detecting AEs that considers both the feature level distribution and the pixel level distribution.
arXiv Detail & Related papers (2022-05-17T19:13:55Z) - Scale-Equivalent Distillation for Semi-Supervised Object Detection [57.59525453301374]
Recent Semi-Supervised Object Detection (SS-OD) methods are mainly based on self-training, generating hard pseudo-labels by a teacher model on unlabeled data as supervisory signals.
We analyze the challenges these methods meet with the empirical experiment results.
We introduce a novel approach, Scale-Equivalent Distillation (SED), which is a simple yet effective end-to-end knowledge distillation framework robust to large object size variance and class imbalance.
arXiv Detail & Related papers (2022-03-23T07:33:37Z) - Improved Certified Defenses against Data Poisoning with (Deterministic)
Finite Aggregation [122.83280749890078]
We propose an improved certified defense against general poisoning attacks, namely Finite Aggregation.
In contrast to DPA, which directly splits the training set into disjoint subsets, our method first splits the training set into smaller disjoint subsets.
We offer an alternative view of our method, bridging the designs of deterministic and aggregation-based certified defenses.
arXiv Detail & Related papers (2022-02-05T20:08:58Z) - Mitigating the Mutual Error Amplification for Semi-Supervised Object
Detection [92.52505195585925]
We propose a Cross Teaching (CT) method, aiming to mitigate the mutual error amplification by introducing a rectification mechanism of pseudo labels.
In contrast to existing mutual teaching methods that directly treat predictions from other detectors as pseudo labels, we propose the Label Rectification Module (LRM)
arXiv Detail & Related papers (2022-01-26T03:34:57Z) - MixDefense: A Defense-in-Depth Framework for Adversarial Example
Detection Based on Statistical and Semantic Analysis [14.313178290347293]
We propose a multilayer defense-in-depth framework for AE detection, namely MixDefense.
We leverage the noise' features extracted from the inputs to discover the statistical difference between natural images and tampered ones for AE detection.
We show that the proposed MixDefense solution outperforms the existing AE detection techniques by a considerable margin.
arXiv Detail & Related papers (2021-04-20T15:57:07Z) - Anomaly Detection with Convolutional Autoencoders for Fingerprint
Presentation Attack Detection [11.879849130630406]
Presentation attack detection (PAD) methods are used to determine whether samples stem from a bona fide subject or from a presentation attack instrument (PAI)
We propose a new PAD technique based on autoencoders (AEs) trained only on bona fide samples (i.e. one-class) captured in the short wave infrared domain.
arXiv Detail & Related papers (2020-08-18T15:33:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.