Towards Deep Learning Models Resistant to Transfer-based Adversarial
Attacks via Data-centric Robust Learning
- URL: http://arxiv.org/abs/2310.09891v1
- Date: Sun, 15 Oct 2023 17:20:42 GMT
- Title: Towards Deep Learning Models Resistant to Transfer-based Adversarial
Attacks via Data-centric Robust Learning
- Authors: Yulong Yang, Chenhao Lin, Xiang Ji, Qiwei Tian, Qian Li, Hongshan
Yang, Zhibo Wang, Chao Shen
- Abstract summary: Adversarial training (AT) is recognized as the strongest defense against white-box attacks.
We name this new defense paradigm Data-centric Robust Learning (DRL)
- Score: 16.53553150596255
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transfer-based adversarial attacks raise a severe threat to real-world deep
learning systems since they do not require access to target models. Adversarial
training (AT), which is recognized as the strongest defense against white-box
attacks, has also guaranteed high robustness to (black-box) transfer-based
attacks. However, AT suffers from heavy computational overhead since it
optimizes the adversarial examples during the whole training process. In this
paper, we demonstrate that such heavy optimization is not necessary for AT
against transfer-based attacks. Instead, a one-shot adversarial augmentation
prior to training is sufficient, and we name this new defense paradigm
Data-centric Robust Learning (DRL). Our experimental results show that DRL
outperforms widely-used AT techniques (e.g., PGD-AT, TRADES, EAT, and FAT) in
terms of black-box robustness and even surpasses the top-1 defense on
RobustBench when combined with diverse data augmentations and loss
regularizations. We also identify other benefits of DRL, for instance, the
model generalization capability and robust fairness.
Related papers
- Efficient Adversarial Training in LLMs with Continuous Attacks [99.5882845458567]
Large language models (LLMs) are vulnerable to adversarial attacks that can bypass their safety guardrails.
We propose a fast adversarial training algorithm (C-AdvUL) composed of two losses.
C-AdvIPO is an adversarial variant of IPO that does not require utility data for adversarially robust alignment.
arXiv Detail & Related papers (2024-05-24T14:20:09Z) - Adaptive Batch Normalization Networks for Adversarial Robustness [33.14617293166724]
Adversarial Training (AT) has been a standard foundation of modern adversarial defense approaches.
We propose adaptive Batch Normalization Network (ABNN), inspired by the recent advances in test-time domain adaptation.
ABNN consistently improves adversarial robustness against both digital and physically realizable attacks.
arXiv Detail & Related papers (2024-05-20T00:58:53Z) - Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework.
Our importance weights are obtained by optimizing the KL-divergence regularized loss function.
Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z) - DODEM: DOuble DEfense Mechanism Against Adversarial Attacks Towards
Secure Industrial Internet of Things Analytics [8.697883716452385]
We propose a double defense mechanism to detect and mitigate adversarial attacks in I-IoT environments.
We first detect if there is an adversarial attack on a given sample using novelty detection algorithms.
If there is an attack, adversarial retraining provides a more robust model, while we apply standard training for regular samples.
arXiv Detail & Related papers (2023-01-23T22:10:40Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z) - Fabricated Flips: Poisoning Federated Learning without Data [9.060263645085564]
Attacks on Federated Learning (FL) can severely reduce the quality of the generated models.
We propose a data-free untargeted attack (DFA) that synthesizes malicious data to craft adversarial models.
DFA achieves similar or even higher attack success rate than state-of-the-art untargeted attacks.
arXiv Detail & Related papers (2022-02-07T20:38:28Z) - Interpolated Joint Space Adversarial Training for Robust and
Generalizable Defenses [82.3052187788609]
Adversarial training (AT) is considered to be one of the most reliable defenses against adversarial attacks.
Recent works show generalization improvement with adversarial samples under novel threat models.
We propose a novel threat model called Joint Space Threat Model (JSTM)
Under JSTM, we develop novel adversarial attacks and defenses.
arXiv Detail & Related papers (2021-12-12T21:08:14Z) - Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial
Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically.
Our method learns the in adversarial attacks parameterized by a recurrent neural network.
We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z) - Lagrangian Objective Function Leads to Improved Unforeseen Attack
Generalization in Adversarial Training [0.0]
Adversarial training (AT) has been shown effective to reach a robust model against the attack that is used during training.
We propose a simple modification to the AT that mitigates the mentioned issue.
We show that our attack is faster than other attack schemes that are designed for unseen attack generalization.
arXiv Detail & Related papers (2021-03-29T07:23:46Z) - Guided Adversarial Attack for Evaluating and Enhancing Adversarial
Defenses [59.58128343334556]
We introduce a relaxation term to the standard loss, that finds more suitable gradient-directions, increases attack efficacy and leads to more efficient adversarial training.
We propose Guided Adversarial Margin Attack (GAMA), which utilizes function mapping of the clean image to guide the generation of adversaries.
We also propose Guided Adversarial Training (GAT), which achieves state-of-the-art performance amongst single-step defenses.
arXiv Detail & Related papers (2020-11-30T16:39:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.