Related papers: Target Training Does Adversarial Training Without Adversarial Samples

Target Training Does Adversarial Training Without Adversarial Samples

URL: http://arxiv.org/abs/2102.04836v1
Date: Tue, 9 Feb 2021 14:17:57 GMT
Title: Target Training Does Adversarial Training Without Adversarial Samples
Authors: Blerta Lindqvist
Abstract summary: adversarial samples are not optimal for steering attack convergence, based on the minimization at the core of adversarial attacks. Target Training eliminates the need to generate adversarial samples for training against all attacks that minimize perturbation. Using adversarial samples against attacks that do not minimize perturbation, Target Training exceeds current best defense ($69.1$%) with $76.4$% against CW-L$($kappa=40$) in CIFAR10.
Score: 0.10152838128195464
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural network classifiers are vulnerable to misclassification of adversarial samples, for which the current best defense trains classifiers with adversarial samples. However, adversarial samples are not optimal for steering attack convergence, based on the minimization at the core of adversarial attacks. The minimization perturbation term can be minimized towards $0$ by replacing adversarial samples in training with duplicated original samples, labeled differently only for training. Using only original samples, Target Training eliminates the need to generate adversarial samples for training against all attacks that minimize perturbation. In low-capacity classifiers and without using adversarial samples, Target Training exceeds both default CIFAR10 accuracy ($84.3$%) and current best defense accuracy (below $25$%) with $84.8$% against CW-L$_2$($\kappa=0$) attack, and $86.6$% against DeepFool. Using adversarial samples against attacks that do not minimize perturbation, Target Training exceeds current best defense ($69.1$%) with $76.4$% against CW-L$_2$($\kappa=40$) in CIFAR10.

Related papers

Fast Propagation is Better: Accelerating Single-Step Adversarial Training via Sampling Subnetworks [69.54774045493227]
A drawback of adversarial training is the computational overhead introduced by the generation of adversarial examples. We propose to exploit the interior building blocks of the model to improve efficiency. Compared with previous methods, our method not only reduces the training cost but also achieves better model robustness.
arXiv Detail & Related papers (2023-10-24T01:36:20Z)
Robust Few-shot Learning Without Using any Adversarial Samples [19.34427461937382]
A few efforts have been made to combine the few-shot problem with the robustness objective using sophisticated Meta-Learning techniques. We propose a simple but effective alternative that does not require any adversarial samples. Inspired by the cognitive decision-making process in humans, we enforce high-level feature matching between the base class data and their corresponding low-frequency samples.
arXiv Detail & Related papers (2022-11-03T05:58:26Z)
Sampling Attacks on Meta Reinforcement Learning: A Minimax Formulation and Complexity Analysis [20.11993437283895]
This paper provides a game-theoretical underpinning for understanding this type of security risk. We define the sampling attack model as a Stackelberg game between the attacker and the agent, which yields a minimax formulation. We observe that a minor effort of the attacker can significantly deteriorate the learning performance.
arXiv Detail & Related papers (2022-07-29T21:29:29Z)
Distributed Adversarial Training to Robustify Deep Neural Networks at Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification. To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training. We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z)
DAD: Data-free Adversarial Defense at Test Time [21.741026088202126]
Deep models are highly susceptible to adversarial attacks. Privacy has become an important concern, restricting access to only trained models but not the training data. We propose a completely novel problem of 'test-time adversarial defense in absence of training data and even their statistics'
arXiv Detail & Related papers (2022-04-04T15:16:13Z)
Practical Evaluation of Adversarial Robustness via Adaptive Auto Attack [96.50202709922698]
A practical evaluation method should be convenient (i.e., parameter-free), efficient (i.e., fewer iterations) and reliable. We propose a parameter-free Adaptive Auto Attack (A$3$) evaluation method which addresses the efficiency and reliability in a test-time-training fashion.
arXiv Detail & Related papers (2022-03-10T04:53:54Z)
A BIC based Mixture Model Defense against Data Poisoning Attacks on Classifiers [24.53226962899903]
Data Poisoning (DP) is an effective attack that causes trained classifiers to misclassify their inputs. We propose a novel mixture model defense against DP attacks.
arXiv Detail & Related papers (2021-05-28T01:06:09Z)
Universal Adversarial Training with Class-Wise Perturbations [78.05383266222285]
adversarial training is the most widely used method for defending against adversarial attacks. In this work, we find that a UAP does not attack all classes equally. We improve the SOTA UAT by proposing to utilize class-wise UAPs during adversarial training.
arXiv Detail & Related papers (2021-04-07T09:05:49Z)
Composite Adversarial Attacks [57.293211764569996]
Adversarial attack is a technique for deceiving Machine Learning (ML) models. In this paper, a new procedure called Composite Adrial Attack (CAA) is proposed for automatically searching the best combination of attack algorithms. CAA beats 10 top attackers on 11 diverse defenses with less elapsed time.
arXiv Detail & Related papers (2020-12-10T03:21:16Z)
Tricking Adversarial Attacks To Fail [0.05076419064097732]
Our white-box defense tricks untargeted attacks into becoming attacks targeted at designated target classes. Our Target Training defense tricks the minimization at the core of untargeted, gradient-based adversarial attacks.
arXiv Detail & Related papers (2020-06-08T12:22:07Z)
Adversarial Distributional Training for Robust Deep Learning [53.300984501078126]
Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples. Most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks. In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models.
arXiv Detail & Related papers (2020-02-14T12:36:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.