Lagrangian Objective Function Leads to Improved Unforeseen Attack
Generalization in Adversarial Training
- URL: http://arxiv.org/abs/2103.15385v1
- Date: Mon, 29 Mar 2021 07:23:46 GMT
- Title: Lagrangian Objective Function Leads to Improved Unforeseen Attack
Generalization in Adversarial Training
- Authors: Mohammad Azizmalayeri, Mohammad Hossein Rohban
- Abstract summary: Adversarial training (AT) has been shown effective to reach a robust model against the attack that is used during training.
We propose a simple modification to the AT that mitigates the mentioned issue.
We show that our attack is faster than other attack schemes that are designed for unseen attack generalization.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent improvements in deep learning models and their practical applications
have raised concerns about the robustness of these models against adversarial
examples. Adversarial training (AT) has been shown effective to reach a robust
model against the attack that is used during training. However, it usually
fails against other attacks, i.e. the model overfits to the training attack
scheme. In this paper, we propose a simple modification to the AT that
mitigates the mentioned issue. More specifically, we minimize the perturbation
$\ell_p$ norm while maximizing the classification loss in the Lagrangian form.
We argue that crafting adversarial examples based on this scheme results in
enhanced attack generalization in the learned model. We compare our final model
robust accuracy against attacks that were not used during training to closely
related state-of-the-art AT methods. This comparison demonstrates that our
average robust accuracy against unseen attacks is 5.9% higher in the CIFAR-10
dataset and is 3.2% higher in the ImageNet-100 dataset than corresponding
state-of-the-art methods. We also demonstrate that our attack is faster than
other attack schemes that are designed for unseen attack generalization, and
conclude that it is feasible for large-scale datasets.
Related papers
- Learn from the Past: A Proxy Guided Adversarial Defense Framework with
Self Distillation Regularization [53.04697800214848]
Adversarial Training (AT) is pivotal in fortifying the robustness of deep learning models.
AT methods, relying on direct iterative updates for target model's defense, frequently encounter obstacles such as unstable training and catastrophic overfitting.
We present a general proxy guided defense framework, LAST' (bf Learn from the Pbf ast)
arXiv Detail & Related papers (2023-10-19T13:13:41Z) - OMG-ATTACK: Self-Supervised On-Manifold Generation of Transferable
Evasion Attacks [17.584752814352502]
Evasion Attacks (EA) are used to test the robustness of trained neural networks by distorting input data.
We introduce a self-supervised, computationally economical method for generating adversarial examples.
Our experiments consistently demonstrate the method is effective across various models, unseen data categories, and even defended models.
arXiv Detail & Related papers (2023-10-05T17:34:47Z) - Transferable Attack for Semantic Segmentation [59.17710830038692]
adversarial attacks, and observe that the adversarial examples generated from a source model fail to attack the target models.
We propose an ensemble attack for semantic segmentation to achieve more effective attacks with higher transferability.
arXiv Detail & Related papers (2023-07-31T11:05:55Z) - MORA: Improving Ensemble Robustness Evaluation with Model-Reweighing
Attack [26.37741124166643]
Adversarial attacks can deceive neural networks by adding tiny perturbations to their input data.
We show that adversarial attack strategies cannot reliably evaluate ensemble defenses, sizeably overestimating their robustness.
We introduce MORA, a model-reweighing attack to steer adversarial example synthesis by reweighing the importance of sub-model gradients.
arXiv Detail & Related papers (2022-11-15T09:45:32Z) - Fast Adversarial Training with Adaptive Step Size [62.37203478589929]
We study the phenomenon from the perspective of training instances.
We propose a simple but effective method, Adversarial Training with Adaptive Step size (ATAS)
ATAS learns an instancewise adaptive step size that is inversely proportional to its gradient norm.
arXiv Detail & Related papers (2022-06-06T08:20:07Z) - Interpolated Joint Space Adversarial Training for Robust and
Generalizable Defenses [82.3052187788609]
Adversarial training (AT) is considered to be one of the most reliable defenses against adversarial attacks.
Recent works show generalization improvement with adversarial samples under novel threat models.
We propose a novel threat model called Joint Space Threat Model (JSTM)
Under JSTM, we develop novel adversarial attacks and defenses.
arXiv Detail & Related papers (2021-12-12T21:08:14Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - Optimal Transport as a Defense Against Adversarial Attacks [4.6193503399184275]
Adversarial attacks can find a human-imperceptible perturbation for a given image that will mislead a trained model.
Previous work aimed to align original and adversarial image representations in the same way as domain adaptation to improve robustness.
We propose to use a loss between distributions that faithfully reflect the ground distance.
This leads to SAT (Sinkhorn Adversarial Training), a more robust defense against adversarial attacks.
arXiv Detail & Related papers (2021-02-05T13:24:36Z) - Untargeted, Targeted and Universal Adversarial Attacks and Defenses on
Time Series [0.0]
We have performed untargeted, targeted and universal adversarial attacks on UCR time series datasets.
Our results show that deep learning based time series classification models are vulnerable to these attacks.
We also show that universal adversarial attacks have good generalization property as it need only a fraction of the training data.
arXiv Detail & Related papers (2021-01-13T13:00:51Z) - Learning to Attack: Towards Textual Adversarial Attacking in Real-world
Situations [81.82518920087175]
Adversarial attacking aims to fool deep neural networks with adversarial examples.
We propose a reinforcement learning based attack model, which can learn from attack history and launch attacks more efficiently.
arXiv Detail & Related papers (2020-09-19T09:12:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.