Related papers: On the Convergence and Robustness of Adversarial Training

On the Convergence and Robustness of Adversarial Training

URL: http://arxiv.org/abs/2112.08304v1
Date: Wed, 15 Dec 2021 17:54:08 GMT
Title: On the Convergence and Robustness of Adversarial Training
Authors: Yisen Wang, Xingjun Ma, James Bailey, Jinfeng Yi, Bowen Zhou, Quanquan Gu
Abstract summary: Adrial training with Project Gradient Decent (PGD) is amongst the most effective. We propose a textitdynamic training strategy to increase the convergence quality of the generated adversarial examples. Our theoretical and empirical results show the effectiveness of the proposed method.
Score: 134.25999006326916
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Improving the robustness of deep neural networks (DNNs) to adversarial examples is an important yet challenging problem for secure deep learning. Across existing defense techniques, adversarial training with Projected Gradient Decent (PGD) is amongst the most effective. Adversarial training solves a min-max optimization problem, with the \textit{inner maximization} generating adversarial examples by maximizing the classification loss, and the \textit{outer minimization} finding model parameters by minimizing the loss on adversarial examples generated from the inner maximization. A criterion that measures how well the inner maximization is solved is therefore crucial for adversarial training. In this paper, we propose such a criterion, namely First-Order Stationary Condition for constrained optimization (FOSC), to quantitatively evaluate the convergence quality of adversarial examples found in the inner maximization. With FOSC, we find that to ensure better robustness, it is essential to use adversarial examples with better convergence quality at the \textit{later stages} of training. Yet at the early stages, high convergence quality adversarial examples are not necessary and may even lead to poor robustness. Based on these observations, we propose a \textit{dynamic} training strategy to gradually increase the convergence quality of the generated adversarial examples, which significantly improves the robustness of adversarial training. Our theoretical and empirical results show the effectiveness of the proposed method.

Related papers

Adversarial Training in Low-Label Regimes with Margin-Based Interpolation [8.585017175426023]
Adversarial training has emerged as an effective approach to train robust neural network models that are resistant to adversarial attacks. In this paper, we introduce a novel semi-supervised adversarial training approach that enhances both robustness and natural accuracy.
arXiv Detail & Related papers (2024-11-27T00:35:13Z)
Focus on Hiders: Exploring Hidden Threats for Enhancing Adversarial Training [20.1991376813843]
We propose a generalized adversarial training algorithm called Hider-Focused Adversarial Training (HFAT) HFAT combines the optimization directions of standard adversarial training and prevention hiders. We demonstrate the effectiveness of our method based on extensive experiments.
arXiv Detail & Related papers (2023-12-12T08:41:18Z)
Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework. Our importance weights are obtained by optimizing the KL-divergence regularized loss function. Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z)
Latent Boundary-guided Adversarial Training [61.43040235982727]
Adrial training is proved to be the most effective strategy that injects adversarial examples into model training. We propose a novel adversarial training framework called LAtent bounDary-guided aDvErsarial tRaining.
arXiv Detail & Related papers (2022-06-08T07:40:55Z)
Enhancing Adversarial Training with Feature Separability [52.39305978984573]
We introduce a new concept of adversarial training graph (ATG) with which the proposed adversarial training with feature separability (ATFS) enables to boost the intra-class feature similarity and increase inter-class feature variance. Through comprehensive experiments, we demonstrate that the proposed ATFS framework significantly improves both clean and robust performance.
arXiv Detail & Related papers (2022-05-02T04:04:23Z)
Distributed Statistical Min-Max Learning in the Presence of Byzantine Agents [34.46660729815201]
We consider a multi-agent min-max learning problem, and focus on the emerging challenge of contending with Byzantine adversarial agents. Our main contribution is to provide a crisp analysis of the proposed robust extra-gradient algorithm for smooth convex-concave and smooth strongly convex-strongly concave functions. Our rates are near-optimal, and reveal both the effect of adversarial corruption and the benefit of collaboration among the non-faulty agents.
arXiv Detail & Related papers (2022-04-07T03:36:28Z)
Adversarial Robustness with Semi-Infinite Constrained Learning [177.42714838799924]
Deep learning to inputs perturbations has raised serious questions about its use in safety-critical domains. We propose a hybrid Langevin Monte Carlo training approach to mitigate this issue. We show that our approach can mitigate the trade-off between state-of-the-art performance and robust robustness.
arXiv Detail & Related papers (2021-10-29T13:30:42Z)
Robust Deep Learning as Optimal Control: Insights and Convergence Guarantees [19.28405674700399]
adversarial examples during training is a popular defense mechanism against adversarial attacks. By interpreting the min-max problem as an optimal control problem, it has been shown that one can exploit the compositional structure of neural networks. We provide the first convergence analysis of this adversarial training algorithm by combining techniques from robust optimal control and inexact methods in optimization.
arXiv Detail & Related papers (2020-05-01T21:26:38Z)
Adversarial Distributional Training for Robust Deep Learning [53.300984501078126]
Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples. Most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks. In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models.
arXiv Detail & Related papers (2020-02-14T12:36:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.