Revisiting and Advancing Fast Adversarial Training Through The Lens of
Bi-Level Optimization
- URL: http://arxiv.org/abs/2112.12376v2
- Date: Fri, 24 Dec 2021 04:06:41 GMT
- Title: Revisiting and Advancing Fast Adversarial Training Through The Lens of
Bi-Level Optimization
- Authors: Yihua Zhang, Guanhua Zhang, Prashant Khanduri, Mingyi Hong, Shiyu
Chang, Sijia Liu
- Abstract summary: We propose a new tractable bi-level optimization problem, design and analyze a new set of algorithms termed Bi-level AT (FAST-BAT)
FAST-BAT is capable of defending sign-based projected descent (PGD) attacks without calling any gradient sign method and explicit robust regularization.
- Score: 60.72410937614299
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Adversarial training (AT) has become a widely recognized defense mechanism to
improve the robustness of deep neural networks against adversarial attacks. It
solves a min-max optimization problem, where the minimizer (i.e., defender)
seeks a robust model to minimize the worst-case training loss in the presence
of adversarial examples crafted by the maximizer (i.e., attacker). However, the
min-max nature makes AT computationally intensive and thus difficult to scale.
Meanwhile, the FAST-AT algorithm, and in fact many recent algorithms that
improve AT, simplify the min-max based AT by replacing its maximization step
with the simple one-shot gradient sign based attack generation step. Although
easy to implement, FAST-AT lacks theoretical guarantees, and its practical
performance can be unsatisfactory, suffering from the robustness catastrophic
overfitting when training with strong adversaries.
In this paper, we propose to design FAST-AT from the perspective of bi-level
optimization (BLO). We first make the key observation that the most
commonly-used algorithmic specification of FAST-AT is equivalent to using some
gradient descent-type algorithm to solve a bi-level problem involving a sign
operation. However, the discrete nature of the sign operation makes it
difficult to understand the algorithm performance. Based on the above
observation, we propose a new tractable bi-level optimization problem, design
and analyze a new set of algorithms termed Fast Bi-level AT (FAST-BAT).
FAST-BAT is capable of defending sign-based projected gradient descent (PGD)
attacks without calling any gradient sign method and explicit robust
regularization. Furthermore, we empirically show that our method outperforms
state-of-the-art FAST-AT baselines, by achieving superior model robustness
without inducing robustness catastrophic overfitting, or suffering from any
loss of standard accuracy.
Related papers
- Rethinking PGD Attack: Is Sign Function Necessary? [131.6894310945647]
We present a theoretical analysis of how such sign-based update algorithm influences step-wise attack performance.
We propose a new raw gradient descent (RGD) algorithm that eliminates the use of sign.
The effectiveness of the proposed RGD algorithm has been demonstrated extensively in experiments.
arXiv Detail & Related papers (2023-12-03T02:26:58Z) - Fast Adversarial Training with Adaptive Step Size [62.37203478589929]
We study the phenomenon from the perspective of training instances.
We propose a simple but effective method, Adversarial Training with Adaptive Step size (ATAS)
ATAS learns an instancewise adaptive step size that is inversely proportional to its gradient norm.
arXiv Detail & Related papers (2022-06-06T08:20:07Z) - Robust recovery for stochastic block models [16.74630355427558]
We develop an efficient algorithm for weak recovery in a robust version of the block model.
Our results show that there is no price of robustness in the block model.
arXiv Detail & Related papers (2021-11-16T15:43:00Z) - Boosting Fast Adversarial Training with Learnable Adversarial
Initialization [79.90495058040537]
Adrial training (AT) has been demonstrated to be effective in improving model robustness by leveraging adversarial examples for training.
To boost training efficiency, fast gradient sign method (FGSM) is adopted in fast AT methods by calculating gradient only once.
arXiv Detail & Related papers (2021-10-11T05:37:00Z) - Implicit Finite-Horizon Approximation and Efficient Optimal Algorithms
for Stochastic Shortest Path [29.289190242826688]
We introduce a generic template for developing regret algorithms in the Shortest Path (SSP) model.
We develop two new algorithms: the first is model-free and minimax optimal under strictly positive costs.
Both algorithms admit highly sparse updates, making them computationally more efficient than all existing algorithms.
arXiv Detail & Related papers (2021-06-15T19:15:17Z) - Sparse and Imperceptible Adversarial Attack via a Homotopy Algorithm [93.80082636284922]
Sparse adversarial attacks can fool deep networks (DNNs) by only perturbing a few pixels.
Recent efforts combine it with another l_infty perturbation on magnitudes.
We propose a homotopy algorithm to tackle the sparsity and neural perturbation framework.
arXiv Detail & Related papers (2021-06-10T20:11:36Z) - Targeted Attack against Deep Neural Networks via Flipping Limited Weight
Bits [55.740716446995805]
We study a novel attack paradigm, which modifies model parameters in the deployment stage for malicious purposes.
Our goal is to misclassify a specific sample into a target class without any sample modification.
By utilizing the latest technique in integer programming, we equivalently reformulate this BIP problem as a continuous optimization problem.
arXiv Detail & Related papers (2021-02-21T03:13:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.