Related papers: Revisiting and Advancing Fast Adversarial Training Through The Lens of Bi-Level Optimization

Revisiting and Advancing Fast Adversarial Training Through The Lens of Bi-Level Optimization

URL: http://arxiv.org/abs/2112.12376v2
Date: Fri, 24 Dec 2021 04:06:41 GMT
Title: Revisiting and Advancing Fast Adversarial Training Through The Lens of Bi-Level Optimization
Authors: Yihua Zhang, Guanhua Zhang, Prashant Khanduri, Mingyi Hong, Shiyu Chang, Sijia Liu
Abstract summary: We propose a new tractable bi-level optimization problem, design and analyze a new set of algorithms termed Bi-level AT (FAST-BAT) FAST-BAT is capable of defending sign-based projected descent (PGD) attacks without calling any gradient sign method and explicit robust regularization.
Score: 60.72410937614299
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Adversarial training (AT) has become a widely recognized defense mechanism to improve the robustness of deep neural networks against adversarial attacks. It solves a min-max optimization problem, where the minimizer (i.e., defender) seeks a robust model to minimize the worst-case training loss in the presence of adversarial examples crafted by the maximizer (i.e., attacker). However, the min-max nature makes AT computationally intensive and thus difficult to scale. Meanwhile, the FAST-AT algorithm, and in fact many recent algorithms that improve AT, simplify the min-max based AT by replacing its maximization step with the simple one-shot gradient sign based attack generation step. Although easy to implement, FAST-AT lacks theoretical guarantees, and its practical performance can be unsatisfactory, suffering from the robustness catastrophic overfitting when training with strong adversaries. In this paper, we propose to design FAST-AT from the perspective of bi-level optimization (BLO). We first make the key observation that the most commonly-used algorithmic specification of FAST-AT is equivalent to using some gradient descent-type algorithm to solve a bi-level problem involving a sign operation. However, the discrete nature of the sign operation makes it difficult to understand the algorithm performance. Based on the above observation, we propose a new tractable bi-level optimization problem, design and analyze a new set of algorithms termed Fast Bi-level AT (FAST-BAT). FAST-BAT is capable of defending sign-based projected gradient descent (PGD) attacks without calling any gradient sign method and explicit robust regularization. Furthermore, we empirically show that our method outperforms state-of-the-art FAST-AT baselines, by achieving superior model robustness without inducing robustness catastrophic overfitting, or suffering from any loss of standard accuracy.

Related papers

Rethinking PGD Attack: Is Sign Function Necessary? [131.6894310945647]
We present a theoretical analysis of how such sign-based update algorithm influences step-wise attack performance. We propose a new raw gradient descent (RGD) algorithm that eliminates the use of sign. The effectiveness of the proposed RGD algorithm has been demonstrated extensively in experiments.
arXiv Detail & Related papers (2023-12-03T02:26:58Z)
Fast Adversarial Training with Adaptive Step Size [62.37203478589929]
We study the phenomenon from the perspective of training instances. We propose a simple but effective method, Adversarial Training with Adaptive Step size (ATAS) ATAS learns an instancewise adaptive step size that is inversely proportional to its gradient norm.
arXiv Detail & Related papers (2022-06-06T08:20:07Z)
Robust recovery for stochastic block models [16.74630355427558]
We develop an efficient algorithm for weak recovery in a robust version of the block model. Our results show that there is no price of robustness in the block model.
arXiv Detail & Related papers (2021-11-16T15:43:00Z)
Boosting Fast Adversarial Training with Learnable Adversarial Initialization [79.90495058040537]
Adrial training (AT) has been demonstrated to be effective in improving model robustness by leveraging adversarial examples for training. To boost training efficiency, fast gradient sign method (FGSM) is adopted in fast AT methods by calculating gradient only once.
arXiv Detail & Related papers (2021-10-11T05:37:00Z)
Implicit Finite-Horizon Approximation and Efficient Optimal Algorithms for Stochastic Shortest Path [29.289190242826688]
We introduce a generic template for developing regret algorithms in the Shortest Path (SSP) model. We develop two new algorithms: the first is model-free and minimax optimal under strictly positive costs. Both algorithms admit highly sparse updates, making them computationally more efficient than all existing algorithms.
arXiv Detail & Related papers (2021-06-15T19:15:17Z)
Sparse and Imperceptible Adversarial Attack via a Homotopy Algorithm [93.80082636284922]
Sparse adversarial attacks can fool deep networks (DNNs) by only perturbing a few pixels. Recent efforts combine it with another l_infty perturbation on magnitudes. We propose a homotopy algorithm to tackle the sparsity and neural perturbation framework.
arXiv Detail & Related papers (2021-06-10T20:11:36Z)
Targeted Attack against Deep Neural Networks via Flipping Limited Weight Bits [55.740716446995805]
We study a novel attack paradigm, which modifies model parameters in the deployment stage for malicious purposes. Our goal is to misclassify a specific sample into a target class without any sample modification. By utilizing the latest technique in integer programming, we equivalently reformulate this BIP problem as a continuous optimization problem.
arXiv Detail & Related papers (2021-02-21T03:13:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.