Understanding Frank-Wolfe Adversarial Training
- URL: http://arxiv.org/abs/2012.12368v2
- Date: Mon, 8 Feb 2021 17:38:24 GMT
- Title: Understanding Frank-Wolfe Adversarial Training
- Authors: Theodoros Tsiligkaridis, Jay Roberts
- Abstract summary: Adversarial Training (AT) is a technique that approximately solves a robust optimization problem to minimize the worst-case loss.
A Frank-Wolfe adversarial training approach is presented and is shown to provide competitive level of robustness as PGD-AT.
- Score: 1.2183405753834557
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep neural networks are easily fooled by small perturbations known as
adversarial attacks. Adversarial Training (AT) is a technique that
approximately solves a robust optimization problem to minimize the worst-case
loss and is widely regarded as the most effective defense against such attacks.
While projected gradient descent (PGD) has received most attention for
approximately solving the inner maximization of AT, Frank-Wolfe (FW)
optimization is projection-free and can be adapted to any $\ell_p$ norm. A
Frank-Wolfe adversarial training approach is presented and is shown to provide
as competitive level of robustness as PGD-AT for a variety of architectures,
attacks, and datasets. Exploiting a representation of the FW attack we are able
to derive the geometric insight that: The larger the $\ell_2$ norm of an
$\ell_\infty$ attack is, the less loss gradient variation there is. It is then
experimentally demonstrated that $\ell_\infty$ attacks against robust models
achieve near the maximal possible $\ell_2$ distortion, providing a new lens
into the specific type of regularization that AT bestows. Using FW optimization
in conjunction with robust models, we are able to generate sparse
human-interpretable counterfactual explanations without relying on expensive
$\ell_1$ projections.
Related papers
- Deep Adversarial Defense Against Multilevel-Lp Attacks [5.604868766260297]
This paper introduces a computationally efficient multilevel $ell_p$ defense, called the Efficient Robust Mode Connectivity (EMRC) method.
Similar to analytical continuation approaches used in continuous optimization, the method blends two $p$-specific adversarially optimal models.
We present experiments demonstrating that our approach performs better on various attacks as compared to AT-$ell_infty$, E-AT, and MSD.
arXiv Detail & Related papers (2024-07-12T13:30:00Z) - Advancing Generalized Transfer Attack with Initialization Derived Bilevel Optimization and Dynamic Sequence Truncation [49.480978190805125]
Transfer attacks generate significant interest for black-box applications.
Existing works essentially directly optimize the single-level objective w.r.t. surrogate model.
We propose a bilevel optimization paradigm, which explicitly reforms the nested relationship between the Upper-Level (UL) pseudo-victim attacker and the Lower-Level (LL) surrogate attacker.
arXiv Detail & Related papers (2024-06-04T07:45:27Z) - $σ$-zero: Gradient-based Optimization of $\ell_0$-norm Adversarial Examples [14.17412770504598]
We show that $ell_infty$-norm constraints can be used to craft input perturbations.
We propose a novel $ell_infty$-norm attack called $sigma$-norm.
It outperforms all competing adversarial attacks in terms of success, size, and efficiency.
arXiv Detail & Related papers (2024-02-02T20:08:11Z) - Towards Alternative Techniques for Improving Adversarial Robustness:
Analysis of Adversarial Training at a Spectrum of Perturbations [5.18694590238069]
Adversarial training (AT) and its variants have spearheaded progress in improving neural network robustness to adversarial perturbations.
We focus on models, trained on a spectrum of $epsilon$ values.
We identify alternative improvements to AT that otherwise wouldn't have been apparent at a single $epsilon$.
arXiv Detail & Related papers (2022-06-13T22:01:21Z) - Towards Compositional Adversarial Robustness: Generalizing Adversarial
Training to Composite Semantic Perturbations [70.05004034081377]
We first propose a novel method for generating composite adversarial examples.
Our method can find the optimal attack composition by utilizing component-wise projected gradient descent.
We then propose generalized adversarial training (GAT) to extend model robustness from $ell_p$-ball to composite semantic perturbations.
arXiv Detail & Related papers (2022-02-09T02:41:56Z) - Sparse and Imperceptible Adversarial Attack via a Homotopy Algorithm [93.80082636284922]
Sparse adversarial attacks can fool deep networks (DNNs) by only perturbing a few pixels.
Recent efforts combine it with another l_infty perturbation on magnitudes.
We propose a homotopy algorithm to tackle the sparsity and neural perturbation framework.
arXiv Detail & Related papers (2021-06-10T20:11:36Z) - PDPGD: Primal-Dual Proximal Gradient Descent Adversarial Attack [92.94132883915876]
State-of-the-art deep neural networks are sensitive to small input perturbations.
Many defence methods have been proposed that attempt to improve robustness to adversarial noise.
evaluating adversarial robustness has proven to be extremely challenging.
arXiv Detail & Related papers (2021-06-03T01:45:48Z) - Patch-wise++ Perturbation for Adversarial Targeted Attacks [132.58673733817838]
We propose a patch-wise iterative method (PIM) aimed at crafting adversarial examples with high transferability.
Specifically, we introduce an amplification factor to the step size in each iteration, and one pixel's overall gradient overflowing the $epsilon$-constraint is properly assigned to its surrounding regions.
Compared with the current state-of-the-art attack methods, we significantly improve the success rate by 35.9% for defense models and 32.7% for normally trained models.
arXiv Detail & Related papers (2020-12-31T08:40:42Z) - Towards Defending Multiple $\ell_p$-norm Bounded Adversarial
Perturbations via Gated Batch Normalization [120.99395850108422]
Existing adversarial defenses typically improve model robustness against individual specific perturbations.
Some recent methods improve model robustness against adversarial attacks in multiple $ell_p$ balls, but their performance against each perturbation type is still far from satisfactory.
We propose Gated Batch Normalization (GBN) to adversarially train a perturbation-invariant predictor for defending multiple $ell_p bounded adversarial perturbations.
arXiv Detail & Related papers (2020-12-03T02:26:01Z) - Stronger and Faster Wasserstein Adversarial Attacks [25.54761631515683]
Deep models are vulnerable to "small, imperceptible" perturbations known as adversarial attacks.
We develop an exact yet efficient projection operator to enable a stronger projected gradient attack.
We also show that the Frank-Wolfe method equipped with a suitable linear minimization oracle works extremely fast under Wasserstein constraints.
arXiv Detail & Related papers (2020-08-06T21:36:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.