Related papers: Toward Adversarial Robustness via Semi-supervised Robust Training

Toward Adversarial Robustness via Semi-supervised Robust Training

URL: http://arxiv.org/abs/2003.06974v3
Date: Tue, 16 Jun 2020 01:12:53 GMT
Title: Toward Adversarial Robustness via Semi-supervised Robust Training
Authors: Yiming Li, Baoyuan Wu, Yan Feng, Yanbo Fan, Yong Jiang, Zhifeng Li, Shutao Xia
Abstract summary: Adrial examples have been shown to be the severe threat to deep neural networks (DNNs) We propose a novel defense method, the robust training (RT), by jointly minimizing two separated risks ($R_stand$ and $R_rob$)
Score: 93.36310070269643
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adversarial examples have been shown to be the severe threat to deep neural networks (DNNs). One of the most effective adversarial defense methods is adversarial training (AT) through minimizing the adversarial risk $R_{adv}$, which encourages both the benign example $x$ and its adversarially perturbed neighborhoods within the $\ell_{p}$-ball to be predicted as the ground-truth label. In this work, we propose a novel defense method, the robust training (RT), by jointly minimizing two separated risks ($R_{stand}$ and $R_{rob}$), which is with respect to the benign example and its neighborhoods respectively. The motivation is to explicitly and jointly enhance the accuracy and the adversarial robustness. We prove that $R_{adv}$ is upper-bounded by $R_{stand} + R_{rob}$, which implies that RT has similar effect as AT. Intuitively, minimizing the standard risk enforces the benign example to be correctly predicted, and the robust risk minimization encourages the predictions of the neighbor examples to be consistent with the prediction of the benign example. Besides, since $R_{rob}$ is independent of the ground-truth label, RT is naturally extended to the semi-supervised mode ($i.e.$, SRT), to further enhance the adversarial robustness. Moreover, we extend the $\ell_{p}$-bounded neighborhood to a general case, which covers different types of perturbations, such as the pixel-wise ($i.e.$, $x + \delta$) or the spatial perturbation ($i.e.$, $ AX + b$). Extensive experiments on benchmark datasets not only verify the superiority of the proposed SRT method to state-of-the-art methods for defensing pixel-wise or spatial perturbations separately, but also demonstrate its robustness to both perturbations simultaneously. The code for reproducing main results is available at \url{https://github.com/THUYimingLi/Semi-supervised_Robust_Training}.

Related papers

Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback [58.66941279460248]
Learning from human feedback plays an important role in aligning generative models, such as large language models (LLM) We study a model within this domain--contextual dueling bandits with adversarial feedback, where the true preference label can be flipped by an adversary. We propose an algorithm namely robust contextual dueling bandits (RCDB), which is based on uncertainty-weighted maximum likelihood estimation.
arXiv Detail & Related papers (2024-04-16T17:59:55Z)
$σ$-zero: Gradient-based Optimization of $\ell_0$-norm Adversarial Examples [14.17412770504598]
We show that $ell_infty$-norm constraints can be used to craft input perturbations. We propose a novel $ell_infty$-norm attack called $sigma$-norm. It outperforms all competing adversarial attacks in terms of success, size, and efficiency.
arXiv Detail & Related papers (2024-02-02T20:08:11Z)
Adaptive Smoothness-weighted Adversarial Training for Multiple Perturbations with Its Stability Analysis [39.90487314421744]
Adrial Training (AT) has been demonstrated as one of the most effective methods against adversarial examples. Adrial training for multiple perturbations (ATMP) is proposed to generalize the adversarial robustness over different perturbation types. We develop the stability-based excess risk bounds and propose adaptive-weighted adversarial training for multiple perturbations.
arXiv Detail & Related papers (2022-10-02T15:42:34Z)
Adversarially Robust Learning with Tolerance [8.658596218544774]
We study the problem of tolerant adversarial PAC learning with respect to metric perturbation sets. We show that a variant of the natural perturb-and-smooth algorithm PAC learns any hypothesis class $mathcalH$ with VC dimension $v$ in the $gamma$-tolerant adversarial setting. We additionally propose an alternative learning method which yields sample bounds with only linear dependence on the doubling dimension.
arXiv Detail & Related papers (2022-03-02T03:50:16Z)
Towards Compositional Adversarial Robustness: Generalizing Adversarial Training to Composite Semantic Perturbations [70.05004034081377]
We first propose a novel method for generating composite adversarial examples. Our method can find the optimal attack composition by utilizing component-wise projected gradient descent. We then propose generalized adversarial training (GAT) to extend model robustness from $ell_p$-ball to composite semantic perturbations.
arXiv Detail & Related papers (2022-02-09T02:41:56Z)
Linear Contextual Bandits with Adversarial Corruptions [91.38793800392108]
We study the linear contextual bandit problem in the presence of adversarial corruption. We present a variance-aware algorithm that is adaptive to the level of adversarial contamination $C$.
arXiv Detail & Related papers (2021-10-25T02:53:24Z)
PDPGD: Primal-Dual Proximal Gradient Descent Adversarial Attack [92.94132883915876]
State-of-the-art deep neural networks are sensitive to small input perturbations. Many defence methods have been proposed that attempt to improve robustness to adversarial noise. evaluating adversarial robustness has proven to be extremely challenging.
arXiv Detail & Related papers (2021-06-03T01:45:48Z)
Towards Defending Multiple $\ell_p$-norm Bounded Adversarial Perturbations via Gated Batch Normalization [120.99395850108422]
Existing adversarial defenses typically improve model robustness against individual specific perturbations. Some recent methods improve model robustness against adversarial attacks in multiple $ell_p$ balls, but their performance against each perturbation type is still far from satisfactory. We propose Gated Batch Normalization (GBN) to adversarially train a perturbation-invariant predictor for defending multiple $ell_p bounded adversarial perturbations.
arXiv Detail & Related papers (2020-12-03T02:26:01Z)
Sharp Statistical Guarantees for Adversarially Robust Gaussian Classification [54.22421582955454]
We provide the first result of the optimal minimax guarantees for the excess risk for adversarially robust classification. Results are stated in terms of the Adversarial Signal-to-Noise Ratio (AdvSNR), which generalizes a similar notion for standard linear classification to the adversarial setting.
arXiv Detail & Related papers (2020-06-29T21:06:52Z)
Are L2 adversarial examples intrinsically different? [14.77179227968466]
We unravel the properties that can intrinsically differentiate adversarial examples and normal inputs through theoretical analysis. We achieve a recovered classification accuracy of up to 99% on MNIST, 89% on CIFAR, and 87% on ImageNet subsets against $L$ attacks.
arXiv Detail & Related papers (2020-02-28T03:42:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.