Learning to Generate Noise for Multi-Attack Robustness
- URL: http://arxiv.org/abs/2006.12135v4
- Date: Thu, 24 Jun 2021 18:41:57 GMT
- Title: Learning to Generate Noise for Multi-Attack Robustness
- Authors: Divyam Madaan, Jinwoo Shin, Sung Ju Hwang
- Abstract summary: Adversarial learning has emerged as one of the successful techniques to circumvent the susceptibility of existing methods against adversarial perturbations.
In safety-critical applications, this makes these methods extraneous as the attacker can adopt diverse adversaries to deceive the system.
We propose a novel meta-learning framework that explicitly learns to generate noise to improve the model's robustness against multiple types of attacks.
- Score: 126.23656251512762
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Adversarial learning has emerged as one of the successful techniques to
circumvent the susceptibility of existing methods against adversarial
perturbations. However, the majority of existing defense methods are tailored
to defend against a single category of adversarial perturbation (e.g.
$\ell_\infty$-attack). In safety-critical applications, this makes these
methods extraneous as the attacker can adopt diverse adversaries to deceive the
system. Moreover, training on multiple perturbations simultaneously
significantly increases the computational overhead during training. To address
these challenges, we propose a novel meta-learning framework that explicitly
learns to generate noise to improve the model's robustness against multiple
types of attacks. Its key component is Meta Noise Generator (MNG) that outputs
optimal noise to stochastically perturb a given sample, such that it helps
lower the error on diverse adversarial perturbations. By utilizing samples
generated by MNG, we train a model by enforcing the label consistency across
multiple perturbations. We validate the robustness of models trained by our
scheme on various datasets and against a wide variety of perturbations,
demonstrating that it significantly outperforms the baselines across multiple
perturbations with a marginal computational cost.
Related papers
- DAT: Improving Adversarial Robustness via Generative Amplitude Mix-up in Frequency Domain [23.678658814438855]
adversarial training (AT) is developed to protect deep neural networks (DNNs) from adversarial attacks.
Recent studies show that adversarial attacks disproportionately impact the patterns within the phase of the sample's frequency spectrum.
We propose an optimized Adversarial Amplitude Generator (AAG) to achieve a better tradeoff between improving the model's robustness and retaining phase patterns.
arXiv Detail & Related papers (2024-10-16T07:18:36Z) - Efficient Adversarial Training in LLMs with Continuous Attacks [99.5882845458567]
Large language models (LLMs) are vulnerable to adversarial attacks that can bypass their safety guardrails.
We propose a fast adversarial training algorithm (C-AdvUL) composed of two losses.
C-AdvIPO is an adversarial variant of IPO that does not require utility data for adversarially robust alignment.
arXiv Detail & Related papers (2024-05-24T14:20:09Z) - Multi-granular Adversarial Attacks against Black-box Neural Ranking Models [111.58315434849047]
We create high-quality adversarial examples by incorporating multi-granular perturbations.
We transform the multi-granular attack into a sequential decision-making process.
Our attack method surpasses prevailing baselines in both attack effectiveness and imperceptibility.
arXiv Detail & Related papers (2024-04-02T02:08:29Z) - Robust Deep Learning Models Against Semantic-Preserving Adversarial
Attack [3.7264705684737893]
Deep learning models can be fooled by small $l_p$-norm adversarial perturbations and natural perturbations in terms of attributes.
We propose a novel attack mechanism named Semantic-Preserving Adversarial (SPA) attack, which can then be used to enhance adversarial training.
arXiv Detail & Related papers (2023-04-08T08:28:36Z) - Enhancing Multiple Reliability Measures via Nuisance-extended
Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition.
We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training.
We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z) - Threat Model-Agnostic Adversarial Defense using Diffusion Models [14.603209216642034]
Deep Neural Networks (DNNs) are highly sensitive to imperceptible malicious perturbations, known as adversarial attacks.
Deep Neural Networks (DNNs) are highly sensitive to imperceptible malicious perturbations, known as adversarial attacks.
arXiv Detail & Related papers (2022-07-17T06:50:48Z) - A Hamiltonian Monte Carlo Method for Probabilistic Adversarial Attack
and Learning [122.49765136434353]
We present an effective method, called Hamiltonian Monte Carlo with Accumulated Momentum (HMCAM), aiming to generate a sequence of adversarial examples.
We also propose a new generative method called Contrastive Adversarial Training (CAT), which approaches equilibrium distribution of adversarial examples.
Both quantitative and qualitative analysis on several natural image datasets and practical systems have confirmed the superiority of the proposed algorithm.
arXiv Detail & Related papers (2020-10-15T16:07:26Z) - DVERGE: Diversifying Vulnerabilities for Enhanced Robust Generation of
Ensembles [20.46399318111058]
Adversarial attacks can mislead CNN models with small perturbations, which can effectively transfer between different models trained on the same dataset.
We propose DVERGE, which isolates the adversarial vulnerability in each sub-model by distilling non-robust features.
The novel diversity metric and training procedure enables DVERGE to achieve higher robustness against transfer attacks.
arXiv Detail & Related papers (2020-09-30T14:57:35Z) - Learning to Learn from Mistakes: Robust Optimization for Adversarial
Noise [1.976652238476722]
We train a meta-optimizer which learns to robustly optimize a model using adversarial examples and is able to transfer the knowledge learned to new models.
Experimental results show the meta-optimizer is consistent across different architectures and data sets, suggesting it is possible to automatically patch adversarial vulnerabilities.
arXiv Detail & Related papers (2020-08-12T11:44:01Z) - Regularizers for Single-step Adversarial Training [49.65499307547198]
We propose three types of regularizers that help to learn robust models using single-step adversarial training methods.
Regularizers mitigate the effect of gradient masking by harnessing on properties that differentiate a robust model from that of a pseudo robust model.
arXiv Detail & Related papers (2020-02-03T09:21:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.