Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial
Robustness
- URL: http://arxiv.org/abs/2006.13726v4
- Date: Wed, 29 Mar 2023 13:57:28 GMT
- Title: Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial
Robustness
- Authors: Xingjun Ma, Linxi Jiang, Hanxun Huang, Zejia Weng, James Bailey,
Yu-Gang Jiang
- Abstract summary: In this paper, we identify a more subtle situation called Imbalanced Gradients that can also cause overestimated adversarial robustness.
The phenomenon of imbalanced gradients occurs when the gradient of one term of the margin loss dominates and pushes the attack towards a suboptimal direction.
We propose a Margin Decomposition (MD) attack that decomposes a margin loss into individual terms and then explores the attackability of these terms separately.
- Score: 75.30116479840619
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Evaluating the robustness of a defense model is a challenging task in
adversarial robustness research. Obfuscated gradients have previously been
found to exist in many defense methods and cause a false signal of robustness.
In this paper, we identify a more subtle situation called Imbalanced Gradients
that can also cause overestimated adversarial robustness. The phenomenon of
imbalanced gradients occurs when the gradient of one term of the margin loss
dominates and pushes the attack towards to a suboptimal direction. To exploit
imbalanced gradients, we formulate a Margin Decomposition (MD) attack that
decomposes a margin loss into individual terms and then explores the
attackability of these terms separately via a two-stage process. We also
propose a multi-targeted and ensemble version of our MD attack. By
investigating 24 defense models proposed since 2018, we find that 11 models are
susceptible to a certain degree of imbalanced gradients and our MD attack can
decrease their robustness evaluated by the best standalone baseline attack by
more than 1%. We also provide an in-depth investigation on the likely causes of
imbalanced gradients and effective countermeasures. Our code is available at
https://github.com/HanxunH/MDAttack.
Related papers
- DALA: A Distribution-Aware LoRA-Based Adversarial Attack against
Language Models [64.79319733514266]
Adversarial attacks can introduce subtle perturbations to input data.
Recent attack methods can achieve a relatively high attack success rate (ASR)
We propose a Distribution-Aware LoRA-based Adversarial Attack (DALA) method.
arXiv Detail & Related papers (2023-11-14T23:43:47Z) - DiffAttack: Evasion Attacks Against Diffusion-Based Adversarial
Purification [63.65630243675792]
Diffusion-based purification defenses leverage diffusion models to remove crafted perturbations of adversarial examples.
Recent studies show that even advanced attacks cannot break such defenses effectively.
We propose a unified framework DiffAttack to perform effective and efficient attacks against diffusion-based purification defenses.
arXiv Detail & Related papers (2023-10-27T15:17:50Z) - RECESS Vaccine for Federated Learning: Proactive Defense Against Model Poisoning Attacks [20.55681622921858]
Model poisoning attacks greatly jeopardize the application of federated learning (FL)
In this work, we propose a novel proactive defense named RECESS against model poisoning attacks.
Unlike previous methods that score each iteration, RECESS considers clients' performance correlation across multiple iterations to estimate the trust score.
arXiv Detail & Related papers (2023-10-09T06:09:01Z) - A Closer Look at the Adversarial Robustness of Deep Equilibrium Models [25.787638780625514]
We develop approaches to estimate the intermediate gradients of DEQs and integrate them into the attacking pipelines.
Our approaches facilitate fully white-box evaluations and lead to effective adversarial defense for DEQs.
arXiv Detail & Related papers (2023-06-02T10:40:30Z) - IDEA: Invariant Defense for Graph Adversarial Robustness [60.0126873387533]
We propose an Invariant causal DEfense method against adversarial Attacks (IDEA)
We derive node-based and structure-based invariance objectives from an information-theoretic perspective.
Experiments demonstrate that IDEA attains state-of-the-art defense performance under all five attacks on all five datasets.
arXiv Detail & Related papers (2023-05-25T07:16:00Z) - Guidance Through Surrogate: Towards a Generic Diagnostic Attack [101.36906370355435]
We develop a guided mechanism to avoid local minima during attack optimization, leading to a novel attack dubbed Guided Projected Gradient Attack (G-PGA)
Our modified attack does not require random restarts, large number of attack iterations or search for an optimal step-size.
More than an effective attack, G-PGA can be used as a diagnostic tool to reveal elusive robustness due to gradient masking in adversarial defenses.
arXiv Detail & Related papers (2022-12-30T18:45:23Z) - Reliable evaluation of adversarial robustness with an ensemble of
diverse parameter-free attacks [65.20660287833537]
In this paper we propose two extensions of the PGD-attack overcoming failures due to suboptimal step size and problems of the objective function.
We then combine our novel attacks with two complementary existing ones to form a parameter-free, computationally affordable and user-independent ensemble of attacks to test adversarial robustness.
arXiv Detail & Related papers (2020-03-03T18:15:55Z) - On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient
Shaping [36.41173109033075]
Machine learning algorithms are vulnerable to data poisoning attacks.
We study the feasibility of an attack-agnostic defense relying on artifacts common to all poisoning attacks.
arXiv Detail & Related papers (2020-02-26T14:04:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.