Mitigating Advanced Adversarial Attacks with More Advanced Gradient
Obfuscation Techniques
- URL: http://arxiv.org/abs/2005.13712v1
- Date: Wed, 27 May 2020 23:42:25 GMT
- Title: Mitigating Advanced Adversarial Attacks with More Advanced Gradient
Obfuscation Techniques
- Authors: Han Qiu, Yi Zeng, Qinkai Zheng, Tianwei Zhang, Meikang Qiu, Gerard
Memmi
- Abstract summary: Deep Neural Networks (DNNs) are well-known to be vulnerable to Adversarial Examples (AEs)
Recently, advanced gradient-based attack techniques were proposed.
In this paper, we make a steady step towards mitigating those advanced gradient-based attacks.
- Score: 13.972753012322126
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Neural Networks (DNNs) are well-known to be vulnerable to Adversarial
Examples (AEs). A large amount of efforts have been spent to launch and heat
the arms race between the attackers and defenders. Recently, advanced
gradient-based attack techniques were proposed (e.g., BPDA and EOT), which have
defeated a considerable number of existing defense methods. Up to today, there
are still no satisfactory solutions that can effectively and efficiently defend
against those attacks.
In this paper, we make a steady step towards mitigating those advanced
gradient-based attacks with two major contributions. First, we perform an
in-depth analysis about the root causes of those attacks, and propose four
properties that can break the fundamental assumptions of those attacks. Second,
we identify a set of operations that can meet those properties. By integrating
these operations, we design two preprocessing functions that can invalidate
these powerful attacks. Extensive evaluations indicate that our solutions can
effectively mitigate all existing standard and advanced attack techniques, and
beat 11 state-of-the-art defense solutions published in top-tier conferences
over the past 2 years. The defender can employ our solutions to constrain the
attack success rate below 7% for the strongest attacks even the adversary has
spent dozens of GPU hours.
Related papers
- Defense Against Prompt Injection Attack by Leveraging Attack Techniques [66.65466992544728]
Large language models (LLMs) have achieved remarkable performance across various natural language processing (NLP) tasks.
As LLMs continue to evolve, new vulnerabilities, especially prompt injection attacks arise.
Recent attack methods leverage LLMs' instruction-following abilities and their inabilities to distinguish instructions injected in the data content.
arXiv Detail & Related papers (2024-11-01T09:14:21Z) - Can Go AIs be adversarially robust? [4.466856575755327]
We study whether adding natural countermeasures can achieve robustness in Go.
We find that though some of these defenses protect against previously discovered attacks, none withstand freshly trained adversaries.
Our results suggest that building robust AI systems is challenging even with extremely superhuman systems in some of the most tractable settings.
arXiv Detail & Related papers (2024-06-18T17:57:49Z) - IDEA: Invariant Defense for Graph Adversarial Robustness [60.0126873387533]
We propose an Invariant causal DEfense method against adversarial Attacks (IDEA)
We derive node-based and structure-based invariance objectives from an information-theoretic perspective.
Experiments demonstrate that IDEA attains state-of-the-art defense performance under all five attacks on all five datasets.
arXiv Detail & Related papers (2023-05-25T07:16:00Z) - Guidance Through Surrogate: Towards a Generic Diagnostic Attack [101.36906370355435]
We develop a guided mechanism to avoid local minima during attack optimization, leading to a novel attack dubbed Guided Projected Gradient Attack (G-PGA)
Our modified attack does not require random restarts, large number of attack iterations or search for an optimal step-size.
More than an effective attack, G-PGA can be used as a diagnostic tool to reveal elusive robustness due to gradient masking in adversarial defenses.
arXiv Detail & Related papers (2022-12-30T18:45:23Z) - LAFEAT: Piercing Through Adversarial Defenses with Latent Features [15.189068478164337]
We show that latent features in certain "robust" models are surprisingly susceptible to adversarial attacks.
We introduce a unified $ell_infty$-norm white-box attack algorithm which harnesses latent features in its gradient descent steps, namely LAFEAT.
arXiv Detail & Related papers (2021-04-19T13:22:20Z) - Guided Adversarial Attack for Evaluating and Enhancing Adversarial
Defenses [59.58128343334556]
We introduce a relaxation term to the standard loss, that finds more suitable gradient-directions, increases attack efficacy and leads to more efficient adversarial training.
We propose Guided Adversarial Margin Attack (GAMA), which utilizes function mapping of the clean image to guide the generation of adversaries.
We also propose Guided Adversarial Training (GAT), which achieves state-of-the-art performance amongst single-step defenses.
arXiv Detail & Related papers (2020-11-30T16:39:39Z) - FDA3 : Federated Defense Against Adversarial Attacks for Cloud-Based
IIoT Applications [11.178342219720298]
adversarial attacks are increasingly emerging to fool Deep Neural Networks (DNNs) used by Industrial IoT (IIoT) applications.
We present an effective federated defense approach named FDA3 that can aggregate defense knowledge against adversarial examples from different sources.
Our proposed cloud-based architecture enables the sharing of defense capabilities against different attacks among IIoT devices.
arXiv Detail & Related papers (2020-06-28T15:17:15Z) - RayS: A Ray Searching Method for Hard-label Adversarial Attack [99.72117609513589]
We present the Ray Searching attack (RayS), which greatly improves the hard-label attack effectiveness as well as efficiency.
RayS attack can also be used as a sanity check for possible "falsely robust" models.
arXiv Detail & Related papers (2020-06-23T07:01:50Z) - Stealthy and Efficient Adversarial Attacks against Deep Reinforcement
Learning [30.46580767540506]
We introduce two novel adversarial attack techniques to emphstealthily and emphefficiently attack the Deep Reinforcement Learning agents.
The first technique is the emphcritical point attack: the adversary builds a model to predict the future environmental states and agent's actions, assesses the damage of each possible attack strategy, and selects the optimal one.
The second technique is the emphantagonist attack: the adversary automatically learns a domain-agnostic model to discover the critical moments of attacking the agent in an episode.
arXiv Detail & Related papers (2020-05-14T16:06:38Z) - Deflecting Adversarial Attacks [94.85315681223702]
We present a new approach towards ending this cycle where we "deflect" adversarial attacks by causing the attacker to produce an input that resembles the attack's target class.
We first propose a stronger defense based on Capsule Networks that combines three detection mechanisms to achieve state-of-the-art detection performance.
arXiv Detail & Related papers (2020-02-18T06:59:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.