Related papers: Mitigating Advanced Adversarial Attacks with More Advanced Gradient Obfuscation Techniques

Mitigating Advanced Adversarial Attacks with More Advanced Gradient Obfuscation Techniques

URL: http://arxiv.org/abs/2005.13712v1
Date: Wed, 27 May 2020 23:42:25 GMT
Title: Mitigating Advanced Adversarial Attacks with More Advanced Gradient Obfuscation Techniques
Authors: Han Qiu, Yi Zeng, Qinkai Zheng, Tianwei Zhang, Meikang Qiu, Gerard Memmi
Abstract summary: Deep Neural Networks (DNNs) are well-known to be vulnerable to Adversarial Examples (AEs) Recently, advanced gradient-based attack techniques were proposed. In this paper, we make a steady step towards mitigating those advanced gradient-based attacks.
Score: 13.972753012322126
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep Neural Networks (DNNs) are well-known to be vulnerable to Adversarial Examples (AEs). A large amount of efforts have been spent to launch and heat the arms race between the attackers and defenders. Recently, advanced gradient-based attack techniques were proposed (e.g., BPDA and EOT), which have defeated a considerable number of existing defense methods. Up to today, there are still no satisfactory solutions that can effectively and efficiently defend against those attacks. In this paper, we make a steady step towards mitigating those advanced gradient-based attacks with two major contributions. First, we perform an in-depth analysis about the root causes of those attacks, and propose four properties that can break the fundamental assumptions of those attacks. Second, we identify a set of operations that can meet those properties. By integrating these operations, we design two preprocessing functions that can invalidate these powerful attacks. Extensive evaluations indicate that our solutions can effectively mitigate all existing standard and advanced attack techniques, and beat 11 state-of-the-art defense solutions published in top-tier conferences over the past 2 years. The defender can employ our solutions to constrain the attack success rate below 7% for the strongest attacks even the adversary has spent dozens of GPU hours.

Related papers

Benchmarking Misuse Mitigation Against Covert Adversaries [80.74502950627736]
Existing language model safety evaluations focus on overt attacks and low-stakes tasks.<n>We develop Benchmarks for Stateful Defenses (BSD), a data generation pipeline that automates evaluations of covert attacks and corresponding defenses.<n>Our evaluations indicate that decomposition attacks are effective misuse enablers, and highlight stateful defenses as a countermeasure.
arXiv Detail & Related papers (2025-06-06T17:33:33Z)
Defense Against Prompt Injection Attack by Leveraging Attack Techniques [66.65466992544728]
Large language models (LLMs) have achieved remarkable performance across various natural language processing (NLP) tasks. As LLMs continue to evolve, new vulnerabilities, especially prompt injection attacks arise. Recent attack methods leverage LLMs' instruction-following abilities and their inabilities to distinguish instructions injected in the data content.
arXiv Detail & Related papers (2024-11-01T09:14:21Z)
Can Go AIs be adversarially robust? [4.466856575755327]
We study whether adding natural countermeasures can achieve robustness in Go. We find that though some of these defenses protect against previously discovered attacks, none withstand freshly trained adversaries. Our results suggest that building robust AI systems is challenging even with extremely superhuman systems in some of the most tractable settings.
arXiv Detail & Related papers (2024-06-18T17:57:49Z)
IDEA: Invariant Defense for Graph Adversarial Robustness [60.0126873387533]
We propose an Invariant causal DEfense method against adversarial Attacks (IDEA) We derive node-based and structure-based invariance objectives from an information-theoretic perspective. Experiments demonstrate that IDEA attains state-of-the-art defense performance under all five attacks on all five datasets.
arXiv Detail & Related papers (2023-05-25T07:16:00Z)
Guidance Through Surrogate: Towards a Generic Diagnostic Attack [101.36906370355435]
We develop a guided mechanism to avoid local minima during attack optimization, leading to a novel attack dubbed Guided Projected Gradient Attack (G-PGA) Our modified attack does not require random restarts, large number of attack iterations or search for an optimal step-size. More than an effective attack, G-PGA can be used as a diagnostic tool to reveal elusive robustness due to gradient masking in adversarial defenses.
arXiv Detail & Related papers (2022-12-30T18:45:23Z)
LAFEAT: Piercing Through Adversarial Defenses with Latent Features [15.189068478164337]
We show that latent features in certain "robust" models are surprisingly susceptible to adversarial attacks. We introduce a unified $ell_infty$-norm white-box attack algorithm which harnesses latent features in its gradient descent steps, namely LAFEAT.
arXiv Detail & Related papers (2021-04-19T13:22:20Z)
Guided Adversarial Attack for Evaluating and Enhancing Adversarial Defenses [59.58128343334556]
We introduce a relaxation term to the standard loss, that finds more suitable gradient-directions, increases attack efficacy and leads to more efficient adversarial training. We propose Guided Adversarial Margin Attack (GAMA), which utilizes function mapping of the clean image to guide the generation of adversaries. We also propose Guided Adversarial Training (GAT), which achieves state-of-the-art performance amongst single-step defenses.
arXiv Detail & Related papers (2020-11-30T16:39:39Z)
FDA3 : Federated Defense Against Adversarial Attacks for Cloud-Based IIoT Applications [11.178342219720298]
adversarial attacks are increasingly emerging to fool Deep Neural Networks (DNNs) used by Industrial IoT (IIoT) applications. We present an effective federated defense approach named FDA3 that can aggregate defense knowledge against adversarial examples from different sources. Our proposed cloud-based architecture enables the sharing of defense capabilities against different attacks among IIoT devices.
arXiv Detail & Related papers (2020-06-28T15:17:15Z)
RayS: A Ray Searching Method for Hard-label Adversarial Attack [99.72117609513589]
We present the Ray Searching attack (RayS), which greatly improves the hard-label attack effectiveness as well as efficiency. RayS attack can also be used as a sanity check for possible "falsely robust" models.
arXiv Detail & Related papers (2020-06-23T07:01:50Z)
Stealthy and Efficient Adversarial Attacks against Deep Reinforcement Learning [30.46580767540506]
We introduce two novel adversarial attack techniques to emphstealthily and emphefficiently attack the Deep Reinforcement Learning agents. The first technique is the emphcritical point attack: the adversary builds a model to predict the future environmental states and agent's actions, assesses the damage of each possible attack strategy, and selects the optimal one. The second technique is the emphantagonist attack: the adversary automatically learns a domain-agnostic model to discover the critical moments of attacking the agent in an episode.
arXiv Detail & Related papers (2020-05-14T16:06:38Z)
Deflecting Adversarial Attacks [94.85315681223702]
We present a new approach towards ending this cycle where we "deflect" adversarial attacks by causing the attacker to produce an input that resembles the attack's target class. We first propose a stronger defense based on Capsule Networks that combines three detection mechanisms to achieve state-of-the-art detection performance.
arXiv Detail & Related papers (2020-02-18T06:59:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.