Related papers: Mitigating Gradient-based Adversarial Attacks via Denoising and Compression

Mitigating Gradient-based Adversarial Attacks via Denoising and Compression

URL: http://arxiv.org/abs/2104.01494v1
Date: Sat, 3 Apr 2021 22:57:01 GMT
Title: Mitigating Gradient-based Adversarial Attacks via Denoising and Compression
Authors: Rehana Mahfuz, Rajeev Sahay, Aly El Gamal
Abstract summary: Gradient-based adversarial attacks on deep neural networks pose a serious threat. They can be deployed by adding imperceptible perturbations to the test data of any network. Denoising and dimensionality reduction are two distinct methods that have been investigated to combat such attacks.
Score: 7.305019142196582
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Gradient-based adversarial attacks on deep neural networks pose a serious threat, since they can be deployed by adding imperceptible perturbations to the test data of any network, and the risk they introduce cannot be assessed through the network's original training performance. Denoising and dimensionality reduction are two distinct methods that have been independently investigated to combat such attacks. While denoising offers the ability to tailor the defense to the specific nature of the attack, dimensionality reduction offers the advantage of potentially removing previously unseen perturbations, along with reducing the training time of the network being defended. We propose strategies to combine the advantages of these two defense mechanisms. First, we propose the cascaded defense, which involves denoising followed by dimensionality reduction. To reduce the training time of the defense for a small trade-off in performance, we propose the hidden layer defense, which involves feeding the output of the encoder of a denoising autoencoder into the network. Further, we discuss how adaptive attacks against these defenses could become significantly weak when an alternative defense is used, or when no defense is used. In this light, we propose a new metric to evaluate a defense which measures the sensitivity of the adaptive attack to modifications in the defense. Finally, we present a guideline for building an ordered repertoire of defenses, a.k.a. a defense infrastructure, that adjusts to limited computational resources in presence of uncertainty about the attack strategy.

Related papers

Benchmarking Misuse Mitigation Against Covert Adversaries [80.74502950627736]
Existing language model safety evaluations focus on overt attacks and low-stakes tasks.<n>We develop Benchmarks for Stateful Defenses (BSD), a data generation pipeline that automates evaluations of covert attacks and corresponding defenses.<n>Our evaluations indicate that decomposition attacks are effective misuse enablers, and highlight stateful defenses as a countermeasure.
arXiv Detail & Related papers (2025-06-06T17:33:33Z)
Learning from the Good Ones: Risk Profiling-Based Defenses Against Evasion Attacks on DNNs [4.837320865223376]
Safety-critical applications use deep neural networks (DNNs) to make predictions and infer decisions.<n>We propose a novel risk profiling framework that uses a risk-aware strategy to selectively train static defenses.<n>We show that selective training on the less vulnerable patients achieves a recall increase of up to 27.5% with minimal impact on precision.
arXiv Detail & Related papers (2025-05-10T00:33:15Z)
Gradient-Free Adversarial Purification with Diffusion Models [10.917491144598575]
Adversarial training and adversarial purification are effective methods to enhance a model's robustness against adversarial attacks. We propose an effective and efficient adversarial defense method that counters both perturbation-based and unrestricted adversarial attacks.
arXiv Detail & Related papers (2025-01-23T02:34:14Z)
A Hybrid Training-time and Run-time Defense Against Adversarial Attacks in Modulation Classification [35.061430235135155]
Defense mechanism based on both training-time and run-time defense techniques for protecting machine learning-based radio signal (modulation) classification against adversarial attacks. Considering a white-box scenario and real datasets, we demonstrate that our proposed techniques outperform existing state-of-the-art technologies.
arXiv Detail & Related papers (2024-07-09T12:28:38Z)
MPAT: Building Robust Deep Neural Networks against Textual Adversarial Attacks [4.208423642716679]
We propose a malicious perturbation based adversarial training method (MPAT) for building robust deep neural networks against adversarial attacks. Specifically, we construct a multi-level malicious example generation strategy to generate adversarial examples with malicious perturbations. We employ a novel training objective function to ensure achieving the defense goal without compromising the performance on the original task.
arXiv Detail & Related papers (2024-02-29T01:49:18Z)
BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive Learning [85.2564206440109]
This paper reveals the threats in this practical scenario that backdoor attacks can remain effective even after defenses. We introduce the emphtoolns attack, which is resistant to backdoor detection and model fine-tuning defenses.
arXiv Detail & Related papers (2023-11-20T02:21:49Z)
Denoising Autoencoder-based Defensive Distillation as an Adversarial Robustness Algorithm [0.0]
Adversarial attacks significantly threaten the robustness of deep neural networks (DNNs) This work proposes a novel method that combines the defensive distillation mechanism with a denoising autoencoder (DAE)
arXiv Detail & Related papers (2023-03-28T11:34:54Z)
Fixed Points in Cyber Space: Rethinking Optimal Evasion Attacks in the Age of AI-NIDS [70.60975663021952]
We study blackbox adversarial attacks on network classifiers. We argue that attacker-defender fixed points are themselves general-sum games with complex phase transitions. We show that a continual learning approach is required to study attacker-defender dynamics.
arXiv Detail & Related papers (2021-11-23T23:42:16Z)
Practical Defences Against Model Inversion Attacks for Split Neural Networks [5.66430335973956]
We describe a threat model under which a split network-based federated learning system is susceptible to a model inversion attack by a malicious computational server. We propose a simple additive noise method to defend against model inversion, finding that the method can significantly reduce attack efficacy at an acceptable accuracy trade-off on MNIST.
arXiv Detail & Related papers (2021-04-12T18:12:17Z)
Sparse Coding Frontend for Robust Neural Networks [11.36192454455449]
Deep Neural Networks are known to be vulnerable to small, adversarially crafted, perturbations. Current defense methods against these adversarial attacks are variants of adversarial training. In this paper, we introduce a radically different defense based on a sparse coding based on clean images.
arXiv Detail & Related papers (2021-04-12T11:14:32Z)
Guided Adversarial Attack for Evaluating and Enhancing Adversarial Defenses [59.58128343334556]
We introduce a relaxation term to the standard loss, that finds more suitable gradient-directions, increases attack efficacy and leads to more efficient adversarial training. We propose Guided Adversarial Margin Attack (GAMA), which utilizes function mapping of the clean image to guide the generation of adversaries. We also propose Guided Adversarial Training (GAT), which achieves state-of-the-art performance amongst single-step defenses.
arXiv Detail & Related papers (2020-11-30T16:39:39Z)
Online Alternate Generator against Adversarial Attacks [144.45529828523408]
Deep learning models are notoriously sensitive to adversarial examples which are synthesized by adding quasi-perceptible noises on real images. We propose a portable defense method, online alternate generator, which does not need to access or modify the parameters of the target networks. The proposed method works by online synthesizing another image from scratch for an input image, instead of removing or destroying adversarial noises.
arXiv Detail & Related papers (2020-09-17T07:11:16Z)
Robust Tracking against Adversarial Attacks [69.59717023941126]
We first attempt to generate adversarial examples on top of video sequences to improve the tracking robustness against adversarial attacks. We apply the proposed adversarial attack and defense approaches to state-of-the-art deep tracking algorithms.
arXiv Detail & Related papers (2020-07-20T08:05:55Z)
A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems. This paper proposes a self-supervised adversarial training mechanism in the input space. It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.