Theoretical Study of Random Noise Defense against Query-Based Black-Box
Attacks
- URL: http://arxiv.org/abs/2104.11470v1
- Date: Fri, 23 Apr 2021 08:39:41 GMT
- Title: Theoretical Study of Random Noise Defense against Query-Based Black-Box
Attacks
- Authors: Zeyu Qin, Yanbo Fan, Hongyuan Zha, Baoyuan Wu
- Abstract summary: In this work, we study a simple but promising defense technique, dubbed Random Noise Defense (RND) against query-based black-box attacks.
It is lightweight and can be directly combined with any off-the-shelf models and other defense strategies.
In this work, we present solid theoretical analyses to demonstrate that the defense effect of RND against the query-based black-box attack and the corresponding adaptive attack heavily depends on the magnitude ratio.
- Score: 72.8152874114382
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The query-based black-box attacks, which don't require any knowledge about
the attacked models and datasets, have raised serious threats to machine
learning models in many real applications. In this work, we study a simple but
promising defense technique, dubbed Random Noise Defense (RND) against
query-based black-box attacks, which adds proper Gaussian noise to each query.
It is lightweight and can be directly combined with any off-the-shelf models
and other defense strategies. However, the theoretical guarantee of random
noise defense is missing, and the actual effectiveness of this defense is not
yet fully understood. In this work, we present solid theoretical analyses to
demonstrate that the defense effect of RND against the query-based black-box
attack and the corresponding adaptive attack heavily depends on the magnitude
ratio between the random noise added by the defender (i.e., RND) and the random
noise added by the attacker for gradient estimation. Extensive experiments on
CIFAR-10 and ImageNet verify our theoretical studies. Based on RND, we also
propose a stronger defense method that combines RND with Gaussian augmentation
training (RND-GT) and achieves better defense performance.
Related papers
- Noise as a Double-Edged Sword: Reinforcement Learning Exploits Randomized Defenses in Neural Networks [1.788784870849724]
This study investigates the potential for noise-based defenses to inadvertently aid evasion attacks in certain scenarios.
In some cases, noise-based defenses can inadvertently create an adversarial training loop beneficial to the RL attacker.
It challenges the assumption that randomness universally enhances defense against evasion attacks.
arXiv Detail & Related papers (2024-10-31T12:22:19Z) - From Attack to Defense: Insights into Deep Learning Security Measures in Black-Box Settings [1.8006345220416338]
adversarial samples pose a serious threat that can cause the model to misbehave and compromise the performance of such applications.
Addressing the robustness of Deep Learning models has become crucial to understanding and defending against adversarial attacks.
Our research focuses on black-box attacks such as SimBA, HopSkipJump, MGAAttack, and boundary attacks, as well as preprocessor-based defensive mechanisms.
arXiv Detail & Related papers (2024-05-03T09:40:47Z) - Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks [62.036798488144306]
Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked.
We propose an attack-agnostic defense method named Meta Invariance Defense (MID)
We show that MID simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration.
arXiv Detail & Related papers (2024-04-04T10:10:38Z) - Understanding the Robustness of Randomized Feature Defense Against
Query-Based Adversarial Attacks [23.010308600769545]
Deep neural networks are vulnerable to adversarial examples that find samples close to the original image but can make the model misclassify.
We propose a simple and lightweight defense against black-box attacks by adding random noise to hidden features at intermediate layers of the model at inference time.
Our method effectively enhances the model's resilience against both score-based and decision-based black-box attacks.
arXiv Detail & Related papers (2023-10-01T03:53:23Z) - Isolation and Induction: Training Robust Deep Neural Networks against
Model Stealing Attacks [51.51023951695014]
Existing model stealing defenses add deceptive perturbations to the victim's posterior probabilities to mislead the attackers.
This paper proposes Isolation and Induction (InI), a novel and effective training framework for model stealing defenses.
In contrast to adding perturbations over model predictions that harm the benign accuracy, we train models to produce uninformative outputs against stealing queries.
arXiv Detail & Related papers (2023-08-02T05:54:01Z) - Small Input Noise is Enough to Defend Against Query-based Black-box
Attacks [23.712389625037442]
In this paper, we show how Small Noise Defense can defend against query-based black-box attacks.
Even a small additive input noise can neutralize most query-based attacks.
Even with strong defense ability, SND almost maintains the original clean accuracy and computational speed.
arXiv Detail & Related papers (2021-01-13T01:45:59Z) - MAD-VAE: Manifold Awareness Defense Variational Autoencoder [0.0]
We introduce several methods to improve the robustness of defense models.
With extensive experiments on MNIST data set, we have demonstrated the effectiveness of our algorithms.
We also discuss the applicability of existing adversarial latent space attacks as they may have a significant flaw.
arXiv Detail & Related papers (2020-10-31T09:04:25Z) - Attack Agnostic Adversarial Defense via Visual Imperceptible Bound [70.72413095698961]
This research aims to design a defense model that is robust within a certain bound against both seen and unseen adversarial attacks.
The proposed defense model is evaluated on the MNIST, CIFAR-10, and Tiny ImageNet databases.
The proposed algorithm is attack agnostic, i.e. it does not require any knowledge of the attack algorithm.
arXiv Detail & Related papers (2020-10-25T23:14:26Z) - RayS: A Ray Searching Method for Hard-label Adversarial Attack [99.72117609513589]
We present the Ray Searching attack (RayS), which greatly improves the hard-label attack effectiveness as well as efficiency.
RayS attack can also be used as a sanity check for possible "falsely robust" models.
arXiv Detail & Related papers (2020-06-23T07:01:50Z) - Defense for Black-box Attacks on Anti-spoofing Models by Self-Supervised
Learning [71.17774313301753]
We explore the robustness of self-supervised learned high-level representations by using them in the defense against adversarial attacks.
Experimental results on the ASVspoof 2019 dataset demonstrate that high-level representations extracted by Mockingjay can prevent the transferability of adversarial examples.
arXiv Detail & Related papers (2020-06-05T03:03:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.