Counter-Samples: A Stateless Strategy to Neutralize Black Box Adversarial Attacks
- URL: http://arxiv.org/abs/2403.10562v1
- Date: Thu, 14 Mar 2024 10:59:54 GMT
- Title: Counter-Samples: A Stateless Strategy to Neutralize Black Box Adversarial Attacks
- Authors: Roey Bokobza, Yisroel Mirsky,
- Abstract summary: Our paper presents a novel defence against black box attacks, where attackers use the victim model as an oracle to craft their adversarial examples.
Unlike traditional preprocessing defences that rely on sanitizing input samples, our strategy counters the attack process itself.
We demonstrate that our approach is remarkably effective against state-of-the-art black box attacks and outperforms existing defences for both the CIFAR-10 and ImageNet datasets.
- Score: 2.9815109163161204
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Our paper presents a novel defence against black box attacks, where attackers use the victim model as an oracle to craft their adversarial examples. Unlike traditional preprocessing defences that rely on sanitizing input samples, our stateless strategy counters the attack process itself. For every query we evaluate a counter-sample instead, where the counter-sample is the original sample optimized against the attacker's objective. By countering every black box query with a targeted white box optimization, our strategy effectively introduces an asymmetry to the game to the defender's advantage. This defence not only effectively misleads the attacker's search for an adversarial example, it also preserves the model's accuracy on legitimate inputs and is generic to multiple types of attacks. We demonstrate that our approach is remarkably effective against state-of-the-art black box attacks and outperforms existing defences for both the CIFAR-10 and ImageNet datasets. Additionally, we also show that the proposed defence is robust against strong adversaries as well.
Related papers
- Improving behavior based authentication against adversarial attack using XAI [3.340314613771868]
We propose an eXplainable AI (XAI) based defense strategy against adversarial attacks in such scenarios.
A feature selector, trained with our method, can be used as a filter in front of the original authenticator.
We demonstrate that our XAI based defense strategy is effective against adversarial attacks and outperforms other defense strategies.
arXiv Detail & Related papers (2024-02-26T09:29:05Z) - The Best Defense is a Good Offense: Adversarial Augmentation against
Adversarial Attacks [91.56314751983133]
$A5$ is a framework to craft a defensive perturbation to guarantee that any attack towards the input in hand will fail.
We show effective on-the-fly defensive augmentation with a robustifier network that ignores the ground truth label.
We also show how to apply $A5$ to create certifiably robust physical objects.
arXiv Detail & Related papers (2023-05-23T16:07:58Z) - Stateful Defenses for Machine Learning Models Are Not Yet Secure Against
Black-box Attacks [28.93464970650329]
We show that stateful defense models (SDMs) are highly vulnerable to a new class of adaptive black-box attacks.
We propose a novel adaptive black-box attack strategy called Oracle-guided Adaptive Rejection Sampling (OARS)
We show how to apply the strategy to enhance six common black-box attacks to be more effective against current class of SDMs.
arXiv Detail & Related papers (2023-03-11T02:10:21Z) - Planning for Attacker Entrapment in Adversarial Settings [16.085007590604327]
We propose a framework to generate a defense strategy against an attacker who is working in an environment where a defender can operate without the attacker's knowledge.
Our problem formulation allows us to capture it as a much simpler infinite horizon discounted MDP, in which the optimal policy for the MDP gives the defender's strategy against the actions of the attacker.
arXiv Detail & Related papers (2023-03-01T21:08:27Z) - Adversarial Defense via Image Denoising with Chaotic Encryption [65.48888274263756]
We propose a novel defense that assumes everything but a private key will be made available to the attacker.
Our framework uses an image denoising procedure coupled with encryption via a discretized Baker map.
arXiv Detail & Related papers (2022-03-19T10:25:02Z) - Parallel Rectangle Flip Attack: A Query-based Black-box Attack against
Object Detection [89.08832589750003]
We propose a Parallel Rectangle Flip Attack (PRFA) via random search to avoid sub-optimal detection near the attacked region.
Our method can effectively and efficiently attack various popular object detectors, including anchor-based and anchor-free, and generate transferable adversarial examples.
arXiv Detail & Related papers (2022-01-22T06:00:17Z) - Output Randomization: A Novel Defense for both White-box and Black-box
Adversarial Models [8.189696720657247]
Adversarial examples pose a threat to deep neural network models in a variety of scenarios.
We explore the use of output randomization as a defense against attacks in both the black box and white box models.
arXiv Detail & Related papers (2021-07-08T12:27:19Z) - Defense for Black-box Attacks on Anti-spoofing Models by Self-Supervised
Learning [71.17774313301753]
We explore the robustness of self-supervised learned high-level representations by using them in the defense against adversarial attacks.
Experimental results on the ASVspoof 2019 dataset demonstrate that high-level representations extracted by Mockingjay can prevent the transferability of adversarial examples.
arXiv Detail & Related papers (2020-06-05T03:03:06Z) - Harnessing adversarial examples with a surprisingly simple defense [47.64219291655723]
I introduce a very simple method to defend against adversarial examples.
The basic idea is to raise the slope of the ReLU function at the test time.
Experiments over MNIST and CIFAR-10 datasets demonstrate the effectiveness of the proposed defense.
arXiv Detail & Related papers (2020-04-26T03:09:42Z) - Deflecting Adversarial Attacks [94.85315681223702]
We present a new approach towards ending this cycle where we "deflect" adversarial attacks by causing the attacker to produce an input that resembles the attack's target class.
We first propose a stronger defense based on Capsule Networks that combines three detection mechanisms to achieve state-of-the-art detection performance.
arXiv Detail & Related papers (2020-02-18T06:59:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.