Closing the Gap: Achieving Better Accuracy-Robustness Tradeoffs against Query-Based Attacks
- URL: http://arxiv.org/abs/2312.10132v2
- Date: Thu, 21 Mar 2024 15:42:06 GMT
- Title: Closing the Gap: Achieving Better Accuracy-Robustness Tradeoffs against Query-Based Attacks
- Authors: Pascal Zimmer, Sébastien Andreina, Giorgia Azzurra Marson, Ghassan Karame,
- Abstract summary: We show how to efficiently establish, at test-time, a solid tradeoff between robustness and accuracy when mitigating query-based attacks.
Our approach is independent of training and supported by theory.
- Score: 1.54994260281059
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although promising, existing defenses against query-based attacks share a common limitation: they offer increased robustness against attacks at the price of a considerable accuracy drop on clean samples. In this work, we show how to efficiently establish, at test-time, a solid tradeoff between robustness and accuracy when mitigating query-based attacks. Given that these attacks necessarily explore low-confidence regions, our insight is that activating dedicated defenses, such as random noise defense and random image transformations, only for low-confidence inputs is sufficient to prevent them. Our approach is independent of training and supported by theory. We verify the effectiveness of our approach for various existing defenses by conducting extensive experiments on CIFAR-10, CIFAR-100, and ImageNet. Our results confirm that our proposal can indeed enhance these defenses by providing better tradeoffs between robustness and accuracy when compared to state-of-the-art approaches while being completely training-free.
Related papers
- Confidence-driven Sampling for Backdoor Attacks [49.72680157684523]
Backdoor attacks aim to surreptitiously insert malicious triggers into DNN models, granting unauthorized control during testing scenarios.
Existing methods lack robustness against defense strategies and predominantly focus on enhancing trigger stealthiness while randomly selecting poisoned samples.
We introduce a straightforward yet highly effective sampling methodology that leverages confidence scores. Specifically, it selects samples with lower confidence scores, significantly increasing the challenge for defenders in identifying and countering these attacks.
arXiv Detail & Related papers (2023-10-08T18:57:36Z) - Hindering Adversarial Attacks with Multiple Encrypted Patch Embeddings [13.604830818397629]
We propose a new key-based defense focusing on both efficiency and robustness.
We build upon the previous defense with two major improvements: (1) efficient training and (2) optional randomization.
Experiments were carried out on the ImageNet dataset, and the proposed defense was evaluated against an arsenal of state-of-the-art attacks.
arXiv Detail & Related papers (2023-09-04T14:08:34Z) - Randomness in ML Defenses Helps Persistent Attackers and Hinders
Evaluators [49.52538232104449]
It is becoming increasingly imperative to design robust ML defenses.
Recent work has found that many defenses that initially resist state-of-the-art attacks can be broken by an adaptive adversary.
We take steps to simplify the design of defenses and argue that white-box defenses should eschew randomness when possible.
arXiv Detail & Related papers (2023-02-27T01:33:31Z) - FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated
Learning [66.56240101249803]
We study how hardening benign clients can affect the global model (and the malicious clients)
We propose a trigger reverse engineering based defense and show that our method can achieve improvement with guarantee robustness.
Our results on eight competing SOTA defense methods show the empirical superiority of our method on both single-shot and continuous FL backdoor attacks.
arXiv Detail & Related papers (2022-10-23T22:24:03Z) - Increasing Confidence in Adversarial Robustness Evaluations [53.2174171468716]
We propose a test to identify weak attacks and thus weak defense evaluations.
Our test slightly modifies a neural network to guarantee the existence of an adversarial example for every sample.
For eleven out of thirteen previously-published defenses, the original evaluation of the defense fails our test, while stronger attacks that break these defenses pass it.
arXiv Detail & Related papers (2022-06-28T13:28:13Z) - Adversarial Training with Rectified Rejection [114.83821848791206]
We propose to use true confidence (T-Con) as a certainty oracle, and learn to predict T-Con by rectifying confidence.
We prove that under mild conditions, a rectified confidence (R-Con) rejector and a confidence rejector can be coupled to distinguish any wrongly classified input from correctly classified ones.
arXiv Detail & Related papers (2021-05-31T08:24:53Z) - Automated Discovery of Adaptive Attacks on Adversarial Defenses [14.633898825111826]
We present a framework that automatically discovers an effective attack on a given model with an unknown defense.
We show it outperforms AutoAttack, the current state-of-the-art tool for reliable evaluation of adversarial defenses.
arXiv Detail & Related papers (2021-02-23T18:43:24Z) - Reliable evaluation of adversarial robustness with an ensemble of
diverse parameter-free attacks [65.20660287833537]
In this paper we propose two extensions of the PGD-attack overcoming failures due to suboptimal step size and problems of the objective function.
We then combine our novel attacks with two complementary existing ones to form a parameter-free, computationally affordable and user-independent ensemble of attacks to test adversarial robustness.
arXiv Detail & Related papers (2020-03-03T18:15:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.