Blacklight: Scalable Defense for Neural Networks against Query-Based
Black-Box Attacks
- URL: http://arxiv.org/abs/2006.14042v3
- Date: Thu, 9 Jun 2022 05:11:53 GMT
- Title: Blacklight: Scalable Defense for Neural Networks against Query-Based
Black-Box Attacks
- Authors: Huiying Li, Shawn Shan, Emily Wenger, Jiayun Zhang, Haitao Zheng, Ben
Y. Zhao
- Abstract summary: We propose Blacklight, a new defense against query-based black-box adversarial attacks.
Blacklight detects query-based black-box attacks by detecting highly similar queries.
We evaluate Blacklight against eight state-of-the-art attacks, across a variety of models and image classification tasks.
- Score: 34.04323550970413
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning systems are known to be vulnerable to adversarial examples. In
particular, query-based black-box attacks do not require knowledge of the deep
learning model, but can compute adversarial examples over the network by
submitting queries and inspecting returns. Recent work largely improves the
efficiency of those attacks, demonstrating their practicality on today's
ML-as-a-service platforms.
We propose Blacklight, a new defense against query-based black-box
adversarial attacks. The fundamental insight driving our design is that, to
compute adversarial examples, these attacks perform iterative optimization over
the network, producing image queries highly similar in the input space.
Blacklight detects query-based black-box attacks by detecting highly similar
queries, using an efficient similarity engine operating on probabilistic
content fingerprints. We evaluate Blacklight against eight state-of-the-art
attacks, across a variety of models and image classification tasks. Blacklight
identifies them all, often after only a handful of queries. By rejecting all
detected queries, Blacklight prevents any attack to complete, even when
attackers persist to submit queries after account ban or query rejection.
Blacklight is also robust against several powerful countermeasures, including
an optimal black-box attack that approximates white-box attacks in efficiency.
Finally, we illustrate how Blacklight generalizes to other domains like text
classification.
Related papers
- BruSLeAttack: A Query-Efficient Score-Based Black-Box Sparse Adversarial Attack [22.408968332454062]
We study the unique, less-well understood problem of generating sparse adversarial samples simply by observing the score-based replies to model queries.
We develop the BruSLeAttack-a new, faster (more query-efficient) algorithm for the problem.
Our work facilitates faster evaluation of model vulnerabilities and raises our vigilance on the safety, security and reliability of deployed systems.
arXiv Detail & Related papers (2024-04-08T08:59:26Z) - Certifiable Black-Box Attacks with Randomized Adversarial Examples: Breaking Defenses with Provable Confidence [34.35162562625252]
Black-box adversarial attacks have demonstrated strong potential to compromise machine learning models.
We study a new paradigm of black-box attacks with provable guarantees.
This new black-box attack unveils significant vulnerabilities of machine learning models.
arXiv Detail & Related papers (2023-04-10T01:12:09Z) - Stateful Defenses for Machine Learning Models Are Not Yet Secure Against
Black-box Attacks [28.93464970650329]
We show that stateful defense models (SDMs) are highly vulnerable to a new class of adaptive black-box attacks.
We propose a novel adaptive black-box attack strategy called Oracle-guided Adaptive Rejection Sampling (OARS)
We show how to apply the strategy to enhance six common black-box attacks to be more effective against current class of SDMs.
arXiv Detail & Related papers (2023-03-11T02:10:21Z) - Query Efficient Cross-Dataset Transferable Black-Box Attack on Action
Recognition [99.29804193431823]
Black-box adversarial attacks present a realistic threat to action recognition systems.
We propose a new attack on action recognition that addresses these shortcomings by generating perturbations.
Our method achieves 8% and higher 12% deception rates compared to state-of-the-art query-based and transfer-based attacks.
arXiv Detail & Related papers (2022-11-23T17:47:49Z) - Distributed Black-box Attack: Do Not Overestimate Black-box Attacks [4.764637544913963]
Black-box adversarial attacks can fool image classifiers into misclassifying images without requiring access to model structure and weights.
Recent studies have reported attack success rates of over 95% with less than 1,000 queries.
This paper applies black-box attacks directly to cloud APIs rather than to local models.
arXiv Detail & Related papers (2022-10-28T19:14:03Z) - Towards Lightweight Black-Box Attacks against Deep Neural Networks [70.9865892636123]
We argue that black-box attacks can pose practical attacks where only several test samples are available.
As only a few samples are required, we refer to these attacks as lightweight black-box attacks.
We propose Error TransFormer (ETF) for lightweight attacks to mitigate the approximation error.
arXiv Detail & Related papers (2022-09-29T14:43:03Z) - Parallel Rectangle Flip Attack: A Query-based Black-box Attack against
Object Detection [89.08832589750003]
We propose a Parallel Rectangle Flip Attack (PRFA) via random search to avoid sub-optimal detection near the attacked region.
Our method can effectively and efficiently attack various popular object detectors, including anchor-based and anchor-free, and generate transferable adversarial examples.
arXiv Detail & Related papers (2022-01-22T06:00:17Z) - Local Black-box Adversarial Attacks: A Query Efficient Approach [64.98246858117476]
Adrial attacks have threatened the application of deep neural networks in security-sensitive scenarios.
We propose a novel framework to perturb the discriminative areas of clean examples only within limited queries in black-box attacks.
We conduct extensive experiments to show that our framework can significantly improve the query efficiency during black-box perturbing with a high attack success rate.
arXiv Detail & Related papers (2021-01-04T15:32:16Z) - Improving Query Efficiency of Black-box Adversarial Attack [75.71530208862319]
We propose a Neural Process based black-box adversarial attack (NP-Attack)
NP-Attack could greatly decrease the query counts under the black-box setting.
arXiv Detail & Related papers (2020-09-24T06:22:56Z) - Spanning Attack: Reinforce Black-box Attacks with Unlabeled Data [96.92837098305898]
Black-box attacks aim to craft adversarial perturbations by querying input-output pairs of machine learning models.
Black-box attacks often suffer from the issue of query inefficiency due to the high dimensionality of the input space.
We propose a novel technique called the spanning attack, which constrains adversarial perturbations in a low-dimensional subspace via spanning an auxiliary unlabeled dataset.
arXiv Detail & Related papers (2020-05-11T05:57:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.