Preprocessors Matter! Realistic Decision-Based Attacks on Machine
Learning Systems
- URL: http://arxiv.org/abs/2210.03297v2
- Date: Thu, 20 Jul 2023 19:28:22 GMT
- Title: Preprocessors Matter! Realistic Decision-Based Attacks on Machine
Learning Systems
- Authors: Chawin Sitawarin, Florian Tram\`er, Nicholas Carlini
- Abstract summary: Decision-based attacks construct adversarial examples against a machine learning (ML) model by making only hard-label queries.
We develop techniques to (i) reverse-engineer the preprocessor and then (ii) use this extracted information to attack the end-to-end system.
Our preprocessors extraction method requires only a few hundred queries, and our preprocessor-aware attacks recover the same efficacy as when attacking the model alone.
- Score: 56.64374584117259
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Decision-based attacks construct adversarial examples against a machine
learning (ML) model by making only hard-label queries. These attacks have
mainly been applied directly to standalone neural networks. However, in
practice, ML models are just one component of a larger learning system. We find
that by adding a single preprocessor in front of a classifier, state-of-the-art
query-based attacks are up to 7$\times$ less effective at attacking a
prediction pipeline than at attacking the model alone. We explain this
discrepancy by the fact that most preprocessors introduce some notion of
invariance to the input space. Hence, attacks that are unaware of this
invariance inevitably waste a large number of queries to re-discover or
overcome it. We, therefore, develop techniques to (i) reverse-engineer the
preprocessor and then (ii) use this extracted information to attack the
end-to-end system. Our preprocessors extraction method requires only a few
hundred queries, and our preprocessor-aware attacks recover the same efficacy
as when attacking the model alone. The code can be found at
https://github.com/google-research/preprocessor-aware-black-box-attack.
Related papers
- Defense Against Model Extraction Attacks on Recommender Systems [53.127820987326295]
We introduce Gradient-based Ranking Optimization (GRO) to defend against model extraction attacks on recommender systems.
GRO aims to minimize the loss of the protected target model while maximizing the loss of the attacker's surrogate model.
Results show GRO's superior effectiveness in defending against model extraction attacks.
arXiv Detail & Related papers (2023-10-25T03:30:42Z) - DeltaBound Attack: Efficient decision-based attack in low queries regime [0.4061135251278187]
Deep neural networks and other machine learning systems are vulnerable to adversarial attacks.
We propose a novel, powerful attack in the hard-label setting with $ell$ norm bounded perturbations.
We find that the DeltaBound attack performs as well and sometimes better than current state-of-the-art attacks.
arXiv Detail & Related papers (2022-10-01T14:45:18Z) - Versatile Weight Attack via Flipping Limited Bits [68.45224286690932]
We study a novel attack paradigm, which modifies model parameters in the deployment stage.
Considering the effectiveness and stealthiness goals, we provide a general formulation to perform the bit-flip based weight attack.
We present two cases of the general formulation with different malicious purposes, i.e., single sample attack (SSA) and triggered samples attack (TSA)
arXiv Detail & Related papers (2022-07-25T03:24:58Z) - Query Efficient Decision Based Sparse Attacks Against Black-Box Deep
Learning Models [9.93052896330371]
We develop an evolution-based algorithm-SparseEvo-for the problem and evaluate against both convolutional deep neural networks and vision transformers.
SparseEvo requires significantly fewer model queries than the state-of-the-art sparse attack Pointwise for both untargeted and targeted attacks.
Importantly, the query efficient SparseEvo, along with decision-based attacks, in general raise new questions regarding the safety of deployed systems.
arXiv Detail & Related papers (2022-01-31T21:10:47Z) - Qu-ANTI-zation: Exploiting Quantization Artifacts for Achieving
Adversarial Outcomes [5.865029600972316]
Quantization is a technique that transforms the parameter representation of a neural network from floating-point numbers into lower-precision ones.
We propose a new training framework to implement adversarial quantization outcomes.
We show that a single compromised model defeats multiple quantization schemes.
arXiv Detail & Related papers (2021-10-26T10:09:49Z) - Multi-concept adversarial attacks [13.538643599990785]
Test time attacks targeting a single ML model often neglect their impact on other ML models.
We develop novel attack techniques that can simultaneously attack one set of ML models while preserving the accuracy of the other.
arXiv Detail & Related papers (2021-10-19T22:14:19Z) - Attribution of Gradient Based Adversarial Attacks for Reverse
Engineering of Deceptions [16.23543028393521]
We present two techniques that support automated identification and attribution of adversarial ML attack toolchains.
To the best of our knowledge, this is the first approach to attribute gradient based adversarial attacks and estimate their parameters.
arXiv Detail & Related papers (2021-03-19T19:55:00Z) - Targeted Attack against Deep Neural Networks via Flipping Limited Weight
Bits [55.740716446995805]
We study a novel attack paradigm, which modifies model parameters in the deployment stage for malicious purposes.
Our goal is to misclassify a specific sample into a target class without any sample modification.
By utilizing the latest technique in integer programming, we equivalently reformulate this BIP problem as a continuous optimization problem.
arXiv Detail & Related papers (2021-02-21T03:13:27Z) - Composite Adversarial Attacks [57.293211764569996]
Adversarial attack is a technique for deceiving Machine Learning (ML) models.
In this paper, a new procedure called Composite Adrial Attack (CAA) is proposed for automatically searching the best combination of attack algorithms.
CAA beats 10 top attackers on 11 diverse defenses with less elapsed time.
arXiv Detail & Related papers (2020-12-10T03:21:16Z) - On Adversarial Examples and Stealth Attacks in Artificial Intelligence
Systems [62.997667081978825]
We present a formal framework for assessing and analyzing two classes of malevolent action towards generic Artificial Intelligence (AI) systems.
The first class involves adversarial examples and concerns the introduction of small perturbations of the input data that cause misclassification.
The second class, introduced here for the first time and named stealth attacks, involves small perturbations to the AI system itself.
arXiv Detail & Related papers (2020-04-09T10:56:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.