Related papers: Preprocessors Matter! Realistic Decision-Based Attacks on Machine Learning Systems

Preprocessors Matter! Realistic Decision-Based Attacks on Machine Learning Systems

URL: http://arxiv.org/abs/2210.03297v2
Date: Thu, 20 Jul 2023 19:28:22 GMT
Title: Preprocessors Matter! Realistic Decision-Based Attacks on Machine Learning Systems
Authors: Chawin Sitawarin, Florian Tram\`er, Nicholas Carlini
Abstract summary: Decision-based attacks construct adversarial examples against a machine learning (ML) model by making only hard-label queries. We develop techniques to (i) reverse-engineer the preprocessor and then (ii) use this extracted information to attack the end-to-end system. Our preprocessors extraction method requires only a few hundred queries, and our preprocessor-aware attacks recover the same efficacy as when attacking the model alone.
Score: 56.64374584117259
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Decision-based attacks construct adversarial examples against a machine learning (ML) model by making only hard-label queries. These attacks have mainly been applied directly to standalone neural networks. However, in practice, ML models are just one component of a larger learning system. We find that by adding a single preprocessor in front of a classifier, state-of-the-art query-based attacks are up to 7$\times$ less effective at attacking a prediction pipeline than at attacking the model alone. We explain this discrepancy by the fact that most preprocessors introduce some notion of invariance to the input space. Hence, attacks that are unaware of this invariance inevitably waste a large number of queries to re-discover or overcome it. We, therefore, develop techniques to (i) reverse-engineer the preprocessor and then (ii) use this extracted information to attack the end-to-end system. Our preprocessors extraction method requires only a few hundred queries, and our preprocessor-aware attacks recover the same efficacy as when attacking the model alone. The code can be found at https://github.com/google-research/preprocessor-aware-black-box-attack.

Related papers

When Forgetting Triggers Backdoors: A Clean Unlearning Attack [1.8434042562191815]
We propose a novel em clean backdoor attack that exploits both the model learning phase and the subsequent unlearning requests.<n>This strategy results in a powerful and stealthy novel attack that is hard to detect or mitigate.
arXiv Detail & Related papers (2025-06-14T14:31:51Z)
SecAlign: Defending Against Prompt Injection with Preference Optimization [52.48001255555192]
Adrial prompts can be injected into external data sources to override the system's intended instruction and execute a malicious instruction. We propose a new defense called SecAlign based on the technique of preference optimization. Our method reduces the success rates of various prompt injections to around 0%, even against attacks much more sophisticated than ones seen during training.
arXiv Detail & Related papers (2024-10-07T19:34:35Z)
Defense Against Model Extraction Attacks on Recommender Systems [53.127820987326295]
We introduce Gradient-based Ranking Optimization (GRO) to defend against model extraction attacks on recommender systems. GRO aims to minimize the loss of the protected target model while maximizing the loss of the attacker's surrogate model. Results show GRO's superior effectiveness in defending against model extraction attacks.
arXiv Detail & Related papers (2023-10-25T03:30:42Z)
DeltaBound Attack: Efficient decision-based attack in low queries regime [0.4061135251278187]
Deep neural networks and other machine learning systems are vulnerable to adversarial attacks. We propose a novel, powerful attack in the hard-label setting with $ell$ norm bounded perturbations. We find that the DeltaBound attack performs as well and sometimes better than current state-of-the-art attacks.
arXiv Detail & Related papers (2022-10-01T14:45:18Z)
Versatile Weight Attack via Flipping Limited Bits [68.45224286690932]
We study a novel attack paradigm, which modifies model parameters in the deployment stage. Considering the effectiveness and stealthiness goals, we provide a general formulation to perform the bit-flip based weight attack. We present two cases of the general formulation with different malicious purposes, i.e., single sample attack (SSA) and triggered samples attack (TSA)
arXiv Detail & Related papers (2022-07-25T03:24:58Z)
Query Efficient Decision Based Sparse Attacks Against Black-Box Deep Learning Models [9.93052896330371]
We develop an evolution-based algorithm-SparseEvo-for the problem and evaluate against both convolutional deep neural networks and vision transformers. SparseEvo requires significantly fewer model queries than the state-of-the-art sparse attack Pointwise for both untargeted and targeted attacks. Importantly, the query efficient SparseEvo, along with decision-based attacks, in general raise new questions regarding the safety of deployed systems.
arXiv Detail & Related papers (2022-01-31T21:10:47Z)
Qu-ANTI-zation: Exploiting Quantization Artifacts for Achieving Adversarial Outcomes [5.865029600972316]
Quantization is a technique that transforms the parameter representation of a neural network from floating-point numbers into lower-precision ones. We propose a new training framework to implement adversarial quantization outcomes. We show that a single compromised model defeats multiple quantization schemes.
arXiv Detail & Related papers (2021-10-26T10:09:49Z)
Multi-concept adversarial attacks [13.538643599990785]
Test time attacks targeting a single ML model often neglect their impact on other ML models. We develop novel attack techniques that can simultaneously attack one set of ML models while preserving the accuracy of the other.
arXiv Detail & Related papers (2021-10-19T22:14:19Z)
Attribution of Gradient Based Adversarial Attacks for Reverse Engineering of Deceptions [16.23543028393521]
We present two techniques that support automated identification and attribution of adversarial ML attack toolchains. To the best of our knowledge, this is the first approach to attribute gradient based adversarial attacks and estimate their parameters.
arXiv Detail & Related papers (2021-03-19T19:55:00Z)
Targeted Attack against Deep Neural Networks via Flipping Limited Weight Bits [55.740716446995805]
We study a novel attack paradigm, which modifies model parameters in the deployment stage for malicious purposes. Our goal is to misclassify a specific sample into a target class without any sample modification. By utilizing the latest technique in integer programming, we equivalently reformulate this BIP problem as a continuous optimization problem.
arXiv Detail & Related papers (2021-02-21T03:13:27Z)
Composite Adversarial Attacks [57.293211764569996]
Adversarial attack is a technique for deceiving Machine Learning (ML) models. In this paper, a new procedure called Composite Adrial Attack (CAA) is proposed for automatically searching the best combination of attack algorithms. CAA beats 10 top attackers on 11 diverse defenses with less elapsed time.
arXiv Detail & Related papers (2020-12-10T03:21:16Z)
On Adversarial Examples and Stealth Attacks in Artificial Intelligence Systems [62.997667081978825]
We present a formal framework for assessing and analyzing two classes of malevolent action towards generic Artificial Intelligence (AI) systems. The first class involves adversarial examples and concerns the introduction of small perturbations of the input data that cause misclassification. The second class, introduced here for the first time and named stealth attacks, involves small perturbations to the AI system itself.
arXiv Detail & Related papers (2020-04-09T10:56:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.