I Know What You Trained Last Summer: A Survey on Stealing Machine
Learning Models and Defences
- URL: http://arxiv.org/abs/2206.08451v2
- Date: Tue, 6 Jun 2023 09:52:41 GMT
- Title: I Know What You Trained Last Summer: A Survey on Stealing Machine
Learning Models and Defences
- Authors: Daryna Oliynyk, Rudolf Mayer, Andreas Rauber
- Abstract summary: We study model stealing attacks, assessing their performance and exploring corresponding defence techniques in different settings.
We propose a taxonomy for attack and defence approaches, and provide guidelines on how to select the right attack or defence based on the goal and available resources.
- Score: 0.1031296820074812
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine Learning-as-a-Service (MLaaS) has become a widespread paradigm,
making even the most complex machine learning models available for clients via
e.g. a pay-per-query principle. This allows users to avoid time-consuming
processes of data collection, hyperparameter tuning, and model training.
However, by giving their customers access to the (predictions of their) models,
MLaaS providers endanger their intellectual property, such as sensitive
training data, optimised hyperparameters, or learned model parameters.
Adversaries can create a copy of the model with (almost) identical behavior
using the the prediction labels only. While many variants of this attack have
been described, only scattered defence strategies have been proposed,
addressing isolated threats. This raises the necessity for a thorough
systematisation of the field of model stealing, to arrive at a comprehensive
understanding why these attacks are successful, and how they could be
holistically defended against. We address this by categorising and comparing
model stealing attacks, assessing their performance, and exploring
corresponding defence techniques in different settings. We propose a taxonomy
for attack and defence approaches, and provide guidelines on how to select the
right attack or defence strategy based on the goal and available resources.
Finally, we analyse which defences are rendered less effective by current
attack strategies.
Related papers
- Defense Against Model Extraction Attacks on Recommender Systems [53.127820987326295]
We introduce Gradient-based Ranking Optimization (GRO) to defend against model extraction attacks on recommender systems.
GRO aims to minimize the loss of the protected target model while maximizing the loss of the attacker's surrogate model.
Results show GRO's superior effectiveness in defending against model extraction attacks.
arXiv Detail & Related papers (2023-10-25T03:30:42Z) - Towards Attack-tolerant Federated Learning via Critical Parameter
Analysis [85.41873993551332]
Federated learning systems are susceptible to poisoning attacks when malicious clients send false updates to the central server.
This paper proposes a new defense strategy, FedCPA (Federated learning with Critical Analysis)
Our attack-tolerant aggregation method is based on the observation that benign local models have similar sets of top-k and bottom-k critical parameters, whereas poisoned local models do not.
arXiv Detail & Related papers (2023-08-18T05:37:55Z) - FedDefender: Client-Side Attack-Tolerant Federated Learning [60.576073964874]
Federated learning enables learning from decentralized data sources without compromising privacy.
It is vulnerable to model poisoning attacks, where malicious clients interfere with the training process.
We propose a new defense mechanism that focuses on the client-side, called FedDefender, to help benign clients train robust local models.
arXiv Detail & Related papers (2023-07-18T08:00:41Z) - Avoid Adversarial Adaption in Federated Learning by Multi-Metric
Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources.
FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks.
We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously.
MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z) - Target Model Agnostic Adversarial Attacks with Query Budgets on Language
Understanding Models [14.738950386902518]
We propose a target model adversarial attack method with a high degree of attack transferability across the attacked models.
Our empirical studies show that our method generates highly transferable adversarial sentences under the restriction of limited query budgets.
arXiv Detail & Related papers (2021-06-13T17:18:19Z) - Adversarial Poisoning Attacks and Defense for General Multi-Class Models
Based On Synthetic Reduced Nearest Neighbors [14.968442560499753]
State-of-the-art machine learning models are vulnerable to data poisoning attacks.
This paper proposes a novel model-free label-flipping attack based on the multi-modality of the data.
Second, a novel defense technique based on the Synthetic Reduced Nearest Neighbor (SRNN) model is proposed.
arXiv Detail & Related papers (2021-02-11T06:55:40Z) - ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine
Learning Models [64.03398193325572]
Inference attacks against Machine Learning (ML) models allow adversaries to learn about training data, model parameters, etc.
We concentrate on four attacks - namely, membership inference, model inversion, attribute inference, and model stealing.
Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models.
arXiv Detail & Related papers (2021-02-04T11:35:13Z) - Adversarial Attack Attribution: Discovering Attributable Signals in
Adversarial ML Attacks [0.7883722807601676]
Even production systems, such as self-driving cars and ML-as-a-service offerings, are susceptible to adversarial inputs.
Can perturbed inputs be attributed to the methods used to generate the attack?
We introduce the concept of adversarial attack attribution and create a simple supervised learning experimental framework to examine the feasibility of discovering attributable signals in adversarial attacks.
arXiv Detail & Related papers (2021-01-08T08:16:41Z) - Omni: Automated Ensemble with Unexpected Models against Adversarial
Evasion Attack [35.0689225703137]
A machine learning-based security detection model is susceptible to adversarial evasion attacks.
We propose an approach called Omni to explore methods that create an ensemble of "unexpected models"
In studies with five types of adversarial evasion attacks, we show Omni is a promising approach as a defense strategy.
arXiv Detail & Related papers (2020-11-23T20:02:40Z) - Learning to Attack: Towards Textual Adversarial Attacking in Real-world
Situations [81.82518920087175]
Adversarial attacking aims to fool deep neural networks with adversarial examples.
We propose a reinforcement learning based attack model, which can learn from attack history and launch attacks more efficiently.
arXiv Detail & Related papers (2020-09-19T09:12:24Z) - Improving Robustness to Model Inversion Attacks via Mutual Information
Regularization [12.079281416410227]
This paper studies defense mechanisms against model inversion (MI) attacks.
MI is a type of privacy attacks aimed at inferring information about the training data distribution given the access to a target machine learning model.
We propose the Mutual Information Regularization based Defense (MID) against MI attacks.
arXiv Detail & Related papers (2020-09-11T06:02:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.