Efficient Defense Against Model Stealing Attacks on Convolutional Neural
Networks
- URL: http://arxiv.org/abs/2309.01838v2
- Date: Mon, 11 Sep 2023 14:09:53 GMT
- Title: Efficient Defense Against Model Stealing Attacks on Convolutional Neural
Networks
- Authors: Kacem Khaled, Mouna Dhaouadi, Felipe Gohring de Magalh\~aes and
Gabriela Nicolescu
- Abstract summary: Model stealing attacks can lead to intellectual property theft and other security and privacy risks.
Current state-of-the-art defenses against model stealing attacks suggest adding perturbations to the prediction probabilities.
We propose a simple yet effective and efficient defense alternative.
- Score: 0.548924822963045
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model stealing attacks have become a serious concern for deep learning
models, where an attacker can steal a trained model by querying its black-box
API. This can lead to intellectual property theft and other security and
privacy risks. The current state-of-the-art defenses against model stealing
attacks suggest adding perturbations to the prediction probabilities. However,
they suffer from heavy computations and make impracticable assumptions about
the adversary. They often require the training of auxiliary models. This can be
time-consuming and resource-intensive which hinders the deployment of these
defenses in real-world applications. In this paper, we propose a simple yet
effective and efficient defense alternative. We introduce a heuristic approach
to perturb the output probabilities. The proposed defense can be easily
integrated into models without additional training. We show that our defense is
effective in defending against three state-of-the-art stealing attacks. We
evaluate our approach on large and quantized (i.e., compressed) Convolutional
Neural Networks (CNNs) trained on several vision datasets. Our technique
outperforms the state-of-the-art defenses with a $\times37$ faster inference
latency without requiring any additional model and with a low impact on the
model's performance. We validate that our defense is also effective for
quantized CNNs targeting edge devices.
Related papers
- Versatile Defense Against Adversarial Attacks on Image Recognition [2.9980620769521513]
Defending against adversarial attacks in a real-life setting can be compared to the way antivirus software works.
It appears that a defense method based on image-to-image translation may be capable of this.
The trained model has successfully improved the classification accuracy from nearly zero to an average of 86%.
arXiv Detail & Related papers (2024-03-13T01:48:01Z) - Isolation and Induction: Training Robust Deep Neural Networks against
Model Stealing Attacks [51.51023951695014]
Existing model stealing defenses add deceptive perturbations to the victim's posterior probabilities to mislead the attackers.
This paper proposes Isolation and Induction (InI), a novel and effective training framework for model stealing defenses.
In contrast to adding perturbations over model predictions that harm the benign accuracy, we train models to produce uninformative outputs against stealing queries.
arXiv Detail & Related papers (2023-08-02T05:54:01Z) - Avoid Adversarial Adaption in Federated Learning by Multi-Metric
Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources.
FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks.
We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously.
MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z) - How to Steer Your Adversary: Targeted and Efficient Model Stealing
Defenses with Gradient Redirection [16.88718696087103]
We present a new approach to model stealing defenses called gradient redirection.
At the core of our approach is a provably optimal, efficient algorithm for steering an adversary's training updates in a targeted manner.
Combined with improvements to surrogate networks and a novel coordinated defense strategy, our gradient redirection defense, called GRAD$2$, achieves small utility trade-offs and low computational overhead.
arXiv Detail & Related papers (2022-06-28T17:04:49Z) - The Feasibility and Inevitability of Stealth Attacks [63.14766152741211]
We study new adversarial perturbations that enable an attacker to gain control over decisions in generic Artificial Intelligence systems.
In contrast to adversarial data modification, the attack mechanism we consider here involves alterations to the AI system itself.
arXiv Detail & Related papers (2021-06-26T10:50:07Z) - LAFEAT: Piercing Through Adversarial Defenses with Latent Features [15.189068478164337]
We show that latent features in certain "robust" models are surprisingly susceptible to adversarial attacks.
We introduce a unified $ell_infty$-norm white-box attack algorithm which harnesses latent features in its gradient descent steps, namely LAFEAT.
arXiv Detail & Related papers (2021-04-19T13:22:20Z) - Learning to Attack: Towards Textual Adversarial Attacking in Real-world
Situations [81.82518920087175]
Adversarial attacking aims to fool deep neural networks with adversarial examples.
We propose a reinforcement learning based attack model, which can learn from attack history and launch attacks more efficiently.
arXiv Detail & Related papers (2020-09-19T09:12:24Z) - Online Alternate Generator against Adversarial Attacks [144.45529828523408]
Deep learning models are notoriously sensitive to adversarial examples which are synthesized by adding quasi-perceptible noises on real images.
We propose a portable defense method, online alternate generator, which does not need to access or modify the parameters of the target networks.
The proposed method works by online synthesizing another image from scratch for an input image, instead of removing or destroying adversarial noises.
arXiv Detail & Related papers (2020-09-17T07:11:16Z) - Defense for Black-box Attacks on Anti-spoofing Models by Self-Supervised
Learning [71.17774313301753]
We explore the robustness of self-supervised learned high-level representations by using them in the defense against adversarial attacks.
Experimental results on the ASVspoof 2019 dataset demonstrate that high-level representations extracted by Mockingjay can prevent the transferability of adversarial examples.
arXiv Detail & Related papers (2020-06-05T03:03:06Z) - An Analysis of Adversarial Attacks and Defenses on Autonomous Driving
Models [15.007794089091616]
Convolutional neural network (CNN) is a key component in autonomous driving.
Previous work shows CNN-based classification models are vulnerable to adversarial attacks.
This paper presents an in-depth analysis of five adversarial attacks and four defense methods on three driving models.
arXiv Detail & Related papers (2020-02-06T09:49:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.