Related papers: Efficient Defense Against Model Stealing Attacks on Convolutional Neural Networks

Efficient Defense Against Model Stealing Attacks on Convolutional Neural Networks

URL: http://arxiv.org/abs/2309.01838v2
Date: Mon, 11 Sep 2023 14:09:53 GMT
Title: Efficient Defense Against Model Stealing Attacks on Convolutional Neural Networks
Authors: Kacem Khaled, Mouna Dhaouadi, Felipe Gohring de Magalh\~aes and Gabriela Nicolescu
Abstract summary: Model stealing attacks can lead to intellectual property theft and other security and privacy risks. Current state-of-the-art defenses against model stealing attacks suggest adding perturbations to the prediction probabilities. We propose a simple yet effective and efficient defense alternative.
Score: 0.548924822963045
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Model stealing attacks have become a serious concern for deep learning models, where an attacker can steal a trained model by querying its black-box API. This can lead to intellectual property theft and other security and privacy risks. The current state-of-the-art defenses against model stealing attacks suggest adding perturbations to the prediction probabilities. However, they suffer from heavy computations and make impracticable assumptions about the adversary. They often require the training of auxiliary models. This can be time-consuming and resource-intensive which hinders the deployment of these defenses in real-world applications. In this paper, we propose a simple yet effective and efficient defense alternative. We introduce a heuristic approach to perturb the output probabilities. The proposed defense can be easily integrated into models without additional training. We show that our defense is effective in defending against three state-of-the-art stealing attacks. We evaluate our approach on large and quantized (i.e., compressed) Convolutional Neural Networks (CNNs) trained on several vision datasets. Our technique outperforms the state-of-the-art defenses with a $\times37$ faster inference latency without requiring any additional model and with a low impact on the model's performance. We validate that our defense is also effective for quantized CNNs targeting edge devices.

Related papers

Improving Large Language Model Safety with Contrastive Representation Learning [92.79965952162298]
Large Language Models (LLMs) are powerful tools with profound societal impacts.<n>Their ability to generate responses to diverse and uncontrolled inputs leaves them vulnerable to adversarial attacks.<n>We propose a defense framework that formulates model defense as a contrastive representation learning problem.
arXiv Detail & Related papers (2025-06-13T16:42:09Z)
MISLEADER: Defending against Model Extraction with Ensembles of Distilled Models [56.09354775405601]
Model extraction attacks aim to replicate the functionality of a black-box model through query access.<n>Most existing defenses presume that attacker queries have out-of-distribution (OOD) samples, enabling them to detect and disrupt suspicious inputs.<n>We propose MISLEADER, a novel defense strategy that does not rely on OOD assumptions.
arXiv Detail & Related papers (2025-06-03T01:37:09Z)
Versatile Defense Against Adversarial Attacks on Image Recognition [2.9980620769521513]
Defending against adversarial attacks in a real-life setting can be compared to the way antivirus software works. It appears that a defense method based on image-to-image translation may be capable of this. The trained model has successfully improved the classification accuracy from nearly zero to an average of 86%.
arXiv Detail & Related papers (2024-03-13T01:48:01Z)
Isolation and Induction: Training Robust Deep Neural Networks against Model Stealing Attacks [51.51023951695014]
Existing model stealing defenses add deceptive perturbations to the victim's posterior probabilities to mislead the attackers. This paper proposes Isolation and Induction (InI), a novel and effective training framework for model stealing defenses. In contrast to adding perturbations over model predictions that harm the benign accuracy, we train models to produce uninformative outputs against stealing queries.
arXiv Detail & Related papers (2023-08-02T05:54:01Z)
Avoid Adversarial Adaption in Federated Learning by Multi-Metric Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources. FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks. We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously. MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z)
How to Steer Your Adversary: Targeted and Efficient Model Stealing Defenses with Gradient Redirection [16.88718696087103]
We present a new approach to model stealing defenses called gradient redirection. At the core of our approach is a provably optimal, efficient algorithm for steering an adversary's training updates in a targeted manner. Combined with improvements to surrogate networks and a novel coordinated defense strategy, our gradient redirection defense, called GRAD$2$, achieves small utility trade-offs and low computational overhead.
arXiv Detail & Related papers (2022-06-28T17:04:49Z)
The Feasibility and Inevitability of Stealth Attacks [63.14766152741211]
We study new adversarial perturbations that enable an attacker to gain control over decisions in generic Artificial Intelligence systems. In contrast to adversarial data modification, the attack mechanism we consider here involves alterations to the AI system itself.
arXiv Detail & Related papers (2021-06-26T10:50:07Z)
LAFEAT: Piercing Through Adversarial Defenses with Latent Features [15.189068478164337]
We show that latent features in certain "robust" models are surprisingly susceptible to adversarial attacks. We introduce a unified $ell_infty$-norm white-box attack algorithm which harnesses latent features in its gradient descent steps, namely LAFEAT.
arXiv Detail & Related papers (2021-04-19T13:22:20Z)
Learning to Attack: Towards Textual Adversarial Attacking in Real-world Situations [81.82518920087175]
Adversarial attacking aims to fool deep neural networks with adversarial examples. We propose a reinforcement learning based attack model, which can learn from attack history and launch attacks more efficiently.
arXiv Detail & Related papers (2020-09-19T09:12:24Z)
Online Alternate Generator against Adversarial Attacks [144.45529828523408]
Deep learning models are notoriously sensitive to adversarial examples which are synthesized by adding quasi-perceptible noises on real images. We propose a portable defense method, online alternate generator, which does not need to access or modify the parameters of the target networks. The proposed method works by online synthesizing another image from scratch for an input image, instead of removing or destroying adversarial noises.
arXiv Detail & Related papers (2020-09-17T07:11:16Z)
Defense for Black-box Attacks on Anti-spoofing Models by Self-Supervised Learning [71.17774313301753]
We explore the robustness of self-supervised learned high-level representations by using them in the defense against adversarial attacks. Experimental results on the ASVspoof 2019 dataset demonstrate that high-level representations extracted by Mockingjay can prevent the transferability of adversarial examples.
arXiv Detail & Related papers (2020-06-05T03:03:06Z)
An Analysis of Adversarial Attacks and Defenses on Autonomous Driving Models [15.007794089091616]
Convolutional neural network (CNN) is a key component in autonomous driving. Previous work shows CNN-based classification models are vulnerable to adversarial attacks. This paper presents an in-depth analysis of five adversarial attacks and four defense methods on three driving models.
arXiv Detail & Related papers (2020-02-06T09:49:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.