Related papers: Publishing Efficient On-device Models Increases Adversarial Vulnerability

Publishing Efficient On-device Models Increases Adversarial Vulnerability

URL: http://arxiv.org/abs/2212.13700v1
Date: Wed, 28 Dec 2022 05:05:58 GMT
Title: Publishing Efficient On-device Models Increases Adversarial Vulnerability
Authors: Sanghyun Hong, Nicholas Carlini, Alexey Kurakin
Abstract summary: In this paper, we study the security considerations of publishing on-device variants of large-scale models. We first show that an adversary can exploit on-device models to make attacking the large models easier. We then show that the vulnerability increases as the similarity between a full-scale and its efficient model increase.
Score: 58.6975494957865
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent increases in the computational demands of deep neural networks (DNNs) have sparked interest in efficient deep learning mechanisms, e.g., quantization or pruning. These mechanisms enable the construction of a small, efficient version of commercial-scale models with comparable accuracy, accelerating their deployment to resource-constrained devices. In this paper, we study the security considerations of publishing on-device variants of large-scale models. We first show that an adversary can exploit on-device models to make attacking the large models easier. In evaluations across 19 DNNs, by exploiting the published on-device models as a transfer prior, the adversarial vulnerability of the original commercial-scale models increases by up to 100x. We then show that the vulnerability increases as the similarity between a full-scale and its efficient model increase. Based on the insights, we propose a defense, $similarity$-$unpairing$, that fine-tunes on-device models with the objective of reducing the similarity. We evaluated our defense on all the 19 DNNs and found that it reduces the transferability up to 90% and the number of queries required by a factor of 10-100x. Our results suggest that further research is needed on the security (or even privacy) threats caused by publishing those efficient siblings.

Related papers

MISLEADER: Defending against Model Extraction with Ensembles of Distilled Models [56.09354775405601]
Model extraction attacks aim to replicate the functionality of a black-box model through query access.<n>Most existing defenses presume that attacker queries have out-of-distribution (OOD) samples, enabling them to detect and disrupt suspicious inputs.<n>We propose MISLEADER, a novel defense strategy that does not rely on OOD assumptions.
arXiv Detail & Related papers (2025-06-03T01:37:09Z)
Defending Deep Neural Networks against Backdoor Attacks via Module Switching [15.979018992591032]
An exponential increase in the parameters of Deep Neural Networks (DNNs) has significantly raised the cost of independent training. Open-source models are more vulnerable to malicious threats, such as backdoor attacks. We propose a novel module-switching strategy to break such spurious correlations within the model's propagation path.
arXiv Detail & Related papers (2025-04-08T11:01:07Z)
Disarming Steganography Attacks Inside Neural Network Models [4.750077838548593]
We propose a zero-trust prevention strategy based on AI model attack disarm and reconstruction. We demonstrate a 100% prevention rate while the methods introduce a minimal decrease in model accuracy based on Qint8 and K-LRBP methods.
arXiv Detail & Related papers (2023-09-06T15:18:35Z)
Isolation and Induction: Training Robust Deep Neural Networks against Model Stealing Attacks [51.51023951695014]
Existing model stealing defenses add deceptive perturbations to the victim's posterior probabilities to mislead the attackers. This paper proposes Isolation and Induction (InI), a novel and effective training framework for model stealing defenses. In contrast to adding perturbations over model predictions that harm the benign accuracy, we train models to produce uninformative outputs against stealing queries.
arXiv Detail & Related papers (2023-08-02T05:54:01Z)
Partially Oblivious Neural Network Inference [4.843820624525483]
We show that for neural network models, like CNNs, some information leakage can be acceptable. We experimentally demonstrate that in a CIFAR-10 network we can leak up to $80%$ of the model's weights with practically no security impact.
arXiv Detail & Related papers (2022-10-27T05:39:36Z)
RIBAC: Towards Robust and Imperceptible Backdoor Attack against Compact DNN [28.94653593443991]
Recently backdoor attack has become an emerging threat to the security of deep neural network (DNN) models. In this paper, we propose to study and develop Robust and Imperceptible Backdoor Attack against Compact DNN models (RIBAC)
arXiv Detail & Related papers (2022-08-22T21:27:09Z)
Adversarial Robustness Assessment of NeuroEvolution Approaches [1.237556184089774]
We evaluate the robustness of models found by two NeuroEvolution approaches on the CIFAR-10 image classification task. Our results show that when the evolved models are attacked with iterative methods, their accuracy usually drops to, or close to, zero. Some of these techniques can exacerbate the perturbations added to the original inputs, potentially harming robustness.
arXiv Detail & Related papers (2022-07-12T10:40:19Z)
Interpolated Joint Space Adversarial Training for Robust and Generalizable Defenses [82.3052187788609]
Adversarial training (AT) is considered to be one of the most reliable defenses against adversarial attacks. Recent works show generalization improvement with adversarial samples under novel threat models. We propose a novel threat model called Joint Space Threat Model (JSTM) Under JSTM, we develop novel adversarial attacks and defenses.
arXiv Detail & Related papers (2021-12-12T21:08:14Z)
Federated Learning with Unreliable Clients: Performance Analysis and Mechanism Design [76.29738151117583]
Federated Learning (FL) has become a promising tool for training effective machine learning models among distributed clients. However, low quality models could be uploaded to the aggregator server by unreliable clients, leading to a degradation or even a collapse of training. We model these unreliable behaviors of clients and propose a defensive mechanism to mitigate such a security risk.
arXiv Detail & Related papers (2021-05-10T08:02:27Z)
"What's in the box?!": Deflecting Adversarial Attacks by Randomly Deploying Adversarially-Disjoint Models [71.91835408379602]
adversarial examples have been long considered a real threat to machine learning models. We propose an alternative deployment-based defense paradigm that goes beyond the traditional white-box and black-box threat models.
arXiv Detail & Related papers (2021-02-09T20:07:13Z)
Defence against adversarial attacks using classical and quantum-enhanced Boltzmann machines [64.62510681492994]
generative models attempt to learn the distribution underlying a dataset, making them inherently more robust to small perturbations. We find improvements ranging from 5% to 72% against attacks with Boltzmann machines on the MNIST dataset.
arXiv Detail & Related papers (2020-12-21T19:00:03Z)
EMPIR: Ensembles of Mixed Precision Deep Networks for Increased Robustness against Adversarial Attacks [18.241639570479563]
Deep Neural Networks (DNNs) are vulnerable to adversarial attacks in which small input perturbations can produce catastrophic misclassifications. We propose EMPIR, ensembles of quantized DNN models with different numerical precisions, as a new approach to increase robustness against adversarial attacks. Our results indicate that EMPIR boosts the average adversarial accuracies by 42.6%, 15.2% and 10.5% for the DNN models trained on the MNIST, CIFAR-10 and ImageNet datasets respectively.
arXiv Detail & Related papers (2020-04-21T17:17:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.