Towards Data-Free Model Stealing in a Hard Label Setting
- URL: http://arxiv.org/abs/2204.11022v1
- Date: Sat, 23 Apr 2022 08:44:51 GMT
- Title: Towards Data-Free Model Stealing in a Hard Label Setting
- Authors: Sunandini Sanyal, Sravanti Addepalli, R. Venkatesh Babu
- Abstract summary: We show that it is possible to steal Machine Learning models by accessing only top-1 predictions.
We propose a novel GAN-based framework that trains the student and generator in tandem to steal the model.
- Score: 41.92884427579068
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning models deployed as a service (MLaaS) are susceptible to
model stealing attacks, where an adversary attempts to steal the model within a
restricted access framework. While existing attacks demonstrate near-perfect
clone-model performance using softmax predictions of the classification
network, most of the APIs allow access to only the top-1 labels. In this work,
we show that it is indeed possible to steal Machine Learning models by
accessing only top-1 predictions (Hard Label setting) as well, without access
to model gradients (Black-Box setting) or even the training dataset (Data-Free
setting) within a low query budget. We propose a novel GAN-based framework that
trains the student and generator in tandem to steal the model effectively while
overcoming the challenge of the hard label setting by utilizing gradients of
the clone network as a proxy to the victim's gradients. We propose to overcome
the large query costs associated with a typical Data-Free setting by utilizing
publicly available (potentially unrelated) datasets as a weak image prior. We
additionally show that even in the absence of such data, it is possible to
achieve state-of-the-art results within a low query budget using synthetically
crafted samples. We are the first to demonstrate the scalability of Model
Stealing in a restricted access setting on a 100 class dataset as well.
Related papers
- Data-Free Hard-Label Robustness Stealing Attack [67.41281050467889]
We introduce a novel Data-Free Hard-Label Robustness Stealing (DFHL-RS) attack in this paper.
It enables the stealing of both model accuracy and robustness by simply querying hard labels of the target model.
Our method achieves a clean accuracy of 77.86% and a robust accuracy of 39.51% against AutoAttack.
arXiv Detail & Related papers (2023-12-10T16:14:02Z) - Army of Thieves: Enhancing Black-Box Model Extraction via Ensemble based
sample selection [10.513955887214497]
In Model Stealing Attacks (MSA), a machine learning model is queried repeatedly to build a labelled dataset.
In this work, we explore the usage of an ensemble of deep learning models as our thief model.
We achieve a 21% higher adversarial sample transferability than previous work for models trained on the CIFAR-10 dataset.
arXiv Detail & Related papers (2023-11-08T10:31:29Z) - Beyond Labeling Oracles: What does it mean to steal ML models? [52.63413852460003]
Model extraction attacks are designed to steal trained models with only query access.
We investigate factors influencing the success of model extraction attacks.
Our findings urge the community to redefine the adversarial goals of ME attacks.
arXiv Detail & Related papers (2023-10-03T11:10:21Z) - Towards Few-Call Model Stealing via Active Self-Paced Knowledge
Distillation and Diffusion-Based Image Generation [33.60710287553274]
We propose to copy black-box classification models without having access to the original training data, the architecture, and the weights of the model.
We employ a novel active self-paced learning framework to make the most of the proxy data during distillation.
Our empirical results on two data sets confirm the superiority of our framework over two state-of-the-art methods in the few-call model extraction scenario.
arXiv Detail & Related papers (2023-09-29T19:09:27Z) - Label-Retrieval-Augmented Diffusion Models for Learning from Noisy
Labels [61.97359362447732]
Learning from noisy labels is an important and long-standing problem in machine learning for real applications.
In this paper, we reformulate the label-noise problem from a generative-model perspective.
Our model achieves new state-of-the-art (SOTA) results on all the standard real-world benchmark datasets.
arXiv Detail & Related papers (2023-05-31T03:01:36Z) - Unlearnable Clusters: Towards Label-agnostic Unlearnable Examples [128.25509832644025]
There is a growing interest in developing unlearnable examples (UEs) against visual privacy leaks on the Internet.
UEs are training samples added with invisible but unlearnable noise, which have been found can prevent unauthorized training of machine learning models.
We present a novel technique called Unlearnable Clusters (UCs) to generate label-agnostic unlearnable examples with cluster-wise perturbations.
arXiv Detail & Related papers (2022-12-31T04:26:25Z) - Label-Only Model Inversion Attacks via Boundary Repulsion [12.374249336222906]
We introduce an algorithm to invert private training data using only the target model's predicted labels.
Using the example of face recognition, we show that the images reconstructed by BREP-MI successfully reproduce the semantics of the private training data.
arXiv Detail & Related papers (2022-03-03T18:57:57Z) - RamBoAttack: A Robust Query Efficient Deep Neural Network Decision
Exploit [9.93052896330371]
We develop a robust query efficient attack capable of avoiding entrapment in a local minimum and misdirection from noisy gradients.
The RamBoAttack is more robust to the different sample inputs available to an adversary and the targeted class.
arXiv Detail & Related papers (2021-12-10T01:25:24Z) - Black-Box Dissector: Towards Erasing-based Hard-Label Model Stealing
Attack [90.6076825117532]
Model stealing attack aims to create a substitute model that steals the ability of the victim target model.
Most of the existing methods depend on the full probability outputs from the victim model, which is unavailable in most realistic scenarios.
We propose a novel hard-label model stealing method termed emphblack-box dissector, which includes a CAM-driven erasing strategy to mine the hidden information in hard labels from the victim model.
arXiv Detail & Related papers (2021-05-03T04:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.