Related papers: Towards Data-Free Model Stealing in a Hard Label Setting

Towards Data-Free Model Stealing in a Hard Label Setting

URL: http://arxiv.org/abs/2204.11022v1
Date: Sat, 23 Apr 2022 08:44:51 GMT
Title: Towards Data-Free Model Stealing in a Hard Label Setting
Authors: Sunandini Sanyal, Sravanti Addepalli, R. Venkatesh Babu
Abstract summary: We show that it is possible to steal Machine Learning models by accessing only top-1 predictions. We propose a novel GAN-based framework that trains the student and generator in tandem to steal the model.
Score: 41.92884427579068
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine learning models deployed as a service (MLaaS) are susceptible to model stealing attacks, where an adversary attempts to steal the model within a restricted access framework. While existing attacks demonstrate near-perfect clone-model performance using softmax predictions of the classification network, most of the APIs allow access to only the top-1 labels. In this work, we show that it is indeed possible to steal Machine Learning models by accessing only top-1 predictions (Hard Label setting) as well, without access to model gradients (Black-Box setting) or even the training dataset (Data-Free setting) within a low query budget. We propose a novel GAN-based framework that trains the student and generator in tandem to steal the model effectively while overcoming the challenge of the hard label setting by utilizing gradients of the clone network as a proxy to the victim's gradients. We propose to overcome the large query costs associated with a typical Data-Free setting by utilizing publicly available (potentially unrelated) datasets as a weak image prior. We additionally show that even in the absence of such data, it is possible to achieve state-of-the-art results within a low query budget using synthetically crafted samples. We are the first to demonstrate the scalability of Model Stealing in a restricted access setting on a 100 class dataset as well.

Related papers

Data-Free Hard-Label Robustness Stealing Attack [67.41281050467889]
We introduce a novel Data-Free Hard-Label Robustness Stealing (DFHL-RS) attack in this paper. It enables the stealing of both model accuracy and robustness by simply querying hard labels of the target model. Our method achieves a clean accuracy of 77.86% and a robust accuracy of 39.51% against AutoAttack.
arXiv Detail & Related papers (2023-12-10T16:14:02Z)
Army of Thieves: Enhancing Black-Box Model Extraction via Ensemble based sample selection [10.513955887214497]
In Model Stealing Attacks (MSA), a machine learning model is queried repeatedly to build a labelled dataset. In this work, we explore the usage of an ensemble of deep learning models as our thief model. We achieve a 21% higher adversarial sample transferability than previous work for models trained on the CIFAR-10 dataset.
arXiv Detail & Related papers (2023-11-08T10:31:29Z)
Beyond Labeling Oracles: What does it mean to steal ML models? [52.63413852460003]
Model extraction attacks are designed to steal trained models with only query access. We investigate factors influencing the success of model extraction attacks. Our findings urge the community to redefine the adversarial goals of ME attacks.
arXiv Detail & Related papers (2023-10-03T11:10:21Z)
Towards Few-Call Model Stealing via Active Self-Paced Knowledge Distillation and Diffusion-Based Image Generation [33.60710287553274]
We propose to copy black-box classification models without having access to the original training data, the architecture, and the weights of the model. We employ a novel active self-paced learning framework to make the most of the proxy data during distillation. Our empirical results on two data sets confirm the superiority of our framework over two state-of-the-art methods in the few-call model extraction scenario.
arXiv Detail & Related papers (2023-09-29T19:09:27Z)
Label-Retrieval-Augmented Diffusion Models for Learning from Noisy Labels [61.97359362447732]
Learning from noisy labels is an important and long-standing problem in machine learning for real applications. In this paper, we reformulate the label-noise problem from a generative-model perspective. Our model achieves new state-of-the-art (SOTA) results on all the standard real-world benchmark datasets.
arXiv Detail & Related papers (2023-05-31T03:01:36Z)
Unlearnable Clusters: Towards Label-agnostic Unlearnable Examples [128.25509832644025]
There is a growing interest in developing unlearnable examples (UEs) against visual privacy leaks on the Internet. UEs are training samples added with invisible but unlearnable noise, which have been found can prevent unauthorized training of machine learning models. We present a novel technique called Unlearnable Clusters (UCs) to generate label-agnostic unlearnable examples with cluster-wise perturbations.
arXiv Detail & Related papers (2022-12-31T04:26:25Z)
Label-Only Model Inversion Attacks via Boundary Repulsion [12.374249336222906]
We introduce an algorithm to invert private training data using only the target model's predicted labels. Using the example of face recognition, we show that the images reconstructed by BREP-MI successfully reproduce the semantics of the private training data.
arXiv Detail & Related papers (2022-03-03T18:57:57Z)
RamBoAttack: A Robust Query Efficient Deep Neural Network Decision Exploit [9.93052896330371]
We develop a robust query efficient attack capable of avoiding entrapment in a local minimum and misdirection from noisy gradients. The RamBoAttack is more robust to the different sample inputs available to an adversary and the targeted class.
arXiv Detail & Related papers (2021-12-10T01:25:24Z)
Black-Box Dissector: Towards Erasing-based Hard-Label Model Stealing Attack [90.6076825117532]
Model stealing attack aims to create a substitute model that steals the ability of the victim target model. Most of the existing methods depend on the full probability outputs from the victim model, which is unavailable in most realistic scenarios. We propose a novel hard-label model stealing method termed emphblack-box dissector, which includes a CAM-driven erasing strategy to mine the hidden information in hard labels from the victim model.
arXiv Detail & Related papers (2021-05-03T04:12:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.