Marich: A Query-efficient Distributionally Equivalent Model Extraction
Attack using Public Data
- URL: http://arxiv.org/abs/2302.08466v2
- Date: Wed, 18 Oct 2023 17:08:26 GMT
- Title: Marich: A Query-efficient Distributionally Equivalent Model Extraction
Attack using Public Data
- Authors: Pratik Karmakar and Debabrota Basu
- Abstract summary: Black-box model extraction attacks can send minimal number of queries from a publicly available dataset to a target ML model through a predictive API.
We create an informative and distributionally equivalent replica of the target using an active sampling-based query selection algorithm, Marich.
Marich extracts models that achieve $sim 60-95%$ of true model's accuracy and uses $sim 1,000 - 8,500$ queries from the publicly available datasets.
- Score: 10.377650972462654
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We study design of black-box model extraction attacks that can send minimal
number of queries from a publicly available dataset to a target ML model
through a predictive API with an aim to create an informative and
distributionally equivalent replica of the target. First, we define
distributionally equivalent and Max-Information model extraction attacks, and
reduce them into a variational optimisation problem. The attacker sequentially
solves this optimisation problem to select the most informative queries that
simultaneously maximise the entropy and reduce the mismatch between the target
and the stolen models. This leads to an active sampling-based query selection
algorithm, Marich, which is model-oblivious. Then, we evaluate Marich on
different text and image data sets, and different models, including CNNs and
BERT. Marich extracts models that achieve $\sim 60-95\%$ of true model's
accuracy and uses $\sim 1,000 - 8,500$ queries from the publicly available
datasets, which are different from the private training datasets. Models
extracted by Marich yield prediction distributions, which are $\sim 2-4\times$
closer to the target's distribution in comparison to the existing active
sampling-based attacks. The extracted models also lead to $84-96\%$ accuracy
under membership inference attacks. Experimental results validate that Marich
is query-efficient, and capable of performing task-accurate, high-fidelity, and
informative model extraction.
Related papers
- Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - MEAOD: Model Extraction Attack against Object Detectors [45.817537875368956]
Model extraction attacks allow attackers to replicate a substitute model with comparable functionality to the victim model.
We propose an effective attack method called MEAOD for object detection models.
We achieve an extraction performance of over 70% under the given condition of a 10k query budget.
arXiv Detail & Related papers (2023-12-22T13:28:50Z) - SPD-DDPM: Denoising Diffusion Probabilistic Models in the Symmetric
Positive Definite Space [47.65912121120524]
We propose a novel generative model, termed SPD-DDPM, to handle large-scale data.
Our model is able to estimate $p(X)$ unconditionally and flexibly without giving $y$.
Experiment results on toy data and real taxi data demonstrate that our models effectively fit the data distribution both unconditionally and unconditionally.
arXiv Detail & Related papers (2023-12-13T15:08:54Z) - MeaeQ: Mount Model Extraction Attacks with Efficient Queries [6.1106195466129485]
We study model extraction attacks in natural language processing (NLP)
We propose MeaeQ, a straightforward yet effective method to address these issues.
MeaeQ achieves higher functional similarity to the victim model than baselines while requiring fewer queries.
arXiv Detail & Related papers (2023-10-21T16:07:16Z) - Learning from aggregated data with a maximum entropy model [73.63512438583375]
We show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis.
We present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
arXiv Detail & Related papers (2022-10-05T09:17:27Z) - PL-$k$NN: A Parameterless Nearest Neighbors Classifier [0.24499092754102875]
The $k$-Nearest Neighbors is one of the most effective and straightforward models employed in numerous problems.
This paper proposes a $k$-Nearest Neighbors classifier that bypasses the need to define the value of $k$.
arXiv Detail & Related papers (2022-09-26T12:52:45Z) - BRIO: Bringing Order to Abstractive Summarization [107.97378285293507]
We propose a novel training paradigm which assumes a non-deterministic distribution.
Our method achieves a new state-of-the-art result on the CNN/DailyMail (47.78 ROUGE-1) and XSum (49.07 ROUGE-1) datasets.
arXiv Detail & Related papers (2022-03-31T05:19:38Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - X-model: Improving Data Efficiency in Deep Learning with A Minimax Model [78.55482897452417]
We aim at improving data efficiency for both classification and regression setups in deep learning.
To take the power of both worlds, we propose a novel X-model.
X-model plays a minimax game between the feature extractor and task-specific heads.
arXiv Detail & Related papers (2021-10-09T13:56:48Z) - MAZE: Data-Free Model Stealing Attack Using Zeroth-Order Gradient
Estimation [14.544507965617582]
Model Stealing (MS) attacks allow an adversary with black-box access to a Machine Learning model to replicate its functionality, compromising the confidentiality of the model.
This paper proposes MAZE -- a data-free model stealing attack using zeroth-order gradient estimation.
In contrast to prior works, MAZE does not require any data and instead creates synthetic data using a generative model.
arXiv Detail & Related papers (2020-05-06T22:26:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.