Label-only Model Inversion Attack: The Attack that Requires the Least
Information
- URL: http://arxiv.org/abs/2203.06555v1
- Date: Sun, 13 Mar 2022 03:03:49 GMT
- Title: Label-only Model Inversion Attack: The Attack that Requires the Least
Information
- Authors: Dayong Ye and Tianqing Zhu and Shuai Zhou and Bo Liu and Wanlei Zhou
- Abstract summary: In a model inversion attack, an adversary attempts to reconstruct the data records, used to train a target model, using only the model's output.
We have found a model inversion method that can reconstruct the input data records based only on the output labels.
- Score: 14.061083728194378
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In a model inversion attack, an adversary attempts to reconstruct the data
records, used to train a target model, using only the model's output. In
launching a contemporary model inversion attack, the strategies discussed are
generally based on either predicted confidence score vectors, i.e., black-box
attacks, or the parameters of a target model, i.e., white-box attacks. However,
in the real world, model owners usually only give out the predicted labels; the
confidence score vectors and model parameters are hidden as a defense mechanism
to prevent such attacks. Unfortunately, we have found a model inversion method
that can reconstruct the input data records based only on the output labels. We
believe this is the attack that requires the least information to succeed and,
therefore, has the best applicability. The key idea is to exploit the error
rate of the target model to compute the median distance from a set of data
records to the decision boundary of the target model. The distance, then, is
used to generate confidence score vectors which are adopted to train an attack
model to reconstruct the data records. The experimental results show that
highly recognizable data records can be reconstructed with far less information
than existing methods.
Related papers
- Beyond Labeling Oracles: What does it mean to steal ML models? [52.63413852460003]
Model extraction attacks are designed to steal trained models with only query access.
We investigate factors influencing the success of model extraction attacks.
Our findings urge the community to redefine the adversarial goals of ME attacks.
arXiv Detail & Related papers (2023-10-03T11:10:21Z) - Data-Free Model Extraction Attacks in the Context of Object Detection [0.6719751155411076]
A significant number of machine learning models are vulnerable to model extraction attacks.
We propose an adversary black box attack extending to a regression problem for predicting bounding box coordinates in object detection.
We find that the proposed model extraction method achieves significant results by using reasonable queries.
arXiv Detail & Related papers (2023-08-09T06:23:54Z) - Unstoppable Attack: Label-Only Model Inversion via Conditional Diffusion
Model [14.834360664780709]
Model attacks (MIAs) aim to recover private data from inaccessible training sets of deep learning models.
This paper develops a novel MIA method, leveraging a conditional diffusion model (CDM) to recover representative samples under the target label.
Experimental results show that this method can generate similar and accurate samples to the target label, outperforming generators of previous approaches.
arXiv Detail & Related papers (2023-07-17T12:14:24Z) - Query Efficient Cross-Dataset Transferable Black-Box Attack on Action
Recognition [99.29804193431823]
Black-box adversarial attacks present a realistic threat to action recognition systems.
We propose a new attack on action recognition that addresses these shortcomings by generating perturbations.
Our method achieves 8% and higher 12% deception rates compared to state-of-the-art query-based and transfer-based attacks.
arXiv Detail & Related papers (2022-11-23T17:47:49Z) - Label-Only Model Inversion Attacks via Boundary Repulsion [12.374249336222906]
We introduce an algorithm to invert private training data using only the target model's predicted labels.
Using the example of face recognition, we show that the images reconstructed by BREP-MI successfully reproduce the semantics of the private training data.
arXiv Detail & Related papers (2022-03-03T18:57:57Z) - Probing Model Signal-Awareness via Prediction-Preserving Input
Minimization [67.62847721118142]
We evaluate models' ability to capture the correct vulnerability signals to produce their predictions.
We measure the signal awareness of models using a new metric we propose- Signal-aware Recall (SAR)
The results show a sharp drop in the model's Recall from the high 90s to sub-60s with the new metric.
arXiv Detail & Related papers (2020-11-25T20:05:23Z) - Knowledge-Enriched Distributional Model Inversion Attacks [49.43828150561947]
Model inversion (MI) attacks are aimed at reconstructing training data from model parameters.
We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data.
Our experiments show that the combination of these techniques can significantly boost the success rate of the state-of-the-art MI attacks by 150%.
arXiv Detail & Related papers (2020-10-08T16:20:48Z) - Learning to Attack: Towards Textual Adversarial Attacking in Real-world
Situations [81.82518920087175]
Adversarial attacking aims to fool deep neural networks with adversarial examples.
We propose a reinforcement learning based attack model, which can learn from attack history and launch attacks more efficiently.
arXiv Detail & Related papers (2020-09-19T09:12:24Z) - MAZE: Data-Free Model Stealing Attack Using Zeroth-Order Gradient
Estimation [14.544507965617582]
Model Stealing (MS) attacks allow an adversary with black-box access to a Machine Learning model to replicate its functionality, compromising the confidentiality of the model.
This paper proposes MAZE -- a data-free model stealing attack using zeroth-order gradient estimation.
In contrast to prior works, MAZE does not require any data and instead creates synthetic data using a generative model.
arXiv Detail & Related papers (2020-05-06T22:26:18Z) - Adversarial Imitation Attack [63.76805962712481]
A practical adversarial attack should require as little as possible knowledge of attacked models.
Current substitute attacks need pre-trained models to generate adversarial examples.
In this study, we propose a novel adversarial imitation attack.
arXiv Detail & Related papers (2020-03-28T10:02:49Z) - Membership Inference Attacks Against Object Detection Models [1.0467092641687232]
We present the first membership inference attack against black-boxed object detection models.
We successfully reveal the membership status of privately sensitive data trained using one-stage and two-stage detection models.
Our results show that object detection models are also vulnerable to inference attacks like other models.
arXiv Detail & Related papers (2020-01-12T23:17:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.