Unstoppable Attack: Label-Only Model Inversion via Conditional Diffusion
Model
- URL: http://arxiv.org/abs/2307.08424v3
- Date: Wed, 6 Mar 2024 05:00:06 GMT
- Title: Unstoppable Attack: Label-Only Model Inversion via Conditional Diffusion
Model
- Authors: Rongke Liu, Dong Wang, Yizhi Ren, Zhen Wang, Kaitian Guo, Qianqian
Qin, Xiaolei Liu
- Abstract summary: Model attacks (MIAs) aim to recover private data from inaccessible training sets of deep learning models.
This paper develops a novel MIA method, leveraging a conditional diffusion model (CDM) to recover representative samples under the target label.
Experimental results show that this method can generate similar and accurate samples to the target label, outperforming generators of previous approaches.
- Score: 14.834360664780709
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model inversion attacks (MIAs) aim to recover private data from inaccessible
training sets of deep learning models, posing a privacy threat. MIAs primarily
focus on the white-box scenario where attackers have full access to the model's
structure and parameters. However, practical applications are usually in
black-box scenarios or label-only scenarios, i.e., the attackers can only
obtain the output confidence vectors or labels by accessing the model.
Therefore, the attack models in existing MIAs are difficult to effectively
train with the knowledge of the target model, resulting in sub-optimal attacks.
To the best of our knowledge, we pioneer the research of a powerful and
practical attack model in the label-only scenario.
In this paper, we develop a novel MIA method, leveraging a conditional
diffusion model (CDM) to recover representative samples under the target label
from the training set. Two techniques are introduced: selecting an auxiliary
dataset relevant to the target model task and using predicted labels as
conditions to guide training CDM; and inputting target label, pre-defined
guidance strength, and random noise into the trained attack model to generate
and correct multiple results for final selection. This method is evaluated
using Learned Perceptual Image Patch Similarity as a new metric and as a
judgment basis for deciding the values of hyper-parameters. Experimental
results show that this method can generate similar and accurate samples to the
target label, outperforming generators of previous approaches.
Related papers
- Model Inversion Attacks Through Target-Specific Conditional Diffusion Models [54.69008212790426]
Model inversion attacks (MIAs) aim to reconstruct private images from a target classifier's training set, thereby raising privacy concerns in AI applications.
Previous GAN-based MIAs tend to suffer from inferior generative fidelity due to GAN's inherent flaws and biased optimization within latent space.
We propose Diffusion-based Model Inversion (Diff-MI) attacks to alleviate these issues.
arXiv Detail & Related papers (2024-07-16T06:38:49Z) - DTA: Distribution Transform-based Attack for Query-Limited Scenario [11.874670564015789]
In generating adversarial examples, the conventional black-box attack methods rely on sufficient feedback from the to-be-attacked models.
This paper proposes a hard-label attack that simulates an attacked action being permitted to conduct a limited number of queries.
Experiments validate the effectiveness of the proposed idea and the superiority of DTA over the state-of-the-art.
arXiv Detail & Related papers (2023-12-12T13:21:03Z) - Practical Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration [32.15773300068426]
Membership Inference Attacks aim to infer whether a target data record has been utilized for model training.
We propose a Membership Inference Attack based on Self-calibrated Probabilistic Variation (SPV-MIA)
arXiv Detail & Related papers (2023-11-10T13:55:05Z) - Universal Semi-supervised Model Adaptation via Collaborative Consistency
Training [92.52892510093037]
We introduce a realistic and challenging domain adaptation problem called Universal Semi-supervised Model Adaptation (USMA)
We propose a collaborative consistency training framework that regularizes the prediction consistency between two models.
Experimental results demonstrate the effectiveness of our method on several benchmark datasets.
arXiv Detail & Related papers (2023-07-07T08:19:40Z) - Query Efficient Cross-Dataset Transferable Black-Box Attack on Action
Recognition [99.29804193431823]
Black-box adversarial attacks present a realistic threat to action recognition systems.
We propose a new attack on action recognition that addresses these shortcomings by generating perturbations.
Our method achieves 8% and higher 12% deception rates compared to state-of-the-art query-based and transfer-based attacks.
arXiv Detail & Related papers (2022-11-23T17:47:49Z) - Label-only Model Inversion Attack: The Attack that Requires the Least
Information [14.061083728194378]
In a model inversion attack, an adversary attempts to reconstruct the data records, used to train a target model, using only the model's output.
We have found a model inversion method that can reconstruct the input data records based only on the output labels.
arXiv Detail & Related papers (2022-03-13T03:03:49Z) - Label-Only Model Inversion Attacks via Boundary Repulsion [12.374249336222906]
We introduce an algorithm to invert private training data using only the target model's predicted labels.
Using the example of face recognition, we show that the images reconstructed by BREP-MI successfully reproduce the semantics of the private training data.
arXiv Detail & Related papers (2022-03-03T18:57:57Z) - MEGA: Model Stealing via Collaborative Generator-Substitute Networks [4.065949099860426]
Recent data-free model stealingmethods are shown effective to extract the knowledge of thetarget model without using real query examples.
We propose a data-free model stealing frame-work,MEGA, which is based on collaborative generator-substitute networks.
Our results show that theaccuracy of our trained substitute model and the adversarialattack success rate over it can be up to 33% and 40% higherthan state-of-the-art data-free black-box attacks.
arXiv Detail & Related papers (2022-01-31T09:34:28Z) - Delving into Data: Effectively Substitute Training for Black-box Attack [84.85798059317963]
We propose a novel perspective substitute training that focuses on designing the distribution of data used in the knowledge stealing process.
The combination of these two modules can further boost the consistency of the substitute model and target model, which greatly improves the effectiveness of adversarial attack.
arXiv Detail & Related papers (2021-04-26T07:26:29Z) - Knowledge-Enriched Distributional Model Inversion Attacks [49.43828150561947]
Model inversion (MI) attacks are aimed at reconstructing training data from model parameters.
We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data.
Our experiments show that the combination of these techniques can significantly boost the success rate of the state-of-the-art MI attacks by 150%.
arXiv Detail & Related papers (2020-10-08T16:20:48Z) - Boosting Black-Box Attack with Partially Transferred Conditional
Adversarial Distribution [83.02632136860976]
We study black-box adversarial attacks against deep neural networks (DNNs)
We develop a novel mechanism of adversarial transferability, which is robust to the surrogate biases.
Experiments on benchmark datasets and attacking against real-world API demonstrate the superior attack performance of the proposed method.
arXiv Detail & Related papers (2020-06-15T16:45:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.