Breaking the Black-Box: Confidence-Guided Model Inversion Attack for
Distribution Shift
- URL: http://arxiv.org/abs/2402.18027v1
- Date: Wed, 28 Feb 2024 03:47:17 GMT
- Title: Breaking the Black-Box: Confidence-Guided Model Inversion Attack for
Distribution Shift
- Authors: Xinhao Liu, Yingzhao Jiang, Zetao Lin
- Abstract summary: Model inversion attacks (MIAs) seek to infer the private training data of a target classifier by generating synthetic images that reflect the characteristics of the target class.
Previous studies have relied on full access to the target model, which is not practical in real-world scenarios.
This paper proposes a textbfConfidence-textbfGuided textbfModel textbfInversion attack method called CG-MI.
- Score: 0.46040036610482665
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model inversion attacks (MIAs) seek to infer the private training data of a
target classifier by generating synthetic images that reflect the
characteristics of the target class through querying the model. However, prior
studies have relied on full access to the target model, which is not practical
in real-world scenarios. Additionally, existing black-box MIAs assume that the
image prior and target model follow the same distribution. However, when
confronted with diverse data distribution settings, these methods may result in
suboptimal performance in conducting attacks. To address these limitations,
this paper proposes a \textbf{C}onfidence-\textbf{G}uided \textbf{M}odel
\textbf{I}nversion attack method called CG-MI, which utilizes the latent space
of a pre-trained publicly available generative adversarial network (GAN) as
prior information and gradient-free optimizer, enabling high-resolution MIAs
across different data distributions in a black-box setting. Our experiments
demonstrate that our method significantly \textbf{outperforms the SOTA
black-box MIA by more than 49\% for Celeba and 58\% for Facescrub in different
distribution settings}. Furthermore, our method exhibits the ability to
generate high-quality images \textbf{comparable to those produced by white-box
attacks}. Our method provides a practical and effective solution for black-box
model inversion attacks.
Related papers
- Learning to Learn Transferable Generative Attack for Person Re-Identification [17.26567195924685]
Existing attacks merely consider cross-dataset and cross-model transferability, ignoring the cross-test capability to perturb models trained in different domains.
To powerfully examine the robustness of real-world re-id models, the Meta Transferable Generative Attack (MTGA) method is proposed.
Our MTGA outperforms the SOTA methods by 21.5% and 11.3% on mean mAP drop rate, respectively.
arXiv Detail & Related papers (2024-09-06T11:57:17Z) - Model Inversion Attacks Through Target-Specific Conditional Diffusion Models [54.69008212790426]
Model attacks (MIAs) aim to reconstruct private images from a target classifier's training set, thereby raising privacy concerns in AI applications.
Previous GAN-based MIAs tend to suffer from inferior generative fidelity due to GAN's inherent flaws and biased optimization within latent space.
We propose Diffusion-based Model Inversion (Diff-MI) attacks to alleviate these issues.
arXiv Detail & Related papers (2024-07-16T06:38:49Z) - Prediction Exposes Your Face: Black-box Model Inversion via Prediction Alignment [24.049615035939237]
Model inversion (MI) attack reconstructs the private training data of a target model given its output.
We propose a novel Prediction-to-ImageP2I method for black-box MI attack.
Our method improves attack accuracy by 8.5% and reduces query numbers by 99% on dataset CelebA.
arXiv Detail & Related papers (2024-07-11T01:58:35Z) - Distributional Black-Box Model Inversion Attack with Multi-Agent Reinforcement Learning [19.200221582814518]
This paper proposes a novel Distributional Black-Box Model Inversion (DBB-MI) attack by constructing the probabilistic latent space for searching the target privacy data.
As the latent probability distribution closely aligns with the target privacy data in latent space, the recovered data will leak the privacy of training samples of the target model significantly.
Experiments conducted on diverse datasets and networks show that the present DBB-MI has better performance than state-of-the-art in attack accuracy, K-nearest neighbor feature distance, and Peak Signal-to-Noise Ratio.
arXiv Detail & Related papers (2024-04-22T04:18:38Z) - Semantic Adversarial Attacks via Diffusion Models [30.169827029761702]
Semantic adversarial attacks focus on changing semantic attributes of clean examples, such as color, context, and features.
We propose a framework to quickly generate a semantic adversarial attack by leveraging recent diffusion models.
Our approaches achieve approximately 100% attack success rate in multiple settings with the best FID as 36.61.
arXiv Detail & Related papers (2023-09-14T02:57:48Z) - Unstoppable Attack: Label-Only Model Inversion via Conditional Diffusion
Model [14.834360664780709]
Model attacks (MIAs) aim to recover private data from inaccessible training sets of deep learning models.
This paper develops a novel MIA method, leveraging a conditional diffusion model (CDM) to recover representative samples under the target label.
Experimental results show that this method can generate similar and accurate samples to the target label, outperforming generators of previous approaches.
arXiv Detail & Related papers (2023-07-17T12:14:24Z) - Pseudo Label-Guided Model Inversion Attack via Conditional Generative
Adversarial Network [102.21368201494909]
Model inversion (MI) attacks have raised increasing concerns about privacy.
Recent MI attacks leverage a generative adversarial network (GAN) as an image prior to narrow the search space.
We propose Pseudo Label-Guided MI (PLG-MI) attack via conditional GAN (cGAN)
arXiv Detail & Related papers (2023-02-20T07:29:34Z) - Training Meta-Surrogate Model for Transferable Adversarial Attack [98.13178217557193]
We consider adversarial attacks to a black-box model when no queries are allowed.
In this setting, many methods directly attack surrogate models and transfer the obtained adversarial examples to fool the target model.
We show we can obtain a Meta-Surrogate Model (MSM) such that attacks to this model can be easier transferred to other models.
arXiv Detail & Related papers (2021-09-05T03:27:46Z) - Delving into Data: Effectively Substitute Training for Black-box Attack [84.85798059317963]
We propose a novel perspective substitute training that focuses on designing the distribution of data used in the knowledge stealing process.
The combination of these two modules can further boost the consistency of the substitute model and target model, which greatly improves the effectiveness of adversarial attack.
arXiv Detail & Related papers (2021-04-26T07:26:29Z) - Boosting Black-Box Attack with Partially Transferred Conditional
Adversarial Distribution [83.02632136860976]
We study black-box adversarial attacks against deep neural networks (DNNs)
We develop a novel mechanism of adversarial transferability, which is robust to the surrogate biases.
Experiments on benchmark datasets and attacking against real-world API demonstrate the superior attack performance of the proposed method.
arXiv Detail & Related papers (2020-06-15T16:45:27Z) - Perturbing Across the Feature Hierarchy to Improve Standard and Strict
Blackbox Attack Transferability [100.91186458516941]
We consider the blackbox transfer-based targeted adversarial attack threat model in the realm of deep neural network (DNN) image classifiers.
We design a flexible attack framework that allows for multi-layer perturbations and demonstrates state-of-the-art targeted transfer performance.
We analyze why the proposed methods outperform existing attack strategies and show an extension of the method in the case when limited queries to the blackbox model are allowed.
arXiv Detail & Related papers (2020-04-29T16:00:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.