Related papers: Prediction Exposes Your Face: Black-box Model Inversion via Prediction Alignment

Prediction Exposes Your Face: Black-box Model Inversion via Prediction Alignment

URL: http://arxiv.org/abs/2407.08127v1
Date: Thu, 11 Jul 2024 01:58:35 GMT
Title: Prediction Exposes Your Face: Black-box Model Inversion via Prediction Alignment
Authors: Yufan Liu, Wanqian Zhang, Dayan Wu, Zheng Lin, Jingzi Gu, Weiping Wang,
Abstract summary: Model inversion (MI) attack reconstructs the private training data of a target model given its output. We propose a novel Prediction-to-ImageP2I method for black-box MI attack. Our method improves attack accuracy by 8.5% and reduces query numbers by 99% on dataset CelebA.
Score: 24.049615035939237
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Model inversion (MI) attack reconstructs the private training data of a target model given its output, posing a significant threat to deep learning models and data privacy. On one hand, most of existing MI methods focus on searching for latent codes to represent the target identity, yet this iterative optimization-based scheme consumes a huge number of queries to the target model, making it unrealistic especially in black-box scenario. On the other hand, some training-based methods launch an attack through a single forward inference, whereas failing to directly learn high-level mappings from prediction vectors to images. Addressing these limitations, we propose a novel Prediction-to-Image (P2I) method for black-box MI attack. Specifically, we introduce the Prediction Alignment Encoder to map the target model's output prediction into the latent code of StyleGAN. In this way, prediction vector space can be well aligned with the more disentangled latent space, thus establishing a connection between prediction vectors and the semantic facial features. During the attack phase, we further design the Aligned Ensemble Attack scheme to integrate complementary facial attributes of target identity for better reconstruction. Experimental results show that our method outperforms other SOTAs, e.g.,compared with RLB-MI, our method improves attack accuracy by 8.5% and reduces query numbers by 99% on dataset CelebA.

Related papers

The Surprising Effectiveness of Membership Inference with Simple N-Gram Coverage [71.8564105095189]
We introduce N-Gram Coverage Attack, a membership inference attack that relies solely on text outputs from the target model.<n>We first demonstrate on a diverse set of existing benchmarks that N-Gram Coverage Attack outperforms other black-box methods.<n>We find that more recent models, such as GPT-4o, exhibit increased robustness to membership inference.
arXiv Detail & Related papers (2025-08-13T08:35:16Z)
Model Inversion Attacks Through Target-Specific Conditional Diffusion Models [54.69008212790426]
Model inversion attacks (MIAs) aim to reconstruct private images from a target classifier's training set, thereby raising privacy concerns in AI applications. Previous GAN-based MIAs tend to suffer from inferior generative fidelity due to GAN's inherent flaws and biased optimization within latent space. We propose Diffusion-based Model Inversion (Diff-MI) attacks to alleviate these issues.
arXiv Detail & Related papers (2024-07-16T06:38:49Z)
Breaking the Black-Box: Confidence-Guided Model Inversion Attack for Distribution Shift [0.46040036610482665]
Model inversion attacks (MIAs) seek to infer the private training data of a target classifier by generating synthetic images that reflect the characteristics of the target class. Previous studies have relied on full access to the target model, which is not practical in real-world scenarios. This paper proposes a textbfConfidence-textbfGuided textbfModel textbfInversion attack method called CG-MI.
arXiv Detail & Related papers (2024-02-28T03:47:17Z)
Rethinking Model Inversion Attacks With Patch-Wise Reconstruction [7.264378254137811]
Model inversion (MI) attacks aim to infer or reconstruct the training dataset through reverse-engineering from the target model's weights. We propose the Patch-MI method, inspired by a jigsaw puzzle, which offers a novel probabilistic interpretation of MI attacks. We numerically demonstrate that the Patch-MI improves Top 1 attack accuracy by 5%p compared to existing methods.
arXiv Detail & Related papers (2023-12-12T07:52:35Z)
Unstoppable Attack: Label-Only Model Inversion via Conditional Diffusion Model [14.834360664780709]
Model attacks (MIAs) aim to recover private data from inaccessible training sets of deep learning models. This paper develops a novel MIA method, leveraging a conditional diffusion model (CDM) to recover representative samples under the target label. Experimental results show that this method can generate similar and accurate samples to the target label, outperforming generators of previous approaches.
arXiv Detail & Related papers (2023-07-17T12:14:24Z)
Pseudo Label-Guided Model Inversion Attack via Conditional Generative Adversarial Network [102.21368201494909]
Model inversion (MI) attacks have raised increasing concerns about privacy. Recent MI attacks leverage a generative adversarial network (GAN) as an image prior to narrow the search space. We propose Pseudo Label-Guided MI (PLG-MI) attack via conditional GAN (cGAN)
arXiv Detail & Related papers (2023-02-20T07:29:34Z)
AdvDO: Realistic Adversarial Attacks for Trajectory Prediction [87.96767885419423]
Trajectory prediction is essential for autonomous vehicles to plan correct and safe driving behaviors. We devise an optimization-based adversarial attack framework to generate realistic adversarial trajectories. Our attack can lead an AV to drive off road or collide into other vehicles in simulation.
arXiv Detail & Related papers (2022-09-19T03:34:59Z)
How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective [74.47093382436823]
We address the problem of black-box defense: How to robustify a black-box model using just input queries and output feedback? We propose a general notion of defensive operation that can be applied to black-box models, and design it through the lens of denoised smoothing (DS) We empirically show that ZO-AE-DS can achieve improved accuracy, certified robustness, and query complexity over existing baselines.
arXiv Detail & Related papers (2022-03-27T03:23:32Z)
Label-Only Model Inversion Attacks via Boundary Repulsion [12.374249336222906]
We introduce an algorithm to invert private training data using only the target model's predicted labels. Using the example of face recognition, we show that the images reconstructed by BREP-MI successfully reproduce the semantics of the private training data.
arXiv Detail & Related papers (2022-03-03T18:57:57Z)
Targeted Attack against Deep Neural Networks via Flipping Limited Weight Bits [55.740716446995805]
We study a novel attack paradigm, which modifies model parameters in the deployment stage for malicious purposes. Our goal is to misclassify a specific sample into a target class without any sample modification. By utilizing the latest technique in integer programming, we equivalently reformulate this BIP problem as a continuous optimization problem.
arXiv Detail & Related papers (2021-02-21T03:13:27Z)
Knowledge-Enriched Distributional Model Inversion Attacks [49.43828150561947]
Model inversion (MI) attacks are aimed at reconstructing training data from model parameters. We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data. Our experiments show that the combination of these techniques can significantly boost the success rate of the state-of-the-art MI attacks by 150%.
arXiv Detail & Related papers (2020-10-08T16:20:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.