Is Diffusion Model Safe? Severe Data Leakage via Gradient-Guided Diffusion Model
- URL: http://arxiv.org/abs/2406.09484v1
- Date: Thu, 13 Jun 2024 14:41:47 GMT
- Title: Is Diffusion Model Safe? Severe Data Leakage via Gradient-Guided Diffusion Model
- Authors: Jiayang Meng, Tao Huang, Hong Chen, Cuiping Li,
- Abstract summary: Gradient leakage has been identified as a potential source of privacy breaches in modern image processing systems.
We propose an innovative gradient-guided fine-tuning method and introduce a new reconstruction attack that is capable of stealing high-resolution images.
Our attack method significantly outperforms the SOTA attack baselines in terms of both pixel-wise accuracy and time efficiency of image reconstruction.
- Score: 13.66943548640248
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Gradient leakage has been identified as a potential source of privacy breaches in modern image processing systems, where the adversary can completely reconstruct the training images from leaked gradients. However, existing methods are restricted to reconstructing low-resolution images where data leakage risks of image processing systems are not sufficiently explored. In this paper, by exploiting diffusion models, we propose an innovative gradient-guided fine-tuning method and introduce a new reconstruction attack that is capable of stealing private, high-resolution images from image processing systems through leaked gradients where severe data leakage encounters. Our attack method is easy to implement and requires little prior knowledge. The experimental results indicate that current reconstruction attacks can steal images only up to a resolution of $128 \times 128$ pixels, while our attack method can successfully recover and steal images with resolutions up to $512 \times 512$ pixels. Our attack method significantly outperforms the SOTA attack baselines in terms of both pixel-wise accuracy and time efficiency of image reconstruction. Furthermore, our attack can render differential privacy ineffective to some extent.
Related papers
- Exploring User-level Gradient Inversion with a Diffusion Prior [17.2657358645072]
We propose a novel gradient inversion attack that applies a denoising diffusion model as a strong image prior to enhance recovery in the large batch setting.
Unlike traditional attacks, which aim to reconstruct individual samples and suffer at large batch and image sizes, our approach instead aims to recover a representative image that captures the sensitive shared semantic information corresponding to the underlying user.
arXiv Detail & Related papers (2024-09-11T14:20:47Z) - Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent
Diffusion Model [61.53213964333474]
We propose a unified framework Adv-Diffusion that can generate imperceptible adversarial identity perturbations in the latent space but not the raw pixel space.
Specifically, we propose the identity-sensitive conditioned diffusion generative model to generate semantic perturbations in the surroundings.
The designed adaptive strength-based adversarial perturbation algorithm can ensure both attack transferability and stealthiness.
arXiv Detail & Related papers (2023-12-18T15:25:23Z) - IMPRESS: Evaluating the Resilience of Imperceptible Perturbations
Against Unauthorized Data Usage in Diffusion-Based Generative AI [52.90082445349903]
Diffusion-based image generation models can create artistic images that mimic the style of an artist or maliciously edit the original images for fake content.
Several attempts have been made to protect the original images from such unauthorized data usage by adding imperceptible perturbations.
In this work, we introduce a purification perturbation platform, named IMPRESS, to evaluate the effectiveness of imperceptible perturbations as a protective measure.
arXiv Detail & Related papers (2023-10-30T03:33:41Z) - DIAGNOSIS: Detecting Unauthorized Data Usages in Text-to-image Diffusion Models [79.71665540122498]
We propose a method for detecting unauthorized data usage by planting the injected content into the protected dataset.
Specifically, we modify the protected images by adding unique contents on these images using stealthy image warping functions.
By analyzing whether the model has memorized the injected content, we can detect models that had illegally utilized the unauthorized data.
arXiv Detail & Related papers (2023-07-06T16:27:39Z) - Understanding Reconstruction Attacks with the Neural Tangent Kernel and
Dataset Distillation [110.61853418925219]
We build a stronger version of the dataset reconstruction attack and show how it can provably recover the emphentire training set in the infinite width regime.
We show that both theoretically and empirically, reconstructed images tend to "outliers" in the dataset.
These reconstruction attacks can be used for textitdataset distillation, that is, we can retrain on reconstructed images and obtain high predictive accuracy.
arXiv Detail & Related papers (2023-02-02T21:41:59Z) - An Eye for an Eye: Defending against Gradient-based Attacks with
Gradients [24.845539113785552]
gradient-based adversarial attacks have demonstrated high success rates.
We show that the gradients can also be exploited as a powerful weapon to defend against adversarial attacks.
By using both gradient maps and adversarial images as inputs, we propose a Two-stream Restoration Network (TRN) to restore the adversarial images.
arXiv Detail & Related papers (2022-02-02T16:22:28Z) - Hiding Images into Images with Real-world Robustness [21.328984859163956]
We introduce a generative network based method for hiding images into images while assuring high-quality extraction.
An embedding network is sequentially decoupling with an attack layer, a decoupling network and an image extraction network.
We are the first to robustly hide three secret images.
arXiv Detail & Related papers (2021-10-12T02:20:34Z) - Exploring Structure Consistency for Deep Model Watermarking [122.38456787761497]
The intellectual property (IP) of Deep neural networks (DNNs) can be easily stolen'' by surrogate model attack.
We propose a new watermarking methodology, namely structure consistency'', based on which a new deep structure-aligned model watermarking algorithm is designed.
arXiv Detail & Related papers (2021-08-05T04:27:15Z) - Delving into Deep Image Prior for Adversarial Defense: A Novel
Reconstruction-based Defense Framework [34.75025893777763]
This work proposes a novel and effective reconstruction-based defense framework by delving into deep image prior.
The proposed method analyzes and explicitly incorporates the model decision process into our defense.
Experiments demonstrate that the proposed method outperforms existing state-of-the-art reconstruction-based methods both in defending white-box attacks and defense-aware attacks.
arXiv Detail & Related papers (2021-07-31T08:49:17Z) - Underwater Image Restoration via Contrastive Learning and a Real-world
Dataset [59.35766392100753]
We present a novel method for underwater image restoration based on unsupervised image-to-image translation framework.
Our proposed method leveraged contrastive learning and generative adversarial networks to maximize the mutual information between raw and restored images.
arXiv Detail & Related papers (2021-06-20T16:06:26Z) - Analysis and Mitigations of Reverse Engineering Attacks on Local Feature
Descriptors [15.973484638972739]
We show under controlled conditions a reverse engineering attack on sparse feature maps and analyze the vulnerability of popular descriptors.
We evaluate potential mitigation techniques that select a subset of descriptors to carefully balance privacy reconstruction risk while preserving image matching accuracy.
arXiv Detail & Related papers (2021-05-09T01:41:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.