Related papers: DIAGNOSIS: Detecting Unauthorized Data Usages in Text-to-image Diffusion Models

DIAGNOSIS: Detecting Unauthorized Data Usages in Text-to-image Diffusion Models

URL: http://arxiv.org/abs/2307.03108v3
Date: Tue, 9 Apr 2024 16:31:33 GMT
Title: DIAGNOSIS: Detecting Unauthorized Data Usages in Text-to-image Diffusion Models
Authors: Zhenting Wang, Chen Chen, Lingjuan Lyu, Dimitris N. Metaxas, Shiqing Ma,
Abstract summary: We propose a method for detecting unauthorized data usage by planting the injected content into the protected dataset. Specifically, we modify the protected images by adding unique contents on these images using stealthy image warping functions. By analyzing whether the model has memorized the injected content, we can detect models that had illegally utilized the unauthorized data.
Score: 79.71665540122498
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent text-to-image diffusion models have shown surprising performance in generating high-quality images. However, concerns have arisen regarding the unauthorized data usage during the training or fine-tuning process. One example is when a model trainer collects a set of images created by a particular artist and attempts to train a model capable of generating similar images without obtaining permission and giving credit to the artist. To address this issue, we propose a method for detecting such unauthorized data usage by planting the injected memorization into the text-to-image diffusion models trained on the protected dataset. Specifically, we modify the protected images by adding unique contents on these images using stealthy image warping functions that are nearly imperceptible to humans but can be captured and memorized by diffusion models. By analyzing whether the model has memorized the injected content (i.e., whether the generated images are processed by the injected post-processing function), we can detect models that had illegally utilized the unauthorized data. Experiments on Stable Diffusion and VQ Diffusion with different model training or fine-tuning methods (i.e, LoRA, DreamBooth, and standard training) demonstrate the effectiveness of our proposed method in detecting unauthorized data usages. Code: https://github.com/ZhentingWang/DIAGNOSIS.

Related papers

Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models [61.56740897898055]
We introduce the Silent Branding Attack, a novel data poisoning method that manipulates text-to-image diffusion models. We find that when certain visual patterns are repeatedly in the training data, the model learns to reproduce them naturally in its outputs. We develop an automated data poisoning algorithm that unobtrusively injects logos into original images, ensuring they blend naturally and remain undetected.
arXiv Detail & Related papers (2025-03-12T17:21:57Z)
Towards Reliable Verification of Unauthorized Data Usage in Personalized Text-to-Image Diffusion Models [23.09033991200197]
New personalization techniques have been proposed to customize the pre-trained base models for crafting images with specific themes or styles. Such a lightweight solution poses a new concern regarding whether the personalized models are trained from unauthorized data. We introduce SIREN, a novel methodology to proactively trace unauthorized data usage in black-box personalized text-to-image diffusion models.
arXiv Detail & Related papers (2024-10-14T12:29:23Z)
CGI-DM: Digital Copyright Authentication for Diffusion Models via Contrasting Gradient Inversion [26.70115339710056]
Contrasting Gradient Inversion for Diffusion Models (CGI-DM) is a novel method featuring vivid visual representations for digital copyright authentication. We formulate the differences as KL divergence between latent variables of the two models when given the same input image. Experiments on the WikiArt and Dreambooth datasets demonstrate the high accuracy of CGI-DM in digital copyright authentication.
arXiv Detail & Related papers (2024-03-17T10:06:38Z)
IMPRESS: Evaluating the Resilience of Imperceptible Perturbations Against Unauthorized Data Usage in Diffusion-Based Generative AI [52.90082445349903]
Diffusion-based image generation models can create artistic images that mimic the style of an artist or maliciously edit the original images for fake content. Several attempts have been made to protect the original images from such unauthorized data usage by adding imperceptible perturbations. In this work, we introduce a purification perturbation platform, named IMPRESS, to evaluate the effectiveness of imperceptible perturbations as a protective measure.
arXiv Detail & Related papers (2023-10-30T03:33:41Z)
Exposing the Fake: Effective Diffusion-Generated Images Detection [14.646957596560076]
This paper proposes a novel detection method called Stepwise Error for Diffusion-generated Image Detection (SeDID) SeDID exploits the unique attributes of diffusion models, namely deterministic reverse and deterministic denoising errors. Our work makes a pivotal contribution to distinguishing diffusion model-generated images, marking a significant step in the domain of artificial intelligence security.
arXiv Detail & Related papers (2023-07-12T16:16:37Z)
Unlearnable Examples for Diffusion Models: Protect Data from Unauthorized Exploitation [25.55296442023984]
We propose a method, Unlearnable Diffusion Perturbation, to safeguard images from unauthorized exploitation. This achievement holds significant importance in real-world scenarios, as it contributes to the protection of privacy and copyright against AI-generated content.
arXiv Detail & Related papers (2023-06-02T20:19:19Z)
Discriminative Class Tokens for Text-to-Image Diffusion Models [107.98436819341592]
We propose a non-invasive fine-tuning technique that capitalizes on the expressive potential of free-form text. Our method is fast compared to prior fine-tuning methods and does not require a collection of in-class images. We evaluate our method extensively, showing that the generated images are: (i) more accurate and of higher quality than standard diffusion models, (ii) can be used to augment training data in a low-resource setting, and (iii) reveal information about the data used to train the guiding classifier.
arXiv Detail & Related papers (2023-03-30T05:25:20Z)
DIRE for Diffusion-Generated Image Detection [128.95822613047298]
We propose a novel representation called DIffusion Reconstruction Error (DIRE) DIRE measures the error between an input image and its reconstruction counterpart by a pre-trained diffusion model. It provides a hint that DIRE can serve as a bridge to distinguish generated and real images.
arXiv Detail & Related papers (2023-03-16T13:15:03Z)
Masked Images Are Counterfactual Samples for Robust Fine-tuning [77.82348472169335]
Fine-tuning deep learning models can lead to a trade-off between in-distribution (ID) performance and out-of-distribution (OOD) robustness. We propose a novel fine-tuning method, which uses masked images as counterfactual samples that help improve the robustness of the fine-tuning model.
arXiv Detail & Related papers (2023-03-06T11:51:28Z)
Diffusion Art or Digital Forgery? Investigating Data Replication in Diffusion Models [53.03978584040557]
We study image retrieval frameworks that enable us to compare generated images with training samples and detect when content has been replicated. Applying our frameworks to diffusion models trained on multiple datasets including Oxford flowers, Celeb-A, ImageNet, and LAION, we discuss how factors such as training set size impact rates of content replication.
arXiv Detail & Related papers (2022-12-07T18:58:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.