EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations
- URL: http://arxiv.org/abs/2406.13933v1
- Date: Thu, 20 Jun 2024 02:02:44 GMT
- Title: EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations
- Authors: Jie Ren, Yingqian Cui, Chen Chen, Vikash Sehwag, Yue Xing, Jiliang Tang, Lingjuan Lyu,
- Abstract summary: We introduce a novel approach, EnTruth, which Enhances Traceability of unauthorized dataset usage.
By strategically incorporating the template memorization, EnTruth can trigger the specific behavior in unauthorized models as the evidence of infringement.
Our method is the first to investigate the positive application of memorization and use it for copyright protection, which turns a curse into a blessing.
- Score: 73.94175015918059
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative models, especially text-to-image diffusion models, have significantly advanced in their ability to generate images, benefiting from enhanced architectures, increased computational power, and large-scale datasets. While the datasets play an important role, their protection has remained as an unsolved issue. Current protection strategies, such as watermarks and membership inference, are either in high poison rate which is detrimental to image quality or suffer from low accuracy and robustness. In this work, we introduce a novel approach, EnTruth, which Enhances Traceability of unauthorized dataset usage utilizing template memorization. By strategically incorporating the template memorization, EnTruth can trigger the specific behavior in unauthorized models as the evidence of infringement. Our method is the first to investigate the positive application of memorization and use it for copyright protection, which turns a curse into a blessing and offers a pioneering perspective for unauthorized usage detection in generative models. Comprehensive experiments are provided to demonstrate its effectiveness in terms of data-alteration rate, accuracy, robustness and generation quality.
Related papers
- Protect-Your-IP: Scalable Source-Tracing and Attribution against Personalized Generation [19.250673262185767]
We propose a unified approach for image copyright source-tracing and attribution.
We introduce an innovative watermarking-attribution method that blends proactive and passive strategies.
We have conducted experiments using various celebrity portrait series sourced online.
arXiv Detail & Related papers (2024-05-26T15:14:54Z) - DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception [78.26734070960886]
Current perceptive models heavily depend on resource-intensive datasets.
We introduce perception-aware loss (P.A. loss) through segmentation, improving both quality and controllability.
Our method customizes data augmentation by extracting and utilizing perception-aware attribute (P.A. Attr) during generation.
arXiv Detail & Related papers (2024-03-20T04:58:03Z) - Generative Models are Self-Watermarked: Declaring Model Authentication
through Re-Generation [17.88043926057354]
verifying data ownership poses formidable challenges, particularly in cases of unauthorized reuse of generated data.
Our work is dedicated to detecting data reuse from even an individual sample.
We propose an explainable verification procedure that attributes data ownership through re-generation, and further amplifies these fingerprints in the generative models through iterative data re-generation.
arXiv Detail & Related papers (2024-02-23T10:48:21Z) - A Dataset and Benchmark for Copyright Infringement Unlearning from Text-to-Image Diffusion Models [52.49582606341111]
Copyright law confers creators the exclusive rights to reproduce, distribute, and monetize their creative works.
Recent progress in text-to-image generation has introduced formidable challenges to copyright enforcement.
We introduce a novel pipeline that harmonizes CLIP, ChatGPT, and diffusion models to curate a dataset.
arXiv Detail & Related papers (2024-01-04T11:14:01Z) - Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent
Diffusion Model [61.53213964333474]
We propose a unified framework Adv-Diffusion that can generate imperceptible adversarial identity perturbations in the latent space but not the raw pixel space.
Specifically, we propose the identity-sensitive conditioned diffusion generative model to generate semantic perturbations in the surroundings.
The designed adaptive strength-based adversarial perturbation algorithm can ensure both attack transferability and stealthiness.
arXiv Detail & Related papers (2023-12-18T15:25:23Z) - Can Protective Perturbation Safeguard Personal Data from Being Exploited by Stable Diffusion? [21.75921532822961]
We introduce a purification method capable of removing protected perturbations while preserving the original image structure.
Experiments reveal that Stable Diffusion can effectively learn from purified images over all protective methods.
arXiv Detail & Related papers (2023-11-30T07:17:43Z) - IMPRESS: Evaluating the Resilience of Imperceptible Perturbations
Against Unauthorized Data Usage in Diffusion-Based Generative AI [52.90082445349903]
Diffusion-based image generation models can create artistic images that mimic the style of an artist or maliciously edit the original images for fake content.
Several attempts have been made to protect the original images from such unauthorized data usage by adding imperceptible perturbations.
In this work, we introduce a purification perturbation platform, named IMPRESS, to evaluate the effectiveness of imperceptible perturbations as a protective measure.
arXiv Detail & Related papers (2023-10-30T03:33:41Z) - Unlearnable Examples for Diffusion Models: Protect Data from Unauthorized Exploitation [25.55296442023984]
We propose a method, Unlearnable Diffusion Perturbation, to safeguard images from unauthorized exploitation.
This achievement holds significant importance in real-world scenarios, as it contributes to the protection of privacy and copyright against AI-generated content.
arXiv Detail & Related papers (2023-06-02T20:19:19Z) - ConfounderGAN: Protecting Image Data Privacy with Causal Confounder [85.6757153033139]
We propose ConfounderGAN, a generative adversarial network (GAN) that can make personal image data unlearnable to protect the data privacy of its owners.
Experiments are conducted in six image classification datasets, consisting of three natural object datasets and three medical datasets.
arXiv Detail & Related papers (2022-12-04T08:49:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.