Related papers: EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations

EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations

URL: http://arxiv.org/abs/2406.13933v1
Date: Thu, 20 Jun 2024 02:02:44 GMT
Title: EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations
Authors: Jie Ren, Yingqian Cui, Chen Chen, Vikash Sehwag, Yue Xing, Jiliang Tang, Lingjuan Lyu,
Abstract summary: We introduce a novel approach, EnTruth, which Enhances Traceability of unauthorized dataset usage. By strategically incorporating the template memorization, EnTruth can trigger the specific behavior in unauthorized models as the evidence of infringement. Our method is the first to investigate the positive application of memorization and use it for copyright protection, which turns a curse into a blessing.
Score: 73.94175015918059
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generative models, especially text-to-image diffusion models, have significantly advanced in their ability to generate images, benefiting from enhanced architectures, increased computational power, and large-scale datasets. While the datasets play an important role, their protection has remained as an unsolved issue. Current protection strategies, such as watermarks and membership inference, are either in high poison rate which is detrimental to image quality or suffer from low accuracy and robustness. In this work, we introduce a novel approach, EnTruth, which Enhances Traceability of unauthorized dataset usage utilizing template memorization. By strategically incorporating the template memorization, EnTruth can trigger the specific behavior in unauthorized models as the evidence of infringement. Our method is the first to investigate the positive application of memorization and use it for copyright protection, which turns a curse into a blessing and offers a pioneering perspective for unauthorized usage detection in generative models. Comprehensive experiments are provided to demonstrate its effectiveness in terms of data-alteration rate, accuracy, robustness and generation quality.

Related papers

Harnessing Frequency Spectrum Insights for Image Copyright Protection Against Diffusion Models [26.821064889438777]
We present novel evidence that diffusion-generated images faithfully preserve the statistical properties of their training data. We introduce emphCoprGuard, a robust frequency domain watermarking framework to safeguard against unauthorized image usage.
arXiv Detail & Related papers (2025-03-14T04:27:50Z)
Exploiting Watermark-Based Defense Mechanisms in Text-to-Image Diffusion Models for Unauthorized Data Usage [14.985938758090763]
Text-to-image diffusion models, such as Stable Diffusion, have shown exceptional potential in generating high-quality images. Recent studies highlight concerns over the use of unauthorized data in training these models, which may lead to intellectual property infringement or privacy violations. In this paper, we examine the robustness of various watermark-based protection methods applied to text-to-image models.
arXiv Detail & Related papers (2024-11-22T22:28:19Z)
Towards Reliable Verification of Unauthorized Data Usage in Personalized Text-to-Image Diffusion Models [23.09033991200197]
New personalization techniques have been proposed to customize the pre-trained base models for crafting images with specific themes or styles. Such a lightweight solution poses a new concern regarding whether the personalized models are trained from unauthorized data. We introduce SIREN, a novel methodology to proactively trace unauthorized data usage in black-box personalized text-to-image diffusion models.
arXiv Detail & Related papers (2024-10-14T12:29:23Z)
Detecting Dataset Abuse in Fine-Tuning Stable Diffusion Models for Text-to-Image Synthesis [3.8809673918404246]
dataset watermarking framework designed to detect unauthorized usage and trace data leaks. We present a dataset watermarking framework designed to detect unauthorized usage and trace data leaks.
arXiv Detail & Related papers (2024-09-27T16:34:48Z)
Adversarial Robustification via Text-to-Image Diffusion Models [56.37291240867549]
Adrial robustness has been conventionally believed as a challenging property to encode for neural networks. We develop a scalable and model-agnostic solution to achieve adversarial robustness without using any data.
arXiv Detail & Related papers (2024-07-26T10:49:14Z)
DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception [78.26734070960886]
Current perceptive models heavily depend on resource-intensive datasets. We introduce perception-aware loss (P.A. loss) through segmentation, improving both quality and controllability. Our method customizes data augmentation by extracting and utilizing perception-aware attribute (P.A. Attr) during generation.
arXiv Detail & Related papers (2024-03-20T04:58:03Z)
Generative Models are Self-Watermarked: Declaring Model Authentication through Re-Generation [17.88043926057354]
verifying data ownership poses formidable challenges, particularly in cases of unauthorized reuse of generated data. Our work is dedicated to detecting data reuse from even an individual sample. We propose an explainable verification procedure that attributes data ownership through re-generation, and further amplifies these fingerprints in the generative models through iterative data re-generation.
arXiv Detail & Related papers (2024-02-23T10:48:21Z)
A Dataset and Benchmark for Copyright Infringement Unlearning from Text-to-Image Diffusion Models [52.49582606341111]
Copyright law confers creators the exclusive rights to reproduce, distribute, and monetize their creative works. Recent progress in text-to-image generation has introduced formidable challenges to copyright enforcement. We introduce a novel pipeline that harmonizes CLIP, ChatGPT, and diffusion models to curate a dataset.
arXiv Detail & Related papers (2024-01-04T11:14:01Z)
Adv-Diffusion: Imperceptible Adversarial Face Identity Attack via Latent Diffusion Model [61.53213964333474]
We propose a unified framework Adv-Diffusion that can generate imperceptible adversarial identity perturbations in the latent space but not the raw pixel space. Specifically, we propose the identity-sensitive conditioned diffusion generative model to generate semantic perturbations in the surroundings. The designed adaptive strength-based adversarial perturbation algorithm can ensure both attack transferability and stealthiness.
arXiv Detail & Related papers (2023-12-18T15:25:23Z)
IMPRESS: Evaluating the Resilience of Imperceptible Perturbations Against Unauthorized Data Usage in Diffusion-Based Generative AI [52.90082445349903]
Diffusion-based image generation models can create artistic images that mimic the style of an artist or maliciously edit the original images for fake content. Several attempts have been made to protect the original images from such unauthorized data usage by adding imperceptible perturbations. In this work, we introduce a purification perturbation platform, named IMPRESS, to evaluate the effectiveness of imperceptible perturbations as a protective measure.
arXiv Detail & Related papers (2023-10-30T03:33:41Z)
Unlearnable Examples for Diffusion Models: Protect Data from Unauthorized Exploitation [25.55296442023984]
We propose a method, Unlearnable Diffusion Perturbation, to safeguard images from unauthorized exploitation. This achievement holds significant importance in real-world scenarios, as it contributes to the protection of privacy and copyright against AI-generated content.
arXiv Detail & Related papers (2023-06-02T20:19:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.