Related papers: Finding Dori: Memorization in Text-to-Image Diffusion Models Is Not Local

Finding Dori: Memorization in Text-to-Image Diffusion Models Is Not Local

URL: http://arxiv.org/abs/2507.16880v2
Date: Tue, 14 Oct 2025 06:59:43 GMT
Title: Finding Dori: Memorization in Text-to-Image Diffusion Models Is Not Local
Authors: Antoni Kowalczuk, Dominik Hintersdorf, Lukas Struppek, Kristian Kersting, Adam Dziedzic, Franziska Boenisch,
Abstract summary: Recent mitigation efforts have focused on identifying and pruning weights responsible for triggering verbatim training data replication.<n>We challenge this assumption and demonstrate that, even after such pruning, small perturbations to the text embeddings of previously mitigated prompts can re-trigger data replication.<n>Our findings provide new insights into the nature of memorization in text-to-image DMs and inform the development of more reliable mitigations against DM memorization.
Score: 55.33447817350623
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Text-to-image diffusion models (DMs) have achieved remarkable success in image generation. However, concerns about data privacy and intellectual property remain due to their potential to inadvertently memorize and replicate training data. Recent mitigation efforts have focused on identifying and pruning weights responsible for triggering verbatim training data replication, based on the assumption that memorization can be localized. We challenge this assumption and demonstrate that, even after such pruning, small perturbations to the text embeddings of previously mitigated prompts can re-trigger data replication, revealing the fragility of such defenses. Our further analysis then provides multiple indications that memorization is indeed not inherently local: (1) replication triggers for memorized images are distributed throughout text embedding space; (2) embeddings yielding the same replicated image produce divergent model activations; and (3) different pruning methods identify inconsistent sets of memorization-related weights for the same image. Finally, we show that bypassing the locality assumption enables more robust mitigation through adversarial fine-tuning. These findings provide new insights into the nature of memorization in text-to-image DMs and inform the development of more reliable mitigations against DM memorization.

Related papers

You Don't Need All That Attention: Surgical Memorization Mitigation in Text-to-Image Diffusion Models [8.429432661292964]
Generative models have been shown to "memorize" certain training data, leading to verbatim or near-verbatim generating images.<n>We introduce Guidance Using Attractive-Repulsive Dynamics (GUARD), a novel framework for memorization mitigation in text-to-image diffusion models.<n>GUARD adjusts the image denoising process to guide the generation away from an original training image and towards one that is distinct from training data.
arXiv Detail & Related papers (2026-02-23T17:20:40Z)
Demystifying Foreground-Background Memorization in Diffusion Models [23.914702151370204]
Diffusion models (DMs) memorize training images and can reproduce near-duplicates during generation.<n>Current detection methods identify verbatim memorization but fail to capture two critical aspects.<n>We propose Foreground Background Memorization (FB-Mem), a novel segmentation-based metric that classifies and quantifies memorized regions within generated images.
arXiv Detail & Related papers (2025-08-16T20:15:16Z)
Active Adversarial Noise Suppression for Image Forgery Localization [56.98050814363447]
We introduce an Adversarial Noise Suppression Module (ANSM) that generate a defensive perturbation to suppress the attack effect of adversarial noise.<n>To our best knowledge, this is the first report of adversarial defense in image forgery localization tasks.
arXiv Detail & Related papers (2025-06-15T14:53:27Z)
CopyJudge: Automated Copyright Infringement Identification and Mitigation in Text-to-Image Diffusion Models [58.58208005178676]
We propose CopyJudge, a novel automated infringement identification framework.<n>We employ an abstraction-filtration-comparison test framework to assess the likelihood of infringement.<n>We introduce a general LVLM-based mitigation strategy that automatically optimize infringing prompts.
arXiv Detail & Related papers (2025-02-21T08:09:07Z)
LoyalDiffusion: A Diffusion Model Guarding Against Data Replication [6.818344768093927]
Diffusion models can replicate training data, particularly when the training data includes confidential information.<n>We propose a replication-aware U-Net architecture that incorporates information transfer blocks into skip connections that are less essential for image quality.<n>Experiments demonstrate that LoyalDiffusion outperforms the state-of-the-art replication mitigation method achieving a 48.63% reduction in replication while maintaining comparable image quality.
arXiv Detail & Related papers (2024-12-02T04:41:30Z)
Exploring Local Memorization in Diffusion Models via Bright Ending Attention [62.979954692036685]
"bright ending" (BE) anomaly in text-to-image diffusion models prone to memorizing training images.<n>We propose a simple yet effective method to integrate BE into existing frameworks.
arXiv Detail & Related papers (2024-10-29T02:16:01Z)
Detecting, Explaining, and Mitigating Memorization in Diffusion Models [49.438362005962375]
We introduce a straightforward yet effective method for detecting memorized prompts by inspecting the magnitude of text-conditional predictions. Our proposed method seamlessly integrates without disrupting sampling algorithms, and delivers high accuracy even at the first generation step. Building on our detection strategy, we unveil an explainable approach that shows the contribution of individual words or tokens to memorization.
arXiv Detail & Related papers (2024-07-31T16:13:29Z)
Embedding Space Selection for Detecting Memorization and Fingerprinting in Generative Models [45.83830252441126]
Generative Adversarial Networks (GANs) and Diffusion Models have become cornerstone technologies, driving innovation in diverse fields from art creation to healthcare. Despite their potential, these models face the significant challenge of data memorization, which poses risks to privacy and the integrity of generated content. We study memorization scores calculated from encoder layer embeddings, which involves measuring distances between samples in the embedding spaces.
arXiv Detail & Related papers (2024-07-30T19:52:49Z)
Iterative Ensemble Training with Anti-Gradient Control for Mitigating Memorization in Diffusion Models [20.550324116099357]
Diffusion models are known for their tremendous ability to generate novel and high-quality samples.<n>Recent approaches for memory mitigation either only focused on the text modality problem in cross-modal generation tasks or utilized data augmentation strategies.<n>We propose a novel training framework for diffusion models from the perspective of visual modality, which is more generic and fundamental for mitigating memorization.
arXiv Detail & Related papers (2024-07-22T02:19:30Z)
EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations [73.94175015918059]
We introduce a novel approach, EnTruth, which Enhances Traceability of unauthorized dataset usage. By strategically incorporating the template memorization, EnTruth can trigger the specific behavior in unauthorized models as the evidence of infringement. Our method is the first to investigate the positive application of memorization and use it for copyright protection, which turns a curse into a blessing.
arXiv Detail & Related papers (2024-06-20T02:02:44Z)
Memorized Images in Diffusion Models share a Subspace that can be Located and Deleted [15.162296378581853]
Large-scale text-to-image diffusion models excel in generating high-quality images from textual inputs. Concerns arise as research indicates their tendency to memorize and replicate training data. Efforts within the text-to-image community to address memorization explore causes such as data duplication, replicated captions, or trigger tokens.
arXiv Detail & Related papers (2024-06-01T15:47:13Z)
Unveiling and Mitigating Memorization in Text-to-image Diffusion Models through Cross Attention [62.671435607043875]
Research indicates that text-to-image diffusion models replicate images from their training data, raising tremendous concerns about potential copyright infringement and privacy risks.<n>We reveal that during memorization, the cross-attention tends to focus disproportionately on the embeddings of specific tokens.<n>We introduce an innovative approach to detect and mitigate memorization in diffusion models.
arXiv Detail & Related papers (2024-03-17T01:27:00Z)
Memory Triggers: Unveiling Memorization in Text-To-Image Generative Models through Word-Level Duplication [16.447035745151428]
Diffusion-based models have revolutionized text-to-image synthesis with their ability to produce high-quality, high-resolution images. These models also raise concerns due to their tendency to replicate exact training samples, posing privacy risks and enabling adversarial attacks. This paper focuses on two distinct and underexplored types of duplication that lead to replication during inference in diffusion-based models.
arXiv Detail & Related papers (2023-12-06T18:54:44Z)
Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective [91.14291142262262]
This work presents a straightforward and fundamental explanation from the data perspective. Our preliminary investigation reveals a strong correlation between the degeneration issue and the presence of repetitions in training data. Our experiments reveal that penalizing the repetitions in training data remains critical even when considering larger model sizes and instruction tuning.
arXiv Detail & Related papers (2023-10-16T09:35:42Z)
Understanding and Mitigating Copying in Diffusion Models [53.03978584040557]
Images generated by diffusion models like Stable Diffusion are increasingly widespread. Recent works and even lawsuits have shown that these models are prone to replicating their training data, unbeknownst to the user.
arXiv Detail & Related papers (2023-05-31T17:58:02Z)
Decoupling Knowledge from Memorization: Retrieval-augmented Prompt Learning [113.58691755215663]
We develop RetroPrompt to help a model strike a balance between generalization and memorization. In contrast with vanilla prompt learning, RetroPrompt constructs an open-book knowledge-store from training instances. Extensive experiments demonstrate that RetroPrompt can obtain better performance in both few-shot and zero-shot settings.
arXiv Detail & Related papers (2022-05-29T16:07:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.