Demystifying Foreground-Background Memorization in Diffusion Models
- URL: http://arxiv.org/abs/2508.12148v1
- Date: Sat, 16 Aug 2025 20:15:16 GMT
- Title: Demystifying Foreground-Background Memorization in Diffusion Models
- Authors: Jimmy Z. Di, Yiwei Lu, Yaoliang Yu, Gautam Kamath, Adam Dziedzic, Franziska Boenisch,
- Abstract summary: Diffusion models (DMs) memorize training images and can reproduce near-duplicates during generation.<n>Current detection methods identify verbatim memorization but fail to capture two critical aspects.<n>We propose Foreground Background Memorization (FB-Mem), a novel segmentation-based metric that classifies and quantifies memorized regions within generated images.
- Score: 23.914702151370204
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion models (DMs) memorize training images and can reproduce near-duplicates during generation. Current detection methods identify verbatim memorization but fail to capture two critical aspects: quantifying partial memorization occurring in small image regions, and memorization patterns beyond specific prompt-image pairs. To address these limitations, we propose Foreground Background Memorization (FB-Mem), a novel segmentation-based metric that classifies and quantifies memorized regions within generated images. Our method reveals that memorization is more pervasive than previously understood: (1) individual generations from single prompts may be linked to clusters of similar training images, revealing complex memorization patterns that extend beyond one-to-one correspondences; and (2) existing model-level mitigation methods, such as neuron deactivation and pruning, fail to eliminate local memorization, which persists particularly in foreground regions. Our work establishes an effective framework for measuring memorization in diffusion models, demonstrates the inadequacy of current mitigation approaches, and proposes a stronger mitigation method using a clustering approach.
Related papers
- CAPTAIN: Semantic Feature Injection for Memorization Mitigation in Text-to-Image Diffusion Models [60.610268549138375]
Diffusion models can unintentionally reproduce training examples, raising privacy and copyright concerns.<n>We introduce CAPTAIN, a training-free framework that mitigates memorization by directly modifying latent features during denoising.
arXiv Detail & Related papers (2025-12-11T14:01:47Z) - How Diffusion Models Memorize [26.711679643772623]
diffusion models can memorize training data, raising serious privacy and copyright concerns.<n>We show memorization is driven by the overestimation of training samples during early denoising.
arXiv Detail & Related papers (2025-09-30T03:03:27Z) - Finding Dori: Memorization in Text-to-Image Diffusion Models Is Not Local [55.33447817350623]
Recent mitigation efforts have focused on identifying and pruning weights responsible for triggering verbatim training data replication.<n>We challenge this assumption and demonstrate that, even after such pruning, small perturbations to the text embeddings of previously mitigated prompts can re-trigger data replication.<n>Our findings provide new insights into the nature of memorization in text-to-image DMs and inform the development of more reliable mitigations against DM memorization.
arXiv Detail & Related papers (2025-07-22T15:02:38Z) - A Geometric Framework for Understanding Memorization in Generative Models [11.263296715798374]
Recent work has shown that deep generative models can be capable of memorizing and reproducing training datapoints when deployed.<n>These findings call into question the usability of generative models, especially in light of the legal and privacy risks brought about by memorization.<n>We propose the manifold memorization hypothesis (MMH), a geometric framework which leverages the manifold hypothesis into a clear language in which to reason about memorization.
arXiv Detail & Related papers (2024-10-31T18:09:01Z) - Exploring Local Memorization in Diffusion Models via Bright Ending Attention [62.979954692036685]
"bright ending" (BE) anomaly in text-to-image diffusion models prone to memorizing training images.<n>We propose a simple yet effective method to integrate BE into existing frameworks.
arXiv Detail & Related papers (2024-10-29T02:16:01Z) - Detecting, Explaining, and Mitigating Memorization in Diffusion Models [49.438362005962375]
We introduce a straightforward yet effective method for detecting memorized prompts by inspecting the magnitude of text-conditional predictions.
Our proposed method seamlessly integrates without disrupting sampling algorithms, and delivers high accuracy even at the first generation step.
Building on our detection strategy, we unveil an explainable approach that shows the contribution of individual words or tokens to memorization.
arXiv Detail & Related papers (2024-07-31T16:13:29Z) - Memorized Images in Diffusion Models share a Subspace that can be Located and Deleted [15.162296378581853]
Large-scale text-to-image diffusion models excel in generating high-quality images from textual inputs.
Concerns arise as research indicates their tendency to memorize and replicate training data.
Efforts within the text-to-image community to address memorization explore causes such as data duplication, replicated captions, or trigger tokens.
arXiv Detail & Related papers (2024-06-01T15:47:13Z) - Unveiling and Mitigating Memorization in Text-to-image Diffusion Models through Cross Attention [62.671435607043875]
Research indicates that text-to-image diffusion models replicate images from their training data, raising tremendous concerns about potential copyright infringement and privacy risks.<n>We reveal that during memorization, the cross-attention tends to focus disproportionately on the embeddings of specific tokens.<n>We introduce an innovative approach to detect and mitigate memorization in diffusion models.
arXiv Detail & Related papers (2024-03-17T01:27:00Z) - Pin the Memory: Learning to Generalize Semantic Segmentation [68.367763672095]
We present a novel memory-guided domain generalization method for semantic segmentation based on meta-learning framework.
Our method abstracts the conceptual knowledge of semantic classes into categorical memory which is constant beyond the domains.
arXiv Detail & Related papers (2022-04-07T17:34:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.