Related papers: Unconsciously Forget: Mitigating Memorization; Without Knowing What is being Memorized

Unconsciously Forget: Mitigating Memorization; Without Knowing What is being Memorized

URL: http://arxiv.org/abs/2512.09687v2
Date: Fri, 12 Dec 2025 15:01:15 GMT
Title: Unconsciously Forget: Mitigating Memorization; Without Knowing What is being Memorized
Authors: Er Jin, Yang Zhang, Yongli Mou, Yanfei Dong, Stefan Decker, Kenji Kawaguchi, Johannes Stegmaier,
Abstract summary: Memorizing training data can lead to legal challenges, including copyright infringement, violations of portrait rights, and trademark violations.<n>Our work demonstrates that specific parts of the model are responsible for copyrighted content generation.<n>By applying model pruning, we can effectively suppress the probability of generating copyrighted content without targeting specific concepts.
Score: 41.5028352241977
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advances in generative models have demonstrated an exceptional ability to produce highly realistic images. However, previous studies show that generated images often resemble the training data, and this problem becomes more severe as the model size increases. Memorizing training data can lead to legal challenges, including copyright infringement, violations of portrait rights, and trademark violations. Existing approaches to mitigating memorization mainly focus on manipulating the denoising sampling process to steer image embeddings away from the memorized embedding space or employ unlearning methods that require training on datasets containing specific sets of memorized concepts. However, existing methods often incur substantial computational overhead during sampling, or focus narrowly on removing one or more groups of target concepts, imposing a significant limitation on their scalability. To understand and mitigate these problems, our work, UniForget, offers a new perspective on understanding the root cause of memorization. Our work demonstrates that specific parts of the model are responsible for copyrighted content generation. By applying model pruning, we can effectively suppress the probability of generating copyrighted content without targeting specific concepts while preserving the general generative capabilities of the model. Additionally, we show that our approach is both orthogonal and complementary to existing unlearning methods, thereby highlighting its potential to improve current unlearning and de-memorization techniques.

Related papers

From Logits to Latents: Contrastive Representation Shaping for LLM Unlearning [13.726373414710137]
We introduce CLReg, a representation regularizer that identifies forget features while pushing them away from retain features.<n>We provide first theoretical insights that relate representation shaping to entanglement reduction.<n> CLReg decreases forget-retain representation entanglement that facilitates mainstream unlearning methods without positing extra privacy risks.
arXiv Detail & Related papers (2026-01-29T17:34:37Z)
Beyond Memorization: Gradient Projection Enables Selective Learning in Diffusion Models [3.4064487905075294]
Memorization in large-scale text-to-image diffusion models poses significant security and intellectual property risks.<n>We introduce a Gradient Projection Framework designed to enforce a stringent requirement of concept-level feature exclusion.<n>Our approach establishes a new paradigm for IP-safe and privacy-preserving generative AI.
arXiv Detail & Related papers (2025-12-12T00:50:38Z)
Memory Self-Regeneration: Uncovering Hidden Knowledge in Unlearned Models [1.3654763247057877]
We present considerations regarding the ability of models to forget and recall knowledge.<n>We present MemoRa strategy, which we consider to be a regenerative approach supporting the effective recovery of previously lost knowledge.
arXiv Detail & Related papers (2025-09-26T19:11:01Z)
Sculpting Memory: Multi-Concept Forgetting in Diffusion Models via Dynamic Mask and Concept-Aware Optimization [20.783312940122297]
Text-to-image (T2I) diffusion models have achieved remarkable success in generating high-quality images from textual prompts.<n>However, their ability to store vast amounts of knowledge raises concerns in scenarios where selective forgetting is necessary.<n>We propose textbfDynamic Mask coupled with Concept-Aware Loss, a novel unlearning framework designed for multi-concept forgetting.
arXiv Detail & Related papers (2025-04-12T01:38:58Z)
Boosting Alignment for Post-Unlearning Text-to-Image Generative Models [55.82190434534429]
Large-scale generative models have shown impressive image-generation capabilities, propelled by massive data.<n>This often inadvertently leads to the generation of harmful or inappropriate content and raises copyright concerns.<n>We propose a framework that seeks an optimal model update at each unlearning iteration, ensuring monotonic improvement on both objectives.
arXiv Detail & Related papers (2024-12-09T21:36:10Z)
Rethinking and Defending Protective Perturbation in Personalized Diffusion Models [21.30373461975769]
We study the fine-tuning process of personalized diffusion models (PDMs) through the lens of shortcut learning. PDMs are susceptible to minor adversarial perturbations, leading to significant degradation when fine-tuned on corrupted datasets. We propose a systematic defense framework that includes data purification and contrastive decoupling learning.
arXiv Detail & Related papers (2024-06-27T07:14:14Z)
EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations [73.94175015918059]
We introduce a novel approach, EnTruth, which Enhances Traceability of unauthorized dataset usage. By strategically incorporating the template memorization, EnTruth can trigger the specific behavior in unauthorized models as the evidence of infringement. Our method is the first to investigate the positive application of memorization and use it for copyright protection, which turns a curse into a blessing.
arXiv Detail & Related papers (2024-06-20T02:02:44Z)
Memorized Images in Diffusion Models share a Subspace that can be Located and Deleted [15.162296378581853]
Large-scale text-to-image diffusion models excel in generating high-quality images from textual inputs. Concerns arise as research indicates their tendency to memorize and replicate training data. Efforts within the text-to-image community to address memorization explore causes such as data duplication, replicated captions, or trigger tokens.
arXiv Detail & Related papers (2024-06-01T15:47:13Z)
Unveiling and Mitigating Memorization in Text-to-image Diffusion Models through Cross Attention [62.671435607043875]
Research indicates that text-to-image diffusion models replicate images from their training data, raising tremendous concerns about potential copyright infringement and privacy risks.<n>We reveal that during memorization, the cross-attention tends to focus disproportionately on the embeddings of specific tokens.<n>We introduce an innovative approach to detect and mitigate memorization in diffusion models.
arXiv Detail & Related papers (2024-03-17T01:27:00Z)
A Dataset and Benchmark for Copyright Infringement Unlearning from Text-to-Image Diffusion Models [52.49582606341111]
Copyright law confers creators the exclusive rights to reproduce, distribute, and monetize their creative works. Recent progress in text-to-image generation has introduced formidable challenges to copyright enforcement. We introduce a novel pipeline that harmonizes CLIP, ChatGPT, and diffusion models to curate a dataset.
arXiv Detail & Related papers (2024-01-04T11:14:01Z)
Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models [79.50701155336198]
textbfForget-Me-Not is designed to safely remove specified IDs, objects, or styles from a well-configured text-to-image model in as little as 30 seconds. We demonstrate that Forget-Me-Not can effectively eliminate targeted concepts while maintaining the model's performance on other concepts. It can also be adapted as a lightweight model patch for Stable Diffusion, allowing for concept manipulation and convenient distribution.
arXiv Detail & Related papers (2023-03-30T17:58:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.