Sculpting Memory: Multi-Concept Forgetting in Diffusion Models via Dynamic Mask and Concept-Aware Optimization
- URL: http://arxiv.org/abs/2504.09039v2
- Date: Sat, 28 Jun 2025 04:37:01 GMT
- Title: Sculpting Memory: Multi-Concept Forgetting in Diffusion Models via Dynamic Mask and Concept-Aware Optimization
- Authors: Gen Li, Yang Xiao, Jie Ji, Kaiyuan Deng, Bo Hui, Linke Guo, Xiaolong Ma,
- Abstract summary: Text-to-image (T2I) diffusion models have achieved remarkable success in generating high-quality images from textual prompts.<n>However, their ability to store vast amounts of knowledge raises concerns in scenarios where selective forgetting is necessary.<n>We propose textbfDynamic Mask coupled with Concept-Aware Loss, a novel unlearning framework designed for multi-concept forgetting.
- Score: 20.783312940122297
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text-to-image (T2I) diffusion models have achieved remarkable success in generating high-quality images from textual prompts. However, their ability to store vast amounts of knowledge raises concerns in scenarios where selective forgetting is necessary, such as removing copyrighted content, reducing biases, or eliminating harmful concepts. While existing unlearning methods can remove certain concepts, they struggle with multi-concept forgetting due to instability, residual knowledge persistence, and generation quality degradation. To address these challenges, we propose \textbf{Dynamic Mask coupled with Concept-Aware Loss}, a novel unlearning framework designed for multi-concept forgetting in diffusion models. Our \textbf{Dynamic Mask} mechanism adaptively updates gradient masks based on current optimization states, allowing selective weight modifications that prevent interference with unrelated knowledge. Additionally, our \textbf{Concept-Aware Loss} explicitly guides the unlearning process by enforcing semantic consistency through superclass alignment, while a regularization loss based on knowledge distillation ensures that previously unlearned concepts remain forgotten during sequential unlearning. We conduct extensive experiments to evaluate our approach. Results demonstrate that our method outperforms existing unlearning techniques in forgetting effectiveness, output fidelity, and semantic coherence, particularly in multi-concept scenarios. Our work provides a principled and flexible framework for stable and high-fidelity unlearning in generative models. The code will be released publicly.
Related papers
- Image Can Bring Your Memory Back: A Novel Multi-Modal Guided Attack against Image Generation Model Unlearning [28.15997901023315]
Recall is a novel adversarial framework designed to compromise the robustness of unlearned IGMs.<n>It consistently outperforms existing baselines in terms of adversarial effectiveness, computational efficiency, and semantic fidelity with the original prompt.<n>These findings reveal critical vulnerabilities in current unlearning mechanisms and underscore the need for more robust solutions.
arXiv Detail & Related papers (2025-07-09T02:59:01Z) - TRACE: Trajectory-Constrained Concept Erasure in Diffusion Models [0.0]
Concept erasure aims to remove or suppress specific concept information in a generative model.<n>Trajectory-Constrained Attentional Concept Erasure (TRACE) is a novel method to erase targeted concepts from diffusion models.<n>TRACE achieves state-of-the-art performance, outperforming recent methods such as ANT, EraseAnything, and MACE in terms of removal efficacy and output quality.
arXiv Detail & Related papers (2025-05-29T10:15:22Z) - Fine-Grained Erasure in Text-to-Image Diffusion-based Foundation Models [56.35484513848296]
FADE (Fine grained Attenuation for Diffusion Erasure) is an adjacency-aware unlearning algorithm for text-to-image generative models.<n>It removes target concepts with minimal impact on correlated concepts, achieving a 12% improvement in retention performance over state-of-the-art methods.
arXiv Detail & Related papers (2025-03-25T15:49:48Z) - TRCE: Towards Reliable Malicious Concept Erasure in Text-to-Image Diffusion Models [45.393001061726366]
Recent advances in text-to-image diffusion models enable photorealistic image generation, but they also risk producing malicious content, such as NSFW images.
To mitigate risk, concept erasure methods are studied to facilitate the model to unlearn specific concepts.
We propose TRCE, using a two-stage concept erasure strategy to achieve an effective trade-off between reliable erasure and knowledge preservation.
arXiv Detail & Related papers (2025-03-10T14:37:53Z) - Boosting Alignment for Post-Unlearning Text-to-Image Generative Models [55.82190434534429]
Large-scale generative models have shown impressive image-generation capabilities, propelled by massive data.<n>This often inadvertently leads to the generation of harmful or inappropriate content and raises copyright concerns.<n>We propose a framework that seeks an optimal model update at each unlearning iteration, ensuring monotonic improvement on both objectives.
arXiv Detail & Related papers (2024-12-09T21:36:10Z) - Towards Lifelong Few-Shot Customization of Text-to-Image Diffusion [50.26583654615212]
Lifelong few-shot customization for text-to-image diffusion aims to continually generalize existing models for new tasks with minimal data.
In this study, we identify and categorize the catastrophic forgetting problems into two folds: relevant concepts forgetting and previous concepts forgetting.
Unlike existing methods that rely on additional real data or offline replay of original concept data, our approach enables on-the-fly knowledge distillation to retain the previous concepts while learning new ones.
arXiv Detail & Related papers (2024-11-08T12:58:48Z) - How to Continually Adapt Text-to-Image Diffusion Models for Flexible Customization? [91.49559116493414]
We propose a novel Concept-Incremental text-to-image Diffusion Model (CIDM)
It can resolve catastrophic forgetting and concept neglect to learn new customization tasks in a concept-incremental manner.
Experiments validate that our CIDM surpasses existing custom diffusion models.
arXiv Detail & Related papers (2024-10-23T06:47:29Z) - Reliable and Efficient Concept Erasure of Text-to-Image Diffusion Models [76.39651111467832]
We introduce Reliable and Efficient Concept Erasure (RECE), a novel approach that modifies the model in 3 seconds without necessitating additional fine-tuning.
To mitigate inappropriate content potentially represented by derived embeddings, RECE aligns them with harmless concepts in cross-attention layers.
The derivation and erasure of new representation embeddings are conducted iteratively to achieve a thorough erasure of inappropriate concepts.
arXiv Detail & Related papers (2024-07-17T08:04:28Z) - Safeguard Text-to-Image Diffusion Models with Human Feedback Inversion [51.931083971448885]
We propose a framework named Human Feedback Inversion (HFI), where human feedback on model-generated images is condensed into textual tokens guiding the mitigation or removal of problematic images.
Our experimental results demonstrate our framework significantly reduces objectionable content generation while preserving image quality, contributing to the ethical deployment of AI in the public sphere.
arXiv Detail & Related papers (2024-07-17T05:21:41Z) - Unlearning Concepts in Diffusion Model via Concept Domain Correction and Concept Preserving Gradient [20.698305103879232]
We propose a novel concept domain correction framework named textbfDoCo (textbfDomaintextbfCorrection)<n>By aligning the output domains of sensitive and anchor concepts through adversarial training, our approach ensures comprehensive unlearning of target concepts.<n>We also introduce a concept-preserving gradient surgery technique that mitigates conflicting gradient components, thereby preserving the model's utility while unlearning specific concepts.
arXiv Detail & Related papers (2024-05-24T07:47:36Z) - Probing Unlearned Diffusion Models: A Transferable Adversarial Attack Perspective [20.263233740360022]
Unlearning methods have been developed to erase concepts from diffusion models.
This paper aims to leverage the transferability of the adversarial attack to probe the unlearning robustness under a black-box setting.
Specifically, we employ an adversarial search strategy to search for the adversarial embedding which can transfer across different unlearned models.
arXiv Detail & Related papers (2024-04-30T09:14:54Z) - Separable Multi-Concept Erasure from Diffusion Models [52.51972530398691]
We propose a Separable Multi-concept Eraser (SepME) to eliminate unsafe concepts from large-scale diffusion models.
The latter separates optimizable model weights, making each weight increment correspond to a specific concept erasure.
Extensive experiments indicate the efficacy of our approach in eliminating concepts, preserving model performance, and offering flexibility in the erasure or recovery of various concepts.
arXiv Detail & Related papers (2024-02-03T11:10:57Z) - Multi-Concept T2I-Zero: Tweaking Only The Text Embeddings and Nothing
Else [75.6806649860538]
We consider a more ambitious goal: natural multi-concept generation using a pre-trained diffusion model.
We observe concept dominance and non-localized contribution that severely degrade multi-concept generation performance.
We design a minimal low-cost solution that overcomes the above issues by tweaking the text embeddings for more realistic multi-concept text-to-image generation.
arXiv Detail & Related papers (2023-10-11T12:05:44Z) - Implicit Concept Removal of Diffusion Models [92.55152501707995]
Text-to-image (T2I) diffusion models often inadvertently generate unwanted concepts such as watermarks and unsafe images.
We present the Geom-Erasing, a novel concept removal method based on the geometric-driven control.
arXiv Detail & Related papers (2023-10-09T17:13:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.