Zero-Residual Concept Erasure via Progressive Alignment in Text-to-Image Model
- URL: http://arxiv.org/abs/2508.04472v1
- Date: Wed, 06 Aug 2025 14:19:32 GMT
- Title: Zero-Residual Concept Erasure via Progressive Alignment in Text-to-Image Model
- Authors: Hongxu Chen, Zhen Wang, Taoran Mei, Lin Li, Bowei Zhu, Runshi Li, Long Chen,
- Abstract summary: Concept Erasure aims to prevent pretrained text-to-image models from generating content associated with semantic-harmful concepts.<n>Existing methods often result in incomplete erasure due to "non-zero alignment residual"<n>We propose a novel closed-form method ErasePro: it is designed for more complete concept erasure and better preserving overall generative quality.
- Score: 15.636542463543066
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Concept Erasure, which aims to prevent pretrained text-to-image models from generating content associated with semantic-harmful concepts (i.e., target concepts), is getting increased attention. State-of-the-art methods formulate this task as an optimization problem: they align all target concepts with semantic-harmless anchor concepts, and apply closed-form solutions to update the model accordingly. While these closed-form methods are efficient, we argue that existing methods have two overlooked limitations: 1) They often result in incomplete erasure due to "non-zero alignment residual", especially when text prompts are relatively complex. 2) They may suffer from generation quality degradation as they always concentrate parameter updates in a few deep layers. To address these issues, we propose a novel closed-form method ErasePro: it is designed for more complete concept erasure and better preserving overall generative quality. Specifically, ErasePro first introduces a strict zero-residual constraint into the optimization objective, ensuring perfect alignment between target and anchor concept features and enabling more complete erasure. Secondly, it employs a progressive, layer-wise update strategy that gradually transfers target concept features to those of the anchor concept from shallow to deep layers. As the depth increases, the required parameter changes diminish, thereby reducing deviations in sensitive deep layers and preserving generative quality. Empirical results across different concept erasure tasks (including instance, art style, and nudity erasure) have demonstrated the effectiveness of our ErasePro.
Related papers
- Concept Pinpoint Eraser for Text-to-image Diffusion Models via Residual Attention Gate [10.996274286143244]
Concept erasing has been investigated with the goals of deleting target concepts in diffusion models while preserving other concepts with minimal distortion.<n>We propose a novel framework, dubbed Concept Pinpoint Eraser (CPE), by adding emphnonlinear Residual Attention Gates (ResAGs) that selectively erase (or cut) target concepts.<n>CPE outperforms prior arts by keeping diverse remaining concepts while deleting the target concepts with robustness against attack prompts.
arXiv Detail & Related papers (2025-06-28T08:17:19Z) - SAGE: Exploring the Boundaries of Unsafe Concept Domain with Semantic-Augment Erasing [65.82241040239452]
Concept erasing finetunes weights to unlearn undesirable concepts.<n>Existing methods treat unsafe concept as a fixed word and repeatedly erase it.<n>We introduce semantic-augment erasing which transforms concept word erasure into concept domain erasure.
arXiv Detail & Related papers (2025-06-11T03:21:24Z) - One Image is Worth a Thousand Words: A Usability Preservable Text-Image Collaborative Erasing Framework [127.07102988701092]
We introduce the first text-image Collaborative Concept Erasing (Co-Erasing) framework.<n>Co-Erasing describes the concept jointly by text prompts and the corresponding undesirable images induced by the prompts.<n>We design a text-guided image concept refinement strategy that directs the model to focus on visual features most relevant to the specified text concept.
arXiv Detail & Related papers (2025-05-16T11:25:50Z) - Set You Straight: Auto-Steering Denoising Trajectories to Sidestep Unwanted Concepts [12.04985139116705]
We introduce a finetuning framework, dubbed ANT, which guides deNoising Trajectories to avoid unwanted concepts.<n>ANT is built on a key insight: reversing the condition direction of classifier-free guidance during mid-to-late denoising stages.<n>For single-concept erasure, we propose an augmentation-enhanced weight saliency map, enabling more thorough and efficient erasure.<n>For multi-concept erasure, our objective function offers a versatile plug-and-play solution that significantly boosts performance.
arXiv Detail & Related papers (2025-04-17T09:29:30Z) - Fine-Grained Erasure in Text-to-Image Diffusion-based Foundation Models [56.35484513848296]
FADE (Fine grained Attenuation for Diffusion Erasure) is an adjacency-aware unlearning algorithm for text-to-image generative models.<n>It removes target concepts with minimal impact on correlated concepts, achieving a 12% improvement in retention performance over state-of-the-art methods.
arXiv Detail & Related papers (2025-03-25T15:49:48Z) - SPEED: Scalable, Precise, and Efficient Concept Erasure for Diffusion Models [41.284399182295026]
We introduce SPEED, a model editing-based concept erasure approach that leverages null-space constraints for scalable, precise, and efficient erasure.<n> SPEED consistently outperforms existing methods in prior preservation while achieving efficient and high-fidelity concept erasure.
arXiv Detail & Related papers (2025-03-10T14:40:01Z) - DuMo: Dual Encoder Modulation Network for Precise Concept Erasure [75.05165577219425]
We propose our Dual encoder Modulation network (DuMo) which achieves precise erasure of inappropriate target concepts with minimum impairment to non-target concepts.<n>Our method achieves state-of-the-art performance on Explicit Content Erasure, Cartoon Concept Removal and Artistic Style Erasure, clearly outperforming alternative methods.
arXiv Detail & Related papers (2025-01-02T07:47:34Z) - Reliable and Efficient Concept Erasure of Text-to-Image Diffusion Models [76.39651111467832]
We introduce Reliable and Efficient Concept Erasure (RECE), a novel approach that modifies the model in 3 seconds without necessitating additional fine-tuning.
To mitigate inappropriate content potentially represented by derived embeddings, RECE aligns them with harmless concepts in cross-attention layers.
The derivation and erasure of new representation embeddings are conducted iteratively to achieve a thorough erasure of inappropriate concepts.
arXiv Detail & Related papers (2024-07-17T08:04:28Z) - All but One: Surgical Concept Erasing with Model Preservation in
Text-to-Image Diffusion Models [22.60023885544265]
Large-scale datasets may contain sexually explicit, copyrighted, or undesirable content, which allows the model to directly generate them.
Fine-tuning algorithms have been developed to tackle concept erasing in diffusion models.
We present a new approach that solves all of these challenges.
arXiv Detail & Related papers (2023-12-20T07:04:33Z) - Implicit Concept Removal of Diffusion Models [92.55152501707995]
Text-to-image (T2I) diffusion models often inadvertently generate unwanted concepts such as watermarks and unsafe images.
We present the Geom-Erasing, a novel concept removal method based on the geometric-driven control.
arXiv Detail & Related papers (2023-10-09T17:13:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.