Precise, Fast, and Low-cost Concept Erasure in Value Space: Orthogonal Complement Matters
- URL: http://arxiv.org/abs/2412.06143v1
- Date: Mon, 09 Dec 2024 01:56:25 GMT
- Title: Precise, Fast, and Low-cost Concept Erasure in Value Space: Orthogonal Complement Matters
- Authors: Yuan Wang, Ouxiang Li, Tingting Mu, Yanbin Hao, Kuien Liu, Xiang Wang, Xiangnan He,
- Abstract summary: We propose a precise, fast, and low-cost concept erasure method, called Adaptive Vaule Decomposer (AdaVD)
AdaVD supports a series of diffusion models and downstream image generation tasks, the code is available on the project page.
- Score: 38.355389084255386
- License:
- Abstract: The success of text-to-image generation enabled by diffuion models has imposed an urgent need to erase unwanted concepts, e.g., copyrighted, offensive, and unsafe ones, from the pre-trained models in a precise, timely, and low-cost manner. The twofold demand of concept erasure requires a precise removal of the target concept during generation (i.e., erasure efficacy), while a minimal impact on non-target content generation (i.e., prior preservation). Existing methods are either computationally costly or face challenges in maintaining an effective balance between erasure efficacy and prior preservation. To improve, we propose a precise, fast, and low-cost concept erasure method, called Adaptive Vaule Decomposer (AdaVD), which is training-free. This method is grounded in a classical linear algebraic orthogonal complement operation, implemented in the value space of each cross-attention layer within the UNet of diffusion models. An effective shift factor is designed to adaptively navigate the erasure strength, enhancing prior preservation without sacrificing erasure efficacy. Extensive experimental results show that the proposed AdaVD is effective at both single and multiple concept erasure, showing a 2- to 10-fold improvement in prior preservation as compared to the second best, meanwhile achieving the best or near best erasure efficacy, when comparing with both training-based and training-free state of the arts. AdaVD supports a series of diffusion models and downstream image generation tasks, the code is available on the project page: https://github.com/WYuan1001/AdaVD
Related papers
- Fantastic Targets for Concept Erasure in Diffusion Models and Where To Find Them [21.386640828092524]
Concept erasure has emerged as a promising technique for mitigating the risk of harmful content generation in diffusion models.
We propose the Adaptive Guided Erasure (AGE) method, which emphdynamically selects optimal target concepts tailored to each undesirable concept.
Results show that AGE significantly outperforms state-of-the-art erasure methods on preserving unrelated concepts while maintaining effective erasure performance.
arXiv Detail & Related papers (2025-01-31T08:17:23Z) - DuMo: Dual Encoder Modulation Network for Precise Concept Erasure [75.05165577219425]
We propose our Dual encoder Modulation network (DuMo) which achieves precise erasure of inappropriate target concepts with minimum impairment to non-target concepts.
Our method achieves state-of-the-art performance on Explicit Content Erasure, Cartoon Concept Removal and Artistic Style Erasure, clearly outperforming alternative methods.
arXiv Detail & Related papers (2025-01-02T07:47:34Z) - EraseAnything: Enabling Concept Erasure in Rectified Flow Transformers [33.195628798316754]
EraseAnything is the first method specifically developed to address concept erasure within the latest flow-based T2I framework.
We formulate concept erasure as a bi-level optimization problem, employing LoRA-based parameter tuning and an attention map regularizer.
We propose a self-contrastive learning strategy to ensure that removing unwanted concepts does not inadvertently harm performance on unrelated ones.
arXiv Detail & Related papers (2024-12-29T09:42:53Z) - Reliable and Efficient Concept Erasure of Text-to-Image Diffusion Models [76.39651111467832]
We introduce Reliable and Efficient Concept Erasure (RECE), a novel approach that modifies the model in 3 seconds without necessitating additional fine-tuning.
To mitigate inappropriate content potentially represented by derived embeddings, RECE aligns them with harmless concepts in cross-attention layers.
The derivation and erasure of new representation embeddings are conducted iteratively to achieve a thorough erasure of inappropriate concepts.
arXiv Detail & Related papers (2024-07-17T08:04:28Z) - Towards Continual Learning Desiderata via HSIC-Bottleneck
Orthogonalization and Equiangular Embedding [55.107555305760954]
We propose a conceptually simple yet effective method that attributes forgetting to layer-wise parameter overwriting and the resulting decision boundary distortion.
Our method achieves competitive accuracy performance, even with absolute superiority of zero exemplar buffer and 1.02x the base model.
arXiv Detail & Related papers (2024-01-17T09:01:29Z) - Erasing Undesirable Influence in Diffusion Models [51.225365010401006]
Diffusion models are highly effective at generating high-quality images but pose risks, such as the unintentional generation of NSFW (not safe for work) content.
In this work, we introduce EraseDiff, an algorithm designed to preserve the utility of the diffusion model on retained data while removing the unwanted information associated with the data to be forgotten.
arXiv Detail & Related papers (2024-01-11T09:30:36Z) - All but One: Surgical Concept Erasing with Model Preservation in
Text-to-Image Diffusion Models [22.60023885544265]
Large-scale datasets may contain sexually explicit, copyrighted, or undesirable content, which allows the model to directly generate them.
Fine-tuning algorithms have been developed to tackle concept erasing in diffusion models.
We present a new approach that solves all of these challenges.
arXiv Detail & Related papers (2023-12-20T07:04:33Z) - Fine-tuning can cripple your foundation model; preserving features may be the solution [87.35911633187204]
A fine-tuned model's ability to recognize concepts on tasks is reduced significantly compared to its pre-trained counterpart.
We propose a new fine-tuning method called $textitLDIFS$ that, while learning new concepts related to the downstream task, allows a model to preserve its pre-trained knowledge as well.
arXiv Detail & Related papers (2023-08-25T11:49:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.