Related papers: EraseAnything: Enabling Concept Erasure in Rectified Flow Transformers

EraseAnything: Enabling Concept Erasure in Rectified Flow Transformers

URL: http://arxiv.org/abs/2412.20413v2
Date: Thu, 02 Jan 2025 13:26:55 GMT
Title: EraseAnything: Enabling Concept Erasure in Rectified Flow Transformers
Authors: Daiheng Gao, Shilin Lu, Shaw Walters, Wenbo Zhou, Jiaming Chu, Jie Zhang, Bang Zhang, Mengxi Jia, Jian Zhao, Zhaoxin Fan, Weiming Zhang,
Abstract summary: EraseAnything is the first method specifically developed to address concept erasure within the latest flow-based T2I framework.<n>We formulate concept erasure as a bi-level optimization problem, employing LoRA-based parameter tuning and an attention map regularizer.<n>We propose a self-contrastive learning strategy to ensure that removing unwanted concepts does not inadvertently harm performance on unrelated ones.
Score: 33.195628798316754
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Removing unwanted concepts from large-scale text-to-image (T2I) diffusion models while maintaining their overall generative quality remains an open challenge. This difficulty is especially pronounced in emerging paradigms, such as Stable Diffusion (SD) v3 and Flux, which incorporate flow matching and transformer-based architectures. These advancements limit the transferability of existing concept-erasure techniques that were originally designed for the previous T2I paradigm (e.g., SD v1.4). In this work, we introduce EraseAnything, the first method specifically developed to address concept erasure within the latest flow-based T2I framework. We formulate concept erasure as a bi-level optimization problem, employing LoRA-based parameter tuning and an attention map regularizer to selectively suppress undesirable activations. Furthermore, we propose a self-contrastive learning strategy to ensure that removing unwanted concepts does not inadvertently harm performance on unrelated ones. Experimental results demonstrate that EraseAnything successfully fills the research gap left by earlier methods in this new T2I paradigm, achieving state-of-the-art performance across a wide range of concept erasure tasks.

Related papers

Fine-Grained Erasure in Text-to-Image Diffusion-based Foundation Models [56.35484513848296]
FADE (Fine grained Attenuation for Diffusion Erasure) is an adjacency-aware unlearning algorithm for text-to-image generative models. It removes target concepts with minimal impact on correlated concepts, achieving a 12% improvement in retention performance over state-of-the-art methods.
arXiv Detail & Related papers (2025-03-25T15:49:48Z)
Sparse Autoencoder as a Zero-Shot Classifier for Concept Erasing in Text-to-Image Diffusion Models [24.15603438969762]
Interpret then Deactivate (ItD) is a novel framework to enable precise concept removal in T2I diffusion models. ItD uses a sparse autoencoder to interpret each concept as a combination of multiple features. It can be easily extended to erase multiple concepts without requiring further training.
arXiv Detail & Related papers (2025-03-12T14:46:40Z)
SPEED: Scalable, Precise, and Efficient Concept Erasure for Diffusion Models [41.284399182295026]
We introduce SPEED, a model editing-based concept erasure approach that leverages null-space constraints for scalable, precise, and efficient erasure. SPEED consistently outperforms existing methods in prior preservation while achieving efficient and high-fidelity concept erasure.
arXiv Detail & Related papers (2025-03-10T14:40:01Z)
DuMo: Dual Encoder Modulation Network for Precise Concept Erasure [75.05165577219425]
We propose our Dual encoder Modulation network (DuMo) which achieves precise erasure of inappropriate target concepts with minimum impairment to non-target concepts. Our method achieves state-of-the-art performance on Explicit Content Erasure, Cartoon Concept Removal and Artistic Style Erasure, clearly outperforming alternative methods.
arXiv Detail & Related papers (2025-01-02T07:47:34Z)
Efficient Fine-Tuning and Concept Suppression for Pruned Diffusion Models [93.76814568163353]
We propose a novel bilevel optimization framework for pruned diffusion models. This framework consolidates the fine-tuning and unlearning processes into a unified phase. It is compatible with various pruning and concept unlearning methods.
arXiv Detail & Related papers (2024-12-19T19:13:18Z)
Precise, Fast, and Low-cost Concept Erasure in Value Space: Orthogonal Complement Matters [38.355389084255386]
We propose a precise, fast, and low-cost concept erasure method called Adaptive Value Decomposer (AdaVD) AdaVD excels in both single and multiple concept erasure, showing 2 to 10 times improvement in prior preservation.
arXiv Detail & Related papers (2024-12-09T01:56:25Z)
STEREO: Towards Adversarially Robust Concept Erasing from Text-to-Image Generation Models [18.64776777593743]
We propose an approach called STEREO that involves two distinct stages. The first stage searches thoroughly enough for strong and diverse adversarial prompts that can regenerate an erased concept from a CEM. In the second robustly erase once stage, we introduce an anchor-concept-based compositional objective to robustly erase the target concept at one go.
arXiv Detail & Related papers (2024-08-29T17:29:26Z)
Reliable and Efficient Concept Erasure of Text-to-Image Diffusion Models [76.39651111467832]
We introduce Reliable and Efficient Concept Erasure (RECE), a novel approach that modifies the model in 3 seconds without necessitating additional fine-tuning. To mitigate inappropriate content potentially represented by derived embeddings, RECE aligns them with harmless concepts in cross-attention layers. The derivation and erasure of new representation embeddings are conducted iteratively to achieve a thorough erasure of inappropriate concepts.
arXiv Detail & Related papers (2024-07-17T08:04:28Z)
Unlearning Concepts in Diffusion Model via Concept Domain Correction and Concept Preserving Gradient [20.698305103879232]
We propose a novel concept domain correction framework named textbfDoCo (textbfDomaintextbfCorrection)<n>By aligning the output domains of sensitive and anchor concepts through adversarial training, our approach ensures comprehensive unlearning of target concepts.<n>We also introduce a concept-preserving gradient surgery technique that mitigates conflicting gradient components, thereby preserving the model's utility while unlearning specific concepts.
arXiv Detail & Related papers (2024-05-24T07:47:36Z)
Editing Massive Concepts in Text-to-Image Diffusion Models [58.620118104364174]
We propose a two-stage method, Editing Massive Concepts In Diffusion Models (EMCID) The first stage performs memory optimization for each individual concept with dual self-distillation from text alignment loss and diffusion noise prediction loss. The second stage conducts massive concept editing with multi-layer, closed form model editing.
arXiv Detail & Related papers (2024-03-20T17:59:57Z)
All but One: Surgical Concept Erasing with Model Preservation in Text-to-Image Diffusion Models [22.60023885544265]
Large-scale datasets may contain sexually explicit, copyrighted, or undesirable content, which allows the model to directly generate them. Fine-tuning algorithms have been developed to tackle concept erasing in diffusion models. We present a new approach that solves all of these challenges.
arXiv Detail & Related papers (2023-12-20T07:04:33Z)
Ring-A-Bell! How Reliable are Concept Removal Methods for Diffusion Models? [52.238883592674696]
Ring-A-Bell is a model-agnostic red-teaming tool for T2I diffusion models. It identifies problematic prompts for diffusion models with the corresponding generation of inappropriate content. Our results show that Ring-A-Bell, by manipulating safe prompting benchmarks, can transform prompts that were originally regarded as safe to evade existing safety mechanisms.
arXiv Detail & Related papers (2023-10-16T02:11:20Z)
DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior [70.46245698746874]
We present DiffBIR, a general restoration pipeline that could handle different blind image restoration tasks. DiffBIR decouples blind image restoration problem into two stages: 1) degradation removal: removing image-independent content; 2) information regeneration: generating the lost image content. In the first stage, we use restoration modules to remove degradations and obtain high-fidelity restored results. For the second stage, we propose IRControlNet that leverages the generative ability of latent diffusion models to generate realistic details.
arXiv Detail & Related papers (2023-08-29T07:11:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.