Related papers: LURE: Latent Space Unblocking for Multi-Concept Reawakening in Diffusion Models

LURE: Latent Space Unblocking for Multi-Concept Reawakening in Diffusion Models

URL: http://arxiv.org/abs/2601.14330v1
Date: Tue, 20 Jan 2026 10:39:11 GMT
Title: LURE: Latent Space Unblocking for Multi-Concept Reawakening in Diffusion Models
Authors: Mengyu Sun, Ziyuan Yang, Andrew Beng Jin Teoh, Junxu Liu, Haibo Hu, Yi Zhang,
Abstract summary: Concept erasure aims to suppress sensitive content in diffusion models.<n>Recent studies show that erased concepts can still be reawakened, revealing vulnerabilities in erasure methods.<n>We model the generation process as an implicit function to enable a comprehensive theoretical analysis of multiple factors.
Score: 24.332916173317113
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Concept erasure aims to suppress sensitive content in diffusion models, but recent studies show that erased concepts can still be reawakened, revealing vulnerabilities in erasure methods. Existing reawakening methods mainly rely on prompt-level optimization to manipulate sampling trajectories, neglecting other generative factors, which limits a comprehensive understanding of the underlying dynamics. In this paper, we model the generation process as an implicit function to enable a comprehensive theoretical analysis of multiple factors, including text conditions, model parameters, and latent states. We theoretically show that perturbing each factor can reawaken erased concepts. Building on this insight, we propose a novel concept reawakening method: Latent space Unblocking for concept REawakening (LURE), which reawakens erased concepts by reconstructing the latent space and guiding the sampling trajectory. Specifically, our semantic re-binding mechanism reconstructs the latent space by aligning denoising predictions with target distributions to reestablish severed text-visual associations. However, in multi-concept scenarios, naive reconstruction can cause gradient conflicts and feature entanglement. To address this, we introduce Gradient Field Orthogonalization, which enforces feature orthogonality to prevent mutual interference. Additionally, our Latent Semantic Identification-Guided Sampling (LSIS) ensures stability of the reawakening process via posterior density verification. Extensive experiments demonstrate that LURE enables simultaneous, high-fidelity reawakening of multiple erased concepts across diverse erasure tasks and methods.

Related papers

Differential Vector Erasure: Unified Training-Free Concept Erasure for Flow Matching Models [49.10620605347065]
We propose Differential Vector Erasure (DVE), a training-free concept erasure method specifically designed for flow matching models.<n>Our key insight is that semantic concepts are implicitly encoded in the directional structure of the velocity field governing the generative flow.<n>During inference, DVE selectively removes concept-specific components by projecting the velocity field onto the differential direction, enabling precise concept suppression without affecting irrelevant semantics.
arXiv Detail & Related papers (2026-02-01T08:05:45Z)
Boosting Fidelity for Pre-Trained-Diffusion-Based Low-Light Image Enhancement via Condition Refinement [63.54516423266521]
Pre-Trained Diffusion-Based (PTDB) methods often sacrifice content fidelity to attain higher perceptual realism.<n>We propose a novel optimization strategy for conditioning in pre-trained diffusion models, enhancing fidelity while preserving realism and aesthetics.<n>Our approach is plug-and-play, seamlessly integrating into existing diffusion networks to provide more effective control.
arXiv Detail & Related papers (2025-10-20T02:40:06Z)
FACE: Faithful Automatic Concept Extraction [4.417419748257645]
FACE (Faithful Automatic Concept Extraction) is a novel framework that augments Non-negative Matrix Factorization (NMF) with a Kullback-Leibler (KL) divergence regularization term to ensure alignment between the model's original and concept-based predictions.<n>We provide theoretical guarantees showing that minimizing the KL divergence bounds the deviation in predictive distributions, thereby promoting faithful local linearity in the learned concept space.
arXiv Detail & Related papers (2025-10-13T17:44:45Z)
Revoking Amnesia: RL-based Trajectory Optimization to Resurrect Erased Concepts in Diffusion Models [38.38751366738881]
Concept erasure techniques have been widely deployed in T2I diffusion models to prevent inappropriate content generation for safety and copyright considerations.<n> established erasure methods exhibit degraded effectiveness, raising questions about their true mechanisms.<n>We propose textbfRevAm, a trajectory optimization framework that resurrects erased concepts by dynamically steering the denoising process.
arXiv Detail & Related papers (2025-09-30T07:46:19Z)
A Survey on Latent Reasoning [100.54120559169735]
Large Language Models (LLMs) have demonstrated impressive reasoning capabilities.<n>CoT reasoning that verbalizes intermediate steps limits the model's expressive bandwidth.<n>Latent reasoning tackles this bottleneck by performing multi-step inference entirely in the model's continuous hidden state.
arXiv Detail & Related papers (2025-07-08T17:29:07Z)
TRACE: Trajectory-Constrained Concept Erasure in Diffusion Models [0.0]
Concept erasure aims to remove or suppress specific concept information in a generative model.<n>Trajectory-Constrained Attentional Concept Erasure (TRACE) is a novel method to erase targeted concepts from diffusion models.<n>TRACE achieves state-of-the-art performance, outperforming recent methods such as ANT, EraseAnything, and MACE in terms of removal efficacy and output quality.
arXiv Detail & Related papers (2025-05-29T10:15:22Z)
When Are Concepts Erased From Diffusion Models? [37.59943248660331]
In concept erasure, a model is modified to selectively prevent it from generating a target concept.<n>We propose two conceptual models for the erasure mechanism in diffusion models.<n>To assess whether a concept has been truly erased from the model, we introduce a comprehensive suite of independent probing techniques.
arXiv Detail & Related papers (2025-05-22T17:59:09Z)
Erased or Dormant? Rethinking Concept Erasure Through Reversibility [6.895055915600732]
We evaluate two representative concept erasure methods, Unified Concept Editing and Erased Stable Diffusion.<n>We show that erased concepts often reemerge with substantial visual fidelity after minimal adaptation.<n>Our findings reveal critical limitations in existing concept erasure approaches.
arXiv Detail & Related papers (2025-05-22T03:26:46Z)
TRCE: Towards Reliable Malicious Concept Erasure in Text-to-Image Diffusion Models [53.937498564603054]
Recent advances in text-to-image diffusion models enable photorealistic image generation, but they also risk producing malicious content, such as NSFW images.<n>To mitigate risk, concept erasure methods are studied to facilitate the model to unlearn specific concepts.<n>We propose TRCE, using a two-stage concept erasure strategy to achieve an effective trade-off between reliable erasure and knowledge preservation.
arXiv Detail & Related papers (2025-03-10T14:37:53Z)
Rethinking the Vulnerability of Concept Erasure and a New Method [9.044763606650646]
Concept erasure (defense) methods have been developed to "unlearn" specific concepts through post-hoc finetuning.<n>Recent concept restoration (attack) methods have demonstrated that these supposedly erased concepts can be recovered using adversarially crafted prompts.<n>We introduce **RECORD**, a novel coordinate-descent-based restoration algorithm that consistently outperforms existing restoration methods by up to 17.8 times.
arXiv Detail & Related papers (2025-02-24T17:26:01Z)
Reliable and Efficient Concept Erasure of Text-to-Image Diffusion Models [76.39651111467832]
We introduce Reliable and Efficient Concept Erasure (RECE), a novel approach that modifies the model in 3 seconds without necessitating additional fine-tuning. To mitigate inappropriate content potentially represented by derived embeddings, RECE aligns them with harmless concepts in cross-attention layers. The derivation and erasure of new representation embeddings are conducted iteratively to achieve a thorough erasure of inappropriate concepts.
arXiv Detail & Related papers (2024-07-17T08:04:28Z)
Implicit Concept Removal of Diffusion Models [92.55152501707995]
Text-to-image (T2I) diffusion models often inadvertently generate unwanted concepts such as watermarks and unsafe images. We present the Geom-Erasing, a novel concept removal method based on the geometric-driven control.
arXiv Detail & Related papers (2023-10-09T17:13:10Z)
Toward Certified Robustness Against Real-World Distribution Shifts [65.66374339500025]
We train a generative model to learn perturbations from data and define specifications with respect to the output of the learned model. A unique challenge arising from this setting is that existing verifiers cannot tightly approximate sigmoid activations. We propose a general meta-algorithm for handling sigmoid activations which leverages classical notions of counter-example-guided abstraction refinement.
arXiv Detail & Related papers (2022-06-08T04:09:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.