Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models
- URL: http://arxiv.org/abs/2303.17591v1
- Date: Thu, 30 Mar 2023 17:58:11 GMT
- Title: Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models
- Authors: Eric Zhang, Kai Wang, Xingqian Xu, Zhangyang Wang, Humphrey Shi
- Abstract summary: textbfForget-Me-Not is designed to safely remove specified IDs, objects, or styles from a well-configured text-to-image model in as little as 30 seconds.
We demonstrate that Forget-Me-Not can effectively eliminate targeted concepts while maintaining the model's performance on other concepts.
It can also be adapted as a lightweight model patch for Stable Diffusion, allowing for concept manipulation and convenient distribution.
- Score: 79.50701155336198
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The unlearning problem of deep learning models, once primarily an academic
concern, has become a prevalent issue in the industry. The significant advances
in text-to-image generation techniques have prompted global discussions on
privacy, copyright, and safety, as numerous unauthorized personal IDs, content,
artistic creations, and potentially harmful materials have been learned by
these models and later utilized to generate and distribute uncontrolled
content. To address this challenge, we propose \textbf{Forget-Me-Not}, an
efficient and low-cost solution designed to safely remove specified IDs,
objects, or styles from a well-configured text-to-image model in as little as
30 seconds, without impairing its ability to generate other content. Alongside
our method, we introduce the \textbf{Memorization Score (M-Score)} and
\textbf{ConceptBench} to measure the models' capacity to generate general
concepts, grouped into three primary categories: ID, object, and style. Using
M-Score and ConceptBench, we demonstrate that Forget-Me-Not can effectively
eliminate targeted concepts while maintaining the model's performance on other
concepts. Furthermore, Forget-Me-Not offers two practical extensions: a)
removal of potentially harmful or NSFW content, and b) enhancement of model
accuracy, inclusion and diversity through \textbf{concept correction and
disentanglement}. It can also be adapted as a lightweight model patch for
Stable Diffusion, allowing for concept manipulation and convenient
distribution. To encourage future research in this critical area and promote
the development of safe and inclusive generative models, we will open-source
our code and ConceptBench at
\href{https://github.com/SHI-Labs/Forget-Me-Not}{https://github.com/SHI-Labs/Forget-Me-Not}.
Related papers
- Safety Without Semantic Disruptions: Editing-free Safe Image Generation via Context-preserving Dual Latent Reconstruction [49.60774626839712]
Training multimodal generative models can expose users to harmful, unsafe and controversial or culturally-inappropriate outputs.
We propose a modular, dynamic solution that leverages safety-context embeddings and a dual reconstruction process to generate safer images.
We achieve state-of-the-art results on safe image generation benchmarks, while offering controllable variation of model safety.
arXiv Detail & Related papers (2024-11-21T09:47:13Z) - Reliable and Efficient Concept Erasure of Text-to-Image Diffusion Models [76.39651111467832]
We introduce Reliable and Efficient Concept Erasure (RECE), a novel approach that modifies the model in 3 seconds without necessitating additional fine-tuning.
To mitigate inappropriate content potentially represented by derived embeddings, RECE aligns them with harmless concepts in cross-attention layers.
The derivation and erasure of new representation embeddings are conducted iteratively to achieve a thorough erasure of inappropriate concepts.
arXiv Detail & Related papers (2024-07-17T08:04:28Z) - Latent Guard: a Safety Framework for Text-to-image Generation [64.49596711025993]
Existing safety measures are either based on text blacklists, which can be easily circumvented, or harmful content classification.
We propose Latent Guard, a framework designed to improve safety measures in text-to-image generation.
Inspired by blacklist-based approaches, Latent Guard learns a latent space on top of the T2I model's text encoder, where it is possible to check the presence of harmful concepts.
arXiv Detail & Related papers (2024-04-11T17:59:52Z) - Create Your World: Lifelong Text-to-Image Diffusion [75.14353789007902]
We propose Lifelong text-to-image Diffusion Model (L2DM) to overcome knowledge "catastrophic forgetting" for the past encountered concepts.
In respect of knowledge "catastrophic forgetting", our L2DM framework devises a task-aware memory enhancement module and a elastic-concept distillation module.
Our model can generate more faithful image across a range of continual text prompts in terms of both qualitative and quantitative metrics.
arXiv Detail & Related papers (2023-09-08T16:45:56Z) - Towards Safe Self-Distillation of Internet-Scale Text-to-Image Diffusion
Models [63.20512617502273]
We propose a method called SDD to prevent problematic content generation in text-to-image diffusion models.
Our method eliminates a much greater proportion of harmful content from the generated images without degrading the overall image quality.
arXiv Detail & Related papers (2023-07-12T07:48:29Z) - Selective Amnesia: A Continual Learning Approach to Forgetting in Deep
Generative Models [12.188240438657512]
We derive a technique inspired by continual learning to selectively forget concepts in pretrained deep generative models.
Our method, dubbed Selective Amnesia, enables controllable forgetting where a user can specify how a concept should be forgotten.
arXiv Detail & Related papers (2023-05-17T10:53:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.