Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models
- URL: http://arxiv.org/abs/2303.17591v1
- Date: Thu, 30 Mar 2023 17:58:11 GMT
- Title: Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models
- Authors: Eric Zhang, Kai Wang, Xingqian Xu, Zhangyang Wang, Humphrey Shi
- Abstract summary: textbfForget-Me-Not is designed to safely remove specified IDs, objects, or styles from a well-configured text-to-image model in as little as 30 seconds.
We demonstrate that Forget-Me-Not can effectively eliminate targeted concepts while maintaining the model's performance on other concepts.
It can also be adapted as a lightweight model patch for Stable Diffusion, allowing for concept manipulation and convenient distribution.
- Score: 79.50701155336198
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The unlearning problem of deep learning models, once primarily an academic
concern, has become a prevalent issue in the industry. The significant advances
in text-to-image generation techniques have prompted global discussions on
privacy, copyright, and safety, as numerous unauthorized personal IDs, content,
artistic creations, and potentially harmful materials have been learned by
these models and later utilized to generate and distribute uncontrolled
content. To address this challenge, we propose \textbf{Forget-Me-Not}, an
efficient and low-cost solution designed to safely remove specified IDs,
objects, or styles from a well-configured text-to-image model in as little as
30 seconds, without impairing its ability to generate other content. Alongside
our method, we introduce the \textbf{Memorization Score (M-Score)} and
\textbf{ConceptBench} to measure the models' capacity to generate general
concepts, grouped into three primary categories: ID, object, and style. Using
M-Score and ConceptBench, we demonstrate that Forget-Me-Not can effectively
eliminate targeted concepts while maintaining the model's performance on other
concepts. Furthermore, Forget-Me-Not offers two practical extensions: a)
removal of potentially harmful or NSFW content, and b) enhancement of model
accuracy, inclusion and diversity through \textbf{concept correction and
disentanglement}. It can also be adapted as a lightweight model patch for
Stable Diffusion, allowing for concept manipulation and convenient
distribution. To encourage future research in this critical area and promote
the development of safe and inclusive generative models, we will open-source
our code and ConceptBench at
\href{https://github.com/SHI-Labs/Forget-Me-Not}{https://github.com/SHI-Labs/Forget-Me-Not}.
Related papers
- Reliable and Efficient Concept Erasure of Text-to-Image Diffusion Models [76.39651111467832]
We introduce Reliable and Efficient Concept Erasure (RECE), a novel approach that modifies the model in 3 seconds without necessitating additional fine-tuning.
To mitigate inappropriate content potentially represented by derived embeddings, RECE aligns them with harmless concepts in cross-attention layers.
The derivation and erasure of new representation embeddings are conducted iteratively to achieve a thorough erasure of inappropriate concepts.
arXiv Detail & Related papers (2024-07-17T08:04:28Z) - Safeguard Text-to-Image Diffusion Models with Human Feedback Inversion [51.931083971448885]
We propose a framework named Human Feedback Inversion (HFI), where human feedback on model-generated images is condensed into textual tokens guiding the mitigation or removal of problematic images.
Our experimental results demonstrate our framework significantly reduces objectionable content generation while preserving image quality, contributing to the ethical deployment of AI in the public sphere.
arXiv Detail & Related papers (2024-07-17T05:21:41Z) - Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models [42.734578139757886]
Diffusion models (DMs) have achieved remarkable success in text-to-image generation, but they also pose safety risks.
The techniques of machine unlearning, also known as concept erasing, have been developed to address these risks.
This work aims to enhance the robustness of concept erasing by integrating the principle of adversarial training (AT) into machine unlearning.
arXiv Detail & Related papers (2024-05-24T05:47:23Z) - Latent Guard: a Safety Framework for Text-to-image Generation [64.49596711025993]
Existing safety measures are either based on text blacklists, which can be easily circumvented, or harmful content classification.
We propose Latent Guard, a framework designed to improve safety measures in text-to-image generation.
Inspired by blacklist-based approaches, Latent Guard learns a latent space on top of the T2I model's text encoder, where it is possible to check the presence of harmful concepts.
arXiv Detail & Related papers (2024-04-11T17:59:52Z) - Create Your World: Lifelong Text-to-Image Diffusion [75.14353789007902]
We propose Lifelong text-to-image Diffusion Model (L2DM) to overcome knowledge "catastrophic forgetting" for the past encountered concepts.
In respect of knowledge "catastrophic forgetting", our L2DM framework devises a task-aware memory enhancement module and a elastic-concept distillation module.
Our model can generate more faithful image across a range of continual text prompts in terms of both qualitative and quantitative metrics.
arXiv Detail & Related papers (2023-09-08T16:45:56Z) - Towards Safe Self-Distillation of Internet-Scale Text-to-Image Diffusion
Models [63.20512617502273]
We propose a method called SDD to prevent problematic content generation in text-to-image diffusion models.
Our method eliminates a much greater proportion of harmful content from the generated images without degrading the overall image quality.
arXiv Detail & Related papers (2023-07-12T07:48:29Z) - Selective Amnesia: A Continual Learning Approach to Forgetting in Deep
Generative Models [12.188240438657512]
We derive a technique inspired by continual learning to selectively forget concepts in pretrained deep generative models.
Our method, dubbed Selective Amnesia, enables controllable forgetting where a user can specify how a concept should be forgotten.
arXiv Detail & Related papers (2023-05-17T10:53:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.