Selective Amnesia: A Continual Learning Approach to Forgetting in Deep
Generative Models
- URL: http://arxiv.org/abs/2305.10120v2
- Date: Tue, 17 Oct 2023 01:59:27 GMT
- Title: Selective Amnesia: A Continual Learning Approach to Forgetting in Deep
Generative Models
- Authors: Alvin Heng, Harold Soh
- Abstract summary: We derive a technique inspired by continual learning to selectively forget concepts in pretrained deep generative models.
Our method, dubbed Selective Amnesia, enables controllable forgetting where a user can specify how a concept should be forgotten.
- Score: 12.188240438657512
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The recent proliferation of large-scale text-to-image models has led to
growing concerns that such models may be misused to generate harmful,
misleading, and inappropriate content. Motivated by this issue, we derive a
technique inspired by continual learning to selectively forget concepts in
pretrained deep generative models. Our method, dubbed Selective Amnesia,
enables controllable forgetting where a user can specify how a concept should
be forgotten. Selective Amnesia can be applied to conditional variational
likelihood models, which encompass a variety of popular deep generative
frameworks, including variational autoencoders and large-scale text-to-image
diffusion models. Experiments across different models demonstrate that our
approach induces forgetting on a variety of concepts, from entire classes in
standard datasets to celebrity and nudity prompts in text-to-image models. Our
code is publicly available at https://github.com/clear-nus/selective-amnesia.
Related papers
- Avoiding Generative Model Writer's Block With Embedding Nudging [8.3196702956302]
We focus on the latent diffusion image generative models and how one can prevent them to generate particular images while generating similar images with limited overhead.
Our method successfully prevents the generation of memorized training images while maintaining comparable image quality and relevance to the unmodified model.
arXiv Detail & Related papers (2024-08-28T00:07:51Z) - Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images.
We identify model weaknesses by testing the model using the counterfactual image dataset.
We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z) - All but One: Surgical Concept Erasing with Model Preservation in
Text-to-Image Diffusion Models [22.60023885544265]
Large-scale datasets may contain sexually explicit, copyrighted, or undesirable content, which allows the model to directly generate them.
Fine-tuning algorithms have been developed to tackle concept erasing in diffusion models.
We present a new approach that solves all of these challenges.
arXiv Detail & Related papers (2023-12-20T07:04:33Z) - Create Your World: Lifelong Text-to-Image Diffusion [75.14353789007902]
We propose Lifelong text-to-image Diffusion Model (L2DM) to overcome knowledge "catastrophic forgetting" for the past encountered concepts.
In respect of knowledge "catastrophic forgetting", our L2DM framework devises a task-aware memory enhancement module and a elastic-concept distillation module.
Our model can generate more faithful image across a range of continual text prompts in terms of both qualitative and quantitative metrics.
arXiv Detail & Related papers (2023-09-08T16:45:56Z) - Circumventing Concept Erasure Methods For Text-to-Image Generative
Models [26.804057000265434]
Text-to-image generative models can produce photo-realistic images for an extremely broad range of concepts.
These models have numerous drawbacks, including their potential to generate images featuring sexually explicit content.
Various methods have been proposed in order to "erase" sensitive concepts from text-to-image models.
arXiv Detail & Related papers (2023-08-03T02:34:01Z) - Towards Safe Self-Distillation of Internet-Scale Text-to-Image Diffusion
Models [63.20512617502273]
We propose a method called SDD to prevent problematic content generation in text-to-image diffusion models.
Our method eliminates a much greater proportion of harmful content from the generated images without degrading the overall image quality.
arXiv Detail & Related papers (2023-07-12T07:48:29Z) - Training Diffusion Models with Reinforcement Learning [82.29328477109826]
Diffusion models are trained with an approximation to the log-likelihood objective.
In this paper, we investigate reinforcement learning methods for directly optimizing diffusion models for downstream objectives.
We describe how posing denoising as a multi-step decision-making problem enables a class of policy gradient algorithms.
arXiv Detail & Related papers (2023-05-22T17:57:41Z) - Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models [79.50701155336198]
textbfForget-Me-Not is designed to safely remove specified IDs, objects, or styles from a well-configured text-to-image model in as little as 30 seconds.
We demonstrate that Forget-Me-Not can effectively eliminate targeted concepts while maintaining the model's performance on other concepts.
It can also be adapted as a lightweight model patch for Stable Diffusion, allowing for concept manipulation and convenient distribution.
arXiv Detail & Related papers (2023-03-30T17:58:11Z) - Freestyle Layout-to-Image Synthesis [42.64485133926378]
In this work, we explore the freestyle capability of the model, i.e., how far can it generate unseen semantics onto a given layout.
Inspired by this, we opt to leverage large-scale pre-trained text-to-image diffusion models to achieve the generation of unseen semantics.
The proposed diffusion network produces realistic and freestyle layout-to-image generation results with diverse text inputs.
arXiv Detail & Related papers (2023-03-25T09:37:41Z) - Ablating Concepts in Text-to-Image Diffusion Models [57.9371041022838]
Large-scale text-to-image diffusion models can generate high-fidelity images with powerful compositional ability.
These models are typically trained on an enormous amount of Internet data, often containing copyrighted material, licensed images, and personal photos.
We propose an efficient method of ablating concepts in the pretrained model, preventing the generation of a target concept.
arXiv Detail & Related papers (2023-03-23T17:59:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.