EraseDiff: Erasing Data Influence in Diffusion Models
- URL: http://arxiv.org/abs/2401.05779v2
- Date: Mon, 5 Feb 2024 00:32:13 GMT
- Title: EraseDiff: Erasing Data Influence in Diffusion Models
- Authors: Jing Wu, Trung Le, Munawar Hayat, Mehrtash Harandi
- Abstract summary: We introduce an unlearning algorithm for diffusion models.
We show that our algorithm can preserve the model utility, effectiveness, and efficiency while removing across the widely-used diffusion models.
- Score: 54.95692559939673
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we introduce an unlearning algorithm for diffusion models. Our
algorithm equips a diffusion model with a mechanism to mitigate the concerns
related to data memorization. To achieve this, we formulate the unlearning
problem as a constraint optimization problem, aiming to preserve the utility of
the diffusion model on the remaining data and scrub the information associated
with forgetting data by deviating the learnable generative process from the
ground-truth denoising procedure. To solve the resulting problem, we adopt a
first-order method, having superior practical performance while being vigilant
about the diffusion process. Empirically, we demonstrate that our algorithm can
preserve the model utility, effectiveness, and efficiency while removing across
the widely-used diffusion models and in both conditional and unconditional
image generation scenarios.
Related papers
- Integrating Amortized Inference with Diffusion Models for Learning Clean Distribution from Corrupted Images [19.957503854446735]
Diffusion models (DMs) have emerged as powerful generative models for solving inverse problems.
FlowDiff is a joint training paradigm that leverages a conditional normalizing flow model to facilitate the training of diffusion models on corrupted data sources.
Our experiment shows that FlowDiff can effectively learn clean distributions across a wide range of corrupted data sources.
arXiv Detail & Related papers (2024-07-15T18:33:20Z) - An Expectation-Maximization Algorithm for Training Clean Diffusion Models from Corrupted Observations [21.411327264448058]
We propose an expectation-maximization (EM) approach to train diffusion models from corrupted observations.
Our method alternates between reconstructing clean images from corrupted data using a known diffusion model (E-step) and refining diffusion model weights based on these reconstructions (M-step)
This iterative process leads the learned diffusion model to gradually converge to the true clean data distribution.
arXiv Detail & Related papers (2024-07-01T07:00:17Z) - Unsupervised Discovery of Interpretable Directions in h-space of
Pre-trained Diffusion Models [63.1637853118899]
We propose the first unsupervised and learning-based method to identify interpretable directions in h-space of pre-trained diffusion models.
We employ a shift control module that works on h-space of pre-trained diffusion models to manipulate a sample into a shifted version of itself.
By jointly optimizing them, the model will spontaneously discover disentangled and interpretable directions.
arXiv Detail & Related papers (2023-10-15T18:44:30Z) - Solving Inverse Problems with Latent Diffusion Models via Hard Data Consistency [7.671153315762146]
Training diffusion models in the pixel space are both data-intensive and computationally demanding.
Latent diffusion models, which operate in a much lower-dimensional space, offer a solution to these challenges.
We propose textitReSample, an algorithm that can solve general inverse problems with pre-trained latent diffusion models.
arXiv Detail & Related papers (2023-07-16T18:42:01Z) - BOOT: Data-free Distillation of Denoising Diffusion Models with
Bootstrapping [64.54271680071373]
Diffusion models have demonstrated excellent potential for generating diverse images.
Knowledge distillation has been recently proposed as a remedy that can reduce the number of inference steps to one or a few.
We present a novel technique called BOOT, that overcomes limitations with an efficient data-free distillation algorithm.
arXiv Detail & Related papers (2023-06-08T20:30:55Z) - CamoDiffusion: Camouflaged Object Detection via Conditional Diffusion
Models [72.93652777646233]
Camouflaged Object Detection (COD) is a challenging task in computer vision due to the high similarity between camouflaged objects and their surroundings.
We propose a new paradigm that treats COD as a conditional mask-generation task leveraging diffusion models.
Our method, dubbed CamoDiffusion, employs the denoising process of diffusion models to iteratively reduce the noise of the mask.
arXiv Detail & Related papers (2023-05-29T07:49:44Z) - Two-stage Denoising Diffusion Model for Source Localization in Graph
Inverse Problems [19.57064597050846]
Source localization is the inverse problem of graph information dissemination.
We propose a two-stage optimization framework, the source localization denoising diffusion model (SL-Diff)
SL-Diff yields excellent prediction results within a reasonable sampling time at extensive experiments.
arXiv Detail & Related papers (2023-04-18T09:11:09Z) - A Cheaper and Better Diffusion Language Model with Soft-Masked Noise [62.719656543880596]
Masked-Diffuse LM is a novel diffusion model for language modeling, inspired by linguistic features in languages.
Specifically, we design a linguistic-informed forward process which adds corruptions to the text through strategically soft-masking to better noise the textual data.
We demonstrate that our Masked-Diffuse LM can achieve better generation quality than the state-of-the-art diffusion models with better efficiency.
arXiv Detail & Related papers (2023-04-10T17:58:42Z) - Diffusion models for missing value imputation in tabular data [10.599563005836066]
Missing value imputation in machine learning is the task of estimating the missing values in the dataset accurately using available information.
We propose a diffusion model approach called "Conditional Score-based Diffusion Models for Tabular data" (CSDI_T)
To effectively handle categorical variables and numerical variables simultaneously, we investigate three techniques: one-hot encoding, analog bits encoding, and feature tokenization.
arXiv Detail & Related papers (2022-10-31T08:13:26Z) - Truncated Diffusion Probabilistic Models and Diffusion-based Adversarial
Auto-Encoders [137.1060633388405]
Diffusion-based generative models learn how to generate the data by inferring a reverse diffusion chain.
We propose a faster and cheaper approach that adds noise not until the data become pure random noise.
We show that the proposed model can be cast as an adversarial auto-encoder empowered by both the diffusion process and a learnable implicit prior.
arXiv Detail & Related papers (2022-02-19T20:18:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.