Diffusion Soup: Model Merging for Text-to-Image Diffusion Models
- URL: http://arxiv.org/abs/2406.08431v1
- Date: Wed, 12 Jun 2024 17:16:16 GMT
- Title: Diffusion Soup: Model Merging for Text-to-Image Diffusion Models
- Authors: Benjamin Biggs, Arjun Seshadri, Yang Zou, Achin Jain, Aditya Golatkar, Yusheng Xie, Alessandro Achille, Ashwin Swaminathan, Stefano Soatto,
- Abstract summary: We present Diffusion Soup, a compartmentalization method for Text-to-Image Generation that averages the weights of diffusion models trained on sharded data.
By construction, our approach enables training-free continual learning and unlearning with no additional memory or inference costs.
- Score: 90.01635703779183
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present Diffusion Soup, a compartmentalization method for Text-to-Image Generation that averages the weights of diffusion models trained on sharded data. By construction, our approach enables training-free continual learning and unlearning with no additional memory or inference costs, since models corresponding to data shards can be added or removed by re-averaging. We show that Diffusion Soup samples from a point in weight space that approximates the geometric mean of the distributions of constituent datasets, which offers anti-memorization guarantees and enables zero-shot style mixing. Empirically, Diffusion Soup outperforms a paragon model trained on the union of all data shards and achieves a 30% improvement in Image Reward (.34 $\to$ .44) on domain sharded data, and a 59% improvement in IR (.37 $\to$ .59) on aesthetic data. In both cases, souping also prevails in TIFA score (respectively, 85.5 $\to$ 86.5 and 85.6 $\to$ 86.8). We demonstrate robust unlearning -- removing any individual domain shard only lowers performance by 1% in IR (.45 $\to$ .44) -- and validate our theoretical insights on anti-memorization using real data. Finally, we showcase Diffusion Soup's ability to blend the distinct styles of models finetuned on different shards, resulting in the zero-shot generation of hybrid styles.
Related papers
- Mitigating Embedding Collapse in Diffusion Models for Categorical Data [52.90687881724333]
We introduce CATDM, a continuous diffusion framework within the embedding space that stabilizes training.
Experiments on benchmarks show that CATDM mitigates embedding collapse, yielding superior results on FFHQ, LSUN Churches, and LSUN Bedrooms.
arXiv Detail & Related papers (2024-10-18T09:12:33Z) - Amortizing intractable inference in diffusion models for vision, language, and control [89.65631572949702]
This paper studies amortized sampling of the posterior over data, $mathbfxsim prm post(mathbfx)propto p(mathbfx)r(mathbfx)$, in a model that consists of a diffusion generative model prior $p(mathbfx)$ and a black-box constraint or function $r(mathbfx)$.
We prove the correctness of a data-free learning objective, relative trajectory balance, for training a diffusion model that samples from
arXiv Detail & Related papers (2024-05-31T16:18:46Z) - Training Class-Imbalanced Diffusion Model Via Overlap Optimization [55.96820607533968]
Diffusion models trained on real-world datasets often yield inferior fidelity for tail classes.
Deep generative models, including diffusion models, are biased towards classes with abundant training images.
We propose a method based on contrastive learning to minimize the overlap between distributions of synthetic images for different classes.
arXiv Detail & Related papers (2024-02-16T16:47:21Z) - MimicDiffusion: Purifying Adversarial Perturbation via Mimicking Clean
Diffusion Model [8.695439655048634]
Diffusion-based adversarial purification focuses on using the diffusion model to generate a clean image against adversarial attacks.
We propose MimicDiffusion, a new diffusion-based adversarial purification technique, that directly approximates the generative process of the diffusion model with the clean image as input.
Experiments on three image datasets demonstrate that MimicDiffusion significantly performs better than the state-of-the-art baselines.
arXiv Detail & Related papers (2023-12-08T02:32:47Z) - Reducing Spatial Fitting Error in Distillation of Denoising Diffusion
Models [13.364271265023953]
Knowledge distillation for diffusion models is an effective method to address this limitation with a shortened sampling process.
We attribute the degradation to the spatial fitting error occurring in the training of both the teacher and student model.
SFERD utilizes attention guidance from the teacher model and a designed semantic gradient predictor to reduce the student's fitting error.
We achieve an FID of 5.31 on CIFAR-10 and 9.39 on ImageNet 64$times$64 with only one step, outperforming existing diffusion methods.
arXiv Detail & Related papers (2023-11-07T09:19:28Z) - Upgrading VAE Training With Unlimited Data Plans Provided by Diffusion
Models [12.542073306638988]
We show that overfitting encoders in VAEs can be effectively mitigated by training on samples from a pre-trained diffusion model.
We analyze generalization performance, amortization gap, and robustness of VAEs trained with our proposed method on three different data sets.
arXiv Detail & Related papers (2023-10-30T15:38:39Z) - Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution [67.9215891673174]
We propose score entropy as a novel loss that naturally extends score matching to discrete spaces.
We test our Score Entropy Discrete Diffusion models on standard language modeling tasks.
arXiv Detail & Related papers (2023-10-25T17:59:12Z) - GSURE-Based Diffusion Model Training with Corrupted Data [35.56267114494076]
We propose a novel training technique for generative diffusion models based only on corrupted data.
We demonstrate our technique on face images as well as Magnetic Resonance Imaging (MRI)
arXiv Detail & Related papers (2023-05-22T15:27:20Z) - Better Diffusion Models Further Improve Adversarial Training [97.44991845907708]
It has been recognized that the data generated by the diffusion probabilistic model (DDPM) improves adversarial training.
This paper gives an affirmative answer by employing the most recent diffusion model which has higher efficiency.
Our adversarially trained models achieve state-of-the-art performance on RobustBench using only generated data.
arXiv Detail & Related papers (2023-02-09T13:46:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.