Model Collapse in the Self-Consuming Chain of Diffusion Finetuning: A Novel Perspective from Quantitative Trait Modeling
- URL: http://arxiv.org/abs/2407.17493v3
- Date: Fri, 06 Jun 2025 22:43:49 GMT
- Title: Model Collapse in the Self-Consuming Chain of Diffusion Finetuning: A Novel Perspective from Quantitative Trait Modeling
- Authors: Youngseok Yoon, Dainong Hu, Iain Weissburg, Yao Qin, Haewon Jeong,
- Abstract summary: This paper examines Chain of Diffusion, where a pretrained text-to-image diffusion model is finetuned on its own generated images.<n>We demonstrate that severe image quality degradation was universal and identify CFG scale as the key factor impacting this model collapse.<n>We propose Reusable Diffusion Finetuning (ReDiFine), a simple yet effective strategy inspired by genetic mutations.
- Score: 10.159932782892865
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Model collapse, the severe degradation of generative models when iteratively trained on their own outputs, has gained significant attention in recent years. This paper examines Chain of Diffusion, where a pretrained text-to-image diffusion model is finetuned on its own generated images. We demonstrate that severe image quality degradation was universal and identify CFG scale as the key factor impacting this model collapse. Drawing on an analogy between the Chain of Diffusion and biological evolution, we then introduce a novel theoretical analysis based on quantitative trait modeling from statistical genetics. Our theoretical analysis aligns with empirical observations of the generated images in the Chain of Diffusion. Finally, we propose Reusable Diffusion Finetuning (ReDiFine), a simple yet effective strategy inspired by genetic mutations. It operates robustly across various scenarios without requiring any hyperparameter tuning, making it a plug-and-play solution for reusable image generation.
Related papers
- One-for-More: Continual Diffusion Model for Anomaly Detection [61.12622458367425]
Anomaly detection methods utilize diffusion models to generate or reconstruct normal samples when given arbitrary anomaly images.
Our study found that the diffusion model suffers from severe faithfulness hallucination'' and catastrophic forgetting''
We propose a continual diffusion model that uses gradient projection to achieve stable continual learning.
arXiv Detail & Related papers (2025-02-27T07:47:27Z) - DeltaDiff: Reality-Driven Diffusion with AnchorResiduals for Faithful SR [10.790771977682763]
We propose DeltaDiff, a novel frame.work that constrains the difusion process.<n>Our method surpasses state-of-the-art models and generates re-sults with better fidelity.<n>This work establishes a new low-rank constrained par-adigm for applying diffusion models to image reconstruction tasks.
arXiv Detail & Related papers (2025-02-18T06:07:14Z) - Progressive Compression with Universally Quantized Diffusion Models [35.199627388957566]
We explore the potential of diffusion models for progressive coding, resulting in a sequence of bits that can be incrementally transmitted and decoded.
Unlike prior work based on Gaussian diffusion or conditional diffusion models, we propose a new form of diffusion model with uniform noise in the forward process.
We obtain promising first results on image compression, achieving competitive rate-distortion and rate-realism results on a wide range of bit-rates with a single model.
arXiv Detail & Related papers (2024-12-14T19:06:01Z) - An Organism Starts with a Single Pix-Cell: A Neural Cellular Diffusion for High-Resolution Image Synthesis [8.01395073111961]
We introduce a novel family of models termed Generative Cellular Automata (GeCA)
GeCAs are evaluated as an effective augmentation tool for retinal disease classification across two imaging modalities: Fundus and Optical Coherence Tomography ( OCT)
In the context of OCT imaging, where data is scarce and the distribution of classes is inherently skewed, GeCA significantly boosts the performance of 11 different ophthalmological conditions.
arXiv Detail & Related papers (2024-07-03T11:26:09Z) - Self-Consuming Generative Models with Curated Data Provably Optimize Human Preferences [20.629333587044012]
We study the impact of data curation on iterated retraining of generative models.
We prove that, if the data is curated according to a reward model, the expected reward of the iterative retraining procedure is maximized.
arXiv Detail & Related papers (2024-06-12T21:28:28Z) - Lossy Image Compression with Foundation Diffusion Models [10.407650300093923]
In this work we formulate the removal of quantization error as a denoising task, using diffusion to recover lost information in the transmitted image latent.
Our approach allows us to perform less than 10% of the full diffusion generative process and requires no architectural changes to the diffusion model.
arXiv Detail & Related papers (2024-04-12T16:23:42Z) - Fine-Tuning of Continuous-Time Diffusion Models as Entropy-Regularized
Control [54.132297393662654]
Diffusion models excel at capturing complex data distributions, such as those of natural images and proteins.
While diffusion models are trained to represent the distribution in the training dataset, we often are more concerned with other properties, such as the aesthetic quality of the generated images.
We present theoretical and empirical evidence that demonstrates our framework is capable of efficiently generating diverse samples with high genuine rewards.
arXiv Detail & Related papers (2024-02-23T08:54:42Z) - Text Diffusion with Reinforced Conditioning [92.17397504834825]
This paper thoroughly analyzes text diffusion models and uncovers two significant limitations: degradation of self-conditioning during training and misalignment between training and sampling.
Motivated by our findings, we propose a novel Text Diffusion model called TREC, which mitigates the degradation with Reinforced Conditioning and the misalignment by Time-Aware Variance Scaling.
arXiv Detail & Related papers (2024-02-19T09:24:02Z) - Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - Post-training Quantization for Text-to-Image Diffusion Models with Progressive Calibration and Activation Relaxing [49.800746112114375]
We propose a novel post-training quantization method (Progressive and Relaxing) for text-to-image diffusion models.
We are the first to achieve quantization for Stable Diffusion XL while maintaining the performance.
arXiv Detail & Related papers (2023-11-10T09:10:09Z) - Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional
Image Synthesis [62.07413805483241]
Steered Diffusion is a framework for zero-shot conditional image generation using a diffusion model trained for unconditional generation.
We present experiments using steered diffusion on several tasks including inpainting, colorization, text-guided semantic editing, and image super-resolution.
arXiv Detail & Related papers (2023-09-30T02:03:22Z) - ChiroDiff: Modelling chirographic data with Diffusion Models [132.5223191478268]
We introduce a powerful model-class namely "Denoising Diffusion Probabilistic Models" or DDPMs for chirographic data.
Our model named "ChiroDiff", being non-autoregressive, learns to capture holistic concepts and therefore remains resilient to higher temporal sampling rate.
arXiv Detail & Related papers (2023-04-07T15:17:48Z) - Bi-Noising Diffusion: Towards Conditional Diffusion Models with
Generative Restoration Priors [64.24948495708337]
We introduce a new method that brings predicted samples to the training data manifold using a pretrained unconditional diffusion model.
We perform comprehensive experiments to demonstrate the effectiveness of our approach on super-resolution, colorization, turbulence removal, and image-deraining tasks.
arXiv Detail & Related papers (2022-12-14T17:26:35Z) - SinDiffusion: Learning a Diffusion Model from a Single Natural Image [159.4285444680301]
We present SinDiffusion, leveraging denoising diffusion models to capture internal distribution of patches from a single natural image.
It is based on two core designs. First, SinDiffusion is trained with a single model at a single scale instead of multiple models with progressive growing of scales.
Second, we identify that a patch-level receptive field of the diffusion network is crucial and effective for capturing the image's patch statistics.
arXiv Detail & Related papers (2022-11-22T18:00:03Z) - Deep Equilibrium Approaches to Diffusion Models [1.4275201654498746]
Diffusion-based generative models are extremely effective in generating high-quality images.
These models typically require long sampling chains to produce high-fidelity images.
We look at diffusion models through a different perspective, that of a (deep) equilibrium (DEQ) fixed point model.
arXiv Detail & Related papers (2022-10-23T22:02:19Z) - Fast Unsupervised Brain Anomaly Detection and Segmentation with
Diffusion Models [1.6352599467675781]
We propose a method based on diffusion models to detect and segment anomalies in brain imaging.
Our diffusion models achieve competitive performance compared with autoregressive approaches across a series of experiments with 2D CT and MRI data.
arXiv Detail & Related papers (2022-06-07T17:30:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.