Self-Corrected Flow Distillation for Consistent One-Step and Few-Step Text-to-Image Generation
- URL: http://arxiv.org/abs/2412.16906v1
- Date: Sun, 22 Dec 2024 07:48:49 GMT
- Title: Self-Corrected Flow Distillation for Consistent One-Step and Few-Step Text-to-Image Generation
- Authors: Quan Dao, Hao Phung, Trung Dao, Dimitris Metaxas, Anh Tran,
- Abstract summary: Flow matching has emerged as a promising framework for training generative models.
We introduce a self-corrected flow distillation method that integrates consistency models and adversarial training.
This work is a pioneer in achieving consistent generation quality in both few-step and one-step sampling.
- Score: 3.8959351616076745
- License:
- Abstract: Flow matching has emerged as a promising framework for training generative models, demonstrating impressive empirical performance while offering relative ease of training compared to diffusion-based models. However, this method still requires numerous function evaluations in the sampling process. To address these limitations, we introduce a self-corrected flow distillation method that effectively integrates consistency models and adversarial training within the flow-matching framework. This work is a pioneer in achieving consistent generation quality in both few-step and one-step sampling. Our extensive experiments validate the effectiveness of our method, yielding superior results both quantitatively and qualitatively on CelebA-HQ and zero-shot benchmarks on the COCO dataset. Our implementation is released at https://github.com/VinAIResearch/SCFlow
Related papers
- Efficiency Meets Fidelity: A Novel Quantization Framework for Stable Diffusion [9.8078769718432]
We propose an efficient quantization framework for Stable Diffusion models.
Our approach features a Serial-to-Parallel calibration pipeline that addresses the consistency of both the calibration and inference processes.
Under W4A8 quantization settings, our approach enhances both distribution similarity and visual similarity by 45%-60%.
arXiv Detail & Related papers (2024-12-09T17:00:20Z) - FlowTS: Time Series Generation via Rectified Flow [67.41208519939626]
FlowTS is an ODE-based model that leverages rectified flow with straight-line transport in probability space.
For unconditional setting, FlowTS achieves state-of-the-art performance, with context FID scores of 0.019 and 0.011 on Stock and ETTh datasets.
For conditional setting, we have achieved superior performance in solar forecasting.
arXiv Detail & Related papers (2024-11-12T03:03:23Z) - Guided Score identity Distillation for Data-Free One-Step Text-to-Image Generation [62.30570286073223]
Diffusion-based text-to-image generation models have demonstrated the ability to produce images aligned with textual descriptions.
We introduce a data-free guided distillation method that enables the efficient distillation of pretrained Diffusion models without access to the real training data.
By exclusively training with synthetic images generated by its one-step generator, our data-free distillation method rapidly improves FID and CLIP scores, achieving state-of-the-art FID performance while maintaining a competitive CLIP score.
arXiv Detail & Related papers (2024-06-03T17:44:11Z) - Language Rectified Flow: Advancing Diffusion Language Generation with Probabilistic Flows [53.31856123113228]
This paper proposes Language Rectified Flow (ours)
Our method is based on the reformulation of the standard probabilistic flow models.
Experiments and ablation studies demonstrate that our method can be general, effective, and beneficial for many NLP tasks.
arXiv Detail & Related papers (2024-03-25T17:58:22Z) - One-Step Diffusion Distillation via Deep Equilibrium Models [64.11782639697883]
We introduce a simple yet effective means of distilling diffusion models directly from initial noise to the resulting image.
Our method enables fully offline training with just noise/image pairs from the diffusion model.
We demonstrate that the DEQ architecture is crucial to this capability, as GET matches a $5times$ larger ViT in terms of FID scores.
arXiv Detail & Related papers (2023-12-12T07:28:40Z) - Guided Flows for Generative Modeling and Decision Making [55.42634941614435]
We show that Guided Flows significantly improves the sample quality in conditional image generation and zero-shot text synthesis-to-speech.
Notably, we are first to apply flow models for plan generation in the offline reinforcement learning setting ax speedup in compared to diffusion models.
arXiv Detail & Related papers (2023-11-22T15:07:59Z) - Flow Matching in Latent Space [2.9330609943398525]
Flow matching is a framework to train generative models that exhibits impressive empirical performance.
We propose to apply flow matching in the latent spaces of pretrained autoencoders, which offers improved computational efficiency.
Our work stands as a pioneering contribution in the integration of various conditions into flow matching for conditional generation tasks.
arXiv Detail & Related papers (2023-07-17T17:57:56Z) - BOOT: Data-free Distillation of Denoising Diffusion Models with
Bootstrapping [64.54271680071373]
Diffusion models have demonstrated excellent potential for generating diverse images.
Knowledge distillation has been recently proposed as a remedy that can reduce the number of inference steps to one or a few.
We present a novel technique called BOOT, that overcomes limitations with an efficient data-free distillation algorithm.
arXiv Detail & Related papers (2023-06-08T20:30:55Z) - Enhancing Text Generation with Cooperative Training [23.971227375706327]
Most prevailing methods trained generative and discriminative models in isolation, which left them unable to adapt to changes in each other.
We introduce a textitself-consistent learning framework in the text field that involves training a discriminator and generator cooperatively in a closed-loop manner.
Our framework are able to mitigate training instabilities such as mode collapse and non-convergence.
arXiv Detail & Related papers (2023-03-16T04:21:19Z) - Modeling Score Distributions and Continuous Covariates: A Bayesian
Approach [8.772459063453285]
We develop a generative model of the match and non-match score distributions over continuous covariates.
We use mixture models to capture arbitrary distributions and local basis functions.
Three experiments demonstrate the accuracy and effectiveness of our approach.
arXiv Detail & Related papers (2020-09-21T02:41:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.