Related papers: Guided Flows for Generative Modeling and Decision Making

Guided Flows for Generative Modeling and Decision Making

URL: http://arxiv.org/abs/2311.13443v2
Date: Thu, 7 Dec 2023 20:49:03 GMT
Title: Guided Flows for Generative Modeling and Decision Making
Authors: Qinqing Zheng, Matt Le, Neta Shaul, Yaron Lipman, Aditya Grover, Ricky T. Q. Chen
Abstract summary: We show that Guided Flows significantly improves the sample quality in conditional image generation and zero-shot text synthesis-to-speech. Notably, we are first to apply flow models for plan generation in the offline reinforcement learning setting ax speedup in compared to diffusion models.
Score: 55.42634941614435
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Classifier-free guidance is a key component for enhancing the performance of conditional generative models across diverse tasks. While it has previously demonstrated remarkable improvements for the sample quality, it has only been exclusively employed for diffusion models. In this paper, we integrate classifier-free guidance into Flow Matching (FM) models, an alternative simulation-free approach that trains Continuous Normalizing Flows (CNFs) based on regressing vector fields. We explore the usage of \emph{Guided Flows} for a variety of downstream applications. We show that Guided Flows significantly improves the sample quality in conditional image generation and zero-shot text-to-speech synthesis, boasting state-of-the-art performance. Notably, we are the first to apply flow models for plan generation in the offline reinforcement learning setting, showcasing a 10x speedup in computation compared to diffusion models while maintaining comparable performance.

Related papers

Self-Corrected Flow Distillation for Consistent One-Step and Few-Step Text-to-Image Generation [3.8959351616076745]
Flow matching has emerged as a promising framework for training generative models. We introduce a self-corrected flow distillation method that integrates consistency models and adversarial training. This work is a pioneer in achieving consistent generation quality in both few-step and one-step sampling.
arXiv Detail & Related papers (2024-12-22T07:48:49Z)
Jet: A Modern Transformer-Based Normalizing Flow [62.2573739835562]
We revisit the design of the coupling-based normalizing flow models by carefully ablating prior design choices. We achieve state-of-the-art quantitative and qualitative performance with a much simpler architecture.
arXiv Detail & Related papers (2024-12-19T18:09:42Z)
Energy-Based Diffusion Language Models for Text Generation [126.23425882687195]
Energy-based Diffusion Language Model (EDLM) is an energy-based model operating at the full sequence level for each diffusion step. Our framework offers a 1.3$times$ sampling speedup over existing diffusion models.
arXiv Detail & Related papers (2024-10-28T17:25:56Z)
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think [72.48325960659822]
One main bottleneck in training large-scale diffusion models for generation lies in effectively learning these representations. We study this by introducing a straightforward regularization called REPresentation Alignment (REPA), which aligns the projections of noisy input hidden states in denoising networks with clean image representations obtained from external, pretrained visual encoders. The results are striking: our simple strategy yields significant improvements in both training efficiency and generation quality when applied to popular diffusion and flow-based transformers, such as DiTs and SiTs.
arXiv Detail & Related papers (2024-10-09T14:34:53Z)
FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner [70.90505084288057]
Flow-based models tend to produce a straighter sampling trajectory during the sampling process. We introduce several techniques including a pseudo corrector and sample-aware compilation to further reduce inference time. FlowTurbo reaches an FID of 2.12 on ImageNet with 100 (ms / img) and FID of 3.93 with 38 (ms / img)
arXiv Detail & Related papers (2024-09-26T17:59:51Z)
Text-to-Image Rectified Flow as Plug-and-Play Priors [52.586838532560755]
Rectified flow is a novel class of generative models that enforces a linear progression from the source to the target distribution. We show that rectified flow approaches surpass in terms of generation quality and efficiency, requiring fewer inference steps. Our method also displays competitive performance in image inversion and editing.
arXiv Detail & Related papers (2024-06-05T14:02:31Z)
D-Flow: Differentiating through Flows for Controlled Generation [37.80603174399585]
We introduce D-Flow, a framework for controlling the generation process by differentiating through the flow. We motivate this framework by our key observation stating that for Diffusion/FM models trained with Gaussian probability paths, differentiating through the generation process projects gradient on the data manifold. We validate our framework on linear and non-linear controlled generation problems including: image and audio inverse problems and conditional molecule generation reaching state of the art performance across all.
arXiv Detail & Related papers (2024-02-21T18:56:03Z)
Attentive Contractive Flow with Lipschitz-constrained Self-Attention [25.84621883831624]
We introduce a novel approach called Attentive Contractive Flow (ACF) ACF utilizes a special category of flow-based generative models - contractive flows. We demonstrate that ACF can be introduced into a variety of state of the art flow models in a plug-and-play manner.
arXiv Detail & Related papers (2021-09-24T18:02:49Z)
Distilling the Knowledge from Normalizing Flows [22.578033953780697]
Normalizing flows are a powerful class of generative models demonstrating strong performance in several speech and vision problems. We propose a simple distillation approach and demonstrate its effectiveness on state-of-the-art conditional flow-based models for image super-resolution and speech synthesis.
arXiv Detail & Related papers (2021-06-24T00:10:22Z)
Refining Deep Generative Models via Discriminator Gradient Flow [18.406499703293566]
Discriminator Gradient flow (DGflow) is a new technique that improves generated samples via the gradient flow of entropy-regularized f-divergences. We show that DGflow leads to significant improvement in the quality of generated samples for a variety of generative models.
arXiv Detail & Related papers (2020-12-01T19:10:15Z)
Normalizing Flows with Multi-Scale Autoregressive Priors [131.895570212956]
We introduce channel-wise dependencies in their latent space through multi-scale autoregressive priors (mAR) Our mAR prior for models with split coupling flow layers (mAR-SCF) can better capture dependencies in complex multimodal data. We show that mAR-SCF allows for improved image generation quality, with gains in FID and Inception scores compared to state-of-the-art flow-based models.
arXiv Detail & Related papers (2020-04-08T09:07:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.