Align Your Flow: Scaling Continuous-Time Flow Map Distillation
- URL: http://arxiv.org/abs/2506.14603v1
- Date: Tue, 17 Jun 2025 15:06:07 GMT
- Title: Align Your Flow: Scaling Continuous-Time Flow Map Distillation
- Authors: Amirmojtaba Sabour, Sanja Fidler, Karsten Kreis,
- Abstract summary: Flow maps connect any two noise levels in a single step and remain effective across all step counts.<n>We extensively validate our flow map models, called Align Your Flow, on challenging image generation benchmarks.<n>We show text-to-image flow map models that outperform all existing non-adversarially trained few-step samplers in text-conditioned synthesis.
- Score: 63.927438959502226
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion- and flow-based models have emerged as state-of-the-art generative modeling approaches, but they require many sampling steps. Consistency models can distill these models into efficient one-step generators; however, unlike flow- and diffusion-based methods, their performance inevitably degrades when increasing the number of steps, which we show both analytically and empirically. Flow maps generalize these approaches by connecting any two noise levels in a single step and remain effective across all step counts. In this paper, we introduce two new continuous-time objectives for training flow maps, along with additional novel training techniques, generalizing existing consistency and flow matching objectives. We further demonstrate that autoguidance can improve performance, using a low-quality model for guidance during distillation, and an additional boost can be achieved by adversarial finetuning, with minimal loss in sample diversity. We extensively validate our flow map models, called Align Your Flow, on challenging image generation benchmarks and achieve state-of-the-art few-step generation performance on both ImageNet 64x64 and 512x512, using small and efficient neural networks. Finally, we show text-to-image flow map models that outperform all existing non-adversarially trained few-step samplers in text-conditioned synthesis.
Related papers
- How to build a consistency model: Learning flow maps via self-distillation [15.520853806024943]
We present a systematic approach for learning flow maps associated with flow and diffusion models.<n>We exploit a relationship between the velocity field underlying a continuous-time flow and the instantaneous rate of change of the flow map.<n>We show how to convert existing distillation schemes into direct training algorithms via self-distillation.
arXiv Detail & Related papers (2025-05-24T18:50:50Z) - Mean Flows for One-step Generative Modeling [64.4997821467102]
We propose a principled and effective framework for one-step generative modeling.<n>A well-defined identity between average and instantaneous velocities is derived and used to guide neural network training.<n>Our method, termed the MeanFlow model, is self-contained and requires no pre-training, distillation, or curriculum learning.
arXiv Detail & Related papers (2025-05-19T17:59:42Z) - ARFlow: Autoregressive Flow with Hybrid Linear Attention [48.707933347079894]
Flow models are effective at progressively generating realistic images.<n>They struggle to capture long-range dependencies during the generation process.<n>We propose integrating autoregressive modeling into flow models.
arXiv Detail & Related papers (2025-01-27T14:33:27Z) - One Step Diffusion via Shortcut Models [109.72495454280627]
We introduce shortcut models, a family of generative models that use a single network and training phase to produce high-quality samples.<n>Shortcut models condition the network on the current noise level and also on the desired step size, allowing the model to skip ahead in the generation process.<n>Compared to distillation, shortcut models reduce complexity to a single network and training phase and additionally allow varying step budgets at inference time.
arXiv Detail & Related papers (2024-10-16T13:34:40Z) - FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner [70.90505084288057]
Flow-based models tend to produce a straighter sampling trajectory during the sampling process.
We introduce several techniques including a pseudo corrector and sample-aware compilation to further reduce inference time.
FlowTurbo reaches an FID of 2.12 on ImageNet with 100 (ms / img) and FID of 3.93 with 38 (ms / img)
arXiv Detail & Related papers (2024-09-26T17:59:51Z) - Flow map matching with stochastic interpolants: A mathematical framework for consistency models [15.520853806024943]
Flow Map Matching is a principled framework for learning the two-time flow map of an underlying generative model.<n>We show that FMM unifies and extends a broad class of existing approaches for fast sampling.
arXiv Detail & Related papers (2024-06-11T17:41:26Z) - Guided Flows for Generative Modeling and Decision Making [55.42634941614435]
We show that Guided Flows significantly improves the sample quality in conditional image generation and zero-shot text synthesis-to-speech.
Notably, we are first to apply flow models for plan generation in the offline reinforcement learning setting ax speedup in compared to diffusion models.
arXiv Detail & Related papers (2023-11-22T15:07:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.