AlphaFlow: Understanding and Improving MeanFlow Models
- URL: http://arxiv.org/abs/2510.20771v1
- Date: Thu, 23 Oct 2025 17:45:06 GMT
- Title: AlphaFlow: Understanding and Improving MeanFlow Models
- Authors: Huijie Zhang, Aliaksandr Siarohin, Willi Menapace, Michael Vasilkovsky, Sergey Tulyakov, Qing Qu, Ivan Skorokhodov,
- Abstract summary: We show that the MeanFlow objective naturally decomposes into two parts: trajectory flow matching and trajectory consistency.<n>Motivated by these insights, we introduce $alpha$-Flow, a broad family of objectives that unifies trajectory flow matching, Shortcut Model, and MeanFlow.<n>When trained from scratch on class-conditional ImageNet-1K 256x256 with vanilla DiT backbones, $alpha$-Flow consistently outperforms MeanFlow across scales and settings.
- Score: 74.64465762009475
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: MeanFlow has recently emerged as a powerful framework for few-step generative modeling trained from scratch, but its success is not yet fully understood. In this work, we show that the MeanFlow objective naturally decomposes into two parts: trajectory flow matching and trajectory consistency. Through gradient analysis, we find that these terms are strongly negatively correlated, causing optimization conflict and slow convergence. Motivated by these insights, we introduce $\alpha$-Flow, a broad family of objectives that unifies trajectory flow matching, Shortcut Model, and MeanFlow under one formulation. By adopting a curriculum strategy that smoothly anneals from trajectory flow matching to MeanFlow, $\alpha$-Flow disentangles the conflicting objectives, and achieves better convergence. When trained from scratch on class-conditional ImageNet-1K 256x256 with vanilla DiT backbones, $\alpha$-Flow consistently outperforms MeanFlow across scales and settings. Our largest $\alpha$-Flow-XL/2+ model achieves new state-of-the-art results using vanilla DiT backbones, with FID scores of 2.58 (1-NFE) and 2.15 (2-NFE).
Related papers
- RMFlow: Refined Mean Flow by a Noise-Injection Step for Multimodal Generation [12.979642182577157]
Mean flow (MeanFlow) enables efficient, high-fidelity image generation, yet its single-function evaluation (1-NFE) generation often cannot yield compelling results.<n>We introduce RMFlow, an efficient multimodal generative model that integrates a coarse 1-NFE MeanFlow transport with a tailored noise-injection refinement step.<n> RMFlow achieves near state-of-the-art results on text-to-image, context-to-molecule, and time-series generation using only 1-NFE, at a computational cost comparable to the baseline MeanFlows.
arXiv Detail & Related papers (2026-01-31T18:27:05Z) - TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows [25.487712175353035]
We propose TwinFlow, a framework for training 1-step generative models.<n>Our method achieves a GenEval score of 0.83 in 1-NFE on text-to-image tasks.<n>Our approach matches the performance of the original 100-NFE model on GenEval and DPG-Bench benchmarks.
arXiv Detail & Related papers (2025-12-03T07:45:46Z) - Improved Mean Flows: On the Challenges of Fastforward Generative Models [81.10827083963655]
MeanFlow (MF) has recently been established as a framework for one-step generative modeling.<n>Here, we address key challenges in both the training objective and the guidance mechanism.<n>Our reformulation yields a more standard regression problem and improves the training stability.<n>Overall, our $textbfimproved MeanFlow$ ($textbfiMF$) method, trained entirely from scratch, achieves $textbf1.72$ FID with a single function evaluation (1-NFE) on ImageNet 256$times$256.
arXiv Detail & Related papers (2025-12-01T18:59:49Z) - OneFlow: Concurrent Mixed-Modal and Interleaved Generation with Edit Flows [59.052955667723985]
We present OneFlow, the first non-autoregressive multimodal model that enables variable-length and concurrent mixed-modal generation.<n>Unlike autoregressive models that enforce rigid causal ordering between text and image generation, OneFlow combines an insertion-based Edit Flow for discrete text tokens with Flow Matching for image latents.
arXiv Detail & Related papers (2025-10-03T20:40:30Z) - Mean Flows for One-step Generative Modeling [64.4997821467102]
We propose a principled and effective framework for one-step generative modeling.<n>A well-defined identity between average and instantaneous velocities is derived and used to guide neural network training.<n>Our method, termed the MeanFlow model, is self-contained and requires no pre-training, distillation, or curriculum learning.
arXiv Detail & Related papers (2025-05-19T17:59:42Z) - Normalizing Flows are Capable Generative Models [48.31226028595099]
TarFlow is a simple and scalable architecture that enables highly performant NF models.<n>It is straightforward to train end-to-end, and capable of directly modeling and generating pixels.<n>TarFlow sets new state-of-the-art results on likelihood estimation for images, beating the previous best methods by a large margin.
arXiv Detail & Related papers (2024-12-09T09:28:06Z) - Variational Flow Matching for Graph Generation [42.3778673162256]
We develop CatFlow, a flow matching method for categorical data.<n>CatFlow is easy to implement, computationally efficient, and achieves strong results on graph generation tasks.
arXiv Detail & Related papers (2024-06-07T11:16:17Z) - Learning GFlowNets from partial episodes for improved convergence and
stability [56.99229746004125]
Generative flow networks (GFlowNets) are algorithms for training a sequential sampler of discrete objects under an unnormalized target density.
Existing training objectives for GFlowNets are either local to states or transitions, or propagate a reward signal over an entire sampling trajectory.
Inspired by the TD($lambda$) algorithm in reinforcement learning, we introduce subtrajectory balance or SubTB($lambda$), a GFlowNet training objective that can learn from partial action subsequences of varying lengths.
arXiv Detail & Related papers (2022-09-26T15:44:24Z) - ShapeFlow: Dynamic Shape Interpreter for TensorFlow [10.59840927423059]
We present ShapeFlow, a dynamic abstract interpreter for which quickly catches shape incompatibility errors.
ShapeFlow constructs a custom shape computational graph, similar to the computational graph used by the programmer.
We evaluate ShapeFlow on 52 programs collected by prior empirical studies to show how fast and accurately it can catch shape incompatibility errors.
arXiv Detail & Related papers (2020-11-26T19:27:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.