Improved Mean Flows: On the Challenges of Fastforward Generative Models
- URL: http://arxiv.org/abs/2512.02012v1
- Date: Mon, 01 Dec 2025 18:59:49 GMT
- Title: Improved Mean Flows: On the Challenges of Fastforward Generative Models
- Authors: Zhengyang Geng, Yiyang Lu, Zongze Wu, Eli Shechtman, J. Zico Kolter, Kaiming He,
- Abstract summary: MeanFlow (MF) has recently been established as a framework for one-step generative modeling.<n>Here, we address key challenges in both the training objective and the guidance mechanism.<n>Our reformulation yields a more standard regression problem and improves the training stability.<n>Overall, our $textbfimproved MeanFlow$ ($textbfiMF$) method, trained entirely from scratch, achieves $textbf1.72$ FID with a single function evaluation (1-NFE) on ImageNet 256$times$256.
- Score: 81.10827083963655
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: MeanFlow (MF) has recently been established as a framework for one-step generative modeling. However, its ``fastforward'' nature introduces key challenges in both the training objective and the guidance mechanism. First, the original MF's training target depends not only on the underlying ground-truth fields but also on the network itself. To address this issue, we recast the objective as a loss on the instantaneous velocity $v$, re-parameterized by a network that predicts the average velocity $u$. Our reformulation yields a more standard regression problem and improves the training stability. Second, the original MF fixes the classifier-free guidance scale during training, which sacrifices flexibility. We tackle this issue by formulating guidance as explicit conditioning variables, thereby retaining flexibility at test time. The diverse conditions are processed through in-context conditioning, which reduces model size and benefits performance. Overall, our $\textbf{improved MeanFlow}$ ($\textbf{iMF}$) method, trained entirely from scratch, achieves $\textbf{1.72}$ FID with a single function evaluation (1-NFE) on ImageNet 256$\times$256. iMF substantially outperforms prior methods of this kind and closes the gap with multi-step methods while using no distillation. We hope our work will further advance fastforward generative modeling as a stand-alone paradigm.
Related papers
- MeanFlow Transformers with Representation Autoencoders [71.45823902973349]
MeanFlow (MF) is a diffusion-motivated generative model that enables efficient few-step generation by learning long jumps directly from noise to data.<n>We develop an efficient training and sampling scheme for MF in the latent space of a Representation Autoencoder (RAE)<n>We achieve a 1-step FID of 2.03, outperforming vanilla MF's 3.43, while reducing sampling GFLOPS by 38% and total training cost by 83% on ImageNet 256.
arXiv Detail & Related papers (2025-11-17T06:17:08Z) - Modular MeanFlow: Towards Stable and Scalable One-Step Generative Modeling [0.07646713951724012]
One-step generative modeling seeks to generate high-quality data samples in a single function evaluation.<n>In this work, we introduce Modular MeanFlow, a flexible and theoretically grounded approach for learning time-averaged velocity fields.
arXiv Detail & Related papers (2025-08-24T16:00:08Z) - Flow-Anchored Consistency Models [32.04797599813587]
Continuous-time Consistency Models (CMs) promise efficient few-step generation but face challenges with training instability.<n>We argue this instability stems from a fundamental conflict: by training a network to learn only a shortcut across a probability flow, the model loses its grasp on the instantaneous velocity field that defines the flow.<n>We introduce the Flow-Anchored Consistency Model (FACM), a simple but effective training strategy that uses a Flow Matching task as an anchor for the primary CM shortcut objective.
arXiv Detail & Related papers (2025-07-04T17:56:51Z) - Intention-Conditioned Flow Occupancy Models [80.42634994902858]
Large-scale pre-training has fundamentally changed how machine learning research is done today.<n>Applying this same framework to reinforcement learning is appealing because it offers compelling avenues for addressing core challenges in RL.<n>Recent advances in generative AI have provided new tools for modeling highly complex distributions.
arXiv Detail & Related papers (2025-06-10T15:27:46Z) - Mean Flows for One-step Generative Modeling [64.4997821467102]
We propose a principled and effective framework for one-step generative modeling.<n>A well-defined identity between average and instantaneous velocities is derived and used to guide neural network training.<n>Our method, termed the MeanFlow model, is self-contained and requires no pre-training, distillation, or curriculum learning.
arXiv Detail & Related papers (2025-05-19T17:59:42Z) - Truncated Consistency Models [57.50243901368328]
Training consistency models requires learning to map all intermediate points along PF ODE trajectories to their corresponding endpoints.<n>We empirically find that this training paradigm limits the one-step generation performance of consistency models.<n>We propose a new parameterization of the consistency function and a two-stage training procedure that prevents the truncated-time training from collapsing to a trivial solution.
arXiv Detail & Related papers (2024-10-18T22:38:08Z) - FINE: Factorizing Knowledge for Initialization of Variable-sized Diffusion Models [35.40065954148091]
FINE is a method based on the Learngene framework to initializing downstream networks leveraging pre-trained models.
It decomposes pre-trained knowledge into the product of matrices (i.e., $U$, $Sigma$, and $V$), where $U$ and $V$ are shared across network blocks as learngenes''
It consistently outperforms direct pre-training, particularly for smaller models, achieving state-of-the-art results across variable model sizes.
arXiv Detail & Related papers (2024-09-28T08:57:17Z) - DEFT: Efficient Fine-Tuning of Diffusion Models by Learning the Generalised $h$-transform [44.29325094229024]
We propose DEFT (Doob's h-transform Efficient FineTuning), a new approach for conditional generation that simply fine-tunes a very small network to quickly learn the conditional $h$-transform.<n>On image reconstruction tasks, we achieve speedups of up to 1.6$times$, while having the best perceptual quality on natural images and reconstruction performance on medical images.
arXiv Detail & Related papers (2024-06-03T20:52:34Z) - Not All Models Are Equal: Predicting Model Transferability in a
Self-challenging Fisher Space [51.62131362670815]
This paper addresses the problem of ranking the pre-trained deep neural networks and screening the most transferable ones for downstream tasks.
It proposes a new transferability metric called textbfSelf-challenging textbfFisher textbfDiscriminant textbfAnalysis (textbfSFDA)
arXiv Detail & Related papers (2022-07-07T01:33:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.