RMFlow: Refined Mean Flow by a Noise-Injection Step for Multimodal Generation
- URL: http://arxiv.org/abs/2602.00849v1
- Date: Sat, 31 Jan 2026 18:27:05 GMT
- Title: RMFlow: Refined Mean Flow by a Noise-Injection Step for Multimodal Generation
- Authors: Yuhao Huang, Shih-Hsin Wang, Andrea L. Bertozzi, Bao Wang,
- Abstract summary: Mean flow (MeanFlow) enables efficient, high-fidelity image generation, yet its single-function evaluation (1-NFE) generation often cannot yield compelling results.<n>We introduce RMFlow, an efficient multimodal generative model that integrates a coarse 1-NFE MeanFlow transport with a tailored noise-injection refinement step.<n> RMFlow achieves near state-of-the-art results on text-to-image, context-to-molecule, and time-series generation using only 1-NFE, at a computational cost comparable to the baseline MeanFlows.
- Score: 12.979642182577157
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Mean flow (MeanFlow) enables efficient, high-fidelity image generation, yet its single-function evaluation (1-NFE) generation often cannot yield compelling results. We address this issue by introducing RMFlow, an efficient multimodal generative model that integrates a coarse 1-NFE MeanFlow transport with a subsequent tailored noise-injection refinement step. RMFlow approximates the average velocity of the flow path using a neural network trained with a new loss function that balances minimizing the Wasserstein distance between probability paths and maximizing sample likelihood. RMFlow achieves near state-of-the-art results on text-to-image, context-to-molecule, and time-series generation using only 1-NFE, at a computational cost comparable to the baseline MeanFlows.
Related papers
- Trajectory Stitching for Solving Inverse Problems with Flow-Based Models [68.36374645801901]
Flow-based generative models have emerged as powerful priors for solving inverse problems.<n>We propose MS-Flow, which represents the trajectory as a sequence of intermediate latent states rather than a single initial code.<n>We demonstrate the effectiveness of MS-Flow over existing methods on image recovery and inverse problems, including inpainting, super-resolution, and computed tomography.
arXiv Detail & Related papers (2026-02-09T11:36:41Z) - Flow Straighter and Faster: Efficient One-Step Generative Modeling via MeanFlow on Rectified Trajectories [14.36205662558203]
Rectified MeanFlow is a framework that models the mean velocity field along the rectified trajectory using only a single reflow step.<n>Experiments on ImageNet at 64, 256, and 512 resolutions show that Re-MeanFlow consistently outperforms prior one-step flow distillation and Rectified Flow methods in both sample quality and training efficiency.
arXiv Detail & Related papers (2025-11-28T16:50:08Z) - AlphaFlow: Understanding and Improving MeanFlow Models [74.64465762009475]
We show that the MeanFlow objective naturally decomposes into two parts: trajectory flow matching and trajectory consistency.<n>Motivated by these insights, we introduce $alpha$-Flow, a broad family of objectives that unifies trajectory flow matching, Shortcut Model, and MeanFlow.<n>When trained from scratch on class-conditional ImageNet-1K 256x256 with vanilla DiT backbones, $alpha$-Flow consistently outperforms MeanFlow across scales and settings.
arXiv Detail & Related papers (2025-10-23T17:45:06Z) - OneFlow: Concurrent Mixed-Modal and Interleaved Generation with Edit Flows [59.052955667723985]
We present OneFlow, the first non-autoregressive multimodal model that enables variable-length and concurrent mixed-modal generation.<n>Unlike autoregressive models that enforce rigid causal ordering between text and image generation, OneFlow combines an insertion-based Edit Flow for discrete text tokens with Flow Matching for image latents.
arXiv Detail & Related papers (2025-10-03T20:40:30Z) - Towards High-Order Mean Flow Generative Models: Feasibility, Expressivity, and Provably Efficient Criteria [14.37317011636007]
We introduce a theoretical study on Second-Order MeanFlow, a novel extension that incorporates average acceleration fields into the MeanFlow objective.<n>We first establish the feasibility of our approach by proving that the average acceleration satisfies a generalized consistency condition analogous to first-order MeanFlow.<n>We then characterize its expressivity via circuit complexity analysis, showing that the Second-Order MeanFlow sampling process can be implemented by uniform threshold circuits.
arXiv Detail & Related papers (2025-08-09T21:10:58Z) - Mean Flows for One-step Generative Modeling [64.4997821467102]
We propose a principled and effective framework for one-step generative modeling.<n>A well-defined identity between average and instantaneous velocities is derived and used to guide neural network training.<n>Our method, termed the MeanFlow model, is self-contained and requires no pre-training, distillation, or curriculum learning.
arXiv Detail & Related papers (2025-05-19T17:59:42Z) - Normalizing Flows are Capable Generative Models [48.31226028595099]
TarFlow is a simple and scalable architecture that enables highly performant NF models.<n>It is straightforward to train end-to-end, and capable of directly modeling and generating pixels.<n>TarFlow sets new state-of-the-art results on likelihood estimation for images, beating the previous best methods by a large margin.
arXiv Detail & Related papers (2024-12-09T09:28:06Z) - Boundary-aware Decoupled Flow Networks for Realistic Extreme Rescaling [49.215957313126324]
Recently developed generative methods, including invertible rescaling network (IRN) based and generative adversarial network (GAN) based methods, have demonstrated exceptional performance in image rescaling.
However, IRN-based methods tend to produce over-smoothed results, while GAN-based methods easily generate fake details.
We propose Boundary-aware Decoupled Flow Networks (BDFlow) to generate realistic and visually pleasing results.
arXiv Detail & Related papers (2024-05-05T14:05:33Z) - GMFlow: Learning Optical Flow via Global Matching [124.57850500778277]
We propose a GMFlow framework for learning optical flow estimation.
It consists of three main components: a customized Transformer for feature enhancement, a correlation and softmax layer for global feature matching, and a self-attention layer for flow propagation.
Our new framework outperforms 32-iteration RAFT's performance on the challenging Sintel benchmark.
arXiv Detail & Related papers (2021-11-26T18:59:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.