Related papers: RMFlow: Refined Mean Flow by a Noise-Injection Step for Multimodal Generation

RMFlow: Refined Mean Flow by a Noise-Injection Step for Multimodal Generation

URL: http://arxiv.org/abs/2602.00849v1
Date: Sat, 31 Jan 2026 18:27:05 GMT
Title: RMFlow: Refined Mean Flow by a Noise-Injection Step for Multimodal Generation
Authors: Yuhao Huang, Shih-Hsin Wang, Andrea L. Bertozzi, Bao Wang,
Abstract summary: Mean flow (MeanFlow) enables efficient, high-fidelity image generation, yet its single-function evaluation (1-NFE) generation often cannot yield compelling results.<n>We introduce RMFlow, an efficient multimodal generative model that integrates a coarse 1-NFE MeanFlow transport with a tailored noise-injection refinement step.<n> RMFlow achieves near state-of-the-art results on text-to-image, context-to-molecule, and time-series generation using only 1-NFE, at a computational cost comparable to the baseline MeanFlows.
Score: 12.979642182577157
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Mean flow (MeanFlow) enables efficient, high-fidelity image generation, yet its single-function evaluation (1-NFE) generation often cannot yield compelling results. We address this issue by introducing RMFlow, an efficient multimodal generative model that integrates a coarse 1-NFE MeanFlow transport with a subsequent tailored noise-injection refinement step. RMFlow approximates the average velocity of the flow path using a neural network trained with a new loss function that balances minimizing the Wasserstein distance between probability paths and maximizing sample likelihood. RMFlow achieves near state-of-the-art results on text-to-image, context-to-molecule, and time-series generation using only 1-NFE, at a computational cost comparable to the baseline MeanFlows.

Related papers

Trajectory Stitching for Solving Inverse Problems with Flow-Based Models [68.36374645801901]
Flow-based generative models have emerged as powerful priors for solving inverse problems.<n>We propose MS-Flow, which represents the trajectory as a sequence of intermediate latent states rather than a single initial code.<n>We demonstrate the effectiveness of MS-Flow over existing methods on image recovery and inverse problems, including inpainting, super-resolution, and computed tomography.
arXiv Detail & Related papers (2026-02-09T11:36:41Z)
Flow Straighter and Faster: Efficient One-Step Generative Modeling via MeanFlow on Rectified Trajectories [14.36205662558203]
Rectified MeanFlow is a framework that models the mean velocity field along the rectified trajectory using only a single reflow step.<n>Experiments on ImageNet at 64, 256, and 512 resolutions show that Re-MeanFlow consistently outperforms prior one-step flow distillation and Rectified Flow methods in both sample quality and training efficiency.
arXiv Detail & Related papers (2025-11-28T16:50:08Z)
AlphaFlow: Understanding and Improving MeanFlow Models [74.64465762009475]
We show that the MeanFlow objective naturally decomposes into two parts: trajectory flow matching and trajectory consistency.<n>Motivated by these insights, we introduce $alpha$-Flow, a broad family of objectives that unifies trajectory flow matching, Shortcut Model, and MeanFlow.<n>When trained from scratch on class-conditional ImageNet-1K 256x256 with vanilla DiT backbones, $alpha$-Flow consistently outperforms MeanFlow across scales and settings.
arXiv Detail & Related papers (2025-10-23T17:45:06Z)
OneFlow: Concurrent Mixed-Modal and Interleaved Generation with Edit Flows [59.052955667723985]
We present OneFlow, the first non-autoregressive multimodal model that enables variable-length and concurrent mixed-modal generation.<n>Unlike autoregressive models that enforce rigid causal ordering between text and image generation, OneFlow combines an insertion-based Edit Flow for discrete text tokens with Flow Matching for image latents.
arXiv Detail & Related papers (2025-10-03T20:40:30Z)
Towards High-Order Mean Flow Generative Models: Feasibility, Expressivity, and Provably Efficient Criteria [14.37317011636007]
We introduce a theoretical study on Second-Order MeanFlow, a novel extension that incorporates average acceleration fields into the MeanFlow objective.<n>We first establish the feasibility of our approach by proving that the average acceleration satisfies a generalized consistency condition analogous to first-order MeanFlow.<n>We then characterize its expressivity via circuit complexity analysis, showing that the Second-Order MeanFlow sampling process can be implemented by uniform threshold circuits.
arXiv Detail & Related papers (2025-08-09T21:10:58Z)
Mean Flows for One-step Generative Modeling [64.4997821467102]
We propose a principled and effective framework for one-step generative modeling.<n>A well-defined identity between average and instantaneous velocities is derived and used to guide neural network training.<n>Our method, termed the MeanFlow model, is self-contained and requires no pre-training, distillation, or curriculum learning.
arXiv Detail & Related papers (2025-05-19T17:59:42Z)
Normalizing Flows are Capable Generative Models [48.31226028595099]
TarFlow is a simple and scalable architecture that enables highly performant NF models.<n>It is straightforward to train end-to-end, and capable of directly modeling and generating pixels.<n>TarFlow sets new state-of-the-art results on likelihood estimation for images, beating the previous best methods by a large margin.
arXiv Detail & Related papers (2024-12-09T09:28:06Z)
Boundary-aware Decoupled Flow Networks for Realistic Extreme Rescaling [49.215957313126324]
Recently developed generative methods, including invertible rescaling network (IRN) based and generative adversarial network (GAN) based methods, have demonstrated exceptional performance in image rescaling. However, IRN-based methods tend to produce over-smoothed results, while GAN-based methods easily generate fake details. We propose Boundary-aware Decoupled Flow Networks (BDFlow) to generate realistic and visually pleasing results.
arXiv Detail & Related papers (2024-05-05T14:05:33Z)
GMFlow: Learning Optical Flow via Global Matching [124.57850500778277]
We propose a GMFlow framework for learning optical flow estimation. It consists of three main components: a customized Transformer for feature enhancement, a correlation and softmax layer for global feature matching, and a self-attention layer for flow propagation. Our new framework outperforms 32-iteration RAFT's performance on the challenging Sintel benchmark.
arXiv Detail & Related papers (2021-11-26T18:59:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.