Flow-Factory: A Unified Framework for Reinforcement Learning in Flow-Matching Models
- URL: http://arxiv.org/abs/2602.12529v1
- Date: Fri, 13 Feb 2026 02:21:59 GMT
- Title: Flow-Factory: A Unified Framework for Reinforcement Learning in Flow-Matching Models
- Authors: Bowen Ping, Chengyou Jia, Minnan Luo, Hangwei Qian, Ivor Tsang,
- Abstract summary: Flow-Factory is a framework that decouples algorithms, models, and rewards through a modular, registry-based architecture.<n>It empowers researchers to rapidly prototype and scale future innovations with ease.<n>Flow-Factory provides production-ready memory optimization, flexible multi-reward training, and seamless distributed training support.
- Score: 30.65606997113044
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reinforcement learning has emerged as a promising paradigm for aligning diffusion and flow-matching models with human preferences, yet practitioners face fragmented codebases, model-specific implementations, and engineering complexity. We introduce Flow-Factory, a unified framework that decouples algorithms, models, and rewards through through a modular, registry-based architecture. This design enables seamless integration of new algorithms and architectures, as demonstrated by our support for GRPO, DiffusionNFT, and AWM across Flux, Qwen-Image, and WAN video models. By minimizing implementation overhead, Flow-Factory empowers researchers to rapidly prototype and scale future innovations with ease. Flow-Factory provides production-ready memory optimization, flexible multi-reward training, and seamless distributed training support. The codebase is available at https://github.com/X-GenGroup/Flow-Factory.
Related papers
- Efficient Training of Diffusion Mixture-of-Experts Models: A Practical Recipe [51.26601054313749]
Recent efforts on Diffusion MoE models have primarily focused on developing more sophisticated routing mechanisms.<n>Inspired by the MoE design paradigms established in large language models (LLMs), we identify a set of crucial architectural factors for building effective Diffusion MoE models.<n>We present novel architectures that can be efficiently applied to both latent and pixel-space diffusion frameworks.
arXiv Detail & Related papers (2025-12-01T03:52:31Z) - NExT-OMNI: Towards Any-to-Any Omnimodal Foundation Models with Discrete Flow Matching [64.10695425442164]
We introduce NExT-OMNI, an open-source omnimodal foundation model that achieves unified modeling through discrete flow paradigms.<n>Trained on large-scale interleaved text, image, video, and audio data, NExT-OMNI delivers competitive performance on multimodal generation and understanding benchmarks.<n>To advance further research, we release training details, data protocols, and open-source both the code and model checkpoints.
arXiv Detail & Related papers (2025-10-15T16:25:18Z) - Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models [49.911784762244814]
TraceRL is a trajectory-aware reinforcement learning framework for diffusion language models (DLMs)<n>We derive a series of state-of-the-art diffusion language models, namely TraDo.<n>TraDo-8B-Instruct achieves relative accuracy improvements of 6.1% over Qwen2.5-7B-Instruct and 51.3% over Llama3.1-8B-Instruct on mathematical reasoning benchmarks.
arXiv Detail & Related papers (2025-09-08T17:58:06Z) - FedPromo: Federated Lightweight Proxy Models at the Edge Bring New Domains to Foundation Models [16.83959862897466]
Federated Learning (FL) is an established paradigm for training deep learning models on decentralized data.<n>We introduce FedPromo, a novel framework that enables efficient adaptation of large-scale foundation models stored on a central server to new domains encountered only by remote clients.
arXiv Detail & Related papers (2025-08-05T12:00:49Z) - JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation [36.93638123812204]
We present JanusFlow, a powerful framework that unifies image understanding and generation in a single model.<n>JanusFlow integrates autoregressive language models with rectified flow, a state-of-the-art method in generative modeling.
arXiv Detail & Related papers (2024-11-12T17:55:10Z) - LLM-KT: A Versatile Framework for Knowledge Transfer from Large Language Models to Collaborative Filtering [0.07793154724386657]
We present a flexible framework designed to enhance collaborative filtering (CF) models by seamlessly integrating LLM-generated features.
Our framework injects these features into an intermediate layer of any CF model, allowing the model to reconstruct and leverage the embeddings internally.
Our framework is built for easy integration and modification, providing researchers and developers with a powerful tool for extending CF model capabilities.
arXiv Detail & Related papers (2024-11-01T13:09:30Z) - Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities.
In-Context Learning (ICL) and.
Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting.
LLMs to downstream tasks.
We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z) - TinyLLaVA Factory: A Modularized Codebase for Small-scale Large Multimodal Models [22.214259364977256]
We present TinyLLaVA Factory, an open-source modular for small-scale large models (LMMs)
TinyLLaVA Factory modularizes the entire system into interchangeable components, with each component integrating a suite of cutting-edge models and methods.
In addition to allowing users to customize their own LMMs, TinyLLaVA Factory provides popular training recipes to let users pretrain and finetune their models with less coding effort.
arXiv Detail & Related papers (2024-05-20T05:11:02Z) - Vertical Federated Learning over Cloud-RAN: Convergence Analysis and
System Optimization [82.12796238714589]
We propose a novel cloud radio access network (Cloud-RAN) based vertical FL system to enable fast and accurate model aggregation.
We characterize the convergence behavior of the vertical FL algorithm considering both uplink and downlink transmissions.
We establish a system optimization framework by joint transceiver and fronthaul quantization design, for which successive convex approximation and alternate convex search based system optimization algorithms are developed.
arXiv Detail & Related papers (2023-05-04T09:26:03Z) - FedHM: Efficient Federated Learning for Heterogeneous Models via
Low-rank Factorization [16.704006420306353]
A scalable federated learning framework should address heterogeneous clients equipped with different computation and communication capabilities.
This paper proposes FedHM, a novel federated model compression framework that distributes the heterogeneous low-rank models to clients and then aggregates them into a global full-rank model.
Our solution enables the training of heterogeneous local models with varying computational complexities and aggregates a single global model.
arXiv Detail & Related papers (2021-11-29T16:11:09Z) - Ensemble Distillation for Robust Model Fusion in Federated Learning [72.61259487233214]
Federated Learning (FL) is a machine learning setting where many devices collaboratively train a machine learning model.
In most of the current training schemes the central model is refined by averaging the parameters of the server model and the updated parameters from the client side.
We propose ensemble distillation for model fusion, i.e. training the central classifier through unlabeled data on the outputs of the models from the clients.
arXiv Detail & Related papers (2020-06-12T14:49:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.