Optimal Control Meets Flow Matching: A Principled Route to Multi-Subject Fidelity
- URL: http://arxiv.org/abs/2510.02315v1
- Date: Thu, 02 Oct 2025 17:59:58 GMT
- Title: Optimal Control Meets Flow Matching: A Principled Route to Multi-Subject Fidelity
- Authors: Eric Tillmann Bill, Enis Simsar, Thomas Hofmann,
- Abstract summary: Text-to-image (T2I) models excel on single-entity prompts but struggle with multi-subject descriptions.<n>We introduce the first theoretical framework with principled optimizable objective for steering sampling dynamics toward multi-subject fidelity.
- Score: 35.95129874095729
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text-to-image (T2I) models excel on single-entity prompts but struggle with multi-subject descriptions, often showing attribute leakage, identity entanglement, and subject omissions. We introduce the first theoretical framework with a principled, optimizable objective for steering sampling dynamics toward multi-subject fidelity. Viewing flow matching (FM) through stochastic optimal control (SOC), we formulate subject disentanglement as control over a trained FM sampler. This yields two architecture-agnostic algorithms: (i) a training-free test-time controller that perturbs the base velocity with a single-pass update, and (ii) Adjoint Matching, a lightweight fine-tuning rule that regresses a control network to a backward adjoint signal while preserving base-model capabilities. The same formulation unifies prior attention heuristics, extends to diffusion models via a flow-diffusion correspondence, and provides the first fine-tuning route explicitly designed for multi-subject fidelity. Empirically, on Stable Diffusion 3.5, FLUX, and Stable Diffusion XL, both algorithms consistently improve multi-subject alignment while maintaining base-model style. Test-time control runs efficiently on commodity GPUs, and fine-tuned controllers trained on limited prompts generalize to unseen ones. We further highlight FOCUS (Flow Optimal Control for Unentangled Subjects), which achieves state-of-the-art multi-subject fidelity across models.
Related papers
- Solving Inverse Problems with Flow-based Models via Model Predictive Control [41.551726534223704]
MPC-Flow is a model predictive control framework that formulates inverse problem solving with flow-based generative models as a sequence of control sub-problems.<n>We show how different algorithmic choices yield a spectrum of guidance algorithms, including regimes that avoid backpropagation through the generative model trajectory.
arXiv Detail & Related papers (2026-01-30T17:59:09Z) - DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving [65.7087560656003]
Generative diffusion models for end-to-end autonomous driving often suffer from mode collapse.<n>We propose DiffusionDriveV2, which leverages reinforcement learning to constrain low-quality modes and explore for superior trajectories.<n>This significantly enhances the overall output quality while preserving the inherent multimodality of its core Gaussian Mixture Model.
arXiv Detail & Related papers (2025-12-08T17:29:52Z) - ScaleWeaver: Weaving Efficient Controllable T2I Generation with Multi-Scale Reference Attention [86.93601565563954]
ScaleWeaver is a framework designed to achieve high-fidelity, controllable generation upon advanced visual autoregressive( VAR) models.<n>The proposed Reference Attention module discards the unnecessary attention from image$rightarrow$condition, reducing computational cost.<n>Experiments show that ScaleWeaver delivers high-quality generation and precise control while attaining superior efficiency over diffusion-based methods.
arXiv Detail & Related papers (2025-10-16T17:00:59Z) - Rectified-CFG++ for Flow Based Models [26.896426878221718]
We present Rectified-C++, an adaptive predictor-corrector guidance that couples the deterministic efficiency of rectified flows with a geometry-aware conditioning rule.<n>Experiments on large-scale text-to-image models (Flux, Stable Diffusion 3/3.5, Lumina) show that Rectified-C++ consistently outperforms standard CFG on benchmark datasets.
arXiv Detail & Related papers (2025-10-09T00:00:47Z) - Merge and Guide: Unifying Model Merging and Guided Decoding for Controllable Multi-Objective Generation [49.98025799046136]
We introduce Merge-And-GuidE, a two-stage framework that leverages model merging for guided decoding.<n>In Stage 1, MAGE resolves a compatibility problem between the guidance and base models.<n>In Stage 2, we merge explicit and implicit value models into a unified guidance proxy, which then steers the decoding of the base model from Stage 1.
arXiv Detail & Related papers (2025-10-04T11:10:07Z) - Diffusion Blend: Inference-Time Multi-Preference Alignment for Diffusion Models [25.59542599768357]
Diffusion Blend is a novel approach to solve inference-time multi-preference alignment.<n>We instantiate this approach with two algorithms: DB-MPA for multi-reward alignment and DB-KLA for KL regularization control.
arXiv Detail & Related papers (2025-05-24T06:27:55Z) - SuperFlow++: Enhanced Spatiotemporal Consistency for Cross-Modal Data Pretraining [62.433137130087445]
SuperFlow++ is a novel framework that integrates pretraining and downstream tasks using consecutive camera pairs.<n>We show that SuperFlow++ outperforms state-of-the-art methods across diverse tasks and driving conditions.<n>With strong generalizability and computational efficiency, SuperFlow++ establishes a new benchmark for data-efficient LiDAR-based perception in autonomous driving.
arXiv Detail & Related papers (2025-03-25T17:59:57Z) - Taming Flow Matching with Unbalanced Optimal Transport into Fast Pansharpening [10.23957420290553]
We propose the Optimal Transport Flow Matching framework to achieve one-step, high-quality pansharpening.<n>The OTFM framework enables simulation-free training and single-step inference while maintaining strict adherence to pansharpening constraints.
arXiv Detail & Related papers (2025-03-19T08:10:49Z) - Stochastic Control for Fine-tuning Diffusion Models: Optimality, Regularity, and Convergence [19.484676783876306]
Diffusion models have emerged as powerful tools for generative modeling.<n>We propose a control framework for fine-tuning diffusion models.<n>We show that PI-FT achieves global convergence at a linear rate.
arXiv Detail & Related papers (2024-12-24T04:55:46Z) - Consistency Flow Matching: Defining Straight Flows with Velocity Consistency [97.28511135503176]
We introduce Consistency Flow Matching (Consistency-FM), a novel FM method that explicitly enforces self-consistency in the velocity field.
Preliminary experiments demonstrate that our Consistency-FM significantly improves training efficiency by converging 4.4x faster than consistency models.
arXiv Detail & Related papers (2024-07-02T16:15:37Z) - Guided Flows for Generative Modeling and Decision Making [55.42634941614435]
We show that Guided Flows significantly improves the sample quality in conditional image generation and zero-shot text synthesis-to-speech.
Notably, we are first to apply flow models for plan generation in the offline reinforcement learning setting ax speedup in compared to diffusion models.
arXiv Detail & Related papers (2023-11-22T15:07:59Z) - Improving and generalizing flow-based generative models with minibatch
optimal transport [90.01613198337833]
We introduce the generalized conditional flow matching (CFM) technique for continuous normalizing flows (CNFs)
CFM features a stable regression objective like that used to train the flow in diffusion models but enjoys the efficient inference of deterministic flow models.
A variant of our objective is optimal transport CFM (OT-CFM), which creates simpler flows that are more stable to train and lead to faster inference.
arXiv Detail & Related papers (2023-02-01T14:47:17Z) - Comparative analysis of machine learning methods for active flow control [60.53767050487434]
Genetic Programming (GP) and Reinforcement Learning (RL) are gaining popularity in flow control.
This work presents a comparative analysis of the two, bench-marking some of their most representative algorithms against global optimization techniques.
arXiv Detail & Related papers (2022-02-23T18:11:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.