Related papers: Optimal Control Meets Flow Matching: A Principled Route to Multi-Subject Fidelity

Optimal Control Meets Flow Matching: A Principled Route to Multi-Subject Fidelity

URL: http://arxiv.org/abs/2510.02315v1
Date: Thu, 02 Oct 2025 17:59:58 GMT
Title: Optimal Control Meets Flow Matching: A Principled Route to Multi-Subject Fidelity
Authors: Eric Tillmann Bill, Enis Simsar, Thomas Hofmann,
Abstract summary: Text-to-image (T2I) models excel on single-entity prompts but struggle with multi-subject descriptions.<n>We introduce the first theoretical framework with principled optimizable objective for steering sampling dynamics toward multi-subject fidelity.
Score: 35.95129874095729
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Text-to-image (T2I) models excel on single-entity prompts but struggle with multi-subject descriptions, often showing attribute leakage, identity entanglement, and subject omissions. We introduce the first theoretical framework with a principled, optimizable objective for steering sampling dynamics toward multi-subject fidelity. Viewing flow matching (FM) through stochastic optimal control (SOC), we formulate subject disentanglement as control over a trained FM sampler. This yields two architecture-agnostic algorithms: (i) a training-free test-time controller that perturbs the base velocity with a single-pass update, and (ii) Adjoint Matching, a lightweight fine-tuning rule that regresses a control network to a backward adjoint signal while preserving base-model capabilities. The same formulation unifies prior attention heuristics, extends to diffusion models via a flow-diffusion correspondence, and provides the first fine-tuning route explicitly designed for multi-subject fidelity. Empirically, on Stable Diffusion 3.5, FLUX, and Stable Diffusion XL, both algorithms consistently improve multi-subject alignment while maintaining base-model style. Test-time control runs efficiently on commodity GPUs, and fine-tuned controllers trained on limited prompts generalize to unseen ones. We further highlight FOCUS (Flow Optimal Control for Unentangled Subjects), which achieves state-of-the-art multi-subject fidelity across models.

Related papers

Solving Inverse Problems with Flow-based Models via Model Predictive Control [41.551726534223704]
MPC-Flow is a model predictive control framework that formulates inverse problem solving with flow-based generative models as a sequence of control sub-problems.<n>We show how different algorithmic choices yield a spectrum of guidance algorithms, including regimes that avoid backpropagation through the generative model trajectory.
arXiv Detail & Related papers (2026-01-30T17:59:09Z)
DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving [65.7087560656003]
Generative diffusion models for end-to-end autonomous driving often suffer from mode collapse.<n>We propose DiffusionDriveV2, which leverages reinforcement learning to constrain low-quality modes and explore for superior trajectories.<n>This significantly enhances the overall output quality while preserving the inherent multimodality of its core Gaussian Mixture Model.
arXiv Detail & Related papers (2025-12-08T17:29:52Z)
ScaleWeaver: Weaving Efficient Controllable T2I Generation with Multi-Scale Reference Attention [86.93601565563954]
ScaleWeaver is a framework designed to achieve high-fidelity, controllable generation upon advanced visual autoregressive( VAR) models.<n>The proposed Reference Attention module discards the unnecessary attention from image$rightarrow$condition, reducing computational cost.<n>Experiments show that ScaleWeaver delivers high-quality generation and precise control while attaining superior efficiency over diffusion-based methods.
arXiv Detail & Related papers (2025-10-16T17:00:59Z)
Rectified-CFG++ for Flow Based Models [26.896426878221718]
We present Rectified-C++, an adaptive predictor-corrector guidance that couples the deterministic efficiency of rectified flows with a geometry-aware conditioning rule.<n>Experiments on large-scale text-to-image models (Flux, Stable Diffusion 3/3.5, Lumina) show that Rectified-C++ consistently outperforms standard CFG on benchmark datasets.
arXiv Detail & Related papers (2025-10-09T00:00:47Z)
Merge and Guide: Unifying Model Merging and Guided Decoding for Controllable Multi-Objective Generation [49.98025799046136]
We introduce Merge-And-GuidE, a two-stage framework that leverages model merging for guided decoding.<n>In Stage 1, MAGE resolves a compatibility problem between the guidance and base models.<n>In Stage 2, we merge explicit and implicit value models into a unified guidance proxy, which then steers the decoding of the base model from Stage 1.
arXiv Detail & Related papers (2025-10-04T11:10:07Z)
Diffusion Blend: Inference-Time Multi-Preference Alignment for Diffusion Models [25.59542599768357]
Diffusion Blend is a novel approach to solve inference-time multi-preference alignment.<n>We instantiate this approach with two algorithms: DB-MPA for multi-reward alignment and DB-KLA for KL regularization control.
arXiv Detail & Related papers (2025-05-24T06:27:55Z)
SuperFlow++: Enhanced Spatiotemporal Consistency for Cross-Modal Data Pretraining [62.433137130087445]
SuperFlow++ is a novel framework that integrates pretraining and downstream tasks using consecutive camera pairs.<n>We show that SuperFlow++ outperforms state-of-the-art methods across diverse tasks and driving conditions.<n>With strong generalizability and computational efficiency, SuperFlow++ establishes a new benchmark for data-efficient LiDAR-based perception in autonomous driving.
arXiv Detail & Related papers (2025-03-25T17:59:57Z)
Taming Flow Matching with Unbalanced Optimal Transport into Fast Pansharpening [10.23957420290553]
We propose the Optimal Transport Flow Matching framework to achieve one-step, high-quality pansharpening.<n>The OTFM framework enables simulation-free training and single-step inference while maintaining strict adherence to pansharpening constraints.
arXiv Detail & Related papers (2025-03-19T08:10:49Z)
Stochastic Control for Fine-tuning Diffusion Models: Optimality, Regularity, and Convergence [19.484676783876306]
Diffusion models have emerged as powerful tools for generative modeling.<n>We propose a control framework for fine-tuning diffusion models.<n>We show that PI-FT achieves global convergence at a linear rate.
arXiv Detail & Related papers (2024-12-24T04:55:46Z)
Consistency Flow Matching: Defining Straight Flows with Velocity Consistency [97.28511135503176]
We introduce Consistency Flow Matching (Consistency-FM), a novel FM method that explicitly enforces self-consistency in the velocity field. Preliminary experiments demonstrate that our Consistency-FM significantly improves training efficiency by converging 4.4x faster than consistency models.
arXiv Detail & Related papers (2024-07-02T16:15:37Z)
Guided Flows for Generative Modeling and Decision Making [55.42634941614435]
We show that Guided Flows significantly improves the sample quality in conditional image generation and zero-shot text synthesis-to-speech. Notably, we are first to apply flow models for plan generation in the offline reinforcement learning setting ax speedup in compared to diffusion models.
arXiv Detail & Related papers (2023-11-22T15:07:59Z)
Improving and generalizing flow-based generative models with minibatch optimal transport [90.01613198337833]
We introduce the generalized conditional flow matching (CFM) technique for continuous normalizing flows (CNFs) CFM features a stable regression objective like that used to train the flow in diffusion models but enjoys the efficient inference of deterministic flow models. A variant of our objective is optimal transport CFM (OT-CFM), which creates simpler flows that are more stable to train and lead to faster inference.
arXiv Detail & Related papers (2023-02-01T14:47:17Z)
Comparative analysis of machine learning methods for active flow control [60.53767050487434]
Genetic Programming (GP) and Reinforcement Learning (RL) are gaining popularity in flow control. This work presents a comparative analysis of the two, bench-marking some of their most representative algorithms against global optimization techniques.
arXiv Detail & Related papers (2022-02-23T18:11:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.