OSCAR: Orthogonal Stochastic Control for Alignment-Respecting Diversity in Flow Matching
- URL: http://arxiv.org/abs/2510.09060v1
- Date: Fri, 10 Oct 2025 07:07:19 GMT
- Title: OSCAR: Orthogonal Stochastic Control for Alignment-Respecting Diversity in Flow Matching
- Authors: Jingxuan Wu, Zhenglin Wan, Xingrui Yu, Yuzhe Yang, Bo An, Ivor Tsang,
- Abstract summary: Flow-based text-to-image models follow deterministic trajectories, forcing users to repeatedly sample to discover diverse modes.<n>We present a training-free, inference-time control mechanism that makes the flow itself diversity-aware.
- Score: 14.664226708184676
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Flow-based text-to-image models follow deterministic trajectories, forcing users to repeatedly sample to discover diverse modes, which is a costly and inefficient process. We present a training-free, inference-time control mechanism that makes the flow itself diversity-aware. Our method simultaneously encourages lateral spread among trajectories via a feature-space objective and reintroduces uncertainty through a time-scheduled stochastic perturbation. Crucially, this perturbation is projected to be orthogonal to the generation flow, a geometric constraint that allows it to boost variation without degrading image details or prompt fidelity. Our procedure requires no retraining or modification to the base sampler and is compatible with common flow-matching solvers. Theoretically, our method is shown to monotonically increase a volume surrogate while, due to its geometric constraints, approximately preserving the marginal distribution. This provides a principled explanation for why generation quality is robustly maintained. Empirically, across multiple text-to-image settings under fixed sampling budgets, our method consistently improves diversity metrics such as the Vendi Score and Brisque over strong baselines, while upholding image quality and alignment.
Related papers
- Restora-Flow: Mask-Guided Image Restoration with Flow Matching [0.39141750421215127]
Flow matching has emerged as a promising generative approach that addresses the lengthy sampling times associated with state-of-the-art diffusion models.<n>We introduce Restora-Flow, a training-free method that guides flow matching sampling by a degradation mask.<n>We show superior perceptual quality and processing time compared to diffusion and flow matching-based reference methods.
arXiv Detail & Related papers (2025-11-25T10:22:26Z) - Training-Free Generation of Diverse and High-Fidelity Images via Prompt Semantic Space Optimization [50.5332987313297]
We propose Token-Prompt embedding Space Optimization (TPSO), a training-free and model-agnostic module.<n>TPSO introduces learnable parameters to explore underrepresented regions of the token embedding space, reducing the tendency of the model to repeatedly generate samples from strong modes of the learned distribution.<n>In experiments on MS-COCO and three diffusion backbones, TPSO significantly enhances generative diversity, improving baseline performance from 1.10 to 4.18 points, without sacrificing image quality.
arXiv Detail & Related papers (2025-11-25T00:42:09Z) - Robust Posterior Diffusion-based Sampling via Adaptive Guidance Scale [39.27744518020771]
We propose an adaptive likelihood step-size strategy to guide the diffusion process for inverse-problem formulations.<n>The resulting approach, Adaptive Posterior diffusion Sampling (AdaPS), is hyper-free and improves reconstruction quality across diverse imaging tasks.
arXiv Detail & Related papers (2025-11-23T14:37:59Z) - Measurement-Guided Consistency Model Sampling for Inverse Problems [2.217547045999963]
Consistency models enable high-quality generation in a single or only a few steps.<n>We present a modified consistency sampling approach tailored for inverse problem reconstruction.
arXiv Detail & Related papers (2025-10-02T16:53:07Z) - Purrception: Variational Flow Matching for Vector-Quantized Image Generation [79.74708247230218]
Purrception is a variational flow matching approach for vector-quantized image generation.<n>Our method adapts Variational Flow Matching to vector-quantized latents by learning categorical posteriors over codebook indices.<n>This combines the geometric awareness of continuous methods with the discrete supervision of categorical approaches.
arXiv Detail & Related papers (2025-10-01T21:41:30Z) - Diffusion Models for Solving Inverse Problems via Posterior Sampling with Piecewise Guidance [52.705112811734566]
A novel diffusion-based framework is introduced for solving inverse problems using a piecewise guidance scheme.<n>The proposed method is problem-agnostic and readily adaptable to a variety of inverse problems.<n>The framework achieves a reduction in inference time of (25%) for inpainting with both random and center masks, and (23%) and (24%) for (4times) and (8times) super-resolution tasks.
arXiv Detail & Related papers (2025-07-22T19:35:14Z) - Solving Inverse Problems with FLAIR [59.02385492199431]
Flow-based latent generative models are able to generate images with remarkable quality, even enabling text-to-image generation.<n>We present FLAIR, a novel training free variational framework that leverages flow-based generative models as a prior for inverse problems.<n>Results on standard imaging benchmarks demonstrate that FLAIR consistently outperforms existing diffusion- and flow-based methods in terms of reconstruction quality and sample diversity.
arXiv Detail & Related papers (2025-06-03T09:29:47Z) - FFHFlow: Diverse and Uncertainty-Aware Dexterous Grasp Generation via Flow Variational Inference [36.02645364048733]
We propose FFHFlow, a flow-based variational framework that generates diverse, robust multi-finger grasps.<n>By exploiting the invertibility and exact likelihoods of flows, FFHFlow introspects shape uncertainty in partial observations.<n>We also integrate a discriminative grasp evaluator with the flow likelihoods, formulating an uncertainty-aware ranking strategy.
arXiv Detail & Related papers (2024-07-21T13:33:08Z) - A Variational Perspective on Solving Inverse Problems with Diffusion
Models [101.831766524264]
Inverse tasks can be formulated as inferring a posterior distribution over data.
This is however challenging in diffusion models since the nonlinear and iterative nature of the diffusion process renders the posterior intractable.
We propose a variational approach that by design seeks to approximate the true posterior distribution.
arXiv Detail & Related papers (2023-05-07T23:00:47Z) - Auto-regressive Image Synthesis with Integrated Quantization [55.51231796778219]
This paper presents a versatile framework for conditional image generation.
It incorporates the inductive bias of CNNs and powerful sequence modeling of auto-regression.
Our method achieves superior diverse image generation performance as compared with the state-of-the-art.
arXiv Detail & Related papers (2022-07-21T22:19:17Z) - Deblurring via Stochastic Refinement [85.42730934561101]
We present an alternative framework for blind deblurring based on conditional diffusion models.
Our method is competitive in terms of distortion metrics such as PSNR.
arXiv Detail & Related papers (2021-12-05T04:36:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.