Three Forms of Stochastic Injection for Improved Distribution-to-Distribution Generative Modeling
- URL: http://arxiv.org/abs/2510.06634v1
- Date: Wed, 08 Oct 2025 04:36:34 GMT
- Title: Three Forms of Stochastic Injection for Improved Distribution-to-Distribution Generative Modeling
- Authors: Shiye Su, Yuhui Zhang, Linqi Zhou, Rajesh Ranganath, Serena Yeung-Levy,
- Abstract summary: Flow matching offers a natural framework for modeling transformations between arbitrary data distributions.<n>We propose a simple and computationally efficient method that injectsperturbity into the training process by perturbing source samples and flow interpolants.<n>Our approach also reduces the transport cost between input and generated samples to better highlight the true effect of the transformation.
- Score: 40.63772844645927
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modeling transformations between arbitrary data distributions is a fundamental scientific challenge, arising in applications like drug discovery and evolutionary simulation. While flow matching offers a natural framework for this task, its use has thus far primarily focused on the noise-to-data setting, while its application in the general distribution-to-distribution setting is underexplored. We find that in the latter case, where the source is also a data distribution to be learned from limited samples, standard flow matching fails due to sparse supervision. To address this, we propose a simple and computationally efficient method that injects stochasticity into the training process by perturbing source samples and flow interpolants. On five diverse imaging tasks spanning biology, radiology, and astronomy, our method significantly improves generation quality, outperforming existing baselines by an average of 9 FID points. Our approach also reduces the transport cost between input and generated samples to better highlight the true effect of the transformation, making flow matching a more practical tool for simulating the diverse distribution transformations that arise in science.
Related papers
- Guiding diffusion models to reconstruct flow fields from sparse data [25.34099672176622]
We introduce a novel sampling method for diffusion models that enables the reconstruction of high-fidelity samples.<n>Our method consistently outperforms other diffusion-based methods in predicting the fluid's structure.<n>This study underscores the remarkable potential of diffusion models in reconstructing flow field data.
arXiv Detail & Related papers (2025-10-22T19:01:50Z) - Data-to-Energy Stochastic Dynamics [16.394074432826823]
We propose the first general method for modelling Schr"odinger bridges when one (or both) distributions are given by their unnormalised densities.<n>Our algorithm relies on a generalisation of the iterative proportional fitting (IPF) procedure to the data-free case, inspired by recent developments in off-policy reinforcement learning.<n>We demonstrate the efficacy of the proposed data-to-energy IPF on synthetic problems, finding that it can successfully learn transports between multimodal distributions.
arXiv Detail & Related papers (2025-09-30T15:03:55Z) - DeFoG: Discrete Flow Matching for Graph Generation [45.037260759871124]
We introduce DeFoG, a graph generative framework that disentangles sampling from training.<n>We propose novel sampling methods that significantly enhance performance and reduce the required number of refinement steps.
arXiv Detail & Related papers (2024-10-05T18:52:54Z) - Constrained Diffusion Models via Dual Training [80.03953599062365]
Diffusion processes are prone to generating samples that reflect biases in a training dataset.
We develop constrained diffusion models by imposing diffusion constraints based on desired distributions.
We show that our constrained diffusion models generate new data from a mixture data distribution that achieves the optimal trade-off among objective and constraints.
arXiv Detail & Related papers (2024-08-27T14:25:42Z) - Generative Assignment Flows for Representing and Learning Joint Distributions of Discrete Data [2.6499018693213316]
We introduce a novel generative model for the representation of joint probability distributions of discrete random variables.<n>The approach uses measure transport by randomized assignment flows on the statistical submanifold of factorizing distributions.
arXiv Detail & Related papers (2024-06-06T21:58:33Z) - On the Trajectory Regularity of ODE-based Diffusion Sampling [79.17334230868693]
Diffusion-based generative models use differential equations to establish a smooth connection between a complex data distribution and a tractable prior distribution.
In this paper, we identify several intriguing trajectory properties in the ODE-based sampling process of diffusion models.
arXiv Detail & Related papers (2024-05-18T15:59:41Z) - Improved off-policy training of diffusion samplers [93.66433483772055]
We study the problem of training diffusion models to sample from a distribution with an unnormalized density or energy function.<n>We benchmark several diffusion-structured inference methods, including simulation-based variational approaches and off-policy methods.<n>Our results shed light on the relative advantages of existing algorithms while bringing into question some claims from past work.
arXiv Detail & Related papers (2024-02-07T18:51:49Z) - Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - Time-series Generation by Contrastive Imitation [87.51882102248395]
We study a generative framework that seeks to combine the strengths of both: Motivated by a moment-matching objective to mitigate compounding error, we optimize a local (but forward-looking) transition policy.
At inference, the learned policy serves as the generator for iterative sampling, and the learned energy serves as a trajectory-level measure for evaluating sample quality.
arXiv Detail & Related papers (2023-11-02T16:45:25Z) - Uncertainty quantification and out-of-distribution detection using
surjective normalizing flows [46.51077762143714]
We propose a simple approach using surjective normalizing flows to identify out-of-distribution data sets in deep neural network models.
We show that our method can reliably discern out-of-distribution data from in-distribution data.
arXiv Detail & Related papers (2023-11-01T09:08:35Z) - Diffusion Generative Flow Samplers: Improving learning signals through
partial trajectory optimization [87.21285093582446]
Diffusion Generative Flow Samplers (DGFS) is a sampling-based framework where the learning process can be tractably broken down into short partial trajectory segments.
Our method takes inspiration from the theory developed for generative flow networks (GFlowNets)
arXiv Detail & Related papers (2023-10-04T09:39:05Z) - The Score-Difference Flow for Implicit Generative Modeling [1.1929584800629673]
Implicit generative modeling aims to produce samples of synthetic data matching a target data distribution.<n>Recent work has approached the IGM problem from the perspective of pushing synthetic source data toward the target distribution.<n>We present the score difference between arbitrary target and source distributions as a flow that optimally reduces the Kullback-Leibler divergence between them.
arXiv Detail & Related papers (2023-04-25T15:21:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.