Minibatch Optimal Transport and Perplexity Bound Estimation in Discrete Flow Matching
- URL: http://arxiv.org/abs/2411.00759v2
- Date: Wed, 13 Nov 2024 20:48:26 GMT
- Title: Minibatch Optimal Transport and Perplexity Bound Estimation in Discrete Flow Matching
- Authors: Etrit Haxholli, Yeti Z. Gürbüz, Oğul Can, Eli Waxman,
- Abstract summary: Outperforming autoregressive models on categorical data distributions, such as textual data, remains challenging for continuous diffusion and flow models.
We propose a dynamic-optimal-transport-like minimization objective for discrete flows with convex interpolants.
Unlike continuous flows, wherein the instantaneous change of variables enables density estimation, discrete models lack a similar mechanism due to the inherent non-determinism and discontinuity of their paths.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Outperforming autoregressive models on categorical data distributions, such as textual data, remains challenging for continuous diffusion and flow models. Discrete flow matching, a recent framework for modeling categorical data, has shown competitive performance with autoregressive models. Despite its similarities with continuous flow matching, the rectification strategy applied in the continuous version does not directly extend to the discrete one due to the inherent stochasticity of discrete paths. This limitation necessitates exploring alternative methods to minimize state transitions during generation. To address this, we propose a dynamic-optimal-transport-like minimization objective for discrete flows with convex interpolants and derive its equivalent Kantorovich formulation. The latter defines transport cost solely in terms of inter-state similarity and is optimized using a minibatch strategy. Another limitation we address in the discrete flow framework is model evaluation. Unlike continuous flows, wherein the instantaneous change of variables enables density estimation, discrete models lack a similar mechanism due to the inherent non-determinism and discontinuity of their paths. To alleviate this issue, we propose an upper bound on the perplexity of discrete flow models, enabling performance evaluation and comparison with other methods.
Related papers
- Probabilistic Forecasting via Autoregressive Flow Matching [1.5467259918426441]
FlowTime is a generative model for probabilistic forecasting of timeseries data.
We decompose the joint distribution of future observations into a sequence of conditional densities, each modeled via a shared flow.
We demonstrate the effectiveness of FlowTime on multiple dynamical systems and real-world forecasting tasks.
arXiv Detail & Related papers (2025-03-13T13:54:24Z) - Continuous Diffusion Model for Language Modeling [57.396578974401734]
Existing continuous diffusion models for discrete data have limited performance compared to discrete approaches.
We propose a continuous diffusion model for language modeling that incorporates the geometry of the underlying categorical distribution.
arXiv Detail & Related papers (2025-02-17T08:54:29Z) - Conditional Lagrangian Wasserstein Flow for Time Series Imputation [3.914746375834628]
We propose a novel method for time series imputation called Conditional Lagrangian Wasserstein Flow.
The proposed method leverages the (conditional) optimal transport theory to learn the probability flow in a simulation-free manner.
The experimental results on the real-word datasets show that the proposed method achieves competitive performance on time series imputation.
arXiv Detail & Related papers (2024-10-10T02:46:28Z) - Local Flow Matching Generative Models [19.859984725284896]
Local Flow Matching is a computational framework for density estimation based on flow-based generative models.
$textttLFM$ employs a simulation-free scheme and incrementally learns a sequence of Flow Matching sub-models.
We demonstrate the improved training efficiency and competitive generative performance of $textttLFM$ compared to FM.
arXiv Detail & Related papers (2024-10-03T14:53:10Z) - On the Trajectory Regularity of ODE-based Diffusion Sampling [79.17334230868693]
Diffusion-based generative models use differential equations to establish a smooth connection between a complex data distribution and a tractable prior distribution.
In this paper, we identify several intriguing trajectory properties in the ODE-based sampling process of diffusion models.
arXiv Detail & Related papers (2024-05-18T15:59:41Z) - Guided Flows for Generative Modeling and Decision Making [55.42634941614435]
We show that Guided Flows significantly improves the sample quality in conditional image generation and zero-shot text synthesis-to-speech.
Notably, we are first to apply flow models for plan generation in the offline reinforcement learning setting ax speedup in compared to diffusion models.
arXiv Detail & Related papers (2023-11-22T15:07:59Z) - Stochastic Interpolants: A Unifying Framework for Flows and Diffusions [16.95541777254722]
A class of generative models that unifies flow-based and diffusion-based methods is introduced.
These models extend the framework proposed in Albergo & VandenEijnden (2023), enabling the use of a broad class of continuous-time processes called stochastic interpolants'
These interpolants are built by combining data from the two prescribed densities with an additional latent variable that shapes the bridge in a flexible way.
arXiv Detail & Related papers (2023-03-15T17:43:42Z) - Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z) - Manifold Interpolating Optimal-Transport Flows for Trajectory Inference [64.94020639760026]
We present a method called Manifold Interpolating Optimal-Transport Flow (MIOFlow)
MIOFlow learns, continuous population dynamics from static snapshot samples taken at sporadic timepoints.
We evaluate our method on simulated data with bifurcations and merges, as well as scRNA-seq data from embryoid body differentiation, and acute myeloid leukemia treatment.
arXiv Detail & Related papers (2022-06-29T22:19:03Z) - Attentive Contractive Flow with Lipschitz-constrained Self-Attention [25.84621883831624]
We introduce a novel approach called Attentive Contractive Flow (ACF)
ACF utilizes a special category of flow-based generative models - contractive flows.
We demonstrate that ACF can be introduced into a variety of state of the art flow models in a plug-and-play manner.
arXiv Detail & Related papers (2021-09-24T18:02:49Z) - Discrete Denoising Flows [87.44537620217673]
We introduce a new discrete flow-based model for categorical random variables: Discrete Denoising Flows (DDFs)
In contrast with other discrete flow-based models, our model can be locally trained without introducing gradient bias.
We show that DDFs outperform Discrete Flows on modeling a toy example, binary MNIST and Cityscapes segmentation maps, measured in log-likelihood.
arXiv Detail & Related papers (2021-07-24T14:47:22Z) - Comparing Probability Distributions with Conditional Transport [63.11403041984197]
We propose conditional transport (CT) as a new divergence and approximate it with the amortized CT (ACT) cost.
ACT amortizes the computation of its conditional transport plans and comes with unbiased sample gradients that are straightforward to compute.
On a wide variety of benchmark datasets generative modeling, substituting the default statistical distance of an existing generative adversarial network with ACT is shown to consistently improve the performance.
arXiv Detail & Related papers (2020-12-28T05:14:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.