Related papers: $α$-Flow: A Unified Framework for Continuous-State Discrete Flow Matching Models

$α$-Flow: A Unified Framework for Continuous-State Discrete Flow Matching Models

URL: http://arxiv.org/abs/2504.10283v1
Date: Mon, 14 Apr 2025 14:51:45 GMT
Title: $α$-Flow: A Unified Framework for Continuous-State Discrete Flow Matching Models
Authors: Chaoran Cheng, Jiahan Li, Jiajun Fan, Ge Liu,
Abstract summary: This work presents a unified framework for Continuous-State Discrete Flow Matching models.<n>We introduce $alpha$-Flow, a family of CS-DFM models that adheres to the canonical $alpha$-geometry of the statistical manifold.<n>We show that the flow matching loss for $alpha$-flow establishes a unified variational bound for the discrete negative log-likelihood.
Score: 8.705749038874137
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent efforts have extended the flow-matching framework to discrete generative modeling. One strand of models directly works with the continuous probabilities instead of discrete tokens, which we colloquially refer to as Continuous-State Discrete Flow Matching (CS-DFM). Existing CS-DFM models differ significantly in their representations and geometric assumptions. This work presents a unified framework for CS-DFM models, under which the existing variants can be understood as operating on different $\alpha$-representations of probabilities. Building upon the theory of information geometry, we introduce $\alpha$-Flow, a family of CS-DFM models that adheres to the canonical $\alpha$-geometry of the statistical manifold, and demonstrate its optimality in minimizing the generalized kinetic energy. Theoretically, we show that the flow matching loss for $\alpha$-flow establishes a unified variational bound for the discrete negative log-likelihood. We comprehensively evaluate different instantiations of $\alpha$-flow on various discrete generation domains to demonstrate their effectiveness in discrete generative modeling, including intermediate values whose geometries have never been explored before. $\alpha$-flow significantly outperforms its discrete-state counterpart in image and protein sequence generation and better captures the entropy in language modeling.

Related papers

Diffusion models for multivariate subsurface generation and efficient probabilistic inversion [0.0]
Diffusion models offer stable training and state-of-the-art performance for deep generative modeling tasks.<n>We introduce a likelihood approximation accounting for the noise-contamination that is inherent in diffusion modeling.<n>Our tests show significantly improved statistical robustness, enhanced sampling of the posterior probability density function.
arXiv Detail & Related papers (2025-07-21T17:10:16Z)
Preconditioned Inexact Stochastic ADMM for Deep Model [35.37705488695026]
This paper develops an algorithm, PISA, which enables scalable parallel computing and supports various second-moment schemes.<n>Grounded in rigorous theoretical guarantees, the algorithm converges under the sole assumption of Lipschitz of the gradient.<n> Comprehensive experimental evaluations for or fine-tuning diverse FMs, including vision models, large language models, reinforcement learning models, generative adversarial networks, and recurrent neural networks, demonstrate its superior numerical performance compared to various state-of-the-art Directions.
arXiv Detail & Related papers (2025-02-15T12:28:51Z)
Categorical Flow Matching on Statistical Manifolds [12.646272756981672]
We introduce a flow-matching framework on the manifold of parameterized probability measures inspired by information geometry. We develop an efficient training and sampling algorithm that overcomes numerical stability with a diffeomorphism between manifold. We manifest that SFM can learn more complex patterns on the statistical manifold where existing models often fail due to strong prior assumptions.
arXiv Detail & Related papers (2024-05-26T05:50:39Z)
Fisher Flow Matching for Generative Modeling over Discrete Data [12.69975914345141]
We introduce Fisher-Flow, a novel flow-matching model for discrete data. Fisher-Flow takes a manifestly geometric perspective by considering categorical distributions over discrete data. We prove that the gradient flow induced by Fisher-Flow is optimal in reducing the forward KL divergence.
arXiv Detail & Related papers (2024-05-23T15:02:11Z)
Convergence Analysis of Discrete Diffusion Model: Exact Implementation through Uniformization [17.535229185525353]
We introduce an algorithm leveraging the uniformization of continuous Markov chains, implementing transitions on random time points. Our results align with state-of-the-art achievements for diffusion models in $mathbbRd$ and further underscore the advantages of discrete diffusion models in comparison to the $mathbbRd$ setting.
arXiv Detail & Related papers (2024-02-12T22:26:52Z)
Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with Applications to Protein Co-Design [37.634098563033795]
We present a new flow-based model of discrete data that provides the missing link in enabling flow-based generative models. Our key insight is that the discrete equivalent of continuous space flow matching can be realized using Continuous Time Markov Chains. We apply this capability to the task of protein co-design, wherein we learn a model for jointly generating protein structure and sequence.
arXiv Detail & Related papers (2024-02-07T16:15:36Z)
Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative Models [49.81937966106691]
We develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models. In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach.
arXiv Detail & Related papers (2023-06-15T16:30:08Z)
Distributed Bayesian Learning of Dynamic States [65.7870637855531]
The proposed algorithm is a distributed Bayesian filtering task for finite-state hidden Markov models. It can be used for sequential state estimation, as well as for modeling opinion formation over social networks under dynamic environments.
arXiv Detail & Related papers (2022-12-05T19:40:17Z)
Diffusion models as plug-and-play priors [98.16404662526101]
We consider the problem of inferring high-dimensional data $mathbfx$ in a model that consists of a prior $p(mathbfx)$ and an auxiliary constraint $c(mathbfx,mathbfy)$. The structure of diffusion models allows us to perform approximate inference by iterating differentiation through the fixed denoising network enriched with different amounts of noise.
arXiv Detail & Related papers (2022-06-17T21:11:36Z)
Discrete Denoising Flows [87.44537620217673]
We introduce a new discrete flow-based model for categorical random variables: Discrete Denoising Flows (DDFs) In contrast with other discrete flow-based models, our model can be locally trained without introducing gradient bias. We show that DDFs outperform Discrete Flows on modeling a toy example, binary MNIST and Cityscapes segmentation maps, measured in log-likelihood.
arXiv Detail & Related papers (2021-07-24T14:47:22Z)
Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors. We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method. Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)
Probabilistic Circuits for Variational Inference in Discrete Graphical Models [101.28528515775842]
Inference in discrete graphical models with variational methods is difficult. Many sampling-based methods have been proposed for estimating Evidence Lower Bound (ELBO) We propose a new approach that leverages the tractability of probabilistic circuit models, such as Sum Product Networks (SPN) We show that selective-SPNs are suitable as an expressive variational distribution, and prove that when the log-density of the target model is aweighted the corresponding ELBO can be computed analytically.
arXiv Detail & Related papers (2020-10-22T05:04:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.