Related papers: Generative Assignment Flows for Representing and Learning Joint Distributions of Discrete Data

Generative Assignment Flows for Representing and Learning Joint Distributions of Discrete Data

URL: http://arxiv.org/abs/2406.04527v1
Date: Thu, 6 Jun 2024 21:58:33 GMT
Title: Generative Assignment Flows for Representing and Learning Joint Distributions of Discrete Data
Authors: Bastian Boll, Daniel Gonzalez-Alvarado, Stefania Petra, Christoph Schnörr,
Abstract summary: We introduce a novel generative model for the representation of joint probability distributions of a possibly large number of discrete random variables. The embedding of the flow via the Segre map in the meta-simplex of all discrete joint distributions ensures that any target distribution can be represented in principle. Our approach has strong motivation from first principles of modeling coupled discrete variables.
Score: 2.6499018693213316
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce a novel generative model for the representation of joint probability distributions of a possibly large number of discrete random variables. The approach uses measure transport by randomized assignment flows on the statistical submanifold of factorizing distributions, which also enables to sample efficiently from the target distribution and to assess the likelihood of unseen data points. The embedding of the flow via the Segre map in the meta-simplex of all discrete joint distributions ensures that any target distribution can be represented in principle, whose complexity in practice only depends on the parametrization of the affinity function of the dynamical assignment flow system. Our model can be trained in a simulation-free manner without integration by conditional Riemannian flow matching, using the training data encoded as geodesics in closed-form with respect to the e-connection of information geometry. By projecting high-dimensional flow matching in the meta-simplex of joint distributions to the submanifold of factorizing distributions, our approach has strong motivation from first principles of modeling coupled discrete variables. Numerical experiments devoted to distributions of structured image labelings demonstrate the applicability to large-scale problems, which may include discrete distributions in other application areas. Performance measures show that our approach scales better with the increasing number of classes than recent related work.

Related papers

Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers [49.97755400231656]
We present the first performance guarantee with explicit dimensional general score-mismatched diffusion samplers. We show that score mismatches result in an distributional bias between the target and sampling distributions, proportional to the accumulated mismatch between the target and training distributions. This result can be directly applied to zero-shot conditional samplers for any conditional model, irrespective of measurement noise.
arXiv Detail & Related papers (2024-10-17T16:42:12Z)
Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian Mixture Models [59.331993845831946]
Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties. This paper provides the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models.
arXiv Detail & Related papers (2024-03-03T23:15:48Z)
Generative Modeling of Discrete Joint Distributions by E-Geodesic Flow Matching on Assignment Manifolds [0.8594140167290099]
General non-factorizing discrete distributions can be approximated by embedding the submanifold into a the meta-simplex of all joint discrete distributions. Efficient training of the generative model is demonstrated by matching the flow of geodesics of factorizing discrete distributions.
arXiv Detail & Related papers (2024-02-12T17:56:52Z)
Distributed Markov Chain Monte Carlo Sampling based on the Alternating Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers. We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art. In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z)
Uncertainty quantification and out-of-distribution detection using surjective normalizing flows [46.51077762143714]
We propose a simple approach using surjective normalizing flows to identify out-of-distribution data sets in deep neural network models. We show that our method can reliably discern out-of-distribution data from in-distribution data.
arXiv Detail & Related papers (2023-11-01T09:08:35Z)
Probabilistic Matching of Real and Generated Data Statistics in Generative Adversarial Networks [0.6906005491572401]
We propose a method to ensure that the distributions of certain generated data statistics coincide with the respective distributions of the real data. We evaluate the method on a synthetic dataset and a real-world dataset and demonstrate improved performance of our approach.
arXiv Detail & Related papers (2023-06-19T14:03:27Z)
Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data [68.62134204367668]
This paper studies score approximation, estimation, and distribution recovery of diffusion models, when data are supported on an unknown low-dimensional linear subspace. We show that with a properly chosen neural network architecture, the score function can be both accurately approximated and efficiently estimated. The generated distribution based on the estimated score function captures the data geometric structures and converges to a close vicinity of the data distribution.
arXiv Detail & Related papers (2023-02-14T17:02:35Z)
Wrapped Distributions on homogeneous Riemannian manifolds [58.720142291102135]
Control over distributions' properties, such as parameters, symmetry and modality yield a family of flexible distributions. We empirically validate our approach by utilizing our proposed distributions within a variational autoencoder and a latent space network model.
arXiv Detail & Related papers (2022-04-20T21:25:21Z)
Investigating Shifts in GAN Output-Distributions [5.076419064097734]
We introduce a loop-training scheme for the systematic investigation of observable shifts between the distributions of real training data and GAN generated data. Overall, the combination of these methods allows an explorative investigation of innate limitations of current GAN algorithms.
arXiv Detail & Related papers (2021-12-28T09:16:55Z)
GFlowNet Foundations [66.69854262276391]
Generative Flow Networks (GFlowNets) have been introduced as a method to sample a diverse set of candidates in an active learning context. We show a number of additional theoretical properties of GFlowNets.
arXiv Detail & Related papers (2021-11-17T17:59:54Z)
Resampling Base Distributions of Normalizing Flows [0.0]
We introduce a base distribution for normalizing flows based on learned rejection sampling. We develop suitable learning algorithms using both maximizing the log-likelihood and the optimization of the reverse Kullback-Leibler divergence.
arXiv Detail & Related papers (2021-10-29T14:44:44Z)
Personalized Trajectory Prediction via Distribution Discrimination [78.69458579657189]
Trarimiy prediction is confronted with the dilemma to capture the multi-modal nature of future dynamics. We present a distribution discrimination (DisDis) method to predict personalized motion patterns. Our method can be integrated with existing multi-modal predictive models as a plug-and-play module.
arXiv Detail & Related papers (2021-07-29T17:42:12Z)
Goal-oriented adaptive sampling under random field modelling of response probability distributions [0.6445605125467573]
We consider cases where the spatial variation of response distributions does not only concern their mean and/or variance but also other features including for instance shape or uni-modality versus multi-modality. Our contributions build upon a non-parametric Bayesian approach to modelling the thereby induced fields of probability distributions.
arXiv Detail & Related papers (2021-02-15T15:55:23Z)
Achieving Efficiency in Black Box Simulation of Distribution Tails with Self-structuring Importance Samplers [1.6114012813668934]
The paper presents a novel Importance Sampling (IS) scheme for estimating distribution of performance measures modeled with a rich set of tools such as linear programs, integer linear programs, piecewise linear/quadratic objectives, feature maps specified with deep neural networks, etc.
arXiv Detail & Related papers (2021-02-14T03:37:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.