Learning over von Mises-Fisher Distributions via a Wasserstein-like Geometry
- URL: http://arxiv.org/abs/2504.14164v1
- Date: Sat, 19 Apr 2025 03:38:15 GMT
- Title: Learning over von Mises-Fisher Distributions via a Wasserstein-like Geometry
- Authors: Kisung You, Dennis Shung, Mauro Giuffrè,
- Abstract summary: We introduce a geometry-aware distance metric for the family of von Mises-Fisher (vMF) distributions.<n>Motivated by the theory of optimal transport, we propose a Wasserstein-like distance that decomposes the discrepancy between two vMF distributions into two interpretable components.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a novel, geometry-aware distance metric for the family of von Mises-Fisher (vMF) distributions, which are fundamental models for directional data on the unit hypersphere. Although the vMF distribution is widely employed in a variety of probabilistic learning tasks involving spherical data, principled tools for comparing vMF distributions remain limited, primarily due to the intractability of normalization constants and the absence of suitable geometric metrics. Motivated by the theory of optimal transport, we propose a Wasserstein-like distance that decomposes the discrepancy between two vMF distributions into two interpretable components: a geodesic term capturing the angular separation between mean directions, and a variance-like term quantifying differences in concentration parameters. The derivation leverages a Gaussian approximation in the high-concentration regime to yield a tractable, closed-form expression that respects the intrinsic spherical geometry. We show that the proposed distance exhibits desirable theoretical properties and induces a latent geometric structure on the space of non-degenerate vMF distributions. As a primary application, we develop the efficient algorithms for vMF mixture reduction, enabling structure-preserving compression of mixture models in high-dimensional settings. Empirical results on synthetic datasets and real-world high-dimensional embeddings, including biomedical sentence representations and deep visual features, demonstrate the effectiveness of the proposed geometry in distinguishing distributions and supporting interpretable inference. This work expands the statistical toolbox for directional data analysis by introducing a tractable, transport-inspired distance tailored to the geometry of the hypersphere.
Related papers
- Metric Flow Matching for Smooth Interpolations on the Data Manifold [40.24392451848883]
Metric Flow Matching (MFM) is a novel simulation-free framework for conditional flow matching.
We propose MFM as a framework for conditional paths that transform a source distribution into a target distribution.
We test MFM on a suite of challenges including LiDAR navigation, unpaired image translation, and modeling cellular dynamics.
arXiv Detail & Related papers (2024-05-23T16:48:06Z) - Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian
Mixture Models [59.331993845831946]
Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties.
This paper provides the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models.
arXiv Detail & Related papers (2024-03-03T23:15:48Z) - Scaling Riemannian Diffusion Models [68.52820280448991]
We show that our method enables us to scale to high dimensional tasks on nontrivial manifold.
We model QCD densities on $SU(n)$ lattices and contrastively learned embeddings on high dimensional hyperspheres.
arXiv Detail & Related papers (2023-10-30T21:27:53Z) - Diffusion Models are Minimax Optimal Distribution Estimators [49.47503258639454]
We provide the first rigorous analysis on approximation and generalization abilities of diffusion modeling.
We show that when the true density function belongs to the Besov space and the empirical score matching loss is properly minimized, the generated data distribution achieves the nearly minimax optimal estimation rates.
arXiv Detail & Related papers (2023-03-03T11:31:55Z) - Score Approximation, Estimation and Distribution Recovery of Diffusion
Models on Low-Dimensional Data [68.62134204367668]
This paper studies score approximation, estimation, and distribution recovery of diffusion models, when data are supported on an unknown low-dimensional linear subspace.
We show that with a properly chosen neural network architecture, the score function can be both accurately approximated and efficiently estimated.
The generated distribution based on the estimated score function captures the data geometric structures and converges to a close vicinity of the data distribution.
arXiv Detail & Related papers (2023-02-14T17:02:35Z) - Geometric Scattering on Measure Spaces [15.819230791757906]
We introduce a general, unified model for geometric scattering on measure spaces.<n>We consider finite measure spaces that are obtained from randomly sampling an unknown manifold.<n>We propose two methods for constructing a data-driven graph on which the associated graph scattering transform approximates the scattering transform on the underlying manifold.
arXiv Detail & Related papers (2022-08-17T22:40:09Z) - Mean-field Variational Inference via Wasserstein Gradient Flow [8.05603983337769]
Variational inference, such as the mean-field (MF) approximation, requires certain conjugacy structures for efficient computation.
We introduce a general computational framework to implement MFal inference for Bayesian models, with or without latent variables, using the Wasserstein gradient flow (WGF)
We propose a new constraint-free function approximation method using neural networks to numerically realize our algorithm.
arXiv Detail & Related papers (2022-07-17T04:05:32Z) - Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region.
Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z) - Cycle Consistent Probability Divergences Across Different Spaces [38.43511529063335]
Discrepancy measures between probability distributions are at the core of statistical inference and machine learning.
This work proposes a novel unbalanced Monge optimal transport formulation for matching, up to isometries, distributions on different spaces.
arXiv Detail & Related papers (2021-11-22T16:35:58Z) - A Unifying and Canonical Description of Measure-Preserving Diffusions [60.59592461429012]
A complete recipe of measure-preserving diffusions in Euclidean space was recently derived unifying several MCMC algorithms into a single framework.
We develop a geometric theory that improves and generalises this construction to any manifold.
arXiv Detail & Related papers (2021-05-06T17:36:55Z) - AI Giving Back to Statistics? Discovery of the Coordinate System of
Univariate Distributions by Beta Variational Autoencoder [0.0]
The article discusses experiences of training neural networks to classify univariate empirical distributions and to represent them on the two-dimensional latent space forcing disentanglement based on the inputs of cumulative distribution functions (CDF)
The representation on the latent two-dimensional coordinate system can be seen as an additional metadata of the real-world data that disentangles important distribution characteristics, such as shape of the CDF, classification probabilities of underlying theoretical distributions and their parameters, information entropy, and skewness.
arXiv Detail & Related papers (2020-04-06T14:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.