On the Wasserstein Geodesic Principal Component Analysis of probability measures
- URL: http://arxiv.org/abs/2506.04480v1
- Date: Wed, 04 Jun 2025 22:00:43 GMT
- Title: On the Wasserstein Geodesic Principal Component Analysis of probability measures
- Authors: Nina Vesseron, Elsa Cazelles, Alice Le Brigant, Thierry Klein,
- Abstract summary: The goal is to identify geodesic curves in the space of probability measures that best capture the modes of variation of the underlying dataset.<n>We first address the case of a collection of Gaussian distributions, and show how to lift the computations in the space of invertible linear maps.<n>For the more general setting of absolutely continuous probability measures, we leverage a novel approach to parameterizing geodesics in Wasserstein space with neural networks.
- Score: 1.2999518604217852
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper focuses on Geodesic Principal Component Analysis (GPCA) on a collection of probability distributions using the Otto-Wasserstein geometry. The goal is to identify geodesic curves in the space of probability measures that best capture the modes of variation of the underlying dataset. We first address the case of a collection of Gaussian distributions, and show how to lift the computations in the space of invertible linear maps. For the more general setting of absolutely continuous probability measures, we leverage a novel approach to parameterizing geodesics in Wasserstein space with neural networks. Finally, we compare to classical tangent PCA through various examples and provide illustrations on real-world datasets.
Related papers
- Enforcing Latent Euclidean Geometry in Single-Cell VAEs for Manifold Interpolation [79.27003481818413]
We introduce FlatVI, a training framework that regularises the latent manifold of discrete-likelihood variational autoencoders towards Euclidean geometry.<n>By encouraging straight lines in the latent space to approximate geodesics on the decoded single-cell manifold, FlatVI enhances compatibility with downstream approaches.
arXiv Detail & Related papers (2025-07-15T23:08:14Z) - Riemannian Principal Component Analysis [0.0]
This paper proposes an innovative extension of Principal Component Analysis (PCA) that transcends the traditional assumption of data lying in Euclidean space.<n>We adapt PCA to include local metrics, enabling the incorporation of manifold geometry.
arXiv Detail & Related papers (2025-05-30T21:04:01Z) - Wrapped Gaussian on the manifold of Symmetric Positive Definite Matrices [6.7523635840772505]
Circular and non-flat data distributions are prevalent across diverse domains of data science.<n>A principled approach to accounting for the underlying geometry of such data is pivotal.<n>This work lays the groundwork for extending classical machine learning and statistical methods to more complex and structured data.
arXiv Detail & Related papers (2025-02-03T16:46:46Z) - Score-based Pullback Riemannian Geometry: Extracting the Data Manifold Geometry using Anisotropic Flows [10.649159213723106]
We propose a framework for data-driven Riemannian geometry that is scalable in both geometry and learning.<n>We show that the proposed framework produces high-quality geodesics passing through the data support.<n>This is the first scalable framework for extracting the complete geometry of the data manifold.
arXiv Detail & Related papers (2024-10-02T18:52:12Z) - Polynomial Chaos Expansions on Principal Geodesic Grassmannian
Submanifolds for Surrogate Modeling and Uncertainty Quantification [0.41709348827585524]
We introduce a manifold learning-based surrogate modeling framework for uncertainty in high-dimensional systems.
We employ Principal Geodesic Analysis on the Grassmann manifold of the response to identify a set of disjoint principal geodesic submanifolds.
Polynomial chaos expansion is then used to construct a mapping between the random input parameters and the projection of the response.
arXiv Detail & Related papers (2024-01-30T02:13:02Z) - Conformal inference for regression on Riemannian Manifolds [49.7719149179179]
We investigate prediction sets for regression scenarios when the response variable, denoted by $Y$, resides in a manifold, and the covariable, denoted by X, lies in Euclidean space.
We prove the almost sure convergence of the empirical version of these regions on the manifold to their population counterparts.
arXiv Detail & Related papers (2023-10-12T10:56:25Z) - Information Entropy Initialized Concrete Autoencoder for Optimal Sensor
Placement and Reconstruction of Geophysical Fields [58.720142291102135]
We propose a new approach to the optimal placement of sensors for reconstructing geophysical fields from sparse measurements.
We demonstrate our method on the two examples: (a) temperature and (b) salinity fields around the Barents Sea and the Svalbard group of islands.
We find out that the obtained optimal sensor locations have clear physical interpretation and correspond to the boundaries between sea currents.
arXiv Detail & Related papers (2022-06-28T12:43:38Z) - Wrapped Distributions on homogeneous Riemannian manifolds [58.720142291102135]
Control over distributions' properties, such as parameters, symmetry and modality yield a family of flexible distributions.
We empirically validate our approach by utilizing our proposed distributions within a variational autoencoder and a latent space network model.
arXiv Detail & Related papers (2022-04-20T21:25:21Z) - Wasserstein Iterative Networks for Barycenter Estimation [80.23810439485078]
We present an algorithm to approximate the Wasserstein-2 barycenters of continuous measures via a generative model.
Based on the celebrity faces dataset, we construct Ave, celeba! dataset which can be used for quantitative evaluation of barycenter algorithms.
arXiv Detail & Related papers (2022-01-28T16:59:47Z) - A Unifying and Canonical Description of Measure-Preserving Diffusions [60.59592461429012]
A complete recipe of measure-preserving diffusions in Euclidean space was recently derived unifying several MCMC algorithms into a single framework.
We develop a geometric theory that improves and generalises this construction to any manifold.
arXiv Detail & Related papers (2021-05-06T17:36:55Z) - Learning High Dimensional Wasserstein Geodesics [55.086626708837635]
We propose a new formulation and learning strategy for computing the Wasserstein geodesic between two probability distributions in high dimensions.
By applying the method of Lagrange multipliers to the dynamic formulation of the optimal transport (OT) problem, we derive a minimax problem whose saddle point is the Wasserstein geodesic.
We then parametrize the functions by deep neural networks and design a sample based bidirectional learning algorithm for training.
arXiv Detail & Related papers (2021-02-05T04:25:28Z) - Projected Statistical Methods for Distributional Data on the Real Line
with the Wasserstein Metric [0.0]
We present a novel class of projected methods, to perform statistical analysis on a data set of probability distributions on the real line.
We focus in particular on Principal Component Analysis (PCA) and regression.
Several theoretical properties of the models are investigated and consistency is proven.
arXiv Detail & Related papers (2021-01-22T10:24:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.