Related papers: Riemannian AmbientFlow: Towards Simultaneous Manifold Learning and Generative Modeling from Corrupted Data

Riemannian AmbientFlow: Towards Simultaneous Manifold Learning and Generative Modeling from Corrupted Data

URL: http://arxiv.org/abs/2601.18728v1
Date: Mon, 26 Jan 2026 17:51:52 GMT
Title: Riemannian AmbientFlow: Towards Simultaneous Manifold Learning and Generative Modeling from Corrupted Data
Authors: Willem Diepeveen, Oscar Leong,
Abstract summary: We introduce a framework for learning a probabilistic generative model and the underlying, nonlinear data manifold directly from corrupted observations.<n>We establish theoretical guarantees showing that, under appropriate geometric regularization and measurement conditions, the learned model recovers the underlying data distribution up to a controllable error and yields a smooth, bi-Lipschitz manifold parametrization.
Score: 4.681760167323748
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Modern generative modeling methods have demonstrated strong performance in learning complex data distributions from clean samples. In many scientific and imaging applications, however, clean samples are unavailable, and only noisy or linearly corrupted measurements can be observed. Moreover, latent structures, such as manifold geometries, present in the data are important to extract for further downstream scientific analysis. In this work, we introduce Riemannian AmbientFlow, a framework for simultaneously learning a probabilistic generative model and the underlying, nonlinear data manifold directly from corrupted observations. Building on the variational inference framework of AmbientFlow, our approach incorporates data-driven Riemannian geometry induced by normalizing flows, enabling the extraction of manifold structure through pullback metrics and Riemannian Autoencoders. We establish theoretical guarantees showing that, under appropriate geometric regularization and measurement conditions, the learned model recovers the underlying data distribution up to a controllable error and yields a smooth, bi-Lipschitz manifold parametrization. We further show that the resulting smooth decoder can serve as a principled generative prior for inverse problems with recovery guarantees. We empirically validate our approach on low-dimensional synthetic manifolds and on MNIST.

Related papers

When and how can inexact generative models still sample from the data manifold? [2.4664553878979185]
Despite learning errors in the score function or the drift vector field, the generated samples appear to shift emphalong the support of the data distribution but not emphaway from it.<n>We show that the alignment of the top Lyapunov vectors with the tangent spaces along the boundary of the data manifold leads to robustness.
arXiv Detail & Related papers (2025-08-11T03:24:34Z)
Riemannian Denoising Diffusion Probabilistic Models [7.964790563398277]
We propose RDDPMs for learning distributions on submanifolds of Euclidean space that are level sets of functions.<n>We provide a theoretical analysis of our method in the continuous-time limit.<n>The capability of our method is demonstrated on datasets from previous studies and on new sampled datasets.
arXiv Detail & Related papers (2025-05-07T11:37:16Z)
Categorical Flow Matching on Statistical Manifolds [12.646272756981672]
We introduce a flow-matching framework on the manifold of parameterized probability measures inspired by information geometry.<n>We develop an efficient training and sampling algorithm that overcomes numerical stability with a diffeomorphism between manifold.<n>We manifest that SFM can learn more complex patterns on the statistical manifold where existing models often fail due to strong prior assumptions.
arXiv Detail & Related papers (2024-05-26T05:50:39Z)
Fisher Flow Matching for Generative Modeling over Discrete Data [12.69975914345141]
We introduce Fisher-Flow, a novel flow-matching model for discrete data. Fisher-Flow takes a manifestly geometric perspective by considering categorical distributions over discrete data. We prove that the gradient flow induced by Fisher-Flow is optimal in reducing the forward KL divergence.
arXiv Detail & Related papers (2024-05-23T15:02:11Z)
Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop. We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models. We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z)
VTAE: Variational Transformer Autoencoder with Manifolds Learning [144.0546653941249]
Deep generative models have demonstrated successful applications in learning non-linear data distributions through a number of latent variables. The nonlinearity of the generator implies that the latent space shows an unsatisfactory projection of the data space, which results in poor representation learning. We show that geodesics and accurate computation can substantially improve the performance of deep generative models.
arXiv Detail & Related papers (2023-04-03T13:13:19Z)
ManiFlow: Implicitly Representing Manifolds with Normalizing Flows [145.9820993054072]
Normalizing Flows (NFs) are flexible explicit generative models that have been shown to accurately model complex real-world data distributions. We propose an optimization objective that recovers the most likely point on the manifold given a sample from the perturbed distribution. Finally, we focus on 3D point clouds for which we utilize the explicit nature of NFs, i.e. surface normals extracted from the gradient of the log-likelihood and the log-likelihood itself.
arXiv Detail & Related papers (2022-08-18T16:07:59Z)
Nonlinear Isometric Manifold Learning for Injective Normalizing Flows [58.720142291102135]
We use isometries to separate manifold learning and density estimation. We also employ autoencoders to design embeddings with explicit inverses that do not distort the probability distribution.
arXiv Detail & Related papers (2022-03-08T08:57:43Z)
Inferring Manifolds From Noisy Data Using Gaussian Processes [17.166283428199634]
Most existing manifold learning algorithms replace the original data with lower dimensional coordinates. This article proposes a new methodology for addressing these problems, allowing the estimated manifold between fitted data points.
arXiv Detail & Related papers (2021-10-14T15:50:38Z)
GELATO: Geometrically Enriched Latent Model for Offline Reinforcement Learning [54.291331971813364]
offline reinforcement learning approaches can be divided into proximal and uncertainty-aware methods. In this work, we demonstrate the benefit of combining the two in a latent variational model. Our proposed metrics measure both the quality of out of distribution samples as well as the discrepancy of examples in the data.
arXiv Detail & Related papers (2021-02-22T19:42:40Z)
MMCGAN: Generative Adversarial Network with Explicit Manifold Prior [78.58159882218378]
We propose to employ explicit manifold learning as prior to alleviate mode collapse and stabilize training of GAN. Our experiments on both the toy data and real datasets show the effectiveness of MMCGAN in alleviating mode collapse, stabilizing training, and improving the quality of generated samples.
arXiv Detail & Related papers (2020-06-18T07:38:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.