Related papers: Reconstruction of Manifold Distances from Noisy Observations

Reconstruction of Manifold Distances from Noisy Observations

URL: http://arxiv.org/abs/2511.13025v1
Date: Mon, 17 Nov 2025 06:24:31 GMT
Title: Reconstruction of Manifold Distances from Noisy Observations
Authors: Charles Fefferman, Jonathan Marty, Kevin Ren,
Abstract summary: We consider the problem of reconstructing the intrinsic geometry of a manifold from noisy pairwise distance observations.<n>We develop a new framework for recovering all distances between points in a sufficiently dense subsample of $M$.<n>We show that a quantitative lower bound on sampling probabilities suffices to modify the cluster construction in the first algorithm and extend all recovery guarantees.
Score: 0.745554610293091
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: We consider the problem of reconstructing the intrinsic geometry of a manifold from noisy pairwise distance observations. Specifically, let $M$ denote a diameter 1 d-dimensional manifold and $μ$ a probability measure on $M$ that is mutually absolutely continuous with the volume measure. Suppose $X_1,\dots,X_N$ are i.i.d. samples of $μ$ and we observe noisy-distance random variables $d'(X_j, X_k)$ that are related to the true geodesic distances $d(X_j,X_k)$. With mild assumptions on the distributions and independence of the noisy distances, we develop a new framework for recovering all distances between points in a sufficiently dense subsample of $M$. Our framework improves on previous work which assumed i.i.d. additive noise with known moments. Our method is based on a new way to estimate $L_2$-norms of certain expectation-functions $f_x(y)=\mathbb{E}d'(x,y)$ and use them to build robust clusters centered at points of our sample. Using a new geometric argument, we establish that, under mild geometric assumptions--bounded curvature and positive injectivity radius--these clusters allow one to recover the true distances between points in the sample up to an additive error of $O(\varepsilon \log \varepsilon^{-1})$. We develop two distinct algorithms for producing these clusters. The first achieves a sample complexity $N \asymp \varepsilon^{-2d-2}\log(1/\varepsilon)$ and runtime $o(N^3)$. The second introduces novel geometric ideas that warrant further investigation. In the presence of missing observations, we show that a quantitative lower bound on sampling probabilities suffices to modify the cluster construction in the first algorithm and extend all recovery guarantees. Our main technical result also elucidates which properties of a manifold are necessary for the distance recovery, which suggests further extension of our techniques to a broader class of metric probability spaces.

Related papers

Sublinear Time Quantum Sensitivity Sampling [57.356528942341534]
We present a unified framework for quantum sensitivity sampling, extending the advantages of quantum computing to a broad class of classical approximation problems.<n>Our framework provides a streamlined approach for constructing coresets and offers significant runtime improvements in applications such as clustering, regression, and low-rank approximation.
arXiv Detail & Related papers (2025-09-20T20:18:49Z)
Fundamental Limits of Learning High-dimensional Simplices in Noisy Regimes [0.5892638927736114]
We prove an algorithm exists that, with high probability, outputs a simplex within $ell$ or total variation (TV) distance at most $varepsilon$ from the true simplex.<n>In the noiseless scenario, our lower bound $n ge Omega(K/varepsilon)$ matches known upper bounds up to constant factors.
arXiv Detail & Related papers (2025-06-11T18:35:38Z)
Near-Optimal Clustering in Mixture of Markov Chains [74.3828414695655]
We study the problem of clustering $T$ trajectories of length $H$, each generated by one of $K$ unknown ergodic Markov chains over a finite state space of size $S$.<n>We derive an instance-dependent, high-probability lower bound on the clustering error rate, governed by the weighted KL divergence between the transition kernels of the chains.<n>We then present a novel two-stage clustering algorithm.
arXiv Detail & Related papers (2025-06-02T05:10:40Z)
Dimension-free Private Mean Estimation for Anisotropic Distributions [55.86374912608193]
Previous private estimators on distributions over $mathRd suffer from a curse of dimensionality. We present an algorithm whose sample complexity has improved dependence on dimension.
arXiv Detail & Related papers (2024-11-01T17:59:53Z)
A Unified Framework for Uniform Signal Recovery in Nonlinear Generative Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously. Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples. We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z)
Online Learning with Adversaries: A Differential-Inclusion Analysis [52.43460995467893]
We introduce an observation-matrix-based framework for fully asynchronous online Federated Learning with adversaries. Our main result is that the proposed algorithm almost surely converges to the desired mean $mu.$ We derive this convergence using a novel differential-inclusion-based two-timescale analysis.
arXiv Detail & Related papers (2023-04-04T04:32:29Z)
Sample Complexity Bounds for Learning High-dimensional Simplices in Noisy Regimes [5.526935605535376]
We find a sample complexity bound for learning a simplex from noisy samples. We show that as long as $mathrmSNRgeOmegaleft(K1/2right)$, the sample complexity of the noisy regime has the same order to that of the noiseless case.
arXiv Detail & Related papers (2022-09-09T23:35:25Z)
Non-Parametric Estimation of Manifolds from Noisy Data [1.0152838128195467]
We consider the problem of estimating a $d$ dimensional sub-manifold of $mathbbRD$ from a finite set of noisy samples. We show that the estimation yields rates of convergence of $n-frack2k + d$ for the point estimation and $n-frack-12k + d$ for the estimation of tangent space.
arXiv Detail & Related papers (2021-05-11T02:29:33Z)
Consistent regression when oblivious outliers overwhelm [8.873449722727026]
Prior to our work, even for Gaussian $X$, no estimator for $beta*$ was known to be consistent in this model. We show that consistent estimation is possible with nearly linear sample size and inverse-polynomial inlier fraction. The model studied here also captures heavy-tailed noise distributions that may not even have a first moment.
arXiv Detail & Related papers (2020-09-30T16:21:34Z)
Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$ We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z)
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction [63.41789556777387]
Asynchronous Q-learning aims to learn the optimal action-value function (or Q-function) of a Markov decision process (MDP) We show that the number of samples needed to yield an entrywise $varepsilon$-accurate estimate of the Q-function is at most on the order of $frac1mu_min (1-gamma)5varepsilon2+ fract_mixmu_min (1-gamma)$ up to some logarithmic factor.
arXiv Detail & Related papers (2020-06-04T17:51:00Z)
A Randomized Algorithm to Reduce the Support of Discrete Measures [79.55586575988292]
Given a discrete probability measure supported on $N$ atoms and a set of $n$ real-valued functions, there exists a probability measure that is supported on a subset of $n+1$ of the original $N$ atoms. We give a simple geometric characterization of barycenters via negative cones and derive a randomized algorithm that computes this new measure by "greedy geometric sampling" We then study its properties, and benchmark it on synthetic and real-world data to show that it can be very beneficial in the $Ngg n$ regime.
arXiv Detail & Related papers (2020-06-02T16:38:36Z)
Randomized Bregman Coordinate Descent Methods for Non-Lipschitz Optimization [31.474280642125734]
A new textitrandomized Bregman (block) coordinate descent (CD) method is proposed. We show that the proposed method is $O(epsilon-2n)$ to achieve $epsilon-stationary point, where $n$ is the number of blocks of coordinates.
arXiv Detail & Related papers (2020-01-15T09:57:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.