Fermat Distances: Metric Approximation, Spectral Convergence, and
Clustering Algorithms
- URL: http://arxiv.org/abs/2307.05750v1
- Date: Fri, 7 Jul 2023 16:36:00 GMT
- Title: Fermat Distances: Metric Approximation, Spectral Convergence, and
Clustering Algorithms
- Authors: Nicol\'as Garc\'ia Trillos, Anna Little, Daniel McKenzie, James M.
Murphy
- Abstract summary: We prove that Fermat distances converge to their continuum analogues in small neighborhoods with a precise rate that depends on the intrinsic dimensionality of the data.
Our results are then used to prove that discrete graph Laplacians based on discrete, sample-driven Fermat distances converge to corresponding continuum operators.
The perspective afforded by our discrete-to-continuum Fermat distance analysis leads to new clustering algorithms for data.
- Score: 7.07321040534471
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We analyze the convergence properties of Fermat distances, a family of
density-driven metrics defined on Riemannian manifolds with an associated
probability measure. Fermat distances may be defined either on discrete samples
from the underlying measure, in which case they are random, or in the continuum
setting, in which they are induced by geodesics under a density-distorted
Riemannian metric. We prove that discrete, sample-based Fermat distances
converge to their continuum analogues in small neighborhoods with a precise
rate that depends on the intrinsic dimensionality of the data and the parameter
governing the extent of density weighting in Fermat distances. This is done by
leveraging novel geometric and statistical arguments in percolation theory that
allow for non-uniform densities and curved domains. Our results are then used
to prove that discrete graph Laplacians based on discrete, sample-driven Fermat
distances converge to corresponding continuum operators. In particular, we show
the discrete eigenvalues and eigenvectors converge to their continuum analogues
at a dimension-dependent rate, which allows us to interpret the efficacy of
discrete spectral clustering using Fermat distances in terms of the resulting
continuum limit. The perspective afforded by our discrete-to-continuum Fermat
distance analysis leads to new clustering algorithms for data and related
insights into efficient computations associated to density-driven spectral
clustering. Our theoretical analysis is supported with numerical simulations
and experiments on synthetic and real image data.
Related papers
- Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis [56.442307356162864]
We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework.
We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points.
Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
arXiv Detail & Related papers (2024-10-03T09:07:13Z) - Computing distances and means on manifolds with a metric-constrained Eikonal approach [4.266376725904727]
We introduce the metric-constrained Eikonal solver to obtain continuous, differentiable representations of distance functions.
The differentiable nature of these representations allows for the direct computation of globally length-minimising paths on the manifold.
arXiv Detail & Related papers (2024-04-12T18:26:32Z) - Sampling and estimation on manifolds using the Langevin diffusion [45.57801520690309]
Two estimators of linear functionals of $mu_phi $ based on the discretized Markov process are considered.
Error bounds are derived for sampling and estimation using a discretization of an intrinsically defined Langevin diffusion.
arXiv Detail & Related papers (2023-12-22T18:01:11Z) - Time-inhomogeneous diffusion geometry and topology [69.55228523791897]
Diffusion condensation is a time-inhomogeneous process where each step first computes and then applies a diffusion operator to the data.
We theoretically analyze the convergence and evolution of this process from geometric, spectral, and topological perspectives.
Our work gives theoretical insights into the convergence of diffusion condensation, and shows that it provides a link between topological and geometric data analysis.
arXiv Detail & Related papers (2022-03-28T16:06:17Z) - Tangent Space and Dimension Estimation with the Wasserstein Distance [10.118241139691952]
Consider a set of points sampled independently near a smooth compact submanifold of Euclidean space.
We provide mathematically rigorous bounds on the number of sample points required to estimate both the dimension and the tangent spaces of that manifold.
arXiv Detail & Related papers (2021-10-12T21:02:06Z) - Kernel distance measures for time series, random fields and other
structured data [71.61147615789537]
kdiff is a novel kernel-based measure for estimating distances between instances of structured data.
It accounts for both self and cross similarities across the instances and is defined using a lower quantile of the distance distribution.
Some theoretical results are provided for separability conditions using kdiff as a distance measure for clustering and classification problems.
arXiv Detail & Related papers (2021-09-29T22:54:17Z) - The Connection between Discrete- and Continuous-Time Descriptions of
Gaussian Continuous Processes [60.35125735474386]
We show that discretizations yielding consistent estimators have the property of invariance under coarse-graining'
This result explains why combining differencing schemes for derivatives reconstruction and local-in-time inference approaches does not work for time series analysis of second or higher order differential equations.
arXiv Detail & Related papers (2021-01-16T17:11:02Z) - Intrinsic persistent homology via density-based metric learning [1.0499611180329804]
We prove that the metric space defined by the sample endowed with a computable metric known as sample Fermat distance converges a.s.
The limiting object is the manifold itself endowed with the population Fermat distance, an intrinsic metric that accounts for both the geometry of the manifold and the density that produces the sample.
arXiv Detail & Related papers (2020-12-11T18:54:36Z) - The Convergence Indicator: Improved and completely characterized
parameter bounds for actual convergence of Particle Swarm Optimization [68.8204255655161]
We introduce a new convergence indicator that can be used to calculate whether the particles will finally converge to a single point or diverge.
Using this convergence indicator we provide the actual bounds completely characterizing parameter regions that lead to a converging swarm.
arXiv Detail & Related papers (2020-06-06T19:08:05Z) - Parallelity of mixed quantum ensembles [0.0]
A unifying framework for identifying distance and holonomy for decompositions of density operators is introduced.
A gauge invariant spectral geometric phase for discrete sequences of mixed quantum states is obtained as the phase of the trace of the spectral holonomy.
arXiv Detail & Related papers (2020-01-17T15:06:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.