A Normalized Bottleneck Distance on Persistence Diagrams and Homology
Preservation under Dimension Reduction
- URL: http://arxiv.org/abs/2306.06727v1
- Date: Sun, 11 Jun 2023 17:17:48 GMT
- Title: A Normalized Bottleneck Distance on Persistence Diagrams and Homology
Preservation under Dimension Reduction
- Authors: Bala Krishnamoorthy and Nathan H. May
- Abstract summary: We define a new scale-invariant distance between persistence diagrams termed normalized bottleneck distance, d_N.
We study two popular dimension reduction techniques, Johnson-Lindenstrauss (JL) projections and metric multidimensional scaling (mMDS)
We provide new bounds on how well these dimension reduction techniques preserve homology with respect to d_N.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Persistence diagrams are used as signatures of point cloud data assumed to be
sampled from manifolds, and represent their topology in a compact fashion.
Further, two given clouds of points can be compared by directly comparing their
persistence diagrams using the bottleneck distance, d_B. But one potential
drawback of this pipeline is that point clouds sampled from topologically
similar manifolds can have arbitrarily large d_B values when there is a large
degree of scaling between them. This situation is typical in dimension
reduction frameworks that are also aiming to preserve topology.
We define a new scale-invariant distance between persistence diagrams termed
normalized bottleneck distance, d_N, and study its properties. In defining d_N,
we also develop a broader framework called metric decomposition for comparing
finite metric spaces of equal cardinality with a bijection. We utilize metric
decomposition to prove a stability result for d_N by deriving an explicit bound
on the distortion of the associated bijective map. We then study two popular
dimension reduction techniques, Johnson-Lindenstrauss (JL) projections and
metric multidimensional scaling (mMDS), and a third class of general
biLipschitz mappings. We provide new bounds on how well these dimension
reduction techniques preserve homology with respect to d_N. For a JL map f that
transforms input X to f(X), we show that d_N(dgm(X),dgm(f(X)) < e, where dgm(X)
is the Vietoris-Rips persistence diagram of X, and 0 < e < 1 is the tolerance
up to which pairwise distances are preserved by f. For mMDS, we present new
bounds for both d_B and d_N between persistence diagrams of X and its
projection in terms of the eigenvalues of the covariance matrix. And for
k-biLipschitz maps, we show that d_N is bounded by the product of (k^2-1)/k and
the ratio of diameters of X and f(X).
Related papers
- Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - Horizon-Free Regret for Linear Markov Decision Processes [92.02082223856479]
A recent line of works showed regret bounds in reinforcement learning can be (nearly) independent of planning horizon.
We give the first horizon-free bound for the popular linear Markov Decision Process (MDP) setting.
In contrast to prior works which explicitly estimate the transition model and compute the inhomogeneous value functions, we directly estimate the value functions and confidence sets.
arXiv Detail & Related papers (2024-03-15T23:50:58Z) - Scaling Riemannian Diffusion Models [68.52820280448991]
We show that our method enables us to scale to high dimensional tasks on nontrivial manifold.
We model QCD densities on $SU(n)$ lattices and contrastively learned embeddings on high dimensional hyperspheres.
arXiv Detail & Related papers (2023-10-30T21:27:53Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - High-dimensional limit theorems for SGD: Effective dynamics and critical
scaling [6.950316788263433]
We prove limit theorems for the trajectories of summary statistics of gradient descent (SGD)
We show a critical scaling regime for the step-size, below which the effective ballistic dynamics matches gradient flow for the population loss.
About the fixed points of this effective dynamics, the corresponding diffusive limits can be quite complex and even degenerate.
arXiv Detail & Related papers (2022-06-08T17:42:18Z) - Entropic Optimal Transport in Random Graphs [8.7314407902481]
In graph analysis, a classic task consists in computing similarity measures between (groups of) nodes.
We show that it is possible to consistently estimate distances between groups of nodes in the latent space.
arXiv Detail & Related papers (2022-01-11T13:52:34Z) - Spatially relaxed inference on high-dimensional linear models [48.989769153211995]
We study the properties of ensembled clustered inference algorithms which combine spatially constrained clustering, statistical inference, and ensembling to aggregate several clustered inference solutions.
We show that ensembled clustered inference algorithms control the $delta$-FWER under standard assumptions for $delta$ equal to the largest cluster diameter.
arXiv Detail & Related papers (2021-06-04T16:37:19Z) - Last iterate convergence of SGD for Least-Squares in the Interpolation
regime [19.05750582096579]
We study the noiseless model in the fundamental least-squares setup.
We assume that an optimum predictor fits perfectly inputs and outputs $langle theta_*, phi(X) rangle = Y$, where $phi(X)$ stands for a possibly infinite dimensional non-linear feature map.
arXiv Detail & Related papers (2021-02-05T14:02:20Z) - From Geometry to Topology: Inverse Theorems for Distributed Persistence [0.0]
We show that the correct invariant is not the persistence diagram of X, but rather the collection of persistence diagrams of many small subsets.
This invariant, which we call "distributed persistence," is trivially parallelizable, more stable to outliers.
Results are complemented by two synthetic experiments demonstrating the use of distributed persistence in practice.
arXiv Detail & Related papers (2021-01-28T21:36:45Z) - Manifold Learning via Manifold Deflation [105.7418091051558]
dimensionality reduction methods provide a valuable means to visualize and interpret high-dimensional data.
Many popular methods can fail dramatically, even on simple two-dimensional Manifolds.
This paper presents an embedding method for a novel, incremental tangent space estimator that incorporates global structure as coordinates.
Empirically, we show our algorithm recovers novel and interesting embeddings on real-world and synthetic datasets.
arXiv Detail & Related papers (2020-07-07T10:04:28Z) - Unsupervised Discretization by Two-dimensional MDL-based Histogram [0.0]
Unsupervised discretization is a crucial step in many knowledge discovery tasks.
We propose an expressive model class that allows for far more flexible partitions of two-dimensional data.
We introduce a algorithm, named PALM, which Partitions each dimension ALternately and then Merges neighboring regions.
arXiv Detail & Related papers (2020-06-02T19:19:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.