A Normalized Bottleneck Distance on Persistence Diagrams and Homology
Preservation under Dimension Reduction
- URL: http://arxiv.org/abs/2306.06727v1
- Date: Sun, 11 Jun 2023 17:17:48 GMT
- Title: A Normalized Bottleneck Distance on Persistence Diagrams and Homology
Preservation under Dimension Reduction
- Authors: Bala Krishnamoorthy and Nathan H. May
- Abstract summary: We define a new scale-invariant distance between persistence diagrams termed normalized bottleneck distance, d_N.
We study two popular dimension reduction techniques, Johnson-Lindenstrauss (JL) projections and metric multidimensional scaling (mMDS)
We provide new bounds on how well these dimension reduction techniques preserve homology with respect to d_N.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Persistence diagrams are used as signatures of point cloud data assumed to be
sampled from manifolds, and represent their topology in a compact fashion.
Further, two given clouds of points can be compared by directly comparing their
persistence diagrams using the bottleneck distance, d_B. But one potential
drawback of this pipeline is that point clouds sampled from topologically
similar manifolds can have arbitrarily large d_B values when there is a large
degree of scaling between them. This situation is typical in dimension
reduction frameworks that are also aiming to preserve topology.
We define a new scale-invariant distance between persistence diagrams termed
normalized bottleneck distance, d_N, and study its properties. In defining d_N,
we also develop a broader framework called metric decomposition for comparing
finite metric spaces of equal cardinality with a bijection. We utilize metric
decomposition to prove a stability result for d_N by deriving an explicit bound
on the distortion of the associated bijective map. We then study two popular
dimension reduction techniques, Johnson-Lindenstrauss (JL) projections and
metric multidimensional scaling (mMDS), and a third class of general
biLipschitz mappings. We provide new bounds on how well these dimension
reduction techniques preserve homology with respect to d_N. For a JL map f that
transforms input X to f(X), we show that d_N(dgm(X),dgm(f(X)) < e, where dgm(X)
is the Vietoris-Rips persistence diagram of X, and 0 < e < 1 is the tolerance
up to which pairwise distances are preserved by f. For mMDS, we present new
bounds for both d_B and d_N between persistence diagrams of X and its
projection in terms of the eigenvalues of the covariance matrix. And for
k-biLipschitz maps, we show that d_N is bounded by the product of (k^2-1)/k and
the ratio of diameters of X and f(X).
Related papers
- Proper Latent Decomposition [4.266376725904727]
We compute a reduced set of intrinsic coordinates (latent space) to accurately describe a flow with fewer degrees of freedom than the numerical discretization.
With this proposed numerical framework, we propose an algorithm to perform PLD on the manifold.
This work opens opportunities for analyzing autoencoders and latent spaces, nonlinear reduced-order modeling and scientific insights into the structure of high-dimensional data.
arXiv Detail & Related papers (2024-12-01T12:19:08Z) - Dimension reduction and the gradient flow of relative entropy [0.0]
Dimension reduction, widely used in science, maps high-dimensional data into low-dimensional space.
We investigate a basic mathematical model underlying the techniques of neighborhood embedding (SNE) and its popular variant t-SNE.
The aim is to map these points to low dimensions in an optimal way so that similar points are closer together.
arXiv Detail & Related papers (2024-09-25T14:23:04Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - Scaling Riemannian Diffusion Models [68.52820280448991]
We show that our method enables us to scale to high dimensional tasks on nontrivial manifold.
We model QCD densities on $SU(n)$ lattices and contrastively learned embeddings on high dimensional hyperspheres.
arXiv Detail & Related papers (2023-10-30T21:27:53Z) - Nearly $d$-Linear Convergence Bounds for Diffusion Models via Stochastic
Localization [40.808942894229325]
We provide the first convergence bounds which are linear in the data dimension.
We show that diffusion models require at most $tilde O(fracd log2(1/delta)varepsilon2)$ steps to approximate an arbitrary distribution.
arXiv Detail & Related papers (2023-08-07T16:01:14Z) - Pseudonorm Approachability and Applications to Regret Minimization [73.54127663296906]
We convert high-dimensional $ell_infty$-approachability problems to low-dimensional pseudonorm approachability problems.
We develop an algorithmic theory of pseudonorm approachability, analogous to previous work on approachability for $ell$ and other norms.
arXiv Detail & Related papers (2023-02-03T03:19:14Z) - Bayesian Hyperbolic Multidimensional Scaling [2.5944208050492183]
We propose a Bayesian approach to multidimensional scaling when the low-dimensional manifold is hyperbolic.
A case-control likelihood approximation allows for efficient sampling from the posterior distribution in larger data settings.
We evaluate the proposed method against state-of-the-art alternatives using simulations, canonical reference datasets, Indian village network data, and human gene expression data.
arXiv Detail & Related papers (2022-10-26T23:34:30Z) - Generalization Bounds for Stochastic Gradient Descent via Localized
$\varepsilon$-Covers [16.618918548497223]
We propose a new covering technique localized for the trajectories of SGD.
This localization provides an algorithm-specific clustering measured by the bounds number.
We derive these results in various contexts and improve the known state-of-the-art label rates.
arXiv Detail & Related papers (2022-09-19T12:11:07Z) - High-dimensional limit theorems for SGD: Effective dynamics and critical
scaling [6.950316788263433]
We prove limit theorems for the trajectories of summary statistics of gradient descent (SGD)
We show a critical scaling regime for the step-size, below which the effective ballistic dynamics matches gradient flow for the population loss.
About the fixed points of this effective dynamics, the corresponding diffusive limits can be quite complex and even degenerate.
arXiv Detail & Related papers (2022-06-08T17:42:18Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - Manifold Hypothesis in Data Analysis: Double Geometrically-Probabilistic
Approach to Manifold Dimension Estimation [92.81218653234669]
We present new approach to manifold hypothesis checking and underlying manifold dimension estimation.
Our geometrical method is a modification for sparse data of a well-known box-counting algorithm for Minkowski dimension calculation.
Experiments on real datasets show that the suggested approach based on two methods combination is powerful and effective.
arXiv Detail & Related papers (2021-07-08T15:35:54Z) - Spatially relaxed inference on high-dimensional linear models [48.989769153211995]
We study the properties of ensembled clustered inference algorithms which combine spatially constrained clustering, statistical inference, and ensembling to aggregate several clustered inference solutions.
We show that ensembled clustered inference algorithms control the $delta$-FWER under standard assumptions for $delta$ equal to the largest cluster diameter.
arXiv Detail & Related papers (2021-06-04T16:37:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.