A Class of Topological Pseudodistances for Fast Comparison of
Persistence Diagrams
- URL: http://arxiv.org/abs/2402.14489v1
- Date: Thu, 22 Feb 2024 12:27:35 GMT
- Title: A Class of Topological Pseudodistances for Fast Comparison of
Persistence Diagrams
- Authors: Rolando Kindelan Nu\~nez, Mircea Petrache, Mauricio Cerda, Nancy
Hitschfeld
- Abstract summary: We introduce a class of pseudodistances called Extended Topological Pseudodistances (ETD)
ETDs have tunable complexity, and can approximate Sliced and classical Wasserstein distances at the high-complexity extreme.
We experimentally verify that ETDs outperform PSs in terms of accuracy and outperform Wasserstein and Sliced Wasserstein distances in terms of computational complexity.
- Score: 0.3968603035422276
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Persistence diagrams (PD)s play a central role in topological data analysis,
and are used in an ever increasing variety of applications. The comparison of
PD data requires computing comparison metrics among large sets of PDs, with
metrics which are accurate, theoretically sound, and fast to compute.
Especially for denser multi-dimensional PDs, such comparison metrics are
lacking. While on the one hand, Wasserstein-type distances have high accuracy
and theoretical guarantees, they incur high computational cost. On the other
hand, distances between vectorizations such as Persistence Statistics (PS)s
have lower computational cost, but lack the accuracy guarantees and in general
they are not guaranteed to distinguish PDs (i.e. the two PS vectors of
different PDs may be equal). In this work we introduce a class of
pseudodistances called Extended Topological Pseudodistances (ETD)s, which have
tunable complexity, and can approximate Sliced and classical Wasserstein
distances at the high-complexity extreme, while being computationally lighter
and close to Persistence Statistics at the lower complexity extreme, and thus
allow users to interpolate between the two metrics. We build theoretical
comparisons to show how to fit our new distances at an intermediate level
between persistence vectorizations and Wasserstein distances. We also
experimentally verify that ETDs outperform PSs in terms of accuracy and
outperform Wasserstein and Sliced Wasserstein distances in terms of
computational complexity.
Related papers
- Relative-Translation Invariant Wasserstein Distance [82.6068808353647]
We introduce a new family of distances, relative-translation invariant Wasserstein distances ($RW_p$)
We show that $RW_p distances are also real distance metrics defined on the quotient set $mathcalP_p(mathbbRn)/sim$ invariant to distribution translations.
arXiv Detail & Related papers (2024-09-04T03:41:44Z) - Linearized Wasserstein dimensionality reduction with approximation
guarantees [65.16758672591365]
LOT Wassmap is a computationally feasible algorithm to uncover low-dimensional structures in the Wasserstein space.
We show that LOT Wassmap attains correct embeddings and that the quality improves with increased sample size.
We also show how LOT Wassmap significantly reduces the computational cost when compared to algorithms that depend on pairwise distance computations.
arXiv Detail & Related papers (2023-02-14T22:12:16Z) - Statistical, Robustness, and Computational Guarantees for Sliced
Wasserstein Distances [18.9717974398864]
Sliced Wasserstein distances preserve properties of classic Wasserstein distances while being more scalable for computation and estimation in high dimensions.
We quantify this scalability from three key aspects: (i) empirical convergence rates; (ii) robustness to data contamination; and (iii) efficient computational methods.
arXiv Detail & Related papers (2022-10-17T15:04:51Z) - Estimation and Quantization of Expected Persistence Diagrams [0.0]
We study two such summaries, the Persistence Diagram (EPD) and its quantization.
EPD is a measure supported on R 2.
We prove that this estimator is optimal from a minimax standpoint on a large class of models with a parametric rate of convergence.
arXiv Detail & Related papers (2021-05-11T08:12:18Z) - Depth-based pseudo-metrics between probability distributions [1.1470070927586016]
We propose two new pseudo-metrics between continuous probability measures based on data depth and its associated central regions.
In contrast to the Wasserstein distance, the proposed pseudo-metrics do not suffer from the curse of dimensionality.
The regions-based pseudo-metric appears to be robust w.r.t. both outliers and heavy tails.
arXiv Detail & Related papers (2021-03-23T17:33:18Z) - Balancing Geometry and Density: Path Distances on High-Dimensional Data [0.0]
New geometric and computational analyses of power-weighted shortest-path distances (PWSPDs) are presented.
By illuminating the way these metrics balance density and geometry in the underlying data, we discuss how they may be chosen in practice.
arXiv Detail & Related papers (2020-12-17T04:03:15Z) - Two-sample Test using Projected Wasserstein Distance [18.46110328123008]
We develop a projected Wasserstein distance for the two-sample test, a fundamental problem in statistics and machine learning.
A key contribution is to couple optimal projection to find the low dimensional linear mapping to maximize the Wasserstein distance between projected probability distributions.
arXiv Detail & Related papers (2020-10-22T18:08:58Z) - On Projection Robust Optimal Transport: Sample Complexity and Model
Misspecification [101.0377583883137]
Projection robust (PR) OT seeks to maximize the OT cost between two measures by choosing a $k$-dimensional subspace onto which they can be projected.
Our first contribution is to establish several fundamental statistical properties of PR Wasserstein distances.
Next, we propose the integral PR Wasserstein (IPRW) distance as an alternative to the PRW distance, by averaging rather than optimizing on subspaces.
arXiv Detail & Related papers (2020-06-22T14:35:33Z) - Augmented Sliced Wasserstein Distances [55.028065567756066]
We propose a new family of distance metrics, called augmented sliced Wasserstein distances (ASWDs)
ASWDs are constructed by first mapping samples to higher-dimensional hypersurfaces parameterized by neural networks.
Numerical results demonstrate that the ASWD significantly outperforms other Wasserstein variants for both synthetic and real-world problems.
arXiv Detail & Related papers (2020-06-15T23:00:08Z) - Projection Robust Wasserstein Distance and Riemannian Optimization [107.93250306339694]
We show that projection robustly solidstein (PRW) is a robust variant of Wasserstein projection (WPP)
This paper provides a first step into the computation of the PRW distance and provides the links between their theory and experiments on and real data.
arXiv Detail & Related papers (2020-06-12T20:40:22Z) - Fast and Robust Comparison of Probability Measures in Heterogeneous
Spaces [62.35667646858558]
We introduce the Anchor Energy (AE) and Anchor Wasserstein (AW) distances, which are respectively the energy and Wasserstein distances instantiated on such representations.
Our main contribution is to propose a sweep line algorithm to compute AE emphexactly in log-quadratic time, where a naive implementation would be cubic.
We show that AE and AW perform well in various experimental settings at a fraction of the computational cost of popular GW approximations.
arXiv Detail & Related papers (2020-02-05T03:09:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.