From Geometry to Topology: Inverse Theorems for Distributed Persistence
- URL: http://arxiv.org/abs/2101.12288v1
- Date: Thu, 28 Jan 2021 21:36:45 GMT
- Title: From Geometry to Topology: Inverse Theorems for Distributed Persistence
- Authors: Elchanan Solomon, Alex Wagner, Paul Bendich
- Abstract summary: We show that the correct invariant is not the persistence diagram of X, but rather the collection of persistence diagrams of many small subsets.
This invariant, which we call "distributed persistence," is trivially parallelizable, more stable to outliers.
Results are complemented by two synthetic experiments demonstrating the use of distributed persistence in practice.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: What is the "right" topological invariant of a large point cloud X? Prior
research has focused on estimating the full persistence diagram of X, a
quantity that is very expensive to compute, unstable to outliers, and far from
a sufficient statistic. We therefore propose that the correct invariant is not
the persistence diagram of X, but rather the collection of persistence diagrams
of many small subsets. This invariant, which we call "distributed persistence,"
is trivially parallelizable, more stable to outliers, and has a rich inverse
theory. The map from the space of point clouds (with the quasi-isometry metric)
to the space of distributed persistence invariants (with the
Hausdorff-Bottleneck distance) is a global quasi-isometry. This is a much
stronger property than simply being injective, as it implies that the inverse
of a small neighborhood is a small neighborhood, and is to our knowledge the
only result of its kind in the TDA literature. Moreover, the quasi-isometry
bounds depend on the size of the subsets taken, so that as the size of these
subsets goes from small to large, the invariant interpolates between a purely
geometric one and a topological one. Lastly, we note that our inverse results
do not actually require considering all subsets of a fixed size (an enormous
collection), but a relatively small collection satisfying certain covering
properties that arise with high probability when randomly sampling subsets.
These theoretical results are complemented by two synthetic experiments
demonstrating the use of distributed persistence in practice.
Related papers
- Towards Self-Supervised Covariance Estimation in Deep Heteroscedastic Regression [102.24287051757469]
We study self-supervised covariance estimation in deep heteroscedastic regression.
We derive an upper bound on the 2-Wasserstein distance between normal distributions.
Experiments over a wide range of synthetic and real datasets demonstrate that the proposed 2-Wasserstein bound coupled with pseudo label annotations results in a computationally cheaper yet accurate deep heteroscedastic regression.
arXiv Detail & Related papers (2025-02-14T22:37:11Z) - Blessing of Dimensionality for Approximating Sobolev Classes on Manifolds [14.183849746284816]
The manifold hypothesis says that natural high-dimensional data is supported on or around a low-dimensional manifold.
Recent success of statistical and learning-based methods empirically supports this hypothesis.
We provide theoretical statistical complexity results, which directly relates to generalization properties.
arXiv Detail & Related papers (2024-08-13T15:56:42Z) - Scaling Riemannian Diffusion Models [68.52820280448991]
We show that our method enables us to scale to high dimensional tasks on nontrivial manifold.
We model QCD densities on $SU(n)$ lattices and contrastively learned embeddings on high dimensional hyperspheres.
arXiv Detail & Related papers (2023-10-30T21:27:53Z) - Almost Equivariance via Lie Algebra Convolutions [0.0]
We provide a definition of almost equivariance and give a practical method for encoding it in models.
Specifically, we define Lie algebra convolutions and demonstrate that they offer several benefits over Lie group convolutions.
We prove two existence theorems, one showing the existence of almost isometries within bounded distance of isometries of a manifold, and another showing the converse for Hilbert spaces.
arXiv Detail & Related papers (2023-10-19T21:31:11Z) - Conformal inference for regression on Riemannian Manifolds [49.7719149179179]
We investigate prediction sets for regression scenarios when the response variable, denoted by $Y$, resides in a manifold, and the covariable, denoted by X, lies in Euclidean space.
We prove the almost sure convergence of the empirical version of these regions on the manifold to their population counterparts.
arXiv Detail & Related papers (2023-10-12T10:56:25Z) - The Exact Sample Complexity Gain from Invariances for Kernel Regression [37.74032673086741]
In practice, encoding invariances into models improves sample complexity.
We provide minimax optimal rates for kernel ridge regression on compact manifold.
Our results hold for any smooth compact Lie group action, even groups of positive dimension.
arXiv Detail & Related papers (2023-03-24T20:47:31Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - Coordinate Independent Convolutional Networks -- Isometry and Gauge
Equivariant Convolutions on Riemannian Manifolds [70.32518963244466]
A major complication in comparison to flat spaces is that it is unclear in which alignment a convolution kernel should be applied on a manifold.
We argue that the particular choice of coordinatization should not affect a network's inference -- it should be coordinate independent.
A simultaneous demand for coordinate independence and weight sharing is shown to result in a requirement on the network to be equivariant.
arXiv Detail & Related papers (2021-06-10T19:54:19Z) - Intrinsic Dimension Estimation [92.87600241234344]
We introduce a new estimator of the intrinsic dimension and provide finite sample, non-asymptotic guarantees.
We then apply our techniques to get new sample complexity bounds for Generative Adversarial Networks (GANs) depending on the intrinsic dimension of the data.
arXiv Detail & Related papers (2021-06-08T00:05:39Z) - Spatially relaxed inference on high-dimensional linear models [48.989769153211995]
We study the properties of ensembled clustered inference algorithms which combine spatially constrained clustering, statistical inference, and ensembling to aggregate several clustered inference solutions.
We show that ensembled clustered inference algorithms control the $delta$-FWER under standard assumptions for $delta$ equal to the largest cluster diameter.
arXiv Detail & Related papers (2021-06-04T16:37:19Z) - Minimal length in an orbit closure as a semiclassical limit [1.6312226592634047]
Invariant theory states that orbit of a vector v is separated from the origin if and only if some homogeneous invariant is nonzero on v.
This connection to optimization has led to efficient algorithms for many problems in invariant theory.
We provide a new and independent elementary proof inspired by the Fourier-analytic proof of the local central limit theorem.
arXiv Detail & Related papers (2020-04-30T15:19:07Z) - Any Target Function Exists in a Neighborhood of Any Sufficiently Wide
Random Network: A Geometrical Perspective [4.42494528420519]
It is known that any target function is realized in a sufficiently small neighborhood of any randomly connected deep network.
We show that high-dimensional geometry plays a magical role.
arXiv Detail & Related papers (2020-01-20T01:09:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.