A Framework for Fast and Stable Representations of Multiparameter
Persistent Homology Decompositions
- URL: http://arxiv.org/abs/2306.11170v1
- Date: Mon, 19 Jun 2023 21:28:53 GMT
- Title: A Framework for Fast and Stable Representations of Multiparameter
Persistent Homology Decompositions
- Authors: David Loiseaux, Mathieu Carri\`ere, Andrew J. Blumberg
- Abstract summary: We introduce a new general representation framework that leverages recent results on em decompositions of multi parameter persistent homology.
We establish theoretical stability guarantees under this framework as well as efficient algorithms for practical computation.
We validate our stability results and algorithms with numerical experiments that demonstrate statistical convergence, prediction accuracy, and fast running times on several real data sets.
- Score: 2.76240219662896
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Topological data analysis (TDA) is an area of data science that focuses on
using invariants from algebraic topology to provide multiscale shape
descriptors for geometric data sets such as point clouds. One of the most
important such descriptors is {\em persistent homology}, which encodes the
change in shape as a filtration parameter changes; a typical parameter is the
feature scale. For many data sets, it is useful to simultaneously vary multiple
filtration parameters, for example feature scale and density. While the
theoretical properties of single parameter persistent homology are well
understood, less is known about the multiparameter case. In particular, a
central question is the problem of representing multiparameter persistent
homology by elements of a vector space for integration with standard machine
learning algorithms. Existing approaches to this problem either ignore most of
the multiparameter information to reduce to the one-parameter case or are
heuristic and potentially unstable in the face of noise. In this article, we
introduce a new general representation framework that leverages recent results
on {\em decompositions} of multiparameter persistent homology. This framework
is rich in information, fast to compute, and encompasses previous approaches.
Moreover, we establish theoretical stability guarantees under this framework as
well as efficient algorithms for practical computation, making this framework
an applicable and versatile tool for analyzing geometric and point cloud data.
We validate our stability results and algorithms with numerical experiments
that demonstrate statistical convergence, prediction accuracy, and fast running
times on several real data sets.
Related papers
- Stable Vectorization of Multiparameter Persistent Homology using Signed
Barcodes as Measures [0.5312303275762102]
We show how the interpretation of signed barcodes leads to natural extensions of vectorization strategies.
The resulting feature vectors are easy to define and to compute, and provably stable.
arXiv Detail & Related papers (2023-06-06T15:45:07Z) - A Size-Consistent Wave-function Ansatz Built from Statistical Analysis
of Orbital Occupations [0.0]
We present a fresh approach to wavefunction parametrization that is size-consistent, rapidly convergent, and numerically robust.
The general utility of this approach is verified by applying it to uncorrelated, weakly-correlated, and strongly-correlated systems.
arXiv Detail & Related papers (2023-04-20T17:30:06Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - Euler Characteristic Curves and Profiles: a stable shape invariant for
big data problems [3.0023392750520883]
We show efficient algorithms to compute Euler Characteristic based approaches for persistent homology.
Euler Curves and Profiles enjoys certain type of stability which makes them robust tool in data analysis.
arXiv Detail & Related papers (2022-12-03T18:37:48Z) - FaDIn: Fast Discretized Inference for Hawkes Processes with General
Parametric Kernels [82.53569355337586]
This work offers an efficient solution to temporal point processes inference using general parametric kernels with finite support.
The method's effectiveness is evaluated by modeling the occurrence of stimuli-induced patterns from brain signals recorded with magnetoencephalography (MEG)
Results show that the proposed approach leads to an improved estimation of pattern latency than the state-of-the-art.
arXiv Detail & Related papers (2022-10-10T12:35:02Z) - A Causality-Based Learning Approach for Discovering the Underlying
Dynamics of Complex Systems from Partial Observations with Stochastic
Parameterization [1.2882319878552302]
This paper develops a new iterative learning algorithm for complex turbulent systems with partial observations.
It alternates between identifying model structures, recovering unobserved variables, and estimating parameters.
Numerical experiments show that the new algorithm succeeds in identifying the model structure and providing suitable parameterizations for many complex nonlinear systems.
arXiv Detail & Related papers (2022-08-19T00:35:03Z) - Post-mortem on a deep learning contest: a Simpson's paradox and the
complementary roles of scale metrics versus shape metrics [61.49826776409194]
We analyze a corpus of models made publicly-available for a contest to predict the generalization accuracy of neural network (NN) models.
We identify what amounts to a Simpson's paradox: where "scale" metrics perform well overall but perform poorly on sub partitions of the data.
We present two novel shape metrics, one data-independent, and the other data-dependent, which can predict trends in the test accuracy of a series of NNs.
arXiv Detail & Related papers (2021-06-01T19:19:49Z) - A Forward Backward Greedy approach for Sparse Multiscale Learning [0.0]
We propose a feature driven Reproducing Kernel Hilbert space (RKHS) for which the associated kernel has a weighted multiscale structure.
For generating approximations in this space, we provide a practical forward-backward algorithm that is shown to greedily construct a set of basis functions having a multiscale structure.
We analyze the performance of the approach on a variety of simulation and real data sets.
arXiv Detail & Related papers (2021-02-14T04:22:52Z) - Generalized Matrix Factorization: efficient algorithms for fitting
generalized linear latent variable models to large data arrays [62.997667081978825]
Generalized Linear Latent Variable models (GLLVMs) generalize such factor models to non-Gaussian responses.
Current algorithms for estimating model parameters in GLLVMs require intensive computation and do not scale to large datasets.
We propose a new approach for fitting GLLVMs to high-dimensional datasets, based on approximating the model using penalized quasi-likelihood.
arXiv Detail & Related papers (2020-10-06T04:28:19Z) - Instability, Computational Efficiency and Statistical Accuracy [101.32305022521024]
We develop a framework that yields statistical accuracy based on interplay between the deterministic convergence rate of the algorithm at the population level, and its degree of (instability) when applied to an empirical object based on $n$ samples.
We provide applications of our general results to several concrete classes of models, including Gaussian mixture estimation, non-linear regression models, and informative non-response models.
arXiv Detail & Related papers (2020-05-22T22:30:52Z) - Stable and consistent density-based clustering via multiparameter
persistence [77.34726150561087]
We consider the degree-Rips construction from topological data analysis.
We analyze its stability to perturbations of the input data using the correspondence-interleaving distance.
We integrate these methods into a pipeline for density-based clustering, which we call Persistable.
arXiv Detail & Related papers (2020-05-18T19:45:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.