Stability for Inference with Persistent Homology Rank Functions
- URL: http://arxiv.org/abs/2307.02904v2
- Date: Sun, 22 Sep 2024 21:55:19 GMT
- Title: Stability for Inference with Persistent Homology Rank Functions
- Authors: Qiquan Wang, Inés García-Redondo, Pierre Faugère, Gregory Henselman-Petrusek, Anthea Monod,
- Abstract summary: We revisit the persistent homology rank function as a tool for statistics and machine learning.
We find that the use of persistent homology captured by rank functions offers a clear improvement over existing non-persistence-based approaches.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Persistent homology barcodes and diagrams are a cornerstone of topological data analysis that capture the "shape" of a wide range of complex data structures, such as point clouds, networks, and functions. However, their use in statistical settings is challenging due to their complex geometric structure. In this paper, we revisit the persistent homology rank function, which is mathematically equivalent to a barcode and persistence diagram, as a tool for statistics and machine learning. Rank functions, being functions, enable the direct application of the statistical theory of functional data analysis (FDA)-a domain of statistics adapted for data in the form of functions. A key challenge they present over barcodes in practice, however, is their lack of stability-a property that is crucial to validate their use as a faithful representation of the data and therefore a viable summary statistic. In this paper, we fill this gap by deriving two stability results for persistent homology rank functions under a suitable metric for FDA integration. We then study the performance of rank functions in functional inferential statistics and machine learning on real data applications, in both single and multiparameter persistent homology. We find that the use of persistent homology captured by rank functions offers a clear improvement over existing non-persistence-based approaches.
Related papers
- Meta-Statistical Learning: Supervised Learning of Statistical Inference [59.463430294611626]
This work demonstrates that the tools and principles driving the success of large language models (LLMs) can be repurposed to tackle distribution-level tasks.
We propose meta-statistical learning, a framework inspired by multi-instance learning that reformulates statistical inference tasks as supervised learning problems.
arXiv Detail & Related papers (2025-02-17T18:04:39Z) - Functional relevance based on the continuous Shapley value [0.0]
This work focuses on interpretability of predictive models based on functional data.
We propose an interpretability method based on the Shapley value for continuous games.
The method is illustrated through a set of experiments with simulated and real data sets.
arXiv Detail & Related papers (2024-11-27T18:20:00Z) - SymbolFit: Automatic Parametric Modeling with Symbolic Regression [1.2662552408022727]
We introduce SymbolFit, a framework that automates parametric modeling by using symbolic regression to perform a machine-search for functions that fit the data.
Our approach is demonstrated in data analysis applications in high-energy physics experiments at the CERN Large Hadron Collider.
arXiv Detail & Related papers (2024-11-15T00:09:37Z) - ASGNet: Adaptive Semantic Gate Networks for Log-Based Anomaly Diagnosis [6.399472066185473]
We propose an adaptive semantic gate networks (ASGNet) that combines statistical features and semantic features to consolidate log text semantic representation.
ASGNet encodes statistical features via a variational encoding module and fuses useful information through a well-designed adaptive semantic threshold mechanism.
arXiv Detail & Related papers (2024-02-19T05:08:44Z) - Offline Reinforcement Learning with Differentiable Function
Approximation is Provably Efficient [65.08966446962845]
offline reinforcement learning, which aims at optimizing decision-making strategies with historical data, has been extensively applied in real-life applications.
We take a step by considering offline reinforcement learning with differentiable function class approximation (DFA)
Most importantly, we show offline differentiable function approximation is provably efficient by analyzing the pessimistic fitted Q-learning algorithm.
arXiv Detail & Related papers (2022-10-03T07:59:42Z) - Robust Topological Inference in the Presence of Outliers [18.6112824677157]
The distance function to a compact set plays a crucial role in the paradigm of topological data analysis.
Despite its stability to perturbations in the Hausdorff distance, persistent homology is highly sensitive to outliers.
We propose a $textitmedian-of-means$ variant of the distance function ($textsfMoM Dist$), and establish its statistical properties.
arXiv Detail & Related papers (2022-06-03T19:45:43Z) - Data-Driven Reachability analysis and Support set Estimation with
Christoffel Functions [8.183446952097528]
We present algorithms for estimating the forward reachable set of a dynamical system.
The produced estimate is the sublevel set of a function called an empirical inverse Christoffel function.
In addition to reachability analysis, the same approach can be applied to general problems of estimating the support of a random variable.
arXiv Detail & Related papers (2021-12-18T20:25:34Z) - Learning PSD-valued functions using kernel sums-of-squares [94.96262888797257]
We introduce a kernel sum-of-squares model for functions that take values in the PSD cone.
We show that it constitutes a universal approximator of PSD functions, and derive eigenvalue bounds in the case of subsampled equality constraints.
We then apply our results to modeling convex functions, by enforcing a kernel sum-of-squares representation of their Hessian.
arXiv Detail & Related papers (2021-11-22T16:07:50Z) - Removing Bias in Multi-modal Classifiers: Regularization by Maximizing
Functional Entropies [88.0813215220342]
Some modalities can more easily contribute to the classification results than others.
We develop a method based on the log-Sobolev inequality, which bounds the functional entropy with the functional-Fisher-information.
On the two challenging multi-modal datasets VQA-CPv2 and SocialIQ, we obtain state-of-the-art results while more uniformly exploiting the modalities.
arXiv Detail & Related papers (2020-10-21T07:40:33Z) - Estimating Structural Target Functions using Machine Learning and
Influence Functions [103.47897241856603]
We propose a new framework for statistical machine learning of target functions arising as identifiable functionals from statistical models.
This framework is problem- and model-agnostic and can be used to estimate a broad variety of target parameters of interest in applied statistics.
We put particular focus on so-called coarsening at random/doubly robust problems with partially unobserved information.
arXiv Detail & Related papers (2020-08-14T16:48:29Z) - Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management.
We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.