Related papers: TDAvec: Computing Vector Summaries of Persistence Diagrams for Topological Data Analysis in R and Python

TDAvec: Computing Vector Summaries of Persistence Diagrams for Topological Data Analysis in R and Python

URL: http://arxiv.org/abs/2411.17340v1
Date: Tue, 26 Nov 2024 11:34:12 GMT
Title: TDAvec: Computing Vector Summaries of Persistence Diagrams for Topological Data Analysis in R and Python
Authors: Aleksei Luchinsky, Umar Islambekov,
Abstract summary: We introduce a new software package designed to streamline the vectorization of persistence diagrams (PDs) The non-Hilbert nature of the space of PDs poses challenges for their direct use in machine learning applications.
Score: 0.6445605125467574
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Persistent homology is a widely-used tool in topological data analysis (TDA) for understanding the underlying shape of complex data. By constructing a filtration of simplicial complexes from data points, it captures topological features such as connected components, loops, and voids across multiple scales. These features are encoded in persistence diagrams (PDs), which provide a concise summary of the data's topological structure. However, the non-Hilbert nature of the space of PDs poses challenges for their direct use in machine learning applications. To address this, kernel methods and vectorization techniques have been developed to transform PDs into machine-learning-compatible formats. In this paper, we introduce a new software package designed to streamline the vectorization of PDs, offering an intuitive workflow and advanced functionalities. We demonstrate the necessity of the package through practical examples and provide a detailed discussion on its contributions to applied TDA. Definitions of all vectorization summaries used in the package are included in the appendix.

Related papers

RV-Syn: Rational and Verifiable Mathematical Reasoning Data Synthesis based on Structured Function Library [58.404895570822184]
RV-Syn is a novel mathematical Synthesis approach. It generates graphs as solutions by combining Python-formatted functions from this library. Based on the constructed graph, we achieve solution-guided logic-aware problem generation.
arXiv Detail & Related papers (2025-04-29T04:42:02Z)
Analytical Discovery of Manifold with Machine Learning [2.6585498155499643]
We introduce a novel framework, GAMLA (Global Analytical Manifold Learning using Auto-encoding) GAMLA employs a two-round training process within an auto-encoding framework to derive both character and complementary representations for the underlying manifold. We find the two representations together decompose the whole latent space and can thus characterize the local spatial structure surrounding the manifold.
arXiv Detail & Related papers (2025-04-03T11:53:00Z)
Understanding Generative AI Content with Embedding Models [4.662332573448995]
We show that deep neural networks (DNNs) implicitly engineer features by transforming their input data into hidden feature vectors called embeddings. We find empirical evidence that there is intrinsic separability between real samples and those generated by artificial intelligence (AI)
arXiv Detail & Related papers (2024-08-19T22:07:05Z)
IsUMap: Manifold Learning and Data Visualization leveraging Vietoris-Rips filtrations [0.08796261172196743]
We present a systematic and detailed construction of a metric representation for locally distorted metric spaces. Our approach addresses limitations in existing methods by accommodating non-uniform data distributions and intricate local geometries.
arXiv Detail & Related papers (2024-07-25T07:46:30Z)
Discovering symbolic expressions with parallelized tree search [59.92040079807524]
Symbolic regression plays a crucial role in scientific research thanks to its capability of discovering concise and interpretable mathematical expressions from data. Existing algorithms have faced a critical bottleneck of accuracy and efficiency over a decade when handling problems of complexity. We introduce a parallelized tree search (PTS) model to efficiently distill generic mathematical expressions from limited data.
arXiv Detail & Related papers (2024-07-05T10:41:15Z)
ComboStoc: Combinatorial Stochasticity for Diffusion Generative Models [65.82630283336051]
We show that the space spanned by the combination of dimensions and attributes is insufficiently sampled by existing training scheme of diffusion generative models. We present a simple fix to this problem by constructing processes that fully exploit the structures, hence the name ComboStoc.
arXiv Detail & Related papers (2024-05-22T15:23:10Z)
Improving embedding of graphs with missing data by soft manifolds [51.425411400683565]
The reliability of graph embeddings depends on how much the geometry of the continuous space matches the graph structure. We introduce a new class of manifold, named soft manifold, that can solve this situation. Using soft manifold for graph embedding, we can provide continuous spaces to pursue any task in data analysis over complex datasets.
arXiv Detail & Related papers (2023-11-29T12:48:33Z)
Higher-order topological kernels via quantum computation [68.8204255655161]
Topological data analysis (TDA) has emerged as a powerful tool for extracting meaningful insights from complex data. We propose a quantum approach to defining Betti kernels, which is based on constructing Betti curves with increasing order.
arXiv Detail & Related papers (2023-07-14T14:48:52Z)
DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion [66.21290235237808]
We introduce an energy constrained diffusion model which encodes a batch of instances from a dataset into evolutionary states. We provide rigorous theory that implies closed-form optimal estimates for the pairwise diffusion strength among arbitrary instance pairs. Experiments highlight the wide applicability of our model as a general-purpose encoder backbone with superior performance in various tasks.
arXiv Detail & Related papers (2023-01-23T15:18:54Z)
Learning Implicit Feature Alignment Function for Semantic Segmentation [51.36809814890326]
Implicit Feature Alignment function (IFA) is inspired by the rapidly expanding topic of implicit neural representations. We show that IFA implicitly aligns the feature maps at different levels and is capable of producing segmentation maps in arbitrary resolutions. Our method can be combined with improvement on various architectures, and it achieves state-of-the-art accuracy trade-off on common benchmarks.
arXiv Detail & Related papers (2022-06-17T09:40:14Z)
A computationally efficient framework for vector representation of persistence diagrams [0.0]
We propose a framework to convert a persistence diagram (PD) into a vector in $mathbbRn$, called a vectorized persistence block (VPB) Our representation possesses many of the desired properties of vector-based summaries such as stability with respect to input noise, low computational cost and flexibility.
arXiv Detail & Related papers (2021-09-16T22:02:35Z)
Random Persistence Diagram Generation [4.435094091999926]
Topological data analysis (TDA) studies the shape patterns of data. Persistent homology (PH) is a widely used method in TDA that summarizes homological features of data at multiple scales and stores this in persistence diagrams (PDs) We propose random persistence diagram generation (RPDG), a method that generates a sequence of random PDs from the ones produced by the data.
arXiv Detail & Related papers (2021-04-15T19:33:01Z)
The Interconnectivity Vector: A Finite-Dimensional Vector Representation of Persistent Homology [2.741266294612776]
Persistent Homology (PH) is a useful tool to study the underlying structure of a data set. Persistence Diagrams (PDs) are a concise summary of the information found by studying the PH of a data set. We propose a new finite-dimensional vector, called the interconnectivity vector, representation of a PD adapted from Bag-of-Words (BoW)
arXiv Detail & Related papers (2020-11-23T17:43:06Z)
A Short Review on Data Modelling for Vector Fields [5.51641435875237]
Machine learning methods have proven highly successful in dealing with a wide variety of data analysis and analytics tasks. The recent success of end-to-end modelling scheme using deep neural networks allows the extension to more sophisticated and structured practical data. This review article is dedicated to recent computational tools of vector fields, including vector data representations, predictive model of spatial data, as well as applications in computer vision, signal processing, and empirical sciences.
arXiv Detail & Related papers (2020-09-01T17:07:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.