The Interconnectivity Vector: A Finite-Dimensional Vector Representation
of Persistent Homology
- URL: http://arxiv.org/abs/2011.11579v1
- Date: Mon, 23 Nov 2020 17:43:06 GMT
- Title: The Interconnectivity Vector: A Finite-Dimensional Vector Representation
of Persistent Homology
- Authors: Megan Johnson, Jae-Hun Jung
- Abstract summary: Persistent Homology (PH) is a useful tool to study the underlying structure of a data set.
Persistence Diagrams (PDs) are a concise summary of the information found by studying the PH of a data set.
We propose a new finite-dimensional vector, called the interconnectivity vector, representation of a PD adapted from Bag-of-Words (BoW)
- Score: 2.741266294612776
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Persistent Homology (PH) is a useful tool to study the underlying structure
of a data set. Persistence Diagrams (PDs), which are 2D multisets of points,
are a concise summary of the information found by studying the PH of a data
set. However, PDs are difficult to incorporate into a typical machine learning
workflow. To that end, two main methods for representing PDs have been
developed: kernel methods and vectorization methods. In this paper we propose a
new finite-dimensional vector, called the interconnectivity vector,
representation of a PD adapted from Bag-of-Words (BoW). This new representation
is constructed to demonstrate the connections between the homological features
of a data set. This initial definition of the interconnectivity vector proves
to be unstable, but we introduce a stabilized version of the vector and prove
its stability with respect to small perturbations in the inputs. We evaluate
both versions of the presented vectorization on several data sets and show
their high discriminative power.
Related papers
- TDAvec: Computing Vector Summaries of Persistence Diagrams for Topological Data Analysis in R and Python [0.6445605125467574]
We introduce a new software package designed to streamline the vectorization of persistence diagrams (PDs)
The non-Hilbert nature of the space of PDs poses challenges for their direct use in machine learning applications.
arXiv Detail & Related papers (2024-11-26T11:34:12Z) - A computationally efficient framework for vector representation of
persistence diagrams [0.0]
We propose a framework to convert a persistence diagram (PD) into a vector in $mathbbRn$, called a vectorized persistence block (VPB)
Our representation possesses many of the desired properties of vector-based summaries such as stability with respect to input noise, low computational cost and flexibility.
arXiv Detail & Related papers (2021-09-16T22:02:35Z) - Estimation and Quantization of Expected Persistence Diagrams [0.0]
We study two such summaries, the Persistence Diagram (EPD) and its quantization.
EPD is a measure supported on R 2.
We prove that this estimator is optimal from a minimax standpoint on a large class of models with a parametric rate of convergence.
arXiv Detail & Related papers (2021-05-11T08:12:18Z) - Random Persistence Diagram Generation [4.435094091999926]
Topological data analysis (TDA) studies the shape patterns of data.
Persistent homology (PH) is a widely used method in TDA that summarizes homological features of data at multiple scales and stores this in persistence diagrams (PDs)
We propose random persistence diagram generation (RPDG), a method that generates a sequence of random PDs from the ones produced by the data.
arXiv Detail & Related papers (2021-04-15T19:33:01Z) - Prototypical Representation Learning for Relation Extraction [56.501332067073065]
This paper aims to learn predictive, interpretable, and robust relation representations from distantly-labeled data.
We learn prototypes for each relation from contextual information to best explore the intrinsic semantics of relations.
Results on several relation learning tasks show that our model significantly outperforms the previous state-of-the-art relational models.
arXiv Detail & Related papers (2021-03-22T08:11:43Z) - Kernel Two-Dimensional Ridge Regression for Subspace Clustering [45.651770340521786]
We propose a novel subspace clustering method for 2D data.
It directly uses 2D data as inputs such that the learning of representations benefits from inherent structures and relationships of the data.
arXiv Detail & Related papers (2020-11-03T04:52:46Z) - Two-Dimensional Semi-Nonnegative Matrix Factorization for Clustering [50.43424130281065]
We propose a new Semi-Nonnegative Matrix Factorization method for 2-dimensional (2D) data, named TS-NMF.
It overcomes the drawback of existing methods that seriously damage the spatial information of the data by converting 2D data to vectors in a preprocessing step.
arXiv Detail & Related papers (2020-05-19T05:54:14Z) - Holistically-Attracted Wireframe Parsing [123.58263152571952]
This paper presents a fast and parsimonious parsing method to detect a vectorized wireframe in an input image with a single forward pass.
The proposed method is end-to-end trainable, consisting of three components: (i) line segment and junction proposal generation, (ii) line segment and junction matching, and (iii) line segment and junction verification.
arXiv Detail & Related papers (2020-03-03T17:43:57Z) - Semiparametric Nonlinear Bipartite Graph Representation Learning with
Provable Guarantees [106.91654068632882]
We consider the bipartite graph and formalize its representation learning problem as a statistical estimation problem of parameters in a semiparametric exponential family distribution.
We show that the proposed objective is strongly convex in a neighborhood around the ground truth, so that a gradient descent-based method achieves linear convergence rate.
Our estimator is robust to any model misspecification within the exponential family, which is validated in extensive experiments.
arXiv Detail & Related papers (2020-03-02T16:40:36Z) - Learning Bijective Feature Maps for Linear ICA [73.85904548374575]
We show that existing probabilistic deep generative models (DGMs) which are tailor-made for image data, underperform on non-linear ICA tasks.
To address this, we propose a DGM which combines bijective feature maps with a linear ICA model to learn interpretable latent structures for high-dimensional data.
We create models that converge quickly, are easy to train, and achieve better unsupervised latent factor discovery than flow-based models, linear ICA, and Variational Autoencoders on images.
arXiv Detail & Related papers (2020-02-18T17:58:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.