Typed Topological Structures Of Datasets
- URL: http://arxiv.org/abs/2508.14008v1
- Date: Tue, 19 Aug 2025 17:14:13 GMT
- Title: Typed Topological Structures Of Datasets
- Authors: Wanjun Hu,
- Abstract summary: A datatset $X$ on $R2$ is a finite topological space.<n>In this article, we develop a special set of types and its related typed topology on a dataset $X$.<n>Such structures provide a platform for new algorithms for problems such as calculating convex hull, holes, clustering and anomaly detection.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: A datatset $X$ on $R^2$ is a finite topological space. Current research of a dataset focuses on statistical methods and the algebraic topological method \cite{carlsson}. In \cite{hu}, the concept of typed topological space was introduced and showed to have the potential for studying finite topological spaces, such as a dataset. It is a new method from the general topology perspective. A typed topological space is a topological space whose open sets are assigned types. Topological concepts and methods can be redefined using open sets of certain types. In this article, we develop a special set of types and its related typed topology on a dataset $X$. Using it, we can investigate the inner structure of $X$. In particular, $R^2$ has a natural quotient space, in which $X$ is organized into tracks, and each track is split into components. Those components are in a order. Further, they can be represented by an integer sequence. Components crossing tracks form branches, and the relationship can be well represented by a type of pseudotree (called typed-II pseudotree). Such structures provide a platform for new algorithms for problems such as calculating convex hull, holes, clustering and anomaly detection.
Related papers
- Improving embedding of graphs with missing data by soft manifolds [51.425411400683565]
The reliability of graph embeddings depends on how much the geometry of the continuous space matches the graph structure.
We introduce a new class of manifold, named soft manifold, that can solve this situation.
Using soft manifold for graph embedding, we can provide continuous spaces to pursue any task in data analysis over complex datasets.
arXiv Detail & Related papers (2023-11-29T12:48:33Z) - Lattice-preserving $\mathcal{ALC}$ ontology embeddings with saturation [50.05281461410368]
An order-preserving embedding method is proposed to generate embeddings of OWL representations.
We show that our method outperforms state-the-art theory-of-the-art embedding methods in several knowledge base completion tasks.
arXiv Detail & Related papers (2023-05-11T22:27:51Z) - An Approximation Theory for Metric Space-Valued Functions With A View
Towards Deep Learning [25.25903127886586]
We build universal functions approximators of continuous maps between arbitrary Polish metric spaces $mathcalX$ and $mathcalY$.
In particular, we show that the required number of Dirac measures is determined by the structure of $mathcalX$ and $mathcalY$.
arXiv Detail & Related papers (2023-04-24T16:18:22Z) - Detection-Recovery Gap for Planted Dense Cycles [72.4451045270967]
We consider a model where a dense cycle with expected bandwidth $n tau$ and edge density $p$ is planted in an ErdHos-R'enyi graph $G(n,q)$.
We characterize the computational thresholds for the associated detection and recovery problems for the class of low-degree algorithms.
arXiv Detail & Related papers (2023-02-13T22:51:07Z) - Deep Invertible Approximation of Topologically Rich Maps between
Manifolds [17.60434807901964]
We show how to design neural networks that allow for stable universal approximation of maps between topologically interesting manifold.
By exploiting the topological parallels between locally bilipschitz maps, covering spaces, and local homeomorphisms, we find that a novel network of the form $mathcalT circ p circ mathcalE$ is a universal approximator of local diffeomorphisms.
We also outline possible extensions of our architecture to address molecular imaging of molecules with symmetries.
arXiv Detail & Related papers (2022-10-02T17:14:43Z) - Geometry Interaction Knowledge Graph Embeddings [153.69745042757066]
We propose Geometry Interaction knowledge graph Embeddings (GIE), which learns spatial structures interactively between the Euclidean, hyperbolic and hyperspherical spaces.
Our proposed GIE can capture a richer set of relational information, model key inference patterns, and enable expressive semantic matching across entities.
arXiv Detail & Related papers (2022-06-24T08:33:43Z) - Dist2Cycle: A Simplicial Neural Network for Homology Localization [66.15805004725809]
Simplicial complexes can be viewed as high dimensional generalizations of graphs that explicitly encode multi-way ordered relations.
We propose a graph convolutional model for learning functions parametrized by the $k$-homological features of simplicial complexes.
arXiv Detail & Related papers (2021-10-28T14:59:41Z) - Turing approximations, toric isometric embeddings & manifold
convolutions [0.0]
We define a convolution operator for a manifold of arbitrary topology and dimension.
A result of Alan Turing from 1938 underscores the need for such a toric isometric embedding approach to achieve a global definition of convolution.
arXiv Detail & Related papers (2021-10-05T18:36:16Z) - The decomposition of the higher-order homology embedding constructed
from the $k$-Laplacian [5.076419064097734]
The null space of the $k$-th order Laplacian $mathbfmathcal L_k$ encodes the non-trivial topology of a manifold or a network.
We propose an algorithm to factorize the homology embedding into subspaces corresponding to a manifold's simplest topological components.
arXiv Detail & Related papers (2021-07-23T00:40:01Z) - Exact and Approximate Hierarchical Clustering Using A* [51.187990314731344]
We introduce a new approach based on A* search for clustering.
We overcome the prohibitively large search space by combining A* with a novel emphtrellis data structure.
We empirically demonstrate that our method achieves substantially higher quality results than baselines for a particle physics use case and other clustering benchmarks.
arXiv Detail & Related papers (2021-04-14T18:15:27Z) - Estimate of the Neural Network Dimension using Algebraic Topology and
Lie Theory [0.0]
In this paper we present an approach to determine the smallest possible number of neurons in a layer of a neural network.
We specify the required dimensions precisely, assuming that there is a smooth manifold on or near which the data are located.
We derive the theory and validate it experimentally on toy data sets.
arXiv Detail & Related papers (2020-04-06T14:15:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.