Topological Data Analysis of Database Representations for Information
Retrieval
- URL: http://arxiv.org/abs/2104.01672v1
- Date: Sun, 4 Apr 2021 19:29:47 GMT
- Title: Topological Data Analysis of Database Representations for Information
Retrieval
- Authors: Athanasios Vlontzos, Yueqi Cao, Luca Schmidtke, Bernhard Kainz, and
Anthea Monod
- Abstract summary: Persistent homology provides a rigorous characterization for the database topology.
We show that some commonly used embeddings fail to preserve the connectivity.
We introduce the dilation-invariant bottleneck distance to capture this effect.
- Score: 2.729524133721473
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Appropriately representing elements in a database so that queries may be
accurately matched is a central task in information retrieval. This recently
has been achieved by embedding the graphical structure of the database into a
manifold so that the hierarchy is preserved. Persistent homology provides a
rigorous characterization for the database topology in terms of both its
hierarchy and connectivity structure. We compute persistent homology on a
variety of datasets and show that some commonly used embeddings fail to
preserve the connectivity. Moreover, we show that embeddings which successfully
retain the database topology coincide in persistent homology. We introduce the
dilation-invariant bottleneck distance to capture this effect, which addresses
metric distortion on manifolds. We use it to show that distances between
topology-preserving embeddings of databases are small.
Related papers
- Topograph: An efficient Graph-Based Framework for Strictly Topology Preserving Image Segmentation [78.54656076915565]
Topological correctness plays a critical role in many image segmentation tasks.
Most networks are trained using pixel-wise loss functions, such as Dice, neglecting topological accuracy.
We propose a novel, graph-based framework for topologically accurate image segmentation.
arXiv Detail & Related papers (2024-11-05T16:20:14Z) - A Plug-and-Play Method for Rare Human-Object Interactions Detection by Bridging Domain Gap [50.079224604394]
We present a novel model-agnostic framework called textbfContext-textbfEnhanced textbfFeature textbfAment (CEFA)
CEFA consists of a feature alignment module and a context enhancement module.
Our method can serve as a plug-and-play module to improve the detection performance of HOI models on rare categories.
arXiv Detail & Related papers (2024-07-31T08:42:48Z) - On Characterizing the Evolution of Embedding Space of Neural Networks
using Algebraic Topology [9.537910170141467]
We study how the topology of feature embedding space changes as it passes through the layers of a well-trained deep neural network (DNN) through Betti numbers.
We demonstrate that as depth increases, a topologically complicated dataset is transformed into a simple one, resulting in Betti numbers attaining their lowest possible value.
arXiv Detail & Related papers (2023-11-08T10:45:12Z) - Hierarchical clustering with dot products recovers hidden tree structure [53.68551192799585]
In this paper we offer a new perspective on the well established agglomerative clustering algorithm, focusing on recovery of hierarchical structure.
We recommend a simple variant of the standard algorithm, in which clusters are merged by maximum average dot product and not, for example, by minimum distance or within-cluster variance.
We demonstrate that the tree output by this algorithm provides a bona fide estimate of generative hierarchical structure in data, under a generic probabilistic graphical model.
arXiv Detail & Related papers (2023-05-24T11:05:12Z) - On topological data analysis for SHM; an introduction to persistent
homology [0.0]
The main tool within topological data analysis is persistent homology.
persistent homology is a representation of how the homological features of the data persist over an interval.
These results allow for topological inference and the ability to deduce features in higher-dimensional data.
arXiv Detail & Related papers (2022-09-12T12:02:39Z) - Robust Change Detection Based on Neural Descriptor Fields [53.111397800478294]
We develop an object-level online change detection approach that is robust to partially overlapping observations and noisy localization results.
By associating objects via shape code similarity and comparing local object-neighbor spatial layout, our proposed approach demonstrates robustness to low observation overlap and localization noises.
arXiv Detail & Related papers (2022-08-01T17:45:36Z) - On the complexity of finding set repairs for data-graphs [2.519906683279153]
We study the problem of computing a subset and superset repairs for graph databases with data values.
We show that for positive fragments of Reg-GX expressions these problems admit a subset-time algorithm, while the full expressive power of the language renders them intractable.
arXiv Detail & Related papers (2022-06-15T13:01:26Z) - Personal Fixations-Based Object Segmentation with Object Localization
and Boundary Preservation [60.41628937597989]
We focus on Personal Fixations-based Object (PFOS) to address issues in previous studies.
We propose a novel network based on Object Localization and Boundary Preservation (OLBP) to segment the gazed objects.
OLBP is organized in the mixed bottom-up and top-down manner with multiple types of deep supervision.
arXiv Detail & Related papers (2021-01-22T09:20:47Z) - A Bayesian Hierarchical Score for Structure Learning from Related Data
Sets [0.7240563090941907]
We propose a new Bayesian Dirichlet score, which we call Bayesian Hierarchical Dirichlet (BHD)
BHD is based on a hierarchical model that pools information across data sets to learn a single encompassing network structure.
We find that BHD outperforms the Bayesian Dirichlet equivalent uniform (BDeu) score in terms of reconstruction accuracy as measured by the Structural Hamming distance.
arXiv Detail & Related papers (2020-08-04T16:41:05Z) - On Embeddings in Relational Databases [11.52782249184251]
We address the problem of learning a distributed representation of entities in a relational database using a low-dimensional embedding.
Recent methods for learning embedding constitute of a naive approach to consider complete denormalization of the database by relationalizing the full join of all tables and representing as a knowledge graph.
In this paper we demonstrate; a better methodology for learning representations by exploiting the underlying semantics of columns in a table while using the relation joins and the latent inter-row relationships.
arXiv Detail & Related papers (2020-05-13T17:21:27Z) - Self-Learning with Rectification Strategy for Human Parsing [73.06197841003048]
We propose a trainable graph reasoning method to correct two typical errors in the pseudo-labels.
The reconstructed features have a stronger ability to represent the topology structure of the human body.
Our method outperforms other state-of-the-art methods in supervised human parsing tasks.
arXiv Detail & Related papers (2020-04-17T03:51:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.