Related papers: Tree! I am no Tree! I am a Low Dimensional Hyperbolic Embedding

Tree! I am no Tree! I am a Low Dimensional Hyperbolic Embedding

URL: http://arxiv.org/abs/2005.03847v4
Date: Fri, 23 Oct 2020 01:37:28 GMT
Title: Tree! I am no Tree! I am a Low Dimensional Hyperbolic Embedding
Authors: Rishi Sonthalia, Anna C. Gilbert
Abstract summary: We explore a new method for learning hyperbolic representations by taking a metric-first approach. Rather than determining the low-dimensional hyperbolic embedding directly, we learn a tree structure on the data. This tree structure can then be used directly to extract hierarchical information, embedded into a hyperbolic manifold.
Score: 7.381113319198104
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Given data, finding a faithful low-dimensional hyperbolic embedding of the data is a key method by which we can extract hierarchical information or learn representative geometric features of the data. In this paper, we explore a new method for learning hyperbolic representations by taking a metric-first approach. Rather than determining the low-dimensional hyperbolic embedding directly, we learn a tree structure on the data. This tree structure can then be used directly to extract hierarchical information, embedded into a hyperbolic manifold using Sarkar's construction \cite{sarkar}, or used as a tree approximation of the original metric. To this end, we present a novel fast algorithm \textsc{TreeRep} such that, given a $\delta$-hyperbolic metric (for any $\delta \geq 0$), the algorithm learns a tree structure that approximates the original metric. In the case when $\delta = 0$, we show analytically that \textsc{TreeRep} exactly recovers the original tree structure. We show empirically that \textsc{TreeRep} is not only many orders of magnitude faster than previously known algorithms, but also produces metrics with lower average distortion and higher mean average precision than most previous algorithms for learning hyperbolic embeddings, extracting hierarchical information, and approximating metrics via tree metrics.

Related papers

ReTreever: Tree-based Coarse-to-Fine Representations for Retrieval [64.44265315244579]
We propose a tree-based method for organizing and representing reference documents at various granular levels. Our method, called ReTreever, jointly learns a routing function per internal node of a binary tree such that query and reference documents are assigned to similar tree branches. Our evaluations show that ReTreever generally preserves full representation accuracy.
arXiv Detail & Related papers (2025-02-11T21:35:13Z)
Fitting trees to $\ell_1$-hyperbolic distances [4.220336689294244]
Building trees to represent or to fit distances is a critical component of phylogenetic analysis. We study the tree fitting problem as finding the relation between the hyperbolicity (ultrametricity) vector and the error of tree (ultrametric) embedding.
arXiv Detail & Related papers (2024-09-02T07:38:32Z)
Learning a Decision Tree Algorithm with Transformers [75.96920867382859]
We introduce MetaTree, a transformer-based model trained via meta-learning to directly produce strong decision trees. We fit both greedy decision trees and globally optimized decision trees on a large number of datasets, and train MetaTree to produce only the trees that achieve strong generalization performance.
arXiv Detail & Related papers (2024-02-06T07:40:53Z)
TreeLearn: A Comprehensive Deep Learning Method for Segmenting Individual Trees from Ground-Based LiDAR Forest Point Clouds [42.87502453001109]
We propose TreeLearn, a deep learning-based approach for tree instance segmentation of forest point clouds. TreeLearn is trained on already segmented point clouds in a data-driven manner, making it less reliant on predefined features and algorithms. We trained TreeLearn on forest point clouds of 6665 trees, labeled using the Lidar360 software.
arXiv Detail & Related papers (2023-09-15T15:20:16Z)
On Computing Optimal Tree Ensembles [7.424944196676223]
Random forests and, more generally, (decisionnobreakdash-)tree ensembles are widely used methods for classification and regression. Recent algorithmic advances allow to compute decision trees that are optimal for various measures such as their size or depth. We provide two novel algorithms and corresponding lower bounds.
arXiv Detail & Related papers (2023-06-07T13:30:43Z)
Hierarchical clustering with dot products recovers hidden tree structure [53.68551192799585]
In this paper we offer a new perspective on the well established agglomerative clustering algorithm, focusing on recovery of hierarchical structure. We recommend a simple variant of the standard algorithm, in which clusters are merged by maximum average dot product and not, for example, by minimum distance or within-cluster variance. We demonstrate that the tree output by this algorithm provides a bona fide estimate of generative hierarchical structure in data, under a generic probabilistic graphical model.
arXiv Detail & Related papers (2023-05-24T11:05:12Z)
New Linear-time Algorithm for SubTree Kernel Computation based on Root-Weighted Tree Automata [0.0]
We propose a new linear time algorithm based on the concept of weighted tree automata for SubTree kernel computation. Key idea behind the proposed algorithm is to replace DAG reduction and nodes sorting steps. Our approach has three major advantages: it is output-sensitive, it is free sensitive from the tree types (ordered trees versus unordered trees), and it is well adapted to any incremental tree kernel based learning methods.
arXiv Detail & Related papers (2023-02-02T13:37:48Z)
HyperAid: Denoising in hyperbolic spaces for tree-fitting and hierarchical clustering [36.738414547278154]
We propose a new approach to treemetric denoising (HyperAid) in hyperbolic spaces. It transforms original data into data that is more'' tree-like, when evaluated in terms of Gromov's $delta$ hyperbolicity. We integrate HyperAid with schemes for enforcing nonnegative edge-weights.
arXiv Detail & Related papers (2022-05-19T17:33:16Z)
Incorporating Constituent Syntax for Coreference Resolution [50.71868417008133]
We propose a graph-based method to incorporate constituent syntactic structures. We also explore to utilise higher-order neighbourhood information to encode rich structures in constituent trees. Experiments on the English and Chinese portions of OntoNotes 5.0 benchmark show that our proposed model either beats a strong baseline or achieves new state-of-the-art performance.
arXiv Detail & Related papers (2022-02-22T07:40:42Z)
Active-LATHE: An Active Learning Algorithm for Boosting the Error Exponent for Learning Homogeneous Ising Trees [75.93186954061943]
We design and analyze an algorithm that boosts the error exponent by at least 40% when $rho$ is at least $0.8$. Our analysis hinges on judiciously exploiting the minute but detectable statistical variation of the samples to allocate more data to parts of the graph.
arXiv Detail & Related papers (2021-10-27T10:45:21Z)
Visualizing hierarchies in scRNA-seq data using a density tree-biased autoencoder [50.591267188664666]
We propose an approach for identifying a meaningful tree structure from high-dimensional scRNA-seq data. We then introduce DTAE, a tree-biased autoencoder that emphasizes the tree structure of the data in low dimensional space.
arXiv Detail & Related papers (2021-02-11T08:48:48Z)
SGA: A Robust Algorithm for Partial Recovery of Tree-Structured Graphical Models with Noisy Samples [75.32013242448151]
We consider learning Ising tree models when the observations from the nodes are corrupted by independent but non-identically distributed noise. Katiyar et al. (2020) showed that although the exact tree structure cannot be recovered, one can recover a partial tree structure. We propose Symmetrized Geometric Averaging (SGA), a more statistically robust algorithm for partial tree recovery.
arXiv Detail & Related papers (2021-01-22T01:57:35Z)
Metric Learning for Ordered Labeled Trees with pq-grams [11.284638114256712]
We propose a new metric learning approach for tree-structured data with pq-grams. The pq-gram distance is a distance for ordered labeled trees, and has much lower computation cost than the tree edit distance. We empirically show that the proposed approach achieves competitive results with the state-of-the-art edit distance-based methods.
arXiv Detail & Related papers (2020-03-09T08:04:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.