Tree! I am no Tree! I am a Low Dimensional Hyperbolic Embedding
- URL: http://arxiv.org/abs/2005.03847v4
- Date: Fri, 23 Oct 2020 01:37:28 GMT
- Title: Tree! I am no Tree! I am a Low Dimensional Hyperbolic Embedding
- Authors: Rishi Sonthalia, Anna C. Gilbert
- Abstract summary: We explore a new method for learning hyperbolic representations by taking a metric-first approach.
Rather than determining the low-dimensional hyperbolic embedding directly, we learn a tree structure on the data.
This tree structure can then be used directly to extract hierarchical information, embedded into a hyperbolic manifold.
- Score: 7.381113319198104
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Given data, finding a faithful low-dimensional hyperbolic embedding of the
data is a key method by which we can extract hierarchical information or learn
representative geometric features of the data. In this paper, we explore a new
method for learning hyperbolic representations by taking a metric-first
approach. Rather than determining the low-dimensional hyperbolic embedding
directly, we learn a tree structure on the data. This tree structure can then
be used directly to extract hierarchical information, embedded into a
hyperbolic manifold using Sarkar's construction \cite{sarkar}, or used as a
tree approximation of the original metric. To this end, we present a novel fast
algorithm \textsc{TreeRep} such that, given a $\delta$-hyperbolic metric (for
any $\delta \geq 0$), the algorithm learns a tree structure that approximates
the original metric. In the case when $\delta = 0$, we show analytically that
\textsc{TreeRep} exactly recovers the original tree structure. We show
empirically that \textsc{TreeRep} is not only many orders of magnitude faster
than previously known algorithms, but also produces metrics with lower average
distortion and higher mean average precision than most previous algorithms for
learning hyperbolic embeddings, extracting hierarchical information, and
approximating metrics via tree metrics.
Related papers
- Fitting trees to $\ell_1$-hyperbolic distances [4.220336689294244]
Building trees to represent or to fit distances is a critical component of phylogenetic analysis.
We study the tree fitting problem as finding the relation between the hyperbolicity (ultrametricity) vector and the error of tree (ultrametric) embedding.
arXiv Detail & Related papers (2024-09-02T07:38:32Z) - Learning a Decision Tree Algorithm with Transformers [75.96920867382859]
We introduce MetaTree, a transformer-based model trained via meta-learning to directly produce strong decision trees.
We fit both greedy decision trees and globally optimized decision trees on a large number of datasets, and train MetaTree to produce only the trees that achieve strong generalization performance.
arXiv Detail & Related papers (2024-02-06T07:40:53Z) - TreeLearn: A Comprehensive Deep Learning Method for Segmenting
Individual Trees from Ground-Based LiDAR Forest Point Clouds [42.87502453001109]
We propose TreeLearn, a deep learning-based approach for tree instance segmentation of forest point clouds.
TreeLearn is trained on already segmented point clouds in a data-driven manner, making it less reliant on predefined features and algorithms.
We trained TreeLearn on forest point clouds of 6665 trees, labeled using the Lidar360 software.
arXiv Detail & Related papers (2023-09-15T15:20:16Z) - On Computing Optimal Tree Ensembles [7.424944196676223]
Random forests and, more generally, (decisionnobreakdash-)tree ensembles are widely used methods for classification and regression.
Recent algorithmic advances allow to compute decision trees that are optimal for various measures such as their size or depth.
We provide two novel algorithms and corresponding lower bounds.
arXiv Detail & Related papers (2023-06-07T13:30:43Z) - Hierarchical clustering with dot products recovers hidden tree structure [53.68551192799585]
In this paper we offer a new perspective on the well established agglomerative clustering algorithm, focusing on recovery of hierarchical structure.
We recommend a simple variant of the standard algorithm, in which clusters are merged by maximum average dot product and not, for example, by minimum distance or within-cluster variance.
We demonstrate that the tree output by this algorithm provides a bona fide estimate of generative hierarchical structure in data, under a generic probabilistic graphical model.
arXiv Detail & Related papers (2023-05-24T11:05:12Z) - New Linear-time Algorithm for SubTree Kernel Computation based on
Root-Weighted Tree Automata [0.0]
We propose a new linear time algorithm based on the concept of weighted tree automata for SubTree kernel computation.
Key idea behind the proposed algorithm is to replace DAG reduction and nodes sorting steps.
Our approach has three major advantages: it is output-sensitive, it is free sensitive from the tree types (ordered trees versus unordered trees), and it is well adapted to any incremental tree kernel based learning methods.
arXiv Detail & Related papers (2023-02-02T13:37:48Z) - Incorporating Constituent Syntax for Coreference Resolution [50.71868417008133]
We propose a graph-based method to incorporate constituent syntactic structures.
We also explore to utilise higher-order neighbourhood information to encode rich structures in constituent trees.
Experiments on the English and Chinese portions of OntoNotes 5.0 benchmark show that our proposed model either beats a strong baseline or achieves new state-of-the-art performance.
arXiv Detail & Related papers (2022-02-22T07:40:42Z) - Active-LATHE: An Active Learning Algorithm for Boosting the Error
Exponent for Learning Homogeneous Ising Trees [75.93186954061943]
We design and analyze an algorithm that boosts the error exponent by at least 40% when $rho$ is at least $0.8$.
Our analysis hinges on judiciously exploiting the minute but detectable statistical variation of the samples to allocate more data to parts of the graph.
arXiv Detail & Related papers (2021-10-27T10:45:21Z) - Visualizing hierarchies in scRNA-seq data using a density tree-biased
autoencoder [50.591267188664666]
We propose an approach for identifying a meaningful tree structure from high-dimensional scRNA-seq data.
We then introduce DTAE, a tree-biased autoencoder that emphasizes the tree structure of the data in low dimensional space.
arXiv Detail & Related papers (2021-02-11T08:48:48Z) - SGA: A Robust Algorithm for Partial Recovery of Tree-Structured
Graphical Models with Noisy Samples [75.32013242448151]
We consider learning Ising tree models when the observations from the nodes are corrupted by independent but non-identically distributed noise.
Katiyar et al. (2020) showed that although the exact tree structure cannot be recovered, one can recover a partial tree structure.
We propose Symmetrized Geometric Averaging (SGA), a more statistically robust algorithm for partial tree recovery.
arXiv Detail & Related papers (2021-01-22T01:57:35Z) - Metric Learning for Ordered Labeled Trees with pq-grams [11.284638114256712]
We propose a new metric learning approach for tree-structured data with pq-grams.
The pq-gram distance is a distance for ordered labeled trees, and has much lower computation cost than the tree edit distance.
We empirically show that the proposed approach achieves competitive results with the state-of-the-art edit distance-based methods.
arXiv Detail & Related papers (2020-03-09T08:04:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.