CO-SNE: Dimensionality Reduction and Visualization for Hyperbolic Data
- URL: http://arxiv.org/abs/2111.15037v1
- Date: Tue, 30 Nov 2021 00:21:47 GMT
- Title: CO-SNE: Dimensionality Reduction and Visualization for Hyperbolic Data
- Authors: Yunhui Guo, Haoran Guo, Stella Yu
- Abstract summary: We propose CO-SNE, extending the Euclidean space visualization tool, t-SNE, to hyperbolic space.
Unlike Euclidean space, hyperbolic space is inhomogeneous: a volume could contain a lot more points at a location far from the origin.
We apply CO-SNE to high-dimensional hyperbolic biological data as well as unsupervisedly learned hyperbolic representations.
- Score: 10.618060176686916
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Hyperbolic space can embed tree metric with little distortion, a desirable
property for modeling hierarchical structures of real-world data and semantics.
While high-dimensional embeddings often lead to better representations, most
hyperbolic models utilize low-dimensional embeddings, due to non-trivial
optimization as well as the lack of a visualization for high-dimensional
hyperbolic data.
We propose CO-SNE, extending the Euclidean space visualization tool, t-SNE,
to hyperbolic space. Like t-SNE, it converts distances between data points to
joint probabilities and tries to minimize the Kullback-Leibler divergence
between the joint probabilities of high-dimensional data $X$ and
low-dimensional embeddings $Y$. However, unlike Euclidean space, hyperbolic
space is inhomogeneous: a volume could contain a lot more points at a location
far from the origin. CO-SNE thus uses hyperbolic normal distributions for $X$
and hyberbolic \underline{C}auchy instead of t-SNE's Student's t-distribution
for $Y$, and it additionally attempts to preserve $X$'s individual distances to
the \underline{O}rigin in $Y$.
We apply CO-SNE to high-dimensional hyperbolic biological data as well as
unsupervisedly learned hyperbolic representations. Our results demonstrate that
CO-SNE deflates high-dimensional hyperbolic data into a low-dimensional space
without losing their hyperbolic characteristics, significantly outperforming
popular visualization tools such as PCA, t-SNE, UMAP, and HoroPCA, the last of
which is specifically designed for hyperbolic data.
Related papers
- Hyperbolic Delaunay Geometric Alignment [52.835250875177756]
We propose a similarity score for comparing datasets in a hyperbolic space.
The core idea is counting the edges of the hyperbolic Delaunay graph connecting datapoints across the given sets.
We provide an empirical investigation on synthetic and real-life biological data and demonstrate that HyperDGA outperforms the hyperbolic version of classical distances between sets.
arXiv Detail & Related papers (2024-04-12T17:14:58Z) - Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein [56.62376364594194]
Unsupervised learning aims to capture the underlying structure of potentially large and high-dimensional datasets.
In this work, we revisit these approaches under the lens of optimal transport and exhibit relationships with the Gromov-Wasserstein problem.
This unveils a new general framework, called distributional reduction, that recovers DR and clustering as special cases and allows addressing them jointly within a single optimization problem.
arXiv Detail & Related papers (2024-02-03T19:00:19Z) - Understanding and Mitigating Hyperbolic Dimensional Collapse in Graph Contrastive Learning [70.0681902472251]
We propose a novel contrastive learning framework to learn high-quality graph embeddings in hyperbolic space.
Specifically, we design the alignment metric that effectively captures the hierarchical data-invariant information.
We show that in the hyperbolic space one has to address the leaf- and height-level uniformity related to properties of trees.
arXiv Detail & Related papers (2023-10-27T15:31:42Z) - Hyperbolic vs Euclidean Embeddings in Few-Shot Learning: Two Sides of
the Same Coin [49.12496652756007]
We show that the best few-shot results are attained for hyperbolic embeddings at a common hyperbolic radius.
In contrast to prior benchmark results, we demonstrate that better performance can be achieved by a fixed-radius encoder equipped with the Euclidean metric.
arXiv Detail & Related papers (2023-09-18T14:51:46Z) - Tight and fast generalization error bound of graph embedding in metric
space [54.279425319381374]
We show that graph embedding in non-Euclidean metric space can outperform that in Euclidean space with much smaller training data than the existing bound has suggested.
Our new upper bound is significantly tighter and faster than the existing one, which can be exponential to $R$ and $O(frac1S)$ at the fastest.
arXiv Detail & Related papers (2023-05-13T17:29:18Z) - FFHR: Fully and Flexible Hyperbolic Representation for Knowledge Graph
Completion [45.470475498688344]
Some important operations in hyperbolic space still lack good definitions, making existing methods unable to fully leverage the merits of hyperbolic space.
We develop a textbfFully and textbfFlexible textbfHyperbolic textbfRepresentation framework (textbfFFHR) that is able to transfer recent Euclidean-based advances to hyperbolic space.
arXiv Detail & Related papers (2023-02-07T14:50:28Z) - HRCF: Enhancing Collaborative Filtering via Hyperbolic Geometric
Regularization [52.369435664689995]
We introduce a textitHyperbolic Regularization powered Collaborative Filtering (HRCF) and design a geometric-aware hyperbolic regularizer.
Specifically, the proposal boosts optimization procedure via the root alignment and origin-aware penalty.
Our proposal is able to tackle the over-smoothing problem caused by hyperbolic aggregation and also brings the models a better discriminative ability.
arXiv Detail & Related papers (2022-04-18T06:11:44Z) - Nested Hyperbolic Spaces for Dimensionality Reduction and Hyperbolic NN
Design [8.250374560598493]
Hyperbolic neural networks have been popular in the recent past due to their ability to represent hierarchical data sets effectively and efficiently.
The challenge in developing these networks lies in the nonlinearity of the embedding space namely, the Hyperbolic space.
We present a novel fully hyperbolic neural network which uses the concept of projections (embeddings) followed by an intrinsic aggregation and a nonlinearity all within the hyperbolic space.
arXiv Detail & Related papers (2021-12-03T03:20:27Z) - Highly Scalable and Provably Accurate Classification in Poincare Balls [40.82908295137667]
We establish a unified framework for learning scalable and simple hyperbolic linear classifiers with provable performance guarantees.
Our results include a new hyperbolic and second-order perceptron algorithm as well as an efficient and highly accurate convex optimization setup for hyperbolic support vector machine classifiers.
Their performance accuracies on synthetic data sets comprising millions of points, as well as on complex real-world data sets such as single-cell RNA-seq expression measurements, CIFAR10, Fashion-MNIST and mini-ImageNet.
arXiv Detail & Related papers (2021-09-08T16:59:39Z) - Unit Ball Model for Hierarchical Embeddings in Complex Hyperbolic Space [28.349200177632852]
Learning the representation of data with hierarchical structures in the hyperbolic space attracts increasing attention in recent years.
We propose to learn the graph embeddings in the unit ball model of the complex hyperbolic space.
arXiv Detail & Related papers (2021-05-09T16:09:54Z) - A Local Similarity-Preserving Framework for Nonlinear Dimensionality
Reduction with Neural Networks [56.068488417457935]
We propose a novel local nonlinear approach named Vec2vec for general purpose dimensionality reduction.
To train the neural network, we build the neighborhood similarity graph of a matrix and define the context of data points.
Experiments of data classification and clustering on eight real datasets show that Vec2vec is better than several classical dimensionality reduction methods in the statistical hypothesis test.
arXiv Detail & Related papers (2021-03-10T23:10:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.