Related papers: Hyperbolic Random Forests

Hyperbolic Random Forests

URL: http://arxiv.org/abs/2308.13279v2
Date: Mon, 24 Jun 2024 13:57:01 GMT
Title: Hyperbolic Random Forests
Authors: Lars Doorenbos, Pablo Márquez-Neila, Raphael Sznitman, Pascal Mettes,
Abstract summary: We generalize the well-known random forests to hyperbolic space. We do this by redefining the notion of a split using horospheres. We also outline a new method for combining classes based on their lowest common ancestor and a class-balanced version of the large-margin loss.
Score: 15.992363138277442
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Hyperbolic space is becoming a popular choice for representing data due to the hierarchical structure - whether implicit or explicit - of many real-world datasets. Along with it comes a need for algorithms capable of solving fundamental tasks, such as classification, in hyperbolic space. Recently, multiple papers have investigated hyperbolic alternatives to hyperplane-based classifiers, such as logistic regression and SVMs. While effective, these approaches struggle with more complex hierarchical data. We, therefore, propose to generalize the well-known random forests to hyperbolic space. We do this by redefining the notion of a split using horospheres. Since finding the globally optimal split is computationally intractable, we find candidate horospheres through a large-margin classifier. To make hyperbolic random forests work on multi-class data and imbalanced experiments, we furthermore outline a new method for combining classes based on their lowest common ancestor and a class-balanced version of the large-margin loss. Experiments on standard and new benchmarks show that our approach outperforms both conventional random forest algorithms and recent hyperbolic classifiers.

Related papers

An Enhanced Model-based Approach for Short Text Clustering [58.60681789677676]
Short text clustering has become increasingly important with the popularity of social media like Twitter, Google+, and Facebook.<n>Existing methods can be broadly categorized into two paradigms: topic model-based approaches and deep representation learning-based approaches.<n>We propose a collapsed Gibbs Sampling algorithm for the Dirichlet Multinomial Mixture model (GSDMM), which effectively handles the sparsity and high dimensionality of short texts.<n>Based on several aspects of GSDMM that warrant further refinement, we propose an improved approach, GSDMM+, designed to further optimize its performance.
arXiv Detail & Related papers (2025-07-18T10:07:42Z)
Balanced Hyperbolic Embeddings Are Natural Out-of-Distribution Detectors [13.816378672996784]
A hierarchical hyperbolic embedding is preferred for discriminating in- and out-of-distribution samples.<n>We outline a hyperbolic class embedding algorithm that jointly optimize for hierarchical distortion and balancing between shallow and wide subhierarchies.<n>We show that our hyperbolic embeddings outperform other hyperbolic approaches, beat state-of-the-art contrastive methods, and enable hierarchical out-of-distribution generalization.
arXiv Detail & Related papers (2025-06-11T19:51:06Z)
Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data. The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships. A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z)
Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture. It can model the feature space more comprehensively and reduce the dominance of head classes. The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z)
HyperAid: Denoising in hyperbolic spaces for tree-fitting and hierarchical clustering [36.738414547278154]
We propose a new approach to treemetric denoising (HyperAid) in hyperbolic spaces. It transforms original data into data that is more'' tree-like, when evaluated in terms of Gromov's $delta$ hyperbolicity. We integrate HyperAid with schemes for enforcing nonnegative edge-weights.
arXiv Detail & Related papers (2022-05-19T17:33:16Z)
Contrastive Multi-view Hyperbolic Hierarchical Clustering [33.050054725595736]
We propose Contrastive Multi-view Hyperbolic Hierarchical Clustering (CMHHC) It consists of three components, i.e., multi-view alignment learning, aligned feature similarity learning, and continuous hyperbolic hierarchical clustering. Experimental results on five real-world datasets demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-05-05T12:56:55Z)
HRCF: Enhancing Collaborative Filtering via Hyperbolic Geometric Regularization [52.369435664689995]
We introduce a textitHyperbolic Regularization powered Collaborative Filtering (HRCF) and design a geometric-aware hyperbolic regularizer. Specifically, the proposal boosts optimization procedure via the root alignment and origin-aware penalty. Our proposal is able to tackle the over-smoothing problem caused by hyperbolic aggregation and also brings the models a better discriminative ability.
arXiv Detail & Related papers (2022-04-18T06:11:44Z)
X-model: Improving Data Efficiency in Deep Learning with A Minimax Model [78.55482897452417]
We aim at improving data efficiency for both classification and regression setups in deep learning. To take the power of both worlds, we propose a novel X-model. X-model plays a minimax game between the feature extractor and task-specific heads.
arXiv Detail & Related papers (2021-10-09T13:56:48Z)
Highly Scalable and Provably Accurate Classification in Poincare Balls [40.82908295137667]
We establish a unified framework for learning scalable and simple hyperbolic linear classifiers with provable performance guarantees. Our results include a new hyperbolic and second-order perceptron algorithm as well as an efficient and highly accurate convex optimization setup for hyperbolic support vector machine classifiers. Their performance accuracies on synthetic data sets comprising millions of points, as well as on complex real-world data sets such as single-cell RNA-seq expression measurements, CIFAR10, Fashion-MNIST and mini-ImageNet.
arXiv Detail & Related papers (2021-09-08T16:59:39Z)
A Fully Hyperbolic Neural Model for Hierarchical Multi-Class Classification [7.8176853587105075]
Hyperbolic spaces offer a mathematically appealing approach for learning hierarchical representations of symbolic data. This work proposes a fully hyperbolic model for multi-class multi-label classification, which performs all operations in hyperbolic space. A thorough analysis sheds light on the impact of each component in the final prediction and showcases its ease of integration with Euclidean layers.
arXiv Detail & Related papers (2020-10-05T14:42:56Z)
Solving Long-tailed Recognition with Deep Realistic Taxonomic Classifier [68.38233199030908]
Long-tail recognition tackles the natural non-uniformly distributed data in realworld scenarios. While moderns perform well on populated classes, its performance degrades significantly on tail classes. Deep-RTC is proposed as a new solution to the long-tail problem, combining realism with hierarchical predictions.
arXiv Detail & Related papers (2020-07-20T05:57:42Z)
Robust Large-Margin Learning in Hyperbolic Space [64.42251583239347]
We present the first theoretical guarantees for learning a classifier in hyperbolic rather than Euclidean space. We provide an algorithm to efficiently learn a large-margin hyperplane, relying on the careful injection of adversarial examples. We prove that for hierarchical data that embeds well into hyperbolic space, the low embedding dimension ensures superior guarantees.
arXiv Detail & Related papers (2020-04-11T19:11:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.