Mixed-Curvature Decision Trees and Random Forests
- URL: http://arxiv.org/abs/2406.05227v2
- Date: Thu, 18 Jul 2024 14:11:39 GMT
- Title: Mixed-Curvature Decision Trees and Random Forests
- Authors: Philippe Chlenski, Quentin Chu, Itsik Pe'er,
- Abstract summary: We extend decision tree and random forest algorithms to product space manifold.
Our method enables a simple, expressive method for classification and regression in product manifold.
Code for our implementation and experiments is available at https://github.com/pchlenski/embedders.
- Score: 0.6656737591902598
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We extend decision tree and random forest algorithms to product space manifolds: Cartesian products of Euclidean, hyperspherical, and hyperbolic manifolds. Such spaces have extremely expressive geometries capable of representing many arrangements of distances with low metric distortion. To date, all classifiers for product spaces fit a single linear decision boundary, and no regressor has been described. Our method enables a simple, expressive method for classification and regression in product manifolds. We demonstrate the superior accuracy of our tool compared to Euclidean methods operating in the ambient space or the tangent plane of the manifold across a range of constant-curvature and product manifolds. Code for our implementation and experiments is available at https://github.com/pchlenski/embedders.
Related papers
- Mixed-curvature decision trees and random forests [0.9175022232984708]
Decision trees (DTs) and their random forest (RF) extensions are workhorses of classification and regression in Euclidean spaces.
We extend DT and RF algorithms to product manifold products of several hyperbolic, hyperspherical, or Euclidean components.
Our novel angular reformulation of DTs respects the geometry of the product manifold, yielding splits that are geodesically convex, maximum-margin, and composable.
arXiv Detail & Related papers (2024-10-03T00:48:07Z) - Scaling Riemannian Diffusion Models [68.52820280448991]
We show that our method enables us to scale to high dimensional tasks on nontrivial manifold.
We model QCD densities on $SU(n)$ lattices and contrastively learned embeddings on high dimensional hyperspheres.
arXiv Detail & Related papers (2023-10-30T21:27:53Z) - Gromov-Hausdorff Distances for Comparing Product Manifolds of Model
Spaces [21.97865037637575]
We introduce a novel notion of distance between candidate latent geometries using the Gromov-Hausdorff distance from metric geometry.
We propose using a graph search space that uses the estimated Gromov-Hausdorff distances to search for the optimal latent geometry.
arXiv Detail & Related papers (2023-09-09T11:17:06Z) - Knowledge-based Multiple Adaptive Spaces Fusion for Recommendation [35.20583774988951]
We propose a knowledge-based multiple adaptive spaces fusion method for recommendation, namely MCKG.
Unlike existing methods that solely adopt a specific manifold, we introduce the unified space that is compatible with hyperbolic, euclidean and spherical spaces.
In addition, we propose a geometry-aware optimization strategy which enables the pull and push processes benefited from both hyperbolic and spherical spaces.
arXiv Detail & Related papers (2023-08-29T12:11:16Z) - HRCF: Enhancing Collaborative Filtering via Hyperbolic Geometric
Regularization [52.369435664689995]
We introduce a textitHyperbolic Regularization powered Collaborative Filtering (HRCF) and design a geometric-aware hyperbolic regularizer.
Specifically, the proposal boosts optimization procedure via the root alignment and origin-aware penalty.
Our proposal is able to tackle the over-smoothing problem caused by hyperbolic aggregation and also brings the models a better discriminative ability.
arXiv Detail & Related papers (2022-04-18T06:11:44Z) - Manifold Hypothesis in Data Analysis: Double Geometrically-Probabilistic
Approach to Manifold Dimension Estimation [92.81218653234669]
We present new approach to manifold hypothesis checking and underlying manifold dimension estimation.
Our geometrical method is a modification for sparse data of a well-known box-counting algorithm for Minkowski dimension calculation.
Experiments on real datasets show that the suggested approach based on two methods combination is powerful and effective.
arXiv Detail & Related papers (2021-07-08T15:35:54Z) - Linear Classifiers in Mixed Constant Curvature Spaces [40.82908295137667]
We address the problem of linear classification in a product space form -- a mix of Euclidean, spherical, and hyperbolic spaces.
We prove that linear classifiers in $d$-dimensional constant curvature spaces can shatter exactly $d+1$ points.
We describe a novel perceptron classification algorithm, and establish rigorous convergence results.
arXiv Detail & Related papers (2021-02-19T23:29:03Z) - Switch Spaces: Learning Product Spaces with Sparse Gating [48.591045282317424]
We propose Switch Spaces, a data-driven approach for learning representations in product space.
We introduce sparse gating mechanisms that learn to choose, combine and switch spaces.
Experiments on knowledge graph completion and item recommendations show that the proposed switch space achieves new state-of-the-art performances.
arXiv Detail & Related papers (2021-02-17T11:06:59Z) - Quadric hypersurface intersection for manifold learning in feature space [52.83976795260532]
manifold learning technique suitable for moderately high dimension and large datasets.
The technique is learned from the training data in the form of an intersection of quadric hypersurfaces.
At test time, this manifold can be used to introduce an outlier score for arbitrary new points.
arXiv Detail & Related papers (2021-02-11T18:52:08Z) - Manifold Learning via Manifold Deflation [105.7418091051558]
dimensionality reduction methods provide a valuable means to visualize and interpret high-dimensional data.
Many popular methods can fail dramatically, even on simple two-dimensional Manifolds.
This paper presents an embedding method for a novel, incremental tangent space estimator that incorporates global structure as coordinates.
Empirically, we show our algorithm recovers novel and interesting embeddings on real-world and synthetic datasets.
arXiv Detail & Related papers (2020-07-07T10:04:28Z) - Ultrahyperbolic Representation Learning [13.828165530602224]
In machine learning, data is usually represented in a (flat) Euclidean space where distances between points are along straight lines.
We propose a representation living on a pseudo-Riemannian manifold of constant nonzero curvature.
We provide the necessary learning tools in this geometry and extend gradient-based optimization techniques.
arXiv Detail & Related papers (2020-07-01T03:49:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.