Related papers: From Trees to Continuous Embeddings and Back: Hyperbolic Hierarchical Clustering

From Trees to Continuous Embeddings and Back: Hyperbolic Hierarchical Clustering

URL: http://arxiv.org/abs/2010.00402v1
Date: Thu, 1 Oct 2020 13:43:19 GMT
Title: From Trees to Continuous Embeddings and Back: Hyperbolic Hierarchical Clustering
Authors: Ines Chami, Albert Gu, Vaggos Chatziafratis and Christopher R\'e
Abstract summary: We present the first continuous relaxation of Dasgupta's discrete optimization problem with provable quality guarantees. We show that even approximate solutions found with gradient descent have superior quality than agglomerative clusterings. We also highlight the flexibility of HypHC using end-to-end training in a downstream classification task.
Score: 33.000371053304676
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Similarity-based Hierarchical Clustering (HC) is a classical unsupervised machine learning algorithm that has traditionally been solved with heuristic algorithms like Average-Linkage. Recently, Dasgupta reframed HC as a discrete optimization problem by introducing a global cost function measuring the quality of a given tree. In this work, we provide the first continuous relaxation of Dasgupta's discrete optimization problem with provable quality guarantees. The key idea of our method, HypHC, is showing a direct correspondence from discrete trees to continuous representations (via the hyperbolic embeddings of their leaf nodes) and back (via a decoding algorithm that maps leaf embeddings to a dendrogram), allowing us to search the space of discrete binary trees with continuous optimization. Building on analogies between trees and hyperbolic space, we derive a continuous analogue for the notion of lowest common ancestor, which leads to a continuous relaxation of Dasgupta's discrete objective. We can show that after decoding, the global minimizer of our continuous relaxation yields a discrete tree with a (1 + epsilon)-factor approximation for Dasgupta's optimal tree, where epsilon can be made arbitrarily small and controls optimization challenges. We experimentally evaluate HypHC on a variety of HC benchmarks and find that even approximate solutions found with gradient descent have superior clustering quality than agglomerative heuristics or other gradient based algorithms. Finally, we highlight the flexibility of HypHC using end-to-end training in a downstream classification task.

Related papers

Regular Tree Search for Simulation Optimization [5.54189661879098]
We propose a class of random algorithms, called Regular Tree Search, which integrates adaptive sampling with partitioning the search space.<n>We prove global convergence under sub-Gaussian noise, based on assumptions involving the optimality gap, without requiring continuity of the objective function.
arXiv Detail & Related papers (2025-06-21T12:07:01Z)
Learning-Augmented Hierarchical Clustering [29.438861266606573]
We consider the problem of hierarchical clustering given auxiliary information from natural oracles.<n>We show that a splitting oracle enables algorithms to outperform standard HC approaches.<n>Our approaches extend to sublinear settings, in which we show new streaming and PRAM algorithms with improved guarantees.
arXiv Detail & Related papers (2025-06-05T18:22:40Z)
Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity [59.75300530380427]
We consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries. We provide the first tight characterization for the rate of the minimax simple regret by developing matching upper and lower bounds.
arXiv Detail & Related papers (2024-06-28T02:56:22Z)
Optimal estimation of Gaussian (poly)trees [25.02920605955238]
We consider both problems of distribution learning (i.e. in KL distance) and structure learning (i.e. exact recovery) The first approach is based on the Chow-Liu algorithm, and learns an optimal tree-structured distribution efficiently. The second approach is a modification of the PC algorithm for polytrees that uses partial correlation as a conditional independence tester for constraint-based structure learning.
arXiv Detail & Related papers (2024-02-09T12:58:36Z)
Sample-and-Bound for Non-Convex Optimization [18.30858789210194]
We propose new sampling methods for non-dimensional objective optimization that adapts Monte Carlo benchmarks to improve efficiency. We evaluate the proposed high-order baseline and competitive benchmarks algorithms aggressively.
arXiv Detail & Related papers (2024-01-09T20:45:47Z)
Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training. We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z)
EM's Convergence in Gaussian Latent Tree Models [22.987933817370305]
We show that the unique non-trivial point of the population log-likelihood is its global maximum. We establish that the expectation-maximization algorithm is guaranteed to converge to it in the single latent variable case.
arXiv Detail & Related papers (2022-11-21T23:12:58Z)
Discrete Tree Flows via Tree-Structured Permutations [5.929956715430168]
discrete flow-based models cannot be straightforwardly optimized with conventional deep learning methods because gradients of discrete functions are undefined or zero. Our approach seeks to reduce computational burden and remove the need for pseudo-gradients by developing a discrete flow based on decision trees.
arXiv Detail & Related papers (2022-07-04T23:11:04Z)
Tree ensemble kernels for Bayesian optimization with known constraints over mixed-feature spaces [54.58348769621782]
Tree ensembles can be well-suited for black-box optimization tasks such as algorithm tuning and neural architecture search. Two well-known challenges in using tree ensembles for black-box optimization are (i) effectively quantifying model uncertainty for exploration and (ii) optimizing over the piece-wise constant acquisition function. Our framework performs as well as state-of-the-art methods for unconstrained black-box optimization over continuous/discrete features and outperforms competing methods for problems combining mixed-variable feature spaces and known input constraints.
arXiv Detail & Related papers (2022-07-02T16:59:37Z)
CITS: Coherent Ising Tree Search Algorithm Towards Solving Combinatorial Optimization Problems [0.0]
This paper proposes a search algorithm by expanding search space from a Markov chain to a depth limited tree based on SA. At each iteration, the algorithm will select the best near-optimal solution within the feasible search space by exploring along the tree in the sense of look ahead' Our results show that above the primals SA and CIM, our high-level tree search strategy is able to provide solutions within fewer epochs for Ising formulated NP-optimization problems.
arXiv Detail & Related papers (2022-03-09T10:07:26Z)
Gradient Boosted Binary Histogram Ensemble for Large-scale Regression [60.16351608335641]
We propose a gradient boosting algorithm for large-scale regression problems called textitGradient Boosted Binary Histogram Ensemble (GBBHE) based on binary histogram partition and ensemble learning. In the experiments, compared with other state-of-the-art algorithms such as gradient boosted regression tree (GBRT), our GBBHE algorithm shows promising performance with less running time on large-scale datasets.
arXiv Detail & Related papers (2021-06-03T17:05:40Z)
Improved Branch and Bound for Neural Network Verification via Lagrangian Decomposition [161.09660864941603]
We improve the scalability of Branch and Bound (BaB) algorithms for formally proving input-output properties of neural networks. We present a novel activation-based branching strategy and a BaB framework, named Branch and Dual Network Bound (BaDNB) BaDNB outperforms previous complete verification systems by a large margin, cutting average verification times by factors up to 50 on adversarial properties.
arXiv Detail & Related papers (2021-04-14T09:22:42Z)
MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search. Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.