Related papers: Fast unsupervised ground metric learning with tree-Wasserstein distance

Fast unsupervised ground metric learning with tree-Wasserstein distance

URL: http://arxiv.org/abs/2411.07432v1
Date: Mon, 11 Nov 2024 23:21:01 GMT
Title: Fast unsupervised ground metric learning with tree-Wasserstein distance
Authors: Kira M. Düsterwald, Samo Hromadka, Makoto Yamada,
Abstract summary: unsupervised ground metric learning approaches have been introduced. We propose to augment the WSV method by embedding samples and features on trees, on which we compute the tree-Wasserstein distance (TWD) We demonstrate theoretically and empirically that the algorithm converges to a better approximation of the full WSV approach than the best known alternatives, and does so with $mathcalO(n3)$ complexity.
Score: 14.235762519615175
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: The performance of unsupervised methods such as clustering depends on the choice of distance metric between features, or ground metric. Commonly, ground metrics are decided with heuristics or learned via supervised algorithms. However, since many datasets are unlabelled, unsupervised ground metric learning approaches have been introduced. One recent, promising option uses Wasserstein singular vectors (WSV), which emerge when computing optimal transport distances between features and samples simultaneously. While WSV is effective, it has complexity $\mathcal{O}(n^5)$, which is prohibitively expensive in some applications. In this work, we propose to augment the WSV method by embedding samples and features on trees, on which we compute the tree-Wasserstein distance (TWD). We demonstrate theoretically and empirically that the algorithm converges to a better approximation of the full WSV approach than the best known alternatives, and does so with $\mathcal{O}(n^3)$ complexity. In addition, we prove that the initial tree structure can be chosen flexibly, since tree geometry does not constrain the richness of the approximation up to the number of edge weights. This proof suggests a fast, recursive algorithm for computing the tree parameter basis set, which we find crucial to realising the efficiency gains at scale. Finally, we employ the tree-WSV algorithm to several single-cell RNA sequencing genomics datasets, demonstrating its scalability and utility for unsupervised cell-type clustering problems. These results poise unsupervised ground metric learning with TWD as a low-rank approximation of WSV with the potential for widespread low-compute application.

Related papers

Approximate Tree Completion and Learning-Augmented Algorithms for Metric Minimum Spanning Trees [7.2092555743847155]
Finding a minimum spanning tree (MST) for $n$ points in an arbitrary metric space is a fundamental primitive for hierarchical clustering and many other ML tasks. We introduce a framework for metric MSTs that first (1) finds a forest of disconnected components using practicals, and then (2) finds a small weight set of edges to connect disjoint components of the forest into a spanning tree. We prove that optimally solving the second step still takes $Omega(n2)$ time, but we provide a subquadratic 2.62-approximation algorithm.
arXiv Detail & Related papers (2025-02-18T16:13:46Z)
Coupled Hierarchical Structure Learning using Tree-Wasserstein Distance [12.2853783834605]
We introduce a coupled hierarchical structure learning method using tree-Wasserstein distance (TWD) Our method jointly computes TWDs for samples and features, representing their latent hierarchies as trees. We show that this iterative procedure converges and empirically improves the quality of the constructed trees.
arXiv Detail & Related papers (2025-01-07T08:54:42Z)
Optimal estimation of Gaussian (poly)trees [25.02920605955238]
We consider both problems of distribution learning (i.e. in KL distance) and structure learning (i.e. exact recovery) The first approach is based on the Chow-Liu algorithm, and learns an optimal tree-structured distribution efficiently. The second approach is a modification of the PC algorithm for polytrees that uses partial correlation as a conditional independence tester for constraint-based structure learning.
arXiv Detail & Related papers (2024-02-09T12:58:36Z)
Linearized Wasserstein dimensionality reduction with approximation guarantees [65.16758672591365]
LOT Wassmap is a computationally feasible algorithm to uncover low-dimensional structures in the Wasserstein space. We show that LOT Wassmap attains correct embeddings and that the quality improves with increased sample size. We also show how LOT Wassmap significantly reduces the computational cost when compared to algorithms that depend on pairwise distance computations.
arXiv Detail & Related papers (2023-02-14T22:12:16Z)
Efficient Graph Field Integrators Meet Point Clouds [59.27295475120132]
We present two new classes of algorithms for efficient field integration on graphs encoding point clouds. The first class, SeparatorFactorization(SF), leverages the bounded genus of point cloud mesh graphs, while the second class, RFDiffusion(RFD), uses popular epsilon-nearest-neighbor graph representations for point clouds.
arXiv Detail & Related papers (2023-02-02T08:33:36Z)
Communication-Efficient Adam-Type Algorithms for Distributed Data Mining [93.50424502011626]
We propose a class of novel distributed Adam-type algorithms (emphi.e., SketchedAMSGrad) utilizing sketching. Our new algorithm achieves a fast convergence rate of $O(frac1sqrtnT + frac1(k/d)2 T)$ with the communication cost of $O(k log(d))$ at each iteration.
arXiv Detail & Related papers (2022-10-14T01:42:05Z)
Scalable Differentially Private Clustering via Hierarchically Separated Trees [82.69664595378869]
We show that our method computes a solution with cost at most $O(d3/2log n)cdot OPT + O(k d2 log2 n / epsilon2)$, where $epsilon$ is the privacy guarantee. Although the worst-case guarantee is worse than that of state of the art private clustering methods, the algorithm we propose is practical.
arXiv Detail & Related papers (2022-06-17T09:24:41Z)
Unbiased and Efficient Sampling of Dependency Trees [0.0]
Most treebanks require that every valid dependency tree has a single edge coming out of the ROOT node. Zmigrod et al. have recently proposed algorithms for sampling with and without replacement from the single-root dependency tree distribution. We show that their fastest algorithm for sampling with replacement, Wilson-RC, is in fact biased.
arXiv Detail & Related papers (2022-05-25T09:57:28Z)
Exact and Approximate Hierarchical Clustering Using A* [51.187990314731344]
We introduce a new approach based on A* search for clustering. We overcome the prohibitively large search space by combining A* with a novel emphtrellis data structure. We empirically demonstrate that our method achieves substantially higher quality results than baselines for a particle physics use case and other clustering benchmarks.
arXiv Detail & Related papers (2021-04-14T18:15:27Z)
Unsupervised Ground Metric Learning using Wasserstein Eigenvectors [0.0]
Key bottleneck is design of a "ground" cost which should be adapted to the task under study. In this paper, we propose for the first time a canonical answer by computing the ground cost as a positive eigenvector of the function mapping a cost to the pairwise OT distances between the inputs. We also introduce a scalable computational method using entropic regularization, which operates a principal component analysis dimensionality reduction.
arXiv Detail & Related papers (2021-02-11T21:32:59Z)
Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models. The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning. We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z)
Online Model Selection for Reinforcement Learning with Function Approximation [50.008542459050155]
We present a meta-algorithm that adapts to the optimal complexity with $tildeO(L5/6 T2/3)$ regret. We also show that the meta-algorithm automatically admits significantly improved instance-dependent regret bounds.
arXiv Detail & Related papers (2020-11-19T10:00:54Z)
Learning Binary Decision Trees by Argmin Differentiation [34.9154848754842]
We learn binary decision trees that partition data for some downstream task. We do so by relaxing a mixed-integer program for the discrete parameters. We derive customized algorithms to efficiently compute the forward and backward passes.
arXiv Detail & Related papers (2020-10-09T15:11:28Z)
Hybrid Stochastic-Deterministic Minibatch Proximal Gradient: Less-Than-Single-Pass Optimization with Nearly Optimal Generalization [83.80460802169999]
We show that HSDMPG can attain an $mathcalObig (1/sttnbig)$ which is at the order of excess error on a learning model. For loss factors, we prove that HSDMPG can attain an $mathcalObig (1/sttnbig)$ which is at the order of excess error on a learning model.
arXiv Detail & Related papers (2020-09-18T02:18:44Z)
A Practical Index Structure Supporting Fr\'echet Proximity Queries Among Trajectories [1.9335262420787858]
We present a scalable approach for range and $k$ nearest neighbor queries under computationally expensive metrics. Based on clustering for metric indexes, we obtain a dynamic tree structure whose size is linear in the number of trajectories. We analyze the efficiency and effectiveness of our methods with extensive experiments on diverse synthetic and real-world data sets.
arXiv Detail & Related papers (2020-05-28T04:12:43Z)
Computationally efficient sparse clustering [67.95910835079825]
We provide a finite sample analysis of a new clustering algorithm based on PCA. We show that it achieves the minimax optimal misclustering rate in the regime $|theta infty$.
arXiv Detail & Related papers (2020-05-21T17:51:30Z)
Explainable $k$-Means and $k$-Medians Clustering [25.513261099927163]
We consider using a small decision tree to partition a data set into clusters, so that clusters can be characterized in a straightforward manner. We show that popular top-down decision tree algorithms may lead to clusterings with arbitrarily large cost. We design an efficient algorithm that produces explainable clusters using a tree with $k$ leaves.
arXiv Detail & Related papers (2020-02-28T04:21:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.