Coupled Hierarchical Structure Learning using Tree-Wasserstein Distance
- URL: http://arxiv.org/abs/2501.03627v1
- Date: Tue, 07 Jan 2025 08:54:42 GMT
- Title: Coupled Hierarchical Structure Learning using Tree-Wasserstein Distance
- Authors: Ya-Wei Eileen Lin, Ronald R. Coifman, Gal Mishne, Ronen Talmon,
- Abstract summary: We introduce a coupled hierarchical structure learning method using tree-Wasserstein distance (TWD)<n>Our method jointly computes TWDs for samples and features, representing their latent hierarchies as trees.<n>We show that this iterative procedure converges and empirically improves the quality of the constructed trees.
- Score: 12.2853783834605
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In many applications, both data samples and features have underlying hierarchical structures. However, existing methods for learning these latent structures typically focus on either samples or features, ignoring possible coupling between them. In this paper, we introduce a coupled hierarchical structure learning method using tree-Wasserstein distance (TWD). Our method jointly computes TWDs for samples and features, representing their latent hierarchies as trees. We propose an iterative, unsupervised procedure to build these sample and feature trees based on diffusion geometry, hyperbolic geometry, and wavelet filters. We show that this iterative procedure converges and empirically improves the quality of the constructed trees. The method is also computationally efficient and scales well in high-dimensional settings. Our method can be seamlessly integrated with hyperbolic graph convolutional networks (HGCN). We demonstrate that our method outperforms competing approaches in sparse approximation and unsupervised Wasserstein distance learning on several word-document and single-cell RNA-sequencing datasets. In addition, integrating our method into HGCN enhances performance in link prediction and node classification tasks.
Related papers
- Birch SGD: A Tree Graph Framework for Local and Asynchronous SGD Methods [51.54704494242525]
We propose a new unifying framework, Birch SGD, for analyzing and designing distributed SGD methods.<n>Using Birch SGD, we design eight new methods and analyze them alongside previously known ones, with at least six of the new methods shown to have optimal computational time complexity.<n>Our research leads to two key insights: (i) all methods share the same "iteration rate" of $Oleft(frac(R + 1) L Deltavarepsilon + fracsigma2 L Deltavarepsilon2right)$, where $R$
arXiv Detail & Related papers (2025-05-14T08:37:45Z) - Scalable Deep Metric Learning on Attributed Graphs [10.092560681589578]
We develop a graph embedding method, which is based on extending deep metric and unbiased contrastive learning techniques.
Based on a multi-classt loss function, we present two algorithms -- DMT for semi-supervised learning and DMAT-i for the unsupervised case.
arXiv Detail & Related papers (2024-11-20T03:34:31Z) - Fast unsupervised ground metric learning with tree-Wasserstein distance [14.235762519615175]
unsupervised ground metric learning approaches have been introduced.
We propose to augment the WSV method by embedding samples and features on trees, on which we compute the tree-Wasserstein distance (TWD)
We demonstrate theoretically and empirically that the algorithm converges to a better approximation of the full WSV approach than the best known alternatives, and does so with $mathcalO(n3)$ complexity.
arXiv Detail & Related papers (2024-11-11T23:21:01Z) - Tree-Wasserstein Distance for High Dimensional Data with a Latent Feature Hierarchy [12.2853783834605]
We propose a new tree-Wasserstein distance (TWD) for high-dimensional data with two key aspects.<n>First, our TWD is specifically designed for data with a latent feature hierarchy, i.e., the features lie in a hierarchical space.<n>We show that our TWD computed based on data observations provably recovers the TWD defined with the latent feature hierarchy.
arXiv Detail & Related papers (2024-10-28T15:11:23Z) - Revisiting Nearest Neighbor for Tabular Data: A Deep Tabular Baseline Two Decades Later [76.66498833720411]
We introduce a differentiable version of $K$-nearest neighbors (KNN) originally designed to learn a linear projection to capture semantic similarities between instances.
Surprisingly, our implementation of NCA using SGD and without dimensionality reduction already achieves decent performance on tabular data.
We conclude our paper by analyzing the factors behind these improvements, including loss functions, prediction strategies, and deep architectures.
arXiv Detail & Related papers (2024-07-03T16:38:57Z) - Ensemble Quadratic Assignment Network for Graph Matching [52.20001802006391]
Graph matching is a commonly used technique in computer vision and pattern recognition.
Recent data-driven approaches have improved the graph matching accuracy remarkably.
We propose a graph neural network (GNN) based approach to combine the advantages of data-driven and traditional methods.
arXiv Detail & Related papers (2024-03-11T06:34:05Z) - Hierarchical clustering with dot products recovers hidden tree structure [53.68551192799585]
In this paper we offer a new perspective on the well established agglomerative clustering algorithm, focusing on recovery of hierarchical structure.
We recommend a simple variant of the standard algorithm, in which clusters are merged by maximum average dot product and not, for example, by minimum distance or within-cluster variance.
We demonstrate that the tree output by this algorithm provides a bona fide estimate of generative hierarchical structure in data, under a generic probabilistic graphical model.
arXiv Detail & Related papers (2023-05-24T11:05:12Z) - Pushing the Efficiency Limit Using Structured Sparse Convolutions [82.31130122200578]
We propose Structured Sparse Convolution (SSC), which leverages the inherent structure in images to reduce the parameters in the convolutional filter.
We show that SSC is a generalization of commonly used layers (depthwise, groupwise and pointwise convolution) in efficient architectures''
Architectures based on SSC achieve state-of-the-art performance compared to baselines on CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet classification benchmarks.
arXiv Detail & Related papers (2022-10-23T18:37:22Z) - Adaptive Convolutional Dictionary Network for CT Metal Artifact
Reduction [62.691996239590125]
We propose an adaptive convolutional dictionary network (ACDNet) for metal artifact reduction.
Our ACDNet can automatically learn the prior for artifact-free CT images via training data and adaptively adjust the representation kernels for each input CT image.
Our method inherits the clear interpretability of model-based methods and maintains the powerful representation ability of learning-based methods.
arXiv Detail & Related papers (2022-05-16T06:49:36Z) - Convergent Boosted Smoothing for Modeling Graph Data with Tabular Node
Features [46.052312251801]
We propose a framework for iterating boosting with graph propagation steps.
Our approach is anchored in a principled meta loss function.
Across a variety of non-iid graph datasets, our method achieves comparable or superior performance.
arXiv Detail & Related papers (2021-10-26T04:53:12Z) - Structural Optimization Makes Graph Classification Simpler and Better [5.770986723520119]
We investigate the feasibility of improving graph classification performance while simplifying the model learning process.
Inspired by progress in structural information assessment, we optimize the given data sample from graphs to encoding trees.
We present an implementation of the scheme in a tree kernel and a convolutional network to perform graph classification.
arXiv Detail & Related papers (2021-09-05T08:54:38Z) - R2D2: Recursive Transformer based on Differentiable Tree for
Interpretable Hierarchical Language Modeling [36.61173494449218]
This paper proposes a model based on differentiable CKY style binary trees to emulate the composition process.
We extend the bidirectional language model pre-training objective to this architecture, attempting to predict each word given its left and right abstraction nodes.
To scale up our approach, we also introduce an efficient pruned tree induction algorithm to enable encoding in just a linear number of composition steps.
arXiv Detail & Related papers (2021-07-02T11:00:46Z) - Visualizing hierarchies in scRNA-seq data using a density tree-biased
autoencoder [50.591267188664666]
We propose an approach for identifying a meaningful tree structure from high-dimensional scRNA-seq data.
We then introduce DTAE, a tree-biased autoencoder that emphasizes the tree structure of the data in low dimensional space.
arXiv Detail & Related papers (2021-02-11T08:48:48Z) - Learning the Implicit Semantic Representation on Graph-Structured Data [57.670106959061634]
Existing representation learning methods in graph convolutional networks are mainly designed by describing the neighborhood of each node as a perceptual whole.
We propose a Semantic Graph Convolutional Networks (SGCN) that explores the implicit semantics by learning latent semantic-paths in graphs.
arXiv Detail & Related papers (2021-01-16T16:18:43Z) - Unsupervised Embedding of Hierarchical Structure in Euclidean Space [30.507049058838025]
We consider learning a non-linear embedding of data into Euclidean space as a way to improve the hierarchical clustering produced by agglomerative algorithms.
We show that rescaling the latent space embedding leads to improved results for both dendrogram purity and the Moseley-Wang cost function.
arXiv Detail & Related papers (2020-10-30T03:57:09Z) - Structured Graph Learning for Clustering and Semi-supervised
Classification [74.35376212789132]
We propose a graph learning framework to preserve both the local and global structure of data.
Our method uses the self-expressiveness of samples to capture the global structure and adaptive neighbor approach to respect the local structure.
Our model is equivalent to a combination of kernel k-means and k-means methods under certain condition.
arXiv Detail & Related papers (2020-08-31T08:41:20Z) - Adaptive Context-Aware Multi-Modal Network for Depth Completion [107.15344488719322]
We propose to adopt the graph propagation to capture the observed spatial contexts.
We then apply the attention mechanism on the propagation, which encourages the network to model the contextual information adaptively.
Finally, we introduce the symmetric gated fusion strategy to exploit the extracted multi-modal features effectively.
Our model, named Adaptive Context-Aware Multi-Modal Network (ACMNet), achieves the state-of-the-art performance on two benchmarks.
arXiv Detail & Related papers (2020-08-25T06:00:06Z) - Oblique Predictive Clustering Trees [6.317966126631351]
Predictive clustering trees (PCTs) can be used to solve a variety of predictive modeling tasks, including structured output prediction.
We propose oblique predictive clustering trees, capable of addressing these limitations.
We experimentally evaluate the proposed methods on 60 benchmark datasets for 6 predictive modeling tasks.
arXiv Detail & Related papers (2020-07-27T14:58:23Z) - MetricUNet: Synergistic Image- and Voxel-Level Learning for Precise CT
Prostate Segmentation via Online Sampling [66.01558025094333]
We propose a two-stage framework, with the first stage to quickly localize the prostate region and the second stage to precisely segment the prostate.
We introduce a novel online metric learning module through voxel-wise sampling in the multi-task network.
Our method can effectively learn more representative voxel-level features compared with the conventional learning methods with cross-entropy or Dice loss.
arXiv Detail & Related papers (2020-05-15T10:37:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.