Hierarchical Correlation Clustering and Tree Preserving Embedding
- URL: http://arxiv.org/abs/2002.07756v3
- Date: Wed, 12 Jun 2024 15:43:44 GMT
- Title: Hierarchical Correlation Clustering and Tree Preserving Embedding
- Authors: Morteza Haghir Chehreghani, Mostafa Haghir Chehreghani,
- Abstract summary: We propose a hierarchical correlation clustering method that extends the well-known correlation clustering.
In the following, we study unsupervised representation learning with such hierarchical correlation clustering.
- Score: 3.821323038670061
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a hierarchical correlation clustering method that extends the well-known correlation clustering to produce hierarchical clusters applicable to both positive and negative pairwise dissimilarities. Then, in the following, we study unsupervised representation learning with such hierarchical correlation clustering. For this purpose, we first investigate embedding the respective hierarchy to be used for tree preserving embedding and feature extraction. Thereafter, we study the extension of minimax distance measures to correlation clustering, as another representation learning paradigm. Finally, we demonstrate the performance of our methods on several datasets.
Related papers
- Hierarchical clustering with dot products recovers hidden tree structure [53.68551192799585]
In this paper we offer a new perspective on the well established agglomerative clustering algorithm, focusing on recovery of hierarchical structure.
We recommend a simple variant of the standard algorithm, in which clusters are merged by maximum average dot product and not, for example, by minimum distance or within-cluster variance.
We demonstrate that the tree output by this algorithm provides a bona fide estimate of generative hierarchical structure in data, under a generic probabilistic graphical model.
arXiv Detail & Related papers (2023-05-24T11:05:12Z) - Natural Hierarchical Cluster Analysis by Nearest Neighbors with
Near-Linear Time Complexity [0.0]
We propose a nearest neighbor based clustering algorithm that results in a naturally defined hierarchy of clusters.
In contrast to the agglomerative and divisive hierarchical clustering algorithms, our approach is not dependent on the iterative working of the algorithm.
arXiv Detail & Related papers (2022-03-15T16:03:42Z) - ACTIVE:Augmentation-Free Graph Contrastive Learning for Partial
Multi-View Clustering [52.491074276133325]
We propose an augmentation-free graph contrastive learning framework to solve the problem of partial multi-view clustering.
The proposed approach elevates instance-level contrastive learning and missing data inference to the cluster-level, effectively mitigating the impact of individual missing data on clustering.
arXiv Detail & Related papers (2022-03-01T02:32:25Z) - Path Based Hierarchical Clustering on Knowledge Graphs [1.713291434132985]
We present a novel approach for inducing a hierarchy of subject clusters.
Our method first constructs a tag hierarchy before assigning subjects to clusters on this hierarchy.
We quantitatively demonstrate our method's ability to induce a coherent cluster hierarchy on three real-world datasets.
arXiv Detail & Related papers (2021-09-27T16:42:43Z) - Graph Contrastive Clustering [131.67881457114316]
We propose a novel graph contrastive learning framework, which is then applied to the clustering task and we come up with the Graph Constrastive Clustering(GCC) method.
Specifically, on the one hand, the graph Laplacian based contrastive loss is proposed to learn more discriminative and clustering-friendly features.
On the other hand, a novel graph-based contrastive learning strategy is proposed to learn more compact clustering assignments.
arXiv Detail & Related papers (2021-04-03T15:32:49Z) - Unsupervised Embedding of Hierarchical Structure in Euclidean Space [30.507049058838025]
We consider learning a non-linear embedding of data into Euclidean space as a way to improve the hierarchical clustering produced by agglomerative algorithms.
We show that rescaling the latent space embedding leads to improved results for both dendrogram purity and the Moseley-Wang cost function.
arXiv Detail & Related papers (2020-10-30T03:57:09Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - Interactive Steering of Hierarchical Clustering [30.371250297444703]
We present an interactive steering method to visually supervise constrained hierarchical clustering by utilizing both public knowledge (e.g., Wikipedia) and private knowledge from users.
The novelty of our approach includes 1) automatically constructing constraints for hierarchical clustering using knowledge (knowledge-driven) and intrinsic data distribution (data-driven)
To clearly convey the hierarchical clustering results, an uncertainty-aware tree visualization has been developed to enable users to quickly locate the most uncertain sub-hierarchies.
arXiv Detail & Related papers (2020-09-21T05:26:07Z) - Structured Graph Learning for Clustering and Semi-supervised
Classification [74.35376212789132]
We propose a graph learning framework to preserve both the local and global structure of data.
Our method uses the self-expressiveness of samples to capture the global structure and adaptive neighbor approach to respect the local structure.
Our model is equivalent to a combination of kernel k-means and k-means methods under certain condition.
arXiv Detail & Related papers (2020-08-31T08:41:20Z) - LSD-C: Linearly Separable Deep Clusters [145.89790963544314]
We present LSD-C, a novel method to identify clusters in an unlabeled dataset.
Our method draws inspiration from recent semi-supervised learning practice and proposes to combine our clustering algorithm with self-supervised pretraining and strong data augmentation.
We show that our approach significantly outperforms competitors on popular public image benchmarks including CIFAR 10/100, STL 10 and MNIST, as well as the document classification dataset Reuters 10K.
arXiv Detail & Related papers (2020-06-17T17:58:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.