Related papers: Hierarchical clustering by aggregating representatives in sub-minimum-spanning-trees

Hierarchical clustering by aggregating representatives in sub-minimum-spanning-trees

URL: http://arxiv.org/abs/2111.06968v1
Date: Thu, 11 Nov 2021 07:36:55 GMT
Title: Hierarchical clustering by aggregating representatives in sub-minimum-spanning-trees
Authors: Wen-Bo Xie, Zhen Liu, Jaideep Srivastava
Abstract summary: We propose a novel hierarchical clustering algorithm, in which, while building the clustering dendrogram, we can effectively detect the representative point. Under our analysis, the proposed algorithm has O(nlogn) time-complexity and O(logn) space-complexity, indicating that it has the scalability in handling massive data.
Score: 5.877624540482919
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: One of the main challenges for hierarchical clustering is how to appropriately identify the representative points in the lower level of the cluster tree, which are going to be utilized as the roots in the higher level of the cluster tree for further aggregation. However, conventional hierarchical clustering approaches have adopted some simple tricks to select the "representative" points which might not be as representative as enough. Thus, the constructed cluster tree is less attractive in terms of its poor robustness and weak reliability. Aiming at this issue, we propose a novel hierarchical clustering algorithm, in which, while building the clustering dendrogram, we can effectively detect the representative point based on scoring the reciprocal nearest data points in each sub-minimum-spanning-tree. Extensive experiments on UCI datasets show that the proposed algorithm is more accurate than other benchmarks. Meanwhile, under our analysis, the proposed algorithm has O(nlogn) time-complexity and O(logn) space-complexity, indicating that it has the scalability in handling massive data with less time and storage consumptions.

Related papers

IsoSEL: Isometric Structural Entropy Learning for Deep Graph Clustering in Hyperbolic Space [57.036143666293334]
Graph clustering is a longstanding topic in machine learning. In this paper, we study a challenging yet practical problem: deep graph clustering without K considering the imbalance in reality. We present a novel IsoSEL framework for deep graph clustering, where we design a hyperbolic neural network to learn partitioning tree in the Lorentz model of hyperbolic space.
arXiv Detail & Related papers (2025-04-14T08:21:41Z)
I Want 'Em All (At Once) -- Ultrametric Cluster Hierarchies [11.69714244591334]
We show that, for any reasonable hierarchy, one can optimally solve any center-based clustering objective over it. We conclude by verifying the utility of our proposed techniques across datasets, hierarchies, and partitioning schemes.
arXiv Detail & Related papers (2025-02-19T18:03:52Z)
Interpretable Sequence Clustering [3.280979689839737]
We propose a method called Interpretable Sequence Clustering Tree (ISCT) ISCT generates k leaf nodes, corresponding to k clusters, which provides an intuitive explanation on how each cluster is formed. Experimental results on 14 real-world data sets demonstrate that our proposed method provides an interpretable tree structure.
arXiv Detail & Related papers (2023-09-03T11:31:44Z)
Reinforcement Graph Clustering with Unknown Cluster Number [91.4861135742095]
We propose a new deep graph clustering method termed Reinforcement Graph Clustering. In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework. In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters.
arXiv Detail & Related papers (2023-08-13T18:12:28Z)
Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels. We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z)
Hierarchical clustering with dot products recovers hidden tree structure [53.68551192799585]
In this paper we offer a new perspective on the well established agglomerative clustering algorithm, focusing on recovery of hierarchical structure. We recommend a simple variant of the standard algorithm, in which clusters are merged by maximum average dot product and not, for example, by minimum distance or within-cluster variance. We demonstrate that the tree output by this algorithm provides a bona fide estimate of generative hierarchical structure in data, under a generic probabilistic graphical model.
arXiv Detail & Related papers (2023-05-24T11:05:12Z)
Contrastive Hierarchical Clustering [8.068701201341065]
CoHiClust is a Contrastive Hierarchical Clustering model based on deep neural networks. By employing a self-supervised learning approach, CoHiClust distills the base network into a binary tree without access to any labeled data.
arXiv Detail & Related papers (2023-03-03T07:54:19Z)
DeepCluE: Enhanced Image Clustering via Multi-layer Ensembles in Deep Neural Networks [53.88811980967342]
This paper presents a Deep Clustering via Ensembles (DeepCluE) approach. It bridges the gap between deep clustering and ensemble clustering by harnessing the power of multiple layers in deep neural networks. Experimental results on six image datasets confirm the advantages of DeepCluE over the state-of-the-art deep clustering approaches.
arXiv Detail & Related papers (2022-06-01T09:51:38Z)
Natural Hierarchical Cluster Analysis by Nearest Neighbors with Near-Linear Time Complexity [0.0]
We propose a nearest neighbor based clustering algorithm that results in a naturally defined hierarchy of clusters. In contrast to the agglomerative and divisive hierarchical clustering algorithms, our approach is not dependent on the iterative working of the algorithm.
arXiv Detail & Related papers (2022-03-15T16:03:42Z)
Exact and Approximate Hierarchical Clustering Using A* [51.187990314731344]
We introduce a new approach based on A* search for clustering. We overcome the prohibitively large search space by combining A* with a novel emphtrellis data structure. We empirically demonstrate that our method achieves substantially higher quality results than baselines for a particle physics use case and other clustering benchmarks.
arXiv Detail & Related papers (2021-04-14T18:15:27Z)
Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed. We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.