High-dimensional Clustering onto Hamiltonian Cycle
- URL: http://arxiv.org/abs/2304.14531v2
- Date: Sat, 17 Jun 2023 14:11:47 GMT
- Title: High-dimensional Clustering onto Hamiltonian Cycle
- Authors: Tianyi Huang, Shenghui Cheng, Stan Z. Li, Zhengjun Zhang
- Abstract summary: Clustering aims to group unlabelled samples based on their similarities.
This paper proposes a new framework called High-dimensional Clustering onto Hamiltonian Cycle.
- Score: 30.17605416237686
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Clustering aims to group unlabelled samples based on their similarities. It
has become a significant tool for the analysis of high-dimensional data.
However, most of the clustering methods merely generate pseudo labels and thus
are unable to simultaneously present the similarities between different
clusters and outliers. This paper proposes a new framework called
High-dimensional Clustering onto Hamiltonian Cycle (HCHC) to solve the above
problems. First, HCHC combines global structure with local structure in one
objective function for deep clustering, improving the labels as relative
probabilities, to mine the similarities between different clusters while
keeping the local structure in each cluster. Then, the anchors of different
clusters are sorted on the optimal Hamiltonian cycle generated by the cluster
similarities and mapped on the circumference of a circle. Finally, a sample
with a higher probability of a cluster will be mapped closer to the
corresponding anchor. In this way, our framework allows us to appreciate three
aspects visually and simultaneously - clusters (formed by samples with high
probabilities), cluster similarities (represented as circular distances), and
outliers (recognized as dots far away from all clusters). The experiments
illustrate the superiority of HCHC.
Related papers
- Similarity and Dissimilarity Guided Co-association Matrix Construction for Ensemble Clustering [22.280221709474105]
We propose the Similarity and Dissimilarity Guided Co-association matrix (SDGCA) to achieve ensemble clustering.
First, we introduce normalized ensemble entropy to estimate the quality of each cluster, and construct a similarity matrix based on this estimation.
We employ the random walk to explore high-order proximity of base clusterings to construct a dissimilarity matrix.
arXiv Detail & Related papers (2024-11-01T08:10:28Z) - Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks.
We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z) - Reinforcement Graph Clustering with Unknown Cluster Number [91.4861135742095]
We propose a new deep graph clustering method termed Reinforcement Graph Clustering.
In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework.
In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters.
arXiv Detail & Related papers (2023-08-13T18:12:28Z) - Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels.
We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - Hybrid Fuzzy-Crisp Clustering Algorithm: Theory and Experiments [0.0]
We propose a hybrid fuzzy-crisp clustering algorithm based on a target function combining linear and quadratic terms of the membership function.
In this algorithm, the membership of a data point to a cluster is automatically set to exactly zero if the data point is sufficiently'' far from the cluster center.
The proposed algorithm is demonstrated to outperform the conventional methods on imbalanced datasets and can be competitive on more balanced datasets.
arXiv Detail & Related papers (2023-03-25T05:27:26Z) - Convex Clustering through MM: An Efficient Algorithm to Perform
Hierarchical Clustering [1.0589208420411012]
We propose convex clustering through majorization-minimization ( CCMM) -- an iterative algorithm that uses cluster fusions and a highly efficient updating scheme.
With a current desktop computer, CCMM efficiently solves convex clustering problems featuring over one million objects in seven-dimensional space.
arXiv Detail & Related papers (2022-11-03T15:07:51Z) - Enhancing cluster analysis via topological manifold learning [0.3823356975862006]
We show that inferring the topological structure of a dataset before clustering can considerably enhance cluster detection.
We combine manifold learning method UMAP for inferring the topological structure with density-based clustering method DBSCAN.
arXiv Detail & Related papers (2022-07-01T15:53:39Z) - Self-supervised Contrastive Attributed Graph Clustering [110.52694943592974]
We propose a novel attributed graph clustering network, namely Self-supervised Contrastive Attributed Graph Clustering (SCAGC)
In SCAGC, by leveraging inaccurate clustering labels, a self-supervised contrastive loss, are designed for node representation learning.
For the OOS nodes, SCAGC can directly calculate their clustering labels.
arXiv Detail & Related papers (2021-10-15T03:25:28Z) - You Never Cluster Alone [150.94921340034688]
We extend the mainstream contrastive learning paradigm to a cluster-level scheme, where all the data subjected to the same cluster contribute to a unified representation.
We define a set of categorical variables as clustering assignment confidence, which links the instance-level learning track with the cluster-level one.
By reparametrizing the assignment variables, TCC is trained end-to-end, requiring no alternating steps.
arXiv Detail & Related papers (2021-06-03T14:59:59Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - Spectral Clustering with Smooth Tiny Clusters [14.483043753721256]
We propose a novel clustering algorithm, which con-siders the smoothness of data for the first time.
Our key idea is to cluster tiny clusters, whose centers constitute smooth graphs.
Although in this paper, we singly focus on multi-scale situations, the idea of data smoothness can certainly be extended to any clustering algorithms.
arXiv Detail & Related papers (2020-09-10T05:21:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.