UniForCE: The Unimodality Forest Method for Clustering and Estimation of
the Number of Clusters
- URL: http://arxiv.org/abs/2312.11323v1
- Date: Mon, 18 Dec 2023 16:19:02 GMT
- Title: UniForCE: The Unimodality Forest Method for Clustering and Estimation of
the Number of Clusters
- Authors: Georgios Vardakas, Argyris Kalogeratos, Aristidis Likas
- Abstract summary: We focus on the concept of unimodality and propose a flexible cluster definition called locally unimodal cluster.
A locally unimodal cluster extends for as long as unimodality is locally preserved across pairs of subclusters of the data.
We propose the UniForCE method for locally unimodal clustering.
- Score: 2.4953699842881605
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Estimating the number of clusters k while clustering the data is a
challenging task. An incorrect cluster assumption indicates that the number of
clusters k gets wrongly estimated. Consequently, the model fitting becomes less
important. In this work, we focus on the concept of unimodality and propose a
flexible cluster definition called locally unimodal cluster. A locally unimodal
cluster extends for as long as unimodality is locally preserved across pairs of
subclusters of the data. Then, we propose the UniForCE method for locally
unimodal clustering. The method starts with an initial overclustering of the
data and relies on the unimodality graph that connects subclusters forming
unimodal pairs. Such pairs are identified using an appropriate statistical
test. UniForCE identifies maximal locally unimodal clusters by computing a
spanning forest in the unimodality graph. Experimental results on both real and
synthetic datasets illustrate that the proposed methodology is particularly
flexible and robust in discovering regular and highly complex cluster shapes.
Most importantly, it automatically provides an adequate estimation of the
number of clusters.
Related papers
- Learning Uniform Clusters on Hypersphere for Deep Graph-level Clustering [25.350054742471816]
We propose a novel deep graph-level clustering method called Uniform Deep Graph Clustering (UDGC)
UDGC assigns instances evenly to different clusters and then scatters those clusters on unit hypersphere, leading to a more uniform cluster-level distribution and a slighter cluster collapse.
Our empirical study on eight well-known datasets demonstrates that UDGC significantly outperforms the state-of-the-art models.
arXiv Detail & Related papers (2023-11-23T12:08:20Z) - Reinforcement Graph Clustering with Unknown Cluster Number [91.4861135742095]
We propose a new deep graph clustering method termed Reinforcement Graph Clustering.
In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework.
In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters.
arXiv Detail & Related papers (2023-08-13T18:12:28Z) - Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels.
We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - A Computational Theory and Semi-Supervised Algorithm for Clustering [0.0]
A semi-supervised clustering algorithm is presented.
The kernel of the clustering method is Mohammad's anomaly detection algorithm.
Results are presented on synthetic and realworld data sets.
arXiv Detail & Related papers (2023-06-12T09:15:58Z) - Convex Clustering through MM: An Efficient Algorithm to Perform
Hierarchical Clustering [1.0589208420411012]
We propose convex clustering through majorization-minimization ( CCMM) -- an iterative algorithm that uses cluster fusions and a highly efficient updating scheme.
With a current desktop computer, CCMM efficiently solves convex clustering problems featuring over one million objects in seven-dimensional space.
arXiv Detail & Related papers (2022-11-03T15:07:51Z) - DeepCluE: Enhanced Image Clustering via Multi-layer Ensembles in Deep
Neural Networks [53.88811980967342]
This paper presents a Deep Clustering via Ensembles (DeepCluE) approach.
It bridges the gap between deep clustering and ensemble clustering by harnessing the power of multiple layers in deep neural networks.
Experimental results on six image datasets confirm the advantages of DeepCluE over the state-of-the-art deep clustering approaches.
arXiv Detail & Related papers (2022-06-01T09:51:38Z) - Self-supervised Contrastive Attributed Graph Clustering [110.52694943592974]
We propose a novel attributed graph clustering network, namely Self-supervised Contrastive Attributed Graph Clustering (SCAGC)
In SCAGC, by leveraging inaccurate clustering labels, a self-supervised contrastive loss, are designed for node representation learning.
For the OOS nodes, SCAGC can directly calculate their clustering labels.
arXiv Detail & Related papers (2021-10-15T03:25:28Z) - Learning Statistical Representation with Joint Deep Embedded Clustering [2.1267423178232407]
StatDEC is an unsupervised framework for joint statistical representation learning and clustering.
Our experiments show that using these representations, one can considerably improve results on imbalanced image clustering across a variety of image datasets.
arXiv Detail & Related papers (2021-09-11T09:26:52Z) - Very Compact Clusters with Structural Regularization via Similarity and
Connectivity [3.779514860341336]
We propose an end-to-end deep clustering algorithm, i.e., Very Compact Clusters (VCC) for the general datasets.
Our proposed approach achieves better clustering performance over most of the state-of-the-art clustering methods.
arXiv Detail & Related papers (2021-06-09T23:22:03Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - Kernel learning approaches for summarising and combining posterior
similarity matrices [68.8204255655161]
We build upon the notion of the posterior similarity matrix (PSM) in order to suggest new approaches for summarising the output of MCMC algorithms for Bayesian clustering models.
A key contribution of our work is the observation that PSMs are positive semi-definite, and hence can be used to define probabilistically-motivated kernel matrices.
arXiv Detail & Related papers (2020-09-27T14:16:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.