Deep Clustering Using the Soft Silhouette Score: Towards Compact and
Well-Separated Clusters
- URL: http://arxiv.org/abs/2402.00608v1
- Date: Thu, 1 Feb 2024 14:02:06 GMT
- Title: Deep Clustering Using the Soft Silhouette Score: Towards Compact and
Well-Separated Clusters
- Authors: Georgios Vardakas, Ioannis Papakostas, Aristidis Likas
- Abstract summary: We propose soft silhoutte, a probabilistic formulation of the silhouette coefficient.
We introduce an autoencoder-based deep learning architecture that is suitable for optimizing the soft silhouette objective function.
The proposed deep clustering method has been tested and compared with several well-studied deep clustering methods on various benchmark datasets.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised learning has gained prominence in the big data era, offering a
means to extract valuable insights from unlabeled datasets. Deep clustering has
emerged as an important unsupervised category, aiming to exploit the non-linear
mapping capabilities of neural networks in order to enhance clustering
performance. The majority of deep clustering literature focuses on minimizing
the inner-cluster variability in some embedded space while keeping the learned
representation consistent with the original high-dimensional dataset. In this
work, we propose soft silhoutte, a probabilistic formulation of the silhouette
coefficient. Soft silhouette rewards compact and distinctly separated
clustering solutions like the conventional silhouette coefficient. When
optimized within a deep clustering framework, soft silhouette guides the
learned representations towards forming compact and well-separated clusters. In
addition, we introduce an autoencoder-based deep learning architecture that is
suitable for optimizing the soft silhouette objective function. The proposed
deep clustering method has been tested and compared with several well-studied
deep clustering methods on various benchmark datasets, yielding very
satisfactory clustering results.
Related papers
- SHADE: Deep Density-based Clustering [13.629470968274]
SHADE is the first deep clustering algorithm that incorporates density-connectivity into its loss function.
It supports high-dimensional and large data sets with the expressive power of a deep autoencoder.
It outperforms existing methods in clustering quality, especially on data that contain non-Gaussian clusters.
arXiv Detail & Related papers (2024-10-08T18:03:35Z) - Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks.
We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z) - Deep Clustering: A Comprehensive Survey [53.387957674512585]
Clustering analysis plays an indispensable role in machine learning and data mining.
Deep clustering, which can learn clustering-friendly representations using deep neural networks, has been broadly applied in a wide range of clustering tasks.
Existing surveys for deep clustering mainly focus on the single-view fields and the network architectures, ignoring the complex application scenarios of clustering.
arXiv Detail & Related papers (2022-10-09T02:31:32Z) - DeepCluE: Enhanced Image Clustering via Multi-layer Ensembles in Deep
Neural Networks [53.88811980967342]
This paper presents a Deep Clustering via Ensembles (DeepCluE) approach.
It bridges the gap between deep clustering and ensemble clustering by harnessing the power of multiple layers in deep neural networks.
Experimental results on six image datasets confirm the advantages of DeepCluE over the state-of-the-art deep clustering approaches.
arXiv Detail & Related papers (2022-06-01T09:51:38Z) - Clustering by Maximizing Mutual Information Across Views [62.21716612888669]
We propose a novel framework for image clustering that incorporates joint representation learning and clustering.
Our method significantly outperforms state-of-the-art single-stage clustering methods across a variety of image datasets.
arXiv Detail & Related papers (2021-07-24T15:36:49Z) - Very Compact Clusters with Structural Regularization via Similarity and
Connectivity [3.779514860341336]
We propose an end-to-end deep clustering algorithm, i.e., Very Compact Clusters (VCC) for the general datasets.
Our proposed approach achieves better clustering performance over most of the state-of-the-art clustering methods.
arXiv Detail & Related papers (2021-06-09T23:22:03Z) - Graph Contrastive Clustering [131.67881457114316]
We propose a novel graph contrastive learning framework, which is then applied to the clustering task and we come up with the Graph Constrastive Clustering(GCC) method.
Specifically, on the one hand, the graph Laplacian based contrastive loss is proposed to learn more discriminative and clustering-friendly features.
On the other hand, a novel graph-based contrastive learning strategy is proposed to learn more compact clustering assignments.
arXiv Detail & Related papers (2021-04-03T15:32:49Z) - Deep adaptive fuzzy clustering for evolutionary unsupervised
representation learning [2.8028128734158164]
Cluster assignment of large and complex images is a crucial but challenging task in pattern recognition and computer vision.
We present a novel evolutionary unsupervised learning representation model with iterative optimization.
We jointly fuzzy clustering to the deep reconstruction model, in which fuzzy membership is utilized to represent a clear structure of deep cluster assignments.
arXiv Detail & Related papers (2021-03-31T13:58:10Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - Improving k-Means Clustering Performance with Disentangled Internal
Representations [0.0]
We propose a simpler approach of optimizing the entanglement of the learned latent code representation of an autoencoder.
Using our proposed approach, the test clustering accuracy was 96.2% on the MNIST dataset, 85.6% on the Fashion-MNIST dataset, and 79.2% on the EMNIST Balanced dataset, outperforming our baseline models.
arXiv Detail & Related papers (2020-06-05T11:32:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.