DivClust: Controlling Diversity in Deep Clustering
- URL: http://arxiv.org/abs/2304.01042v1
- Date: Mon, 3 Apr 2023 14:45:43 GMT
- Title: DivClust: Controlling Diversity in Deep Clustering
- Authors: Ioannis Maniadis Metaxas, Georgios Tzimiropoulos, Ioannis Patras
- Abstract summary: DivClust produces consensus clustering solutions that consistently outperform single-clustering baselines.
Our method effectively controls diversity across frameworks and datasets with very small additional computational cost.
- Score: 47.85350249697335
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Clustering has been a major research topic in the field of machine learning,
one to which Deep Learning has recently been applied with significant success.
However, an aspect of clustering that is not addressed by existing deep
clustering methods, is that of efficiently producing multiple, diverse
partitionings for a given dataset. This is particularly important, as a diverse
set of base clusterings are necessary for consensus clustering, which has been
found to produce better and more robust results than relying on a single
clustering. To address this gap, we propose DivClust, a diversity controlling
loss that can be incorporated into existing deep clustering frameworks to
produce multiple clusterings with the desired degree of diversity. We conduct
experiments with multiple datasets and deep clustering frameworks and show
that: a) our method effectively controls diversity across frameworks and
datasets with very small additional computational cost, b) the sets of
clusterings learned by DivClust include solutions that significantly outperform
single-clustering baselines, and c) using an off-the-shelf consensus clustering
algorithm, DivClust produces consensus clustering solutions that consistently
outperform single-clustering baselines, effectively improving the performance
of the base deep clustering framework.
Related papers
- Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks.
We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z) - A3S: A General Active Clustering Method with Pairwise Constraints [66.74627463101837]
A3S features strategic active clustering adjustment on the initial cluster result, which is obtained by an adaptive clustering algorithm.
In extensive experiments across diverse real-world datasets, A3S achieves desired results with significantly fewer human queries.
arXiv Detail & Related papers (2024-07-14T13:37:03Z) - CCFC: Bridging Federated Clustering and Contrastive Learning [9.91610928326645]
We propose a new federated clustering method named cluster-contrastive federated clustering (CCFC)
CCFC shows superior performance in handling device failures from a practical viewpoint.
arXiv Detail & Related papers (2024-01-12T15:26:44Z) - Stable Cluster Discrimination for Deep Clustering [7.175082696240088]
Deep clustering can optimize representations of instances (i.e., representation learning) and explore the inherent data distribution.
The coupled objective implies a trivial solution that all instances collapse to the uniform features.
In this work, we first show that the prevalent discrimination task in supervised learning is unstable for one-stage clustering.
A novel stable cluster discrimination (SeCu) task is proposed and a new hardness-aware clustering criterion can be obtained accordingly.
arXiv Detail & Related papers (2023-11-24T06:43:26Z) - Cluster-level Group Representativity Fairness in $k$-means Clustering [3.420467786581458]
Clustering algorithms could generate clusters such that different groups are disadvantaged within different clusters.
We develop a clustering algorithm, building upon the centroid clustering paradigm pioneered by classical algorithms.
We show that our method is effective in enhancing cluster-level group representativity fairness significantly at low impact on cluster coherence.
arXiv Detail & Related papers (2022-12-29T22:02:28Z) - Deep Clustering: A Comprehensive Survey [53.387957674512585]
Clustering analysis plays an indispensable role in machine learning and data mining.
Deep clustering, which can learn clustering-friendly representations using deep neural networks, has been broadly applied in a wide range of clustering tasks.
Existing surveys for deep clustering mainly focus on the single-view fields and the network architectures, ignoring the complex application scenarios of clustering.
arXiv Detail & Related papers (2022-10-09T02:31:32Z) - DeepCluE: Enhanced Image Clustering via Multi-layer Ensembles in Deep
Neural Networks [53.88811980967342]
This paper presents a Deep Clustering via Ensembles (DeepCluE) approach.
It bridges the gap between deep clustering and ensemble clustering by harnessing the power of multiple layers in deep neural networks.
Experimental results on six image datasets confirm the advantages of DeepCluE over the state-of-the-art deep clustering approaches.
arXiv Detail & Related papers (2022-06-01T09:51:38Z) - Clustering to the Fewest Clusters Under Intra-Cluster Dissimilarity
Constraints [0.0]
equiwide clustering relies neither on density nor on a predefined number of expected classes, but on a dissimilarity threshold.
We review and evaluate suitable clustering algorithms to identify trade-offs between the various practical solutions for this clustering problem.
arXiv Detail & Related papers (2021-09-28T12:02:18Z) - LSEC: Large-scale spectral ensemble clustering [8.545202841051582]
We propose a large-scale spectral ensemble clustering (LSEC) method to strike a good balance between efficiency and effectiveness.
The LSEC method achieves a lower computational complexity than most existing ensemble clustering methods.
arXiv Detail & Related papers (2021-06-18T00:42:03Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.