Deep Clustering with Features from Self-Supervised Pretraining
- URL: http://arxiv.org/abs/2207.13364v1
- Date: Wed, 27 Jul 2022 08:38:45 GMT
- Title: Deep Clustering with Features from Self-Supervised Pretraining
- Authors: Xingzhi Zhou, Nevin L. Zhang
- Abstract summary: A deep clustering model conceptually consists of a feature extractor that maps data points to a latent space, and a clustering head that groups data points into clusters in the latent space.
In the first stage, the feature extractor is trained via self-supervised learning, which enables the preservation of the cluster structures among the data points.
We propose to replace the first stage with another model that is pretrained on a much larger dataset via self-supervised learning.
- Score: 16.023354174462774
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A deep clustering model conceptually consists of a feature extractor that
maps data points to a latent space, and a clustering head that groups data
points into clusters in the latent space. Although the two components used to
be trained jointly in an end-to-end fashion, recent works have proved it
beneficial to train them separately in two stages. In the first stage, the
feature extractor is trained via self-supervised learning, which enables the
preservation of the cluster structures among the data points. To preserve the
cluster structures even better, we propose to replace the first stage with
another model that is pretrained on a much larger dataset via self-supervised
learning. The method is simple and might suffer from domain shift. Nonetheless,
we have empirically shown that it can achieve superior clustering performance.
When a vision transformer (ViT) architecture is used for feature extraction,
our method has achieved clustering accuracy 94.0%, 55.6% and 97.9% on CIFAR-10,
CIFAR-100 and STL-10 respectively. The corresponding previous state-of-the-art
results are 84.3%, 47.7% and 80.8%. Our code will be available online with the
publication of the paper.
Related papers
- End-to-end Learnable Clustering for Intent Learning in Recommendation [54.157784572994316]
We propose a novel intent learning method termed underlineELCRec.
It unifies behavior representation learning into an underlineEnd-to-end underlineLearnable underlineClustering framework.
We deploy this method on the industrial recommendation system with 130 million page views and achieve promising results.
arXiv Detail & Related papers (2024-01-11T15:22:55Z) - Deep Structure and Attention Aware Subspace Clustering [29.967881186297582]
We propose a novel Deep Structure and Attention aware Subspace Clustering (DSASC)
We use a vision transformer to extract features, and the extracted features are divided into two parts, structure features, and content features.
Our method significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-12-25T01:19:47Z) - Hard Regularization to Prevent Deep Online Clustering Collapse without
Data Augmentation [65.268245109828]
Online deep clustering refers to the joint use of a feature extraction network and a clustering model to assign cluster labels to each new data point or batch as it is processed.
While faster and more versatile than offline methods, online clustering can easily reach the collapsed solution where the encoder maps all inputs to the same point and all are put into a single cluster.
We propose a method that does not require data augmentation, and that, differently from existing methods, regularizes the hard assignments.
arXiv Detail & Related papers (2023-03-29T08:23:26Z) - Prompt Tuning for Parameter-efficient Medical Image Segmentation [79.09285179181225]
We propose and investigate several contributions to achieve a parameter-efficient but effective adaptation for semantic segmentation on two medical imaging datasets.
We pre-train this architecture with a dedicated dense self-supervision scheme based on assignments to online generated prototypes.
We demonstrate that the resulting neural network model is able to attenuate the gap between fully fine-tuned and parameter-efficiently adapted models.
arXiv Detail & Related papers (2022-11-16T21:55:05Z) - No Fear of Heterogeneity: Classifier Calibration for Federated Learning
with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data.
We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model.
Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z) - Variational Auto Encoder Gradient Clustering [0.0]
Clustering using deep neural network models have been extensively studied in recent years.
This article investigates how probability function gradient ascent can be used to process data in order to achieve better clustering.
We propose a simple yet effective method for investigating suitable number of clusters for data, based on the DBSCAN clustering algorithm.
arXiv Detail & Related papers (2021-05-11T08:00:36Z) - Learning Self-Expression Metrics for Scalable and Inductive Subspace
Clustering [5.587290026368626]
Subspace clustering has established itself as a state-of-the-art approach to clustering high-dimensional data.
We propose a novel metric learning approach to learn instead a subspace affinity function using a siamese neural network architecture.
Our model benefits from a constant number of parameters and a constant-size memory footprint, allowing it to scale to considerably larger datasets.
arXiv Detail & Related papers (2020-09-27T15:40:12Z) - Contrastive Clustering [57.71729650297379]
We propose Contrastive Clustering (CC) which explicitly performs the instance- and cluster-level contrastive learning.
In particular, CC achieves an NMI of 0.705 (0.431) on the CIFAR-10 (CIFAR-100) dataset, which is an up to 19% (39%) performance improvement compared with the best baseline.
arXiv Detail & Related papers (2020-09-21T08:54:40Z) - LSD-C: Linearly Separable Deep Clusters [145.89790963544314]
We present LSD-C, a novel method to identify clusters in an unlabeled dataset.
Our method draws inspiration from recent semi-supervised learning practice and proposes to combine our clustering algorithm with self-supervised pretraining and strong data augmentation.
We show that our approach significantly outperforms competitors on popular public image benchmarks including CIFAR 10/100, STL 10 and MNIST, as well as the document classification dataset Reuters 10K.
arXiv Detail & Related papers (2020-06-17T17:58:10Z) - Improving k-Means Clustering Performance with Disentangled Internal
Representations [0.0]
We propose a simpler approach of optimizing the entanglement of the learned latent code representation of an autoencoder.
Using our proposed approach, the test clustering accuracy was 96.2% on the MNIST dataset, 85.6% on the Fashion-MNIST dataset, and 79.2% on the EMNIST Balanced dataset, outperforming our baseline models.
arXiv Detail & Related papers (2020-06-05T11:32:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.