LSD-C: Linearly Separable Deep Clusters
- URL: http://arxiv.org/abs/2006.10039v1
- Date: Wed, 17 Jun 2020 17:58:10 GMT
- Title: LSD-C: Linearly Separable Deep Clusters
- Authors: Sylvestre-Alvise Rebuffi, Sebastien Ehrhardt, Kai Han, Andrea Vedaldi,
Andrew Zisserman
- Abstract summary: We present LSD-C, a novel method to identify clusters in an unlabeled dataset.
Our method draws inspiration from recent semi-supervised learning practice and proposes to combine our clustering algorithm with self-supervised pretraining and strong data augmentation.
We show that our approach significantly outperforms competitors on popular public image benchmarks including CIFAR 10/100, STL 10 and MNIST, as well as the document classification dataset Reuters 10K.
- Score: 145.89790963544314
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present LSD-C, a novel method to identify clusters in an unlabeled
dataset. Our algorithm first establishes pairwise connections in the feature
space between the samples of the minibatch based on a similarity metric. Then
it regroups in clusters the connected samples and enforces a linear separation
between clusters. This is achieved by using the pairwise connections as targets
together with a binary cross-entropy loss on the predictions that the
associated pairs of samples belong to the same cluster. This way, the feature
representation of the network will evolve such that similar samples in this
feature space will belong to the same linearly separated cluster. Our method
draws inspiration from recent semi-supervised learning practice and proposes to
combine our clustering algorithm with self-supervised pretraining and strong
data augmentation. We show that our approach significantly outperforms
competitors on popular public image benchmarks including CIFAR 10/100, STL 10
and MNIST, as well as the document classification dataset Reuters 10K.
Related papers
- Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks.
We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z) - ClusterNet: A Perception-Based Clustering Model for Scattered Data [16.326062082938215]
Cluster separation in scatterplots is a task that is typically tackled by widely used clustering techniques.
We propose a learning strategy which directly operates on scattered data.
We train ClusterNet, a point-based deep learning model, trained to reflect human perception of cluster separability.
arXiv Detail & Related papers (2023-04-27T13:41:12Z) - C3: Cross-instance guided Contrastive Clustering [8.953252452851862]
Clustering is the task of gathering similar data samples into clusters without using any predefined labels.
We propose a novel contrastive clustering method, Cross-instance guided Contrastive Clustering (C3)
Our proposed method can outperform state-of-the-art algorithms on benchmark computer vision datasets.
arXiv Detail & Related papers (2022-11-14T06:28:07Z) - Efficient Distribution Similarity Identification in Clustered Federated
Learning via Principal Angles Between Client Data Subspaces [59.33965805898736]
Clustered learning has been shown to produce promising results by grouping clients into clusters.
Existing FL algorithms are essentially trying to group clients together with similar distributions.
Prior FL algorithms attempt similarities indirectly during training.
arXiv Detail & Related papers (2022-09-21T17:37:54Z) - DeepCluE: Enhanced Image Clustering via Multi-layer Ensembles in Deep
Neural Networks [53.88811980967342]
This paper presents a Deep Clustering via Ensembles (DeepCluE) approach.
It bridges the gap between deep clustering and ensemble clustering by harnessing the power of multiple layers in deep neural networks.
Experimental results on six image datasets confirm the advantages of DeepCluE over the state-of-the-art deep clustering approaches.
arXiv Detail & Related papers (2022-06-01T09:51:38Z) - Implicit Sample Extension for Unsupervised Person Re-Identification [97.46045935897608]
Clustering sometimes mixes different true identities together or splits the same identity into two or more sub clusters.
We propose an Implicit Sample Extension (OurWholeMethod) method to generate what we call support samples around the cluster boundaries.
Experiments demonstrate that the proposed method is effective and achieves state-of-the-art performance for unsupervised person Re-ID.
arXiv Detail & Related papers (2022-04-14T11:41:48Z) - Is it all a cluster game? -- Exploring Out-of-Distribution Detection
based on Clustering in the Embedding Space [7.856998585396422]
It is essential for safety-critical applications of deep neural networks to determine when new inputs are significantly different from the training distribution.
We study the structure and separation of clusters in the embedding space and find that supervised contrastive learning leads to well-separated clusters.
In our analysis of different training methods, clustering strategies, distance metrics, and thresholding approaches, we observe that there is no clear winner.
arXiv Detail & Related papers (2022-03-16T11:22:23Z) - Cluster Analysis with Deep Embeddings and Contrastive Learning [0.0]
This work proposes a novel framework for performing image clustering from deep embeddings.
Our approach jointly learns representations and predicts cluster centers in an end-to-end manner.
Our framework performs on par with widely accepted clustering methods and outperforms the state-of-the-art contrastive learning method on the CIFAR-10 dataset.
arXiv Detail & Related papers (2021-09-26T22:18:15Z) - Learning Statistical Representation with Joint Deep Embedded Clustering [2.1267423178232407]
StatDEC is an unsupervised framework for joint statistical representation learning and clustering.
Our experiments show that using these representations, one can considerably improve results on imbalanced image clustering across a variety of image datasets.
arXiv Detail & Related papers (2021-09-11T09:26:52Z) - Contrastive Clustering [57.71729650297379]
We propose Contrastive Clustering (CC) which explicitly performs the instance- and cluster-level contrastive learning.
In particular, CC achieves an NMI of 0.705 (0.431) on the CIFAR-10 (CIFAR-100) dataset, which is an up to 19% (39%) performance improvement compared with the best baseline.
arXiv Detail & Related papers (2020-09-21T08:54:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.