Progressive Cluster Purification for Unsupervised Feature Learning
- URL: http://arxiv.org/abs/2007.02577v2
- Date: Wed, 15 Jul 2020 17:11:45 GMT
- Title: Progressive Cluster Purification for Unsupervised Feature Learning
- Authors: Yifei Zhang, Chang Liu, Yu Zhou, Wei Wang, Weiping Wang and Qixiang Ye
- Abstract summary: In unsupervised feature learning, sample specificity based methods ignore the inter-class information.
We propose a novel clustering based method, which excludes class inconsistent samples during progressive cluster formation.
Our approach, referred to as Progressive Cluster Purification (PCP), implements progressive clustering by gradually reducing the number of clusters during training.
- Score: 48.87365358296371
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In unsupervised feature learning, sample specificity based methods ignore the
inter-class information, which deteriorates the discriminative capability of
representation models. Clustering based methods are error-prone to explore the
complete class boundary information due to the inevitable class inconsistent
samples in each cluster. In this work, we propose a novel clustering based
method, which, by iteratively excluding class inconsistent samples during
progressive cluster formation, alleviates the impact of noise samples in a
simple-yet-effective manner. Our approach, referred to as Progressive Cluster
Purification (PCP), implements progressive clustering by gradually reducing the
number of clusters during training, while the sizes of clusters continuously
expand consistently with the growth of model representation capability. With a
well-designed cluster purification mechanism, it further purifies clusters by
filtering noise samples which facilitate the subsequent feature learning by
utilizing the refined clusters as pseudo-labels. Experiments on commonly used
benchmarks demonstrate that the proposed PCP improves baseline method with
significant margins. Our code will be available at
https://github.com/zhangyifei0115/PCP.
Related papers
- Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks.
We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z) - GCC: Generative Calibration Clustering [55.44944397168619]
We propose a novel Generative Clustering (GCC) method to incorporate feature learning and augmentation into clustering procedure.
First, we develop a discrimirative feature alignment mechanism to discover intrinsic relationship across real and generated samples.
Second, we design a self-supervised metric learning to generate more reliable cluster assignment.
arXiv Detail & Related papers (2024-04-14T01:51:11Z) - Deep Embedding Clustering Driven by Sample Stability [16.53706617383543]
We propose a deep embedding clustering algorithm driven by sample stability (DECS)
Specifically, we start by constructing the initial feature space with an autoencoder and then learn the cluster-oriented embedding feature constrained by sample stability.
The experimental results on five datasets illustrate that the proposed method achieves superior performance compared to state-of-the-art clustering approaches.
arXiv Detail & Related papers (2024-01-29T09:19:49Z) - Deep Clustering with Diffused Sampling and Hardness-aware
Self-distillation [4.550555443103878]
This paper proposes a novel end-to-end deep clustering method with diffused sampling and hardness-aware self-distillation (HaDis)
Results on five challenging image datasets demonstrate the superior clustering performance of our HaDis method over the state-of-the-art.
arXiv Detail & Related papers (2024-01-25T09:33:49Z) - Stable Cluster Discrimination for Deep Clustering [7.175082696240088]
Deep clustering can optimize representations of instances (i.e., representation learning) and explore the inherent data distribution.
The coupled objective implies a trivial solution that all instances collapse to the uniform features.
In this work, we first show that the prevalent discrimination task in supervised learning is unstable for one-stage clustering.
A novel stable cluster discrimination (SeCu) task is proposed and a new hardness-aware clustering criterion can be obtained accordingly.
arXiv Detail & Related papers (2023-11-24T06:43:26Z) - Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels.
We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - Implicit Sample Extension for Unsupervised Person Re-Identification [97.46045935897608]
Clustering sometimes mixes different true identities together or splits the same identity into two or more sub clusters.
We propose an Implicit Sample Extension (OurWholeMethod) method to generate what we call support samples around the cluster boundaries.
Experiments demonstrate that the proposed method is effective and achieves state-of-the-art performance for unsupervised person Re-ID.
arXiv Detail & Related papers (2022-04-14T11:41:48Z) - Self-Evolutionary Clustering [1.662966122370634]
Most existing deep clustering methods are based on simple distance comparison and highly dependent on the target distribution generated by a handcrafted nonlinear mapping.
A novel modular Self-Evolutionary Clustering (Self-EvoC) framework is constructed, which boosts the clustering performance by classification in a self-supervised manner.
The framework can efficiently discriminate sample outliers and generate better target distribution with the assistance of self-supervised.
arXiv Detail & Related papers (2022-02-21T19:38:18Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.