Deep Metric Learning Meets Deep Clustering: An Novel Unsupervised
Approach for Feature Embedding
- URL: http://arxiv.org/abs/2009.04091v1
- Date: Wed, 9 Sep 2020 04:02:04 GMT
- Title: Deep Metric Learning Meets Deep Clustering: An Novel Unsupervised
Approach for Feature Embedding
- Authors: Binh X. Nguyen, Binh D. Nguyen, Gustavo Carneiro, Erman Tjiputra,
Quang D. Tran, Thanh-Toan Do
- Abstract summary: Unsupervised Deep Distance Metric Learning (UDML) aims to learn sample similarities in the embedding space from an unlabeled dataset.
Traditional UDML methods usually use the triplet loss or pairwise loss which requires the mining of positive and negative samples.
This is, however, challenging in an unsupervised setting as the label information is not available.
We propose a new UDML method that overcomes that challenge.
- Score: 32.8693763689033
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised Deep Distance Metric Learning (UDML) aims to learn sample
similarities in the embedding space from an unlabeled dataset. Traditional UDML
methods usually use the triplet loss or pairwise loss which requires the mining
of positive and negative samples w.r.t. anchor data points. This is, however,
challenging in an unsupervised setting as the label information is not
available. In this paper, we propose a new UDML method that overcomes that
challenge. In particular, we propose to use a deep clustering loss to learn
centroids, i.e., pseudo labels, that represent semantic classes. During
learning, these centroids are also used to reconstruct the input samples. It
hence ensures the representativeness of centroids - each centroid represents
visually similar samples. Therefore, the centroids give information about
positive (visually similar) and negative (visually dissimilar) samples. Based
on pseudo labels, we propose a novel unsupervised metric loss which enforces
the positive concentration and negative separation of samples in the embedding
space. Experimental results on benchmarking datasets show that the proposed
approach outperforms other UDML methods.
Related papers
- Virtual Category Learning: A Semi-Supervised Learning Method for Dense
Prediction with Extremely Limited Labels [63.16824565919966]
This paper proposes to use confusing samples proactively without label correction.
A Virtual Category (VC) is assigned to each confusing sample in such a way that it can safely contribute to the model optimisation.
Our intriguing findings highlight the usage of VC learning in dense vision tasks.
arXiv Detail & Related papers (2023-12-02T16:23:52Z) - Manifold DivideMix: A Semi-Supervised Contrastive Learning Framework for
Severe Label Noise [4.90148689564172]
Real-world datasets contain noisy label samples that have no semantic relevance to any class in the dataset.
Most state-of-the-art methods leverage ID labeled noisy samples as unlabeled data for semi-supervised learning.
We propose incorporating the information from all the training data by leveraging the benefits of self-supervised training.
arXiv Detail & Related papers (2023-08-13T23:33:33Z) - Adaptive Negative Evidential Deep Learning for Open-set Semi-supervised Learning [69.81438976273866]
Open-set semi-supervised learning (Open-set SSL) considers a more practical scenario, where unlabeled data and test data contain new categories (outliers) not observed in labeled data (inliers)
We introduce evidential deep learning (EDL) as an outlier detector to quantify different types of uncertainty, and design different uncertainty metrics for self-training and inference.
We propose a novel adaptive negative optimization strategy, making EDL more tailored to the unlabeled dataset containing both inliers and outliers.
arXiv Detail & Related papers (2023-03-21T09:07:15Z) - CEREAL: Few-Sample Clustering Evaluation [4.569028973407756]
We focus on the underexplored problem of estimating clustering quality with limited labels.
We introduce CEREAL, a comprehensive framework for few-sample clustering evaluation.
Our results show that CEREAL reduces the area under the absolute error curve by up to 57% compared to the best sampling baseline.
arXiv Detail & Related papers (2022-09-30T19:52:41Z) - Negative Selection by Clustering for Contrastive Learning in Human
Activity Recognition [5.351176836203563]
We propose a new contrastive learning framework that negative selection by clustering in Human Activity Recognition (HAR)
Compared with SimCLR, it redefines the negative pairs in the contrastive loss function by using unsupervised clustering methods to generate soft labels that mask other samples of the same cluster to avoid regarding them as negative samples.
We evaluate ClusterCLHAR on three benchmark datasets, USC-HAD, MotionSense, and UCI-HAR, using mean F1-score as the evaluation metric.
arXiv Detail & Related papers (2022-03-23T06:54:16Z) - Guided Point Contrastive Learning for Semi-supervised Point Cloud
Semantic Segmentation [90.2445084743881]
We present a method for semi-supervised point cloud semantic segmentation to adopt unlabeled point clouds in training to boost the model performance.
Inspired by the recent contrastive loss in self-supervised tasks, we propose the guided point contrastive loss to enhance the feature representation and model generalization ability.
arXiv Detail & Related papers (2021-10-15T16:38:54Z) - Doubly Contrastive Deep Clustering [135.7001508427597]
We present a novel Doubly Contrastive Deep Clustering (DCDC) framework, which constructs contrastive loss over both sample and class views.
Specifically, for the sample view, we set the class distribution of the original sample and its augmented version as positive sample pairs.
For the class view, we build the positive and negative pairs from the sample distribution of the class.
In this way, two contrastive losses successfully constrain the clustering results of mini-batch samples in both sample and class level.
arXiv Detail & Related papers (2021-03-09T15:15:32Z) - Multimodal-Aware Weakly Supervised Metric Learning with Self-weighting
Triplet Loss [2.010312620798609]
We propose a novel weakly supervised metric learning algorithm, named MultimoDal Aware weakly supervised Metric Learning (MDaML)
MDaML partitions the data space into several clusters and allocates the local cluster centers and weight for each sample.
Experiments conducted on 13 datasets validate the superiority of the proposed MDaML.
arXiv Detail & Related papers (2021-02-03T07:27:05Z) - Attentional-Biased Stochastic Gradient Descent [74.49926199036481]
We present a provable method (named ABSGD) for addressing the data imbalance or label noise problem in deep learning.
Our method is a simple modification to momentum SGD where we assign an individual importance weight to each sample in the mini-batch.
ABSGD is flexible enough to combine with other robust losses without any additional cost.
arXiv Detail & Related papers (2020-12-13T03:41:52Z) - Deep Semi-supervised Knowledge Distillation for Overlapping Cervical
Cell Instance Segmentation [54.49894381464853]
We propose to leverage both labeled and unlabeled data for instance segmentation with improved accuracy by knowledge distillation.
We propose a novel Mask-guided Mean Teacher framework with Perturbation-sensitive Sample Mining.
Experiments show that the proposed method improves the performance significantly compared with the supervised method learned from labeled data only.
arXiv Detail & Related papers (2020-07-21T13:27:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.