Cold PAWS: Unsupervised class discovery and addressing the cold-start
problem for semi-supervised learning
- URL: http://arxiv.org/abs/2305.10071v2
- Date: Tue, 6 Jun 2023 07:31:15 GMT
- Title: Cold PAWS: Unsupervised class discovery and addressing the cold-start
problem for semi-supervised learning
- Authors: Evelyn J. Mannix, Howard D. Bondell
- Abstract summary: We propose a novel approach based on well-established self-supervised learning, clustering, and manifold learning techniques.
We test our approach using several publicly available datasets, namely CIFAR10, Imagenette, DeepWeeds, and EuroSAT.
We obtain superior performance for the datasets considered with a much simpler approach compared to other methods in the literature.
- Score: 0.30458514384586394
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In many machine learning applications, labeling datasets can be an arduous
and time-consuming task. Although research has shown that semi-supervised
learning techniques can achieve high accuracy with very few labels within the
field of computer vision, little attention has been given to how images within
a dataset should be selected for labeling. In this paper, we propose a novel
approach based on well-established self-supervised learning, clustering, and
manifold learning techniques that address this challenge of selecting an
informative image subset to label in the first instance, which is known as the
cold-start or unsupervised selective labelling problem. We test our approach
using several publicly available datasets, namely CIFAR10, Imagenette,
DeepWeeds, and EuroSAT, and observe improved performance with both supervised
and semi-supervised learning strategies when our label selection strategy is
used, in comparison to random sampling. We also obtain superior performance for
the datasets considered with a much simpler approach compared to other methods
in the literature.
Related papers
- LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds [62.49198183539889]
We propose a label-efficient semantic segmentation pipeline for outdoor scenes with LiDAR point clouds.
Our method co-designs an efficient labeling process with semi/weakly supervised learning.
Our proposed method is even highly competitive compared to the fully supervised counterpart with 100% labels.
arXiv Detail & Related papers (2022-10-14T19:13:36Z) - Exploiting Diversity of Unlabeled Data for Label-Efficient
Semi-Supervised Active Learning [57.436224561482966]
Active learning is a research area that addresses the issues of expensive labeling by selecting the most important samples for labeling.
We introduce a new diversity-based initial dataset selection algorithm to select the most informative set of samples for initial labeling in the active learning setting.
Also, we propose a novel active learning query strategy, which uses diversity-based sampling on consistency-based embeddings.
arXiv Detail & Related papers (2022-07-25T16:11:55Z) - Pseudo-Labeled Auto-Curriculum Learning for Semi-Supervised Keypoint
Localization [88.74813798138466]
Localizing keypoints of an object is a basic visual problem.
Supervised learning of a keypoint localization network often requires a large amount of data.
We propose to automatically select reliable pseudo-labeled samples with a series of dynamic thresholds.
arXiv Detail & Related papers (2022-01-21T09:51:58Z) - Budget-aware Few-shot Learning via Graph Convolutional Network [56.41899553037247]
This paper tackles the problem of few-shot learning, which aims to learn new visual concepts from a few examples.
A common problem setting in few-shot classification assumes random sampling strategy in acquiring data labels.
We introduce a new budget-aware few-shot learning problem that aims to learn novel object categories.
arXiv Detail & Related papers (2022-01-07T02:46:35Z) - Towards General and Efficient Active Learning [20.888364610175987]
Active learning aims to select the most informative samples to exploit limited annotation budgets.
We propose a novel general and efficient active learning (GEAL) method in this paper.
Our method can conduct data selection processes on different datasets with a single-pass inference of the same model.
arXiv Detail & Related papers (2021-12-15T08:35:28Z) - OpenCoS: Contrastive Semi-supervised Learning for Handling Open-set
Unlabeled Data [65.19205979542305]
Unlabeled data may include out-of-class samples in practice.
OpenCoS is a method for handling this realistic semi-supervised learning scenario.
arXiv Detail & Related papers (2021-06-29T06:10:05Z) - Iterative label cleaning for transductive and semi-supervised few-shot
learning [16.627512688664513]
Few-shot learning amounts to learning representations and acquiring knowledge such that novel tasks may be solved with both supervision and data being limited.
We introduce a new algorithm that leverages the manifold structure of the labeled and unlabeled data distribution to predict pseudo-labels.
Our solution surpasses or matches the state of the art results on four benchmark datasets.
arXiv Detail & Related papers (2020-12-14T21:54:11Z) - Semi-supervised Active Learning for Instance Segmentation via Scoring
Predictions [25.408505612498423]
We propose a novel and principled semi-supervised active learning framework for instance segmentation.
Specifically, we present an uncertainty sampling strategy named Triplet Scoring Predictions (TSP) to explicitly incorporate samples ranking clues from classes, bounding boxes and masks.
Results on medical images datasets demonstrate that the proposed method results in the embodiment of knowledge from available data in a meaningful way.
arXiv Detail & Related papers (2020-12-09T02:36:52Z) - Automatically Discovering and Learning New Visual Categories with
Ranking Statistics [145.89790963544314]
We tackle the problem of discovering novel classes in an image collection given labelled examples of other classes.
We learn a general-purpose clustering model and use the latter to identify the new classes in the unlabelled data.
We evaluate our approach on standard classification benchmarks and outperform current methods for novel category discovery by a significant margin.
arXiv Detail & Related papers (2020-02-13T18:53:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.