Reliable Label Bootstrapping for Semi-Supervised Learning
- URL: http://arxiv.org/abs/2007.11866v2
- Date: Thu, 25 Feb 2021 11:11:52 GMT
- Title: Reliable Label Bootstrapping for Semi-Supervised Learning
- Authors: Paul Albert, Diego Ortego, Eric Arazo, Noel E. O'Connor, Kevin
McGuinness
- Abstract summary: ReLaB is an unsupervised preprossessing algorithm which improves the performance of semi-supervised algorithms in extremely low supervision settings.
We show that the selection of the network architecture and the self-supervised algorithm are important factors to achieve successful label propagation.
We reach average error rates of $boldsymbol22.34$ with 1 random labeled sample per class on CIFAR-10 and lower this error to $boldsymbol8.46$ when the labeled sample in each class is highly representative.
- Score: 19.841733658911767
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reducing the amount of labels required to train convolutional neural networks
without performance degradation is key to effectively reduce human annotation
efforts. We propose Reliable Label Bootstrapping (ReLaB), an unsupervised
preprossessing algorithm which improves the performance of semi-supervised
algorithms in extremely low supervision settings. Given a dataset with few
labeled samples, we first learn meaningful self-supervised, latent features for
the data. Second, a label propagation algorithm propagates the known labels on
the unsupervised features, effectively labeling the full dataset in an
automatic fashion. Third, we select a subset of correctly labeled (reliable)
samples using a label noise detection algorithm. Finally, we train a
semi-supervised algorithm on the extended subset. We show that the selection of
the network architecture and the self-supervised algorithm are important
factors to achieve successful label propagation and demonstrate that ReLaB
substantially improves semi-supervised learning in scenarios of very limited
supervision on CIFAR-10, CIFAR-100 and mini-ImageNet. We reach average error
rates of $\boldsymbol{22.34}$ with 1 random labeled sample per class on
CIFAR-10 and lower this error to $\boldsymbol{8.46}$ when the labeled sample in
each class is highly representative. Our work is fully reproducible:
https://github.com/PaulAlbert31/ReLaB.
Related papers
- One-bit Supervision for Image Classification: Problem, Solution, and
Beyond [114.95815360508395]
This paper presents one-bit supervision, a novel setting of learning with fewer labels, for image classification.
We propose a multi-stage training paradigm and incorporate negative label suppression into an off-the-shelf semi-supervised learning algorithm.
In multiple benchmarks, the learning efficiency of the proposed approach surpasses that using full-bit, semi-supervised supervision.
arXiv Detail & Related papers (2023-11-26T07:39:00Z) - Adaptive Anchor Label Propagation for Transductive Few-Shot Learning [18.29463308334406]
Few-shot learning addresses the issue of classifying images using limited labeled data.
We propose a novel algorithm that adapts the feature embeddings of the labeled data by minimizing a differentiable loss function.
Our algorithm outperforms the standard label propagation algorithm by as much as 7% and 2% in the 1-shot and 5-shot settings respectively.
arXiv Detail & Related papers (2023-10-30T20:29:31Z) - Manifold DivideMix: A Semi-Supervised Contrastive Learning Framework for
Severe Label Noise [4.90148689564172]
Real-world datasets contain noisy label samples that have no semantic relevance to any class in the dataset.
Most state-of-the-art methods leverage ID labeled noisy samples as unlabeled data for semi-supervised learning.
We propose incorporating the information from all the training data by leveraging the benefits of self-supervised training.
arXiv Detail & Related papers (2023-08-13T23:33:33Z) - All Points Matter: Entropy-Regularized Distribution Alignment for
Weakly-supervised 3D Segmentation [67.30502812804271]
Pseudo-labels are widely employed in weakly supervised 3D segmentation tasks where only sparse ground-truth labels are available for learning.
We propose a novel learning strategy to regularize the generated pseudo-labels and effectively narrow the gaps between pseudo-labels and model predictions.
arXiv Detail & Related papers (2023-05-25T08:19:31Z) - CTRL: Clustering Training Losses for Label Error Detection [4.49681473359251]
In supervised machine learning, use of correct labels is extremely important to ensure high accuracy.
We propose a novel framework, calledClustering TRaining Losses for label error detection.
It detects label errors in two steps based on the observation that models learn clean and noisy labels in different ways.
arXiv Detail & Related papers (2022-08-17T18:09:19Z) - S3: Supervised Self-supervised Learning under Label Noise [53.02249460567745]
In this paper we address the problem of classification in the presence of label noise.
In the heart of our method is a sample selection mechanism that relies on the consistency between the annotated label of a sample and the distribution of the labels in its neighborhood in the feature space.
Our method significantly surpasses previous methods on both CIFARCIFAR100 with artificial noise and real-world noisy datasets such as WebVision and ANIMAL-10N.
arXiv Detail & Related papers (2021-11-22T15:49:20Z) - Boosting Semi-Supervised Face Recognition with Noise Robustness [54.342992887966616]
This paper presents an effective solution to semi-supervised face recognition that is robust to the label noise aroused by the auto-labelling.
We develop a semi-supervised face recognition solution, named Noise Robust Learning-Labelling (NRoLL), which is based on the robust training ability empowered by GN.
arXiv Detail & Related papers (2021-05-10T14:43:11Z) - Coping with Label Shift via Distributionally Robust Optimisation [72.80971421083937]
We propose a model that minimises an objective based on distributionally robust optimisation (DRO)
We then design and analyse a gradient descent-proximal mirror ascent algorithm tailored for large-scale problems to optimise the proposed objective.
arXiv Detail & Related papers (2020-10-23T08:33:04Z) - Analysis of label noise in graph-based semi-supervised learning [2.4366811507669124]
In machine learning, one must acquire labels to help supervise a model that will be able to generalize to unseen data.
It is often the case that most of our data is unlabeled.
Semi-supervised learning (SSL) alleviates that by making strong assumptions about the relation between the labels and the input data distribution.
arXiv Detail & Related papers (2020-09-27T22:13:20Z) - Semi-supervised deep learning based on label propagation in a 2D
embedded space [117.9296191012968]
Proposed solutions propagate labels from a small set of supervised images to a large set of unsupervised ones to train a deep neural network model.
We present a loop in which a deep neural network (VGG-16) is trained from a set with more correctly labeled samples along iterations.
As the labeled set improves along iterations, it improves the features of the neural network.
arXiv Detail & Related papers (2020-08-02T20:08:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.