Boosting Semi-Supervised Face Recognition with Noise Robustness
- URL: http://arxiv.org/abs/2105.04431v1
- Date: Mon, 10 May 2021 14:43:11 GMT
- Title: Boosting Semi-Supervised Face Recognition with Noise Robustness
- Authors: Yuchi Liu, Hailin Shi, Hang Du, Rui Zhu, Jun Wang, Liang Zheng, and
Tao Mei
- Abstract summary: This paper presents an effective solution to semi-supervised face recognition that is robust to the label noise aroused by the auto-labelling.
We develop a semi-supervised face recognition solution, named Noise Robust Learning-Labelling (NRoLL), which is based on the robust training ability empowered by GN.
- Score: 54.342992887966616
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although deep face recognition benefits significantly from large-scale
training data, a current bottleneck is the labelling cost. A feasible solution
to this problem is semi-supervised learning, exploiting a small portion of
labelled data and large amounts of unlabelled data. The major challenge,
however, is the accumulated label errors through auto-labelling, compromising
the training. This paper presents an effective solution to semi-supervised face
recognition that is robust to the label noise aroused by the auto-labelling.
Specifically, we introduce a multi-agent method, named GroupNet (GN), to endow
our solution with the ability to identify the wrongly labelled samples and
preserve the clean samples. We show that GN alone achieves the leading accuracy
in traditional supervised face recognition even when the noisy labels take over
50\% of the training data. Further, we develop a semi-supervised face
recognition solution, named Noise Robust Learning-Labelling (NRoLL), which is
based on the robust training ability empowered by GN. It starts with a small
amount of labelled data and consequently conducts high-confidence labelling on
a large amount of unlabelled data to boost further training. The more data is
labelled by NRoLL, the higher confidence is with the label in the dataset. To
evaluate the competitiveness of our method, we run NRoLL with a rough condition
that only one-fifth of the labelled MSCeleb is available and the rest is used
as unlabelled data. On a wide range of benchmarks, our method compares
favorably against the state-of-the-art methods.
Related papers
- Soft Curriculum for Learning Conditional GANs with Noisy-Labeled and
Uncurated Unlabeled Data [70.25049762295193]
We introduce a novel conditional image generation framework that accepts noisy-labeled and uncurated data during training.
We propose soft curriculum learning, which assigns instance-wise weights for adversarial training while assigning new labels for unlabeled data.
Our experiments show that our approach outperforms existing semi-supervised and label-noise robust methods in terms of both quantitative and qualitative performance.
arXiv Detail & Related papers (2023-07-17T08:31:59Z) - SoftMatch: Addressing the Quantity-Quality Trade-off in Semi-supervised
Learning [101.86916775218403]
This paper revisits the popular pseudo-labeling methods via a unified sample weighting formulation.
We propose SoftMatch to overcome the trade-off by maintaining both high quantity and high quality of pseudo-labels during training.
In experiments, SoftMatch shows substantial improvements across a wide variety of benchmarks, including image, text, and imbalanced classification.
arXiv Detail & Related papers (2023-01-26T03:53:25Z) - Pseudo-Label Noise Suppression Techniques for Semi-Supervised Semantic
Segmentation [21.163070161951868]
Semi-consuming learning (SSL) can reduce the need for large labelled datasets by incorporating unsupervised data into the training.
Current SSL approaches use an initially supervised trained model to generate predictions for unlabelled images, called pseudo-labels.
We use three mechanisms to control pseudo-label noise and errors.
arXiv Detail & Related papers (2022-10-19T09:46:27Z) - Informative Pseudo-Labeling for Graph Neural Networks with Few Labels [12.83841767562179]
Graph Neural Networks (GNNs) have achieved state-of-the-art results for semi-supervised node classification on graphs.
The challenge of how to effectively learn GNNs with very few labels is still under-explored.
We propose a novel informative pseudo-labeling framework, called InfoGNN, to facilitate learning of GNNs with extremely few labels.
arXiv Detail & Related papers (2022-01-20T01:49:30Z) - S3: Supervised Self-supervised Learning under Label Noise [53.02249460567745]
In this paper we address the problem of classification in the presence of label noise.
In the heart of our method is a sample selection mechanism that relies on the consistency between the annotated label of a sample and the distribution of the labels in its neighborhood in the feature space.
Our method significantly surpasses previous methods on both CIFARCIFAR100 with artificial noise and real-world noisy datasets such as WebVision and ANIMAL-10N.
arXiv Detail & Related papers (2021-11-22T15:49:20Z) - An Ensemble Noise-Robust K-fold Cross-Validation Selection Method for
Noisy Labels [0.9699640804685629]
Large-scale datasets tend to contain mislabeled samples that can be memorized by deep neural networks (DNNs)
We present Ensemble Noise-robust K-fold Cross-Validation Selection (E-NKCVS) to effectively select clean samples from noisy data.
We evaluate our approach on various image and text classification tasks where the labels have been manually corrupted with different noise ratios.
arXiv Detail & Related papers (2021-07-06T02:14:52Z) - A Novel Perspective for Positive-Unlabeled Learning via Noisy Labels [49.990938653249415]
This research presents a methodology that assigns initial pseudo-labels to unlabeled data which is used as noisy-labeled data, and trains a deep neural network using the noisy-labeled data.
Experimental results demonstrate that the proposed method significantly outperforms the state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-03-08T11:46:02Z) - In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label
Selection Framework for Semi-Supervised Learning [53.1047775185362]
Pseudo-labeling (PL) is a general SSL approach that does not have this constraint but performs relatively poorly in its original formulation.
We argue that PL underperforms due to the erroneous high confidence predictions from poorly calibrated models.
We propose an uncertainty-aware pseudo-label selection (UPS) framework which improves pseudo labeling accuracy by drastically reducing the amount of noise encountered in the training process.
arXiv Detail & Related papers (2021-01-15T23:29:57Z) - Self-semi-supervised Learning to Learn from NoisyLabeled Data [3.18577806302116]
It is costly to obtain high-quality human-labeled data, leading to the active research area of training models robust to noisy labels.
In this project, we designed methods to more accurately differentiate clean and noisy labels and borrowed the wisdom of self-semi-supervised learning to train noisy labeled data.
arXiv Detail & Related papers (2020-11-03T02:31:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.