Understanding (Generalized) Label Smoothing when Learning with Noisy
Labels
- URL: http://arxiv.org/abs/2106.04149v2
- Date: Wed, 9 Jun 2021 00:52:34 GMT
- Title: Understanding (Generalized) Label Smoothing when Learning with Noisy
Labels
- Authors: Jiaheng Wei, Hangyu Liu, Tongliang Liu, Gang Niu and Yang Liu
- Abstract summary: Label smoothing (LS) is an arising learning paradigm that uses the positively weighted average of both the hard training labels and uniformly distributed soft labels.
We provide understandings for the properties of generalized label smoothing (GLS) when learning with noisy labels.
- Score: 57.37057235894054
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Label smoothing (LS) is an arising learning paradigm that uses the positively
weighted average of both the hard training labels and uniformly distributed
soft labels. It was shown that LS serves as a regularizer for training data
with hard labels and therefore improves the generalization of the model. Later
it was reported LS even helps with improving robustness when learning with
noisy labels. However, we observe that the advantage of LS vanishes when we
operate in a high label noise regime. Puzzled by the observation, we proceeded
to discover that several proposed learning-with-noisy-labels solutions in the
literature instead relate more closely to negative label smoothing (NLS), which
defines as using a negative weight to combine the hard and soft labels! We show
that NLS functions substantially differently from LS in their achieved model
confidence. To differentiate the two cases, we will call LS the positive label
smoothing (PLS), and this paper unifies PLS and NLS into generalized label
smoothing (GLS). We provide understandings for the properties of GLS when
learning with noisy labels. Among other established properties, we
theoretically show NLS is considered more beneficial when the label noise rates
are high. We provide experimental results to support our findings too.
Related papers
- Learning Label Refinement and Threshold Adjustment for Imbalanced Semi-Supervised Learning [6.904448748214652]
Semi-supervised learning algorithms struggle to perform well when exposed to imbalanced training data.
We introduce SEmi-supervised learning with pseudo-label optimization based on VALidation data (SEVAL)
SEVAL adapts to specific tasks with improved pseudo-labels accuracy and ensures pseudo-labels correctness on a per-class basis.
arXiv Detail & Related papers (2024-07-07T13:46:22Z) - Generalized Semi-Supervised Learning via Self-Supervised Feature Adaptation [87.17768598044427]
Traditional semi-supervised learning assumes that the feature distributions of labeled and unlabeled data are consistent.
We propose Self-Supervised Feature Adaptation (SSFA), a generic framework for improving SSL performance when labeled and unlabeled data come from different distributions.
Our proposed SSFA is applicable to various pseudo-label-based SSL learners and significantly improves performance in labeled, unlabeled, and even unseen distributions.
arXiv Detail & Related papers (2024-05-31T03:13:45Z) - InstanT: Semi-supervised Learning with Instance-dependent Thresholds [75.91684890150283]
We propose the study of instance-dependent thresholds, which has the highest degree of freedom compared with existing methods.
We devise a novel instance-dependent threshold function for all unlabeled instances by utilizing their instance-level ambiguity and the instance-dependent error rates of pseudo-labels.
arXiv Detail & Related papers (2023-10-29T05:31:43Z) - BadLabel: A Robust Perspective on Evaluating and Enhancing Label-noise
Learning [113.8799653759137]
We introduce a novel label noise type called BadLabel, which can significantly degrade the performance of existing LNL algorithms by a large margin.
BadLabel is crafted based on the label-flipping attack against standard classification.
We propose a robust LNL method that perturbs the labels in an adversarial manner at each epoch to make the loss values of clean and noisy labels again distinguishable.
arXiv Detail & Related papers (2023-05-28T06:26:23Z) - CLS: Cross Labeling Supervision for Semi-Supervised Learning [9.929229055862491]
Cross Labeling Supervision ( CLS) is a framework that generalizes the typical pseudo-labeling process.
CLS allows the creation of both pseudo and complementary labels to support both positive and negative learning.
arXiv Detail & Related papers (2022-02-17T08:09:40Z) - L2B: Learning to Bootstrap Robust Models for Combating Label Noise [52.02335367411447]
This paper introduces a simple and effective method, named Learning to Bootstrap (L2B)
It enables models to bootstrap themselves using their own predictions without being adversely affected by erroneous pseudo-labels.
It achieves this by dynamically adjusting the importance weight between real observed and generated labels, as well as between different samples through meta-learning.
arXiv Detail & Related papers (2022-02-09T05:57:08Z) - Demystifying How Self-Supervised Features Improve Training from Noisy
Labels [16.281091780103736]
We study why and how self-supervised features help networks resist label noise.
Our result shows that, given a quality encoder pre-trained from SSL, a simple linear layer trained by the cross-entropy loss is theoretically robust to symmetric label noise.
arXiv Detail & Related papers (2021-10-18T05:41:57Z) - Weakly Supervised Label Smoothing [15.05158252504978]
We study Label Smoothing (LS), a widely used regularization technique, in the context of neural learning to rank (L2R) models.
Inspired by our investigation of LS in the context of neural L2R models, we propose a novel technique called Weakly Supervised Label Smoothing (WSLS)
arXiv Detail & Related papers (2020-12-15T19:36:52Z) - Regularization via Adaptive Pairwise Label Smoothing [19.252319300590653]
This paper introduces a novel label smoothing technique called Pairwise Label Smoothing (PLS)
Unlike current LS methods, which typically require to find a global smoothing distribution mass through cross-validation search, PLS automatically learns the distribution mass for each input pair during training.
We empirically show that PLS significantly outperforms LS and the baseline models, achieving up to 30% of relative classification error reduction.
arXiv Detail & Related papers (2020-12-02T22:08:10Z) - Does label smoothing mitigate label noise? [57.76529645344897]
We show that label smoothing is competitive with loss-correction under label noise.
We show that when distilling models from noisy data, label smoothing of the teacher is beneficial.
arXiv Detail & Related papers (2020-03-05T18:43:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.