DiscrimLoss: A Universal Loss for Hard Samples and Incorrect Samples
Discrimination
- URL: http://arxiv.org/abs/2208.09884v1
- Date: Sun, 21 Aug 2022 13:38:55 GMT
- Title: DiscrimLoss: A Universal Loss for Hard Samples and Incorrect Samples
Discrimination
- Authors: Tingting Wu, Xiao Ding, Hao Zhang, Jinglong Gao, Li Du, Bing Qin, Ting
Liu
- Abstract summary: Given data with label noise (i.e., incorrect data), deep neural networks would gradually memorize the label noise and impair model performance.
To relieve this issue, curriculum learning is proposed to improve model performance and generalization by ordering training samples in a meaningful sequence.
- Score: 28.599571524763785
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Given data with label noise (i.e., incorrect data), deep neural networks
would gradually memorize the label noise and impair model performance. To
relieve this issue, curriculum learning is proposed to improve model
performance and generalization by ordering training samples in a meaningful
(e.g., easy to hard) sequence. Previous work takes incorrect samples as generic
hard ones without discriminating between hard samples (i.e., hard samples in
correct data) and incorrect samples. Indeed, a model should learn from hard
samples to promote generalization rather than overfit to incorrect ones. In
this paper, we address this problem by appending a novel loss function
DiscrimLoss, on top of the existing task loss. Its main effect is to
automatically and stably estimate the importance of easy samples and difficult
samples (including hard and incorrect samples) at the early stages of training
to improve the model performance. Then, during the following stages,
DiscrimLoss is dedicated to discriminating between hard and incorrect samples
to improve the model generalization. Such a training strategy can be formulated
dynamically in a self-supervised manner, effectively mimicking the main
principle of curriculum learning. Experiments on image classification, image
regression, text sequence regression, and event relation reasoning demonstrate
the versatility and effectiveness of our method, particularly in the presence
of diversified noise levels.
Related papers
- Foster Adaptivity and Balance in Learning with Noisy Labels [26.309508654960354]
We propose a novel approach named textbfSED to deal with label noise in a textbfSelf-adaptivtextbfE and class-balancetextbfD manner.
A mean-teacher model is then employed to correct labels of noisy samples.
We additionally propose a self-adaptive and class-balanced sample re-weighting mechanism to assign different weights to detected noisy samples.
arXiv Detail & Related papers (2024-07-03T03:10:24Z) - Late Stopping: Avoiding Confidently Learning from Mislabeled Examples [61.00103151680946]
We propose a new framework, Late Stopping, which leverages the intrinsic robust learning ability of DNNs through a prolonged training process.
We empirically observe that mislabeled and clean examples exhibit differences in the number of epochs required for them to be consistently and correctly classified.
Experimental results on benchmark-simulated and real-world noisy datasets demonstrate that the proposed method outperforms state-of-the-art counterparts.
arXiv Detail & Related papers (2023-08-26T12:43:25Z) - Self-supervised Training Sample Difficulty Balancing for Local
Descriptor Learning [1.309716118537215]
In the case of an imbalance between positive and negative samples, hard negative mining strategies have been shown to help models learn more subtle differences.
However, if too strict mining strategies are promoted in the dataset, there may be a risk of introducing false negative samples.
In this paper, we investigate how to trade off the difficulty of the mined samples in order to obtain and exploit high-quality negative samples.
arXiv Detail & Related papers (2023-03-10T18:37:43Z) - Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data.
We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z) - Split-PU: Hardness-aware Training Strategy for Positive-Unlabeled
Learning [42.26185670834855]
Positive-Unlabeled (PU) learning aims to learn a model with rare positive samples and abundant unlabeled samples.
This paper focuses on improving the commonly-used nnPU with a novel training pipeline.
arXiv Detail & Related papers (2022-11-30T05:48:31Z) - Delving into Sample Loss Curve to Embrace Noisy and Imbalanced Data [17.7825114228313]
Corrupted labels and class imbalance are commonly encountered in practically collected training data.
Existing approaches alleviate these issues by adopting a sample re-weighting strategy.
However, biased samples with corrupted labels and of tailed classes commonly co-exist in training data.
arXiv Detail & Related papers (2021-12-30T09:20:07Z) - Density-Based Dynamic Curriculum Learning for Intent Detection [14.653917644725427]
Our model defines the sample's difficulty level according to their eigenvectors' density.
We apply a dynamic curriculum learning strategy, which pays distinct attention to samples of various difficulty levels.
Experiments on three open datasets verify that the proposed density-based algorithm can distinguish simple and complex samples significantly.
arXiv Detail & Related papers (2021-08-24T12:29:26Z) - Jo-SRC: A Contrastive Approach for Combating Noisy Labels [58.867237220886885]
We propose a noise-robust approach named Jo-SRC (Joint Sample Selection and Model Regularization based on Consistency)
Specifically, we train the network in a contrastive learning manner. Predictions from two different views of each sample are used to estimate its "likelihood" of being clean or out-of-distribution.
arXiv Detail & Related papers (2021-03-24T07:26:07Z) - One for More: Selecting Generalizable Samples for Generalizable ReID
Model [92.40951770273972]
This paper proposes a one-for-more training objective that takes the generalization ability of selected samples as a loss function.
Our proposed one-for-more based sampler can be seamlessly integrated into the ReID training framework.
arXiv Detail & Related papers (2020-12-10T06:37:09Z) - Salvage Reusable Samples from Noisy Data for Robust Learning [70.48919625304]
We propose a reusable sample selection and correction approach, termed as CRSSC, for coping with label noise in training deep FG models with web images.
Our key idea is to additionally identify and correct reusable samples, and then leverage them together with clean examples to update the networks.
arXiv Detail & Related papers (2020-08-06T02:07:21Z) - CurricularFace: Adaptive Curriculum Learning Loss for Deep Face
Recognition [79.92240030758575]
We propose a novel Adaptive Curriculum Learning loss (CurricularFace) that embeds the idea of curriculum learning into the loss function.
Our CurricularFace adaptively adjusts the relative importance of easy and hard samples during different training stages.
arXiv Detail & Related papers (2020-04-01T08:43:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.