Delving into Sample Loss Curve to Embrace Noisy and Imbalanced Data
- URL: http://arxiv.org/abs/2201.00849v1
- Date: Thu, 30 Dec 2021 09:20:07 GMT
- Title: Delving into Sample Loss Curve to Embrace Noisy and Imbalanced Data
- Authors: Shenwang Jiang, Jianan Li, Ying Wang, Bo Huang, Zhang Zhang, Tingfa Xu
- Abstract summary: Corrupted labels and class imbalance are commonly encountered in practically collected training data.
Existing approaches alleviate these issues by adopting a sample re-weighting strategy.
However, biased samples with corrupted labels and of tailed classes commonly co-exist in training data.
- Score: 17.7825114228313
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Corrupted labels and class imbalance are commonly encountered in practically
collected training data, which easily leads to over-fitting of deep neural
networks (DNNs). Existing approaches alleviate these issues by adopting a
sample re-weighting strategy, which is to re-weight sample by designing
weighting function. However, it is only applicable for training data containing
only either one type of data biases. In practice, however, biased samples with
corrupted labels and of tailed classes commonly co-exist in training data. How
to handle them simultaneously is a key but under-explored problem. In this
paper, we find that these two types of biased samples, though have similar
transient loss, have distinguishable trend and characteristics in loss curves,
which could provide valuable priors for sample weight assignment. Motivated by
this, we delve into the loss curves and propose a novel probe-and-allocate
training strategy: In the probing stage, we train the network on the whole
biased training data without intervention, and record the loss curve of each
sample as an additional attribute; In the allocating stage, we feed the
resulting attribute to a newly designed curve-perception network, named
CurveNet, to learn to identify the bias type of each sample and assign proper
weights through meta-learning adaptively. The training speed of meta learning
also blocks its application. To solve it, we propose a method named skip layer
meta optimization (SLMO) to accelerate training speed by skipping the bottom
layers. Extensive synthetic and real experiments well validate the proposed
method, which achieves state-of-the-art performance on multiple challenging
benchmarks.
Related papers
- Debiased Sample Selection for Combating Noisy Labels [24.296451733127956]
We propose a noIse-Tolerant Expert Model (ITEM) for debiased learning in sample selection.
Specifically, to mitigate the training bias, we design a robust network architecture that integrates with multiple experts.
By training on the mixture of two class-discriminative mini-batches, the model mitigates the effect of the imbalanced training set.
arXiv Detail & Related papers (2024-01-24T10:37:28Z) - Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data.
We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z) - Dynamic Loss For Robust Learning [17.33444812274523]
This work presents a novel meta-learning based dynamic loss that automatically adjusts the objective functions with the training process to robustly learn a classifier from long-tailed noisy data.
Our method achieves state-of-the-art accuracy on multiple real-world and synthetic datasets with various types of data biases, including CIFAR-10/100, Animal-10N, ImageNet-LT, and Webvision.
arXiv Detail & Related papers (2022-11-22T01:48:25Z) - DiscrimLoss: A Universal Loss for Hard Samples and Incorrect Samples
Discrimination [28.599571524763785]
Given data with label noise (i.e., incorrect data), deep neural networks would gradually memorize the label noise and impair model performance.
To relieve this issue, curriculum learning is proposed to improve model performance and generalization by ordering training samples in a meaningful sequence.
arXiv Detail & Related papers (2022-08-21T13:38:55Z) - Learning to Re-weight Examples with Optimal Transport for Imbalanced
Classification [74.62203971625173]
Imbalanced data pose challenges for deep learning based classification models.
One of the most widely-used approaches for tackling imbalanced data is re-weighting.
We propose a novel re-weighting method based on optimal transport (OT) from a distributional point of view.
arXiv Detail & Related papers (2022-08-05T01:23:54Z) - Learning from Data with Noisy Labels Using Temporal Self-Ensemble [11.245833546360386]
Deep neural networks (DNNs) have an enormous capacity to memorize noisy labels.
Current state-of-the-art methods present a co-training scheme that trains dual networks using samples associated with small losses.
We propose a simple yet effective robust training scheme that operates by training only a single network.
arXiv Detail & Related papers (2022-07-21T08:16:31Z) - CAFA: Class-Aware Feature Alignment for Test-Time Adaptation [50.26963784271912]
Test-time adaptation (TTA) aims to address this challenge by adapting a model to unlabeled data at test time.
We propose a simple yet effective feature alignment loss, termed as Class-Aware Feature Alignment (CAFA), which simultaneously encourages a model to learn target representations in a class-discriminative manner.
arXiv Detail & Related papers (2022-06-01T03:02:07Z) - CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep
Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance.
Sample re-weighting methods are popularly used to alleviate this data bias issue.
We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z) - Jo-SRC: A Contrastive Approach for Combating Noisy Labels [58.867237220886885]
We propose a noise-robust approach named Jo-SRC (Joint Sample Selection and Model Regularization based on Consistency)
Specifically, we train the network in a contrastive learning manner. Predictions from two different views of each sample are used to estimate its "likelihood" of being clean or out-of-distribution.
arXiv Detail & Related papers (2021-03-24T07:26:07Z) - Attentional-Biased Stochastic Gradient Descent [74.49926199036481]
We present a provable method (named ABSGD) for addressing the data imbalance or label noise problem in deep learning.
Our method is a simple modification to momentum SGD where we assign an individual importance weight to each sample in the mini-batch.
ABSGD is flexible enough to combine with other robust losses without any additional cost.
arXiv Detail & Related papers (2020-12-13T03:41:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.