Debiased Sample Selection for Combating Noisy Labels
- URL: http://arxiv.org/abs/2401.13360v2
- Date: Thu, 25 Jan 2024 04:55:08 GMT
- Title: Debiased Sample Selection for Combating Noisy Labels
- Authors: Qi Wei, Lei Feng, Haobo Wang, Bo An
- Abstract summary: We propose a noIse-Tolerant Expert Model (ITEM) for debiased learning in sample selection.
Specifically, to mitigate the training bias, we design a robust network architecture that integrates with multiple experts.
By training on the mixture of two class-discriminative mini-batches, the model mitigates the effect of the imbalanced training set.
- Score: 24.296451733127956
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning with noisy labels aims to ensure model generalization given a
label-corrupted training set. The sample selection strategy achieves promising
performance by selecting a label-reliable subset for model training. In this
paper, we empirically reveal that existing sample selection methods suffer from
both data and training bias that are represented as imbalanced selected sets
and accumulation errors in practice, respectively. However, only the training
bias was handled in previous studies. To address this limitation, we propose a
noIse-Tolerant Expert Model (ITEM) for debiased learning in sample selection.
Specifically, to mitigate the training bias, we design a robust network
architecture that integrates with multiple experts. Compared with the
prevailing double-branch network, our network exhibits better performance of
selection and prediction by ensembling these experts while training with fewer
parameters. Meanwhile, to mitigate the data bias, we propose a mixed sampling
strategy based on two weight-based data samplers. By training on the mixture of
two class-discriminative mini-batches, the model mitigates the effect of the
imbalanced training set while avoiding sparse representations that are easily
caused by sampling strategies. Extensive experiments and analyses demonstrate
the effectiveness of ITEM. Our code is available at this url
\href{https://github.com/1998v7/ITEM}{ITEM}.
Related papers
- Twice Class Bias Correction for Imbalanced Semi-Supervised Learning [59.90429949214134]
We introduce a novel approach called textbfTwice textbfClass textbfBias textbfCorrection (textbfTCBC)
We estimate the class bias of the model parameters during the training process.
We apply a secondary correction to the model's pseudo-labels for unlabeled samples.
arXiv Detail & Related papers (2023-12-27T15:06:36Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - Learning with Noisy Labels Using Collaborative Sample Selection and
Contrastive Semi-Supervised Learning [76.00798972439004]
Collaborative Sample Selection (CSS) removes noisy samples from identified clean set.
We introduce a co-training mechanism with a contrastive loss in semi-supervised learning.
arXiv Detail & Related papers (2023-10-24T05:37:20Z) - A Robust Classifier Under Missing-Not-At-Random Sample Selection Bias [15.628927478079913]
In statistics, Greene's method formulates this type of sample selection with logistic regression as the prediction model.
We propose BiasCorr, an algorithm that improves on Greene's method by modifying the original training set.
We provide theoretical guarantee for the improvement of BiasCorr over Greene's method by analyzing its bias.
arXiv Detail & Related papers (2023-05-25T01:39:51Z) - Delving into Identify-Emphasize Paradigm for Combating Unknown Bias [52.76758938921129]
We propose an effective bias-conflicting scoring method (ECS) to boost the identification accuracy.
We also propose gradient alignment (GA) to balance the contributions of the mined bias-aligned and bias-conflicting samples.
Experiments are conducted on multiple datasets in various settings, demonstrating that the proposed solution can mitigate the impact of unknown biases.
arXiv Detail & Related papers (2023-02-22T14:50:24Z) - Asymmetric Co-teaching with Multi-view Consensus for Noisy Label
Learning [15.690502285538411]
We introduce our noisy-label learning approach, called Asymmetric Co-teaching (AsyCo)
AsyCo produces more consistent divergent results of the co-teaching models.
Experiments on synthetic and real-world noisy-label datasets show that AsyCo improves over current SOTA methods.
arXiv Detail & Related papers (2023-01-01T04:10:03Z) - Delving into Sample Loss Curve to Embrace Noisy and Imbalanced Data [17.7825114228313]
Corrupted labels and class imbalance are commonly encountered in practically collected training data.
Existing approaches alleviate these issues by adopting a sample re-weighting strategy.
However, biased samples with corrupted labels and of tailed classes commonly co-exist in training data.
arXiv Detail & Related papers (2021-12-30T09:20:07Z) - Dash: Semi-Supervised Learning with Dynamic Thresholding [72.74339790209531]
We propose a semi-supervised learning (SSL) approach that uses unlabeled examples to train models.
Our proposed approach, Dash, enjoys its adaptivity in terms of unlabeled data selection.
arXiv Detail & Related papers (2021-09-01T23:52:29Z) - Robust Fairness-aware Learning Under Sample Selection Bias [17.09665420515772]
We propose a framework for robust and fair learning under sample selection bias.
We develop two algorithms to handle sample selection bias when test data is both available and unavailable.
arXiv Detail & Related papers (2021-05-24T23:23:36Z) - Jo-SRC: A Contrastive Approach for Combating Noisy Labels [58.867237220886885]
We propose a noise-robust approach named Jo-SRC (Joint Sample Selection and Model Regularization based on Consistency)
Specifically, we train the network in a contrastive learning manner. Predictions from two different views of each sample are used to estimate its "likelihood" of being clean or out-of-distribution.
arXiv Detail & Related papers (2021-03-24T07:26:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.