DST: Data Selection and joint Training for Learning with Noisy Labels
- URL: http://arxiv.org/abs/2103.00813v1
- Date: Mon, 1 Mar 2021 07:23:58 GMT
- Title: DST: Data Selection and joint Training for Learning with Noisy Labels
- Authors: Yi Wei, Xue Mei, Xin Liu, Pengxiang Xu
- Abstract summary: We propose a Data Selection and joint Training (DST) method to automatically select training samples with accurate annotations.
For each iteration, the correctly labeled and predicted labels are reweighted respectively by the probabilities from the mixture model.
Experiments on CIFAR-10, CIFAR-100 and Clothing1M demonstrate that DST is the comparable or superior to the state-of-the-art methods.
- Score: 11.0375827306207
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training a deep neural network heavily relies on a large amount of training
data with accurate annotations. To alleviate this problem, various methods have
been proposed to annotate the data automatically. However, automatically
generating annotations will inevitably yields noisy labels. In this paper, we
propose a Data Selection and joint Training (DST) method to automatically
select training samples with accurate annotations. Specifically, DST fits a
mixture model according to the original annotation as well as the predicted
label for each training sample, and the mixture model is utilized to
dynamically divide the training dataset into a correctly labeled dataset, a
correctly predicted set and a wrong dataset. Then, DST is trained with these
datasets in a supervised manner. Due to confirmation bias problem, we train the
two networks alternately, and each network is tasked to establish the data
division to teach another network. For each iteration, the correctly labeled
and predicted labels are reweighted respectively by the probabilities from the
mixture model, and a uniform distribution is used to generate the probabilities
of the wrong samples. Experiments on CIFAR-10, CIFAR-100 and Clothing1M
demonstrate that DST is the comparable or superior to the state-of-the-art
methods.
Related papers
- Task-customized Masked AutoEncoder via Mixture of Cluster-conditional
Experts [104.9871176044644]
Masked Autoencoder(MAE) is a prevailing self-supervised learning method that achieves promising results in model pre-training.
We propose a novel MAE-based pre-training paradigm, Mixture of Cluster-conditional Experts (MoCE)
MoCE trains each expert only with semantically relevant images by using cluster-conditional gates.
arXiv Detail & Related papers (2024-02-08T03:46:32Z) - Debiased Sample Selection for Combating Noisy Labels [24.296451733127956]
We propose a noIse-Tolerant Expert Model (ITEM) for debiased learning in sample selection.
Specifically, to mitigate the training bias, we design a robust network architecture that integrates with multiple experts.
By training on the mixture of two class-discriminative mini-batches, the model mitigates the effect of the imbalanced training set.
arXiv Detail & Related papers (2024-01-24T10:37:28Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - Efficient Online Data Mixing For Language Model Pre-Training [101.45242332613944]
Existing data selection methods suffer from slow and computationally expensive processes.
Data mixing, on the other hand, reduces the complexity of data selection by grouping data points together.
We develop an efficient algorithm for Online Data Mixing (ODM) that combines elements from both data selection and data mixing.
arXiv Detail & Related papers (2023-12-05T00:42:35Z) - Too Fine or Too Coarse? The Goldilocks Composition of Data Complexity
for Robust Left-Right Eye-Tracking Classifiers [0.0]
We train machine learning models utilizing a mixed dataset composed of both fine- and coarse-grain data.
For our purposes, finer-grain data refers to data collected using more complex methods whereas coarser-grain data refers to data collected using more simple methods.
arXiv Detail & Related papers (2022-08-24T23:18:08Z) - Learning from Data with Noisy Labels Using Temporal Self-Ensemble [11.245833546360386]
Deep neural networks (DNNs) have an enormous capacity to memorize noisy labels.
Current state-of-the-art methods present a co-training scheme that trains dual networks using samples associated with small losses.
We propose a simple yet effective robust training scheme that operates by training only a single network.
arXiv Detail & Related papers (2022-07-21T08:16:31Z) - CAFA: Class-Aware Feature Alignment for Test-Time Adaptation [50.26963784271912]
Test-time adaptation (TTA) aims to address this challenge by adapting a model to unlabeled data at test time.
We propose a simple yet effective feature alignment loss, termed as Class-Aware Feature Alignment (CAFA), which simultaneously encourages a model to learn target representations in a class-discriminative manner.
arXiv Detail & Related papers (2022-06-01T03:02:07Z) - A Data Cartography based MixUp for Pre-trained Language Models [47.90235939359225]
MixUp is a data augmentation strategy where additional samples are generated during training by combining random pairs of training samples and their labels.
We propose TDMixUp, a novel MixUp strategy that leverages Training Dynamics and allows more informative samples to be combined for generating new data samples.
We empirically validate that our method not only achieves competitive performance using a smaller subset of the training data compared with strong baselines, but also yields lower expected calibration error on the pre-trained language model, BERT, on both in-domain and out-of-domain settings in a wide range of NLP tasks.
arXiv Detail & Related papers (2022-05-06T17:59:19Z) - DivideMix: Learning with Noisy Labels as Semi-supervised Learning [111.03364864022261]
We propose DivideMix, a framework for learning with noisy labels.
Experiments on multiple benchmark datasets demonstrate substantial improvements over state-of-the-art methods.
arXiv Detail & Related papers (2020-02-18T06:20:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.