Salvage Reusable Samples from Noisy Data for Robust Learning
- URL: http://arxiv.org/abs/2008.02427v1
- Date: Thu, 6 Aug 2020 02:07:21 GMT
- Title: Salvage Reusable Samples from Noisy Data for Robust Learning
- Authors: Zeren Sun, Xian-Sheng Hua, Yazhou Yao, Xiu-Shen Wei, Guosheng Hu, Jian
Zhang
- Abstract summary: We propose a reusable sample selection and correction approach, termed as CRSSC, for coping with label noise in training deep FG models with web images.
Our key idea is to additionally identify and correct reusable samples, and then leverage them together with clean examples to update the networks.
- Score: 70.48919625304
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to the existence of label noise in web images and the high memorization
capacity of deep neural networks, training deep fine-grained (FG) models
directly through web images tends to have an inferior recognition ability. In
the literature, to alleviate this issue, loss correction methods try to
estimate the noise transition matrix, but the inevitable false correction would
cause severe accumulated errors. Sample selection methods identify clean
("easy") samples based on the fact that small losses can alleviate the
accumulated errors. However, "hard" and mislabeled examples that can both boost
the robustness of FG models are also dropped. To this end, we propose a
certainty-based reusable sample selection and correction approach, termed as
CRSSC, for coping with label noise in training deep FG models with web images.
Our key idea is to additionally identify and correct reusable samples, and then
leverage them together with clean examples to update the networks. We
demonstrate the superiority of the proposed approach from both theoretical and
experimental perspectives.
Related papers
- An accurate detection is not all you need to combat label noise in web-noisy datasets [23.020126612431746]
We show that direct estimation of the separating hyperplane can indeed offer an accurate detection of OOD samples.
We propose a hybrid solution that alternates between noise detection using linear separation and a state-of-the-art (SOTA) small-loss approach.
arXiv Detail & Related papers (2024-07-08T00:21:42Z) - Learning with Imbalanced Noisy Data by Preventing Bias in Sample
Selection [82.43311784594384]
Real-world datasets contain not only noisy labels but also class imbalance.
We propose a simple yet effective method to address noisy labels in imbalanced datasets.
arXiv Detail & Related papers (2024-02-17T10:34:53Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - Combating Label Noise With A General Surrogate Model For Sample
Selection [84.61367781175984]
We propose to leverage the vision-language surrogate model CLIP to filter noisy samples automatically.
We validate the effectiveness of our proposed method on both real-world and synthetic noisy datasets.
arXiv Detail & Related papers (2023-10-16T14:43:27Z) - Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data.
We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z) - DiscrimLoss: A Universal Loss for Hard Samples and Incorrect Samples
Discrimination [28.599571524763785]
Given data with label noise (i.e., incorrect data), deep neural networks would gradually memorize the label noise and impair model performance.
To relieve this issue, curriculum learning is proposed to improve model performance and generalization by ordering training samples in a meaningful sequence.
arXiv Detail & Related papers (2022-08-21T13:38:55Z) - Confidence Adaptive Regularization for Deep Learning with Noisy Labels [2.0349696181833337]
Recent studies on the memorization effects of deep neural networks on noisy labels show that the networks first fit the correctly-labeled training samples before memorizing the mislabeled samples.
Motivated by this early-learning phenomenon, we propose a novel method to prevent memorization of the mislabeled samples.
We provide the theoretical analysis and conduct the experiments on synthetic and real-world datasets, demonstrating that our approach achieves comparable results to the state-of-the-art methods.
arXiv Detail & Related papers (2021-08-18T15:51:25Z) - Transform consistency for learning with noisy labels [9.029861710944704]
We propose a method to identify clean samples only using one single network.
Clean samples prefer to reach consistent predictions for the original images and the transformed images.
In order to mitigate the negative influence of noisy labels, we design a classification loss by using the off-line hard labels and on-line soft labels.
arXiv Detail & Related papers (2021-03-25T14:33:13Z) - Robust and On-the-fly Dataset Denoising for Image Classification [72.10311040730815]
On-the-fly Data Denoising (ODD) is robust to mislabeled examples, while introducing almost zero computational overhead compared to standard training.
ODD is able to achieve state-of-the-art results on a wide range of datasets including real-world ones such as WebVision and Clothing1M.
arXiv Detail & Related papers (2020-03-24T03:59:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.