Related papers: CrossSplit: Mitigating Label Noise Memorization through Data Splitting

CrossSplit: Mitigating Label Noise Memorization through Data Splitting

URL: http://arxiv.org/abs/2212.01674v2
Date: Wed, 26 Apr 2023 15:33:27 GMT
Title: CrossSplit: Mitigating Label Noise Memorization through Data Splitting
Authors: Jihye Kim, Aristide Baratin, Yan Zhang, Simon Lacoste-Julien
Abstract summary: We propose a novel training procedure to mitigate the memorization of noisy labels, called CrossSplit. Experiments on CIFAR-10, CIFAR-100, Tiny-ImageNet and mini-WebVision datasets demonstrate that our method can outperform the current state-of-the-art in a wide range of noise ratios.
Score: 25.344386272010397
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We approach the problem of improving robustness of deep learning algorithms in the presence of label noise. Building upon existing label correction and co-teaching methods, we propose a novel training procedure to mitigate the memorization of noisy labels, called CrossSplit, which uses a pair of neural networks trained on two disjoint parts of the labelled dataset. CrossSplit combines two main ingredients: (i) Cross-split label correction. The idea is that, since the model trained on one part of the data cannot memorize example-label pairs from the other part, the training labels presented to each network can be smoothly adjusted by using the predictions of its peer network; (ii) Cross-split semi-supervised training. A network trained on one part of the data also uses the unlabeled inputs of the other part. Extensive experiments on CIFAR-10, CIFAR-100, Tiny-ImageNet and mini-WebVision datasets demonstrate that our method can outperform the current state-of-the-art in a wide range of noise ratios.

Related papers

Co-Training with Active Contrastive Learning and Meta-Pseudo-Labeling on 2D Projections for Deep Semi-Supervised Learning [42.56511266791916]
SSL tackles this challenge by capitalizing on scarce labeled and abundant unlabeled data. We present active-DeepFA, a method that effectively combines CL, teacher-student-based meta-pseudo-labeling and AL.
arXiv Detail & Related papers (2025-04-25T19:41:45Z)
JointMatch: A Unified Approach for Diverse and Collaborative Pseudo-Labeling to Semi-Supervised Text Classification [65.268245109828]
Semi-supervised text classification (SSTC) has gained increasing attention due to its ability to leverage unlabeled data. Existing approaches based on pseudo-labeling suffer from the issues of pseudo-label bias and error accumulation. We propose JointMatch, a holistic approach for SSTC that addresses these challenges by unifying ideas from recent semi-supervised learning.
arXiv Detail & Related papers (2023-10-23T05:43:35Z)
ProtoCon: Pseudo-label Refinement via Online Clustering and Prototypical Consistency for Efficient Semi-supervised Learning [60.57998388590556]
ProtoCon is a novel method for confidence-based pseudo-labeling. Online nature of ProtoCon allows it to utilise the label history of the entire dataset in one training cycle. It delivers significant gains and faster convergence over state-of-the-art datasets.
arXiv Detail & Related papers (2023-03-22T23:51:54Z)
Learning from Data with Noisy Labels Using Temporal Self-Ensemble [11.245833546360386]
Deep neural networks (DNNs) have an enormous capacity to memorize noisy labels. Current state-of-the-art methods present a co-training scheme that trains dual networks using samples associated with small losses. We propose a simple yet effective robust training scheme that operates by training only a single network.
arXiv Detail & Related papers (2022-07-21T08:16:31Z)
Synergistic Network Learning and Label Correction for Noise-robust Image Classification [28.27739181560233]
Deep Neural Networks (DNNs) tend to overfit training label noise, resulting in poorer model performance in practice. We propose a robust label correction framework combining the ideas of small loss selection and noise correction. We demonstrate our method on both synthetic and real-world datasets with different noise types and rates.
arXiv Detail & Related papers (2022-02-27T23:06:31Z)
GuidedMix-Net: Semi-supervised Semantic Segmentation by Using Labeled Images as Reference [90.5402652758316]
We propose a novel method for semi-supervised semantic segmentation named GuidedMix-Net. It uses labeled information to guide the learning of unlabeled instances. It achieves competitive segmentation accuracy and significantly improves the mIoU by +7$%$ compared to previous approaches.
arXiv Detail & Related papers (2021-12-28T06:48:03Z)
S3: Supervised Self-supervised Learning under Label Noise [53.02249460567745]
In this paper we address the problem of classification in the presence of label noise. In the heart of our method is a sample selection mechanism that relies on the consistency between the annotated label of a sample and the distribution of the labels in its neighborhood in the feature space. Our method significantly surpasses previous methods on both CIFARCIFAR100 with artificial noise and real-world noisy datasets such as WebVision and ANIMAL-10N.
arXiv Detail & Related papers (2021-11-22T15:49:20Z)
Cross-domain Speech Recognition with Unsupervised Character-level Distribution Matching [60.8427677151492]
We propose CMatch, a Character-level distribution matching method to perform fine-grained adaptation between each character in two domains. Experiments on the Libri-Adapt dataset show that our proposed approach achieves 14.39% and 16.50% relative Word Error Rate (WER) reduction on both cross-device and cross-environment ASR.
arXiv Detail & Related papers (2021-04-15T14:36:54Z)
Co-Seg: An Image Segmentation Framework Against Label Corruption [8.219887855003648]
Supervised deep learning performance is heavily tied to the availability of high-quality labels for training. We propose a novel framework, namely Co-Seg, to collaboratively train segmentation networks on datasets which include low-quality noisy labels. Our framework can be easily implemented in any segmentation algorithm to increase its robustness to noisy labels.
arXiv Detail & Related papers (2021-01-31T20:01:40Z)
Combating noisy labels by agreement: A joint training method with co-regularization [27.578738673827658]
We propose a robust learning paradigm called JoCoR, which aims to reduce the diversity of two networks during training. We show that JoCoR is superior to many state-of-the-art approaches for learning with noisy labels.
arXiv Detail & Related papers (2020-03-05T16:42:41Z)
DivideMix: Learning with Noisy Labels as Semi-supervised Learning [111.03364864022261]
We propose DivideMix, a framework for learning with noisy labels. Experiments on multiple benchmark datasets demonstrate substantial improvements over state-of-the-art methods.
arXiv Detail & Related papers (2020-02-18T06:20:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.