Related papers: S3: Supervised Self-supervised Learning under Label Noise

S3: Supervised Self-supervised Learning under Label Noise

URL: http://arxiv.org/abs/2111.11288v1
Date: Mon, 22 Nov 2021 15:49:20 GMT
Title: S3: Supervised Self-supervised Learning under Label Noise
Authors: Chen Feng, Georgios Tzimiropoulos, Ioannis Patras
Abstract summary: In this paper we address the problem of classification in the presence of label noise. In the heart of our method is a sample selection mechanism that relies on the consistency between the annotated label of a sample and the distribution of the labels in its neighborhood in the feature space. Our method significantly surpasses previous methods on both CIFARCIFAR100 with artificial noise and real-world noisy datasets such as WebVision and ANIMAL-10N.
Score: 53.02249460567745
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Despite the large progress in supervised learning with Neural Networks, there are significant challenges in obtaining high-quality, large-scale and accurately labeled datasets. In this context, in this paper we address the problem of classification in the presence of label noise and more specifically, both close-set and open-set label noise, that is when the true label of a sample may, or may not belong to the set of the given labels. In the heart of our method is a sample selection mechanism that relies on the consistency between the annotated label of a sample and the distribution of the labels in its neighborhood in the feature space; a relabeling mechanism that relies on the confidence of the classifier across subsequent iterations; and a training strategy that trains the encoder both with a self-consistency loss and the classifier-encoder with the cross-entropy loss on the selected samples alone. Without bells and whistles, such as co-training so as to reduce the self-confirmation bias, and with robustness with respect to settings of its few hyper-parameters, our method significantly surpasses previous methods on both CIFAR10/CIFAR100 with artificial noise and real-world noisy datasets such as WebVision and ANIMAL-10N.

Related papers

Mitigating Instance-Dependent Label Noise: Integrating Self-Supervised Pretraining with Pseudo-Label Refinement [3.272177633069322]
Real-world datasets often contain noisy labels due to human error, ambiguity, or resource constraints during the annotation process. We propose a novel framework that combines self-supervised learning using SimCLR with iterative pseudo-label refinement. Our approach significantly outperforms several state-of-the-art methods, particularly under high noise conditions.
arXiv Detail & Related papers (2024-12-06T09:56:49Z)
Mitigating Noisy Supervision Using Synthetic Samples with Soft Labels [13.314778587751588]
Noisy labels are ubiquitous in real-world datasets, especially in the large-scale ones derived from crowdsourcing and web searching. It is challenging to train deep neural networks with noisy datasets since the networks are prone to overfitting the noisy labels during training. We propose a framework that trains the model with new synthetic samples to mitigate the impact of noisy labels.
arXiv Detail & Related papers (2024-06-22T04:49:39Z)
Extracting Clean and Balanced Subset for Noisy Long-tailed Classification [66.47809135771698]
We develop a novel pseudo labeling method using class prototypes from the perspective of distribution matching. By setting a manually-specific probability measure, we can reduce the side-effects of noisy and long-tailed data simultaneously. Our method can extract this class-balanced subset with clean labels, which brings effective performance gains for long-tailed classification with label noise.
arXiv Detail & Related papers (2024-04-10T07:34:37Z)
Combating Label Noise With A General Surrogate Model For Sample Selection [84.61367781175984]
We propose to leverage the vision-language surrogate model CLIP to filter noisy samples automatically. We validate the effectiveness of our proposed method on both real-world and synthetic noisy datasets.
arXiv Detail & Related papers (2023-10-16T14:43:27Z)
Adversary-Aware Partial label learning with Label distillation [47.18584755798137]
We present Ad-Aware Partial Label Learning and introduce the $textitrival$, a set of noisy labels, to the collection of candidate labels for each instance. Our method achieves promising results on the CIFAR10, CIFAR100 and CUB200 datasets.
arXiv Detail & Related papers (2023-04-02T10:18:30Z)
Learning from Noisy Labels with Coarse-to-Fine Sample Credibility Modeling [22.62790706276081]
Training deep neural network (DNN) with noisy labels is practically challenging. Previous efforts tend to handle part or full data in a unified denoising flow. We propose a coarse-to-fine robust learning method called CREMA to handle noisy data in a divide-and-conquer manner.
arXiv Detail & Related papers (2022-08-23T02:06:38Z)
Neighborhood Collective Estimation for Noisy Label Identification and Correction [92.20697827784426]
Learning with noisy labels (LNL) aims at designing strategies to improve model performance and generalization by mitigating the effects of model overfitting to noisy labels. Recent advances employ the predicted label distributions of individual samples to perform noise verification and noisy label correction, easily giving rise to confirmation bias. We propose Neighborhood Collective Estimation, in which the predictive reliability of a candidate sample is re-estimated by contrasting it against its feature-space nearest neighbors.
arXiv Detail & Related papers (2022-08-05T14:47:22Z)
Mitigating Memorization in Sample Selection for Learning with Noisy Labels [4.679610943608667]
We propose a criteria to penalize dominant-noisy-labeled samples intensively through class-wise penalty labels. Using the proposed sample selection, the learning process of the network becomes significantly robust to noisy labels.
arXiv Detail & Related papers (2021-07-08T06:44:04Z)
An Ensemble Noise-Robust K-fold Cross-Validation Selection Method for Noisy Labels [0.9699640804685629]
Large-scale datasets tend to contain mislabeled samples that can be memorized by deep neural networks (DNNs) We present Ensemble Noise-robust K-fold Cross-Validation Selection (E-NKCVS) to effectively select clean samples from noisy data. We evaluate our approach on various image and text classification tasks where the labels have been manually corrupted with different noise ratios.
arXiv Detail & Related papers (2021-07-06T02:14:52Z)
Tackling Instance-Dependent Label Noise via a Universal Probabilistic Model [80.91927573604438]
This paper proposes a simple yet universal probabilistic model, which explicitly relates noisy labels to their instances. Experiments on datasets with both synthetic and real-world label noise verify that the proposed method yields significant improvements on robustness.
arXiv Detail & Related papers (2021-01-14T05:43:51Z)
A Second-Order Approach to Learning with Instance-Dependent Label Noise [58.555527517928596]
The presence of label noise often misleads the training of deep neural networks. We show that the errors in human-annotated labels are more likely to be dependent on the difficulty levels of tasks.
arXiv Detail & Related papers (2020-12-22T06:36:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.