Instance Correction for Learning with Open-set Noisy Labels
- URL: http://arxiv.org/abs/2106.00455v1
- Date: Tue, 1 Jun 2021 13:05:55 GMT
- Title: Instance Correction for Learning with Open-set Noisy Labels
- Authors: Xiaobo Xia, Tongliang Liu, Bo Han, Mingming Gong, Jun Yu, Gang Niu,
Masashi Sugiyama
- Abstract summary: We use the sample selection approach to handle open-set noisy labels.
The discarded data are seen to be mislabeled and do not participate in training.
We modify the instances of discarded data to make predictions for the discarded data consistent with given labels.
- Score: 145.06552420999986
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The problem of open-set noisy labels denotes that part of training data have
a different label space that does not contain the true class. Lots of
approaches, e.g., loss correction and label correction, cannot handle such
open-set noisy labels well, since they need training data and test data to
share the same label space, which does not hold for learning with open-set
noisy labels. The state-of-the-art methods thus employ the sample selection
approach to handle open-set noisy labels, which tries to select clean data from
noisy data for network parameters updates. The discarded data are seen to be
mislabeled and do not participate in training. Such an approach is intuitive
and reasonable at first glance. However, a natural question could be raised
"can such data only be discarded during training?". In this paper, we show that
the answer is no. Specifically, we discuss that the instances of discarded data
could consist of some meaningful information for generalization. For this
reason, we do not abandon such data, but use instance correction to modify the
instances of the discarded data, which makes the predictions for the discarded
data consistent with given labels. Instance correction are performed by
targeted adversarial attacks. The corrected data are then exploited for
training to help generalization. In addition to the analytical results, a
series of empirical evidences are provided to justify our claims.
Related papers
- FlatMatch: Bridging Labeled Data and Unlabeled Data with Cross-Sharpness
for Semi-Supervised Learning [73.13448439554497]
Semi-Supervised Learning (SSL) has been an effective way to leverage abundant unlabeled data with extremely scarce labeled data.
Most SSL methods are commonly based on instance-wise consistency between different data transformations.
We propose FlatMatch which minimizes a cross-sharpness measure to ensure consistent learning performance between the two datasets.
arXiv Detail & Related papers (2023-10-25T06:57:59Z) - Soft Curriculum for Learning Conditional GANs with Noisy-Labeled and
Uncurated Unlabeled Data [70.25049762295193]
We introduce a novel conditional image generation framework that accepts noisy-labeled and uncurated data during training.
We propose soft curriculum learning, which assigns instance-wise weights for adversarial training while assigning new labels for unlabeled data.
Our experiments show that our approach outperforms existing semi-supervised and label-noise robust methods in terms of both quantitative and qualitative performance.
arXiv Detail & Related papers (2023-07-17T08:31:59Z) - OpenCoS: Contrastive Semi-supervised Learning for Handling Open-set
Unlabeled Data [65.19205979542305]
Unlabeled data may include out-of-class samples in practice.
OpenCoS is a method for handling this realistic semi-supervised learning scenario.
arXiv Detail & Related papers (2021-06-29T06:10:05Z) - A Survey on Semi-Supervised Learning for Delayed Partially Labelled Data
Streams [10.370629574634092]
This survey pays special attention to methods that leverage unlabelled data in a semi-supervised setting.
We discuss the delayed labelling issue, which impacts both fully supervised and semi-supervised methods.
arXiv Detail & Related papers (2021-06-16T23:14:20Z) - A Novel Perspective for Positive-Unlabeled Learning via Noisy Labels [49.990938653249415]
This research presents a methodology that assigns initial pseudo-labels to unlabeled data which is used as noisy-labeled data, and trains a deep neural network using the noisy-labeled data.
Experimental results demonstrate that the proposed method significantly outperforms the state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-03-08T11:46:02Z) - Error-Bounded Correction of Noisy Labels [17.510654621245656]
We show that the prediction of a noisy classifier can indeed be a good indicator of whether the label of a training data is clean.
Based on the theoretical result, we propose a novel algorithm that corrects the labels based on the noisy classifier prediction.
We incorporate our label correction algorithm into the training of deep neural networks and train models that achieve superior testing performance on multiple public datasets.
arXiv Detail & Related papers (2020-11-19T19:23:23Z) - Learning with Out-of-Distribution Data for Audio Classification [60.48251022280506]
We show that detecting and relabelling certain OOD instances, rather than discarding them, can have a positive effect on learning.
The proposed method is shown to improve the performance of convolutional neural networks by a significant margin.
arXiv Detail & Related papers (2020-02-11T21:08:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.