Noisy Label Learning for Large-scale Medical Image Classification
- URL: http://arxiv.org/abs/2103.04053v1
- Date: Sat, 6 Mar 2021 07:42:36 GMT
- Title: Noisy Label Learning for Large-scale Medical Image Classification
- Authors: Fengbei Liu, Yu Tian, Filipe R. Cordeiro, Vasileios Belagiannis, Ian
Reid, Gustavo Carneiro
- Abstract summary: We adapt a state-of-the-art noisy-label multi-class training approach to learn a multi-label classifier for the dataset Chest X-ray14.
We show that the majority of label noise on Chest X-ray14 is present in the class 'No Finding', which is intuitively correct because this is the most likely class to contain one or more of the 14 diseases due to labelling mistakes.
- Score: 37.79118840129632
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The classification accuracy of deep learning models depends not only on the
size of their training sets, but also on the quality of their labels. In
medical image classification, large-scale datasets are becoming abundant, but
their labels will be noisy when they are automatically extracted from radiology
reports using natural language processing tools. Given that deep learning
models can easily overfit these noisy-label samples, it is important to study
training approaches that can handle label noise. In this paper, we adapt a
state-of-the-art (SOTA) noisy-label multi-class training approach to learn a
multi-label classifier for the dataset Chest X-ray14, which is a large scale
dataset known to contain label noise in the training set. Given that this
dataset also has label noise in the testing set, we propose a new theoretically
sound method to estimate the performance of the model on a hidden clean testing
data, given the result on the noisy testing data. Using our clean data
performance estimation, we notice that the majority of label noise on Chest
X-ray14 is present in the class 'No Finding', which is intuitively correct
because this is the most likely class to contain one or more of the 14 diseases
due to labelling mistakes.
Related papers
- Training Gradient Boosted Decision Trees on Tabular Data Containing Label Noise for Classification Tasks [1.261491746208123]
This study aims to investigate the effects of label noise on gradient-boosted decision trees and methods to mitigate those effects.
The implemented methods demonstrate state-of-the-art noise detection performance on the Adult dataset and achieve the highest classification precision and recall on the Adult and Breast Cancer datasets.
arXiv Detail & Related papers (2024-09-13T09:09:24Z) - Extracting Clean and Balanced Subset for Noisy Long-tailed Classification [66.47809135771698]
We develop a novel pseudo labeling method using class prototypes from the perspective of distribution matching.
By setting a manually-specific probability measure, we can reduce the side-effects of noisy and long-tailed data simultaneously.
Our method can extract this class-balanced subset with clean labels, which brings effective performance gains for long-tailed classification with label noise.
arXiv Detail & Related papers (2024-04-10T07:34:37Z) - Label-Retrieval-Augmented Diffusion Models for Learning from Noisy
Labels [61.97359362447732]
Learning from noisy labels is an important and long-standing problem in machine learning for real applications.
In this paper, we reformulate the label-noise problem from a generative-model perspective.
Our model achieves new state-of-the-art (SOTA) results on all the standard real-world benchmark datasets.
arXiv Detail & Related papers (2023-05-31T03:01:36Z) - Learning to Detect Noisy Labels Using Model-Based Features [16.681748918518075]
We propose Selection-Enhanced Noisy label Training (SENT)
SENT does not rely on meta learning while having the flexibility of being data-driven.
It improves performance over strong baselines under the settings of self-training and label corruption.
arXiv Detail & Related papers (2022-12-28T10:12:13Z) - BoMD: Bag of Multi-label Descriptors for Noisy Chest X-ray
Classification [25.76256302330625]
New medical imaging classification problems may need to rely on machine-generated noisy labels extracted from radiology reports.
Current noisy-label learning methods designed for multi-class problems cannot be easily adapted.
We propose a new method designed for the noisy multi-label CXR learning, which detects and smoothly re-labels samples from the dataset.
arXiv Detail & Related papers (2022-03-03T08:04:59Z) - Learning to Aggregate and Refine Noisy Labels for Visual Sentiment
Analysis [69.48582264712854]
We propose a robust learning method to perform robust visual sentiment analysis.
Our method relies on an external memory to aggregate and filter noisy labels during training.
We establish a benchmark for visual sentiment analysis with label noise using publicly available datasets.
arXiv Detail & Related papers (2021-09-15T18:18:28Z) - Co-Correcting: Noise-tolerant Medical Image Classification via mutual
Label Correction [5.994566233473544]
This paper proposes a noise-tolerant medical image classification framework named Co-Correcting.
It significantly improves classification accuracy and obtains more accurate labels through dual-network mutual learning, label probability estimation, and curriculum label correcting.
Experiments show that Co-Correcting achieves the best accuracy and generalization under different noise ratios in various tasks.
arXiv Detail & Related papers (2021-09-11T02:09:52Z) - Noisy Labels Can Induce Good Representations [53.47668632785373]
We study how architecture affects learning with noisy labels.
We show that training with noisy labels can induce useful hidden representations, even when the model generalizes poorly.
This finding leads to a simple method to improve models trained on noisy labels.
arXiv Detail & Related papers (2020-12-23T18:58:05Z) - Error-Bounded Correction of Noisy Labels [17.510654621245656]
We show that the prediction of a noisy classifier can indeed be a good indicator of whether the label of a training data is clean.
Based on the theoretical result, we propose a novel algorithm that corrects the labels based on the noisy classifier prediction.
We incorporate our label correction algorithm into the training of deep neural networks and train models that achieve superior testing performance on multiple public datasets.
arXiv Detail & Related papers (2020-11-19T19:23:23Z) - Attention-Aware Noisy Label Learning for Image Classification [97.26664962498887]
Deep convolutional neural networks (CNNs) learned on large-scale labeled samples have achieved remarkable progress in computer vision.
The cheapest way to obtain a large body of labeled visual data is to crawl from websites with user-supplied labels, such as Flickr.
This paper proposes the attention-aware noisy label learning approach to improve the discriminative capability of the network trained on datasets with potential label noise.
arXiv Detail & Related papers (2020-09-30T15:45:36Z) - Learning with Out-of-Distribution Data for Audio Classification [60.48251022280506]
We show that detecting and relabelling certain OOD instances, rather than discarding them, can have a positive effect on learning.
The proposed method is shown to improve the performance of convolutional neural networks by a significant margin.
arXiv Detail & Related papers (2020-02-11T21:08:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.