A Conformal Prediction Score that is Robust to Label Noise
- URL: http://arxiv.org/abs/2405.02648v2
- Date: Tue, 21 May 2024 13:06:56 GMT
- Title: A Conformal Prediction Score that is Robust to Label Noise
- Authors: Coby Penso, Jacob Goldberger,
- Abstract summary: We introduce a conformal score that is robust to label noise.
The noise-free conformal score is estimated using the noisy labeled data and the noise level.
We show that our method outperforms current methods by a large margin, in terms of the average size of the prediction set.
- Score: 13.22445242068721
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conformal Prediction (CP) quantifies network uncertainty by building a small prediction set with a pre-defined probability that the correct class is within this set. In this study we tackle the problem of CP calibration based on a validation set with noisy labels. We introduce a conformal score that is robust to label noise. The noise-free conformal score is estimated using the noisy labeled data and the noise level. In the test phase the noise-free score is used to form the prediction set. We applied the proposed algorithm to several standard medical imaging classification datasets. We show that our method outperforms current methods by a large margin, in terms of the average size of the prediction set, while maintaining the required coverage.
Related papers
- A conformalized learning of a prediction set with applications to medical imaging classification [14.304858613146536]
We present an algorithm that can produce a prediction set containing the true label with a user-specified probability, such as 90%.
We applied the proposed algorithm to several standard medical imaging classification datasets.
arXiv Detail & Related papers (2024-08-09T12:49:04Z) - Extracting Clean and Balanced Subset for Noisy Long-tailed Classification [66.47809135771698]
We develop a novel pseudo labeling method using class prototypes from the perspective of distribution matching.
By setting a manually-specific probability measure, we can reduce the side-effects of noisy and long-tailed data simultaneously.
Our method can extract this class-balanced subset with clean labels, which brings effective performance gains for long-tailed classification with label noise.
arXiv Detail & Related papers (2024-04-10T07:34:37Z) - Learning to Correct Noisy Labels for Fine-Grained Entity Typing via
Co-Prediction Prompt Tuning [9.885278527023532]
We introduce Co-Prediction Prompt Tuning for noise correction in FET.
We integrate prediction results to recall labeled labels and utilize a differentiated margin to identify inaccurate labels.
Experimental results on three widely-used FET datasets demonstrate that our noise correction approach significantly enhances the quality of training samples.
arXiv Detail & Related papers (2023-10-23T06:04:07Z) - PAC Prediction Sets Under Label Shift [52.30074177997787]
Prediction sets capture uncertainty by predicting sets of labels rather than individual labels.
We propose a novel algorithm for constructing prediction sets with PAC guarantees in the label shift setting.
We evaluate our approach on five datasets.
arXiv Detail & Related papers (2023-10-19T17:57:57Z) - Label Noise: Correcting the Forward-Correction [0.0]
Training neural network classifiers on datasets with label noise poses a risk of overfitting them to the noisy labels.
We propose an approach to tackling overfitting caused by label noise.
Motivated by this observation, we propose imposing a lower bound on the training loss to mitigate overfitting.
arXiv Detail & Related papers (2023-07-24T19:41:19Z) - Neighborhood Collective Estimation for Noisy Label Identification and
Correction [92.20697827784426]
Learning with noisy labels (LNL) aims at designing strategies to improve model performance and generalization by mitigating the effects of model overfitting to noisy labels.
Recent advances employ the predicted label distributions of individual samples to perform noise verification and noisy label correction, easily giving rise to confirmation bias.
We propose Neighborhood Collective Estimation, in which the predictive reliability of a candidate sample is re-estimated by contrasting it against its feature-space nearest neighbors.
arXiv Detail & Related papers (2022-08-05T14:47:22Z) - S3: Supervised Self-supervised Learning under Label Noise [53.02249460567745]
In this paper we address the problem of classification in the presence of label noise.
In the heart of our method is a sample selection mechanism that relies on the consistency between the annotated label of a sample and the distribution of the labels in its neighborhood in the feature space.
Our method significantly surpasses previous methods on both CIFARCIFAR100 with artificial noise and real-world noisy datasets such as WebVision and ANIMAL-10N.
arXiv Detail & Related papers (2021-11-22T15:49:20Z) - Towards Robustness to Label Noise in Text Classification via Noise
Modeling [7.863638253070439]
Large datasets in NLP suffer from noisy labels, due to erroneous automatic and human annotation procedures.
We study the problem of text classification with label noise, and aim to capture this noise through an auxiliary noise model over the classifier.
arXiv Detail & Related papers (2021-01-27T05:41:57Z) - A Second-Order Approach to Learning with Instance-Dependent Label Noise [58.555527517928596]
The presence of label noise often misleads the training of deep neural networks.
We show that the errors in human-annotated labels are more likely to be dependent on the difficulty levels of tasks.
arXiv Detail & Related papers (2020-12-22T06:36:58Z) - Unsupervised Domain Adaptation for Acoustic Scene Classification Using
Band-Wise Statistics Matching [69.24460241328521]
Machine learning algorithms can be negatively affected by mismatches between training (source) and test (target) data distributions.
We propose an unsupervised domain adaptation method that consists of aligning the first- and second-order sample statistics of each frequency band of target-domain acoustic scenes to the ones of the source-domain training dataset.
We show that the proposed method outperforms the state-of-the-art unsupervised methods found in the literature in terms of both source- and target-domain classification accuracy.
arXiv Detail & Related papers (2020-04-30T23:56:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.