Label-Retrieval-Augmented Diffusion Models for Learning from Noisy
Labels
- URL: http://arxiv.org/abs/2305.19518v2
- Date: Sat, 2 Dec 2023 07:30:10 GMT
- Title: Label-Retrieval-Augmented Diffusion Models for Learning from Noisy
Labels
- Authors: Jian Chen, Ruiyi Zhang, Tong Yu, Rohan Sharma, Zhiqiang Xu, Tong Sun,
Changyou Chen
- Abstract summary: Learning from noisy labels is an important and long-standing problem in machine learning for real applications.
In this paper, we reformulate the label-noise problem from a generative-model perspective.
Our model achieves new state-of-the-art (SOTA) results on all the standard real-world benchmark datasets.
- Score: 61.97359362447732
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning from noisy labels is an important and long-standing problem in
machine learning for real applications. One of the main research lines focuses
on learning a label corrector to purify potential noisy labels. However, these
methods typically rely on strict assumptions and are limited to certain types
of label noise. In this paper, we reformulate the label-noise problem from a
generative-model perspective, $\textit{i.e.}$, labels are generated by
gradually refining an initial random guess. This new perspective immediately
enables existing powerful diffusion models to seamlessly learn the stochastic
generative process. Once the generative uncertainty is modeled, we can perform
classification inference using maximum likelihood estimation of labels. To
mitigate the impact of noisy labels, we propose the
$\textbf{L}$abel-$\textbf{R}$etrieval-$\textbf{A}$ugmented (LRA) diffusion
model, which leverages neighbor consistency to effectively construct
pseudo-clean labels for diffusion training. Our model is flexible and general,
allowing easy incorporation of different types of conditional information,
$\textit{e.g.}$, use of pre-trained models, to further boost model performance.
Extensive experiments are conducted for evaluation. Our model achieves new
state-of-the-art (SOTA) results on all the standard real-world benchmark
datasets. Remarkably, by incorporating conditional information from the
powerful CLIP model, our method can boost the current SOTA accuracy by 10-20
absolute points in many cases.
Related papers
- Learning with Confidence: Training Better Classifiers from Soft Labels [0.0]
In supervised machine learning, models are typically trained using data with hard labels, i.e., definite assignments of class membership.
We investigate whether incorporating label uncertainty, represented as discrete probability distributions over the class labels, improves the predictive performance of classification models.
arXiv Detail & Related papers (2024-09-24T13:12:29Z) - Learning to Detect Noisy Labels Using Model-Based Features [16.681748918518075]
We propose Selection-Enhanced Noisy label Training (SENT)
SENT does not rely on meta learning while having the flexibility of being data-driven.
It improves performance over strong baselines under the settings of self-training and label corruption.
arXiv Detail & Related papers (2022-12-28T10:12:13Z) - Class Prototype-based Cleaner for Label Noise Learning [73.007001454085]
Semi-supervised learning methods are current SOTA solutions to the noisy-label learning problem.
We propose a simple yet effective solution, named textbfClass textbfPrototype-based label noise textbfCleaner.
arXiv Detail & Related papers (2022-12-21T04:56:41Z) - SELC: Self-Ensemble Label Correction Improves Learning with Noisy Labels [4.876988315151037]
Deep neural networks are prone to overfitting noisy labels, resulting in poor generalization performance.
We present a method self-ensemble label correction (SELC) to progressively correct noisy labels and refine the model.
SELC obtains more promising and stable results in the presence of class-conditional, instance-dependent, and real-world label noise.
arXiv Detail & Related papers (2022-05-02T18:42:47Z) - Instance-Dependent Partial Label Learning [69.49681837908511]
Partial label learning is a typical weakly supervised learning problem.
Most existing approaches assume that the incorrect labels in each training example are randomly picked as the candidate labels.
In this paper, we consider instance-dependent and assume that each example is associated with a latent label distribution constituted by the real number of each label.
arXiv Detail & Related papers (2021-10-25T12:50:26Z) - Instance-dependent Label-noise Learning under a Structural Causal Model [92.76400590283448]
Label noise will degenerate the performance of deep learning algorithms.
By leveraging a structural causal model, we propose a novel generative approach for instance-dependent label-noise learning.
arXiv Detail & Related papers (2021-09-07T10:42:54Z) - Tackling Instance-Dependent Label Noise via a Universal Probabilistic
Model [80.91927573604438]
This paper proposes a simple yet universal probabilistic model, which explicitly relates noisy labels to their instances.
Experiments on datasets with both synthetic and real-world label noise verify that the proposed method yields significant improvements on robustness.
arXiv Detail & Related papers (2021-01-14T05:43:51Z) - Deep k-NN for Noisy Labels [55.97221021252733]
We show that a simple $k$-nearest neighbor-based filtering approach on the logit layer of a preliminary model can remove mislabeled data and produce more accurate models than many recently proposed methods.
arXiv Detail & Related papers (2020-04-26T05:15:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.