Multi-class Label Noise Learning via Loss Decomposition and Centroid
Estimation
- URL: http://arxiv.org/abs/2203.10858v1
- Date: Mon, 21 Mar 2022 10:28:50 GMT
- Title: Multi-class Label Noise Learning via Loss Decomposition and Centroid
Estimation
- Authors: Yongliang Ding, Tao Zhou, Chuang Zhang, Yijing Luo, Juan Tang, Chen
Gong
- Abstract summary: We propose a novel multi-class robust learning method for Loss Decomposition and Centroid Estimation (LDCE)
Specifically, we decompose the commonly adopted loss function into a label-dependent part and a label-independent part.
By defining a new form of data centroid, we transform the recovery problem of a label-dependent part to a centroid estimation problem.
- Score: 25.098485298561155
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In real-world scenarios, many large-scale datasets often contain inaccurate
labels, i.e., noisy labels, which may confuse model training and lead to
performance degradation. To overcome this issue, Label Noise Learning (LNL) has
recently attracted much attention, and various methods have been proposed to
design an unbiased risk estimator to the noise-free dataset to combat such
label noise. Among them, a trend of works based on Loss Decomposition and
Centroid Estimation (LDCE) has shown very promising performance. However,
existing LNL methods based on LDCE are only designed for binary classification,
and they are not directly extendable to multi-class situations. In this paper,
we propose a novel multi-class robust learning method for LDCE, which is termed
"MC-LDCE". Specifically, we decompose the commonly adopted loss (e.g., mean
squared loss) function into a label-dependent part and a label-independent
part, in which only the former is influenced by label noise. Further, by
defining a new form of data centroid, we transform the recovery problem of a
label-dependent part to a centroid estimation problem. Finally, by critically
examining the mathematical expectation of clean data centroid given the
observed noisy set, the centroid can be estimated which helps to build an
unbiased risk estimator for multi-class learning. The proposed MC-LDCE method
is general and applicable to different types (i.e., linear and nonlinear) of
classification models. The experimental results on five public datasets
demonstrate the superiority of the proposed MC-LDCE against other
representative LNL methods in tackling multi-class label noise problem.
Related papers
- Trusted Multi-view Learning with Label Noise [17.458306450909316]
Multi-view learning methods often focus on improving decision accuracy while neglecting the decision uncertainty.
We propose a trusted multi-view noise refining method to solve this problem.
We empirically compare TMNR with state-of-the-art trusted multi-view learning and label noise learning baselines on 5 publicly available datasets.
arXiv Detail & Related papers (2024-04-18T06:47:30Z) - Combating Label Noise With A General Surrogate Model For Sample
Selection [84.61367781175984]
We propose to leverage the vision-language surrogate model CLIP to filter noisy samples automatically.
We validate the effectiveness of our proposed method on both real-world and synthetic noisy datasets.
arXiv Detail & Related papers (2023-10-16T14:43:27Z) - Adaptive Negative Evidential Deep Learning for Open-set Semi-supervised Learning [69.81438976273866]
Open-set semi-supervised learning (Open-set SSL) considers a more practical scenario, where unlabeled data and test data contain new categories (outliers) not observed in labeled data (inliers)
We introduce evidential deep learning (EDL) as an outlier detector to quantify different types of uncertainty, and design different uncertainty metrics for self-training and inference.
We propose a novel adaptive negative optimization strategy, making EDL more tailored to the unlabeled dataset containing both inliers and outliers.
arXiv Detail & Related papers (2023-03-21T09:07:15Z) - Learning with Noisy Labels through Learnable Weighting and Centroid Similarity [5.187216033152917]
noisy labels are prevalent in domains such as medical diagnosis and autonomous driving.
We introduce a novel method for training machine learning models in the presence of noisy labels.
Our results show that our method consistently outperforms the existing state-of-the-art techniques.
arXiv Detail & Related papers (2023-03-16T16:43:24Z) - Latent Class-Conditional Noise Model [54.56899309997246]
We introduce a Latent Class-Conditional Noise model (LCCN) to parameterize the noise transition under a Bayesian framework.
We then deduce a dynamic label regression method for LCCN, whose Gibbs sampler allows us efficiently infer the latent true labels.
Our approach safeguards the stable update of the noise transition, which avoids previous arbitrarily tuning from a mini-batch of samples.
arXiv Detail & Related papers (2023-02-19T15:24:37Z) - Learning Confident Classifiers in the Presence of Label Noise [5.829762367794509]
This paper proposes a probabilistic model for noisy observations that allows us to build a confident classification and segmentation models.
Our experiments show that our algorithm outperforms state-of-the-art solutions for the considered classification and segmentation problems.
arXiv Detail & Related papers (2023-01-02T04:27:25Z) - Class-Imbalanced Complementary-Label Learning via Weighted Loss [8.934943507699131]
Complementary-label learning (CLL) is widely used in weakly supervised classification.
It faces a significant challenge in real-world datasets when confronted with class-imbalanced training samples.
We propose a novel problem setting that enables learning from class-imbalanced complementary labels for multi-class classification.
arXiv Detail & Related papers (2022-09-28T16:02:42Z) - Learning from Label Proportions by Learning with Label Noise [30.7933303912474]
Learning from label proportions (LLP) is a weakly supervised classification problem where data points are grouped into bags.
We provide a theoretically grounded approach to LLP based on a reduction to learning with label noise.
Our approach demonstrates improved empirical performance in deep learning scenarios across multiple datasets and architectures.
arXiv Detail & Related papers (2022-03-04T18:52:21Z) - L2B: Learning to Bootstrap Robust Models for Combating Label Noise [52.02335367411447]
This paper introduces a simple and effective method, named Learning to Bootstrap (L2B)
It enables models to bootstrap themselves using their own predictions without being adversely affected by erroneous pseudo-labels.
It achieves this by dynamically adjusting the importance weight between real observed and generated labels, as well as between different samples through meta-learning.
arXiv Detail & Related papers (2022-02-09T05:57:08Z) - Tackling Instance-Dependent Label Noise via a Universal Probabilistic
Model [80.91927573604438]
This paper proposes a simple yet universal probabilistic model, which explicitly relates noisy labels to their instances.
Experiments on datasets with both synthetic and real-world label noise verify that the proposed method yields significant improvements on robustness.
arXiv Detail & Related papers (2021-01-14T05:43:51Z) - Progressive Identification of True Labels for Partial-Label Learning [112.94467491335611]
Partial-label learning (PLL) is a typical weakly supervised learning problem, where each training instance is equipped with a set of candidate labels among which only one is the true label.
Most existing methods elaborately designed as constrained optimizations that must be solved in specific manners, making their computational complexity a bottleneck for scaling up to big data.
This paper proposes a novel framework of classifier with flexibility on the model and optimization algorithm.
arXiv Detail & Related papers (2020-02-19T08:35:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.