CCMN: A General Framework for Learning with Class-Conditional
Multi-Label Noise
- URL: http://arxiv.org/abs/2105.07338v1
- Date: Sun, 16 May 2021 03:24:15 GMT
- Title: CCMN: A General Framework for Learning with Class-Conditional
Multi-Label Noise
- Authors: Ming-Kun Xie and Sheng-Jun Huang
- Abstract summary: Class-conditional noise commonly exists in machine learning tasks, where the class label is corrupted with a probability depending on its ground-truth.
In this paper, we formalize this problem as a general framework of learning with Class-Conditional Multi-label Noise ( CCMN for short)
We establish two unbiased estimators with error bounds for solving the CCMN problems, and prove that they are consistent with commonly used multi-label loss functions.
- Score: 40.46921277898713
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Class-conditional noise commonly exists in machine learning tasks, where the
class label is corrupted with a probability depending on its ground-truth. Many
research efforts have been made to improve the model robustness against the
class-conditional noise. However, they typically focus on the single label case
by assuming that only one label is corrupted. In real applications, an instance
is usually associated with multiple labels, which could be corrupted
simultaneously with their respective conditional probabilities. In this paper,
we formalize this problem as a general framework of learning with
Class-Conditional Multi-label Noise (CCMN for short). We establish two unbiased
estimators with error bounds for solving the CCMN problems, and further prove
that they are consistent with commonly used multi-label loss functions.
Finally, a new method for partial multi-label learning is implemented with
unbiased estimator under the CCMN framework. Empirical studies on multiple
datasets and various evaluation metrics validate the effectiveness of the
proposed method.
Related papers
- Multi-Label Quantification [78.83284164605473]
Quantification, variously called "labelled prevalence estimation" or "learning to quantify", is the supervised learning task of generating predictors of the relative frequencies of the classes of interest in unsupervised data samples.
We propose methods for inferring estimators of class prevalence values that strive to leverage the dependencies among the classes of interest in order to predict their relative frequencies more accurately.
arXiv Detail & Related papers (2022-11-15T11:29:59Z) - Label-Noise Learning with Intrinsically Long-Tailed Data [65.41318436799993]
We propose a learning framework for label-noise learning with intrinsically long-tailed data.
Specifically, we propose two-stage bi-dimensional sample selection (TABASCO) to better separate clean samples from noisy samples.
arXiv Detail & Related papers (2022-08-21T07:47:05Z) - Multi-class Probabilistic Bounds for Self-learning [13.875239300089861]
Pseudo-labeling is prone to error and runs the risk of adding noisy labels into unlabeled training data.
We present a probabilistic framework for analyzing self-learning in the multi-class classification scenario with partially labeled data.
arXiv Detail & Related papers (2021-09-29T13:57:37Z) - Unbiased Loss Functions for Multilabel Classification with Missing
Labels [2.1549398927094874]
Missing labels are a ubiquitous phenomenon in extreme multi-label classification (XMC) tasks.
This paper derives the unique unbiased estimators for the different multilabel reductions.
arXiv Detail & Related papers (2021-09-23T10:39:02Z) - PLM: Partial Label Masking for Imbalanced Multi-label Classification [59.68444804243782]
Neural networks trained on real-world datasets with long-tailed label distributions are biased towards frequent classes and perform poorly on infrequent classes.
We propose a method, Partial Label Masking (PLM), which utilizes this ratio during training.
Our method achieves strong performance when compared to existing methods on both multi-label (MultiMNIST and MSCOCO) and single-label (imbalanced CIFAR-10 and CIFAR-100) image classification datasets.
arXiv Detail & Related papers (2021-05-22T18:07:56Z) - Approximating Instance-Dependent Noise via Instance-Confidence Embedding [87.65718705642819]
Label noise in multiclass classification is a major obstacle to the deployment of learning systems.
We investigate the instance-dependent noise (IDN) model and propose an efficient approximation of IDN to capture the instance-specific label corruption.
arXiv Detail & Related papers (2021-03-25T02:33:30Z) - A Second-Order Approach to Learning with Instance-Dependent Label Noise [58.555527517928596]
The presence of label noise often misleads the training of deep neural networks.
We show that the errors in human-annotated labels are more likely to be dependent on the difficulty levels of tasks.
arXiv Detail & Related papers (2020-12-22T06:36:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.