Unbiased Loss Functions for Multilabel Classification with Missing
Labels
- URL: http://arxiv.org/abs/2109.11282v1
- Date: Thu, 23 Sep 2021 10:39:02 GMT
- Title: Unbiased Loss Functions for Multilabel Classification with Missing
Labels
- Authors: Erik Schultheis and Rohit Babbar
- Abstract summary: Missing labels are a ubiquitous phenomenon in extreme multi-label classification (XMC) tasks.
This paper derives the unique unbiased estimators for the different multilabel reductions.
- Score: 2.1549398927094874
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper considers binary and multilabel classification problems in a
setting where labels are missing independently and with a known rate. Missing
labels are a ubiquitous phenomenon in extreme multi-label classification (XMC)
tasks, such as matching Wikipedia articles to a small subset out of the
hundreds of thousands of possible tags, where no human annotator can possibly
check the validity of all the negative samples. For this reason,
propensity-scored precision -- an unbiased estimate for precision-at-k under a
known noise model -- has become one of the standard metrics in XMC. Few methods
take this problem into account already during the training phase, and all are
limited to loss functions that can be decomposed into a sum of contributions
from each individual label. A typical approach to training is to reduce the
multilabel problem into a series of binary or multiclass problems, and it has
been shown that if the surrogate task should be consistent for optimizing
recall, the resulting loss function is not decomposable over labels. Therefore,
this paper derives the unique unbiased estimators for the different multilabel
reductions, including the non-decomposable ones. These estimators suffer from
increased variance and may lead to ill-posed optimization problems, which we
address by switching to convex upper-bounds. The theoretical considerations are
further supplemented by an experimental study showing that the switch to
unbiased estimators significantly alters the bias-variance trade-off and may
thus require stronger regularization, which in some cases can negate the
benefits of unbiased estimation.
Related papers
- Generating Unbiased Pseudo-labels via a Theoretically Guaranteed
Chebyshev Constraint to Unify Semi-supervised Classification and Regression [57.17120203327993]
threshold-to-pseudo label process (T2L) in classification uses confidence to determine the quality of label.
In nature, regression also requires unbiased methods to generate high-quality labels.
We propose a theoretically guaranteed constraint for generating unbiased labels based on Chebyshev's inequality.
arXiv Detail & Related papers (2023-11-03T08:39:35Z) - Multi-Label Noise Transition Matrix Estimation with Label Correlations:
Theory and Algorithm [73.94839250910977]
Noisy multi-label learning has garnered increasing attention due to the challenges posed by collecting large-scale accurate labels.
The introduction of transition matrices can help model multi-label noise and enable the development of statistically consistent algorithms.
We propose a novel estimator that leverages label correlations without the need for anchor points or precise fitting of noisy class posteriors.
arXiv Detail & Related papers (2023-09-22T08:35:38Z) - Class-Distribution-Aware Pseudo Labeling for Semi-Supervised Multi-Label
Learning [97.88458953075205]
Pseudo-labeling has emerged as a popular and effective approach for utilizing unlabeled data.
This paper proposes a novel solution called Class-Aware Pseudo-Labeling (CAP) that performs pseudo-labeling in a class-aware manner.
arXiv Detail & Related papers (2023-05-04T12:52:18Z) - Multi-Label Quantification [78.83284164605473]
Quantification, variously called "labelled prevalence estimation" or "learning to quantify", is the supervised learning task of generating predictors of the relative frequencies of the classes of interest in unsupervised data samples.
We propose methods for inferring estimators of class prevalence values that strive to leverage the dependencies among the classes of interest in order to predict their relative frequencies more accurately.
arXiv Detail & Related papers (2022-11-15T11:29:59Z) - Learning from Multiple Unlabeled Datasets with Partial Risk
Regularization [80.54710259664698]
In this paper, we aim to learn an accurate classifier without any class labels.
We first derive an unbiased estimator of the classification risk that can be estimated from the given unlabeled sets.
We then find that the classifier obtained as such tends to cause overfitting as its empirical risks go negative during training.
Experiments demonstrate that our method effectively mitigates overfitting and outperforms state-of-the-art methods for learning from multiple unlabeled sets.
arXiv Detail & Related papers (2022-07-04T16:22:44Z) - Multi-class Probabilistic Bounds for Self-learning [13.875239300089861]
Pseudo-labeling is prone to error and runs the risk of adding noisy labels into unlabeled training data.
We present a probabilistic framework for analyzing self-learning in the multi-class classification scenario with partially labeled data.
arXiv Detail & Related papers (2021-09-29T13:57:37Z) - sigmoidF1: A Smooth F1 Score Surrogate Loss for Multilabel
Classification [42.37189502220329]
We propose a loss function, sigmoidF1, to account for the complexity of multilabel classification evaluation.
We show that sigmoidF1 outperforms other loss functions on four datasets and several metrics.
arXiv Detail & Related papers (2021-08-24T08:11:33Z) - Disentangling Sampling and Labeling Bias for Learning in Large-Output
Spaces [64.23172847182109]
We show that different negative sampling schemes implicitly trade-off performance on dominant versus rare labels.
We provide a unified means to explicitly tackle both sampling bias, arising from working with a subset of all labels, and labeling bias, which is inherent to the data due to label imbalance.
arXiv Detail & Related papers (2021-05-12T15:40:13Z) - Comparing the Value of Labeled and Unlabeled Data in Method-of-Moments
Latent Variable Estimation [17.212805760360954]
We use a framework centered on model misspecification in method-of-moments latent variable estimation.
We then introduce a correction that provably removes this bias in certain cases.
We observe theoretically and with synthetic experiments that for well-specified models, labeled points are worth a constant factor more than unlabeled points.
arXiv Detail & Related papers (2021-03-03T23:52:38Z) - A Flexible Class of Dependence-aware Multi-Label Loss Functions [4.265467042008983]
This paper introduces a new class of loss functions for multi-label classification.
It overcomes disadvantages of commonly used losses such as Hamming and subset 0/1.
The assessment of multi-labels in terms of these losses is illustrated in an empirical study.
arXiv Detail & Related papers (2020-11-02T07:42:15Z) - Unbiased Loss Functions for Extreme Classification With Missing Labels [1.6011907050002954]
The goal in extreme multi-label classification (XMC) is to tag an instance with a small subset of relevant labels from an extremely large set of possible labels.
In this work, we derive an unbiased estimator for general formulation of loss functions which decompose over labels.
We show that the derived unbiased estimators can be easily incorporated in state-of-the-art algorithms for extreme classification.
arXiv Detail & Related papers (2020-07-01T04:42:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.