Related papers: Multi-Label Quantification

Multi-Label Quantification

URL: http://arxiv.org/abs/2211.08063v1
Date: Tue, 15 Nov 2022 11:29:59 GMT
Title: Multi-Label Quantification
Authors: Alejandro Moreo and Manuel Francisco and Fabrizio Sebastiani
Abstract summary: Quantification, variously called "labelled prevalence estimation" or "learning to quantify", is the supervised learning task of generating predictors of the relative frequencies of the classes of interest in unsupervised data samples. We propose methods for inferring estimators of class prevalence values that strive to leverage the dependencies among the classes of interest in order to predict their relative frequencies more accurately.
Score: 78.83284164605473
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Quantification, variously called "supervised prevalence estimation" or "learning to quantify", is the supervised learning task of generating predictors of the relative frequencies (a.k.a. "prevalence values") of the classes of interest in unlabelled data samples. While many quantification methods have been proposed in the past for binary problems and, to a lesser extent, single-label multiclass problems, the multi-label setting (i.e., the scenario in which the classes of interest are not mutually exclusive) remains by and large unexplored. A straightforward solution to the multi-label quantification problem could simply consist of recasting the problem as a set of independent binary quantification problems. Such a solution is simple but na\"ive, since the independence assumption upon which it rests is, in most cases, not satisfied. In these cases, knowing the relative frequency of one class could be of help in determining the prevalence of other related classes. We propose the first truly multi-label quantification methods, i.e., methods for inferring estimators of class prevalence values that strive to leverage the stochastic dependencies among the classes of interest in order to predict their relative frequencies more accurately. We show empirical evidence that natively multi-label solutions outperform the na\"ive approaches by a large margin. The code to reproduce all our experiments is available online.

Related papers

Probably Approximately Precision and Recall Learning [62.912015491907994]
Precision and Recall are foundational metrics in machine learning. One-sided feedback--where only positive examples are observed during training--is inherent in many practical problems. We introduce a PAC learning framework where each hypothesis is represented by a graph, with edges indicating positive interactions.
arXiv Detail & Related papers (2024-11-20T04:21:07Z)
Trusted Multi-view Learning with Label Noise [17.458306450909316]
Multi-view learning methods often focus on improving decision accuracy while neglecting the decision uncertainty. We propose a trusted multi-view noise refining method to solve this problem. We empirically compare TMNR with state-of-the-art trusted multi-view learning and label noise learning baselines on 5 publicly available datasets.
arXiv Detail & Related papers (2024-04-18T06:47:30Z)
Multi-Label Noise Transition Matrix Estimation with Label Correlations: Theory and Algorithm [73.94839250910977]
Noisy multi-label learning has garnered increasing attention due to the challenges posed by collecting large-scale accurate labels. The introduction of transition matrices can help model multi-label noise and enable the development of statistically consistent algorithms. We propose a novel estimator that leverages label correlations without the need for anchor points or precise fitting of noisy class posteriors.
arXiv Detail & Related papers (2023-09-22T08:35:38Z)
Class-Distribution-Aware Pseudo Labeling for Semi-Supervised Multi-Label Learning [97.88458953075205]
Pseudo-labeling has emerged as a popular and effective approach for utilizing unlabeled data. This paper proposes a novel solution called Class-Aware Pseudo-Labeling (CAP) that performs pseudo-labeling in a class-aware manner.
arXiv Detail & Related papers (2023-05-04T12:52:18Z)
Centrality and Consistency: Two-Stage Clean Samples Identification for Learning with Instance-Dependent Noisy Labels [87.48541631675889]
We propose a two-stage clean samples identification method. First, we employ a class-level feature clustering procedure for the early identification of clean samples. Second, for the remaining clean samples that are close to the ground truth class boundary, we propose a novel consistency-based classification method.
arXiv Detail & Related papers (2022-07-29T04:54:57Z)
Learning from Multiple Unlabeled Datasets with Partial Risk Regularization [80.54710259664698]
In this paper, we aim to learn an accurate classifier without any class labels. We first derive an unbiased estimator of the classification risk that can be estimated from the given unlabeled sets. We then find that the classifier obtained as such tends to cause overfitting as its empirical risks go negative during training. Experiments demonstrate that our method effectively mitigates overfitting and outperforms state-of-the-art methods for learning from multiple unlabeled sets.
arXiv Detail & Related papers (2022-07-04T16:22:44Z)
Evolving Multi-Label Fuzzy Classifier [5.53329677986653]
Multi-label classification has attracted much attention in the machine learning community to address the problem of assigning single samples to more than one class at the same time. We propose an evolving multi-label fuzzy classifier (EFC-ML) which is able to self-adapt and self-evolve its structure with new incoming multi-label samples in an incremental, single-pass manner.
arXiv Detail & Related papers (2022-03-29T08:01:03Z)
Multi-class Probabilistic Bounds for Self-learning [13.875239300089861]
Pseudo-labeling is prone to error and runs the risk of adding noisy labels into unlabeled training data. We present a probabilistic framework for analyzing self-learning in the multi-class classification scenario with partially labeled data.
arXiv Detail & Related papers (2021-09-29T13:57:37Z)
Unbiased Loss Functions for Multilabel Classification with Missing Labels [2.1549398927094874]
Missing labels are a ubiquitous phenomenon in extreme multi-label classification (XMC) tasks. This paper derives the unique unbiased estimators for the different multilabel reductions.
arXiv Detail & Related papers (2021-09-23T10:39:02Z)
CCMN: A General Framework for Learning with Class-Conditional Multi-Label Noise [40.46921277898713]
Class-conditional noise commonly exists in machine learning tasks, where the class label is corrupted with a probability depending on its ground-truth. In this paper, we formalize this problem as a general framework of learning with Class-Conditional Multi-label Noise ( CCMN for short) We establish two unbiased estimators with error bounds for solving the CCMN problems, and prove that they are consistent with commonly used multi-label loss functions.
arXiv Detail & Related papers (2021-05-16T03:24:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.