Learning from Label Proportions by Learning with Label Noise
- URL: http://arxiv.org/abs/2203.02496v2
- Date: Sun, 24 Sep 2023 20:29:03 GMT
- Title: Learning from Label Proportions by Learning with Label Noise
- Authors: Jianxin Zhang, Yutong Wang, Clayton Scott
- Abstract summary: Learning from label proportions (LLP) is a weakly supervised classification problem where data points are grouped into bags.
We provide a theoretically grounded approach to LLP based on a reduction to learning with label noise.
Our approach demonstrates improved empirical performance in deep learning scenarios across multiple datasets and architectures.
- Score: 30.7933303912474
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning from label proportions (LLP) is a weakly supervised classification
problem where data points are grouped into bags, and the label proportions
within each bag are observed instead of the instance-level labels. The task is
to learn a classifier to predict the individual labels of future individual
instances. Prior work on LLP for multi-class data has yet to develop a
theoretically grounded algorithm. In this work, we provide a theoretically
grounded approach to LLP based on a reduction to learning with label noise,
using the forward correction (FC) loss of \citet{Patrini2017MakingDN}. We
establish an excess risk bound and generalization error analysis for our
approach, while also extending the theory of the FC loss which may be of
independent interest. Our approach demonstrates improved empirical performance
in deep learning scenarios across multiple datasets and architectures, compared
to the leading existing methods.
Related papers
- An Unbiased Risk Estimator for Partial Label Learning with Augmented Classes [46.663081214928226]
We propose an unbiased risk estimator with theoretical guarantees for PLLAC.
We provide a theoretical analysis of the estimation error bound of PLLAC.
Experiments on benchmark, UCI and real-world datasets demonstrate the effectiveness of the proposed approach.
arXiv Detail & Related papers (2024-09-29T07:36:16Z) - Easy Learning from Label Proportions [17.71834385754893]
Easyllp is a flexible and simple-to-implement debiasing approach based on aggregate labels.
Our technique allows us to accurately estimate the expected loss of an arbitrary model at an individual level.
arXiv Detail & Related papers (2023-02-06T20:41:38Z) - Learning with Partial Labels from Semi-supervised Perspective [28.735185883881172]
Partial Label (PL) learning refers to the task of learning from partially labeled data.
We propose a novel PL learning method, namely Partial Label learning with Semi-Supervised Perspective (PLSP)
PLSP significantly outperforms the existing PL baseline methods, especially on high ambiguity levels.
arXiv Detail & Related papers (2022-11-24T15:12:16Z) - Contrastive Credibility Propagation for Reliable Semi-Supervised Learning [6.014538614447467]
We propose Contrastive Credibility Propagation (CCP) for deep SSL via iterative transductive pseudo-label refinement.
CCP unifies semi-supervised learning and noisy label learning for the goal of reliably outperforming a supervised baseline in any data scenario.
arXiv Detail & Related papers (2022-11-17T23:01:47Z) - PercentMatch: Percentile-based Dynamic Thresholding for Multi-Label
Semi-Supervised Classification [64.39761523935613]
We propose a percentile-based threshold adjusting scheme to dynamically alter the score thresholds of positive and negative pseudo-labels for each class during the training.
We achieve strong performance on Pascal VOC2007 and MS-COCO datasets when compared to recent SSL methods.
arXiv Detail & Related papers (2022-08-30T01:27:48Z) - Learning from Multiple Unlabeled Datasets with Partial Risk
Regularization [80.54710259664698]
In this paper, we aim to learn an accurate classifier without any class labels.
We first derive an unbiased estimator of the classification risk that can be estimated from the given unlabeled sets.
We then find that the classifier obtained as such tends to cause overfitting as its empirical risks go negative during training.
Experiments demonstrate that our method effectively mitigates overfitting and outperforms state-of-the-art methods for learning from multiple unlabeled sets.
arXiv Detail & Related papers (2022-07-04T16:22:44Z) - Learning with Proper Partial Labels [87.65718705642819]
Partial-label learning is a kind of weakly-supervised learning with inexact labels.
We show that this proper partial-label learning framework includes many previous partial-label learning settings.
We then derive a unified unbiased estimator of the classification risk.
arXiv Detail & Related papers (2021-12-23T01:37:03Z) - PLM: Partial Label Masking for Imbalanced Multi-label Classification [59.68444804243782]
Neural networks trained on real-world datasets with long-tailed label distributions are biased towards frequent classes and perform poorly on infrequent classes.
We propose a method, Partial Label Masking (PLM), which utilizes this ratio during training.
Our method achieves strong performance when compared to existing methods on both multi-label (MultiMNIST and MSCOCO) and single-label (imbalanced CIFAR-10 and CIFAR-100) image classification datasets.
arXiv Detail & Related papers (2021-05-22T18:07:56Z) - Structured Prediction with Partial Labelling through the Infimum Loss [85.4940853372503]
The goal of weak supervision is to enable models to learn using only forms of labelling which are cheaper to collect.
This is a type of incomplete annotation where, for each datapoint, supervision is cast as a set of labels containing the real one.
This paper provides a unified framework based on structured prediction and on the concept of infimum loss to deal with partial labelling.
arXiv Detail & Related papers (2020-03-02T13:59:41Z) - Progressive Identification of True Labels for Partial-Label Learning [112.94467491335611]
Partial-label learning (PLL) is a typical weakly supervised learning problem, where each training instance is equipped with a set of candidate labels among which only one is the true label.
Most existing methods elaborately designed as constrained optimizations that must be solved in specific manners, making their computational complexity a bottleneck for scaling up to big data.
This paper proposes a novel framework of classifier with flexibility on the model and optimization algorithm.
arXiv Detail & Related papers (2020-02-19T08:35:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.