On the Complexity of Learning from Label Proportions
- URL: http://arxiv.org/abs/2004.03515v1
- Date: Tue, 7 Apr 2020 16:15:22 GMT
- Title: On the Complexity of Learning from Label Proportions
- Authors: Benjamin Fish, Lev Reyzin
- Abstract summary: We study the problem of learning with label proportions with unlabeled training data.
This model of learning is applicable to a wide variety of settings, including predicting the number of votes for candidates in political elections from polls.
We show, perhaps surprisingly, that for finite VC classes what can be efficiently LLP learned is a strict subset of what can be leaned efficiently in PAC.
- Score: 4.111899441919163
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the problem of learning with label proportions, which we call LLP
learning, the training data is unlabeled, and only the proportions of examples
receiving each label are given. The goal is to learn a hypothesis that predicts
the proportions of labels on the distribution underlying the sample. This model
of learning is applicable to a wide variety of settings, including predicting
the number of votes for candidates in political elections from polls.
In this paper, we formally define this class and resolve foundational
questions regarding the computational complexity of LLP and characterize its
relationship to PAC learning. Among our results, we show, perhaps surprisingly,
that for finite VC classes what can be efficiently LLP learned is a strict
subset of what can be leaned efficiently in PAC, under standard complexity
assumptions. We also show that there exist classes of functions whose
learnability in LLP is independent of ZFC, the standard set theoretic axioms.
This implies that LLP learning cannot be easily characterized (like PAC by VC
dimension).
Related papers
- Learning Conditional Averages [52.361762722359366]
We introduce the problem of learning conditional averages in the PAC framework.<n>Instead of learning the target concept itself, the goal is to predict, for each instance, the average label over its neighborhood.<n>More generally, it extends PAC learning to a setting that captures learning tasks arising in several domains.
arXiv Detail & Related papers (2026-02-12T13:20:29Z) - Samplability makes learning easier [10.233615909288188]
Standard definition of PAC learning requires learners to succeed under all distributions.<n>We show that samplable PAC substantially expands the power of efficient learners.
arXiv Detail & Related papers (2025-12-01T04:48:36Z) - Proper Learnability and the Role of Unlabeled Data [10.168670899305232]
We show that there are problems whose proper learnability is logically undecidable, i.e., independent of the ZFC axioms.
We then show all impossibility results which obstruct any characterization of proper learnability in the realizable PAC model.
arXiv Detail & Related papers (2025-02-14T18:41:53Z) - Probably Approximately Precision and Recall Learning [62.912015491907994]
Precision and Recall are foundational metrics in machine learning.
One-sided feedback--where only positive examples are observed during training--is inherent in many practical problems.
We introduce a PAC learning framework where each hypothesis is represented by a graph, with edges indicating positive interactions.
arXiv Detail & Related papers (2024-11-20T04:21:07Z) - Optimistic Rates for Learning from Label Proportions [19.980594971351014]
We consider a weakly supervised learning problem called Learning from Label Proportions (LLP), where examples are grouped into bags''
We study various learning rules for LLP that achieve PAC learning guarantees for classification loss.
arXiv Detail & Related papers (2024-06-01T16:36:40Z) - Appeal: Allow Mislabeled Samples the Chance to be Rectified in Partial Label Learning [55.4510979153023]
In partial label learning (PLL), each instance is associated with a set of candidate labels among which only one is ground-truth.
To help these mislabeled samples "appeal," we propose the first appeal-based framework.
arXiv Detail & Related papers (2023-12-18T09:09:52Z) - Robust Representation Learning for Unreliable Partial Label Learning [86.909511808373]
Partial Label Learning (PLL) is a type of weakly supervised learning where each training instance is assigned a set of candidate labels, but only one label is the ground-truth.
This is known as Unreliable Partial Label Learning (UPLL) that introduces an additional complexity due to the inherent unreliability and ambiguity of partial labels.
We propose the Unreliability-Robust Representation Learning framework (URRL) that leverages unreliability-robust contrastive learning to help the model fortify against unreliable partial labels effectively.
arXiv Detail & Related papers (2023-08-31T13:37:28Z) - Correcting Underrepresentation and Intersectional Bias for Classification [49.1574468325115]
We consider the problem of learning from data corrupted by underrepresentation bias.
We show that with a small amount of unbiased data, we can efficiently estimate the group-wise drop-out rates.
We show that our algorithm permits efficient learning for model classes of finite VC dimension.
arXiv Detail & Related papers (2023-06-19T18:25:44Z) - Pseudo Labels Regularization for Imbalanced Partial-Label Learning [11.93067260471484]
The challenge of partial-label learning and long-tail learning lies in matching between a decent marginal prior distribution with drawing the pseudo labels.
By punishing the pseudo labels of head classes, our method implements state-of-art under the standardized benchmarks.
arXiv Detail & Related papers (2023-03-06T15:21:55Z) - Learning with Partial Labels from Semi-supervised Perspective [28.735185883881172]
Partial Label (PL) learning refers to the task of learning from partially labeled data.
We propose a novel PL learning method, namely Partial Label learning with Semi-Supervised Perspective (PLSP)
PLSP significantly outperforms the existing PL baseline methods, especially on high ambiguity levels.
arXiv Detail & Related papers (2022-11-24T15:12:16Z) - Learning versus Refutation in Noninteractive Local Differential Privacy [133.80204506727526]
We study two basic statistical tasks in non-interactive local differential privacy (LDP): learning and refutation.
Our main result is a complete characterization of the sample complexity of PAC learning for non-interactive LDP protocols.
arXiv Detail & Related papers (2022-10-26T03:19:24Z) - Learning from Label Proportions by Learning with Label Noise [30.7933303912474]
Learning from label proportions (LLP) is a weakly supervised classification problem where data points are grouped into bags.
We provide a theoretically grounded approach to LLP based on a reduction to learning with label noise.
Our approach demonstrates improved empirical performance in deep learning scenarios across multiple datasets and architectures.
arXiv Detail & Related papers (2022-03-04T18:52:21Z) - Learning with Proper Partial Labels [87.65718705642819]
Partial-label learning is a kind of weakly-supervised learning with inexact labels.
We show that this proper partial-label learning framework includes many previous partial-label learning settings.
We then derive a unified unbiased estimator of the classification risk.
arXiv Detail & Related papers (2021-12-23T01:37:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.