Related papers: Optimal Learning from Label Proportions with General Loss Functions

Optimal Learning from Label Proportions with General Loss Functions

URL: http://arxiv.org/abs/2509.15145v1
Date: Thu, 18 Sep 2025 16:53:32 GMT
Title: Optimal Learning from Label Proportions with General Loss Functions
Authors: Lorne Applebaum, Travis Dick, Claudio Gentile, Haim Kaplan, Tomer Koren,
Abstract summary: We introduce a novel and versatile low-variance de-biasing methodology to learn from aggregate label information.<n>Our approach exhibits remarkable flexibility, seamlessly accommodating a broad spectrum of practically relevant loss functions.<n>We empirically validate the efficacy of our proposed approach across a diverse array of benchmark datasets.
Score: 33.827617632719864
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Motivated by problems in online advertising, we address the task of Learning from Label Proportions (LLP). In this partially-supervised setting, training data consists of groups of examples, termed bags, for which we only observe the average label value. The main goal, however, remains the design of a predictor for the labels of individual examples. We introduce a novel and versatile low-variance de-biasing methodology to learn from aggregate label information, significantly advancing the state of the art in LLP. Our approach exhibits remarkable flexibility, seamlessly accommodating a broad spectrum of practically relevant loss functions across both binary and multi-class classification settings. By carefully combining our estimators with standard techniques, we substantially improve sample complexity guarantees for a large class of losses of practical relevance. We also empirically validate the efficacy of our proposed approach across a diverse array of benchmark datasets, demonstrating compelling empirical advantages over standard baselines.

Related papers

Rethinking Consistent Multi-Label Classification under Inexact Supervision [60.79309683889278]
In partial multi-label learning, each instance is annotated with a candidate label set, among which only some labels are relevant.<n>In complementary multi-label learning, each instance is annotated with complementary labels indicating the classes to which the instance does not belong.
arXiv Detail & Related papers (2025-10-05T08:30:32Z)
Multi-Label Contrastive Learning : A Comprehensive Study [48.81069245141415]
Multi-label classification has emerged as a key area in both research and industry.<n>Applying contrastive learning to multi-label classification presents unique challenges.<n>We conduct an in-depth study of contrastive learning loss for multi-label classification across diverse settings.
arXiv Detail & Related papers (2024-11-27T20:20:06Z)
Class-aware and Augmentation-free Contrastive Learning from Label Proportion [19.41511190742059]
Learning from Label Proportion (LLP) is a weakly supervised learning scenario in which training data is organized into predefined bags of instances. We propose an augmentation-free contrastive framework TabLLP-BDC that introduces class-aware supervision at the instance level. Our solution features a two-stage Bag Difference Contrastive (BDC) learning mechanism that establishes robust class-aware instance-level supervision.
arXiv Detail & Related papers (2024-08-13T09:04:47Z)
Learning with Complementary Labels Revisited: The Selected-Completely-at-Random Setting Is More Practical [66.57396042747706]
Complementary-label learning is a weakly supervised learning problem. We propose a consistent approach that does not rely on the uniform distribution assumption. We find that complementary-label learning can be expressed as a set of negative-unlabeled binary classification problems.
arXiv Detail & Related papers (2023-11-27T02:59:17Z)
Class-Distribution-Aware Pseudo Labeling for Semi-Supervised Multi-Label Learning [97.88458953075205]
Pseudo-labeling has emerged as a popular and effective approach for utilizing unlabeled data. This paper proposes a novel solution called Class-Aware Pseudo-Labeling (CAP) that performs pseudo-labeling in a class-aware manner.
arXiv Detail & Related papers (2023-05-04T12:52:18Z)
Easy Learning from Label Proportions [17.71834385754893]
Easyllp is a flexible and simple-to-implement debiasing approach based on aggregate labels. Our technique allows us to accurately estimate the expected loss of an arbitrary model at an individual level.
arXiv Detail & Related papers (2023-02-06T20:41:38Z)
SoftMatch: Addressing the Quantity-Quality Trade-off in Semi-supervised Learning [101.86916775218403]
This paper revisits the popular pseudo-labeling methods via a unified sample weighting formulation. We propose SoftMatch to overcome the trade-off by maintaining both high quantity and high quality of pseudo-labels during training. In experiments, SoftMatch shows substantial improvements across a wide variety of benchmarks, including image, text, and imbalanced classification.
arXiv Detail & Related papers (2023-01-26T03:53:25Z)
Controller-Guided Partial Label Consistency Regularization with Unlabeled Data [49.24911720809604]
We propose a controller-guided consistency regularization at both the label-level and representation-level. We dynamically adjust the confidence thresholds so that the number of samples of each class participating in consistency regularization remains roughly equal to alleviate the problem of class-imbalance.
arXiv Detail & Related papers (2022-10-20T12:15:13Z)
A Deep Model for Partial Multi-Label Image Classification with Curriculum Based Disambiguation [42.0958430465578]
We study the partial multi-label (PML) image classification problem. Existing PML methods typically design a disambiguation strategy to filter out noisy labels. We propose a deep model for PML to enhance the representation and discrimination ability.
arXiv Detail & Related papers (2022-07-06T02:49:02Z)
Learning from Label Proportions by Learning with Label Noise [30.7933303912474]
Learning from label proportions (LLP) is a weakly supervised classification problem where data points are grouped into bags. We provide a theoretically grounded approach to LLP based on a reduction to learning with label noise. Our approach demonstrates improved empirical performance in deep learning scenarios across multiple datasets and architectures.
arXiv Detail & Related papers (2022-03-04T18:52:21Z)
Structured Prediction with Partial Labelling through the Infimum Loss [85.4940853372503]
The goal of weak supervision is to enable models to learn using only forms of labelling which are cheaper to collect. This is a type of incomplete annotation where, for each datapoint, supervision is cast as a set of labels containing the real one. This paper provides a unified framework based on structured prediction and on the concept of infimum loss to deal with partial labelling.
arXiv Detail & Related papers (2020-03-02T13:59:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.