A Unified Approach to Count-Based Weakly-Supervised Learning
- URL: http://arxiv.org/abs/2311.13718v1
- Date: Wed, 22 Nov 2023 22:23:34 GMT
- Title: A Unified Approach to Count-Based Weakly-Supervised Learning
- Authors: Vinay Shukla, Zhe Zeng, Kareem Ahmed, Guy Van den Broeck
- Abstract summary: We develop a unified approach to learning from weakly-labeled data.
We compute the probability of exactly k out of n outputs being set to true.
We evaluate our approach on three common weakly-supervised learning paradigms.
- Score: 30.953260850416157
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: High-quality labels are often very scarce, whereas unlabeled data with
inferred weak labels occurs more naturally. In many cases, these weak labels
dictate the frequency of each respective class over a set of instances. In this
paper, we develop a unified approach to learning from such weakly-labeled data,
which we call count-based weakly-supervised learning. At the heart of our
approach is the ability to compute the probability of exactly k out of n
outputs being set to true. This computation is differentiable, exact, and
efficient. Building upon the previous computation, we derive a count loss
penalizing the model for deviations in its distribution from an arithmetic
constraint defined over label counts. We evaluate our approach on three common
weakly-supervised learning paradigms and observe that our proposed approach
achieves state-of-the-art or highly competitive results across all three of the
paradigms.
Related papers
- Learning with Complementary Labels Revisited: The Selected-Completely-at-Random Setting Is More Practical [66.57396042747706]
Complementary-label learning is a weakly supervised learning problem.
We propose a consistent approach that does not rely on the uniform distribution assumption.
We find that complementary-label learning can be expressed as a set of negative-unlabeled binary classification problems.
arXiv Detail & Related papers (2023-11-27T02:59:17Z) - Class-Distribution-Aware Pseudo Labeling for Semi-Supervised Multi-Label
Learning [97.88458953075205]
Pseudo-labeling has emerged as a popular and effective approach for utilizing unlabeled data.
This paper proposes a novel solution called Class-Aware Pseudo-Labeling (CAP) that performs pseudo-labeling in a class-aware manner.
arXiv Detail & Related papers (2023-05-04T12:52:18Z) - SoftMatch: Addressing the Quantity-Quality Trade-off in Semi-supervised
Learning [101.86916775218403]
This paper revisits the popular pseudo-labeling methods via a unified sample weighting formulation.
We propose SoftMatch to overcome the trade-off by maintaining both high quantity and high quality of pseudo-labels during training.
In experiments, SoftMatch shows substantial improvements across a wide variety of benchmarks, including image, text, and imbalanced classification.
arXiv Detail & Related papers (2023-01-26T03:53:25Z) - Learning from Label Proportions by Learning with Label Noise [30.7933303912474]
Learning from label proportions (LLP) is a weakly supervised classification problem where data points are grouped into bags.
We provide a theoretically grounded approach to LLP based on a reduction to learning with label noise.
Our approach demonstrates improved empirical performance in deep learning scenarios across multiple datasets and architectures.
arXiv Detail & Related papers (2022-03-04T18:52:21Z) - Learning with Proper Partial Labels [87.65718705642819]
Partial-label learning is a kind of weakly-supervised learning with inexact labels.
We show that this proper partial-label learning framework includes many previous partial-label learning settings.
We then derive a unified unbiased estimator of the classification risk.
arXiv Detail & Related papers (2021-12-23T01:37:03Z) - Multi-class Probabilistic Bounds for Self-learning [13.875239300089861]
Pseudo-labeling is prone to error and runs the risk of adding noisy labels into unlabeled training data.
We present a probabilistic framework for analyzing self-learning in the multi-class classification scenario with partially labeled data.
arXiv Detail & Related papers (2021-09-29T13:57:37Z) - Disentangling Sampling and Labeling Bias for Learning in Large-Output
Spaces [64.23172847182109]
We show that different negative sampling schemes implicitly trade-off performance on dominant versus rare labels.
We provide a unified means to explicitly tackle both sampling bias, arising from working with a subset of all labels, and labeling bias, which is inherent to the data due to label imbalance.
arXiv Detail & Related papers (2021-05-12T15:40:13Z) - Cost-Based Budget Active Learning for Deep Learning [0.9732863739456035]
We propose a Cost-Based Bugdet Active Learning (CBAL) which considers the classification uncertainty as well as instance diversity in a population constrained by a budget.
A principled approach based on the min-max is considered to minimize both the labeling and decision cost of the selected instances.
arXiv Detail & Related papers (2020-12-09T17:42:44Z) - Structured Prediction with Partial Labelling through the Infimum Loss [85.4940853372503]
The goal of weak supervision is to enable models to learn using only forms of labelling which are cheaper to collect.
This is a type of incomplete annotation where, for each datapoint, supervision is cast as a set of labels containing the real one.
This paper provides a unified framework based on structured prediction and on the concept of infimum loss to deal with partial labelling.
arXiv Detail & Related papers (2020-03-02T13:59:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.