Creating Training Sets via Weak Indirect Supervision
- URL: http://arxiv.org/abs/2110.03484v1
- Date: Thu, 7 Oct 2021 14:09:35 GMT
- Title: Creating Training Sets via Weak Indirect Supervision
- Authors: Jieyu Zhang, Bohan Wang, Xiangchen Song, Yujing Wang, Yaming Yang,
Jing Bai, Alexander Ratner
- Abstract summary: Weak Supervision (WS) frameworks synthesize training labels from multiple potentially noisy supervision sources.
We formulate Weak Indirect Supervision (WIS), a new research problem for automatically synthesizing training labels.
We develop a probabilistic modeling approach, PLRM, which uses user-provided label relations to model and leverage indirect supervision sources.
- Score: 66.77795318313372
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Creating labeled training sets has become one of the major roadblocks in
machine learning. To address this, recent Weak Supervision (WS) frameworks
synthesize training labels from multiple potentially noisy supervision sources.
However, existing frameworks are restricted to supervision sources that share
the same output space as the target task. To extend the scope of usable
sources, we formulate Weak Indirect Supervision (WIS), a new research problem
for automatically synthesizing training labels based on indirect supervision
sources that have different output label spaces. To overcome the challenge of
mismatched output spaces, we develop a probabilistic modeling approach, PLRM,
which uses user-provided label relations to model and leverage indirect
supervision sources. Moreover, we provide a theoretically-principled test of
the distinguishability of PLRM for unseen labels, along with an generalization
bound. On both image and text classification tasks as well as an industrial
advertising application, we demonstrate the advantages of PLRM by outperforming
baselines by a margin of 2%-9%.
Related papers
- Class-aware and Augmentation-free Contrastive Learning from Label Proportion [19.41511190742059]
Learning from Label Proportion (LLP) is a weakly supervised learning scenario in which training data is organized into predefined bags of instances.
We propose an augmentation-free contrastive framework TabLLP-BDC that introduces class-aware supervision at the instance level.
Our solution features a two-stage Bag Difference Contrastive (BDC) learning mechanism that establishes robust class-aware instance-level supervision.
arXiv Detail & Related papers (2024-08-13T09:04:47Z) - AutoWS: Automated Weak Supervision Framework for Text Classification [1.748907524043535]
We propose a novel framework for increasing the efficiency of weak supervision process while decreasing the dependency on domain experts.
Our method requires a small set of labeled examples per label class and automatically creates a set of labeling functions to assign noisy labels to numerous unlabeled data.
arXiv Detail & Related papers (2023-02-07T07:12:05Z) - A Weakly Supervised Learning Framework for Salient Object Detection via
Hybrid Labels [96.56299163691979]
This paper focuses on a new weakly-supervised salient object detection (SOD) task under hybrid labels.
To address the issues of label noise and quantity imbalance in this task, we design a new pipeline framework with three sophisticated training strategies.
Experiments on five SOD benchmarks show that our method achieves competitive performance against weakly-supervised/unsupervised methods.
arXiv Detail & Related papers (2022-09-07T06:45:39Z) - Binary Classification with Positive Labeling Sources [71.37692084951355]
We propose WEAPO, a simple yet competitive WS method for producing training labels without negative labeling sources.
We show WEAPO achieves the highest averaged performance on 10 benchmark datasets.
arXiv Detail & Related papers (2022-08-02T19:32:08Z) - Balancing Discriminability and Transferability for Source-Free Domain
Adaptation [55.143687986324935]
Conventional domain adaptation (DA) techniques aim to improve domain transferability by learning domain-invariant representations.
The requirement of simultaneous access to labeled source and unlabeled target renders them unsuitable for the challenging source-free DA setting.
We derive novel insights to show that a mixup between original and corresponding translated generic samples enhances the discriminability-transferability trade-off.
arXiv Detail & Related papers (2022-06-16T09:06:22Z) - Self-supervised learning for joint SAR and multispectral land cover
classification [38.8529535887097]
We present a framework and specific tasks for self-supervised training of multichannel models.
We show that the proposed self-supervised approach is highly effective at learning features that correlate with the labels for land cover classification.
arXiv Detail & Related papers (2021-08-20T09:02:07Z) - MatchGAN: A Self-Supervised Semi-Supervised Conditional Generative
Adversarial Network [51.84251358009803]
We present a novel self-supervised learning approach for conditional generative adversarial networks (GANs) under a semi-supervised setting.
We perform augmentation by randomly sampling sensible labels from the label space of the few labelled examples available.
Our method surpasses the baseline with only 20% of the labelled examples used to train the baseline.
arXiv Detail & Related papers (2020-06-11T17:14:55Z) - Universal Source-Free Domain Adaptation [57.37520645827318]
We propose a novel two-stage learning process for domain adaptation.
In the Procurement stage, we aim to equip the model for future source-free deployment, assuming no prior knowledge of the upcoming category-gap and domain-shift.
In the Deployment stage, the goal is to design a unified adaptation algorithm capable of operating across a wide range of category-gaps.
arXiv Detail & Related papers (2020-04-09T07:26:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.