Self-Training with Weak Supervision
- URL: http://arxiv.org/abs/2104.05514v1
- Date: Mon, 12 Apr 2021 14:45:04 GMT
- Title: Self-Training with Weak Supervision
- Authors: Giannis Karamanolakis, Subhabrata Mukherjee, Guoqing Zheng and Ahmed
Hassan Awadallah
- Abstract summary: State-of-the-art deep neural networks require large-scale labeled training data that is often expensive to obtain or not available for many tasks.
weak supervision in the form of domain-specific rules has been shown to be useful in such settings.
We develop a weak supervision framework (ASTRA) that leverages all the available data for a given task.
- Score: 32.68342091430266
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: State-of-the-art deep neural networks require large-scale labeled training
data that is often expensive to obtain or not available for many tasks. Weak
supervision in the form of domain-specific rules has been shown to be useful in
such settings to automatically generate weakly labeled training data. However,
learning with weak rules is challenging due to their inherent heuristic and
noisy nature. An additional challenge is rule coverage and overlap, where prior
work on weak supervision only considers instances that are covered by weak
rules, thus leaving valuable unlabeled data behind.
In this work, we develop a weak supervision framework (ASTRA) that leverages
all the available data for a given task. To this end, we leverage task-specific
unlabeled data through self-training with a model (student) that considers
contextualized representations and predicts pseudo-labels for instances that
may not be covered by weak rules. We further develop a rule attention network
(teacher) that learns how to aggregate student pseudo-labels with weak rule
labels, conditioned on their fidelity and the underlying context of an
instance. Finally, we construct a semi-supervised learning objective for
end-to-end training with unlabeled data, domain-specific rules, and a small
amount of labeled data. Extensive experiments on six benchmark datasets for
text classification demonstrate the effectiveness of our approach with
significant improvements over state-of-the-art baselines.
Related papers
- Soft Curriculum for Learning Conditional GANs with Noisy-Labeled and
Uncurated Unlabeled Data [70.25049762295193]
We introduce a novel conditional image generation framework that accepts noisy-labeled and uncurated data during training.
We propose soft curriculum learning, which assigns instance-wise weights for adversarial training while assigning new labels for unlabeled data.
Our experiments show that our approach outperforms existing semi-supervised and label-noise robust methods in terms of both quantitative and qualitative performance.
arXiv Detail & Related papers (2023-07-17T08:31:59Z) - Label Propagation with Weak Supervision [47.52032178837098]
We introduce a novel analysis of the classical label propagation algorithm (LPA) (Zhu & Ghahramani, 2002)
We provide an error bound that exploits both the local geometric properties of the underlying graph and the quality of the prior information.
We demonstrate the ability of our approach on multiple benchmark weakly supervised classification tasks, showing improvements upon existing semi-supervised and weakly supervised methods.
arXiv Detail & Related papers (2022-10-07T14:53:02Z) - Debiased Pseudo Labeling in Self-Training [77.83549261035277]
Deep neural networks achieve remarkable performances on a wide range of tasks with the aid of large-scale labeled datasets.
To mitigate the requirement for labeled data, self-training is widely used in both academia and industry by pseudo labeling on readily-available unlabeled data.
We propose Debiased, in which the generation and utilization of pseudo labels are decoupled by two independent heads.
arXiv Detail & Related papers (2022-02-15T02:14:33Z) - Data Consistency for Weakly Supervised Learning [15.365232702938677]
Training machine learning models involves using large amounts of human-annotated data.
We propose a novel weak supervision algorithm that processes noisy labels, i.e., weak signals.
We show that it significantly outperforms state-of-the-art weak supervision methods on both text and image classification tasks.
arXiv Detail & Related papers (2022-02-08T16:48:19Z) - Noised Consistency Training for Text Summarization [23.16890559954038]
We argue that consistency training can be overcome by a semi-supervised approach.
We have verified that leveraging large amounts of unlabeled data decently improves the performance of supervised learning over an insufficient labeled dataset.
arXiv Detail & Related papers (2021-05-28T07:21:39Z) - Self-Tuning for Data-Efficient Deep Learning [75.34320911480008]
Self-Tuning is a novel approach to enable data-efficient deep learning.
It unifies the exploration of labeled and unlabeled data and the transfer of a pre-trained model.
It outperforms its SSL and TL counterparts on five tasks by sharp margins.
arXiv Detail & Related papers (2021-02-25T14:56:19Z) - Self-supervised driven consistency training for annotation efficient
histopathology image analysis [13.005873872821066]
Training a neural network with a large labeled dataset is still a dominant paradigm in computational histopathology.
We propose a self-supervised pretext task that harnesses the underlying multi-resolution contextual cues in histology whole-slide images to learn a powerful supervisory signal for unsupervised representation learning.
We also propose a new teacher-student semi-supervised consistency paradigm that learns to effectively transfer the pretrained representations to downstream tasks based on prediction consistency with the task-specific un-labeled data.
arXiv Detail & Related papers (2021-02-07T19:46:21Z) - Predicting Themes within Complex Unstructured Texts: A Case Study on
Safeguarding Reports [66.39150945184683]
We focus on the problem of automatically identifying the main themes in a safeguarding report using supervised classification approaches.
Our results show the potential of deep learning models to simulate subject-expert behaviour even for complex tasks with limited labelled data.
arXiv Detail & Related papers (2020-10-27T19:48:23Z) - Social Adaptive Module for Weakly-supervised Group Activity Recognition [143.68241396839062]
This paper presents a new task named weakly-supervised group activity recognition (GAR)
It differs from conventional GAR tasks in that only video-level labels are available, yet the important persons within each frame are not provided even in the training data.
This eases us to collect and annotate a large-scale NBA dataset and thus raise new challenges to GAR.
arXiv Detail & Related papers (2020-07-18T16:40:55Z) - Structured Prediction with Partial Labelling through the Infimum Loss [85.4940853372503]
The goal of weak supervision is to enable models to learn using only forms of labelling which are cheaper to collect.
This is a type of incomplete annotation where, for each datapoint, supervision is cast as a set of labels containing the real one.
This paper provides a unified framework based on structured prediction and on the concept of infimum loss to deal with partial labelling.
arXiv Detail & Related papers (2020-03-02T13:59:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.