PRBoost: Prompt-Based Rule Discovery and Boosting for Interactive
Weakly-Supervised Learning
- URL: http://arxiv.org/abs/2203.09735v1
- Date: Fri, 18 Mar 2022 04:23:20 GMT
- Title: PRBoost: Prompt-Based Rule Discovery and Boosting for Interactive
Weakly-Supervised Learning
- Authors: Rongzhi Zhang, Yue Yu, Pranav Shetty, Le Song, Chao Zhang
- Abstract summary: Weakly-supervised learning (WSL) has shown promising results in addressing label scarcity on many NLP tasks.
Our proposed model, named PRBoost, achieves this goal via iterative prompt-based rule discovery and model boosting.
Experiments on four tasks show PRBoost outperforms state-of-the-art WSL baselines up to 7.1%.
- Score: 57.66155242473784
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Weakly-supervised learning (WSL) has shown promising results in addressing
label scarcity on many NLP tasks, but manually designing a comprehensive,
high-quality labeling rule set is tedious and difficult. We study interactive
weakly-supervised learning -- the problem of iteratively and automatically
discovering novel labeling rules from data to improve the WSL model. Our
proposed model, named PRBoost, achieves this goal via iterative prompt-based
rule discovery and model boosting. It uses boosting to identify large-error
instances and then discovers candidate rules from them by prompting pre-trained
LMs with rule templates. The candidate rules are judged by human experts, and
the accepted rules are used to generate complementary weak labels and
strengthen the current model. Experiments on four tasks show PRBoost
outperforms state-of-the-art WSL baselines up to 7.1% and bridges the gaps with
fully supervised models. Our Implementation is available at
\url{https://github.com/rz-zhang/PRBoost}.
Related papers
- Co-training for Low Resource Scientific Natural Language Inference [65.37685198688538]
We propose a novel co-training method that assigns weights based on the training dynamics of the classifiers to the distantly supervised labels.
By assigning importance weights instead of filtering out examples based on an arbitrary threshold on the predicted confidence, we maximize the usage of automatically labeled data.
The proposed method obtains an improvement of 1.5% in Macro F1 over the distant supervision baseline, and substantial improvements over several other strong SSL baselines.
arXiv Detail & Related papers (2024-06-20T18:35:47Z) - Self-regulating Prompts: Foundational Model Adaptation without
Forgetting [112.66832145320434]
We introduce a self-regularization framework for prompting called PromptSRC.
PromptSRC guides the prompts to optimize for both task-specific and task-agnostic general representations.
arXiv Detail & Related papers (2023-07-13T17:59:35Z) - Multiclass Boosting: Simple and Intuitive Weak Learning Criteria [72.71096438538254]
We give a simple and efficient boosting algorithm, that does not require realizability assumptions.
We present a new result on boosting for list learners, as well as provide a novel proof for the characterization of multiclass PAC learning.
arXiv Detail & Related papers (2023-07-02T19:26:58Z) - Local Boosting for Weakly-Supervised Learning [21.95003048165616]
Boosting is a technique to enhance the performance of a set of base models by combining them into a strong ensemble model.
In weakly supervised learning, where most of the data is labeled through weak and noisy sources, it remains nontrivial to design effective boosting approaches.
We propose $textitLocalBoost$, a novel framework for weakly-supervised boosting.
arXiv Detail & Related papers (2023-06-05T13:24:03Z) - To Copy Rather Than Memorize: A Vertical Learning Paradigm for Knowledge
Graph Completion [35.05965140700747]
We extend embedding models by allowing to explicitly copy target information from related factual triples for more accurate prediction.
We also propose a novel relative distance based negative sampling technique (ReD) for more effective optimization.
arXiv Detail & Related papers (2023-05-23T14:53:20Z) - Guiding Large Language Models via Directional Stimulus Prompting [114.84930073977672]
We introduce Directional Stimulus Prompting, a novel framework for guiding black-box large language models (LLMs) toward specific desired outputs.
Instead of directly adjusting LLMs, our method employs a small tunable policy model to generate an auxiliary directional stimulus prompt for each input instance.
arXiv Detail & Related papers (2023-02-22T17:44:15Z) - LabelPrompt: Effective Prompt-based Learning for Relation Classification [31.291466190218912]
This paper presents a novel prompt-based learning method, namely LabelPrompt, for the relation classification task.
Motivated by the intuition to GIVE MODEL CHOICES!'', we first define additional tokens to represent relation labels, which regard these tokens as the verbaliser with semantic initialisation.
Then, to mitigate inconsistency between predicted relations and given entities, we implement an entity-aware module with contrastive learning.
arXiv Detail & Related papers (2023-02-16T04:06:25Z) - Low Resource Pipeline for Spoken Language Understanding via Weak
Supervision [5.9901156966011975]
In Weak Supervised Learning (WSL), a model is trained over noisy labels obtained from semantic rules and task-specific pre-trained models.
We show that task-agnostic prompts are generalizable and can be used to obtain noisy labels for different Spoken Language Understanding (SLU) tasks.
We demonstrate that prompt-based methods generate reliable labels for the above SLU tasks and thus can be used as a universal weak source to train a weak-supervised model (WSM) in absence of labeled data.
arXiv Detail & Related papers (2022-06-21T17:36:31Z) - Large-Scale Pre-training for Person Re-identification with Noisy Labels [125.49696935852634]
We develop a large-scale Pre-training framework utilizing Noisy Labels (PNL)
In principle, joint learning of these three modules not only clusters similar examples to one prototype, but also rectifies noisy labels based on the prototype assignment.
This simple pre-training task provides a scalable way to learn SOTA Re-ID representations from scratch on "LUPerson-NL" without bells and whistles.
arXiv Detail & Related papers (2022-03-30T17:59:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.