Low Resource Pipeline for Spoken Language Understanding via Weak
Supervision
- URL: http://arxiv.org/abs/2206.10559v1
- Date: Tue, 21 Jun 2022 17:36:31 GMT
- Title: Low Resource Pipeline for Spoken Language Understanding via Weak
Supervision
- Authors: Ayush Kumar, Rishabh Kumar Tripathi, Jithendra Vepa
- Abstract summary: In Weak Supervised Learning (WSL), a model is trained over noisy labels obtained from semantic rules and task-specific pre-trained models.
We show that task-agnostic prompts are generalizable and can be used to obtain noisy labels for different Spoken Language Understanding (SLU) tasks.
We demonstrate that prompt-based methods generate reliable labels for the above SLU tasks and thus can be used as a universal weak source to train a weak-supervised model (WSM) in absence of labeled data.
- Score: 5.9901156966011975
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In Weak Supervised Learning (WSL), a model is trained over noisy labels
obtained from semantic rules and task-specific pre-trained models. Rules offer
limited generalization over tasks and require significant manual efforts while
pre-trained models are available only for limited tasks. In this work, we
propose to utilize prompt-based methods as weak sources to obtain the noisy
labels on unannotated data. We show that task-agnostic prompts are
generalizable and can be used to obtain noisy labels for different Spoken
Language Understanding (SLU) tasks such as sentiment classification, disfluency
detection and emotion classification. These prompts could additionally be
updated to add task-specific contexts, thus providing flexibility to design
task-specific prompts. We demonstrate that prompt-based methods generate
reliable labels for the above SLU tasks and thus can be used as a universal
weak source to train a weak-supervised model (WSM) in absence of labeled data.
Our proposed WSL pipeline trained over prompt-based weak source outperforms
other competitive low-resource benchmarks on zero and few-shot learning by more
than 4% on Macro-F1 on all of the three benchmark SLU datasets. The proposed
method also outperforms a conventional rule based WSL pipeline by more than 5%
on Macro-F1.
Related papers
- Co-training for Low Resource Scientific Natural Language Inference [65.37685198688538]
We propose a novel co-training method that assigns weights based on the training dynamics of the classifiers to the distantly supervised labels.
By assigning importance weights instead of filtering out examples based on an arbitrary threshold on the predicted confidence, we maximize the usage of automatically labeled data.
The proposed method obtains an improvement of 1.5% in Macro F1 over the distant supervision baseline, and substantial improvements over several other strong SSL baselines.
arXiv Detail & Related papers (2024-06-20T18:35:47Z) - Self-regulating Prompts: Foundational Model Adaptation without
Forgetting [112.66832145320434]
We introduce a self-regularization framework for prompting called PromptSRC.
PromptSRC guides the prompts to optimize for both task-specific and task-agnostic general representations.
arXiv Detail & Related papers (2023-07-13T17:59:35Z) - Automated Few-shot Classification with Instruction-Finetuned Language
Models [76.69064714392165]
We show that AuT-Few outperforms state-of-the-art few-shot learning methods.
We also show that AuT-Few is the best ranking method across datasets on the RAFT few-shot benchmark.
arXiv Detail & Related papers (2023-05-21T21:50:27Z) - Task Residual for Tuning Vision-Language Models [69.22958802711017]
We propose a new efficient tuning approach for vision-language models (VLMs) named Task Residual Tuning (TaskRes)
TaskRes explicitly decouples the prior knowledge of the pre-trained models and new knowledge regarding a target task.
The proposed TaskRes is simple yet effective, which significantly outperforms previous methods on 11 benchmark datasets.
arXiv Detail & Related papers (2022-11-18T15:09:03Z) - Meta Auxiliary Learning for Low-resource Spoken Language Understanding [11.002938634213734]
Spoken language understanding (SLU) treats automatic speech recognition (ASR) and natural language understanding (NLU) as a unified task.
We exploit an ASR and NLU joint training method based on meta auxiliary learning to improve the performance of low-resource SLU task.
arXiv Detail & Related papers (2022-06-26T03:12:33Z) - PromptDA: Label-guided Data Augmentation for Prompt-based Few-shot
Learners [15.130992223266734]
We propose a novel label-guided data augmentation framework, PromptDA, which exploits the enriched label semantic information for data augmentation.
Our experiment results on few-shot text classification tasks demonstrate the superior performance of the proposed framework.
arXiv Detail & Related papers (2022-05-18T22:15:20Z) - AdaPrompt: Adaptive Model Training for Prompt-based NLP [77.12071707955889]
We propose AdaPrompt, adaptively retrieving external data for continual pretraining of PLMs.
Experimental results on five NLP benchmarks show that AdaPrompt can improve over standard PLMs in few-shot settings.
In zero-shot settings, our method outperforms standard prompt-based methods by up to 26.35% relative error reduction.
arXiv Detail & Related papers (2022-02-10T04:04:57Z) - The Role of Global Labels in Few-Shot Classification and How to Infer
Them [55.64429518100676]
Few-shot learning is a central problem in meta-learning, where learners must quickly adapt to new tasks.
We propose Meta Label Learning (MeLa), a novel algorithm that infers global labels and obtains robust few-shot models via standard classification.
arXiv Detail & Related papers (2021-08-09T14:07:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.