Multitask Learning for Low Resource Spoken Language Understanding
- URL: http://arxiv.org/abs/2211.13703v1
- Date: Thu, 24 Nov 2022 16:38:17 GMT
- Title: Multitask Learning for Low Resource Spoken Language Understanding
- Authors: Quentin Meeus, Marie-Francine Moens, Hugo Van hamme
- Abstract summary: We train models on dual objectives with automatic speech recognition and intent classification or sentiment classification.
Our models, although being of modest size, show improvements over models trained end-to-end on intent classification.
We study the performance of the models in low-resource scenario by training the models with as few as one example per class.
- Score: 26.106133114838215
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We explore the benefits that multitask learning offer to speech processing as
we train models on dual objectives with automatic speech recognition and intent
classification or sentiment classification. Our models, although being of
modest size, show improvements over models trained end-to-end on intent
classification. We compare different settings to find the optimal disposition
of each task module compared to one another. Finally, we study the performance
of the models in low-resource scenario by training the models with as few as
one example per class. We show that multitask learning in these scenarios
compete with a baseline model trained on text features and performs
considerably better than a pipeline model. On sentiment classification, we
match the performance of an end-to-end model with ten times as many parameters.
We consider 4 tasks and 4 datasets in Dutch and English.
Related papers
- STTATTS: Unified Speech-To-Text And Text-To-Speech Model [6.327929516375736]
We propose a parameter-efficient approach to learning ASR and TTS jointly via a multi-task learning objective and shared parameters.
Our evaluation demonstrates that the performance of our multi-task model is comparable to that of individually trained models.
arXiv Detail & Related papers (2024-10-24T10:04:24Z) - SpeechVerse: A Large-scale Generalizable Audio Language Model [38.67969337605572]
SpeechVerse is a robust multi-task training and curriculum learning framework.
It combines pre-trained speech and text foundation models via a small set of learnable parameters.
Our empirical experiments reveal that our multi-task SpeechVerse model is even superior to conventional task-specific baselines on 9 out of the 11 tasks.
arXiv Detail & Related papers (2024-05-14T03:33:31Z) - MiniGPT-v2: large language model as a unified interface for
vision-language multi-task learning [65.60607895153692]
MiniGPT-v2 is a model that can be treated as a unified interface for better handling various vision-language tasks.
We propose using unique identifiers for different tasks when training the model.
Our results show that MiniGPT-v2 achieves strong performance on many visual question-answering and visual grounding benchmarks.
arXiv Detail & Related papers (2023-10-14T03:22:07Z) - Contrastive Alignment of Vision to Language Through Parameter-Efficient
Transfer Learning [60.26952378997713]
Contrastive vision-language models (e.g. CLIP) are created by updating all the parameters of a vision model and language model through contrastive training.
We show that a minimal set of parameter updates ($$7%) can achieve the same performance as full-model training.
We describe a series of experiments: we show that existing knowledge is conserved more strongly in parameter-efficient training.
arXiv Detail & Related papers (2023-03-21T14:12:08Z) - eP-ALM: Efficient Perceptual Augmentation of Language Models [70.47962271121389]
We propose to direct effort to efficient adaptations of existing models, and propose to augment Language Models with perception.
Existing approaches for adapting pretrained models for vision-language tasks still rely on several key components that hinder their efficiency.
We show that by freezing more than 99% of total parameters, training only one linear projection layer, and prepending only one trainable token, our approach (dubbed eP-ALM) significantly outperforms other baselines on VQA and Captioning.
arXiv Detail & Related papers (2023-03-20T19:20:34Z) - In-context Learning Distillation: Transferring Few-shot Learning Ability
of Pre-trained Language Models [55.78264509270503]
We introduce in-context learning distillation to transfer in-context few-shot learning ability from large models to smaller models.
We perform in-context learning distillation under two different few-shot learning paradigms: Meta In-context Tuning (Meta-ICT) and Multitask In-context Tuning (Multitask-ICT)
Our experiments and analysis reveal that in-context learning objectives and language modeling objectives are complementary under the Multitask-ICT paradigm.
arXiv Detail & Related papers (2022-12-20T22:11:35Z) - PaLM: Scaling Language Modeling with Pathways [180.69584031908113]
We trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM.
We trained PaLM on 6144 TPU v4 chips using Pathways, a new ML system which enables highly efficient training across multiple TPU Pods.
We demonstrate continued benefits of scaling by achieving state-of-the-art few-shot learning results on hundreds of language understanding and generation benchmarks.
arXiv Detail & Related papers (2022-04-05T16:11:45Z) - Exploring Versatile Generative Language Model Via Parameter-Efficient
Transfer Learning [70.81910984985683]
We propose an effective way to fine-tune multiple down-stream generation tasks simultaneously using a single, large pre-trained model.
The experiments on five diverse language generation tasks show that by just using an additional 2-3% parameters for each task, our model can maintain or even improve the performance of fine-tuning the whole model.
arXiv Detail & Related papers (2020-04-08T06:18:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.