Effective Cross-Task Transfer Learning for Explainable Natural Language
Inference with T5
- URL: http://arxiv.org/abs/2210.17301v1
- Date: Mon, 31 Oct 2022 13:26:08 GMT
- Title: Effective Cross-Task Transfer Learning for Explainable Natural Language
Inference with T5
- Authors: Irina Bigoulaeva, Rachneet Sachdeva, Harish Tayyar Madabushi, Aline
Villavicencio and Iryna Gurevych
- Abstract summary: We compare sequential fine-tuning with a model for multi-task learning in the context of boosting performance on two tasks.
Our results show that while sequential multi-task learning can be tuned to be good at the first of two target tasks, it performs less well on the second and additionally struggles with overfitting.
- Score: 50.574918785575655
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We compare sequential fine-tuning with a model for multi-task learning in the
context where we are interested in boosting performance on two tasks, one of
which depends on the other. We test these models on the FigLang2022 shared task
which requires participants to predict language inference labels on figurative
language along with corresponding textual explanations of the inference
predictions. Our results show that while sequential multi-task learning can be
tuned to be good at the first of two target tasks, it performs less well on the
second and additionally struggles with overfitting. Our findings show that
simple sequential fine-tuning of text-to-text models is an extraordinarily
powerful method for cross-task knowledge transfer while simultaneously
predicting multiple interdependent targets. So much so, that our best model
achieved the (tied) highest score on the task.
Related papers
- SpeechVerse: A Large-scale Generalizable Audio Language Model [38.67969337605572]
SpeechVerse is a robust multi-task training and curriculum learning framework.
It combines pre-trained speech and text foundation models via a small set of learnable parameters.
Our empirical experiments reveal that our multi-task SpeechVerse model is even superior to conventional task-specific baselines on 9 out of the 11 tasks.
arXiv Detail & Related papers (2024-05-14T03:33:31Z) - TIMIT Speaker Profiling: A Comparison of Multi-task learning and Single-task learning Approaches [27.152245569974678]
This study employs deep learning techniques to explore four speaker profiling tasks on the TIMIT dataset.
It highlights the potential and challenges of multi-task learning versus single-task models.
arXiv Detail & Related papers (2024-04-18T10:59:54Z) - Rethinking and Improving Multi-task Learning for End-to-end Speech
Translation [51.713683037303035]
We investigate the consistency between different tasks, considering different times and modules.
We find that the textual encoder primarily facilitates cross-modal conversion, but the presence of noise in speech impedes the consistency between text and speech representations.
We propose an improved multi-task learning (IMTL) approach for the ST task, which bridges the modal gap by mitigating the difference in length and representation.
arXiv Detail & Related papers (2023-11-07T08:48:46Z) - Conciseness: An Overlooked Language Task [11.940413163824887]
We define the task and show that it is different from related tasks such as summarization and simplification.
We demonstrate that conciseness is a difficult task for which zero-shot setups with large neural language models often do not perform well.
arXiv Detail & Related papers (2022-11-08T09:47:11Z) - FETA: A Benchmark for Few-Sample Task Transfer in Open-Domain Dialogue [70.65782786401257]
This work explores conversational task transfer by introducing FETA: a benchmark for few-sample task transfer in open-domain dialogue.
FETA contains two underlying sets of conversations upon which there are 10 and 7 tasks annotated, enabling the study of intra-dataset task transfer.
We utilize three popular language models and three learning algorithms to analyze the transferability between 132 source-target task pairs.
arXiv Detail & Related papers (2022-05-12T17:59:00Z) - Improving Multi-task Generalization Ability for Neural Text Matching via
Prompt Learning [54.66399120084227]
Recent state-of-the-art neural text matching models (PLMs) are hard to generalize to different tasks.
We adopt a specialization-generalization training strategy and refer to it as Match-Prompt.
In specialization stage, descriptions of different matching tasks are mapped to only a few prompt tokens.
In generalization stage, text matching model explores the essential matching signals by being trained on diverse multiple matching tasks.
arXiv Detail & Related papers (2022-04-06T11:01:08Z) - Re-framing Incremental Deep Language Models for Dialogue Processing with
Multi-task Learning [14.239355474794142]
We present a multi-task learning framework to enable the training of one universal incremental dialogue processing model.
We show that these tasks provide positive inductive biases to each other with the optimal contribution of each one relying on the severity of the noise from the task.
arXiv Detail & Related papers (2020-11-13T04:31:51Z) - Learning Modality Interaction for Temporal Sentence Localization and
Event Captioning in Videos [76.21297023629589]
We propose a novel method for learning pairwise modality interactions in order to better exploit complementary information for each pair of modalities in videos.
Our method turns out to achieve state-of-the-art performances on four standard benchmark datasets.
arXiv Detail & Related papers (2020-07-28T12:40:59Z) - Temporally Correlated Task Scheduling for Sequence Learning [143.70523777803723]
In many applications, a sequence learning task is usually associated with multiple temporally correlated auxiliary tasks.
We introduce a learnable scheduler to sequence learning, which can adaptively select auxiliary tasks for training.
Our method significantly improves the performance of simultaneous machine translation and stock trend forecasting.
arXiv Detail & Related papers (2020-07-10T10:28:54Z) - Modelling Latent Skills for Multitask Language Generation [15.126163032403811]
We present a generative model for multitask conditional language generation.
Our guiding hypothesis is that a shared set of latent skills underlies many disparate language generation tasks.
We instantiate this task embedding space as a latent variable in a latent variable sequence-to-sequence model.
arXiv Detail & Related papers (2020-02-21T20:39:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.