Exploring and Predicting Transferability across NLP Tasks
- URL: http://arxiv.org/abs/2005.00770v2
- Date: Tue, 6 Oct 2020 18:49:48 GMT
- Title: Exploring and Predicting Transferability across NLP Tasks
- Authors: Tu Vu, Tong Wang, Tsendsuren Munkhdalai, Alessandro Sordoni, Adam
Trischler, Andrew Mattarella-Micke, Subhransu Maji, Mohit Iyyer
- Abstract summary: We study the transferability between 33 NLP tasks across three broad classes of problems.
Our results show that transfer learning is more beneficial than previously thought.
We also develop task embeddings that can be used to predict the most transferable source tasks for a given target task.
- Score: 115.6278033699853
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in NLP demonstrate the effectiveness of training large-scale
language models and transferring them to downstream tasks. Can fine-tuning
these models on tasks other than language modeling further improve performance?
In this paper, we conduct an extensive study of the transferability between 33
NLP tasks across three broad classes of problems (text classification, question
answering, and sequence labeling). Our results show that transfer learning is
more beneficial than previously thought, especially when target task data is
scarce, and can improve performance even when the source task is small or
differs substantially from the target task (e.g., part-of-speech tagging
transfers well to the DROP QA dataset). We also develop task embeddings that
can be used to predict the most transferable source tasks for a given target
task, and we validate their effectiveness in experiments controlled for source
and target data size. Overall, our experiments reveal that factors such as
source data size, task and domain similarity, and task complexity all play a
role in determining transferability.
Related papers
- Distribution Matching for Multi-Task Learning of Classification Tasks: a
Large-Scale Study on Faces & Beyond [62.406687088097605]
Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space.
We show that MTL can be successful with classification tasks with little, or non-overlapping annotations.
We propose a novel approach, where knowledge exchange is enabled between the tasks via distribution matching.
arXiv Detail & Related papers (2024-01-02T14:18:11Z) - An Information-Theoretic Approach to Transferability in Task Transfer
Learning [16.05523977032659]
Task transfer learning is a popular technique in image processing applications that uses pre-trained models to reduce the supervision cost of related tasks.
We present a novel metric, H-score, that estimates the performance of transferred representations from one task to another in classification problems.
arXiv Detail & Related papers (2022-12-20T08:47:17Z) - An Exploration of Data Efficiency in Intra-Dataset Task Transfer for
Dialog Understanding [65.75873687351553]
This study explores the effects of varying quantities of target task training data on sequential transfer learning in the dialog domain.
Unintuitively, our data shows that often target task training data size has minimal effect on how sequential transfer learning performs compared to the same model without transfer learning.
arXiv Detail & Related papers (2022-10-21T04:36:46Z) - Task Compass: Scaling Multi-task Pre-training with Task Prefix [122.49242976184617]
Existing studies show that multi-task learning with large-scale supervised tasks suffers from negative effects across tasks.
We propose a task prefix guided multi-task pre-training framework to explore the relationships among tasks.
Our model can not only serve as the strong foundation backbone for a wide range of tasks but also be feasible as a probing tool for analyzing task relationships.
arXiv Detail & Related papers (2022-10-12T15:02:04Z) - FETA: A Benchmark for Few-Sample Task Transfer in Open-Domain Dialogue [70.65782786401257]
This work explores conversational task transfer by introducing FETA: a benchmark for few-sample task transfer in open-domain dialogue.
FETA contains two underlying sets of conversations upon which there are 10 and 7 tasks annotated, enabling the study of intra-dataset task transfer.
We utilize three popular language models and three learning algorithms to analyze the transferability between 132 source-target task pairs.
arXiv Detail & Related papers (2022-05-12T17:59:00Z) - Active Multi-Task Representation Learning [50.13453053304159]
We give the first formal study on resource task sampling by leveraging the techniques from active learning.
We propose an algorithm that iteratively estimates the relevance of each source task to the target task and samples from each source task based on the estimated relevance.
arXiv Detail & Related papers (2022-02-02T08:23:24Z) - Adaptive Transfer Learning on Graph Neural Networks [4.233435459239147]
Graph neural networks (GNNs) are widely used to learn a powerful representation of graph-structured data.
Recent work demonstrates that transferring knowledge from self-supervised tasks to downstream tasks could further improve graph representation.
We propose a new transfer learning paradigm on GNNs which could effectively leverage self-supervised tasks as auxiliary tasks to help the target task.
arXiv Detail & Related papers (2021-07-19T11:46:28Z) - Towards All-around Knowledge Transferring: Learning From Task-irrelevant
Labels [44.036667329736225]
Existing efforts mainly focus on transferring task-relevant knowledge from other similar data to tackle the issue.
To date, no large-scale studies have been performed to investigate the impact of task-irrelevant features.
We propose Task-Irrelevant Transfer Learning to exploit taskirrelevant features, which mainly are extracted from task-irrelevant labels.
arXiv Detail & Related papers (2020-11-17T06:43:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.