Analysis and Prediction of NLP Models Via Task Embeddings
- URL: http://arxiv.org/abs/2112.05647v1
- Date: Fri, 10 Dec 2021 16:23:24 GMT
- Title: Analysis and Prediction of NLP Models Via Task Embeddings
- Authors: Damien Sileo and Marie-Francine Moens
- Abstract summary: We propose MetaEval, a collection of $101$ NLP tasks.
We fit a single transformer to all MetaEval tasks jointly while conditioning it on learned embeddings.
The resulting task embeddings enable a novel analysis of the space of tasks.
- Score: 25.311690222754454
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Task embeddings are low-dimensional representations that are trained to
capture task properties. In this paper, we propose MetaEval, a collection of
$101$ NLP tasks. We fit a single transformer to all MetaEval tasks jointly
while conditioning it on learned embeddings. The resulting task embeddings
enable a novel analysis of the space of tasks. We then show that task aspects
can be mapped to task embeddings for new tasks without using any annotated
examples.
Predicted embeddings can modulate the encoder for zero-shot inference and
outperform a zero-shot baseline on GLUE tasks. The provided multitask setup can
function as a benchmark for future transfer learning research.
Related papers
- Distribution Matching for Multi-Task Learning of Classification Tasks: a
Large-Scale Study on Faces & Beyond [62.406687088097605]
Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space.
We show that MTL can be successful with classification tasks with little, or non-overlapping annotations.
We propose a novel approach, where knowledge exchange is enabled between the tasks via distribution matching.
arXiv Detail & Related papers (2024-01-02T14:18:11Z) - Task Compass: Scaling Multi-task Pre-training with Task Prefix [122.49242976184617]
Existing studies show that multi-task learning with large-scale supervised tasks suffers from negative effects across tasks.
We propose a task prefix guided multi-task pre-training framework to explore the relationships among tasks.
Our model can not only serve as the strong foundation backbone for a wide range of tasks but also be feasible as a probing tool for analyzing task relationships.
arXiv Detail & Related papers (2022-10-12T15:02:04Z) - Task Adaptive Parameter Sharing for Multi-Task Learning [114.80350786535952]
Adaptive Task Adapting Sharing (TAPS) is a method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers.
Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters.
We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.
arXiv Detail & Related papers (2022-03-30T23:16:07Z) - On Steering Multi-Annotations per Sample for Multi-Task Learning [79.98259057711044]
The study of multi-task learning has drawn great attention from the community.
Despite the remarkable progress, the challenge of optimally learning different tasks simultaneously remains to be explored.
Previous works attempt to modify the gradients from different tasks. Yet these methods give a subjective assumption of the relationship between tasks, and the modified gradient may be less accurate.
In this paper, we introduce Task Allocation(STA), a mechanism that addresses this issue by a task allocation approach, in which each sample is randomly allocated a subset of tasks.
For further progress, we propose Interleaved Task Allocation(ISTA) to iteratively allocate all
arXiv Detail & Related papers (2022-03-06T11:57:18Z) - Grad2Task: Improved Few-shot Text Classification Using Gradients for
Task Representation [24.488427641442694]
We propose a novel conditional neural process-based approach for few-shot text classification.
Our key idea is to represent each task using gradient information from a base model.
Our approach outperforms traditional fine-tuning, sequential transfer learning, and state-of-the-art meta learning approaches.
arXiv Detail & Related papers (2022-01-27T15:29:30Z) - Exploring Low-dimensional Intrinsic Task Subspace via Prompt Tuning [70.76016793057283]
In this work, we study how pre-trained language models (PLMs) learn universal representations and effectively adapt to broad NLP tasks differing a lot.
In experiments, we study diverse few-shot NLP tasks and surprisingly find that in a 5-dimensional subspace found with 100 random tasks, by only tuning 5 free parameters, we can recover 87% and 65% of the full prompt tuning performance.
arXiv Detail & Related papers (2021-10-15T05:43:59Z) - Instance-Level Task Parameters: A Robust Multi-task Weighting Framework [17.639472693362926]
Recent works have shown that deep neural networks benefit from multi-task learning by learning a shared representation across several related tasks.
We let the training process dictate the optimal weighting of tasks for every instance in the dataset.
We conduct extensive experiments on SURREAL and CityScapes datasets, for human shape and pose estimation, depth estimation and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-11T02:35:42Z) - Adaptive Task Sampling for Meta-Learning [79.61146834134459]
Key idea of meta-learning for few-shot classification is to mimic the few-shot situations faced at test time.
We propose an adaptive task sampling method to improve the generalization performance.
arXiv Detail & Related papers (2020-07-17T03:15:53Z) - The Sample Complexity of Meta Sparse Regression [38.092179552223364]
This paper addresses the meta-learning problem in sparse linear regression with infinite tasks.
We show that T in O (( k log(p) ) /l ) tasks are sufficient in order to recover the common support of all tasks.
arXiv Detail & Related papers (2020-02-22T00:59:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.