Learning from Task Descriptions
- URL: http://arxiv.org/abs/2011.08115v1
- Date: Mon, 16 Nov 2020 17:25:24 GMT
- Title: Learning from Task Descriptions
- Authors: Orion Weller, Nicholas Lourie, Matt Gardner, Matthew E. Peters
- Abstract summary: We introduce a framework for developing NLP systems that solve new tasks after reading their descriptions.
We instantiate this framework with a new English language dataset, ZEST, structured for task-oriented evaluation.
We find that the state-of-the-art T5 model achieves a score of 12% on ZEST, leaving a significant challenge for NLP researchers.
- Score: 24.588252048132862
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Typically, machine learning systems solve new tasks by training on thousands
of examples. In contrast, humans can solve new tasks by reading some
instructions, with perhaps an example or two. To take a step toward closing
this gap, we introduce a framework for developing NLP systems that solve new
tasks after reading their descriptions, synthesizing prior work in this area.
We instantiate this framework with a new English language dataset, ZEST,
structured for task-oriented evaluation on unseen tasks. Formulating task
descriptions as questions, we ensure each is general enough to apply to many
possible inputs, thus comprehensively evaluating a model's ability to solve
each task. Moreover, the dataset's structure tests specific types of systematic
generalization. We find that the state-of-the-art T5 model achieves a score of
12% on ZEST, leaving a significant challenge for NLP researchers.
Related papers
- Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - Improving Cross-task Generalization of Unified Table-to-text Models with
Compositional Task Configurations [63.04466647849211]
Methods typically encode task information with a simple dataset name as a prefix to the encoder.
We propose compositional task configurations, a set of prompts prepended to the encoder to improve cross-task generalization.
We show this not only allows the model to better learn shared knowledge across different tasks at training, but also allows us to control the model by composing new configurations.
arXiv Detail & Related papers (2022-12-17T02:20:14Z) - Improving Task Generalization via Unified Schema Prompt [87.31158568180514]
Unified Prompt is a flexible and prompting method, which automatically customizes the learnable prompts for each task according to the task input schema.
It models the shared knowledge between tasks, while keeping the characteristics of different task schema.
The framework achieves strong zero-shot and few-shot performance on 16 unseen tasks downstream from 8 task types.
arXiv Detail & Related papers (2022-08-05T15:26:36Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - Analysis and Prediction of NLP Models Via Task Embeddings [25.311690222754454]
We propose MetaEval, a collection of $101$ NLP tasks.
We fit a single transformer to all MetaEval tasks jointly while conditioning it on learned embeddings.
The resulting task embeddings enable a novel analysis of the space of tasks.
arXiv Detail & Related papers (2021-12-10T16:23:24Z) - Quantifying Adaptability in Pre-trained Language Models with 500 Tasks [60.0364822929442]
We present a large-scale empirical study of the features and limits of LM adaptability using a new benchmark, TaskBench500.
We evaluate three facets of adaptability, finding that adaptation procedures differ dramatically in their ability to memorize small datasets.
Our experiments show that adaptability to new tasks, like generalization to new examples, can be systematically described and understood.
arXiv Detail & Related papers (2021-12-06T18:00:25Z) - Evaluating NLP Systems On a Novel Cloze Task: Judging the Plausibility
of Possible Fillers in Instructional Texts [2.3449131636069898]
Cloze task is a widely used task to evaluate an NLP system's language understanding ability.
New task is proposed: predicting if a filler word in a cloze task is a good, neutral, or bad candidate.
arXiv Detail & Related papers (2021-12-03T12:02:52Z) - LFPT5: A Unified Framework for Lifelong Few-shot Language Learning Based
on Prompt Tuning of T5 [3.04585143845864]
We propose a unified framework for Lifelong Few-shot Language Learning (LFLL) based on prompt tuning of T5.
Our framework called LFPT5 takes full advantage of PT's strong few-shot learning ability, and simultaneously trains the model as a task solver and a data generator.
With extensive experiments, we demonstrate that LFPT5 can be applied to various different types of tasks and significantly outperform previous methods in different LFLL settings.
arXiv Detail & Related papers (2021-10-14T12:06:29Z) - CINS: Comprehensive Instruction for Few-shot Learning in Task-oriented
Dialog Systems [56.302581679816775]
This paper proposes Comprehensive Instruction (CINS) that exploits PLMs with task-specific instructions.
We design a schema (definition, constraint, prompt) of instructions and their customized realizations for three important downstream tasks in ToD.
Experiments are conducted on these ToD tasks in realistic few-shot learning scenarios with small validation data.
arXiv Detail & Related papers (2021-09-10T03:23:06Z) - CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in
NLP [38.40614678878222]
We introduce CrossFit, a task setup for studying cross-task few-shot learning ability.
We present NLP Few-shot Gym, a repository of 160 few-shot NLP tasks.
Our empirical analysis reveals that the few-shot learning ability on unseen tasks can be improved via an upstream learning stage.
arXiv Detail & Related papers (2021-04-18T12:14:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.