Improving Cross-task Generalization of Unified Table-to-text Models with
Compositional Task Configurations
- URL: http://arxiv.org/abs/2212.08780v1
- Date: Sat, 17 Dec 2022 02:20:14 GMT
- Title: Improving Cross-task Generalization of Unified Table-to-text Models with
Compositional Task Configurations
- Authors: Jifan Chen, Yuhao Zhang, Lan Liu, Rui Dong, Xinchi Chen, Patrick Ng,
William Yang Wang, Zhiheng Huang
- Abstract summary: Methods typically encode task information with a simple dataset name as a prefix to the encoder.
We propose compositional task configurations, a set of prompts prepended to the encoder to improve cross-task generalization.
We show this not only allows the model to better learn shared knowledge across different tasks at training, but also allows us to control the model by composing new configurations.
- Score: 63.04466647849211
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There has been great progress in unifying various table-to-text tasks using a
single encoder-decoder model trained via multi-task learning (Xie et al.,
2022). However, existing methods typically encode task information with a
simple dataset name as a prefix to the encoder. This not only limits the
effectiveness of multi-task learning, but also hinders the model's ability to
generalize to new domains or tasks that were not seen during training, which is
crucial for real-world applications. In this paper, we propose compositional
task configurations, a set of prompts prepended to the encoder to improve
cross-task generalization of unified models. We design the task configurations
to explicitly specify the task type, as well as its input and output types. We
show that this not only allows the model to better learn shared knowledge
across different tasks at training, but also allows us to control the model by
composing new configurations that apply novel input-output combinations in a
zero-shot manner. We demonstrate via experiments over ten table-to-text tasks
that our method outperforms the UnifiedSKG baseline by noticeable margins in
both in-domain and zero-shot settings, with average improvements of +0.5 and
+12.6 from using a T5-large backbone, respectively.
Related papers
- Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - Task Adaptive Parameter Sharing for Multi-Task Learning [114.80350786535952]
Adaptive Task Adapting Sharing (TAPS) is a method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers.
Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters.
We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.
arXiv Detail & Related papers (2022-03-30T23:16:07Z) - Combining Modular Skills in Multitask Learning [149.8001096811708]
A modular design encourages neural models to disentangle and recombine different facets of knowledge to generalise more systematically to new tasks.
In this work, we assume each task is associated with a subset of latent discrete skills from a (potentially small) inventory.
We find that the modular design of a network significantly increases sample efficiency in reinforcement learning and few-shot generalisation in supervised learning.
arXiv Detail & Related papers (2022-02-28T16:07:19Z) - Grad2Task: Improved Few-shot Text Classification Using Gradients for
Task Representation [24.488427641442694]
We propose a novel conditional neural process-based approach for few-shot text classification.
Our key idea is to represent each task using gradient information from a base model.
Our approach outperforms traditional fine-tuning, sequential transfer learning, and state-of-the-art meta learning approaches.
arXiv Detail & Related papers (2022-01-27T15:29:30Z) - Uni-Perceiver: Pre-training Unified Architecture for Generic Perception
for Zero-shot and Few-shot Tasks [73.63892022944198]
We present a generic perception architecture named Uni-Perceiver.
It processes a variety of modalities and tasks with unified modeling and shared parameters.
Results show that our pre-trained model without any tuning can achieve reasonable performance even on novel tasks.
arXiv Detail & Related papers (2021-12-02T18:59:50Z) - Adversarial Continual Learning [99.56738010842301]
We propose a hybrid continual learning framework that learns a disjoint representation for task-invariant and task-specific features.
Our model combines architecture growth to prevent forgetting of task-specific skills and an experience replay approach to preserve shared skills.
arXiv Detail & Related papers (2020-03-21T02:08:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.