Compositional Zero-Shot Domain Transfer with Text-to-Text Models
- URL: http://arxiv.org/abs/2303.13386v1
- Date: Thu, 23 Mar 2023 15:58:41 GMT
- Title: Compositional Zero-Shot Domain Transfer with Text-to-Text Models
- Authors: Fangyu Liu, Qianchu Liu, Shruthi Bannur, Fernando P\'erez-Garc\'ia,
Naoto Usuyama, Sheng Zhang, Tristan Naumann, Aditya Nori, Hoifung Poon,
Javier Alvarez-Valle, Ozan Oktay, Stephanie L. Hyland
- Abstract summary: We propose a novel compositional transfer learning framework (DoT5) for zero-shot domain transfer.
Without access to in-domain labels, DoT5 jointly learns domain knowledge and task knowledge in a multi-task manner.
DoT5 demonstrates the effectiveness of compositional transfer learning through multi-task learning.
In particular, DoT5 outperforms the current SOTA in zero-shot transfer by over 7 absolute points in accuracy on RadNLI.
- Score: 65.32821642379066
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Label scarcity is a bottleneck for improving task performance in specialised
domains. We propose a novel compositional transfer learning framework (DoT5 -
domain compositional zero-shot T5) for zero-shot domain transfer. Without
access to in-domain labels, DoT5 jointly learns domain knowledge (from MLM of
unlabelled in-domain free text) and task knowledge (from task training on more
readily available general-domain data) in a multi-task manner. To improve the
transferability of task training, we design a strategy named NLGU: we
simultaneously train NLG for in-domain label-to-data generation which enables
data augmentation for self-finetuning and NLU for label prediction. We evaluate
DoT5 on the biomedical domain and the resource-lean subdomain of radiology,
focusing on NLI, text summarisation and embedding learning. DoT5 demonstrates
the effectiveness of compositional transfer learning through multi-task
learning. In particular, DoT5 outperforms the current SOTA in zero-shot
transfer by over 7 absolute points in accuracy on RadNLI. We validate DoT5 with
ablations and a case study demonstrating its ability to solve challenging NLI
examples requiring in-domain expertise.
Related papers
- Large Language Model for Multi-Domain Translation: Benchmarking and Domain CoT Fine-tuning [55.107329995417786]
Large language models (LLMs) have demonstrated impressive general understanding and generation abilities.
We establish a benchmark for multi-domain translation, featuring 25 German$Leftrightarrow$English and 22 Chinese$Leftrightarrow$English test sets.
We propose a domain Chain of Thought (CoT) fine-tuning technique that utilizes the intrinsic multi-domain intelligence of LLMs to improve translation performance.
arXiv Detail & Related papers (2024-10-03T16:15:04Z) - Frustratingly Simple Entity Tracking with Effective Use of Multi-Task
Learning Models [5.9585526937249]
We present SET, a frustratingly Simple-yet-effective approach for entity tracking in procedural text.
Compared with state-of-the-art entity tracking models that require domain-specific pre-training, SET simply fine-tunes off-the-shelf T5 with customized formats.
We show that T5's supervised multi-task learning plays an important role in the success of SET.
arXiv Detail & Related papers (2022-10-12T17:46:16Z) - Extreme Multi-Domain, Multi-Task Learning With Unified Text-to-Text
Transfer Transformers [0.0]
We investigated the behavior of multi-domain, multi-task learning using multi-domain text-to-text transfer transformers (MD-T5)
We carried out experiments using three popular training strategies: Bert-style joint pretraining + successive finetuning, GPT-style joint pretraining + successive finetuning, and GPT-style joint pretraining + joint finetuning.
We show that while negative knowledge transfer and catastrophic forgetting are still considerable challenges for all the models, the GPT-style joint pretraining + joint finetuning strategy showed the most promise in multi-domain, multi-task learning.
arXiv Detail & Related papers (2022-09-21T04:21:27Z) - Task Transfer and Domain Adaptation for Zero-Shot Question Answering [18.188082154309175]
We use supervised pretraining on source-domain data to reduce sample complexity on domain-specific downstream tasks.
We evaluate zero-shot performance on domain-specific reading comprehension tasks by combining task transfer with domain adaptation.
arXiv Detail & Related papers (2022-06-14T09:10:48Z) - Target-Oriented Fine-tuning for Zero-Resource Named Entity Recognition [25.662899487595524]
We propose four practical guidelines to guide knowledge transfer and task fine-tuning.
Based on these guidelines, we design a target-oriented fine-tuning (TOF) framework to exploit various data from three aspects in a unified training manner.
arXiv Detail & Related papers (2021-07-22T08:48:34Z) - Contrastive Learning and Self-Training for Unsupervised Domain
Adaptation in Semantic Segmentation [71.77083272602525]
UDA attempts to provide efficient knowledge transfer from a labeled source domain to an unlabeled target domain.
We propose a contrastive learning approach that adapts category-wise centroids across domains.
We extend our method with self-training, where we use a memory-efficient temporal ensemble to generate consistent and reliable pseudo-labels.
arXiv Detail & Related papers (2021-05-05T11:55:53Z) - Curriculum Graph Co-Teaching for Multi-Target Domain Adaptation [78.28390172958643]
We identify two key aspects that can help to alleviate multiple domain-shifts in the multi-target domain adaptation (MTDA)
We propose Curriculum Graph Co-Teaching (CGCT) that uses a dual classifier head, with one of them being a graph convolutional network (GCN) which aggregates features from similar samples across the domains.
When the domain labels are available, we propose Domain-aware Curriculum Learning (DCL), a sequential adaptation strategy that first adapts on the easier target domains, followed by the harder ones.
arXiv Detail & Related papers (2021-04-01T23:41:41Z) - Learning Invariant Representations across Domains and Tasks [81.30046935430791]
We propose a novel Task Adaptation Network (TAN) to solve this unsupervised task transfer problem.
In addition to learning transferable features via domain-adversarial training, we propose a novel task semantic adaptor that uses the learning-to-learn strategy to adapt the task semantics.
TAN significantly increases the recall and F1 score by 5.0% and 7.8% compared to recently strong baselines.
arXiv Detail & Related papers (2021-03-03T11:18:43Z) - Unsupervised Transfer Learning with Self-Supervised Remedy [60.315835711438936]
Generalising deep networks to novel domains without manual labels is challenging to deep learning.
Pre-learned knowledge does not transfer well without making strong assumptions about the learned and the novel domains.
In this work, we aim to learn a discriminative latent space of the unlabelled target data in a novel domain by knowledge transfer from labelled related domains.
arXiv Detail & Related papers (2020-06-08T16:42:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.