Task Transfer and Domain Adaptation for Zero-Shot Question Answering
- URL: http://arxiv.org/abs/2206.06705v1
- Date: Tue, 14 Jun 2022 09:10:48 GMT
- Title: Task Transfer and Domain Adaptation for Zero-Shot Question Answering
- Authors: Xiang Pan, Alex Sheng, David Shimshoni, Aditya Singhal, Sara
Rosenthal, Avirup Sil
- Abstract summary: We use supervised pretraining on source-domain data to reduce sample complexity on domain-specific downstream tasks.
We evaluate zero-shot performance on domain-specific reading comprehension tasks by combining task transfer with domain adaptation.
- Score: 18.188082154309175
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pretrained language models have shown success in various areas of natural
language processing, including reading comprehension tasks. However, when
applying machine learning methods to new domains, labeled data may not always
be available. To address this, we use supervised pretraining on source-domain
data to reduce sample complexity on domain-specific downstream tasks. We
evaluate zero-shot performance on domain-specific reading comprehension tasks
by combining task transfer with domain adaptation to fine-tune a pretrained
model with no labelled data from the target task. Our approach outperforms
Domain-Adaptive Pretraining on downstream domain-specific reading comprehension
tasks in 3 out of 4 domains.
Related papers
- SwitchPrompt: Learning Domain-Specific Gated Soft Prompts for
Classification in Low-Resource Domains [14.096170976149521]
SwitchPrompt is a novel and lightweight prompting methodology for adaptation of language models trained on datasets from the general domain to diverse low-resource domains.
Our few-shot experiments on three text classification benchmarks demonstrate the efficacy of the general-domain pre-trained language models when used with SwitchPrompt.
They often even outperform their domain-specific counterparts trained with baseline state-of-the-art prompting methods by up to 10.7% performance increase in accuracy.
arXiv Detail & Related papers (2023-02-14T07:14:08Z) - Using Language to Extend to Unseen Domains [81.37175826824625]
It is expensive to collect training data for every possible domain that a vision model may encounter when deployed.
We consider how simply verbalizing the training domain as well as domains we want to extend to but do not have data for can improve robustness.
Using a multimodal model with a joint image and language embedding space, our method LADS learns a transformation of the image embeddings from the training domain to each unseen test domain.
arXiv Detail & Related papers (2022-10-18T01:14:02Z) - Disentanglement by Cyclic Reconstruction [0.0]
In supervised learning, information specific to the dataset used for training, but irrelevant to the task at hand, may remain encoded in the extracted representations.
We propose splitting the information into a task-related representation and its complementary context representation.
We then adapt this method to the unsupervised domain adaptation problem, consisting of training a model capable of performing on both a source and a target domain.
arXiv Detail & Related papers (2021-12-24T07:47:59Z) - AdaptSum: Towards Low-Resource Domain Adaptation for Abstractive
Summarization [43.024669990477214]
We present a study of domain adaptation for the abstractive summarization task across six diverse target domains in a low-resource setting.
Experiments show that the effectiveness of pre-training is correlated with the similarity between the pre-training data and the target domain task.
arXiv Detail & Related papers (2021-03-21T08:12:19Z) - Multi-Stage Pre-training for Low-Resource Domain Adaptation [24.689862495171408]
Current approaches directly adapt a pre-trained language model (LM) on in-domain text before fine-tuning to downstream tasks.
We show that extending the vocabulary of the LM with domain-specific terms leads to further gains.
We apply these approaches incrementally on a pre-trained Roberta-large LM and show considerable performance gain on three tasks in the IT domain.
arXiv Detail & Related papers (2020-10-12T17:57:00Z) - Domain Adversarial Fine-Tuning as an Effective Regularizer [80.14528207465412]
In Natural Language Processing (NLP), pretrained language models (LMs) that are transferred to downstream tasks have been recently shown to achieve state-of-the-art results.
Standard fine-tuning can degrade the general-domain representations captured during pretraining.
We introduce a new regularization technique, AFTER; domain Adversarial Fine-Tuning as an Effective Regularizer.
arXiv Detail & Related papers (2020-09-28T14:35:06Z) - Domain Adaptation for Semantic Parsing [68.81787666086554]
We propose a novel semantic for domain adaptation, where we have much fewer annotated data in the target domain compared to the source domain.
Our semantic benefits from a two-stage coarse-to-fine framework, thus can provide different and accurate treatments for the two stages.
Experiments on a benchmark dataset show that our method consistently outperforms several popular domain adaptation strategies.
arXiv Detail & Related papers (2020-06-23T14:47:41Z) - Multi-Domain Spoken Language Understanding Using Domain- and Task-Aware
Parameterization [78.93669377251396]
Spoken language understanding has been addressed as a supervised learning problem, where a set of training data is available for each domain.
One existing approach solves the problem by conducting multi-domain learning, using shared parameters for joint training across domains.
We propose to improve the parameterization of this method by using domain-specific and task-specific model parameters.
arXiv Detail & Related papers (2020-04-30T15:15:40Z) - Don't Stop Pretraining: Adapt Language Models to Domains and Tasks [81.99843216550306]
We present a study across four domains (biomedical and computer science publications, news, and reviews) and eight classification tasks.
A second phase of pretraining in-domain (domain-adaptive pretraining) leads to performance gains.
Adapting to the task's unlabeled data (task-adaptive pretraining) improves performance even after domain-adaptive pretraining.
arXiv Detail & Related papers (2020-04-23T04:21:19Z) - Unsupervised Domain Clusters in Pretrained Language Models [61.832234606157286]
We show that massive pre-trained language models implicitly learn sentence representations that cluster by domains without supervision.
We propose domain data selection methods based on such models.
We evaluate our data selection methods for neural machine translation across five diverse domains.
arXiv Detail & Related papers (2020-04-05T06:22:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.