UDApter -- Efficient Domain Adaptation Using Adapters
- URL: http://arxiv.org/abs/2302.03194v1
- Date: Tue, 7 Feb 2023 02:04:17 GMT
- Title: UDApter -- Efficient Domain Adaptation Using Adapters
- Authors: Bhavitvya Malik, Abhinav Ramesh Kashyap, Min-Yen Kan, Soujanya Poria
- Abstract summary: We propose two methods to make unsupervised domain adaptation more parameter efficient.
The first method deconstructs UDA into a two-step process: first by adding a domain adapter to learn domain-invariant information.
We are within 0.85% F1 for natural language inference task, by fine-tuning only a fraction of the full model parameters.
- Score: 29.70751969196527
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose two methods to make unsupervised domain adaptation (UDA) more
parameter efficient using adapters, small bottleneck layers interspersed with
every layer of the large-scale pre-trained language model (PLM). The first
method deconstructs UDA into a two-step process: first by adding a domain
adapter to learn domain-invariant information and then by adding a task adapter
that uses domain-invariant information to learn task representations in the
source domain. The second method jointly learns a supervised classifier while
reducing the divergence measure. Compared to strong baselines, our simple
methods perform well in natural language inference (MNLI) and the cross-domain
sentiment classification task. We even outperform unsupervised domain
adaptation methods such as DANN and DSN in sentiment classification, and we are
within 0.85% F1 for natural language inference task, by fine-tuning only a
fraction of the full model parameters. We release our code at
https://github.com/declare-lab/UDAPTER
Related papers
- EMPL: A novel Efficient Meta Prompt Learning Framework for Few-shot Unsupervised Domain Adaptation [22.586094394391747]
We propose a novel Efficient Meta Prompt Learning Framework for FS-UDA.
Within this framework, we use pre-trained CLIP model as the feature learning base model.
Our method has the large improvement of at least 15.4% on 5-way 1-shot and 8.7% on 5-way 5-shot, compared with the state-of-the-art methods.
arXiv Detail & Related papers (2024-07-04T17:13:06Z) - Mixture-of-Domain-Adapters: Decoupling and Injecting Domain Knowledge to
Pre-trained Language Models Memories [31.995033685838962]
Pre-trained language models (PLMs) demonstrate excellent abilities to understand texts in the generic domain while struggling in a specific domain.
In this paper, we investigate whether we can adapt PLMs both effectively and efficiently by only tuning a few parameters.
Specifically, we decouple the feed-forward networks (FFNs) of the Transformer architecture into two parts: the original pre-trained FFNs to maintain the old-domain knowledge and our novel domain-specific adapters to inject domain-specific knowledge in parallel.
arXiv Detail & Related papers (2023-06-08T17:54:36Z) - AdapterSoup: Weight Averaging to Improve Generalization of Pretrained
Language Models [127.04370753583261]
Pretrained language models (PLMs) are trained on massive corpora, but often need to specialize to specific domains.
A solution is to use a related-domain adapter for the novel domain at test time.
We introduce AdapterSoup, an approach that performs weight-space averaging of adapters trained on different domains.
arXiv Detail & Related papers (2023-02-14T13:09:23Z) - Domain Adaptation via Prompt Learning [39.97105851723885]
Unsupervised domain adaption (UDA) aims to adapt models learned from a well-annotated source domain to a target domain.
We introduce a novel prompt learning paradigm for UDA, named Domain Adaptation via Prompt Learning (DAPL)
arXiv Detail & Related papers (2022-02-14T13:25:46Z) - Efficient Hierarchical Domain Adaptation for Pretrained Language Models [77.02962815423658]
Generative language models are trained on diverse, general domain corpora.
We introduce a method to scale domain adaptation to many diverse domains using a computationally efficient adapter approach.
arXiv Detail & Related papers (2021-12-16T11:09:29Z) - Multilingual Domain Adaptation for NMT: Decoupling Language and Domain
Information with Adapters [66.7986513246294]
We study the compositionality of language and domain adapters in the context of Machine Translation.
We find that in the partial resource scenario a naive combination of domain-specific and language-specific adapters often results in catastrophic forgetting' of the missing languages.
arXiv Detail & Related papers (2021-10-18T18:55:23Z) - Deep Subdomain Adaptation Network for Image Classification [32.58984565281493]
Deep Subdomain Adaptation Network (DSAN) learns a transfer network by aligning the relevant subdomain distributions of domain-specific layer activations.
Our DSAN is very simple but effective which does not need adversarial training and converges fast.
Experiments demonstrate remarkable results on both object recognition tasks and digit classification tasks.
arXiv Detail & Related papers (2021-06-17T11:07:21Z) - Cross-domain Contrastive Learning for Unsupervised Domain Adaptation [108.63914324182984]
Unsupervised domain adaptation (UDA) aims to transfer knowledge learned from a fully-labeled source domain to a different unlabeled target domain.
We build upon contrastive self-supervised learning to align features so as to reduce the domain discrepancy between training and testing sets.
arXiv Detail & Related papers (2021-06-10T06:32:30Z) - Contrastive Learning and Self-Training for Unsupervised Domain
Adaptation in Semantic Segmentation [71.77083272602525]
UDA attempts to provide efficient knowledge transfer from a labeled source domain to an unlabeled target domain.
We propose a contrastive learning approach that adapts category-wise centroids across domains.
We extend our method with self-training, where we use a memory-efficient temporal ensemble to generate consistent and reliable pseudo-labels.
arXiv Detail & Related papers (2021-05-05T11:55:53Z) - OVANet: One-vs-All Network for Universal Domain Adaptation [78.86047802107025]
Existing methods manually set a threshold to reject unknown samples based on validation or a pre-defined ratio of unknown samples.
We propose a method to learn the threshold using source samples and to adapt it to the target domain.
Our idea is that a minimum inter-class distance in the source domain should be a good threshold to decide between known or unknown in the target.
arXiv Detail & Related papers (2021-04-07T18:36:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.