How Useful is Continued Pre-Training for Generative Unsupervised Domain Adaptation?
- URL: http://arxiv.org/abs/2401.17514v2
- Date: Mon, 1 Apr 2024 20:40:40 GMT
- Title: How Useful is Continued Pre-Training for Generative Unsupervised Domain Adaptation?
- Authors: Rheeya Uppaal, Yixuan Li, Junjie Hu,
- Abstract summary: We evaluate the utility of Continued Pre-Training (CPT) for generative UDA.
Our findings suggest that a implicitly learns the downstream task while predicting masked words informative to that task.
- Score: 23.454153602068786
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent breakthroughs in scale have enabled the emergence of powerful generative language models, and the ability to fine-tune these models on various tasks by casting them into prompts or instructions. In this landscape, the problem of Unsupervised Domain Adaptation (UDA), or the problem of leveraging knowledge from a labeled source domain to an unlabeled target domain, has been left behind, with recent UDA methods still addressing discriminative classification. In particular, two popular UDA approaches, involving Continued Pre-Training (CPT) and learning domain invariant representations, have been under-explored in the generative setting, signaling a gap. In this work, we evaluate the utility of CPT for generative UDA. We first perform an empirical evaluation to measure the trade-offs between CPT and strong methods promoting domain invariance. We further evaluate how well the benefits of CPT extend to different architectures, tuning methods and data regimes. We then motivate the use of CPT by studying to what degree it benefits classification performance on the target domain. Finally, we attempt to understand the mechanism behind which CPT improves classification performance on the unlabeled target domain. Our findings suggest that a implicitly learns the downstream task while predicting masked words informative to that task. Our work connects the body of UDA research with that of instruction tuning, enabling an initial step towards a wider applicability of modern language models.
Related papers
- Enhancing Domain Adaptation through Prompt Gradient Alignment [16.618313165111793]
We develop a line of works based on prompt learning to learn both domain-invariant and specific features.
We cast UDA as a multiple-objective optimization problem in which each objective is represented by a domain loss.
Our method consistently surpasses other prompt-based baselines by a large margin on different UDA benchmarks.
arXiv Detail & Related papers (2024-06-13T17:40:15Z) - Open-Set Domain Adaptation with Visual-Language Foundation Models [51.49854335102149]
Unsupervised domain adaptation (UDA) has proven to be very effective in transferring knowledge from a source domain to a target domain with unlabeled data.
Open-set domain adaptation (ODA) has emerged as a potential solution to identify these classes during the training phase.
arXiv Detail & Related papers (2023-07-30T11:38:46Z) - Domain Adaptation with Adversarial Training on Penultimate Activations [82.9977759320565]
Enhancing model prediction confidence on unlabeled target data is an important objective in Unsupervised Domain Adaptation (UDA)
We show that this strategy is more efficient and better correlated with the objective of boosting prediction confidence than adversarial training on input images or intermediate features.
arXiv Detail & Related papers (2022-08-26T19:50:46Z) - Prior Knowledge Guided Unsupervised Domain Adaptation [82.9977759320565]
We propose a Knowledge-guided Unsupervised Domain Adaptation (KUDA) setting where prior knowledge about the target class distribution is available.
In particular, we consider two specific types of prior knowledge about the class distribution in the target domain: Unary Bound and Binary Relationship.
We propose a rectification module that uses such prior knowledge to refine model generated pseudo labels.
arXiv Detail & Related papers (2022-07-18T18:41:36Z) - Connect, Not Collapse: Explaining Contrastive Learning for Unsupervised
Domain Adaptation [88.5448806952394]
We consider unsupervised domain adaptation (UDA), where labeled data from a source domain and unlabeled data from a target domain are used to learn a classifier for the target domain.
We show that contrastive pre-training, which learns features on unlabeled source and target data and then fine-tunes on labeled source data, is competitive with strong UDA methods.
arXiv Detail & Related papers (2022-04-01T16:56:26Z) - Domain Adaptation via Prompt Learning [39.97105851723885]
Unsupervised domain adaption (UDA) aims to adapt models learned from a well-annotated source domain to a target domain.
We introduce a novel prompt learning paradigm for UDA, named Domain Adaptation via Prompt Learning (DAPL)
arXiv Detail & Related papers (2022-02-14T13:25:46Z) - RefRec: Pseudo-labels Refinement via Shape Reconstruction for
Unsupervised 3D Domain Adaptation [23.275638060852067]
RefRec is the first approach to investigate pseudo-labels and self-training in UDA for point clouds.
We present two main innovations to make self-training effective on 3D data.
arXiv Detail & Related papers (2021-10-21T10:24:05Z) - Self-Domain Adaptation for Face Anti-Spoofing [31.441928816043536]
We propose a self-domain adaptation framework to leverage the unlabeled test domain data at inference.
A meta-learning based adaptor learning algorithm is proposed using the data of multiple source domains at the training step.
arXiv Detail & Related papers (2021-02-24T08:46:39Z) - Effective Label Propagation for Discriminative Semi-Supervised Domain
Adaptation [76.41664929948607]
Semi-supervised domain adaptation (SSDA) methods have demonstrated great potential in large-scale image classification tasks.
We present a novel and effective method to tackle this problem by using effective inter-domain and intra-domain semantic information propagation.
Our source code and pre-trained models will be released soon.
arXiv Detail & Related papers (2020-12-04T14:28:19Z) - Class-Incremental Domain Adaptation [56.72064953133832]
We introduce a practical Domain Adaptation (DA) paradigm called Class-Incremental Domain Adaptation (CIDA)
Existing DA methods tackle domain-shift but are unsuitable for learning novel target-domain classes.
Our approach yields superior performance as compared to both DA and CI methods in the CIDA paradigm.
arXiv Detail & Related papers (2020-08-04T07:55:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.