Evolving Domain Adaptation of Pretrained Language Models for Text
Classification
- URL: http://arxiv.org/abs/2311.09661v1
- Date: Thu, 16 Nov 2023 08:28:00 GMT
- Title: Evolving Domain Adaptation of Pretrained Language Models for Text
Classification
- Authors: Yun-Shiuan Chuang, Yi Wu, Dhruv Gupta, Rheeya Uppaal, Ananya Kumar,
Luhang Sun, Makesh Narsimhan Sreedhar, Sijia Yang, Timothy T. Rogers, Junjie
Hu
- Abstract summary: Adapting pre-trained language models (PLMs) for time-series text classification amidst evolving domain shifts (EDS) is critical for maintaining accuracy in applications like stance detection.
This study benchmarks the effectiveness of evolving domain adaptation (EDA) strategies, notably self-training, domain-adversarial training, and domain-adaptive pretraining, with a focus on an incremental self-training method.
- Score: 24.795214770636534
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adapting pre-trained language models (PLMs) for time-series text
classification amidst evolving domain shifts (EDS) is critical for maintaining
accuracy in applications like stance detection. This study benchmarks the
effectiveness of evolving domain adaptation (EDA) strategies, notably
self-training, domain-adversarial training, and domain-adaptive pretraining,
with a focus on an incremental self-training method. Our analysis across
various datasets reveals that this incremental method excels at adapting PLMs
to EDS, outperforming traditional domain adaptation techniques. These findings
highlight the importance of continually updating PLMs to ensure their
effectiveness in real-world applications, paving the way for future research
into PLM robustness against the natural temporal evolution of language.
Related papers
- Stratified Domain Adaptation: A Progressive Self-Training Approach for Scene Text Recognition [1.2878987353423252]
Unsupervised domain adaptation (UDA) has become increasingly prevalent in scene text recognition (STR)
We introduce the Stratified Domain Adaptation (StrDA) approach, which examines the gradual escalation of the domain gap for the learning process.
We propose a novel method for employing domain discriminators to estimate the out-of-distribution and domain discriminative levels of data samples.
arXiv Detail & Related papers (2024-10-13T16:40:48Z) - Investigating Continual Pretraining in Large Language Models: Insights
and Implications [9.591223887442704]
This paper studies the evolving domain of Continual Learning in large language models (LLMs)
Our primary emphasis is on continual domain-adaptive pretraining, a process designed to equip LLMs with the ability to integrate new information from various domains.
We examine the impact of model size on learning efficacy and forgetting, as well as how the progression and similarity of emerging domains affect the knowledge transfer within these models.
arXiv Detail & Related papers (2024-02-27T10:47:24Z) - Progressive Conservative Adaptation for Evolving Target Domains [76.9274842289221]
Conventional domain adaptation typically transfers knowledge from a source domain to a stationary target domain.
Restoring and adapting to such target data results in escalating computational and resource consumption over time.
We propose a simple yet effective approach, termed progressive conservative adaptation (PCAda)
arXiv Detail & Related papers (2024-02-07T04:11:25Z) - Open-Set Domain Adaptation with Visual-Language Foundation Models [51.49854335102149]
Unsupervised domain adaptation (UDA) has proven to be very effective in transferring knowledge from a source domain to a target domain with unlabeled data.
Open-set domain adaptation (ODA) has emerged as a potential solution to identify these classes during the training phase.
arXiv Detail & Related papers (2023-07-30T11:38:46Z) - IDA: Informed Domain Adaptive Semantic Segmentation [51.12107564372869]
We propose an Domain Informed Adaptation (IDA) model, a self-training framework that mixes the data based on class-level segmentation performance.
In our IDA model, the class-level performance is tracked by an expected confidence score (ECS) and we then use a dynamic schedule to determine the mixing ratio for data in different domains.
Our proposed method is able to outperform the state-of-the-art UDA-SS method by a margin of 1.1 mIoU in the adaptation of GTA-V to Cityscapes and of 0.9 mIoU in the adaptation of SYNTHIA to City
arXiv Detail & Related papers (2023-03-05T18:16:34Z) - On the Domain Adaptation and Generalization of Pretrained Language
Models: A Survey [15.533482481757353]
We propose a taxonomy of domain adaptation approaches from a machine learning system view.
We discuss and compare those methods and suggest promising future research directions.
arXiv Detail & Related papers (2022-11-06T15:32:00Z) - Domain Adaptation with Adversarial Training on Penultimate Activations [82.9977759320565]
Enhancing model prediction confidence on unlabeled target data is an important objective in Unsupervised Domain Adaptation (UDA)
We show that this strategy is more efficient and better correlated with the objective of boosting prediction confidence than adversarial training on input images or intermediate features.
arXiv Detail & Related papers (2022-08-26T19:50:46Z) - Feature Adaptation of Pre-Trained Language Models across Languages and
Domains with Robust Self-Training [47.12438995938133]
We adapt pre-trained language models (PrLMs) to new domains without fine-tuning.
We present class-aware feature self-distillation (CFd) to learn discriminative features from PrLMs.
Experiments on two monolingual and multilingual Amazon review datasets show that CFd can consistently improve the performance of self-training.
arXiv Detail & Related papers (2020-09-24T08:04:37Z) - Adaptive Risk Minimization: Learning to Adapt to Domain Shift [109.87561509436016]
A fundamental assumption of most machine learning algorithms is that the training and test data are drawn from the same underlying distribution.
In this work, we consider the problem setting of domain generalization, where the training data are structured into domains and there may be multiple test time shifts.
We introduce the framework of adaptive risk minimization (ARM), in which models are directly optimized for effective adaptation to shift by learning to adapt on the training domains.
arXiv Detail & Related papers (2020-07-06T17:59:30Z) - Don't Stop Pretraining: Adapt Language Models to Domains and Tasks [81.99843216550306]
We present a study across four domains (biomedical and computer science publications, news, and reviews) and eight classification tasks.
A second phase of pretraining in-domain (domain-adaptive pretraining) leads to performance gains.
Adapting to the task's unlabeled data (task-adaptive pretraining) improves performance even after domain-adaptive pretraining.
arXiv Detail & Related papers (2020-04-23T04:21:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.