On the Domain Adaptation and Generalization of Pretrained Language
Models: A Survey
- URL: http://arxiv.org/abs/2211.03154v1
- Date: Sun, 6 Nov 2022 15:32:00 GMT
- Title: On the Domain Adaptation and Generalization of Pretrained Language
Models: A Survey
- Authors: Xu Guo, Han Yu
- Abstract summary: We propose a taxonomy of domain adaptation approaches from a machine learning system view.
We discuss and compare those methods and suggest promising future research directions.
- Score: 15.533482481757353
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in NLP are brought by a range of large-scale pretrained
language models (PLMs). These PLMs have brought significant performance gains
for a range of NLP tasks, circumventing the need to customize complex designs
for specific tasks. However, most current work focus on finetuning PLMs on a
domain-specific datasets, ignoring the fact that the domain gap can lead to
overfitting and even performance drop. Therefore, it is practically important
to find an appropriate method to effectively adapt PLMs to a target domain of
interest. Recently, a range of methods have been proposed to achieve this
purpose. Early surveys on domain adaptation are not suitable for PLMs due to
the sophisticated behavior exhibited by PLMs from traditional models trained
from scratch and that domain adaptation of PLMs need to be redesigned to take
effect. This paper aims to provide a survey on these newly proposed methods and
shed light in how to apply traditional machine learning methods to newly
evolved and future technologies. By examining the issues of deploying PLMs for
downstream tasks, we propose a taxonomy of domain adaptation approaches from a
machine learning system view, covering methods for input augmentation, model
optimization and personalization. We discuss and compare those methods and
suggest promising future research directions.
Related papers
- Aligning CodeLLMs with Direct Preference Optimization [44.34483822102872]
This work first identifies that the commonly used PPO algorithm may be suboptimal for the alignment of CodeLLM.
Based on only preference data pairs, DPO can render the model rank data automatically, giving rise to a fine-grained rewarding pattern.
Studies show that our method significantly improves the performance of existing CodeLLMs on benchmarks such as MBPP and HumanEval.
arXiv Detail & Related papers (2024-10-24T09:36:13Z) - The Ultimate Guide to Fine-Tuning LLMs from Basics to Breakthroughs: An Exhaustive Review of Technologies, Research, Best Practices, Applied Research Challenges and Opportunities [0.35998666903987897]
This report examines the fine-tuning of Large Language Models (LLMs)
It outlines the historical evolution of LLMs from traditional Natural Language Processing (NLP) models to their pivotal role in AI.
The report introduces a structured seven-stage pipeline for fine-tuning LLMs.
arXiv Detail & Related papers (2024-08-23T14:48:02Z) - MAPO: Boosting Large Language Model Performance with Model-Adaptive Prompt Optimization [73.7779735046424]
We show that different prompts should be adapted to different Large Language Models (LLM) to enhance their capabilities across various downstream tasks in NLP.
We then propose a model-adaptive prompt (MAPO) method that optimize the original prompts for each specific LLM in downstream tasks.
arXiv Detail & Related papers (2024-07-04T18:39:59Z) - Investigating Continual Pretraining in Large Language Models: Insights
and Implications [9.591223887442704]
This paper studies the evolving domain of Continual Learning in large language models (LLMs)
Our primary emphasis is on continual domain-adaptive pretraining, a process designed to equip LLMs with the ability to integrate new information from various domains.
We examine the impact of model size on learning efficacy and forgetting, as well as how the progression and similarity of emerging domains affect the knowledge transfer within these models.
arXiv Detail & Related papers (2024-02-27T10:47:24Z) - PANDA: Preference Adaptation for Enhancing Domain-Specific Abilities of LLMs [49.32067576992511]
Large language models often fall short of the performance achieved by domain-specific state-of-the-art models.
One potential approach to enhance domain-specific capabilities of LLMs involves fine-tuning them using corresponding datasets.
We propose Preference Adaptation for Enhancing Domain-specific Abilities of LLMs (PANDA)
Our experimental results reveal that PANDA significantly enhances the domain-specific ability of LLMs on text classification and interactive decision tasks.
arXiv Detail & Related papers (2024-02-20T09:02:55Z) - Evolving Domain Adaptation of Pretrained Language Models for Text
Classification [24.795214770636534]
Adapting pre-trained language models (PLMs) for time-series text classification amidst evolving domain shifts (EDS) is critical for maintaining accuracy in applications like stance detection.
This study benchmarks the effectiveness of evolving domain adaptation (EDA) strategies, notably self-training, domain-adversarial training, and domain-adaptive pretraining, with a focus on an incremental self-training method.
arXiv Detail & Related papers (2023-11-16T08:28:00Z) - Learning Transferable Conceptual Prototypes for Interpretable
Unsupervised Domain Adaptation [79.22678026708134]
In this paper, we propose an inherently interpretable method, named Transferable Prototype Learning ( TCPL)
To achieve this goal, we design a hierarchically prototypical module that transfers categorical basic concepts from the source domain to the target domain and learns domain-shared prototypes for explaining the underlying reasoning process.
Comprehensive experiments show that the proposed method can not only provide effective and intuitive explanations but also outperform previous state-of-the-arts.
arXiv Detail & Related papers (2023-10-12T06:36:41Z) - Open-Set Domain Adaptation with Visual-Language Foundation Models [51.49854335102149]
Unsupervised domain adaptation (UDA) has proven to be very effective in transferring knowledge from a source domain to a target domain with unlabeled data.
Open-set domain adaptation (ODA) has emerged as a potential solution to identify these classes during the training phase.
arXiv Detail & Related papers (2023-07-30T11:38:46Z) - Pre-trained Language Models for Keyphrase Generation: A Thorough
Empirical Study [76.52997424694767]
We present an in-depth empirical study of keyphrase extraction and keyphrase generation using pre-trained language models.
We show that PLMs have competitive high-resource performance and state-of-the-art low-resource performance.
Further results show that in-domain BERT-like PLMs can be used to build strong and data-efficient keyphrase generation models.
arXiv Detail & Related papers (2022-12-20T13:20:21Z) - KALA: Knowledge-Augmented Language Model Adaptation [65.92457495576141]
We propose a novel domain adaption framework for pre-trained language models (PLMs)
Knowledge-Augmented Language model Adaptation (KALA) modulates the intermediate hidden representations of PLMs with domain knowledge.
Results show that, despite being computationally efficient, our KALA largely outperforms adaptive pre-training.
arXiv Detail & Related papers (2022-04-22T08:11:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.