Related papers: On the Domain Adaptation and Generalization of Pretrained Language Models: A Survey

On the Domain Adaptation and Generalization of Pretrained Language Models: A Survey

URL: http://arxiv.org/abs/2211.03154v1
Date: Sun, 6 Nov 2022 15:32:00 GMT
Title: On the Domain Adaptation and Generalization of Pretrained Language Models: A Survey
Authors: Xu Guo, Han Yu
Abstract summary: We propose a taxonomy of domain adaptation approaches from a machine learning system view. We discuss and compare those methods and suggest promising future research directions.
Score: 15.533482481757353
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in NLP are brought by a range of large-scale pretrained language models (PLMs). These PLMs have brought significant performance gains for a range of NLP tasks, circumventing the need to customize complex designs for specific tasks. However, most current work focus on finetuning PLMs on a domain-specific datasets, ignoring the fact that the domain gap can lead to overfitting and even performance drop. Therefore, it is practically important to find an appropriate method to effectively adapt PLMs to a target domain of interest. Recently, a range of methods have been proposed to achieve this purpose. Early surveys on domain adaptation are not suitable for PLMs due to the sophisticated behavior exhibited by PLMs from traditional models trained from scratch and that domain adaptation of PLMs need to be redesigned to take effect. This paper aims to provide a survey on these newly proposed methods and shed light in how to apply traditional machine learning methods to newly evolved and future technologies. By examining the issues of deploying PLMs for downstream tasks, we propose a taxonomy of domain adaptation approaches from a machine learning system view, covering methods for input augmentation, model optimization and personalization. We discuss and compare those methods and suggest promising future research directions.

Related papers

CLIP-Powered Domain Generalization and Domain Adaptation: A Comprehensive Survey [38.281260447611395]
This survey systematically explores the applications of Contrastive Language-Image Pretraining (CLIP) in domain generalization (DG) and domain adaptation (DA) CLIP offers powerful zero-shot capabilities that allow models to perform effectively in unseen domains. Key challenges, including overfitting, domain diversity, and computational efficiency, are addressed.
arXiv Detail & Related papers (2025-04-19T12:27:24Z)
A Survey of Direct Preference Optimization [103.59317151002693]
Large Language Models (LLMs) have demonstrated unprecedented generative capabilities. Their alignment with human values remains critical for ensuring helpful and harmless deployments. Direct Preference Optimization (DPO) has recently gained prominence as a streamlined alternative.
arXiv Detail & Related papers (2025-03-12T08:45:15Z)
LLM Post-Training: A Deep Dive into Reasoning Large Language Models [131.10969986056]
Large Language Models (LLMs) have transformed the natural language processing landscape and brought to life diverse applications. Post-training methods enable LLMs to refine their knowledge, improve reasoning, enhance factual accuracy, and align more effectively with user intents and ethical considerations.
arXiv Detail & Related papers (2025-02-28T18:59:54Z)
Aligning CodeLLMs with Direct Preference Optimization [44.34483822102872]
This work first identifies that the commonly used PPO algorithm may be suboptimal for the alignment of CodeLLM. Based on only preference data pairs, DPO can render the model rank data automatically, giving rise to a fine-grained rewarding pattern. Studies show that our method significantly improves the performance of existing CodeLLMs on benchmarks such as MBPP and HumanEval.
arXiv Detail & Related papers (2024-10-24T09:36:13Z)
The Ultimate Guide to Fine-Tuning LLMs from Basics to Breakthroughs: An Exhaustive Review of Technologies, Research, Best Practices, Applied Research Challenges and Opportunities [0.35998666903987897]
This report examines the fine-tuning of Large Language Models (LLMs) It outlines the historical evolution of LLMs from traditional Natural Language Processing (NLP) models to their pivotal role in AI. The report introduces a structured seven-stage pipeline for fine-tuning LLMs.
arXiv Detail & Related papers (2024-08-23T14:48:02Z)
MAPO: Boosting Large Language Model Performance with Model-Adaptive Prompt Optimization [73.7779735046424]
We show that different prompts should be adapted to different Large Language Models (LLM) to enhance their capabilities across various downstream tasks in NLP. We then propose a model-adaptive prompt (MAPO) method that optimize the original prompts for each specific LLM in downstream tasks.
arXiv Detail & Related papers (2024-07-04T18:39:59Z)
Investigating Continual Pretraining in Large Language Models: Insights and Implications [9.591223887442704]
This paper studies the evolving domain of Continual Learning in large language models (LLMs) Our primary emphasis is on continual domain-adaptive pretraining, a process designed to equip LLMs with the ability to integrate new information from various domains. We examine the impact of model size on learning efficacy and forgetting, as well as how the progression and similarity of emerging domains affect the knowledge transfer within these models.
arXiv Detail & Related papers (2024-02-27T10:47:24Z)
PANDA: Preference Adaptation for Enhancing Domain-Specific Abilities of LLMs [49.32067576992511]
Large language models often fall short of the performance achieved by domain-specific state-of-the-art models. One potential approach to enhance domain-specific capabilities of LLMs involves fine-tuning them using corresponding datasets. We propose Preference Adaptation for Enhancing Domain-specific Abilities of LLMs (PANDA) Our experimental results reveal that PANDA significantly enhances the domain-specific ability of LLMs on text classification and interactive decision tasks.
arXiv Detail & Related papers (2024-02-20T09:02:55Z)
Evolving Domain Adaptation of Pretrained Language Models for Text Classification [24.795214770636534]
Adapting pre-trained language models (PLMs) for time-series text classification amidst evolving domain shifts (EDS) is critical for maintaining accuracy in applications like stance detection. This study benchmarks the effectiveness of evolving domain adaptation (EDA) strategies, notably self-training, domain-adversarial training, and domain-adaptive pretraining, with a focus on an incremental self-training method.
arXiv Detail & Related papers (2023-11-16T08:28:00Z)
Learning Transferable Conceptual Prototypes for Interpretable Unsupervised Domain Adaptation [79.22678026708134]
In this paper, we propose an inherently interpretable method, named Transferable Prototype Learning ( TCPL) To achieve this goal, we design a hierarchically prototypical module that transfers categorical basic concepts from the source domain to the target domain and learns domain-shared prototypes for explaining the underlying reasoning process. Comprehensive experiments show that the proposed method can not only provide effective and intuitive explanations but also outperform previous state-of-the-arts.
arXiv Detail & Related papers (2023-10-12T06:36:41Z)
Open-Set Domain Adaptation with Visual-Language Foundation Models [51.49854335102149]
Unsupervised domain adaptation (UDA) has proven to be very effective in transferring knowledge from a source domain to a target domain with unlabeled data. Open-set domain adaptation (ODA) has emerged as a potential solution to identify these classes during the training phase.
arXiv Detail & Related papers (2023-07-30T11:38:46Z)
Pre-trained Language Models for Keyphrase Generation: A Thorough Empirical Study [76.52997424694767]
We present an in-depth empirical study of keyphrase extraction and keyphrase generation using pre-trained language models. We show that PLMs have competitive high-resource performance and state-of-the-art low-resource performance. Further results show that in-domain BERT-like PLMs can be used to build strong and data-efficient keyphrase generation models.
arXiv Detail & Related papers (2022-12-20T13:20:21Z)
KALA: Knowledge-Augmented Language Model Adaptation [65.92457495576141]
We propose a novel domain adaption framework for pre-trained language models (PLMs) Knowledge-Augmented Language model Adaptation (KALA) modulates the intermediate hidden representations of PLMs with domain knowledge. Results show that, despite being computationally efficient, our KALA largely outperforms adaptive pre-training.
arXiv Detail & Related papers (2022-04-22T08:11:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.