Pre-trained Models for Natural Language Processing: A Survey
- URL: http://arxiv.org/abs/2003.08271v4
- Date: Wed, 23 Jun 2021 17:40:26 GMT
- Title: Pre-trained Models for Natural Language Processing: A Survey
- Authors: Xipeng Qiu, Tianxiang Sun, Yige Xu, Yunfan Shao, Ning Dai, and
Xuanjing Huang
- Abstract summary: The emergence of pre-trained models (PTMs) has brought natural language processing (NLP) to a new era.
This survey is purposed to be a hands-on guide for understanding, using, and developing PTMs for various NLP tasks.
- Score: 75.95500552357429
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, the emergence of pre-trained models (PTMs) has brought natural
language processing (NLP) to a new era. In this survey, we provide a
comprehensive review of PTMs for NLP. We first briefly introduce language
representation learning and its research progress. Then we systematically
categorize existing PTMs based on a taxonomy with four perspectives. Next, we
describe how to adapt the knowledge of PTMs to the downstream tasks. Finally,
we outline some potential directions of PTMs for future research. This survey
is purposed to be a hands-on guide for understanding, using, and developing
PTMs for various NLP tasks.
Related papers
- Continual Learning with Pre-Trained Models: A Survey [61.97613090666247]
Continual Learning aims to overcome the catastrophic forgetting of former knowledge when learning new ones.
This paper presents a comprehensive survey of the latest advancements in PTM-based CL.
arXiv Detail & Related papers (2024-01-29T18:27:52Z) - A Communication Theory Perspective on Prompting Engineering Methods for
Large Language Models [30.57652062704016]
This article aims to illustrate a novel perspective to review existing prompt engineering (PE) methods, within the well-established communication theory framework.
It aims to facilitate a better/deeper understanding of developing trends of existing PE methods used in four typical tasks.
arXiv Detail & Related papers (2023-10-24T03:05:21Z) - Naming Practices of Pre-Trained Models in Hugging Face [4.956536094440504]
Pre-Trained Models (PTMs) are used in computer systems to adapt for quality or performance prior to deployment.
Researchers publish PTMs, which engineers adapt for quality or performance prior to deployment.
Prior research has reported that model names are not always well chosen - and are sometimes erroneous.
In this paper, we frame and conduct the first empirical investigation of PTM naming practices in the Hugging Face PTM registry.
arXiv Detail & Related papers (2023-10-02T21:13:32Z) - A Survey on Knowledge-Enhanced Pre-trained Language Models [8.54551743144995]
Natural Language Processing (NLP) has been revolutionized by the use of Pre-trained Language Models (PLMs)
Despite setting new records in nearly every NLP task, PLMs still face a number of challenges including poor interpretability, weak reasoning capability, and the need for a lot of expensive annotated data when applied to downstream tasks.
By integrating external knowledge into PLMs, textitunderlineKnowledge-underlineEnhanced underlinePre-trained underlineLanguage underlineModels
arXiv Detail & Related papers (2022-12-27T09:54:14Z) - Prompt Tuning for Discriminative Pre-trained Language Models [96.04765512463415]
Recent works have shown promising results of prompt tuning in stimulating pre-trained language models (PLMs) for natural language processing (NLP) tasks.
It is still unknown whether and how discriminative PLMs, e.g., ELECTRA, can be effectively prompt-tuned.
We present DPT, the first prompt tuning framework for discriminative PLMs, which reformulates NLP tasks into a discriminative language modeling problem.
arXiv Detail & Related papers (2022-05-23T10:11:50Z) - A Survey on Programmatic Weak Supervision [74.13976343129966]
We give brief introduction of the PWS learning paradigm and review representative approaches for each PWS's learning workflow.
We identify several critical challenges that remain underexplored in the area to hopefully inspire future directions in the field.
arXiv Detail & Related papers (2022-02-11T04:05:38Z) - AdaPrompt: Adaptive Model Training for Prompt-based NLP [77.12071707955889]
We propose AdaPrompt, adaptively retrieving external data for continual pretraining of PLMs.
Experimental results on five NLP benchmarks show that AdaPrompt can improve over standard PLMs in few-shot settings.
In zero-shot settings, our method outperforms standard prompt-based methods by up to 26.35% relative error reduction.
arXiv Detail & Related papers (2022-02-10T04:04:57Z) - Ranking and Tuning Pre-trained Models: A New Paradigm of Exploiting
Model Hubs [136.4492678691406]
We propose a new paradigm of exploiting model hubs by ranking and tuning pre-trained models.
The best ranked PTM can be fine-tuned and deployed if we have no preference for the model's architecture.
The tuning part introduces a novel method for multiple PTMs tuning, which surpasses dedicated methods.
arXiv Detail & Related papers (2021-10-20T12:59:23Z) - A Survey of Knowledge Enhanced Pre-trained Models [28.160826399552462]
We refer to pre-trained language models with knowledge injection as knowledge-enhanced pre-trained language models (KEPLMs)
These models demonstrate deep understanding and logical reasoning and introduce interpretability.
arXiv Detail & Related papers (2021-10-01T08:51:58Z) - AMMUS : A Survey of Transformer-based Pretrained Models in Natural
Language Processing [0.0]
Transformer-based pretrained language models (T-PTLMs) have achieved great success in almost every NLP task.
Transformed-based PTLMs learn universal language representations from large volumes of text data using self-supervised learning.
These models provide good background knowledge to downstream tasks which avoids training of downstream models from scratch.
arXiv Detail & Related papers (2021-08-12T05:32:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.