Pre-trained Models for Natural Language Processing: A Survey
- URL: http://arxiv.org/abs/2003.08271v4
- Date: Wed, 23 Jun 2021 17:40:26 GMT
- Title: Pre-trained Models for Natural Language Processing: A Survey
- Authors: Xipeng Qiu, Tianxiang Sun, Yige Xu, Yunfan Shao, Ning Dai, and
Xuanjing Huang
- Abstract summary: The emergence of pre-trained models (PTMs) has brought natural language processing (NLP) to a new era.
This survey is purposed to be a hands-on guide for understanding, using, and developing PTMs for various NLP tasks.
- Score: 75.95500552357429
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, the emergence of pre-trained models (PTMs) has brought natural
language processing (NLP) to a new era. In this survey, we provide a
comprehensive review of PTMs for NLP. We first briefly introduce language
representation learning and its research progress. Then we systematically
categorize existing PTMs based on a taxonomy with four perspectives. Next, we
describe how to adapt the knowledge of PTMs to the downstream tasks. Finally,
we outline some potential directions of PTMs for future research. This survey
is purposed to be a hands-on guide for understanding, using, and developing
PTMs for various NLP tasks.
Related papers
- Continual Learning with Pre-Trained Models: A Survey [61.97613090666247]
Continual Learning aims to overcome the catastrophic forgetting of former knowledge when learning new ones.
This paper presents a comprehensive survey of the latest advancements in PTM-based CL.
arXiv Detail & Related papers (2024-01-29T18:27:52Z) - A Communication Theory Perspective on Prompting Engineering Methods for
Large Language Models [30.57652062704016]
This article aims to illustrate a novel perspective to review existing prompt engineering (PE) methods, within the well-established communication theory framework.
It aims to facilitate a better/deeper understanding of developing trends of existing PE methods used in four typical tasks.
arXiv Detail & Related papers (2023-10-24T03:05:21Z) - Naming Practices of Pre-Trained Models in Hugging Face [4.956536094440504]
Pre-Trained Models (PTMs) are used in computer systems to adapt for quality or performance prior to deployment.
Researchers publish PTMs, which engineers adapt for quality or performance prior to deployment.
Prior research has reported that model names are not always well chosen - and are sometimes erroneous.
In this paper, we frame and conduct the first empirical investigation of PTM naming practices in the Hugging Face PTM registry.
arXiv Detail & Related papers (2023-10-02T21:13:32Z) - A Survey on Time-Series Pre-Trained Models [34.98332094625603]
Time-Series Mining (TSM) shows great potential in practical applications.
Deep learning models that rely on massive labeled data have been utilized for TSM successfully.
Recently, Pre-Trained Models have gradually attracted attention in the time series domain.
arXiv Detail & Related papers (2023-05-18T05:27:46Z) - Revisiting Class-Incremental Learning with Pre-Trained Models:
Generalizability and Adaptivity are All You Need [76.10635571879762]
Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones.
Recent pre-training has achieved substantial progress, making vast pre-trained models (PTMs) accessible for CIL.
We argue that the core factors in CIL are adaptivity for model updating and generalizability for knowledge transferring.
arXiv Detail & Related papers (2023-03-13T17:59:02Z) - A Survey on Knowledge-Enhanced Pre-trained Language Models [8.54551743144995]
Natural Language Processing (NLP) has been revolutionized by the use of Pre-trained Language Models (PLMs)
Despite setting new records in nearly every NLP task, PLMs still face a number of challenges including poor interpretability, weak reasoning capability, and the need for a lot of expensive annotated data when applied to downstream tasks.
By integrating external knowledge into PLMs, textitunderlineKnowledge-underlineEnhanced underlinePre-trained underlineLanguage underlineModels
arXiv Detail & Related papers (2022-12-27T09:54:14Z) - Prompt Tuning for Discriminative Pre-trained Language Models [96.04765512463415]
Recent works have shown promising results of prompt tuning in stimulating pre-trained language models (PLMs) for natural language processing (NLP) tasks.
It is still unknown whether and how discriminative PLMs, e.g., ELECTRA, can be effectively prompt-tuned.
We present DPT, the first prompt tuning framework for discriminative PLMs, which reformulates NLP tasks into a discriminative language modeling problem.
arXiv Detail & Related papers (2022-05-23T10:11:50Z) - A Survey on Programmatic Weak Supervision [74.13976343129966]
We give brief introduction of the PWS learning paradigm and review representative approaches for each PWS's learning workflow.
We identify several critical challenges that remain underexplored in the area to hopefully inspire future directions in the field.
arXiv Detail & Related papers (2022-02-11T04:05:38Z) - AdaPrompt: Adaptive Model Training for Prompt-based NLP [77.12071707955889]
We propose AdaPrompt, adaptively retrieving external data for continual pretraining of PLMs.
Experimental results on five NLP benchmarks show that AdaPrompt can improve over standard PLMs in few-shot settings.
In zero-shot settings, our method outperforms standard prompt-based methods by up to 26.35% relative error reduction.
arXiv Detail & Related papers (2022-02-10T04:04:57Z) - AMMUS : A Survey of Transformer-based Pretrained Models in Natural
Language Processing [0.0]
Transformer-based pretrained language models (T-PTLMs) have achieved great success in almost every NLP task.
Transformed-based PTLMs learn universal language representations from large volumes of text data using self-supervised learning.
These models provide good background knowledge to downstream tasks which avoids training of downstream models from scratch.
arXiv Detail & Related papers (2021-08-12T05:32:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.