A Survey of Knowledge Enhanced Pre-trained Models
- URL: http://arxiv.org/abs/2110.00269v5
- Date: Sun, 29 Oct 2023 09:25:17 GMT
- Title: A Survey of Knowledge Enhanced Pre-trained Models
- Authors: Jian Yang, Xinyu Hu, Gang Xiao, Yulong Shen
- Abstract summary: We refer to pre-trained language models with knowledge injection as knowledge-enhanced pre-trained language models (KEPLMs)
These models demonstrate deep understanding and logical reasoning and introduce interpretability.
- Score: 28.160826399552462
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-trained language models learn informative word representations on a
large-scale text corpus through self-supervised learning, which has achieved
promising performance in fields of natural language processing (NLP) after
fine-tuning. These models, however, suffer from poor robustness and lack of
interpretability. We refer to pre-trained language models with knowledge
injection as knowledge-enhanced pre-trained language models (KEPLMs). These
models demonstrate deep understanding and logical reasoning and introduce
interpretability. In this survey, we provide a comprehensive overview of KEPLMs
in NLP. We first discuss the advancements in pre-trained language models and
knowledge representation learning. Then we systematically categorize existing
KEPLMs from three different perspectives. Finally, we outline some potential
directions of KEPLMs for future research.
Related papers
- Linguistically Grounded Analysis of Language Models using Shapley Head Values [2.914115079173979]
We investigate the processing of morphosyntactic phenomena by leveraging a recently proposed method for probing language models via Shapley Head Values (SHVs)
Using the English language BLiMP dataset, we test our approach on two widely used models, BERT and RoBERTa, and compare how linguistic constructions are handled.
Our results show that SHV-based attributions reveal distinct patterns across both models, providing insights into how language models organize and process linguistic information.
arXiv Detail & Related papers (2024-10-17T09:48:08Z) - Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks.
The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation.
We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z) - SINC: Self-Supervised In-Context Learning for Vision-Language Tasks [64.44336003123102]
We propose a framework to enable in-context learning in large language models.
A meta-model can learn on self-supervised prompts consisting of tailored demonstrations.
Experiments show that SINC outperforms gradient-based methods in various vision-language tasks.
arXiv Detail & Related papers (2023-07-15T08:33:08Z) - Commonsense Knowledge Transfer for Pre-trained Language Models [83.01121484432801]
We introduce commonsense knowledge transfer, a framework to transfer the commonsense knowledge stored in a neural commonsense knowledge model to a general-purpose pre-trained language model.
It first exploits general texts to form queries for extracting commonsense knowledge from the neural commonsense knowledge model.
It then refines the language model with two self-supervised objectives: commonsense mask infilling and commonsense relation prediction.
arXiv Detail & Related papers (2023-06-04T15:44:51Z) - An Overview on Language Models: Recent Developments and Outlook [32.528770408502396]
Conventional language models (CLMs) aim to predict the probability of linguistic sequences in a causal manner.
Pre-trained language models (PLMs) cover broader concepts and can be used in both causal sequential modeling and fine-tuning for downstream applications.
arXiv Detail & Related papers (2023-03-10T07:55:00Z) - A Survey of Knowledge Enhanced Pre-trained Language Models [78.56931125512295]
We present a comprehensive review of Knowledge Enhanced Pre-trained Language Models (KE-PLMs)
For NLU, we divide the types of knowledge into four categories: linguistic knowledge, text knowledge, knowledge graph (KG) and rule knowledge.
The KE-PLMs for NLG are categorized into KG-based and retrieval-based methods.
arXiv Detail & Related papers (2022-11-11T04:29:02Z) - Language Model Pre-Training with Sparse Latent Typing [66.75786739499604]
We propose a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types.
Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge.
arXiv Detail & Related papers (2022-10-23T00:37:08Z) - Leveraging pre-trained language models for conversational information
seeking from text [2.8425118603312]
In this paper we investigate the usage of in-context learning and pre-trained language representation models to address the problem of information extraction from process description documents.
The results highlight the potential of the approach and the usefulness of the in-context learning customizations.
arXiv Detail & Related papers (2022-03-31T09:00:46Z) - A Survey of Knowledge-Intensive NLP with Pre-Trained Language Models [185.08295787309544]
We aim to summarize the current progress of pre-trained language model-based knowledge-enhanced models (PLMKEs)
We present the challenges of PLMKEs based on the discussion regarding the three elements and attempt to provide NLP practitioners with potential directions for further research.
arXiv Detail & Related papers (2022-02-17T17:17:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.