LP-BERT: Multi-task Pre-training Knowledge Graph BERT for Link
Prediction
- URL: http://arxiv.org/abs/2201.04843v1
- Date: Thu, 13 Jan 2022 09:18:30 GMT
- Title: LP-BERT: Multi-task Pre-training Knowledge Graph BERT for Link
Prediction
- Authors: Da Li, Ming Yi, Yukai He
- Abstract summary: LP-BERT contains two training stages: multi-task pre-training and knowledge graph fine-tuning.
We achieve state-of-the-art results on WN18RR and UMLS datasets, especially the Hits@10 indicator improved by 5%.
- Score: 3.5382535469099436
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Link prediction plays an significant role in knowledge graph, which is an
important resource for many artificial intelligence tasks, but it is often
limited by incompleteness. In this paper, we propose knowledge graph BERT for
link prediction, named LP-BERT, which contains two training stages: multi-task
pre-training and knowledge graph fine-tuning. The pre-training strategy not
only uses Mask Language Model (MLM) to learn the knowledge of context corpus,
but also introduces Mask Entity Model (MEM) and Mask Relation Model (MRM),
which can learn the relationship information from triples by predicting
semantic based entity and relation elements. Structured triple relation
information can be transformed into unstructured semantic information, which
can be integrated into the pre-training model together with context corpus
information. In the fine-tuning phase, inspired by contrastive learning, we
carry out a triple-style negative sampling in sample batch, which greatly
increased the proportion of negative sampling while keeping the training time
almost unchanged. Furthermore, we propose a data augmentation method based on
the inverse relationship of triples to further increase the sample diversity.
We achieve state-of-the-art results on WN18RR and UMLS datasets, especially the
Hits@10 indicator improved by 5\% from the previous state-of-the-art result on
WN18RR dataset.
Related papers
- MUSE: Integrating Multi-Knowledge for Knowledge Graph Completion [0.0]
Knowledge Graph Completion (KGC) aims to predict the missing [relation] part (head entity)--[relation]->(tail entity) triplet.
Most existing KGC methods focus on single features (e.g., relation types) or sub-graph aggregation.
We propose a knowledge-aware reasoning model (MUSE) which designs a novel multi-knowledge representation learning mechanism for missing relation prediction.
arXiv Detail & Related papers (2024-09-26T04:48:20Z) - Relational Learning in Pre-Trained Models: A Theory from Hypergraph Recovery Perspective [60.64922606733441]
We introduce a mathematical model that formalizes relational learning as hypergraph recovery to study pre-training of Foundation Models (FMs)
In our framework, the world is represented as a hypergraph, with data abstracted as random samples from hyperedges. We theoretically examine the feasibility of a Pre-Trained Model (PTM) to recover this hypergraph and analyze the data efficiency in a minimax near-optimal style.
arXiv Detail & Related papers (2024-06-17T06:20:39Z) - G-SAP: Graph-based Structure-Aware Prompt Learning over Heterogeneous Knowledge for Commonsense Reasoning [8.02547453169677]
We propose a novel Graph-based Structure-Aware Prompt Learning Model for commonsense reasoning, named G-SAP.
In particular, an evidence graph is constructed by integrating multiple knowledge sources, i.e. ConceptNet, Wikipedia, and Cambridge Dictionary.
The results reveal a significant advancement over the existing models, especially, with 6.12% improvement over the SoTA LM+GNNs model on the OpenbookQA dataset.
arXiv Detail & Related papers (2024-05-09T08:28:12Z) - A Condensed Transition Graph Framework for Zero-shot Link Prediction with Large Language Models [20.220781775335645]
We introduce a Condensed Transition Graph Framework for Zero-Shot Link Prediction (CTLP)
CTLP encodes all the paths' information in linear time complexity to predict unseen relations between entities.
Our proposed CTLP method achieves state-of-the-art performance on three standard ZSLP datasets.
arXiv Detail & Related papers (2024-02-16T16:02:33Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - SMiLE: Schema-augmented Multi-level Contrastive Learning for Knowledge
Graph Link Prediction [28.87290783250351]
Link prediction is the task of inferring missing links between entities in knowledge graphs.
We propose a novel Multi-level contrastive LEarning framework (SMiLE) to conduct knowledge graph link prediction.
arXiv Detail & Related papers (2022-10-10T17:40:19Z) - Multimodal Masked Autoencoders Learn Transferable Representations [127.35955819874063]
We propose a simple and scalable network architecture, the Multimodal Masked Autoencoder (M3AE)
M3AE learns a unified encoder for both vision and language data via masked token prediction.
We provide an empirical study of M3AE trained on a large-scale image-text dataset, and find that M3AE is able to learn generalizable representations that transfer well to downstream tasks.
arXiv Detail & Related papers (2022-05-27T19:09:42Z) - Pre-training Co-evolutionary Protein Representation via A Pairwise
Masked Language Model [93.9943278892735]
Key problem in protein sequence representation learning is to capture the co-evolutionary information reflected by the inter-residue co-variation in the sequences.
We propose a novel method to capture this information directly by pre-training via a dedicated language model, i.e., Pairwise Masked Language Model (PMLM)
Our result shows that the proposed method can effectively capture the interresidue correlations and improves the performance of contact prediction by up to 9% compared to the baseline.
arXiv Detail & Related papers (2021-10-29T04:01:32Z) - On the Transferability of Pre-trained Language Models: A Study from
Artificial Datasets [74.11825654535895]
Pre-training language models (LMs) on large-scale unlabeled text data makes the model much easier to achieve exceptional downstream performance.
We study what specific traits in the pre-training data, other than the semantics, make a pre-trained LM superior to their counterparts trained from scratch on downstream tasks.
arXiv Detail & Related papers (2021-09-08T10:39:57Z) - Improving BERT Model Using Contrastive Learning for Biomedical Relation
Extraction [13.354066085659198]
Contrastive learning is not widely utilized in natural language processing due to the lack of a general method of data augmentation for text data.
In this work, we explore the method of employing contrastive learning to improve the text representation from the BERT model for relation extraction.
The experimental results on three relation extraction benchmark datasets demonstrate that our method can improve the BERT model representation and achieve state-of-the-art performance.
arXiv Detail & Related papers (2021-04-28T17:50:24Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.