Related papers: LP-BERT: Multi-task Pre-training Knowledge Graph BERT for Link Prediction

LP-BERT: Multi-task Pre-training Knowledge Graph BERT for Link Prediction

URL: http://arxiv.org/abs/2201.04843v1
Date: Thu, 13 Jan 2022 09:18:30 GMT
Title: LP-BERT: Multi-task Pre-training Knowledge Graph BERT for Link Prediction
Authors: Da Li, Ming Yi, Yukai He
Abstract summary: LP-BERT contains two training stages: multi-task pre-training and knowledge graph fine-tuning. We achieve state-of-the-art results on WN18RR and UMLS datasets, especially the Hits@10 indicator improved by 5%.
Score: 3.5382535469099436
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Link prediction plays an significant role in knowledge graph, which is an important resource for many artificial intelligence tasks, but it is often limited by incompleteness. In this paper, we propose knowledge graph BERT for link prediction, named LP-BERT, which contains two training stages: multi-task pre-training and knowledge graph fine-tuning. The pre-training strategy not only uses Mask Language Model (MLM) to learn the knowledge of context corpus, but also introduces Mask Entity Model (MEM) and Mask Relation Model (MRM), which can learn the relationship information from triples by predicting semantic based entity and relation elements. Structured triple relation information can be transformed into unstructured semantic information, which can be integrated into the pre-training model together with context corpus information. In the fine-tuning phase, inspired by contrastive learning, we carry out a triple-style negative sampling in sample batch, which greatly increased the proportion of negative sampling while keeping the training time almost unchanged. Furthermore, we propose a data augmentation method based on the inverse relationship of triples to further increase the sample diversity. We achieve state-of-the-art results on WN18RR and UMLS datasets, especially the Hits@10 indicator improved by 5\% from the previous state-of-the-art result on WN18RR dataset.

Related papers

Predicting Large Language Model Capabilities on Closed-Book QA Tasks Using Only Information Available Prior to Training [51.60874286674908]
We focus on predicting performance on Closed-book Question Answering (CBQA) tasks, which are closely tied to pre-training data and knowledge retention. We address three major challenges: 1) mastering the entire pre-training process, especially data construction; 2) evaluating a model's knowledge retention; and 3) predicting task-specific knowledge retention using only information available prior to training. We introduce the SMI metric, an information-theoretic measure that quantifies the relationship between pre-training data, model size, and task-specific knowledge retention.
arXiv Detail & Related papers (2025-02-06T13:23:53Z)
Efficient Relational Context Perception for Knowledge Graph Completion [25.903926643251076]
Knowledge Graphs (KGs) provide a structured representation of knowledge but often suffer from challenges of incompleteness. Previous knowledge graph embedding models are limited in their ability to capture expressive features. We propose Triple Receptance Perception architecture to model sequential information, enabling the learning of dynamic context.
arXiv Detail & Related papers (2024-12-31T11:25:58Z)
MUSE: Integrating Multi-Knowledge for Knowledge Graph Completion [0.0]
Knowledge Graph Completion (KGC) aims to predict the missing [relation] part (head entity)--[relation]->(tail entity) triplet. Most existing KGC methods focus on single features (e.g., relation types) or sub-graph aggregation. We propose a knowledge-aware reasoning model (MUSE) which designs a novel multi-knowledge representation learning mechanism for missing relation prediction.
arXiv Detail & Related papers (2024-09-26T04:48:20Z)
Relational Learning in Pre-Trained Models: A Theory from Hypergraph Recovery Perspective [60.64922606733441]
We introduce a mathematical model that formalizes relational learning as hypergraph recovery to study pre-training of Foundation Models (FMs) In our framework, the world is represented as a hypergraph, with data abstracted as random samples from hyperedges. We theoretically examine the feasibility of a Pre-Trained Model (PTM) to recover this hypergraph and analyze the data efficiency in a minimax near-optimal style.
arXiv Detail & Related papers (2024-06-17T06:20:39Z)
G-SAP: Graph-based Structure-Aware Prompt Learning over Heterogeneous Knowledge for Commonsense Reasoning [8.02547453169677]
We propose a novel Graph-based Structure-Aware Prompt Learning Model for commonsense reasoning, named G-SAP. In particular, an evidence graph is constructed by integrating multiple knowledge sources, i.e. ConceptNet, Wikipedia, and Cambridge Dictionary. The results reveal a significant advancement over the existing models, especially, with 6.12% improvement over the SoTA LM+GNNs model on the OpenbookQA dataset.
arXiv Detail & Related papers (2024-05-09T08:28:12Z)
A Condensed Transition Graph Framework for Zero-shot Link Prediction with Large Language Models [20.220781775335645]
We introduce a Condensed Transition Graph Framework for Zero-Shot Link Prediction (CTLP) CTLP encodes all the paths' information in linear time complexity to predict unseen relations between entities. Our proposed CTLP method achieves state-of-the-art performance on three standard ZSLP datasets.
arXiv Detail & Related papers (2024-02-16T16:02:33Z)
Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP) What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining. How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z)
SMiLE: Schema-augmented Multi-level Contrastive Learning for Knowledge Graph Link Prediction [28.87290783250351]
Link prediction is the task of inferring missing links between entities in knowledge graphs. We propose a novel Multi-level contrastive LEarning framework (SMiLE) to conduct knowledge graph link prediction.
arXiv Detail & Related papers (2022-10-10T17:40:19Z)
Multimodal Masked Autoencoders Learn Transferable Representations [127.35955819874063]
We propose a simple and scalable network architecture, the Multimodal Masked Autoencoder (M3AE) M3AE learns a unified encoder for both vision and language data via masked token prediction. We provide an empirical study of M3AE trained on a large-scale image-text dataset, and find that M3AE is able to learn generalizable representations that transfer well to downstream tasks.
arXiv Detail & Related papers (2022-05-27T19:09:42Z)
Pre-training Co-evolutionary Protein Representation via A Pairwise Masked Language Model [93.9943278892735]
Key problem in protein sequence representation learning is to capture the co-evolutionary information reflected by the inter-residue co-variation in the sequences. We propose a novel method to capture this information directly by pre-training via a dedicated language model, i.e., Pairwise Masked Language Model (PMLM) Our result shows that the proposed method can effectively capture the interresidue correlations and improves the performance of contact prediction by up to 9% compared to the baseline.
arXiv Detail & Related papers (2021-10-29T04:01:32Z)
On the Transferability of Pre-trained Language Models: A Study from Artificial Datasets [74.11825654535895]
Pre-training language models (LMs) on large-scale unlabeled text data makes the model much easier to achieve exceptional downstream performance. We study what specific traits in the pre-training data, other than the semantics, make a pre-trained LM superior to their counterparts trained from scratch on downstream tasks.
arXiv Detail & Related papers (2021-09-08T10:39:57Z)
Improving BERT Model Using Contrastive Learning for Biomedical Relation Extraction [13.354066085659198]
Contrastive learning is not widely utilized in natural language processing due to the lack of a general method of data augmentation for text data. In this work, we explore the method of employing contrastive learning to improve the text representation from the BERT model for relation extraction. The experimental results on three relation extraction benchmark datasets demonstrate that our method can improve the BERT model representation and achieve state-of-the-art performance.
arXiv Detail & Related papers (2021-04-28T17:50:24Z)
Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training. We experimentally verify that the new dataset can significantly improve the ability of the learned FER model. To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.