Structure-Augmented Text Representation Learning for Efficient Knowledge
Graph Completion
- URL: http://arxiv.org/abs/2004.14781v2
- Date: Wed, 24 Feb 2021 03:42:08 GMT
- Title: Structure-Augmented Text Representation Learning for Efficient Knowledge
Graph Completion
- Authors: Bo Wang, Tao Shen, Guodong Long, Tianyi Zhou, Yi Chang
- Abstract summary: Human-curated knowledge graphs provide critical supportive information to various natural language processing tasks.
These graphs are usually incomplete, urging auto-completion of them.
graph embedding approaches, e.g., TransE, learn structured knowledge via representing graph elements into dense embeddings.
textual encoding approaches, e.g., KG-BERT, resort to graph triple's text and triple-level contextualized representations.
- Score: 53.31911669146451
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human-curated knowledge graphs provide critical supportive information to
various natural language processing tasks, but these graphs are usually
incomplete, urging auto-completion of them. Prevalent graph embedding
approaches, e.g., TransE, learn structured knowledge via representing graph
elements into dense embeddings and capturing their triple-level relationship
with spatial distance. However, they are hardly generalizable to the elements
never visited in training and are intrinsically vulnerable to graph
incompleteness. In contrast, textual encoding approaches, e.g., KG-BERT, resort
to graph triple's text and triple-level contextualized representations. They
are generalizable enough and robust to the incompleteness, especially when
coupled with pre-trained encoders. But two major drawbacks limit the
performance: (1) high overheads due to the costly scoring of all possible
triples in inference, and (2) a lack of structured knowledge in the textual
encoder. In this paper, we follow the textual encoding paradigm and aim to
alleviate its drawbacks by augmenting it with graph embedding techniques -- a
complementary hybrid of both paradigms. Specifically, we partition each triple
into two asymmetric parts as in translation-based graph embedding approach, and
encode both parts into contextualized representations by a Siamese-style
textual encoder. Built upon the representations, our model employs both
deterministic classifier and spatial measurement for representation and
structure learning respectively. Moreover, we develop a self-adaptive ensemble
scheme to further improve the performance by incorporating triple scores from
an existing graph embedding model. In experiments, we achieve state-of-the-art
performance on three benchmarks and a zero-shot dataset for link prediction,
with highlights of inference costs reduced by 1-2 orders of magnitude compared
to a textual encoding method.
Related papers
- Node Level Graph Autoencoder: Unified Pretraining for Textual Graph Learning [45.70767623846523]
We propose a novel unified unsupervised learning autoencoder framework, named Node Level Graph AutoEncoder (NodeGAE)
We employ language models as the backbone of the autoencoder, with pretraining on text reconstruction.
Our method maintains simplicity in the training process and demonstrates generalizability across diverse textual graphs and downstream tasks.
arXiv Detail & Related papers (2024-08-09T14:57:53Z) - ConGraT: Self-Supervised Contrastive Pretraining for Joint Graph and Text Embeddings [20.25180279903009]
We propose Contrastive Graph-Text pretraining (ConGraT) for jointly learning separate representations of texts and nodes in a text-attributed graph (TAG)
Our method trains a language model (LM) and a graph neural network (GNN) to align their representations in a common latent space using a batch-wise contrastive learning objective inspired by CLIP.
Experiments demonstrate that ConGraT outperforms baselines on various downstream tasks, including node and text category classification, link prediction, and language modeling.
arXiv Detail & Related papers (2023-05-23T17:53:30Z) - Joint Language Semantic and Structure Embedding for Knowledge Graph
Completion [66.15933600765835]
We propose to jointly embed the semantics in the natural language description of the knowledge triplets with their structure information.
Our method embeds knowledge graphs for the completion task via fine-tuning pre-trained language models.
Our experiments on a variety of knowledge graph benchmarks have demonstrated the state-of-the-art performance of our method.
arXiv Detail & Related papers (2022-09-19T02:41:02Z) - Repurposing Knowledge Graph Embeddings for Triple Representation via
Weak Supervision [77.34726150561087]
Current methods learn triple embeddings from scratch without utilizing entity and predicate embeddings from pre-trained models.
We develop a method for automatically sampling triples from a knowledge graph and estimating their pairwise similarities from pre-trained embedding models.
These pairwise similarity scores are then fed to a Siamese-like neural architecture to fine-tune triple representations.
arXiv Detail & Related papers (2022-08-22T14:07:08Z) - VEM$^2$L: A Plug-and-play Framework for Fusing Text and Structure
Knowledge on Sparse Knowledge Graph Completion [14.537509860565706]
We propose a plug-and-play framework VEM2L over sparse Knowledge Graphs to fuse knowledge extracted from text and structure messages into a unity.
Specifically, we partition knowledge acquired by models into two nonoverlapping parts.
We also propose a new fusion strategy proved by Variational EM algorithm to fuse the generalization ability of models.
arXiv Detail & Related papers (2022-07-04T15:50:21Z) - FactGraph: Evaluating Factuality in Summarization with Semantic Graph
Representations [114.94628499698096]
We propose FactGraph, a method that decomposes the document and the summary into structured meaning representations (MRs)
MRs describe core semantic concepts and their relations, aggregating the main content in both document and summary in a canonical form, and reducing data sparsity.
Experiments on different benchmarks for evaluating factuality show that FactGraph outperforms previous approaches by up to 15%.
arXiv Detail & Related papers (2022-04-13T16:45:33Z) - GraphFormers: GNN-nested Transformers for Representation Learning on
Textual Graph [53.70520466556453]
We propose GraphFormers, where layerwise GNN components are nested alongside the transformer blocks of language models.
With the proposed architecture, the text encoding and the graph aggregation are fused into an iterative workflow.
In addition, a progressive learning strategy is introduced, where the model is successively trained on manipulated data and original data to reinforce its capability of integrating information on graph.
arXiv Detail & Related papers (2021-05-06T12:20:41Z) - Exploiting Structured Knowledge in Text via Graph-Guided Representation
Learning [73.0598186896953]
We present two self-supervised tasks learning over raw text with the guidance from knowledge graphs.
Building upon entity-level masked language models, our first contribution is an entity masking scheme.
In contrast to existing paradigms, our approach uses knowledge graphs implicitly, only during pre-training.
arXiv Detail & Related papers (2020-04-29T14:22:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.