Related papers: Structure-Augmented Text Representation Learning for Efficient Knowledge Graph Completion

Structure-Augmented Text Representation Learning for Efficient Knowledge Graph Completion

URL: http://arxiv.org/abs/2004.14781v2
Date: Wed, 24 Feb 2021 03:42:08 GMT
Title: Structure-Augmented Text Representation Learning for Efficient Knowledge Graph Completion
Authors: Bo Wang, Tao Shen, Guodong Long, Tianyi Zhou, Yi Chang
Abstract summary: Human-curated knowledge graphs provide critical supportive information to various natural language processing tasks. These graphs are usually incomplete, urging auto-completion of them. graph embedding approaches, e.g., TransE, learn structured knowledge via representing graph elements into dense embeddings. textual encoding approaches, e.g., KG-BERT, resort to graph triple's text and triple-level contextualized representations.
Score: 53.31911669146451
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Human-curated knowledge graphs provide critical supportive information to various natural language processing tasks, but these graphs are usually incomplete, urging auto-completion of them. Prevalent graph embedding approaches, e.g., TransE, learn structured knowledge via representing graph elements into dense embeddings and capturing their triple-level relationship with spatial distance. However, they are hardly generalizable to the elements never visited in training and are intrinsically vulnerable to graph incompleteness. In contrast, textual encoding approaches, e.g., KG-BERT, resort to graph triple's text and triple-level contextualized representations. They are generalizable enough and robust to the incompleteness, especially when coupled with pre-trained encoders. But two major drawbacks limit the performance: (1) high overheads due to the costly scoring of all possible triples in inference, and (2) a lack of structured knowledge in the textual encoder. In this paper, we follow the textual encoding paradigm and aim to alleviate its drawbacks by augmenting it with graph embedding techniques -- a complementary hybrid of both paradigms. Specifically, we partition each triple into two asymmetric parts as in translation-based graph embedding approach, and encode both parts into contextualized representations by a Siamese-style textual encoder. Built upon the representations, our model employs both deterministic classifier and spatial measurement for representation and structure learning respectively. Moreover, we develop a self-adaptive ensemble scheme to further improve the performance by incorporating triple scores from an existing graph embedding model. In experiments, we achieve state-of-the-art performance on three benchmarks and a zero-shot dataset for link prediction, with highlights of inference costs reduced by 1-2 orders of magnitude compared to a textual encoding method.

Related papers

Integrating Structural and Semantic Signals in Text-Attributed Graphs with BiGTex [0.16385815610837165]
BiGTex is a novel architecture that tightly integrates GNNs and LLMs through stacked Graph-Text Fusion Units. BiGTex achieves state-of-the-art performance in node classification and generalizes effectively to link prediction.
arXiv Detail & Related papers (2025-04-16T20:25:11Z)
Node Level Graph Autoencoder: Unified Pretraining for Textual Graph Learning [45.70767623846523]
We propose a novel unified unsupervised learning autoencoder framework, named Node Level Graph AutoEncoder (NodeGAE) We employ language models as the backbone of the autoencoder, with pretraining on text reconstruction. Our method maintains simplicity in the training process and demonstrates generalizability across diverse textual graphs and downstream tasks.
arXiv Detail & Related papers (2024-08-09T14:57:53Z)
ConGraT: Self-Supervised Contrastive Pretraining for Joint Graph and Text Embeddings [20.25180279903009]
We propose Contrastive Graph-Text pretraining (ConGraT) for jointly learning separate representations of texts and nodes in a text-attributed graph (TAG) Our method trains a language model (LM) and a graph neural network (GNN) to align their representations in a common latent space using a batch-wise contrastive learning objective inspired by CLIP. Experiments demonstrate that ConGraT outperforms baselines on various downstream tasks, including node and text category classification, link prediction, and language modeling.
arXiv Detail & Related papers (2023-05-23T17:53:30Z)
Joint Language Semantic and Structure Embedding for Knowledge Graph Completion [66.15933600765835]
We propose to jointly embed the semantics in the natural language description of the knowledge triplets with their structure information. Our method embeds knowledge graphs for the completion task via fine-tuning pre-trained language models. Our experiments on a variety of knowledge graph benchmarks have demonstrated the state-of-the-art performance of our method.
arXiv Detail & Related papers (2022-09-19T02:41:02Z)
Repurposing Knowledge Graph Embeddings for Triple Representation via Weak Supervision [77.34726150561087]
Current methods learn triple embeddings from scratch without utilizing entity and predicate embeddings from pre-trained models. We develop a method for automatically sampling triples from a knowledge graph and estimating their pairwise similarities from pre-trained embedding models. These pairwise similarity scores are then fed to a Siamese-like neural architecture to fine-tune triple representations.
arXiv Detail & Related papers (2022-08-22T14:07:08Z)
VEM$^2$L: A Plug-and-play Framework for Fusing Text and Structure Knowledge on Sparse Knowledge Graph Completion [14.537509860565706]
We propose a plug-and-play framework VEM2L over sparse Knowledge Graphs to fuse knowledge extracted from text and structure messages into a unity. Specifically, we partition knowledge acquired by models into two nonoverlapping parts. We also propose a new fusion strategy proved by Variational EM algorithm to fuse the generalization ability of models.
arXiv Detail & Related papers (2022-07-04T15:50:21Z)
FactGraph: Evaluating Factuality in Summarization with Semantic Graph Representations [114.94628499698096]
We propose FactGraph, a method that decomposes the document and the summary into structured meaning representations (MRs) MRs describe core semantic concepts and their relations, aggregating the main content in both document and summary in a canonical form, and reducing data sparsity. Experiments on different benchmarks for evaluating factuality show that FactGraph outperforms previous approaches by up to 15%.
arXiv Detail & Related papers (2022-04-13T16:45:33Z)
GraphFormers: GNN-nested Transformers for Representation Learning on Textual Graph [53.70520466556453]
We propose GraphFormers, where layerwise GNN components are nested alongside the transformer blocks of language models. With the proposed architecture, the text encoding and the graph aggregation are fused into an iterative workflow. In addition, a progressive learning strategy is introduced, where the model is successively trained on manipulated data and original data to reinforce its capability of integrating information on graph.
arXiv Detail & Related papers (2021-05-06T12:20:41Z)
Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning [73.0598186896953]
We present two self-supervised tasks learning over raw text with the guidance from knowledge graphs. Building upon entity-level masked language models, our first contribution is an entity masking scheme. In contrast to existing paradigms, our approach uses knowledge graphs implicitly, only during pre-training.
arXiv Detail & Related papers (2020-04-29T14:22:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.