Task-Oriented GNNs Training on Large Knowledge Graphs for Accurate and Efficient Modeling
- URL: http://arxiv.org/abs/2403.05752v2
- Date: Fri, 22 Mar 2024 14:44:17 GMT
- Title: Task-Oriented GNNs Training on Large Knowledge Graphs for Accurate and Efficient Modeling
- Authors: Hussein Abdallah, Waleed Afandi, Panos Kalnis, Essam Mansour,
- Abstract summary: This paper proposes KG-TOSA, an approach to automate the TOSG extraction for task-oriented HGNN training on a large Knowledge Graph (KG)
KG-TOSA helps state-of-the-art HGNN methods reduce training time and memory usage by up to 70% while improving the model performance, e.g., accuracy and inference time.
- Score: 5.460112864687281
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A Knowledge Graph (KG) is a heterogeneous graph encompassing a diverse range of node and edge types. Heterogeneous Graph Neural Networks (HGNNs) are popular for training machine learning tasks like node classification and link prediction on KGs. However, HGNN methods exhibit excessive complexity influenced by the KG's size, density, and the number of node and edge types. AI practitioners handcraft a subgraph of a KG G relevant to a specific task. We refer to this subgraph as a task-oriented subgraph (TOSG), which contains a subset of task-related node and edge types in G. Training the task using TOSG instead of G alleviates the excessive computation required for a large KG. Crafting the TOSG demands a deep understanding of the KG's structure and the task's objectives. Hence, it is challenging and time-consuming. This paper proposes KG-TOSA, an approach to automate the TOSG extraction for task-oriented HGNN training on a large KG. In KG-TOSA, we define a generic graph pattern that captures the KG's local and global structure relevant to a specific task. We explore different techniques to extract subgraphs matching our graph pattern: namely (i) two techniques sampling around targeted nodes using biased random walk or influence scores, and (ii) a SPARQL-based extraction method leveraging RDF engines' built-in indices. Hence, it achieves negligible preprocessing overhead compared to the sampling techniques. We develop a benchmark of real KGs of large sizes and various tasks for node classification and link prediction. Our experiments show that KG-TOSA helps state-of-the-art HGNN methods reduce training time and memory usage by up to 70% while improving the model performance, e.g., accuracy and inference time.
Related papers
- GNN-RAG: Graph Neural Retrieval for Large Language Model Reasoning [21.057810495833063]
We introduce GNN-RAG, a novel method for combining language understanding abilities of LLMs with the reasoning abilities of GNNs in a retrieval-augmented generation (RAG) style.
In our GNN-RAG framework, the GNN acts as a dense subgraph reasoner to extract useful graph information.
Experiments show that GNN-RAG achieves state-of-the-art performance in two widely used KGQA benchmarks.
arXiv Detail & Related papers (2024-05-30T15:14:24Z) - Graph Transformers for Large Graphs [57.19338459218758]
This work advances representation learning on single large-scale graphs with a focus on identifying model characteristics and critical design constraints.
A key innovation of this work lies in the creation of a fast neighborhood sampling technique coupled with a local attention mechanism.
We report a 3x speedup and 16.8% performance gain on ogbn-products and snap-patents, while we also scale LargeGT on ogbn-100M with a 5.9% performance improvement.
arXiv Detail & Related papers (2023-12-18T11:19:23Z) - GRAPES: Learning to Sample Graphs for Scalable Graph Neural Networks [2.4175455407547015]
Graph neural networks learn to represent nodes by aggregating information from their neighbors.
Several existing methods address this by sampling a small subset of nodes, scaling GNNs to much larger graphs.
We introduce GRAPES, an adaptive sampling method that learns to identify the set of nodes crucial for training a GNN.
arXiv Detail & Related papers (2023-10-05T09:08:47Z) - Schema First! Learn Versatile Knowledge Graph Embeddings by Capturing
Semantics with MASCHInE [3.174882428337821]
Knowledge graph embedding models (KGEMs) have gained considerable traction in recent years.
In this work, we design protographs -- small, modified versions of a KG that leverage RDF/S information.
The learnt protograph-based embeddings are meant to encapsulate the semantics of a KG, and can be leveraged in learning KGEs that, in turn, also better capture semantics.
arXiv Detail & Related papers (2023-06-06T13:22:54Z) - Graph Mixture of Experts: Learning on Large-Scale Graphs with Explicit
Diversity Modeling [60.0185734837814]
Graph neural networks (GNNs) have found extensive applications in learning from graph data.
To bolster the generalization capacity of GNNs, it has become customary to augment training graph structures with techniques like graph augmentations.
This study introduces the concept of Mixture-of-Experts (MoE) to GNNs, with the aim of augmenting their capacity to adapt to a diverse range of training graph structures.
arXiv Detail & Related papers (2023-04-06T01:09:36Z) - A Comprehensive Study on Large-Scale Graph Training: Benchmarking and
Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs)
We present a new ensembling training manner, named EnGCN, to address the existing issues.
Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z) - Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural
Networks [52.566735716983956]
We propose a graph gradual pruning framework termed CGP to dynamically prune GNNs.
Unlike LTH-based methods, the proposed CGP approach requires no re-training, which significantly reduces the computation costs.
Our proposed strategy greatly improves both training and inference efficiency while matching or even exceeding the accuracy of existing methods.
arXiv Detail & Related papers (2022-07-18T14:23:31Z) - Sequence-to-Sequence Knowledge Graph Completion and Question Answering [8.207403859762044]
We show that an off-the-shelf encoder-decoder Transformer model can serve as a scalable and versatile KGE model.
We achieve this by posing KG link prediction as a sequence-to-sequence task and exchange the triple scoring approach taken by prior KGE methods with autoregressive decoding.
arXiv Detail & Related papers (2022-03-19T13:01:49Z) - A Unified Lottery Ticket Hypothesis for Graph Neural Networks [82.31087406264437]
We present a unified GNN sparsification (UGS) framework that simultaneously prunes the graph adjacency matrix and the model weights.
We further generalize the popular lottery ticket hypothesis to GNNs for the first time, by defining a graph lottery ticket (GLT) as a pair of core sub-dataset and sparse sub-network.
arXiv Detail & Related papers (2021-02-12T21:52:43Z) - Toward Subgraph-Guided Knowledge Graph Question Generation with Graph
Neural Networks [53.58077686470096]
Knowledge graph (KG) question generation (QG) aims to generate natural language questions from KGs and target answers.
In this work, we focus on a more realistic setting where we aim to generate questions from a KG subgraph and target answers.
arXiv Detail & Related papers (2020-04-13T15:43:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.