DHIL-GT: Scalable Graph Transformer with Decoupled Hierarchy Labeling
- URL: http://arxiv.org/abs/2412.04738v1
- Date: Fri, 06 Dec 2024 02:59:01 GMT
- Title: DHIL-GT: Scalable Graph Transformer with Decoupled Hierarchy Labeling
- Authors: Ningyi Liao, Zihao Yu, Siqiang Luo,
- Abstract summary: Graph Transformer (GT) has recently emerged as a promising neural network architecture for learning graph-structured data.
We propose DHIL-GT, a scalable Graph Transformer that simplifies network learning by fully decoupling the graph computation to a separate stage in advance.
DHIL-GT is efficient in terms of computational boost and mini-batch capability over existing scalable Graph Transformer designs on large-scale benchmarks.
- Score: 10.556366638048384
- License:
- Abstract: Graph Transformer (GT) has recently emerged as a promising neural network architecture for learning graph-structured data. However, its global attention mechanism with quadratic complexity concerning the graph scale prevents wider application to large graphs. While current methods attempt to enhance GT scalability by altering model architecture or encoding hierarchical graph data, our analysis reveals that these models still suffer from the computational bottleneck related to graph-scale operations. In this work, we target the GT scalability issue and propose DHIL-GT, a scalable Graph Transformer that simplifies network learning by fully decoupling the graph computation to a separate stage in advance. DHIL-GT effectively retrieves hierarchical information by exploiting the graph labeling technique, as we show that the graph label hierarchy is more informative than plain adjacency by offering global connections while promoting locality, and is particularly suitable for handling complex graph patterns such as heterophily. We further design subgraph sampling and positional encoding schemes for precomputing model input on top of graph labels in an end-to-end manner. The training stage thus favorably removes graph-related computations, leading to ideal mini-batch capability and GPU utilization. Notably, the precomputation and training processes of DHIL-GT achieve complexities linear to the number of graph edges and nodes, respectively. Extensive experiments demonstrate that DHIL-GT is efficient in terms of computational boost and mini-batch capability over existing scalable Graph Transformer designs on large-scale benchmarks, while achieving top-tier effectiveness on both homophilous and heterophilous graphs.
Related papers
- Graph Transformers for Large Graphs [57.19338459218758]
This work advances representation learning on single large-scale graphs with a focus on identifying model characteristics and critical design constraints.
A key innovation of this work lies in the creation of a fast neighborhood sampling technique coupled with a local attention mechanism.
We report a 3x speedup and 16.8% performance gain on ogbn-products and snap-patents, while we also scale LargeGT on ogbn-100M with a 5.9% performance improvement.
arXiv Detail & Related papers (2023-12-18T11:19:23Z) - Deep Prompt Tuning for Graph Transformers [55.2480439325792]
Fine-tuning is resource-intensive and requires storing multiple copies of large models.
We propose a novel approach called deep graph prompt tuning as an alternative to fine-tuning.
By freezing the pre-trained parameters and only updating the added tokens, our approach reduces the number of free parameters and eliminates the need for multiple model copies.
arXiv Detail & Related papers (2023-09-18T20:12:17Z) - Transforming Graphs for Enhanced Attribute Clustering: An Innovative
Graph Transformer-Based Method [8.989218350080844]
This study introduces an innovative method known as the Graph Transformer Auto-Encoder for Graph Clustering (GTAGC)
By melding the Graph Auto-Encoder with the Graph Transformer, GTAGC is adept at capturing global dependencies between nodes.
The architecture of GTAGC encompasses graph embedding, integration of the Graph Transformer within the autoencoder structure, and a clustering component.
arXiv Detail & Related papers (2023-06-20T06:04:03Z) - Hierarchical Transformer for Scalable Graph Learning [22.462712609402324]
Graph Transformer has demonstrated state-of-the-art performance on benchmarks for graph representation learning.
The complexity of the global self-attention mechanism presents a challenge for full-batch training when applied to larger graphs.
We introduce the Hierarchical Scalable Graph Transformer (HSGT) as a solution to these challenges.
HSGT successfully scales the Transformer architecture to node representation learning tasks on large-scale graphs, while maintaining high performance.
arXiv Detail & Related papers (2023-05-04T14:23:22Z) - Dynamic Graph Representation Learning via Graph Transformer Networks [41.570839291138114]
We propose a Transformer-based dynamic graph learning method named Dynamic Graph Transformer (DGT)
DGT has spatial-temporal encoding to effectively learn graph topology and capture implicit links.
We show that DGT presents superior performance compared with several state-of-the-art baselines.
arXiv Detail & Related papers (2021-11-19T21:44:23Z) - Spectral Graph Convolutional Networks With Lifting-based Adaptive Graph
Wavelets [81.63035727821145]
Spectral graph convolutional networks (SGCNs) have been attracting increasing attention in graph representation learning.
We propose a novel class of spectral graph convolutional networks that implement graph convolutions with adaptive graph wavelets.
arXiv Detail & Related papers (2021-08-03T17:57:53Z) - Hierarchical Adaptive Pooling by Capturing High-order Dependency for
Graph Representation Learning [18.423192209359158]
Graph neural networks (GNN) have been proven to be mature enough for handling graph-structured data on node-level graph representation learning tasks.
This paper proposes a hierarchical graph-level representation learning framework, which is adaptively sensitive to graph structures.
arXiv Detail & Related papers (2021-04-13T06:22:24Z) - Robust Optimization as Data Augmentation for Large-scale Graphs [117.2376815614148]
We propose FLAG (Free Large-scale Adversarial Augmentation on Graphs), which iteratively augments node features with gradient-based adversarial perturbations during training.
FLAG is a general-purpose approach for graph data, which universally works in node classification, link prediction, and graph classification tasks.
arXiv Detail & Related papers (2020-10-19T21:51:47Z) - Iterative Deep Graph Learning for Graph Neural Networks: Better and
Robust Node Embeddings [53.58077686470096]
We propose an end-to-end graph learning framework, namely Iterative Deep Graph Learning (IDGL) for jointly and iteratively learning graph structure and graph embedding.
Our experiments show that our proposed IDGL models can consistently outperform or match the state-of-the-art baselines.
arXiv Detail & Related papers (2020-06-21T19:49:15Z) - Adaptive Graph Auto-Encoder for General Data Clustering [90.8576971748142]
Graph-based clustering plays an important role in the clustering area.
Recent studies about graph convolution neural networks have achieved impressive success on graph type data.
We propose a graph auto-encoder for general data clustering, which constructs the graph adaptively according to the generative perspective of graphs.
arXiv Detail & Related papers (2020-02-20T10:11:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.