Graph Propagation Transformer for Graph Representation Learning
- URL: http://arxiv.org/abs/2305.11424v3
- Date: Wed, 09 Oct 2024 04:25:18 GMT
- Title: Graph Propagation Transformer for Graph Representation Learning
- Authors: Zhe Chen, Hao Tan, Tao Wang, Tianrun Shen, Tong Lu, Qiuying Peng, Cheng Cheng, Yue Qi,
- Abstract summary: We propose a new attention mechanism called Graph Propagation Attention (GPA)
It explicitly passes the information among nodes and edges in three ways, i.e. node-to-node, node-to-edge, and edge-to-node.
We show that our method outperforms many state-of-the-art transformer-based graph models with better performance.
- Score: 36.01189696668657
- License:
- Abstract: This paper presents a novel transformer architecture for graph representation learning. The core insight of our method is to fully consider the information propagation among nodes and edges in a graph when building the attention module in the transformer blocks. Specifically, we propose a new attention mechanism called Graph Propagation Attention (GPA). It explicitly passes the information among nodes and edges in three ways, i.e. node-to-node, node-to-edge, and edge-to-node, which is essential for learning graph-structured data. On this basis, we design an effective transformer architecture named Graph Propagation Transformer (GPTrans) to further help learn graph data. We verify the performance of GPTrans in a wide range of graph learning experiments on several benchmark datasets. These results show that our method outperforms many state-of-the-art transformer-based graph models with better performance. The code will be released at https://github.com/czczup/GPTrans.
Related papers
- Technical Report: The Graph Spectral Token -- Enhancing Graph Transformers with Spectral Information [0.8184895397419141]
Graph Transformers have emerged as a powerful alternative to Message-Passing Graph Neural Networks (MP-GNNs)
We propose the Graph Spectral Token, a novel approach to directly encode graph spectral information.
We benchmark the effectiveness of our approach by enhancing two existing graph transformers, GraphTrans and SubFormer.
arXiv Detail & Related papers (2024-04-08T15:24:20Z) - Deep Prompt Tuning for Graph Transformers [55.2480439325792]
Fine-tuning is resource-intensive and requires storing multiple copies of large models.
We propose a novel approach called deep graph prompt tuning as an alternative to fine-tuning.
By freezing the pre-trained parameters and only updating the added tokens, our approach reduces the number of free parameters and eliminates the need for multiple model copies.
arXiv Detail & Related papers (2023-09-18T20:12:17Z) - AGFormer: Efficient Graph Representation with Anchor-Graph Transformer [95.1825252182316]
We propose a novel graph Transformer architecture, termed Anchor Graph Transformer (AGFormer)
AGFormer first obtains some representative anchors and then converts node-to-node message passing into anchor-to-anchor and anchor-to-node message passing process.
Extensive experiments on several benchmark datasets demonstrate the effectiveness and benefits of proposed AGFormer.
arXiv Detail & Related papers (2023-05-12T14:35:42Z) - Diffusing Graph Attention [15.013509382069046]
We develop a new model for Graph Transformers that integrates the arbitrary graph structure into the architecture.
GD learns to extract structural and positional relationships between distant nodes in the graph, which it then uses to direct the Transformer's attention and node representation.
Experiments on eight benchmarks show Graph diffuser to be a highly competitive model, outperforming the state-of-the-art in a diverse set of domains.
arXiv Detail & Related papers (2023-03-01T16:11:05Z) - PatchGT: Transformer over Non-trainable Clusters for Learning Graph
Representations [18.203910156450085]
We propose a new Transformer-based graph neural network: Patch Graph Transformer (PatchGT)
Unlike previous transformer-based models for learning graph representations, PatchGT learns from non-trainable graph patches, not from nodes directly.
PatchGT achieves higher than 1-WL-type GNNs, and the empirical study shows that PatchGT achieves competitive performances on benchmark datasets.
arXiv Detail & Related papers (2022-11-26T01:17:23Z) - Adaptive Multi-Neighborhood Attention based Transformer for Graph
Representation Learning [11.407118196728943]
We propose an adaptive graph Transformer termed Multi-Neighborhood Attention based Graph Transformer (MNA-GT)
MNA-GT captures the graph structural information for each node from the multi-neighborhood attention mechanism adaptively.
Experiments are conducted on a variety of graph benchmarks, and the empirical results show that MNA-GT outperforms many strong baselines.
arXiv Detail & Related papers (2022-11-15T08:12:44Z) - Pure Transformers are Powerful Graph Learners [51.36884247453605]
We show that standard Transformers without graph-specific modifications can lead to promising results in graph learning both in theory and practice.
We prove that this approach is theoretically at least as expressive as an invariant graph network (2-IGN) composed of equivariant linear layers.
Our method coined Tokenized Graph Transformer (TokenGT) achieves significantly better results compared to GNN baselines and competitive results.
arXiv Detail & Related papers (2022-07-06T08:13:06Z) - Transformer for Graphs: An Overview from Architecture Perspective [86.3545861392215]
It's imperative to sort out the existing Transformer models for graphs and systematically investigate their effectiveness on various graph tasks.
We first disassemble the existing models and conclude three typical ways to incorporate the graph information into the vanilla Transformer.
Our experiments confirm the benefits of current graph-specific modules on Transformer and reveal their advantages on different kinds of graph tasks.
arXiv Detail & Related papers (2022-02-17T06:02:06Z) - Do Transformers Really Perform Bad for Graph Representation? [62.68420868623308]
We present Graphormer, which is built upon the standard Transformer architecture.
Our key insight to utilizing Transformer in the graph is the necessity of effectively encoding the structural information of a graph into the model.
arXiv Detail & Related papers (2021-06-09T17:18:52Z) - Rethinking Graph Transformers with Spectral Attention [13.068288784805901]
We present the $textitSpectral Attention Network$ (SAN), which uses a learned positional encoding (LPE) to learn the position of each node in a given graph.
By leveraging the full spectrum of the Laplacian, our model is theoretically powerful in distinguishing graphs, and can better detect similar sub-structures from their resonance.
Our model performs on par or better than state-of-the-art GNNs, and outperforms any attention-based model by a wide margin.
arXiv Detail & Related papers (2021-06-07T18:11:11Z) - Dirichlet Graph Variational Autoencoder [65.94744123832338]
We present Dirichlet Graph Variational Autoencoder (DGVAE) with graph cluster memberships as latent factors.
Motivated by the low pass characteristics in balanced graph cut, we propose a new variant of GNN named Heatts to encode the input graph into cluster memberships.
arXiv Detail & Related papers (2020-10-09T07:35:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.