PatchGT: Transformer over Non-trainable Clusters for Learning Graph
Representations
- URL: http://arxiv.org/abs/2211.14425v2
- Date: Fri, 7 Apr 2023 19:39:46 GMT
- Title: PatchGT: Transformer over Non-trainable Clusters for Learning Graph
Representations
- Authors: Han Gao, Xu Han, Jiaoyang Huang, Jian-Xun Wang, Li-Ping Liu
- Abstract summary: We propose a new Transformer-based graph neural network: Patch Graph Transformer (PatchGT)
Unlike previous transformer-based models for learning graph representations, PatchGT learns from non-trainable graph patches, not from nodes directly.
PatchGT achieves higher than 1-WL-type GNNs, and the empirical study shows that PatchGT achieves competitive performances on benchmark datasets.
- Score: 18.203910156450085
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently the Transformer structure has shown good performances in graph
learning tasks. However, these Transformer models directly work on graph nodes
and may have difficulties learning high-level information. Inspired by the
vision transformer, which applies to image patches, we propose a new
Transformer-based graph neural network: Patch Graph Transformer (PatchGT).
Unlike previous transformer-based models for learning graph representations,
PatchGT learns from non-trainable graph patches, not from nodes directly. It
can help save computation and improve the model performance. The key idea is to
segment a graph into patches based on spectral clustering without any trainable
parameters, with which the model can first use GNN layers to learn patch-level
representations and then use Transformer to obtain graph-level representations.
The architecture leverages the spectral information of graphs and combines the
strengths of GNNs and Transformers. Further, we show the limitations of
previous hierarchical trainable clusters theoretically and empirically. We also
prove the proposed non-trainable spectral clustering method is permutation
invariant and can help address the information bottlenecks in the graph.
PatchGT achieves higher expressiveness than 1-WL-type GNNs, and the empirical
study shows that PatchGT achieves competitive performances on benchmark
datasets and provides interpretability to its predictions. The implementation
of our algorithm is released at our Github repo:
https://github.com/tufts-ml/PatchGT.
Related papers
- A Pure Transformer Pretraining Framework on Text-attributed Graphs [50.833130854272774]
We introduce a feature-centric pretraining perspective by treating graph structure as a prior.
Our framework, Graph Sequence Pretraining with Transformer (GSPT), samples node contexts through random walks.
GSPT can be easily adapted to both node classification and link prediction, demonstrating promising empirical success on various datasets.
arXiv Detail & Related papers (2024-06-19T22:30:08Z) - Technical Report: The Graph Spectral Token -- Enhancing Graph Transformers with Spectral Information [0.8184895397419141]
Graph Transformers have emerged as a powerful alternative to Message-Passing Graph Neural Networks (MP-GNNs)
We propose the Graph Spectral Token, a novel approach to directly encode graph spectral information.
We benchmark the effectiveness of our approach by enhancing two existing graph transformers, GraphTrans and SubFormer.
arXiv Detail & Related papers (2024-04-08T15:24:20Z) - Cell Graph Transformer for Nuclei Classification [78.47566396839628]
We develop a cell graph transformer (CGT) that treats nodes and edges as input tokens to enable learnable adjacency and information exchange among all nodes.
Poorly features can lead to noisy self-attention scores and inferior convergence.
We propose a novel topology-aware pretraining method that leverages a graph convolutional network (GCN) to learn a feature extractor.
arXiv Detail & Related papers (2024-02-20T12:01:30Z) - GraphGPT: Graph Learning with Generative Pre-trained Transformers [9.862004020075126]
We introduce textitGraphGPT, a novel model for Graph learning by self-supervised Generative Pre-training Transformers.
Our model transforms each graph or sampled subgraph into a sequence of tokens representing the node, edge and attributes.
The generative pre-training enables us to train GraphGPT up to 400M+ parameters with consistently increasing performance.
arXiv Detail & Related papers (2023-12-31T16:19:30Z) - Deep Prompt Tuning for Graph Transformers [55.2480439325792]
Fine-tuning is resource-intensive and requires storing multiple copies of large models.
We propose a novel approach called deep graph prompt tuning as an alternative to fine-tuning.
By freezing the pre-trained parameters and only updating the added tokens, our approach reduces the number of free parameters and eliminates the need for multiple model copies.
arXiv Detail & Related papers (2023-09-18T20:12:17Z) - SimTeG: A Frustratingly Simple Approach Improves Textual Graph Learning [131.04781590452308]
We present SimTeG, a frustratingly Simple approach for Textual Graph learning.
We first perform supervised parameter-efficient fine-tuning (PEFT) on a pre-trained LM on the downstream task.
We then generate node embeddings using the last hidden states of finetuned LM.
arXiv Detail & Related papers (2023-08-03T07:00:04Z) - Graph Propagation Transformer for Graph Representation Learning [36.01189696668657]
We propose a new attention mechanism called Graph Propagation Attention (GPA)
It explicitly passes the information among nodes and edges in three ways, i.e. node-to-node, node-to-edge, and edge-to-node.
We show that our method outperforms many state-of-the-art transformer-based graph models with better performance.
arXiv Detail & Related papers (2023-05-19T04:42:58Z) - Pure Transformers are Powerful Graph Learners [51.36884247453605]
We show that standard Transformers without graph-specific modifications can lead to promising results in graph learning both in theory and practice.
We prove that this approach is theoretically at least as expressive as an invariant graph network (2-IGN) composed of equivariant linear layers.
Our method coined Tokenized Graph Transformer (TokenGT) achieves significantly better results compared to GNN baselines and competitive results.
arXiv Detail & Related papers (2022-07-06T08:13:06Z) - Do Transformers Really Perform Bad for Graph Representation? [62.68420868623308]
We present Graphormer, which is built upon the standard Transformer architecture.
Our key insight to utilizing Transformer in the graph is the necessity of effectively encoding the structural information of a graph into the model.
arXiv Detail & Related papers (2021-06-09T17:18:52Z) - Rethinking Graph Transformers with Spectral Attention [13.068288784805901]
We present the $textitSpectral Attention Network$ (SAN), which uses a learned positional encoding (LPE) to learn the position of each node in a given graph.
By leveraging the full spectrum of the Laplacian, our model is theoretically powerful in distinguishing graphs, and can better detect similar sub-structures from their resonance.
Our model performs on par or better than state-of-the-art GNNs, and outperforms any attention-based model by a wide margin.
arXiv Detail & Related papers (2021-06-07T18:11:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.