On Structural Expressive Power of Graph Transformers
- URL: http://arxiv.org/abs/2305.13987v1
- Date: Tue, 23 May 2023 12:12:21 GMT
- Title: On Structural Expressive Power of Graph Transformers
- Authors: Wenhao Zhu, Tianyu Wen, Guojie Song, Liang Wang, Bo Zheng
- Abstract summary: We introduce textbfSEG-WL test (textbfStructural textbfEncoding enhanced textbfWeisfeiler-textbfLehman test), a generalized graph isomorphism test algorithm.
We show how graph Transformers' expressive power is determined by the design of structural encodings.
- Score: 26.84413369075598
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Graph Transformer has recently received wide attention in the research
community with its outstanding performance, yet its structural expressive power
has not been well analyzed. Inspired by the connections between
Weisfeiler-Lehman (WL) graph isomorphism test and graph neural network (GNN),
we introduce \textbf{SEG-WL test} (\textbf{S}tructural \textbf{E}ncoding
enhanced \textbf{G}lobal \textbf{W}eisfeiler-\textbf{L}ehman test), a
generalized graph isomorphism test algorithm as a powerful theoretical tool for
exploring the structural discriminative power of graph Transformers. We
theoretically prove that the SEG-WL test is an expressivity upper bound on a
wide range of graph Transformers, and the representational power of SEG-WL test
can be approximated by a simple Transformer network arbitrarily under certain
conditions. With the SEG-WL test, we show how graph Transformers' expressive
power is determined by the design of structural encodings, and present
conditions that make the expressivity of graph Transformers beyond WL test and
GNNs. Moreover, motivated by the popular shortest path distance encoding, we
follow the theory-oriented principles and develop a provably stronger
structural encoding method, Shortest Path Induced Subgraph (\textit{SPIS})
encoding. Our theoretical findings provide a novel and practical paradigm for
investigating the expressive power of graph Transformers, and extensive
synthetic and real-world experiments empirically verify the strengths of our
proposed methods.
Related papers
- What Improves the Generalization of Graph Transformers? A Theoretical Dive into the Self-attention and Positional Encoding [67.59552859593985]
Graph Transformers, which incorporate self-attention and positional encoding, have emerged as a powerful architecture for various graph learning tasks.
This paper introduces first theoretical investigation of a shallow Graph Transformer for semi-supervised classification.
arXiv Detail & Related papers (2024-06-04T05:30:16Z) - Automatic Graph Topology-Aware Transformer [50.2807041149784]
We build a comprehensive graph Transformer search space with the micro-level and macro-level designs.
EGTAS evolves graph Transformer topologies at the macro level and graph-aware strategies at the micro level.
We demonstrate the efficacy of EGTAS across a range of graph-level and node-level tasks.
arXiv Detail & Related papers (2024-05-30T07:44:31Z) - On the Theoretical Expressive Power and the Design Space of Higher-Order Graph Transformers [20.73012427295352]
We study the theoretical expressive power of order-$k$ graph transformers and sparse variants.
We show that a natural neighborhood-based sparse order-$k$ transformer model is not only computationally efficient, but also expressive.
arXiv Detail & Related papers (2024-04-04T11:26:51Z) - Graph Transformers without Positional Encodings [0.7252027234425334]
We introduce Eigenformer, a Graph Transformer employing a novel spectrum-aware attention mechanism cognizant of the Laplacian spectrum of the graph.
We empirically show that it achieves performance competetive with SOTA Graph Transformers on a number of standard GNN benchmarks.
arXiv Detail & Related papers (2024-01-31T12:33:31Z) - Towards Principled Graph Transformers [8.897857788525629]
Graph learning architectures based on the k-dimensional Weisfeiler-Leman (k-WL) hierarchy offer a theoretically well-understood expressive power.
We show that the proposed Edge Transformer, a global attention model operating on node pairs instead of nodes, has at least 3-WL expressive power.
arXiv Detail & Related papers (2024-01-18T16:50:55Z) - Are More Layers Beneficial to Graph Transformers? [97.05661983225603]
Current graph transformers suffer from the bottleneck of improving performance by increasing depth.
Deep graph transformers are limited by the vanishing capacity of global attention.
We propose a novel graph transformer model named DeepGraph that explicitly employs substructure tokens in the encoded representation.
arXiv Detail & Related papers (2023-03-01T15:22:40Z) - Pure Transformers are Powerful Graph Learners [51.36884247453605]
We show that standard Transformers without graph-specific modifications can lead to promising results in graph learning both in theory and practice.
We prove that this approach is theoretically at least as expressive as an invariant graph network (2-IGN) composed of equivariant linear layers.
Our method coined Tokenized Graph Transformer (TokenGT) achieves significantly better results compared to GNN baselines and competitive results.
arXiv Detail & Related papers (2022-07-06T08:13:06Z) - Deformable Graph Transformer [31.254872949603982]
We propose Deformable Graph Transformer (DGT) that performs sparse attention with dynamically sampled key and value pairs.
Experiments demonstrate that our novel graph Transformer consistently outperforms existing Transformer-based models.
arXiv Detail & Related papers (2022-06-29T00:23:25Z) - Do Transformers Really Perform Bad for Graph Representation? [62.68420868623308]
We present Graphormer, which is built upon the standard Transformer architecture.
Our key insight to utilizing Transformer in the graph is the necessity of effectively encoding the structural information of a graph into the model.
arXiv Detail & Related papers (2021-06-09T17:18:52Z) - Improving Graph Neural Network Expressivity via Subgraph Isomorphism
Counting [63.04999833264299]
"Graph Substructure Networks" (GSN) is a topologically-aware message passing scheme based on substructure encoding.
We show that it is strictly more expressive than the Weisfeiler-Leman (WL) graph isomorphism test.
We perform an extensive evaluation on graph classification and regression tasks and obtain state-of-the-art results in diverse real-world settings.
arXiv Detail & Related papers (2020-06-16T15:30:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.