Automatic Graph Topology-Aware Transformer
- URL: http://arxiv.org/abs/2405.19779v2
- Date: Tue, 6 Aug 2024 01:25:33 GMT
- Title: Automatic Graph Topology-Aware Transformer
- Authors: Chao Wang, Jiaxuan Zhao, Lingling Li, Licheng Jiao, Fang Liu, Shuyuan Yang,
- Abstract summary: We build a comprehensive graph Transformer search space with the micro-level and macro-level designs.
EGTAS evolves graph Transformer topologies at the macro level and graph-aware strategies at the micro level.
We demonstrate the efficacy of EGTAS across a range of graph-level and node-level tasks.
- Score: 50.2807041149784
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing efforts are dedicated to designing many topologies and graph-aware strategies for the graph Transformer, which greatly improve the model's representation capabilities. However, manually determining the suitable Transformer architecture for a specific graph dataset or task requires extensive expert knowledge and laborious trials. This paper proposes an evolutionary graph Transformer architecture search framework (EGTAS) to automate the construction of strong graph Transformers. We build a comprehensive graph Transformer search space with the micro-level and macro-level designs. EGTAS evolves graph Transformer topologies at the macro level and graph-aware strategies at the micro level. Furthermore, a surrogate model based on generic architectural coding is proposed to directly predict the performance of graph Transformers, substantially reducing the evaluation cost of evolutionary search. We demonstrate the efficacy of EGTAS across a range of graph-level and node-level tasks, encompassing both small-scale and large-scale graph datasets. Experimental results and ablation studies show that EGTAS can construct high-performance architectures that rival state-of-the-art manual and automated baselines.
Related papers
- Graph Transformers without Positional Encodings [0.7252027234425334]
We introduce Eigenformer, a Graph Transformer employing a novel spectrum-aware attention mechanism cognizant of the Laplacian spectrum of the graph.
We empirically show that it achieves performance competetive with SOTA Graph Transformers on a number of standard GNN benchmarks.
arXiv Detail & Related papers (2024-01-31T12:33:31Z) - Deep Prompt Tuning for Graph Transformers [55.2480439325792]
Fine-tuning is resource-intensive and requires storing multiple copies of large models.
We propose a novel approach called deep graph prompt tuning as an alternative to fine-tuning.
By freezing the pre-trained parameters and only updating the added tokens, our approach reduces the number of free parameters and eliminates the need for multiple model copies.
arXiv Detail & Related papers (2023-09-18T20:12:17Z) - SGFormer: Simplifying and Empowering Transformers for Large-Graph Representations [75.71298846760303]
We show that a one-layer attention can bring up surprisingly competitive performance across node property prediction benchmarks.
We frame the proposed scheme as Simplified Graph Transformers (SGFormer), which is empowered by a simple attention model.
We believe the proposed methodology alone enlightens a new technical path of independent interest for building Transformers on large graphs.
arXiv Detail & Related papers (2023-06-19T08:03:25Z) - Are More Layers Beneficial to Graph Transformers? [97.05661983225603]
Current graph transformers suffer from the bottleneck of improving performance by increasing depth.
Deep graph transformers are limited by the vanishing capacity of global attention.
We propose a novel graph transformer model named DeepGraph that explicitly employs substructure tokens in the encoded representation.
arXiv Detail & Related papers (2023-03-01T15:22:40Z) - Transformers over Directed Acyclic Graphs [6.263470141349622]
We study transformers over directed acyclic graphs (DAGs) and propose architecture adaptations tailored to DAGs.
We show that it is effective in making graph transformers generally outperform graph neural networks tailored to DAGs and in improving SOTA graph transformer performance in terms of both quality and efficiency.
arXiv Detail & Related papers (2022-10-24T12:04:52Z) - Hierarchical Graph Transformer with Adaptive Node Sampling [19.45896788055167]
We identify the main deficiencies of current graph transformers.
Most sampling strategies only focus on local neighbors and neglect the long-range dependencies in the graph.
We propose a hierarchical attention scheme with graph coarsening to capture the long-range interactions.
arXiv Detail & Related papers (2022-10-08T05:53:25Z) - Transformer for Graphs: An Overview from Architecture Perspective [86.3545861392215]
It's imperative to sort out the existing Transformer models for graphs and systematically investigate their effectiveness on various graph tasks.
We first disassemble the existing models and conclude three typical ways to incorporate the graph information into the vanilla Transformer.
Our experiments confirm the benefits of current graph-specific modules on Transformer and reveal their advantages on different kinds of graph tasks.
arXiv Detail & Related papers (2022-02-17T06:02:06Z) - Do Transformers Really Perform Bad for Graph Representation? [62.68420868623308]
We present Graphormer, which is built upon the standard Transformer architecture.
Our key insight to utilizing Transformer in the graph is the necessity of effectively encoding the structural information of a graph into the model.
arXiv Detail & Related papers (2021-06-09T17:18:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.