Gophormer: Ego-Graph Transformer for Node Classification
- URL: http://arxiv.org/abs/2110.13094v1
- Date: Mon, 25 Oct 2021 16:43:32 GMT
- Title: Gophormer: Ego-Graph Transformer for Node Classification
- Authors: Jianan Zhao, Chaozhuo Li, Qianlong Wen, Yiqi Wang, Yuming Liu, Hao
Sun, Xing Xie and Yanfang Ye
- Abstract summary: In this paper, we propose a novel Gophormer model which applies transformers on ego-graphs instead of full-graphs.
Specifically, Node2Seq module is proposed to sample ego-graphs as the input of transformers, which alleviates the challenge of scalability.
In order to handle the uncertainty introduced by the ego-graph sampling, we propose a consistency regularization and a multi-sample inference strategy.
- Score: 27.491500255498845
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transformers have achieved remarkable performance in a myriad of fields
including natural language processing and computer vision. However, when it
comes to the graph mining area, where graph neural network (GNN) has been the
dominant paradigm, transformers haven't achieved competitive performance,
especially on the node classification task. Existing graph transformer models
typically adopt fully-connected attention mechanism on the whole input graph
and thus suffer from severe scalability issues and are intractable to train in
data insufficient cases. To alleviate these issues, we propose a novel
Gophormer model which applies transformers on ego-graphs instead of
full-graphs. Specifically, Node2Seq module is proposed to sample ego-graphs as
the input of transformers, which alleviates the challenge of scalability and
serves as an effective data augmentation technique to boost model performance.
Moreover, different from the feature-based attention strategy in vanilla
transformers, we propose a proximity-enhanced attention mechanism to capture
the fine-grained structural bias. In order to handle the uncertainty introduced
by the ego-graph sampling, we further propose a consistency regularization and
a multi-sample inference strategy for stabilized training and testing,
respectively. Extensive experiments on six benchmark datasets are conducted to
demonstrate the superiority of Gophormer over existing graph transformers and
popular GNNs, revealing the promising future of graph transformers.
Related papers
- SGFormer: Single-Layer Graph Transformers with Approximation-Free Linear Complexity [74.51827323742506]
We evaluate the necessity of adopting multi-layer attentions in Transformers on graphs.
We show that one-layer propagation can be reduced to one-layer propagation, with the same capability for representation learning.
It suggests a new technical path for building powerful and efficient Transformers on graphs.
arXiv Detail & Related papers (2024-09-13T17:37:34Z) - Deep Prompt Tuning for Graph Transformers [55.2480439325792]
Fine-tuning is resource-intensive and requires storing multiple copies of large models.
We propose a novel approach called deep graph prompt tuning as an alternative to fine-tuning.
By freezing the pre-trained parameters and only updating the added tokens, our approach reduces the number of free parameters and eliminates the need for multiple model copies.
arXiv Detail & Related papers (2023-09-18T20:12:17Z) - SGFormer: Simplifying and Empowering Transformers for Large-Graph Representations [75.71298846760303]
We show that a one-layer attention can bring up surprisingly competitive performance across node property prediction benchmarks.
We frame the proposed scheme as Simplified Graph Transformers (SGFormer), which is empowered by a simple attention model.
We believe the proposed methodology alone enlightens a new technical path of independent interest for building Transformers on large graphs.
arXiv Detail & Related papers (2023-06-19T08:03:25Z) - Are More Layers Beneficial to Graph Transformers? [97.05661983225603]
Current graph transformers suffer from the bottleneck of improving performance by increasing depth.
Deep graph transformers are limited by the vanishing capacity of global attention.
We propose a novel graph transformer model named DeepGraph that explicitly employs substructure tokens in the encoded representation.
arXiv Detail & Related papers (2023-03-01T15:22:40Z) - Transformers over Directed Acyclic Graphs [6.263470141349622]
We study transformers over directed acyclic graphs (DAGs) and propose architecture adaptations tailored to DAGs.
We show that it is effective in making graph transformers generally outperform graph neural networks tailored to DAGs and in improving SOTA graph transformer performance in terms of both quality and efficiency.
arXiv Detail & Related papers (2022-10-24T12:04:52Z) - Hierarchical Graph Transformer with Adaptive Node Sampling [19.45896788055167]
We identify the main deficiencies of current graph transformers.
Most sampling strategies only focus on local neighbors and neglect the long-range dependencies in the graph.
We propose a hierarchical attention scheme with graph coarsening to capture the long-range interactions.
arXiv Detail & Related papers (2022-10-08T05:53:25Z) - Deformable Graph Transformer [31.254872949603982]
We propose Deformable Graph Transformer (DGT) that performs sparse attention with dynamically sampled key and value pairs.
Experiments demonstrate that our novel graph Transformer consistently outperforms existing Transformer-based models.
arXiv Detail & Related papers (2022-06-29T00:23:25Z) - Glance-and-Gaze Vision Transformer [13.77016463781053]
We propose a new vision Transformer, named Glance-and-Gaze Transformer (GG-Transformer)
It is motivated by the Glance and Gaze behavior of human beings when recognizing objects in natural scenes.
We empirically demonstrate our method achieves consistently superior performance over previous state-of-the-art Transformers.
arXiv Detail & Related papers (2021-06-04T06:13:47Z) - Transformers Solve the Limited Receptive Field for Monocular Depth
Prediction [82.90445525977904]
We propose TransDepth, an architecture which benefits from both convolutional neural networks and transformers.
This is the first paper which applies transformers into pixel-wise prediction problems involving continuous labels.
arXiv Detail & Related papers (2021-03-22T18:00:13Z) - Applying the Transformer to Character-level Transduction [68.91664610425114]
The transformer has been shown to outperform recurrent neural network-based sequence-to-sequence models in various word-level NLP tasks.
We show that with a large enough batch size, the transformer does indeed outperform recurrent models for character-level tasks.
arXiv Detail & Related papers (2020-05-20T17:25:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.