Related papers: HopFormer: Sparse Graph Transformers with Explicit Receptive Field Control

HopFormer: Sparse Graph Transformers with Explicit Receptive Field Control

URL: http://arxiv.org/abs/2602.02268v1
Date: Mon, 02 Feb 2026 16:09:58 GMT
Title: HopFormer: Sparse Graph Transformers with Explicit Receptive Field Control
Authors: Sanggeon Yun, Raheeb Hassan, Ryozo Masukawa, Sungheon Jeong, Mohsen Imani,
Abstract summary: We introduce HopFormer, a graph Transformer that injects structure exclusively through head-specific n-hop masked sparse attention.<n>We show that our approach achieves competitive or superior performance across diverse graph structures.
Score: 7.178718630094309
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Graph Transformers typically rely on explicit positional or structural encodings and dense global attention to incorporate graph topology. In this work, we show that neither is essential. We introduce HopFormer, a graph Transformer that injects structure exclusively through head-specific n-hop masked sparse attention, without the use of positional encodings or architectural modifications. This design provides explicit and interpretable control over receptive fields while enabling genuinely sparse attention whose computational cost scales linearly with mask sparsity. Through extensive experiments on both node-level and graph-level benchmarks, we demonstrate that our approach achieves competitive or superior performance across diverse graph structures. Our results further reveal that dense global attention is often unnecessary: on graphs with strong small-world properties, localized attention yields more stable and consistently high performance, while on graphs with weaker small-world effects, global attention offers diminishing returns. Together, these findings challenge prevailing assumptions in graph Transformer design and highlight sparsity-controlled attention as a principled and efficient alternative.

Related papers

Attention Beyond Neighborhoods: Reviving Transformer for Graph Clustering [21.941792185132996]
Attentive Graph Clustering Network (AGCN) is a novel architecture that reinterprets the notion that graph is attention.<n>AGCN embeds the attention mechanism into the graph structure, enabling effective global information extraction.<n>Our framework incorporates theoretical analysis to contrast AGCN behavior with Graph Neural Networks (GNNs) and Transformer.
arXiv Detail & Related papers (2025-09-18T14:51:13Z)
Graph Positional Autoencoders as Self-supervised Learners [42.78083704462157]
Graph autoencoders (GAEs) take incomplete graphs as input and predict missing elements, such as masked nodes or edges.<n>We propose Graph Positional Autoencoders (GraphPAE), which employs a dual-path architecture to reconstruct both node features and positions.<n>We conduct extensive experiments to verify the effectiveness of GraphPAE, including heterophilic node classification, graph property prediction, and transfer learning.
arXiv Detail & Related papers (2025-05-29T11:10:11Z)
SGFormer: Single-Layer Graph Transformers with Approximation-Free Linear Complexity [74.51827323742506]
We evaluate the necessity of adopting multi-layer attentions in Transformers on graphs. We show that one-layer propagation can be reduced to one-layer propagation, with the same capability for representation learning. It suggests a new technical path for building powerful and efficient Transformers on graphs.
arXiv Detail & Related papers (2024-09-13T17:37:34Z)
What Improves the Generalization of Graph Transformers? A Theoretical Dive into the Self-attention and Positional Encoding [67.59552859593985]
Graph Transformers, which incorporate self-attention and positional encoding, have emerged as a powerful architecture for various graph learning tasks. This paper introduces first theoretical investigation of a shallow Graph Transformer for semi-supervised classification.
arXiv Detail & Related papers (2024-06-04T05:30:16Z)
Masked Graph Transformer for Large-Scale Recommendation [56.37903431721977]
We propose an efficient Masked Graph Transformer, named MGFormer, capable of capturing all-pair interactions among nodes with a linear complexity. Experimental results show the superior performance of our MGFormer, even with a single attention layer.
arXiv Detail & Related papers (2024-05-07T06:00:47Z)
Graph Transformers without Positional Encodings [0.7252027234425334]
We introduce Eigenformer, a Graph Transformer employing a novel spectrum-aware attention mechanism cognizant of the Laplacian spectrum of the graph. We empirically show that it achieves performance competetive with SOTA Graph Transformers on a number of standard GNN benchmarks.
arXiv Detail & Related papers (2024-01-31T12:33:31Z)
Graph Transformers for Large Graphs [57.19338459218758]
This work advances representation learning on single large-scale graphs with a focus on identifying model characteristics and critical design constraints. A key innovation of this work lies in the creation of a fast neighborhood sampling technique coupled with a local attention mechanism. We report a 3x speedup and 16.8% performance gain on ogbn-products and snap-patents, while we also scale LargeGT on ogbn-100M with a 5.9% performance improvement.
arXiv Detail & Related papers (2023-12-18T11:19:23Z)
SGFormer: Simplifying and Empowering Transformers for Large-Graph Representations [75.71298846760303]
We show that a one-layer attention can bring up surprisingly competitive performance across node property prediction benchmarks. We frame the proposed scheme as Simplified Graph Transformers (SGFormer), which is empowered by a simple attention model. We believe the proposed methodology alone enlightens a new technical path of independent interest for building Transformers on large graphs.
arXiv Detail & Related papers (2023-06-19T08:03:25Z)
Are More Layers Beneficial to Graph Transformers? [97.05661983225603]
Current graph transformers suffer from the bottleneck of improving performance by increasing depth. Deep graph transformers are limited by the vanishing capacity of global attention. We propose a novel graph transformer model named DeepGraph that explicitly employs substructure tokens in the encoded representation.
arXiv Detail & Related papers (2023-03-01T15:22:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.