Related papers: Scaling Graph Transformers: A Comparative Study of Sparse and Dense Attention

Scaling Graph Transformers: A Comparative Study of Sparse and Dense Attention

URL: http://arxiv.org/abs/2508.17175v1
Date: Sun, 24 Aug 2025 01:12:59 GMT
Title: Scaling Graph Transformers: A Comparative Study of Sparse and Dense Attention
Authors: Leon Dimitrov,
Abstract summary: Graphs have become a central representation in machine learning for capturing structured data across various domains.<n>Graph transformers overcome this by using attention mechanisms that allow nodes to exchange information globally.<n>We compare these two attention mechanisms, analyze their trade-offs, and highlight when to use each.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Graphs have become a central representation in machine learning for capturing relational and structured data across various domains. Traditional graph neural networks often struggle to capture long-range dependencies between nodes due to their local structure. Graph transformers overcome this by using attention mechanisms that allow nodes to exchange information globally. However, there are two types of attention in graph transformers: dense and sparse. In this paper, we compare these two attention mechanisms, analyze their trade-offs, and highlight when to use each. We also outline current challenges and problems in designing attention for graph transformers.

Related papers

HopFormer: Sparse Graph Transformers with Explicit Receptive Field Control [7.178718630094309]
We introduce HopFormer, a graph Transformer that injects structure exclusively through head-specific n-hop masked sparse attention.<n>We show that our approach achieves competitive or superior performance across diverse graph structures.
arXiv Detail & Related papers (2026-02-02T16:09:58Z)
A Survey of Graph Transformers: Architectures, Theories and Applications [54.561539625830186]
Recent studies have proposed diverse architectures, enhanced explainability, and practical applications for Graph Transformers.<n>We categorize the architecture of Graph Transformers according to their strategies for processing structural information.<n>We provide a summary of the practical applications where Graph Transformers have been utilized, such as molecule, protein, language, vision, traffic, brain and material data.
arXiv Detail & Related papers (2025-02-23T10:55:19Z)
Towards Mechanistic Interpretability of Graph Transformers via Attention Graphs [16.249474010042736]
We introduce Attention Graphs, a new tool for mechanistic interpretability of Graph Neural Networks (GNNs) and Graph Transformers.<n> Attention Graphs aggregate attention matrices across Transformer layers and heads to describe how information flows among input nodes.
arXiv Detail & Related papers (2025-02-17T22:35:16Z)
A Survey on Structure-Preserving Graph Transformers [2.5252594834159643]
We provide a comprehensive overview of structure-preserving graph transformers and generalize these methods from the perspective of their design objective. We also discuss challenges and future directions for graph transformer models to preserve the graph structure and understand the nature of graphs.
arXiv Detail & Related papers (2024-01-29T14:18:09Z)
Gramformer: Learning Crowd Counting via Graph-Modulated Transformer [68.26599222077466]
Gramformer is a graph-modulated transformer to enhance the network by adjusting the attention and input node features respectively. A feature-based encoding is proposed to discover the centrality positions or importance of nodes. Experiments on four challenging crowd counting datasets have validated the competitiveness of the proposed method.
arXiv Detail & Related papers (2024-01-08T13:01:54Z)
SGFormer: Simplifying and Empowering Transformers for Large-Graph Representations [75.71298846760303]
We show that a one-layer attention can bring up surprisingly competitive performance across node property prediction benchmarks. We frame the proposed scheme as Simplified Graph Transformers (SGFormer), which is empowered by a simple attention model. We believe the proposed methodology alone enlightens a new technical path of independent interest for building Transformers on large graphs.
arXiv Detail & Related papers (2023-06-19T08:03:25Z)
Graph Transformer GANs for Graph-Constrained House Generation [223.739067413952]
We present a novel graph Transformer generative adversarial network (GTGAN) to learn effective graph node relations. The GTGAN learns effective graph node relations in an end-to-end fashion for the challenging graph-constrained house generation task.
arXiv Detail & Related papers (2023-03-14T20:35:45Z)
Are More Layers Beneficial to Graph Transformers? [97.05661983225603]
Current graph transformers suffer from the bottleneck of improving performance by increasing depth. Deep graph transformers are limited by the vanishing capacity of global attention. We propose a novel graph transformer model named DeepGraph that explicitly employs substructure tokens in the encoded representation.
arXiv Detail & Related papers (2023-03-01T15:22:40Z)
Transformers over Directed Acyclic Graphs [6.263470141349622]
We study transformers over directed acyclic graphs (DAGs) and propose architecture adaptations tailored to DAGs. We show that it is effective in making graph transformers generally outperform graph neural networks tailored to DAGs and in improving SOTA graph transformer performance in terms of both quality and efficiency.
arXiv Detail & Related papers (2022-10-24T12:04:52Z)
Graph Reasoning Transformer for Image Parsing [67.76633142645284]
We propose a novel Graph Reasoning Transformer (GReaT) for image parsing to enable image patches to interact following a relation reasoning pattern. Compared to the conventional transformer, GReaT has higher interaction efficiency and a more purposeful interaction pattern. Results show that GReaT achieves consistent performance gains with slight computational overheads on the state-of-the-art transformer baselines.
arXiv Detail & Related papers (2022-09-20T08:21:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.