Related papers: Investigating Expressiveness of Transformer in Spectral Domain for Graphs

Investigating Expressiveness of Transformer in Spectral Domain for Graphs

URL: http://arxiv.org/abs/2201.09332v2
Date: Thu, 27 Jan 2022 14:48:00 GMT
Title: Investigating Expressiveness of Transformer in Spectral Domain for Graphs
Authors: Anson Bastos, Abhishek Nadgeri, Kuldeep Singh, Hiroki Kanezashi, Toyotaro Suzumura, Isaiah Onando Mulang'
Abstract summary: We study and prove the link between the spatial and spectral domain in the realm of the transformer. We propose FeTA, a framework that aims to perform attention over the entire graph spectrum analogous to the attention in spatial space.
Score: 6.092217185687028
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Transformers have been proven to be inadequate for graph representation learning. To understand this inadequacy, there is need to investigate if spectral analysis of transformer will reveal insights on its expressive power. Similar studies already established that spectral analysis of Graph neural networks (GNNs) provides extra perspectives on their expressiveness. In this work, we systematically study and prove the link between the spatial and spectral domain in the realm of the transformer. We further provide a theoretical analysis that the spatial attention mechanism in the transformer cannot effectively capture the desired frequency response, thus, inherently limiting its expressiveness in spectral space. Therefore, we propose FeTA, a framework that aims to perform attention over the entire graph spectrum analogous to the attention in spatial space. Empirical results suggest that FeTA provides homogeneous performance gain against vanilla transformer across all tasks on standard benchmarks and can easily be extended to GNN based models with low-pass characteristics (e.g., GAT). Furthermore, replacing the vanilla transformer model with FeTA in recently proposed position encoding schemes has resulted in comparable or better performance than transformer and GNN baselines.

Related papers

Advection-Diffusion on Graphs: A Bakry-Emery Laplacian for Spectral Graph Neural Networks [2.6124895681424323]
Graph Neural Networks (GNNs) often struggle to propagate information across long distances due to oversmoothing and oversquashing.<n>We introduce a Bakry-Emery graph Laplacian that integrates diffusion and advection through a learnable node-wise potential.<n>We develop mu-ChebNet, a spectral architecture that jointly learns the potential and Chebyshev filters.
arXiv Detail & Related papers (2026-02-20T11:01:12Z)
SIEFormer: Spectral-Interpretable and -Enhanced Transformer for Generalized Category Discovery [14.288193104482986]
SIEFormer is composed of two main branches, each corresponding to an implicit and explicit spectral perspective of the ViT.<n>The implicit branch realizes the use of different types of graph Laplacians to model the local structure correlations of tokens.<n>The explicit branch, on the other hand, introduces a Maneuverable Filtering Layer (MFL) that learns global dependencies among tokens.
arXiv Detail & Related papers (2026-02-13T16:22:31Z)
Plain Transformers are Surprisingly Powerful Link Predictors [57.01966734467712]
Link prediction is a core challenge in graph machine learning, demanding models that capture rich and complex topological dependencies.<n>While Graph Neural Networks (GNNs) are the standard solution, state-of-the-art pipelines often rely on explicit structurals or memory-intensive node embeddings.<n>We present PENCIL, an encoder-only plain Transformer that replaces hand-crafted priors with attention over sampled local subgraphs.
arXiv Detail & Related papers (2026-02-02T02:45:52Z)
Simplifying Graph Transformers [64.50059165186701]
We propose three simple modifications to the plain Transformer to render it applicable to graphs without introducing major architectural distortions. Specifically, we advocate for the use of (1) simplified $L$ attention to measure the magnitude of closeness tokens; (2) adaptive root-mean-square normalization to preserve token magnitude information; and (3) a relative positional encoding bias with a shared encoder.
arXiv Detail & Related papers (2025-04-17T02:06:50Z)
Spectral Invariant Learning for Dynamic Graphs under Distribution Shifts [57.19908334882441]
Dynamic graph neural networks (DyGNNs) currently struggle with handling distribution shifts that are inherent in dynamic graphs. We propose to study distribution shifts on dynamic graphs in the spectral domain for the first time.
arXiv Detail & Related papers (2024-03-08T04:07:23Z)
HoloNets: Spectral Convolutions do extend to Directed Graphs [59.851175771106625]
Conventional wisdom dictates that spectral convolutional networks may only be deployed on undirected graphs. Here we show this traditional reliance on the graph Fourier transform to be superfluous. We provide a frequency-response interpretation of newly developed filters, investigate the influence of the basis used to express filters and discuss the interplay with characteristic operators on which networks are based.
arXiv Detail & Related papers (2023-10-03T17:42:09Z)
FactoFormer: Factorized Hyperspectral Transformers with Self-Supervised Pretraining [36.44039681893334]
Hyperspectral images (HSIs) contain rich spectral and spatial information. Current state-of-the-art hyperspectral transformers only tokenize the input HSI sample along the spectral dimension. We propose a novel factorized spectral-spatial transformer that incorporates factorized self-supervised pretraining procedures.
arXiv Detail & Related papers (2023-09-18T02:05:52Z)
Are More Layers Beneficial to Graph Transformers? [97.05661983225603]
Current graph transformers suffer from the bottleneck of improving performance by increasing depth. Deep graph transformers are limited by the vanishing capacity of global attention. We propose a novel graph transformer model named DeepGraph that explicitly employs substructure tokens in the encoded representation.
arXiv Detail & Related papers (2023-03-01T15:22:40Z)
Stable and Transferable Hyper-Graph Neural Networks [95.07035704188984]
We introduce an architecture for processing signals supported on hypergraphs via graph neural networks (GNNs) We provide a framework for bounding the stability and transferability error of GNNs across arbitrary graphs via spectral similarity.
arXiv Detail & Related papers (2022-11-11T23:44:20Z)
FAMLP: A Frequency-Aware MLP-Like Architecture For Domain Generalization [73.41395947275473]
We propose a novel frequency-aware architecture, in which the domain-specific features are filtered out in the transformed frequency domain. Experiments on three benchmarks demonstrate significant performance, outperforming the state-of-the-art methods by a margin of 3%, 4% and 9%, respectively.
arXiv Detail & Related papers (2022-03-24T07:26:29Z)
Gophormer: Ego-Graph Transformer for Node Classification [27.491500255498845]
In this paper, we propose a novel Gophormer model which applies transformers on ego-graphs instead of full-graphs. Specifically, Node2Seq module is proposed to sample ego-graphs as the input of transformers, which alleviates the challenge of scalability. In order to handle the uncertainty introduced by the ego-graph sampling, we propose a consistency regularization and a multi-sample inference strategy.
arXiv Detail & Related papers (2021-10-25T16:43:32Z)
SDA-GAN: Unsupervised Image Translation Using Spectral Domain Attention-Guided Generative Adversarial Network [0.0]
This work introduced a novel GAN architecture for unsupervised image translation on the task of face style transform. A spectral attention-based mechanism is embedded into the design along with spatial attention on the image contents.
arXiv Detail & Related papers (2021-10-06T15:59:43Z)
SpectralFormer: Rethinking Hyperspectral Image Classification with Transformers [91.09957836250209]
Hyperspectral (HS) images are characterized by approximately contiguous spectral information. CNNs have been proven to be a powerful feature extractor in HS image classification. We propose a novel backbone network called ulSpectralFormer for HS image classification.
arXiv Detail & Related papers (2021-07-07T02:59:21Z)
Rethinking Graph Transformers with Spectral Attention [13.068288784805901]
We present the $textitSpectral Attention Network$ (SAN), which uses a learned positional encoding (LPE) to learn the position of each node in a given graph. By leveraging the full spectrum of the Laplacian, our model is theoretically powerful in distinguishing graphs, and can better detect similar sub-structures from their resonance. Our model performs on par or better than state-of-the-art GNNs, and outperforms any attention-based model by a wide margin.
arXiv Detail & Related papers (2021-06-07T18:11:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.