Investigating Expressiveness of Transformer in Spectral Domain for
Graphs
- URL: http://arxiv.org/abs/2201.09332v2
- Date: Thu, 27 Jan 2022 14:48:00 GMT
- Title: Investigating Expressiveness of Transformer in Spectral Domain for
Graphs
- Authors: Anson Bastos, Abhishek Nadgeri, Kuldeep Singh, Hiroki Kanezashi,
Toyotaro Suzumura, Isaiah Onando Mulang'
- Abstract summary: We study and prove the link between the spatial and spectral domain in the realm of the transformer.
We propose FeTA, a framework that aims to perform attention over the entire graph spectrum analogous to the attention in spatial space.
- Score: 6.092217185687028
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Transformers have been proven to be inadequate for graph representation
learning. To understand this inadequacy, there is need to investigate if
spectral analysis of transformer will reveal insights on its expressive power.
Similar studies already established that spectral analysis of Graph neural
networks (GNNs) provides extra perspectives on their expressiveness. In this
work, we systematically study and prove the link between the spatial and
spectral domain in the realm of the transformer. We further provide a
theoretical analysis that the spatial attention mechanism in the transformer
cannot effectively capture the desired frequency response, thus, inherently
limiting its expressiveness in spectral space. Therefore, we propose FeTA, a
framework that aims to perform attention over the entire graph spectrum
analogous to the attention in spatial space. Empirical results suggest that
FeTA provides homogeneous performance gain against vanilla transformer across
all tasks on standard benchmarks and can easily be extended to GNN based models
with low-pass characteristics (e.g., GAT). Furthermore, replacing the vanilla
transformer model with FeTA in recently proposed position encoding schemes has
resulted in comparable or better performance than transformer and GNN
baselines.
Related papers
- Spectral Invariant Learning for Dynamic Graphs under Distribution Shifts [57.19908334882441]
Dynamic graph neural networks (DyGNNs) currently struggle with handling distribution shifts that are inherent in dynamic graphs.
We propose to study distribution shifts on dynamic graphs in the spectral domain for the first time.
arXiv Detail & Related papers (2024-03-08T04:07:23Z) - HoloNets: Spectral Convolutions do extend to Directed Graphs [59.851175771106625]
Conventional wisdom dictates that spectral convolutional networks may only be deployed on undirected graphs.
Here we show this traditional reliance on the graph Fourier transform to be superfluous.
We provide a frequency-response interpretation of newly developed filters, investigate the influence of the basis used to express filters and discuss the interplay with characteristic operators on which networks are based.
arXiv Detail & Related papers (2023-10-03T17:42:09Z) - FactoFormer: Factorized Hyperspectral Transformers with Self-Supervised
Pretraining [36.44039681893334]
Hyperspectral images (HSIs) contain rich spectral and spatial information.
Current state-of-the-art hyperspectral transformers only tokenize the input HSI sample along the spectral dimension.
We propose a novel factorized spectral-spatial transformer that incorporates factorized self-supervised pretraining procedures.
arXiv Detail & Related papers (2023-09-18T02:05:52Z) - Are More Layers Beneficial to Graph Transformers? [97.05661983225603]
Current graph transformers suffer from the bottleneck of improving performance by increasing depth.
Deep graph transformers are limited by the vanishing capacity of global attention.
We propose a novel graph transformer model named DeepGraph that explicitly employs substructure tokens in the encoded representation.
arXiv Detail & Related papers (2023-03-01T15:22:40Z) - Stable and Transferable Hyper-Graph Neural Networks [95.07035704188984]
We introduce an architecture for processing signals supported on hypergraphs via graph neural networks (GNNs)
We provide a framework for bounding the stability and transferability error of GNNs across arbitrary graphs via spectral similarity.
arXiv Detail & Related papers (2022-11-11T23:44:20Z) - FAMLP: A Frequency-Aware MLP-Like Architecture For Domain Generalization [73.41395947275473]
We propose a novel frequency-aware architecture, in which the domain-specific features are filtered out in the transformed frequency domain.
Experiments on three benchmarks demonstrate significant performance, outperforming the state-of-the-art methods by a margin of 3%, 4% and 9%, respectively.
arXiv Detail & Related papers (2022-03-24T07:26:29Z) - Gophormer: Ego-Graph Transformer for Node Classification [27.491500255498845]
In this paper, we propose a novel Gophormer model which applies transformers on ego-graphs instead of full-graphs.
Specifically, Node2Seq module is proposed to sample ego-graphs as the input of transformers, which alleviates the challenge of scalability.
In order to handle the uncertainty introduced by the ego-graph sampling, we propose a consistency regularization and a multi-sample inference strategy.
arXiv Detail & Related papers (2021-10-25T16:43:32Z) - SDA-GAN: Unsupervised Image Translation Using Spectral Domain
Attention-Guided Generative Adversarial Network [0.0]
This work introduced a novel GAN architecture for unsupervised image translation on the task of face style transform.
A spectral attention-based mechanism is embedded into the design along with spatial attention on the image contents.
arXiv Detail & Related papers (2021-10-06T15:59:43Z) - SpectralFormer: Rethinking Hyperspectral Image Classification with
Transformers [91.09957836250209]
Hyperspectral (HS) images are characterized by approximately contiguous spectral information.
CNNs have been proven to be a powerful feature extractor in HS image classification.
We propose a novel backbone network called ulSpectralFormer for HS image classification.
arXiv Detail & Related papers (2021-07-07T02:59:21Z) - Rethinking Graph Transformers with Spectral Attention [13.068288784805901]
We present the $textitSpectral Attention Network$ (SAN), which uses a learned positional encoding (LPE) to learn the position of each node in a given graph.
By leveraging the full spectrum of the Laplacian, our model is theoretically powerful in distinguishing graphs, and can better detect similar sub-structures from their resonance.
Our model performs on par or better than state-of-the-art GNNs, and outperforms any attention-based model by a wide margin.
arXiv Detail & Related papers (2021-06-07T18:11:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.