A Generalization of Transformer Networks to Graphs
- URL: http://arxiv.org/abs/2012.09699v2
- Date: Sun, 24 Jan 2021 09:38:54 GMT
- Title: A Generalization of Transformer Networks to Graphs
- Authors: Vijay Prakash Dwivedi, Xavier Bresson
- Abstract summary: We introduce a graph transformer with four new properties compared to the standard model.
The architecture is extended to edge feature representation, which can be critical to tasks s.a. chemistry (bond type) or link prediction (entity relationship in knowledge graphs)
- Score: 5.736353542430439
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a generalization of transformer neural network architecture for
arbitrary graphs. The original transformer was designed for Natural Language
Processing (NLP), which operates on fully connected graphs representing all
connections between the words in a sequence. Such architecture does not
leverage the graph connectivity inductive bias, and can perform poorly when the
graph topology is important and has not been encoded into the node features. We
introduce a graph transformer with four new properties compared to the standard
model. First, the attention mechanism is a function of the neighborhood
connectivity for each node in the graph. Second, the positional encoding is
represented by the Laplacian eigenvectors, which naturally generalize the
sinusoidal positional encodings often used in NLP. Third, the layer
normalization is replaced by a batch normalization layer, which provides faster
training and better generalization performance. Finally, the architecture is
extended to edge feature representation, which can be critical to tasks s.a.
chemistry (bond type) or link prediction (entity relationship in knowledge
graphs). Numerical experiments on a graph benchmark demonstrate the performance
of the proposed graph transformer architecture. This work closes the gap
between the original transformer, which was designed for the limited case of
line graphs, and graph neural networks, that can work with arbitrary graphs. As
our architecture is simple and generic, we believe it can be used as a black
box for future applications that wish to consider transformer and graphs.
Related papers
- What Improves the Generalization of Graph Transformers? A Theoretical Dive into the Self-attention and Positional Encoding [67.59552859593985]
Graph Transformers, which incorporate self-attention and positional encoding, have emerged as a powerful architecture for various graph learning tasks.
This paper introduces first theoretical investigation of a shallow Graph Transformer for semi-supervised classification.
arXiv Detail & Related papers (2024-06-04T05:30:16Z) - Graph as Point Set [31.448841287258116]
This paper introduces a novel graph-to-set conversion method that transforms interconnected nodes into a set of independent points.
It enables using set encoders to learn from graphs, thereby significantly expanding the design space of Graph Neural Networks.
To demonstrate the effectiveness of our approach, we introduce Point Set Transformer (PST), a transformer architecture that accepts a point set converted from a graph as input.
arXiv Detail & Related papers (2024-05-05T02:29:41Z) - Graph Transformers without Positional Encodings [0.7252027234425334]
We introduce Eigenformer, a Graph Transformer employing a novel spectrum-aware attention mechanism cognizant of the Laplacian spectrum of the graph.
We empirically show that it achieves performance competetive with SOTA Graph Transformers on a number of standard GNN benchmarks.
arXiv Detail & Related papers (2024-01-31T12:33:31Z) - Graph Transformer GANs with Graph Masked Modeling for Architectural
Layout Generation [153.92387500677023]
We present a novel graph Transformer generative adversarial network (GTGAN) to learn effective graph node relations.
The proposed graph Transformer encoder combines graph convolutions and self-attentions in a Transformer to model both local and global interactions.
We also propose a novel self-guided pre-training method for graph representation learning.
arXiv Detail & Related papers (2024-01-15T14:36:38Z) - Enhancing Graph Transformers with Hierarchical Distance Structural Encoding [9.070055955084364]
This paper presents a Hierarchical Distance Structural (HDSE) method to model node distances in a graph.
We introduce a novel framework to seamlessly integrate HDSE into the attention mechanism of existing graph transformers.
We demonstrate that graph transformers with HDSE excel in graph classification, regression on 7 graph-level datasets, and node classification on 11 large-scale graphs.
arXiv Detail & Related papers (2023-08-22T02:22:34Z) - Graph Transformer GANs for Graph-Constrained House Generation [223.739067413952]
We present a novel graph Transformer generative adversarial network (GTGAN) to learn effective graph node relations.
The GTGAN learns effective graph node relations in an end-to-end fashion for the challenging graph-constrained house generation task.
arXiv Detail & Related papers (2023-03-14T20:35:45Z) - Transformers over Directed Acyclic Graphs [6.263470141349622]
We study transformers over directed acyclic graphs (DAGs) and propose architecture adaptations tailored to DAGs.
We show that it is effective in making graph transformers generally outperform graph neural networks tailored to DAGs and in improving SOTA graph transformer performance in terms of both quality and efficiency.
arXiv Detail & Related papers (2022-10-24T12:04:52Z) - Pure Transformers are Powerful Graph Learners [51.36884247453605]
We show that standard Transformers without graph-specific modifications can lead to promising results in graph learning both in theory and practice.
We prove that this approach is theoretically at least as expressive as an invariant graph network (2-IGN) composed of equivariant linear layers.
Our method coined Tokenized Graph Transformer (TokenGT) achieves significantly better results compared to GNN baselines and competitive results.
arXiv Detail & Related papers (2022-07-06T08:13:06Z) - Graph Neural Networks with Learnable Structural and Positional
Representations [83.24058411666483]
A major issue with arbitrary graphs is the absence of canonical positional information of nodes.
We introduce Positional nodes (PE) of nodes, and inject it into the input layer, like in Transformers.
We observe a performance increase for molecular datasets, from 2.87% up to 64.14% when considering learnable PE for both GNN classes.
arXiv Detail & Related papers (2021-10-15T05:59:15Z) - Do Transformers Really Perform Bad for Graph Representation? [62.68420868623308]
We present Graphormer, which is built upon the standard Transformer architecture.
Our key insight to utilizing Transformer in the graph is the necessity of effectively encoding the structural information of a graph into the model.
arXiv Detail & Related papers (2021-06-09T17:18:52Z) - Rethinking Graph Transformers with Spectral Attention [13.068288784805901]
We present the $textitSpectral Attention Network$ (SAN), which uses a learned positional encoding (LPE) to learn the position of each node in a given graph.
By leveraging the full spectrum of the Laplacian, our model is theoretically powerful in distinguishing graphs, and can better detect similar sub-structures from their resonance.
Our model performs on par or better than state-of-the-art GNNs, and outperforms any attention-based model by a wide margin.
arXiv Detail & Related papers (2021-06-07T18:11:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.