Curve Your Attention: Mixed-Curvature Transformers for Graph
Representation Learning
- URL: http://arxiv.org/abs/2309.04082v1
- Date: Fri, 8 Sep 2023 02:44:37 GMT
- Title: Curve Your Attention: Mixed-Curvature Transformers for Graph
Representation Learning
- Authors: Sungjun Cho, Seunghyuk Cho, Sungwoo Park, Hankook Lee, Honglak Lee,
Moontae Lee
- Abstract summary: We propose a generalization of Transformers towards operating entirely on the product of constant curvature spaces.
We also provide a kernelized approach to non-Euclidean attention, which enables our model to run in time and memory cost linear to the number of nodes and edges.
- Score: 77.1421343649344
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Real-world graphs naturally exhibit hierarchical or cyclical structures that
are unfit for the typical Euclidean space. While there exist graph neural
networks that leverage hyperbolic or spherical spaces to learn representations
that embed such structures more accurately, these methods are confined under
the message-passing paradigm, making the models vulnerable against side-effects
such as oversmoothing and oversquashing. More recent work have proposed global
attention-based graph Transformers that can easily model long-range
interactions, but their extensions towards non-Euclidean geometry are yet
unexplored. To bridge this gap, we propose Fully Product-Stereographic
Transformer, a generalization of Transformers towards operating entirely on the
product of constant curvature spaces. When combined with tokenized graph
Transformers, our model can learn the curvature appropriate for the input graph
in an end-to-end fashion, without the need of additional tuning on different
curvature initializations. We also provide a kernelized approach to
non-Euclidean attention, which enables our model to run in time and memory cost
linear to the number of nodes and edges while respecting the underlying
geometry. Experiments on graph reconstruction and node classification
demonstrate the benefits of generalizing Transformers to the non-Euclidean
domain.
Related papers
- SGFormer: Single-Layer Graph Transformers with Approximation-Free Linear Complexity [74.51827323742506]
We evaluate the necessity of adopting multi-layer attentions in Transformers on graphs.
We show that one-layer propagation can be reduced to one-layer propagation, with the same capability for representation learning.
It suggests a new technical path for building powerful and efficient Transformers on graphs.
arXiv Detail & Related papers (2024-09-13T17:37:34Z) - Scalable Graph Compressed Convolutions [68.85227170390864]
We propose a differentiable method that applies permutations to calibrate input graphs for Euclidean convolution.
Based on the graph calibration, we propose the Compressed Convolution Network (CoCN) for hierarchical graph representation learning.
arXiv Detail & Related papers (2024-07-26T03:14:13Z) - What Improves the Generalization of Graph Transformers? A Theoretical Dive into the Self-attention and Positional Encoding [67.59552859593985]
Graph Transformers, which incorporate self-attention and positional encoding, have emerged as a powerful architecture for various graph learning tasks.
This paper introduces first theoretical investigation of a shallow Graph Transformer for semi-supervised classification.
arXiv Detail & Related papers (2024-06-04T05:30:16Z) - Attending to Topological Spaces: The Cellular Transformer [37.84207797241944]
Topological Deep Learning seeks to enhance the predictive performance of neural network models by harnessing topological structures in input data.
We introduce the Cellular Transformer (CT), a novel architecture that generalizes graph-based transformers to cell complexes.
CT achieves state-of-the-art performance, but it does so without the need for more complex enhancements.
arXiv Detail & Related papers (2024-05-23T01:48:32Z) - SGFormer: Simplifying and Empowering Transformers for Large-Graph Representations [75.71298846760303]
We show that a one-layer attention can bring up surprisingly competitive performance across node property prediction benchmarks.
We frame the proposed scheme as Simplified Graph Transformers (SGFormer), which is empowered by a simple attention model.
We believe the proposed methodology alone enlightens a new technical path of independent interest for building Transformers on large graphs.
arXiv Detail & Related papers (2023-06-19T08:03:25Z) - Deformable Graph Transformer [31.254872949603982]
We propose Deformable Graph Transformer (DGT) that performs sparse attention with dynamically sampled key and value pairs.
Experiments demonstrate that our novel graph Transformer consistently outperforms existing Transformer-based models.
arXiv Detail & Related papers (2022-06-29T00:23:25Z) - Rethinking Graph Transformers with Spectral Attention [13.068288784805901]
We present the $textitSpectral Attention Network$ (SAN), which uses a learned positional encoding (LPE) to learn the position of each node in a given graph.
By leveraging the full spectrum of the Laplacian, our model is theoretically powerful in distinguishing graphs, and can better detect similar sub-structures from their resonance.
Our model performs on par or better than state-of-the-art GNNs, and outperforms any attention-based model by a wide margin.
arXiv Detail & Related papers (2021-06-07T18:11:11Z) - Self-Supervised Graph Representation Learning via Topology
Transformations [61.870882736758624]
We present the Topology Transformation Equivariant Representation learning, a general paradigm of self-supervised learning for node representations of graph data.
In experiments, we apply the proposed model to the downstream node and graph classification tasks, and results show that the proposed method outperforms the state-of-the-art unsupervised approaches.
arXiv Detail & Related papers (2021-05-25T06:11:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.