Related papers: Unleashing the Power of Transformer for Graphs

Unleashing the Power of Transformer for Graphs

URL: http://arxiv.org/abs/2202.10581v1
Date: Fri, 18 Feb 2022 06:40:51 GMT
Title: Unleashing the Power of Transformer for Graphs
Authors: Lingbing Guo, Qiang Zhang, Huajun Chen
Abstract summary: Transformer suffers from the scalability problem when dealing with graphs. We propose a new Transformer architecture, named dual-encoding Transformer (DET) DET has a structural encoder to aggregate information from connected neighbors and a semantic encoder to focus on semantically useful distant nodes.
Score: 28.750700720796836
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite recent successes in natural language processing and computer vision, Transformer suffers from the scalability problem when dealing with graphs. The computational complexity is unacceptable for large-scale graphs, e.g., knowledge graphs. One solution is to consider only the near neighbors, which, however, will lose the key merit of Transformer to attend to the elements at any distance. In this paper, we propose a new Transformer architecture, named dual-encoding Transformer (DET). DET has a structural encoder to aggregate information from connected neighbors and a semantic encoder to focus on semantically useful distant nodes. In comparison with resorting to multi-hop neighbors, DET seeks the desired distant neighbors via self-supervised training. We further find these two encoders can be incorporated to boost each others' performance. Our experiments demonstrate DET has achieved superior performance compared to the respective state-of-the-art methods in dealing with molecules, networks and knowledge graphs with various sizes.

Related papers

SGFormer: Single-Layer Graph Transformers with Approximation-Free Linear Complexity [74.51827323742506]
We evaluate the necessity of adopting multi-layer attentions in Transformers on graphs. We show that one-layer propagation can be reduced to one-layer propagation, with the same capability for representation learning. It suggests a new technical path for building powerful and efficient Transformers on graphs.
arXiv Detail & Related papers (2024-09-13T17:37:34Z)
Graph as Point Set [31.448841287258116]
This paper introduces a novel graph-to-set conversion method that transforms interconnected nodes into a set of independent points. It enables using set encoders to learn from graphs, thereby significantly expanding the design space of Graph Neural Networks. To demonstrate the effectiveness of our approach, we introduce Point Set Transformer (PST), a transformer architecture that accepts a point set converted from a graph as input.
arXiv Detail & Related papers (2024-05-05T02:29:41Z)
GTC: GNN-Transformer Co-contrastive Learning for Self-supervised Heterogeneous Graph Representation [0.9249657468385781]
This paper proposes a collaborative learning scheme for GNN-Transformer and constructs GTC architecture. For the Transformer branch, we propose Metapath-aware Hop2Token and CG-Hetphormer, which can cooperate with GNN to attentively encode neighborhood information from different levels. Experiments on real datasets show that GTC exhibits superior performance compared with state-of-the-art methods.
arXiv Detail & Related papers (2024-03-22T12:22:44Z)
SGFormer: Simplifying and Empowering Transformers for Large-Graph Representations [75.71298846760303]
We show that a one-layer attention can bring up surprisingly competitive performance across node property prediction benchmarks. We frame the proposed scheme as Simplified Graph Transformers (SGFormer), which is empowered by a simple attention model. We believe the proposed methodology alone enlightens a new technical path of independent interest for building Transformers on large graphs.
arXiv Detail & Related papers (2023-06-19T08:03:25Z)
Dual Vision Transformer [114.1062057736447]
We propose a novel Transformer architecture that aims to mitigate the cost issue, named Dual Vision Transformer (Dual-ViT) The new architecture incorporates a critical semantic pathway that can more efficiently compress token vectors into global semantics with reduced order of complexity. We empirically demonstrate that Dual-ViT provides superior accuracy than SOTA Transformer architectures with reduced training complexity.
arXiv Detail & Related papers (2022-07-11T16:03:44Z)
Deformable Graph Transformer [31.254872949603982]
We propose Deformable Graph Transformer (DGT) that performs sparse attention with dynamically sampled key and value pairs. Experiments demonstrate that our novel graph Transformer consistently outperforms existing Transformer-based models.
arXiv Detail & Related papers (2022-06-29T00:23:25Z)
What Dense Graph Do You Need for Self-Attention? [73.82686008622596]
We present Hypercube Transformer, a sparse Transformer that models token interactions in a hypercube and shows comparable or even better results with vanilla Transformer. Experiments on tasks requiring various sequence lengths lay validation for our graph function well.
arXiv Detail & Related papers (2022-05-27T14:36:55Z)
Graph Neural Networks with Learnable Structural and Positional Representations [83.24058411666483]
A major issue with arbitrary graphs is the absence of canonical positional information of nodes. We introduce Positional nodes (PE) of nodes, and inject it into the input layer, like in Transformers. We observe a performance increase for molecular datasets, from 2.87% up to 64.14% when considering learnable PE for both GNN classes.
arXiv Detail & Related papers (2021-10-15T05:59:15Z)
Rethinking Graph Transformers with Spectral Attention [13.068288784805901]
We present the $textitSpectral Attention Network$ (SAN), which uses a learned positional encoding (LPE) to learn the position of each node in a given graph. By leveraging the full spectrum of the Laplacian, our model is theoretically powerful in distinguishing graphs, and can better detect similar sub-structures from their resonance. Our model performs on par or better than state-of-the-art GNNs, and outperforms any attention-based model by a wide margin.
arXiv Detail & Related papers (2021-06-07T18:11:11Z)
Transformers Solve the Limited Receptive Field for Monocular Depth Prediction [82.90445525977904]
We propose TransDepth, an architecture which benefits from both convolutional neural networks and transformers. This is the first paper which applies transformers into pixel-wise prediction problems involving continuous labels.
arXiv Detail & Related papers (2021-03-22T18:00:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.