An end-to-end attention-based approach for learning on graphs
- URL: http://arxiv.org/abs/2402.10793v2
- Date: Fri, 06 Dec 2024 15:44:46 GMT
- Title: An end-to-end attention-based approach for learning on graphs
- Authors: David Buterez, Jon Paul Janet, Dino Oglic, Pietro Lio,
- Abstract summary: transformer-based architectures for learning on graphs are motivated by attention as an effective learning mechanism.
We propose a purely attention-based approach consisting of an encoder and an attention pooling mechanism.
Despite its simplicity, the approach outperforms fine-tuned message passing baselines and recently proposed transformer-based methods on more than 70 node and graph-level tasks.
- Score: 8.552020965470113
- License:
- Abstract: There has been a recent surge in transformer-based architectures for learning on graphs, mainly motivated by attention as an effective learning mechanism and the desire to supersede handcrafted operators characteristic of message passing schemes. However, concerns over their empirical effectiveness, scalability, and complexity of the pre-processing steps have been raised, especially in relation to much simpler graph neural networks that typically perform on par with them across a wide range of benchmarks. To tackle these shortcomings, we consider graphs as sets of edges and propose a purely attention-based approach consisting of an encoder and an attention pooling mechanism. The encoder vertically interleaves masked and vanilla self-attention modules to learn an effective representations of edges, while allowing for tackling possible misspecifications in input graphs. Despite its simplicity, the approach outperforms fine-tuned message passing baselines and recently proposed transformer-based methods on more than 70 node and graph-level tasks, including challenging long-range benchmarks. Moreover, we demonstrate state-of-the-art performance across different tasks, ranging from molecular to vision graphs, and heterophilous node classification. The approach also outperforms graph neural networks and transformers in transfer learning settings, and scales much better than alternatives with a similar performance level or expressive power.
Related papers
- SGFormer: Single-Layer Graph Transformers with Approximation-Free Linear Complexity [74.51827323742506]
We evaluate the necessity of adopting multi-layer attentions in Transformers on graphs.
We show that one-layer propagation can be reduced to one-layer propagation, with the same capability for representation learning.
It suggests a new technical path for building powerful and efficient Transformers on graphs.
arXiv Detail & Related papers (2024-09-13T17:37:34Z) - GASE: Graph Attention Sampling with Edges Fusion for Solving Vehicle Routing Problems [6.084414764415137]
We propose an adaptive Graph Attention Sampling with the Edges Fusion framework to solve vehicle routing problems.
Our proposed model outperforms the existing methods by 2.08%-6.23% and shows stronger generalization ability.
arXiv Detail & Related papers (2024-05-21T03:33:07Z) - Message Detouring: A Simple Yet Effective Cycle Representation for
Expressive Graph Learning [4.085624738017079]
We introduce the concept of textitmessage detouring to hierarchically characterize cycle representation throughout the entire graph.
Message detouring can significantly outperform current counterpart approaches on various benchmark datasets.
arXiv Detail & Related papers (2024-02-12T22:06:37Z) - Deep Prompt Tuning for Graph Transformers [55.2480439325792]
Fine-tuning is resource-intensive and requires storing multiple copies of large models.
We propose a novel approach called deep graph prompt tuning as an alternative to fine-tuning.
By freezing the pre-trained parameters and only updating the added tokens, our approach reduces the number of free parameters and eliminates the need for multiple model copies.
arXiv Detail & Related papers (2023-09-18T20:12:17Z) - SGFormer: Simplifying and Empowering Transformers for Large-Graph Representations [75.71298846760303]
We show that a one-layer attention can bring up surprisingly competitive performance across node property prediction benchmarks.
We frame the proposed scheme as Simplified Graph Transformers (SGFormer), which is empowered by a simple attention model.
We believe the proposed methodology alone enlightens a new technical path of independent interest for building Transformers on large graphs.
arXiv Detail & Related papers (2023-06-19T08:03:25Z) - State of the Art and Potentialities of Graph-level Learning [54.68482109186052]
Graph-level learning has been applied to many tasks including comparison, regression, classification, and more.
Traditional approaches to learning a set of graphs rely on hand-crafted features, such as substructures.
Deep learning has helped graph-level learning adapt to the growing scale of graphs by extracting features automatically and encoding graphs into low-dimensional representations.
arXiv Detail & Related papers (2023-01-14T09:15:49Z) - Dynamic Graph Message Passing Networks for Visual Recognition [112.49513303433606]
Modelling long-range dependencies is critical for scene understanding tasks in computer vision.
A fully-connected graph is beneficial for such modelling, but its computational overhead is prohibitive.
We propose a dynamic graph message passing network, that significantly reduces the computational complexity.
arXiv Detail & Related papers (2022-09-20T14:41:37Z) - Affinity-Aware Graph Networks [9.888383815189176]
Graph Neural Networks (GNNs) have emerged as a powerful technique for learning on relational data.
We explore the use of affinity measures as features in graph neural networks.
We propose message passing networks based on these features and evaluate their performance on a variety of node and graph property prediction tasks.
arXiv Detail & Related papers (2022-06-23T18:51:35Z) - Graph Contrastive Learning with Adaptive Augmentation [23.37786673825192]
We propose a novel graph contrastive representation learning method with adaptive augmentation.
Specifically, we design augmentation schemes based on node centrality measures to highlight important connective structures.
Our proposed method consistently outperforms existing state-of-the-art baselines and even surpasses some supervised counterparts.
arXiv Detail & Related papers (2020-10-27T15:12:21Z) - Tensor Graph Convolutional Networks for Multi-relational and Robust
Learning [74.05478502080658]
This paper introduces a tensor-graph convolutional network (TGCN) for scalable semi-supervised learning (SSL) from data associated with a collection of graphs, that are represented by a tensor.
The proposed architecture achieves markedly improved performance relative to standard GCNs, copes with state-of-the-art adversarial attacks, and leads to remarkable SSL performance over protein-to-protein interaction networks.
arXiv Detail & Related papers (2020-03-15T02:33:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.