Adaptive Depth Graph Attention Networks
- URL: http://arxiv.org/abs/2301.06265v1
- Date: Mon, 16 Jan 2023 05:22:29 GMT
- Title: Adaptive Depth Graph Attention Networks
- Authors: Jingbo Zhou, Yixuan Du, Ruqiong Zhang, Rui Zhang
- Abstract summary: The graph attention networks (GAT) is considered the most advanced learning architecture for graph representation.
We find that the main factor limiting the accuracy of the GAT model as the number of layers increases is the oversquashing phenomenon.
We propose a GAT variant model-ADGAT that adaptively selects the number of layers based on the sparsity of the graph.
- Score: 19.673509341792606
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As one of the most popular GNN architectures, the graph attention networks
(GAT) is considered the most advanced learning architecture for graph
representation and has been widely used in various graph mining tasks with
impressive results. However, since GAT was proposed, none of the existing
studies have provided systematic insight into the relationship between the
performance of GAT and the number of layers, which is a critical issue in
guiding model performance improvement. In this paper, we perform a systematic
experimental evaluation and based on the experimental results, we find two
important facts: (1) the main factor limiting the accuracy of the GAT model as
the number of layers increases is the oversquashing phenomenon; (2) among the
previous improvements applied to the GNN model, only the residual connection
can significantly improve the GAT model performance. We combine these two
important findings to provide a theoretical explanation that it is the residual
connection that mitigates the loss of original feature information due to
oversquashing and thus improves the deep GAT model performance. This provides
empirical insights and guidelines for researchers to design the GAT variant
model with appropriate depth and well performance. To demonstrate the
effectiveness of our proposed guidelines, we propose a GAT variant model-ADGAT
that adaptively selects the number of layers based on the sparsity of the
graph, and experimentally demonstrate that the effectiveness of our model is
significantly improved over the original GAT.
Related papers
- Graph neural network surrogate for strategic transport planning [2.175217022338634]
This paper explores the application of advanced Graph Neural Network (GNN) architectures as surrogate models for strategic transport planning.
Building upon a prior work that laid the foundation with graph convolution networks (GCN), our study delves into the comparative analysis of established GCN with the more expressive Graph Attention Network (GAT)
We propose a novel GAT variant (namely GATv3) to address over-smoothing issues in graph-based models.
arXiv Detail & Related papers (2024-08-14T14:18:47Z) - Are GATs Out of Balance? [73.2500577189791]
We study the Graph Attention Network (GAT) in which a node's neighborhood aggregation is weighted by parameterized attention coefficients.
Our main theorem serves as a stepping stone to studying the learning dynamics of positive homogeneous models with attention mechanisms.
arXiv Detail & Related papers (2023-10-11T06:53:05Z) - Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis [50.972595036856035]
We present a code that successfully replicates results from six popular and recent graph recommendation models.
We compare these graph models with traditional collaborative filtering models that historically performed well in offline evaluations.
By investigating the information flow from users' neighborhoods, we aim to identify which models are influenced by intrinsic features in the dataset structure.
arXiv Detail & Related papers (2023-08-01T09:31:44Z) - Optimization and Interpretability of Graph Attention Networks for Small
Sparse Graph Structures in Automotive Applications [11.581071131903775]
This work aims for a better understanding of the attention mechanism and analyzes its interpretability of identifying causal importance.
For automotive applications, the Graph Attention Network (GAT) is a prominently used architecture to include relational information of a traffic scenario during feature embedding.
arXiv Detail & Related papers (2023-05-25T15:55:59Z) - Gradient Derivation for Learnable Parameters in Graph Attention Networks [11.581071131903775]
This work provides a comprehensive derivation of the parameter gradients for GATv2 [4], a widely used implementation of Graph Attention Networks (GATs)
As the gradient flow provides valuable insights into the training dynamics of statistically learning models, this work obtains the gradients for the trainable model parameters of GATv2.
arXiv Detail & Related papers (2023-04-21T13:23:38Z) - Studying How to Efficiently and Effectively Guide Models with Explanations [52.498055901649025]
'Model guidance' is the idea of regularizing the models' explanations to ensure that they are "right for the right reasons"
We conduct an in-depth evaluation across various loss functions, attribution methods, models, and 'guidance depths' on the PASCAL VOC 2007 and MS COCO 2014 datasets.
Specifically, we guide the models via bounding box annotations, which are much cheaper to obtain than the commonly used segmentation masks.
arXiv Detail & Related papers (2023-03-21T15:34:50Z) - Design Amortization for Bayesian Optimal Experimental Design [70.13948372218849]
We build off of successful variational approaches, which optimize a parameterized variational model with respect to bounds on the expected information gain (EIG)
We present a novel neural architecture that allows experimenters to optimize a single variational model that can estimate the EIG for potentially infinitely many designs.
arXiv Detail & Related papers (2022-10-07T02:12:34Z) - Adaptive Fine-Grained Predicates Learning for Scene Graph Generation [122.4588401267544]
General Scene Graph Generation (SGG) models tend to predict head predicates and re-balancing strategies prefer tail categories.
We propose an Adaptive Fine-Grained Predicates Learning (FGPL-A) which aims at differentiating hard-to-distinguish predicates for SGG.
Our proposed model-agnostic strategy significantly boosts performance of benchmark models on VG-SGG and GQA-SGG datasets by up to 175% and 76% on Mean Recall@100, achieving new state-of-the-art performance.
arXiv Detail & Related papers (2022-07-11T03:37:57Z) - Tackling Oversmoothing of GNNs with Contrastive Learning [35.88575306925201]
Graph neural networks (GNNs) integrate the comprehensive relation of graph data and representation learning capability.
Oversmoothing makes the final representations of nodes indiscriminative, thus deteriorating the node classification and link prediction performance.
We propose the Topology-guided Graph Contrastive Layer, named TGCL, which is the first de-oversmoothing method maintaining all three mentioned metrics.
arXiv Detail & Related papers (2021-10-26T15:56:16Z) - Revisiting Graph based Collaborative Filtering: A Linear Residual Graph
Convolutional Network Approach [55.44107800525776]
Graph Convolutional Networks (GCNs) are state-of-the-art graph based representation learning models.
In this paper, we revisit GCN based Collaborative Filtering (CF) based Recommender Systems (RS)
We show that removing non-linearities would enhance recommendation performance, consistent with the theories in simple graph convolutional networks.
We propose a residual network structure that is specifically designed for CF with user-item interaction modeling.
arXiv Detail & Related papers (2020-01-28T04:41:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.