Related papers: Adaptive Depth Graph Attention Networks

Adaptive Depth Graph Attention Networks

URL: http://arxiv.org/abs/2301.06265v1
Date: Mon, 16 Jan 2023 05:22:29 GMT
Title: Adaptive Depth Graph Attention Networks
Authors: Jingbo Zhou, Yixuan Du, Ruqiong Zhang, Rui Zhang
Abstract summary: The graph attention networks (GAT) is considered the most advanced learning architecture for graph representation. We find that the main factor limiting the accuracy of the GAT model as the number of layers increases is the oversquashing phenomenon. We propose a GAT variant model-ADGAT that adaptively selects the number of layers based on the sparsity of the graph.
Score: 19.673509341792606
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As one of the most popular GNN architectures, the graph attention networks (GAT) is considered the most advanced learning architecture for graph representation and has been widely used in various graph mining tasks with impressive results. However, since GAT was proposed, none of the existing studies have provided systematic insight into the relationship between the performance of GAT and the number of layers, which is a critical issue in guiding model performance improvement. In this paper, we perform a systematic experimental evaluation and based on the experimental results, we find two important facts: (1) the main factor limiting the accuracy of the GAT model as the number of layers increases is the oversquashing phenomenon; (2) among the previous improvements applied to the GNN model, only the residual connection can significantly improve the GAT model performance. We combine these two important findings to provide a theoretical explanation that it is the residual connection that mitigates the loss of original feature information due to oversquashing and thus improves the deep GAT model performance. This provides empirical insights and guidelines for researchers to design the GAT variant model with appropriate depth and well performance. To demonstrate the effectiveness of our proposed guidelines, we propose a GAT variant model-ADGAT that adaptively selects the number of layers based on the sparsity of the graph, and experimentally demonstrate that the effectiveness of our model is significantly improved over the original GAT.

Related papers

Quantifying the Capability Boundary of DeepSeek Models: An Application-Driven Performance Analysis [12.79754082920348]
DeepSeek-R1 has achieved state-of-the-art performance on various benchmarks.<n>We evaluate the DeepSeek and its related models using our enhanced A-Eval benchmark, A-Eval-2.0.
arXiv Detail & Related papers (2025-02-16T15:29:58Z)
Graph neural network surrogate for strategic transport planning [2.175217022338634]
This paper explores the application of advanced Graph Neural Network (GNN) architectures as surrogate models for strategic transport planning. Building upon a prior work that laid the foundation with graph convolution networks (GCN), our study delves into the comparative analysis of established GCN with the more expressive Graph Attention Network (GAT) We propose a novel GAT variant (namely GATv3) to address over-smoothing issues in graph-based models.
arXiv Detail & Related papers (2024-08-14T14:18:47Z)
Are GATs Out of Balance? [73.2500577189791]
We study the Graph Attention Network (GAT) in which a node's neighborhood aggregation is weighted by parameterized attention coefficients. Our main theorem serves as a stepping stone to studying the learning dynamics of positive homogeneous models with attention mechanisms.
arXiv Detail & Related papers (2023-10-11T06:53:05Z)
Challenging the Myth of Graph Collaborative Filtering: a Reasoned and Reproducibility-driven Analysis [50.972595036856035]
We present a code that successfully replicates results from six popular and recent graph recommendation models. We compare these graph models with traditional collaborative filtering models that historically performed well in offline evaluations. By investigating the information flow from users' neighborhoods, we aim to identify which models are influenced by intrinsic features in the dataset structure.
arXiv Detail & Related papers (2023-08-01T09:31:44Z)
Optimization and Interpretability of Graph Attention Networks for Small Sparse Graph Structures in Automotive Applications [11.581071131903775]
This work aims for a better understanding of the attention mechanism and analyzes its interpretability of identifying causal importance. For automotive applications, the Graph Attention Network (GAT) is a prominently used architecture to include relational information of a traffic scenario during feature embedding.
arXiv Detail & Related papers (2023-05-25T15:55:59Z)
Gradient Derivation for Learnable Parameters in Graph Attention Networks [11.581071131903775]
This work provides a comprehensive derivation of the parameter gradients for GATv2 [4], a widely used implementation of Graph Attention Networks (GATs) As the gradient flow provides valuable insights into the training dynamics of statistically learning models, this work obtains the gradients for the trainable model parameters of GATv2.
arXiv Detail & Related papers (2023-04-21T13:23:38Z)
Studying How to Efficiently and Effectively Guide Models with Explanations [52.498055901649025]
'Model guidance' is the idea of regularizing the models' explanations to ensure that they are "right for the right reasons" We conduct an in-depth evaluation across various loss functions, attribution methods, models, and 'guidance depths' on the PASCAL VOC 2007 and MS COCO 2014 datasets. Specifically, we guide the models via bounding box annotations, which are much cheaper to obtain than the commonly used segmentation masks.
arXiv Detail & Related papers (2023-03-21T15:34:50Z)
Design Amortization for Bayesian Optimal Experimental Design [70.13948372218849]
We build off of successful variational approaches, which optimize a parameterized variational model with respect to bounds on the expected information gain (EIG) We present a novel neural architecture that allows experimenters to optimize a single variational model that can estimate the EIG for potentially infinitely many designs.
arXiv Detail & Related papers (2022-10-07T02:12:34Z)
Adaptive Fine-Grained Predicates Learning for Scene Graph Generation [122.4588401267544]
General Scene Graph Generation (SGG) models tend to predict head predicates and re-balancing strategies prefer tail categories. We propose an Adaptive Fine-Grained Predicates Learning (FGPL-A) which aims at differentiating hard-to-distinguish predicates for SGG. Our proposed model-agnostic strategy significantly boosts performance of benchmark models on VG-SGG and GQA-SGG datasets by up to 175% and 76% on Mean Recall@100, achieving new state-of-the-art performance.
arXiv Detail & Related papers (2022-07-11T03:37:57Z)
Tackling Oversmoothing of GNNs with Contrastive Learning [35.88575306925201]
Graph neural networks (GNNs) integrate the comprehensive relation of graph data and representation learning capability. Oversmoothing makes the final representations of nodes indiscriminative, thus deteriorating the node classification and link prediction performance. We propose the Topology-guided Graph Contrastive Layer, named TGCL, which is the first de-oversmoothing method maintaining all three mentioned metrics.
arXiv Detail & Related papers (2021-10-26T15:56:16Z)
Bag of Tricks of Semi-Supervised Classification with Graph Neural Networks [0.0]
In this paper, we first summarize a collection of existing refinements, and then propose several novel techniques regarding these model designs and label usage. We empirically evaluate their impacts on the final model accuracy through ablation studies, and show that we are able to significantly improve various GNN models to the extent that they outweigh the gains from model architecture improvement.
arXiv Detail & Related papers (2021-03-24T17:24:26Z)
Revisiting Graph based Collaborative Filtering: A Linear Residual Graph Convolutional Network Approach [55.44107800525776]
Graph Convolutional Networks (GCNs) are state-of-the-art graph based representation learning models. In this paper, we revisit GCN based Collaborative Filtering (CF) based Recommender Systems (RS) We show that removing non-linearities would enhance recommendation performance, consistent with the theories in simple graph convolutional networks. We propose a residual network structure that is specifically designed for CF with user-item interaction modeling.
arXiv Detail & Related papers (2020-01-28T04:41:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.