Related papers: On Vanishing Gradients, Over-Smoothing, and Over-Squashing in GNNs: Bridging Recurrent and Graph Learning

On Vanishing Gradients, Over-Smoothing, and Over-Squashing in GNNs: Bridging Recurrent and Graph Learning

URL: http://arxiv.org/abs/2502.10818v1
Date: Sat, 15 Feb 2025 14:43:41 GMT
Title: On Vanishing Gradients, Over-Smoothing, and Over-Squashing in GNNs: Bridging Recurrent and Graph Learning
Authors: Álvaro Arroyo, Alessio Gravina, Benjamin Gutteridge, Federico Barbero, Claudio Gallicchio, Xiaowen Dong, Michael Bronstein, Pierre Vandergheynst,
Abstract summary: Graph Neural Networks (GNNs) are models that leverage the graph structure to transmit information between nodes.<n>We show that a simple state-space formulation of a GNN effectively alleviates over-smoothing and over-squashing at no extra trainable parameter cost.
Score: 15.409865070022951
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Graph Neural Networks (GNNs) are models that leverage the graph structure to transmit information between nodes, typically through the message-passing operation. While widely successful, this approach is well known to suffer from the over-smoothing and over-squashing phenomena, which result in representational collapse as the number of layers increases and insensitivity to the information contained at distant and poorly connected nodes, respectively. In this paper, we present a unified view of these problems through the lens of vanishing gradients, using ideas from linear control theory for our analysis. We propose an interpretation of GNNs as recurrent models and empirically demonstrate that a simple state-space formulation of a GNN effectively alleviates over-smoothing and over-squashing at no extra trainable parameter cost. Further, we show theoretically and empirically that (i) GNNs are by design prone to extreme gradient vanishing even after a few layers; (ii) Over-smoothing is directly related to the mechanism causing vanishing gradients; (iii) Over-squashing is most easily alleviated by a combination of graph rewiring and vanishing gradient mitigation. We believe our work will help bridge the gap between the recurrent and graph neural network literature and will unlock the design of new deep and performant GNNs.

Related papers

Rewiring Techniques to Mitigate Oversquashing and Oversmoothing in GNNs: A Survey [0.0]
Graph Neural Networks (GNNs) are powerful tools for learning from graph-structured data, but their effectiveness is often constrained by two critical challenges. Oversquashing, where the excessive compression of information from distant nodes results in significant information loss, and oversmoothing, where repeated message-passing iterations homogenize node representations, obscuring meaningful distinctions. In this survey, we examine graph rewiring techniques, a class of methods designed to address these structural bottlenecks by modifying graph topology to enhance information diffusion.
arXiv Detail & Related papers (2024-11-26T13:38:12Z)
Spiking Graph Neural Network on Riemannian Manifolds [51.15400848660023]
Graph neural networks (GNNs) have become the dominant solution for learning on graphs. Existing spiking GNNs consider graphs in Euclidean space, ignoring the structural geometry. We present a Manifold-valued Spiking GNN (MSG) MSG achieves superior performance to previous spiking GNNs and energy efficiency to conventional GNNs.
arXiv Detail & Related papers (2024-10-23T15:09:02Z)
Over-Squashing in Riemannian Graph Neural Networks [1.6317061277457001]
Most graph neural networks (GNNs) are prone to the phenomenon of over-squashing. Recent works have shown that the topology of the graph has the greatest impact on over-squashing. We explore whether over-squashing can be mitigated through the embedding space of the GNN.
arXiv Detail & Related papers (2023-11-27T15:51:07Z)
DEGREE: Decomposition Based Explanation For Graph Neural Networks [55.38873296761104]
We propose DEGREE to provide a faithful explanation for GNN predictions. By decomposing the information generation and aggregation mechanism of GNNs, DEGREE allows tracking the contributions of specific components of the input graph to the final prediction. We also design a subgraph level interpretation algorithm to reveal complex interactions between graph nodes that are overlooked by previous methods.
arXiv Detail & Related papers (2023-05-22T10:29:52Z)
Simple yet Effective Gradient-Free Graph Convolutional Networks [20.448409424929604]
Linearized Graph Neural Networks (GNNs) have attracted great attention in recent years for graph representation learning. In this paper, we relate over-smoothing with the vanishing gradient phenomenon and craft a gradient-free training framework. Our methods achieve better and more stable performances on node classification tasks with varying depths and cost much less training time.
arXiv Detail & Related papers (2023-02-01T11:00:24Z)
OrthoReg: Improving Graph-regularized MLPs via Orthogonality Regularization [66.30021126251725]
Graph Neural Networks (GNNs) are currently dominating in modeling graphstructure data. Graph-regularized networks (GR-MLPs) implicitly inject the graph structure information into model weights, while their performance can hardly match that of GNNs in most tasks. We show that GR-MLPs suffer from dimensional collapse, a phenomenon in which the largest a few eigenvalues dominate the embedding space. We propose OrthoReg, a novel GR-MLP model to mitigate the dimensional collapse issue.
arXiv Detail & Related papers (2023-01-31T21:20:48Z)
Gradient Gating for Deep Multi-Rate Learning on Graphs [62.25886489571097]
We present Gradient Gating (G$2$), a novel framework for improving the performance of Graph Neural Networks (GNNs) Our framework is based on gating the output of GNN layers with a mechanism for multi-rate flow of message passing information across nodes of the underlying graph.
arXiv Detail & Related papers (2022-10-02T13:19:48Z)
MentorGNN: Deriving Curriculum for Pre-Training GNNs [61.97574489259085]
We propose an end-to-end model named MentorGNN that aims to supervise the pre-training process of GNNs across graphs. We shed new light on the problem of domain adaption on relational data (i.e., graphs) by deriving a natural and interpretable upper bound on the generalization error of the pre-trained GNNs.
arXiv Detail & Related papers (2022-08-21T15:12:08Z)
EEGNN: Edge Enhanced Graph Neural Networks [1.0246596695310175]
We propose a new explanation for such deteriorated performance phenomenon, mis-simplification. We show that such simplifying can reduce the potential of message-passing layers to capture the structural information of graphs. EEGNN uses the structural information extracted from the proposed Dirichlet mixture Poisson graph model to improve the performance of various deep message-passing GNNs.
arXiv Detail & Related papers (2022-08-12T15:24:55Z)
Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural Networks [52.566735716983956]
We propose a graph gradual pruning framework termed CGP to dynamically prune GNNs. Unlike LTH-based methods, the proposed CGP approach requires no re-training, which significantly reduces the computation costs. Our proposed strategy greatly improves both training and inference efficiency while matching or even exceeding the accuracy of existing methods.
arXiv Detail & Related papers (2022-07-18T14:23:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.