Related papers: On Over-Squashing in Message Passing Neural Networks: The Impact of Width, Depth, and Topology

On Over-Squashing in Message Passing Neural Networks: The Impact of Width, Depth, and Topology

URL: http://arxiv.org/abs/2302.02941v3
Date: Wed, 24 May 2023 11:42:49 GMT
Title: On Over-Squashing in Message Passing Neural Networks: The Impact of Width, Depth, and Topology
Authors: Francesco Di Giovanni, Lorenzo Giusti, Federico Barbero, Giulia Luise, Pietro Lio', Michael Bronstein
Abstract summary: Message Passing Neural Networks (MPNNs) are instances of Graph Neural Networks that leverage the graph to send messages over the edges. This inductive bias leads to a phenomenon known as over-squashing, where a node feature is insensitive to information contained at distant nodes. Despite recent methods introduced to mitigate this issue, an understanding of the causes for over-squashing and of possible solutions are lacking.
Score: 4.809459273366461
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Message Passing Neural Networks (MPNNs) are instances of Graph Neural Networks that leverage the graph to send messages over the edges. This inductive bias leads to a phenomenon known as over-squashing, where a node feature is insensitive to information contained at distant nodes. Despite recent methods introduced to mitigate this issue, an understanding of the causes for over-squashing and of possible solutions are lacking. In this theoretical work, we prove that: (i) Neural network width can mitigate over-squashing, but at the cost of making the whole network more sensitive; (ii) Conversely, depth cannot help mitigate over-squashing: increasing the number of layers leads to over-squashing being dominated by vanishing gradients; (iii) The graph topology plays the greatest role, since over-squashing occurs between nodes at high commute (access) time. Our analysis provides a unified framework to study different recent methods introduced to cope with over-squashing and serves as a justification for a class of methods that fall under graph rewiring.

Related papers

Statistical physics analysis of graph neural networks: Approaching optimality in the contextual stochastic block model [0.0]
Graph neural networks (GNNs) are designed to process data associated with graphs. GNNs can encounter difficulties in gathering information from nodes far apart by iterated aggregation steps. We show how the architecture of the GCN has to scale with the depth to avoid oversmoothing.
arXiv Detail & Related papers (2025-03-03T09:55:10Z)
On Vanishing Gradients, Over-Smoothing, and Over-Squashing in GNNs: Bridging Recurrent and Graph Learning [15.409865070022951]
Graph Neural Networks (GNNs) are models that leverage the graph structure to transmit information between nodes. We show that a simple state-space formulation of a GNN effectively alleviates over-smoothing and over-squashing at no extra trainable parameter cost.
arXiv Detail & Related papers (2025-02-15T14:43:41Z)
Rewiring Techniques to Mitigate Oversquashing and Oversmoothing in GNNs: A Survey [0.0]
Graph Neural Networks (GNNs) are powerful tools for learning from graph-structured data, but their effectiveness is often constrained by two critical challenges. Oversquashing, where the excessive compression of information from distant nodes results in significant information loss, and oversmoothing, where repeated message-passing iterations homogenize node representations, obscuring meaningful distinctions. In this survey, we examine graph rewiring techniques, a class of methods designed to address these structural bottlenecks by modifying graph topology to enhance information diffusion.
arXiv Detail & Related papers (2024-11-26T13:38:12Z)
Graph Elimination Networks [8.806990624643333]
Graph Neural Networks (GNNs) are widely applied across various domains, yet they perform poorly in deep layers. We show that the root cause of GNNs' performance degradation in deep layers lies in ineffective neighborhood feature propagation. We introduce Graph Elimination Networks (GENs), which employ a specific algorithm to eliminate redundancies during neighborhood propagation.
arXiv Detail & Related papers (2024-01-02T14:58:59Z)
Over-Squashing in Riemannian Graph Neural Networks [1.6317061277457001]
Most graph neural networks (GNNs) are prone to the phenomenon of over-squashing. Recent works have shown that the topology of the graph has the greatest impact on over-squashing. We explore whether over-squashing can be mitigated through the embedding space of the GNN.
arXiv Detail & Related papers (2023-11-27T15:51:07Z)
Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence. We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers. This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z)
NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification [70.51126383984555]
We introduce a novel all-pair message passing scheme for efficiently propagating node signals between arbitrary nodes. The efficient computation is enabled by a kernerlized Gumbel-Softmax operator. Experiments demonstrate the promising efficacy of the method in various tasks including node classification on graphs.
arXiv Detail & Related papers (2023-06-14T09:21:15Z)
Is Rewiring Actually Helpful in Graph Neural Networks? [11.52174067809364]
We propose an evaluation setting based on message-passing models that do not require training to compute node and graph representations. We perform a systematic experimental comparison on real-world node and graph classification tasks, showing that rewiring the underlying graph rarely does confer a practical benefit for message-passing.
arXiv Detail & Related papers (2023-05-31T10:12:23Z)
Deep Graph Neural Networks via Posteriori-Sampling-based Node-Adaptive Residual Module [65.81781176362848]
Graph Neural Networks (GNNs) can learn from graph-structured data through neighborhood information aggregation. As the number of layers increases, node representations become indistinguishable, which is known as over-smoothing. We propose a textbfPosterior-Sampling-based, Node-distinguish Residual module (PSNR).
arXiv Detail & Related papers (2023-05-09T12:03:42Z)
Understanding over-squashing and bottlenecks on graphs via curvature [17.359098638324546]
Over-squashing is a phenomenon where the number of $k$-hop neighbors grows rapidly with $k$. We introduce a new edge-based curvature and prove that negatively curved edges are responsible for over-squashing. We also propose and experimentally test a curvature-based rewiring method to alleviate the over-squashing.
arXiv Detail & Related papers (2021-11-29T13:27:56Z)
Overcoming Catastrophic Forgetting in Graph Neural Networks [50.900153089330175]
Catastrophic forgetting refers to the tendency that a neural network "forgets" the previous learned knowledge upon learning new tasks. We propose a novel scheme dedicated to overcoming this problem and hence strengthen continual learning in graph neural networks (GNNs) At the heart of our approach is a generic module, termed as topology-aware weight preserving(TWP)
arXiv Detail & Related papers (2020-12-10T22:30:25Z)
Towards Deeper Graph Neural Networks [63.46470695525957]
Graph convolutions perform neighborhood aggregation and represent one of the most important graph operations. Several recent studies attribute this performance deterioration to the over-smoothing issue. We propose Deep Adaptive Graph Neural Network (DAGNN) to adaptively incorporate information from large receptive fields.
arXiv Detail & Related papers (2020-07-18T01:11:14Z)
Binary Neural Networks: A Survey [126.67799882857656]
The binary neural network serves as a promising technique for deploying deep models on resource-limited devices. The binarization inevitably causes severe information loss, and even worse, its discontinuity brings difficulty to the optimization of the deep network. We present a survey of these algorithms, mainly categorized into the native solutions directly conducting binarization, and the optimized ones using techniques like minimizing the quantization error, improving the network loss function, and reducing the gradient error.
arXiv Detail & Related papers (2020-03-31T16:47:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.