On Over-Squashing in Message Passing Neural Networks: The Impact of
Width, Depth, and Topology
- URL: http://arxiv.org/abs/2302.02941v3
- Date: Wed, 24 May 2023 11:42:49 GMT
- Title: On Over-Squashing in Message Passing Neural Networks: The Impact of
Width, Depth, and Topology
- Authors: Francesco Di Giovanni, Lorenzo Giusti, Federico Barbero, Giulia Luise,
Pietro Lio', Michael Bronstein
- Abstract summary: Message Passing Neural Networks (MPNNs) are instances of Graph Neural Networks that leverage the graph to send messages over the edges.
This inductive bias leads to a phenomenon known as over-squashing, where a node feature is insensitive to information contained at distant nodes.
Despite recent methods introduced to mitigate this issue, an understanding of the causes for over-squashing and of possible solutions are lacking.
- Score: 4.809459273366461
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Message Passing Neural Networks (MPNNs) are instances of Graph Neural
Networks that leverage the graph to send messages over the edges. This
inductive bias leads to a phenomenon known as over-squashing, where a node
feature is insensitive to information contained at distant nodes. Despite
recent methods introduced to mitigate this issue, an understanding of the
causes for over-squashing and of possible solutions are lacking. In this
theoretical work, we prove that: (i) Neural network width can mitigate
over-squashing, but at the cost of making the whole network more sensitive;
(ii) Conversely, depth cannot help mitigate over-squashing: increasing the
number of layers leads to over-squashing being dominated by vanishing
gradients; (iii) The graph topology plays the greatest role, since
over-squashing occurs between nodes at high commute (access) time. Our analysis
provides a unified framework to study different recent methods introduced to
cope with over-squashing and serves as a justification for a class of methods
that fall under graph rewiring.
Related papers
- Graph Elimination Networks [8.806990624643333]
Graph Neural Networks (GNNs) are widely applied across various domains, yet they perform poorly in deep layers.
We show that the root cause of GNNs' performance degradation in deep layers lies in ineffective neighborhood feature propagation.
We introduce Graph Elimination Networks (GENs), which employ a specific algorithm to eliminate redundancies during neighborhood propagation.
arXiv Detail & Related papers (2024-01-02T14:58:59Z) - Over-Squashing in Riemannian Graph Neural Networks [1.6317061277457001]
Most graph neural networks (GNNs) are prone to the phenomenon of over-squashing.
Recent works have shown that the topology of the graph has the greatest impact on over-squashing.
We explore whether over-squashing can be mitigated through the embedding space of the GNN.
arXiv Detail & Related papers (2023-11-27T15:51:07Z) - Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence.
We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers.
This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z) - NodeFormer: A Scalable Graph Structure Learning Transformer for Node
Classification [70.51126383984555]
We introduce a novel all-pair message passing scheme for efficiently propagating node signals between arbitrary nodes.
The efficient computation is enabled by a kernerlized Gumbel-Softmax operator.
Experiments demonstrate the promising efficacy of the method in various tasks including node classification on graphs.
arXiv Detail & Related papers (2023-06-14T09:21:15Z) - Is Rewiring Actually Helpful in Graph Neural Networks? [11.52174067809364]
We propose an evaluation setting based on message-passing models that do not require training to compute node and graph representations.
We perform a systematic experimental comparison on real-world node and graph classification tasks, showing that rewiring the underlying graph rarely does confer a practical benefit for message-passing.
arXiv Detail & Related papers (2023-05-31T10:12:23Z) - Deep Graph Neural Networks via Posteriori-Sampling-based Node-Adaptive Residual Module [65.81781176362848]
Graph Neural Networks (GNNs) can learn from graph-structured data through neighborhood information aggregation.
As the number of layers increases, node representations become indistinguishable, which is known as over-smoothing.
We propose a textbfPosterior-Sampling-based, Node-distinguish Residual module (PSNR).
arXiv Detail & Related papers (2023-05-09T12:03:42Z) - Understanding over-squashing and bottlenecks on graphs via curvature [17.359098638324546]
Over-squashing is a phenomenon where the number of $k$-hop neighbors grows rapidly with $k$.
We introduce a new edge-based curvature and prove that negatively curved edges are responsible for over-squashing.
We also propose and experimentally test a curvature-based rewiring method to alleviate the over-squashing.
arXiv Detail & Related papers (2021-11-29T13:27:56Z) - Overcoming Catastrophic Forgetting in Graph Neural Networks [50.900153089330175]
Catastrophic forgetting refers to the tendency that a neural network "forgets" the previous learned knowledge upon learning new tasks.
We propose a novel scheme dedicated to overcoming this problem and hence strengthen continual learning in graph neural networks (GNNs)
At the heart of our approach is a generic module, termed as topology-aware weight preserving(TWP)
arXiv Detail & Related papers (2020-12-10T22:30:25Z) - Towards Deeper Graph Neural Networks [63.46470695525957]
Graph convolutions perform neighborhood aggregation and represent one of the most important graph operations.
Several recent studies attribute this performance deterioration to the over-smoothing issue.
We propose Deep Adaptive Graph Neural Network (DAGNN) to adaptively incorporate information from large receptive fields.
arXiv Detail & Related papers (2020-07-18T01:11:14Z) - Binary Neural Networks: A Survey [126.67799882857656]
The binary neural network serves as a promising technique for deploying deep models on resource-limited devices.
The binarization inevitably causes severe information loss, and even worse, its discontinuity brings difficulty to the optimization of the deep network.
We present a survey of these algorithms, mainly categorized into the native solutions directly conducting binarization, and the optimized ones using techniques like minimizing the quantization error, improving the network loss function, and reducing the gradient error.
arXiv Detail & Related papers (2020-03-31T16:47:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.