Related papers: Universal Deep GNNs: Rethinking Residual Connection in GNNs from a Path Decomposition Perspective for Preventing the Over-smoothing

Universal Deep GNNs: Rethinking Residual Connection in GNNs from a Path Decomposition Perspective for Preventing the Over-smoothing

URL: http://arxiv.org/abs/2205.15127v1
Date: Mon, 30 May 2022 14:19:45 GMT
Title: Universal Deep GNNs: Rethinking Residual Connection in GNNs from a Path Decomposition Perspective for Preventing the Over-smoothing
Authors: Jie Chen, Weiqi Liu, Zhizhong Huang, Junbin Gao, Junping Zhang, Jian Pu
Abstract summary: Recent studies have shown that GNNs with residual connections only slightly slow down the degeneration. In this paper, we investigate the forward and backward behavior of GNNs with residual connections from a novel path decomposition perspective. We present a Universal Deep GNNs framework with cold-start adaptive residual connections (DRIVE) and feedforward modules.
Score: 50.242926616772515
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The performance of GNNs degrades as they become deeper due to the over-smoothing. Among all the attempts to prevent over-smoothing, residual connection is one of the promising methods due to its simplicity. However, recent studies have shown that GNNs with residual connections only slightly slow down the degeneration. The reason why residual connections fail in GNNs is still unknown. In this paper, we investigate the forward and backward behavior of GNNs with residual connections from a novel path decomposition perspective. We find that the recursive aggregation of the median length paths from the binomial distribution of residual connection paths dominates output representation, resulting in over-smoothing as GNNs go deeper. Entangled propagation and weight matrices cause gradient smoothing and prevent GNNs with residual connections from optimizing to the identity mapping. Based on these findings, we present a Universal Deep GNNs (UDGNN) framework with cold-start adaptive residual connections (DRIVE) and feedforward modules. Extensive experiments demonstrate the effectiveness of our method, which achieves state-of-the-art results over non-smooth heterophily datasets by simply stacking standard GNNs.

Related papers

On Vanishing Gradients, Over-Smoothing, and Over-Squashing in GNNs: Bridging Recurrent and Graph Learning [15.409865070022951]
Graph Neural Networks (GNNs) are models that leverage the graph structure to transmit information between nodes. We show that a simple state-space formulation of a GNN effectively alleviates over-smoothing and over-squashing at no extra trainable parameter cost.
arXiv Detail & Related papers (2025-02-15T14:43:41Z)
Spiking Graph Neural Network on Riemannian Manifolds [51.15400848660023]
Graph neural networks (GNNs) have become the dominant solution for learning on graphs. Existing spiking GNNs consider graphs in Euclidean space, ignoring the structural geometry. We present a Manifold-valued Spiking GNN (MSG) MSG achieves superior performance to previous spiking GNNs and energy efficiency to conventional GNNs.
arXiv Detail & Related papers (2024-10-23T15:09:02Z)
Bregman Graph Neural Network [27.64062763929748]
In node classification tasks, the smoothing effect induced by GNNs tends to assimilate representations and over-homogenize labels of connected nodes. We propose a novel bilevel optimization framework for GNNs inspired by the notion of Bregman distance.
arXiv Detail & Related papers (2023-09-12T23:54:24Z)
Graph Neural Networks are Inherently Good Generalizers: Insights by Bridging GNNs and MLPs [71.93227401463199]
This paper pinpoints the major source of GNNs' performance gain to their intrinsic capability, by introducing an intermediate model class dubbed as P(ropagational)MLP. We observe that PMLPs consistently perform on par with (or even exceed) their GNN counterparts, while being much more efficient in training.
arXiv Detail & Related papers (2022-12-18T08:17:32Z)
ResNorm: Tackling Long-tailed Degree Distribution Issue in Graph Neural Networks via Normalization [80.90206641975375]
This paper focuses on improving the performance of GNNs via normalization. By studying the long-tailed distribution of node degrees in the graph, we propose a novel normalization method for GNNs. The $scale$ operation of ResNorm reshapes the node-wise standard deviation (NStd) distribution so as to improve the accuracy of tail nodes.
arXiv Detail & Related papers (2022-06-16T13:49:09Z)
Feature Overcorrelation in Deep Graph Neural Networks: A New Perspective [44.96635754139024]
Oversmoothing has been identified as one of the key issues which limit the performance of deep GNNs. We propose a new perspective to look at the performance degradation of deep GNNs, i.e., feature overcorrelation. To reduce the feature correlation, we propose a general framework DeCorr which can encourage GNNs to encode less redundant information.
arXiv Detail & Related papers (2022-06-15T18:13:52Z)
Is Heterophily A Real Nightmare For Graph Neural Networks To Do Node Classification? [44.71818395535755]
Graph Neural Networks (GNNs) extend basic Neural Networks (NNs) by using the graph structures based on the inductive bias (homophily assumption) Performance advantages of GNNs over graph-agnostic NNs seem not generally satisfactory. Heterophily has been considered as a main cause and numerous works have been put forward to address it.
arXiv Detail & Related papers (2021-09-12T23:57:05Z)
Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth [57.10183643449905]
Graph Neural Networks (GNNs) have been studied from the lens of expressive power and generalization. We study the dynamics of GNNs by studying deep skip optimization. Our results provide first theoretical support for the success of GNNs.
arXiv Detail & Related papers (2021-05-10T17:59:01Z)
The Surprising Power of Graph Neural Networks with Random Node Initialization [54.4101931234922]
Graph neural networks (GNNs) are effective models for representation learning on relational data. Standard GNNs are limited in their expressive power, as they cannot distinguish beyond the capability of the Weisfeiler-Leman graph isomorphism. In this work, we analyze the expressive power of GNNs with random node (RNI) We prove that these models are universal, a first such result for GNNs not relying on computationally demanding higher-order properties.
arXiv Detail & Related papers (2020-10-02T19:53:05Z)
Optimization and Generalization Analysis of Transduction through Gradient Boosting and Application to Multi-scale Graph Neural Networks [60.22494363676747]
It is known that the current graph neural networks (GNNs) are difficult to make themselves deep due to the problem known as over-smoothing. Multi-scale GNNs are a promising approach for mitigating the over-smoothing problem. We derive the optimization and generalization guarantees of transductive learning algorithms that include multi-scale GNNs.
arXiv Detail & Related papers (2020-06-15T17:06:17Z)
On the Bottleneck of Graph Neural Networks and its Practical Implications [22.704284264177108]
We show that graph neural networks (GNNs) are susceptible to a bottleneck when aggregating messages across a long path. This bottleneck causes the over-squashing of exponentially growing information into fixed-size vectors. GNNs fail to propagate messages originating from distant nodes and perform poorly when the prediction task depends on long-range interaction.
arXiv Detail & Related papers (2020-06-09T12:04:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.