Universal Deep GNNs: Rethinking Residual Connection in GNNs from a Path
Decomposition Perspective for Preventing the Over-smoothing
- URL: http://arxiv.org/abs/2205.15127v1
- Date: Mon, 30 May 2022 14:19:45 GMT
- Title: Universal Deep GNNs: Rethinking Residual Connection in GNNs from a Path
Decomposition Perspective for Preventing the Over-smoothing
- Authors: Jie Chen, Weiqi Liu, Zhizhong Huang, Junbin Gao, Junping Zhang, Jian
Pu
- Abstract summary: Recent studies have shown that GNNs with residual connections only slightly slow down the degeneration.
In this paper, we investigate the forward and backward behavior of GNNs with residual connections from a novel path decomposition perspective.
We present a Universal Deep GNNs framework with cold-start adaptive residual connections (DRIVE) and feedforward modules.
- Score: 50.242926616772515
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The performance of GNNs degrades as they become deeper due to the
over-smoothing. Among all the attempts to prevent over-smoothing, residual
connection is one of the promising methods due to its simplicity. However,
recent studies have shown that GNNs with residual connections only slightly
slow down the degeneration. The reason why residual connections fail in GNNs is
still unknown. In this paper, we investigate the forward and backward behavior
of GNNs with residual connections from a novel path decomposition perspective.
We find that the recursive aggregation of the median length paths from the
binomial distribution of residual connection paths dominates output
representation, resulting in over-smoothing as GNNs go deeper. Entangled
propagation and weight matrices cause gradient smoothing and prevent GNNs with
residual connections from optimizing to the identity mapping. Based on these
findings, we present a Universal Deep GNNs (UDGNN) framework with cold-start
adaptive residual connections (DRIVE) and feedforward modules. Extensive
experiments demonstrate the effectiveness of our method, which achieves
state-of-the-art results over non-smooth heterophily datasets by simply
stacking standard GNNs.
Related papers
- Spiking Graph Neural Network on Riemannian Manifolds [51.15400848660023]
Graph neural networks (GNNs) have become the dominant solution for learning on graphs.
Existing spiking GNNs consider graphs in Euclidean space, ignoring the structural geometry.
We present a Manifold-valued Spiking GNN (MSG)
MSG achieves superior performance to previous spiking GNNs and energy efficiency to conventional GNNs.
arXiv Detail & Related papers (2024-10-23T15:09:02Z) - Bregman Graph Neural Network [27.64062763929748]
In node classification tasks, the smoothing effect induced by GNNs tends to assimilate representations and over-homogenize labels of connected nodes.
We propose a novel bilevel optimization framework for GNNs inspired by the notion of Bregman distance.
arXiv Detail & Related papers (2023-09-12T23:54:24Z) - Graph Neural Networks are Inherently Good Generalizers: Insights by
Bridging GNNs and MLPs [71.93227401463199]
This paper pinpoints the major source of GNNs' performance gain to their intrinsic capability, by introducing an intermediate model class dubbed as P(ropagational)MLP.
We observe that PMLPs consistently perform on par with (or even exceed) their GNN counterparts, while being much more efficient in training.
arXiv Detail & Related papers (2022-12-18T08:17:32Z) - Feature Overcorrelation in Deep Graph Neural Networks: A New Perspective [44.96635754139024]
Oversmoothing has been identified as one of the key issues which limit the performance of deep GNNs.
We propose a new perspective to look at the performance degradation of deep GNNs, i.e., feature overcorrelation.
To reduce the feature correlation, we propose a general framework DeCorr which can encourage GNNs to encode less redundant information.
arXiv Detail & Related papers (2022-06-15T18:13:52Z) - Is Heterophily A Real Nightmare For Graph Neural Networks To Do Node
Classification? [44.71818395535755]
Graph Neural Networks (GNNs) extend basic Neural Networks (NNs) by using the graph structures based on the inductive bias (homophily assumption)
Performance advantages of GNNs over graph-agnostic NNs seem not generally satisfactory.
Heterophily has been considered as a main cause and numerous works have been put forward to address it.
arXiv Detail & Related papers (2021-09-12T23:57:05Z) - Optimization of Graph Neural Networks: Implicit Acceleration by Skip
Connections and More Depth [57.10183643449905]
Graph Neural Networks (GNNs) have been studied from the lens of expressive power and generalization.
We study the dynamics of GNNs by studying deep skip optimization.
Our results provide first theoretical support for the success of GNNs.
arXiv Detail & Related papers (2021-05-10T17:59:01Z) - The Surprising Power of Graph Neural Networks with Random Node
Initialization [54.4101931234922]
Graph neural networks (GNNs) are effective models for representation learning on relational data.
Standard GNNs are limited in their expressive power, as they cannot distinguish beyond the capability of the Weisfeiler-Leman graph isomorphism.
In this work, we analyze the expressive power of GNNs with random node (RNI)
We prove that these models are universal, a first such result for GNNs not relying on computationally demanding higher-order properties.
arXiv Detail & Related papers (2020-10-02T19:53:05Z) - Optimization and Generalization Analysis of Transduction through
Gradient Boosting and Application to Multi-scale Graph Neural Networks [60.22494363676747]
It is known that the current graph neural networks (GNNs) are difficult to make themselves deep due to the problem known as over-smoothing.
Multi-scale GNNs are a promising approach for mitigating the over-smoothing problem.
We derive the optimization and generalization guarantees of transductive learning algorithms that include multi-scale GNNs.
arXiv Detail & Related papers (2020-06-15T17:06:17Z) - On the Bottleneck of Graph Neural Networks and its Practical
Implications [22.704284264177108]
We show that graph neural networks (GNNs) are susceptible to a bottleneck when aggregating messages across a long path.
This bottleneck causes the over-squashing of exponentially growing information into fixed-size vectors.
GNNs fail to propagate messages originating from distant nodes and perform poorly when the prediction task depends on long-range interaction.
arXiv Detail & Related papers (2020-06-09T12:04:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.