Graph Neural Networks Do Not Always Oversmooth
- URL: http://arxiv.org/abs/2406.02269v2
- Date: Fri, 15 Nov 2024 11:41:56 GMT
- Title: Graph Neural Networks Do Not Always Oversmooth
- Authors: Bastian Epping, Alexandre René, Moritz Helias, Michael T. Schaub,
- Abstract summary: We study oversmoothing in graph convolutional networks (GCNs) by using their Gaussian process (GP) equivalence in the limit of infinitely many hidden features.
We identify a new, non-oversmoothing phase: if the initial weights of the network have sufficiently large variance, GCNs do not oversmooth, and node features remain informative even at large depth.
- Score: 46.57665708260211
- License:
- Abstract: Graph neural networks (GNNs) have emerged as powerful tools for processing relational data in applications. However, GNNs suffer from the problem of oversmoothing, the property that the features of all nodes exponentially converge to the same vector over layers, prohibiting the design of deep GNNs. In this work we study oversmoothing in graph convolutional networks (GCNs) by using their Gaussian process (GP) equivalence in the limit of infinitely many hidden features. By generalizing methods from conventional deep neural networks (DNNs), we can describe the distribution of features at the output layer of deep GCNs in terms of a GP: as expected, we find that typical parameter choices from the literature lead to oversmoothing. The theory, however, allows us to identify a new, non-oversmoothing phase: if the initial weights of the network have sufficiently large variance, GCNs do not oversmooth, and node features remain informative even at large depth. We demonstrate the validity of this prediction in finite-size GCNs by training a linear classifier on their output. Moreover, using the linearization of the GCN GP, we generalize the concept of propagation depth of information from DNNs to GCNs. This propagation depth diverges at the transition between the oversmoothing and non-oversmoothing phase. We test the predictions of our approach and find good agreement with finite-size GCNs. Initializing GCNs near the transition to the non-oversmoothing phase, we obtain networks which are both deep and expressive.
Related papers
- Spiking Graph Neural Network on Riemannian Manifolds [51.15400848660023]
Graph neural networks (GNNs) have become the dominant solution for learning on graphs.
Existing spiking GNNs consider graphs in Euclidean space, ignoring the structural geometry.
We present a Manifold-valued Spiking GNN (MSG)
MSG achieves superior performance to previous spiking GNNs and energy efficiency to conventional GNNs.
arXiv Detail & Related papers (2024-10-23T15:09:02Z) - Superiority of GNN over NN in generalizing bandlimited functions [6.3151583550712065]
Graph Neural Networks (GNNs) have emerged as formidable resources for processing graph-based information across diverse applications.
In this study, we investigate the proficiency of GNNs for such classifications, which can also be cast as a function problem.
Our findings highlight a pronounced efficiency in utilizing GNNs to generalize a bandlimited function within an $varepsilon$-error margin.
arXiv Detail & Related papers (2022-06-13T05:15:12Z) - New Insights into Graph Convolutional Networks using Neural Tangent
Kernels [8.824340350342512]
This paper focuses on semi-supervised learning on graphs, and explains the above observations through the lens of Neural Tangent Kernels (NTKs)
We derive NTKs corresponding to infinitely wide GCNs (with and without skip connections)
We use the derived NTKs to identify that, with suitable normalisation, network depth does not always drastically reduce the performance of GCNs.
arXiv Detail & Related papers (2021-10-08T15:36:52Z) - Enhance Information Propagation for Graph Neural Network by
Heterogeneous Aggregations [7.3136594018091134]
Graph neural networks are emerging as continuation of deep learning success w.r.t. graph data.
We propose to enhance information propagation among GNN layers by combining heterogeneous aggregations.
We empirically validate the effectiveness of HAG-Net on a number of graph classification benchmarks.
arXiv Detail & Related papers (2021-02-08T08:57:56Z) - Overcoming Catastrophic Forgetting in Graph Neural Networks [50.900153089330175]
Catastrophic forgetting refers to the tendency that a neural network "forgets" the previous learned knowledge upon learning new tasks.
We propose a novel scheme dedicated to overcoming this problem and hence strengthen continual learning in graph neural networks (GNNs)
At the heart of our approach is a generic module, termed as topology-aware weight preserving(TWP)
arXiv Detail & Related papers (2020-12-10T22:30:25Z) - A Unified View on Graph Neural Networks as Graph Signal Denoising [49.980783124401555]
Graph Neural Networks (GNNs) have risen to prominence in learning representations for graph structured data.
In this work, we establish mathematically that the aggregation processes in a group of representative GNN models can be regarded as solving a graph denoising problem.
We instantiate a novel GNN model, ADA-UGNN, derived from UGNN, to handle graphs with adaptive smoothness across nodes.
arXiv Detail & Related papers (2020-10-05T04:57:18Z) - DeeperGCN: All You Need to Train Deeper GCNs [66.64739331859226]
Graph Convolutional Networks (GCNs) have been drawing significant attention with the power of representation learning on graphs.
Unlike Convolutional Neural Networks (CNNs), which are able to take advantage of stacking very deep layers, GCNs suffer from vanishing gradient, over-smoothing and over-fitting issues when going deeper.
This paper proposes DeeperGCN that is capable of successfully and reliably training very deep GCNs.
arXiv Detail & Related papers (2020-06-13T23:00:22Z) - Predicting the outputs of finite deep neural networks trained with noisy
gradients [1.1470070927586014]
A recent line of works studied wide deep neural networks (DNNs) by approximating them as Gaussian Processes (GPs)
Here we consider a DNN training protocol involving noise, weight decay and finite width, whose outcome corresponds to a certain non-Gaussian process.
An analytical framework is then introduced to analyze this non-Gaussian process, whose deviation from a GP is controlled by the finite width.
arXiv Detail & Related papers (2020-04-02T18:00:01Z) - Infinitely Wide Graph Convolutional Networks: Semi-supervised Learning
via Gaussian Processes [144.6048446370369]
Graph convolutional neural networks(GCNs) have recently demonstrated promising results on graph-based semi-supervised classification.
We propose a GP regression model via GCNs(GPGC) for graph-based semi-supervised learning.
We conduct extensive experiments to evaluate GPGC and demonstrate that it outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2020-02-26T10:02:32Z) - Generalization and Representational Limits of Graph Neural Networks [46.20253808402385]
We prove that several important graph properties cannot be computed by graph neural networks (GNNs) that rely entirely on local information.
We provide the first data dependent generalization bounds for message passing GNNs.
Our bounds are much tighter than existing VC-dimension based guarantees for GNNs, and are comparable to Rademacher bounds for recurrent neural networks.
arXiv Detail & Related papers (2020-02-14T18:10:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.