Wide Graph Neural Networks: Aggregation Provably Leads to Exponentially
Trainability Loss
- URL: http://arxiv.org/abs/2103.03113v1
- Date: Wed, 3 Mar 2021 11:06:12 GMT
- Title: Wide Graph Neural Networks: Aggregation Provably Leads to Exponentially
Trainability Loss
- Authors: Wei Huang, Yayong Li, Weitao Du, Richard Yi Da Xu, Jie Yin, and Ling
Chen
- Abstract summary: Graph convolutional networks (GCNs) and their variants have achieved great success in dealing with graph-structured data.
It is well known that deep GCNs will suffer from over-smoothing problem.
Few theoretical analyses have been conducted to study the expressivity and trainability of deep GCNs.
- Score: 17.39060566854841
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph convolutional networks (GCNs) and their variants have achieved great
success in dealing with graph-structured data. However, it is well known that
deep GCNs will suffer from over-smoothing problem, where node representations
tend to be indistinguishable as we stack up more layers. Although extensive
research has confirmed this prevailing understanding, few theoretical analyses
have been conducted to study the expressivity and trainability of deep GCNs. In
this work, we demonstrate these characterizations by studying the Gaussian
Process Kernel (GPK) and Graph Neural Tangent Kernel (GNTK) of an
infinitely-wide GCN, corresponding to the analysis on expressivity and
trainability, respectively. We first prove the expressivity of infinitely-wide
GCNs decaying at an exponential rate by applying the mean-field theory on GPK.
Besides, we formulate the asymptotic behaviors of GNTK in the large depth,
which enables us to reveal the dropping trainability of wide and deep GCNs at
an exponential rate. Additionally, we extend our theoretical framework to
analyze residual connection-resemble techniques. We found that these techniques
can mildly mitigate exponential decay, but they failed to overcome it
fundamentally. Finally, all theoretical results in this work are corroborated
experimentally on a variety of graph-structured datasets.
Related papers
- Spiking Graph Neural Network on Riemannian Manifolds [51.15400848660023]
Graph neural networks (GNNs) have become the dominant solution for learning on graphs.
Existing spiking GNNs consider graphs in Euclidean space, ignoring the structural geometry.
We present a Manifold-valued Spiking GNN (MSG)
MSG achieves superior performance to previous spiking GNNs and energy efficiency to conventional GNNs.
arXiv Detail & Related papers (2024-10-23T15:09:02Z) - Bridging Smoothness and Approximation: Theoretical Insights into Over-Smoothing in Graph Neural Networks [12.001676605529626]
We explore the approximation theory of functions defined on graphs.
We establish a framework to assess the lower bounds of approximation for target functions using Graph Convolutional Networks (GCNs)
We show how the high-frequency energy of the output decays, an indicator of over-smoothing, in GCNs.
arXiv Detail & Related papers (2024-07-01T13:35:53Z) - A Manifold Perspective on the Statistical Generalization of Graph Neural Networks [84.01980526069075]
We take a manifold perspective to establish the statistical generalization theory of GNNs on graphs sampled from a manifold in the spectral domain.
We prove that the generalization bounds of GNNs decrease linearly with the size of the graphs in the logarithmic scale, and increase linearly with the spectral continuity constants of the filter functions.
arXiv Detail & Related papers (2024-06-07T19:25:02Z) - Graph Neural Networks Do Not Always Oversmooth [46.57665708260211]
We study oversmoothing in graph convolutional networks (GCNs) by using their Gaussian process (GP) equivalence in the limit of infinitely many hidden features.
We identify a new, non-oversmoothing phase: if the initial weights of the network have sufficiently large variance, GCNs do not oversmooth, and node features remain informative even at large depth.
arXiv Detail & Related papers (2024-06-04T12:47:13Z) - Generalization Guarantee of Training Graph Convolutional Networks with
Graph Topology Sampling [83.77955213766896]
Graph convolutional networks (GCNs) have recently achieved great empirical success in learning graphstructured data.
To address its scalability issue, graph topology sampling has been proposed to reduce the memory and computational cost of training Gs.
This paper provides first theoretical justification of graph topology sampling in training (up to) three-layer GCNs.
arXiv Detail & Related papers (2022-07-07T21:25:55Z) - Node Feature Kernels Increase Graph Convolutional Network Robustness [19.076912727990326]
The robustness of Graph Convolutional Networks (GCNs) to perturbations of their input is becoming a topic of increasing importance.
In this paper, a random matrix theory analysis is possible.
It is observed that enhancing the message passing step in GCNs by adding the node feature kernel to the adjacency matrix of the graph structure solves this problem.
arXiv Detail & Related papers (2021-09-04T04:20:45Z) - Towards Deeper Graph Neural Networks [63.46470695525957]
Graph convolutions perform neighborhood aggregation and represent one of the most important graph operations.
Several recent studies attribute this performance deterioration to the over-smoothing issue.
We propose Deep Adaptive Graph Neural Network (DAGNN) to adaptively incorporate information from large receptive fields.
arXiv Detail & Related papers (2020-07-18T01:11:14Z) - DeeperGCN: All You Need to Train Deeper GCNs [66.64739331859226]
Graph Convolutional Networks (GCNs) have been drawing significant attention with the power of representation learning on graphs.
Unlike Convolutional Neural Networks (CNNs), which are able to take advantage of stacking very deep layers, GCNs suffer from vanishing gradient, over-smoothing and over-fitting issues when going deeper.
This paper proposes DeeperGCN that is capable of successfully and reliably training very deep GCNs.
arXiv Detail & Related papers (2020-06-13T23:00:22Z) - Infinitely Wide Graph Convolutional Networks: Semi-supervised Learning
via Gaussian Processes [144.6048446370369]
Graph convolutional neural networks(GCNs) have recently demonstrated promising results on graph-based semi-supervised classification.
We propose a GP regression model via GCNs(GPGC) for graph-based semi-supervised learning.
We conduct extensive experiments to evaluate GPGC and demonstrate that it outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2020-02-26T10:02:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.