Wide Graph Neural Networks: Aggregation Provably Leads to Exponentially
Trainability Loss
- URL: http://arxiv.org/abs/2103.03113v1
- Date: Wed, 3 Mar 2021 11:06:12 GMT
- Title: Wide Graph Neural Networks: Aggregation Provably Leads to Exponentially
Trainability Loss
- Authors: Wei Huang, Yayong Li, Weitao Du, Richard Yi Da Xu, Jie Yin, and Ling
Chen
- Abstract summary: Graph convolutional networks (GCNs) and their variants have achieved great success in dealing with graph-structured data.
It is well known that deep GCNs will suffer from over-smoothing problem.
Few theoretical analyses have been conducted to study the expressivity and trainability of deep GCNs.
- Score: 17.39060566854841
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph convolutional networks (GCNs) and their variants have achieved great
success in dealing with graph-structured data. However, it is well known that
deep GCNs will suffer from over-smoothing problem, where node representations
tend to be indistinguishable as we stack up more layers. Although extensive
research has confirmed this prevailing understanding, few theoretical analyses
have been conducted to study the expressivity and trainability of deep GCNs. In
this work, we demonstrate these characterizations by studying the Gaussian
Process Kernel (GPK) and Graph Neural Tangent Kernel (GNTK) of an
infinitely-wide GCN, corresponding to the analysis on expressivity and
trainability, respectively. We first prove the expressivity of infinitely-wide
GCNs decaying at an exponential rate by applying the mean-field theory on GPK.
Besides, we formulate the asymptotic behaviors of GNTK in the large depth,
which enables us to reveal the dropping trainability of wide and deep GCNs at
an exponential rate. Additionally, we extend our theoretical framework to
analyze residual connection-resemble techniques. We found that these techniques
can mildly mitigate exponential decay, but they failed to overcome it
fundamentally. Finally, all theoretical results in this work are corroborated
experimentally on a variety of graph-structured datasets.
Related papers
- Bridging Smoothness and Approximation: Theoretical Insights into Over-Smoothing in Graph Neural Networks [12.001676605529626]
We explore the approximation theory of functions defined on graphs.
We establish a framework to assess the lower bounds of approximation for target functions using Graph Convolutional Networks (GCNs)
We show how the high-frequency energy of the output decays, an indicator of over-smoothing, in GCNs.
arXiv Detail & Related papers (2024-07-01T13:35:53Z) - A Manifold Perspective on the Statistical Generalization of Graph Neural Networks [84.01980526069075]
Graph Neural Networks (GNNs) combine information from adjacent nodes by successive applications of graph convolutions.
We study the generalization gaps of GNNs on both node-level and graph-level tasks.
We show that the generalization gaps decrease with the number of nodes in the training graphs.
arXiv Detail & Related papers (2024-06-07T19:25:02Z) - Graph Neural Networks Do Not Always Oversmooth [46.57665708260211]
We study oversmoothing in graph convolutional networks (GCNs) by using their Gaussian process (GP) equivalence in the limit of infinitely many hidden features.
We identify a new, nonoversmoothing phase: if the initial weights of the network have sufficiently large variance, GCNs do not oversmooth, and node features remain informative even at large depth.
arXiv Detail & Related papers (2024-06-04T12:47:13Z) - Graph Neural Tangent Kernel: Convergence on Large Graphs [7.624781434274796]
Graph neural networks (GNNs) achieve remarkable performance in graph machine learning tasks.
We investigate the training dynamics of large-graph GNNs using graph neural kernels (GNTKs) and graphons.
arXiv Detail & Related papers (2023-01-25T19:52:58Z) - Generalization Guarantee of Training Graph Convolutional Networks with
Graph Topology Sampling [83.77955213766896]
Graph convolutional networks (GCNs) have recently achieved great empirical success in learning graphstructured data.
To address its scalability issue, graph topology sampling has been proposed to reduce the memory and computational cost of training Gs.
This paper provides first theoretical justification of graph topology sampling in training (up to) three-layer GCNs.
arXiv Detail & Related papers (2022-07-07T21:25:55Z) - Node Feature Kernels Increase Graph Convolutional Network Robustness [19.076912727990326]
The robustness of Graph Convolutional Networks (GCNs) to perturbations of their input is becoming a topic of increasing importance.
In this paper, a random matrix theory analysis is possible.
It is observed that enhancing the message passing step in GCNs by adding the node feature kernel to the adjacency matrix of the graph structure solves this problem.
arXiv Detail & Related papers (2021-09-04T04:20:45Z) - Towards Deeper Graph Neural Networks [63.46470695525957]
Graph convolutions perform neighborhood aggregation and represent one of the most important graph operations.
Several recent studies attribute this performance deterioration to the over-smoothing issue.
We propose Deep Adaptive Graph Neural Network (DAGNN) to adaptively incorporate information from large receptive fields.
arXiv Detail & Related papers (2020-07-18T01:11:14Z) - Simple and Deep Graph Convolutional Networks [63.76221532439285]
Graph convolutional networks (GCNs) are a powerful deep learning approach for graph-structured data.
Despite their success, most of the current GCN models are shallow, due to the em over-smoothing problem.
We propose the GCNII, an extension of the vanilla GCN model with two simple yet effective techniques.
arXiv Detail & Related papers (2020-07-04T16:18:06Z) - DeeperGCN: All You Need to Train Deeper GCNs [66.64739331859226]
Graph Convolutional Networks (GCNs) have been drawing significant attention with the power of representation learning on graphs.
Unlike Convolutional Neural Networks (CNNs), which are able to take advantage of stacking very deep layers, GCNs suffer from vanishing gradient, over-smoothing and over-fitting issues when going deeper.
This paper proposes DeeperGCN that is capable of successfully and reliably training very deep GCNs.
arXiv Detail & Related papers (2020-06-13T23:00:22Z) - Infinitely Wide Graph Convolutional Networks: Semi-supervised Learning
via Gaussian Processes [144.6048446370369]
Graph convolutional neural networks(GCNs) have recently demonstrated promising results on graph-based semi-supervised classification.
We propose a GP regression model via GCNs(GPGC) for graph-based semi-supervised learning.
We conduct extensive experiments to evaluate GPGC and demonstrate that it outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2020-02-26T10:02:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.