Related papers: Wide Graph Neural Networks: Aggregation Provably Leads to Exponentially Trainability Loss

Wide Graph Neural Networks: Aggregation Provably Leads to Exponentially Trainability Loss

URL: http://arxiv.org/abs/2103.03113v1
Date: Wed, 3 Mar 2021 11:06:12 GMT
Title: Wide Graph Neural Networks: Aggregation Provably Leads to Exponentially Trainability Loss
Authors: Wei Huang, Yayong Li, Weitao Du, Richard Yi Da Xu, Jie Yin, and Ling Chen
Abstract summary: Graph convolutional networks (GCNs) and their variants have achieved great success in dealing with graph-structured data. It is well known that deep GCNs will suffer from over-smoothing problem. Few theoretical analyses have been conducted to study the expressivity and trainability of deep GCNs.
Score: 17.39060566854841
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Graph convolutional networks (GCNs) and their variants have achieved great success in dealing with graph-structured data. However, it is well known that deep GCNs will suffer from over-smoothing problem, where node representations tend to be indistinguishable as we stack up more layers. Although extensive research has confirmed this prevailing understanding, few theoretical analyses have been conducted to study the expressivity and trainability of deep GCNs. In this work, we demonstrate these characterizations by studying the Gaussian Process Kernel (GPK) and Graph Neural Tangent Kernel (GNTK) of an infinitely-wide GCN, corresponding to the analysis on expressivity and trainability, respectively. We first prove the expressivity of infinitely-wide GCNs decaying at an exponential rate by applying the mean-field theory on GPK. Besides, we formulate the asymptotic behaviors of GNTK in the large depth, which enables us to reveal the dropping trainability of wide and deep GCNs at an exponential rate. Additionally, we extend our theoretical framework to analyze residual connection-resemble techniques. We found that these techniques can mildly mitigate exponential decay, but they failed to overcome it fundamentally. Finally, all theoretical results in this work are corroborated experimentally on a variety of graph-structured datasets.

Related papers

Statistical physics analysis of graph neural networks: Approaching optimality in the contextual stochastic block model [0.0]
Graph neural networks (GNNs) are designed to process data associated with graphs. GNNs can encounter difficulties in gathering information from nodes far apart by iterated aggregation steps. We show how the architecture of the GCN has to scale with the depth to avoid oversmoothing.
arXiv Detail & Related papers (2025-03-03T09:55:10Z)
Residual connections provably mitigate oversmoothing in graph neural networks [33.548465692402765]
Graph neural networks (GNNs) have achieved remarkable empirical success in processing and representing graph-structured data. However, a significant challenge known as "oversmoothing" persists, where expressive features become nearly indistinguishable in deep GNNs. In this work, we analyze the oversmoothing rates of deep GNNs with and without residual connections.
arXiv Detail & Related papers (2025-01-01T07:35:36Z)
Spiking Graph Neural Network on Riemannian Manifolds [51.15400848660023]
Graph neural networks (GNNs) have become the dominant solution for learning on graphs. Existing spiking GNNs consider graphs in Euclidean space, ignoring the structural geometry. We present a Manifold-valued Spiking GNN (MSG) MSG achieves superior performance to previous spiking GNNs and energy efficiency to conventional GNNs.
arXiv Detail & Related papers (2024-10-23T15:09:02Z)
Bridging Smoothness and Approximation: Theoretical Insights into Over-Smoothing in Graph Neural Networks [12.001676605529626]
We explore the approximation theory of functions defined on graphs. We establish a framework to assess the lower bounds of approximation for target functions using Graph Convolutional Networks (GCNs) We show how the high-frequency energy of the output decays, an indicator of over-smoothing, in GCNs.
arXiv Detail & Related papers (2024-07-01T13:35:53Z)
A Manifold Perspective on the Statistical Generalization of Graph Neural Networks [84.01980526069075]
We take a manifold perspective to establish the statistical generalization theory of GNNs on graphs sampled from a manifold in the spectral domain. We prove that the generalization bounds of GNNs decrease linearly with the size of the graphs in the logarithmic scale, and increase linearly with the spectral continuity constants of the filter functions.
arXiv Detail & Related papers (2024-06-07T19:25:02Z)
Graph Neural Networks Do Not Always Oversmooth [46.57665708260211]
We study oversmoothing in graph convolutional networks (GCNs) by using their Gaussian process (GP) equivalence in the limit of infinitely many hidden features. We identify a new, non-oversmoothing phase: if the initial weights of the network have sufficiently large variance, GCNs do not oversmooth, and node features remain informative even at large depth.
arXiv Detail & Related papers (2024-06-04T12:47:13Z)
Generalization Guarantee of Training Graph Convolutional Networks with Graph Topology Sampling [83.77955213766896]
Graph convolutional networks (GCNs) have recently achieved great empirical success in learning graphstructured data. To address its scalability issue, graph topology sampling has been proposed to reduce the memory and computational cost of training Gs. This paper provides first theoretical justification of graph topology sampling in training (up to) three-layer GCNs.
arXiv Detail & Related papers (2022-07-07T21:25:55Z)
Node Feature Kernels Increase Graph Convolutional Network Robustness [19.076912727990326]
The robustness of Graph Convolutional Networks (GCNs) to perturbations of their input is becoming a topic of increasing importance. In this paper, a random matrix theory analysis is possible. It is observed that enhancing the message passing step in GCNs by adding the node feature kernel to the adjacency matrix of the graph structure solves this problem.
arXiv Detail & Related papers (2021-09-04T04:20:45Z)
Towards Deeper Graph Neural Networks [63.46470695525957]
Graph convolutions perform neighborhood aggregation and represent one of the most important graph operations. Several recent studies attribute this performance deterioration to the over-smoothing issue. We propose Deep Adaptive Graph Neural Network (DAGNN) to adaptively incorporate information from large receptive fields.
arXiv Detail & Related papers (2020-07-18T01:11:14Z)
DeeperGCN: All You Need to Train Deeper GCNs [66.64739331859226]
Graph Convolutional Networks (GCNs) have been drawing significant attention with the power of representation learning on graphs. Unlike Convolutional Neural Networks (CNNs), which are able to take advantage of stacking very deep layers, GCNs suffer from vanishing gradient, over-smoothing and over-fitting issues when going deeper. This paper proposes DeeperGCN that is capable of successfully and reliably training very deep GCNs.
arXiv Detail & Related papers (2020-06-13T23:00:22Z)
Infinitely Wide Graph Convolutional Networks: Semi-supervised Learning via Gaussian Processes [144.6048446370369]
Graph convolutional neural networks(GCNs) have recently demonstrated promising results on graph-based semi-supervised classification. We propose a GP regression model via GCNs(GPGC) for graph-based semi-supervised learning. We conduct extensive experiments to evaluate GPGC and demonstrate that it outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2020-02-26T10:02:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.