Analysis of Convolutions, Non-linearity and Depth in Graph Neural
Networks using Neural Tangent Kernel
- URL: http://arxiv.org/abs/2210.09809v4
- Date: Tue, 31 Oct 2023 14:25:42 GMT
- Title: Analysis of Convolutions, Non-linearity and Depth in Graph Neural
Networks using Neural Tangent Kernel
- Authors: Mahalakshmi Sabanayagam, Pascal Esser, Debarghya Ghoshdastidar
- Abstract summary: Graph Neural Networks (GNNs) are designed to exploit the structural information of the data by aggregating the neighboring nodes.
We theoretically analyze the influence of different aspects of the GNN architecture using the Graph Neural Kernel in a semi-supervised node classification setting.
We prove that: (i) linear networks capture the class information as good as ReLU networks; (ii) row normalization preserves the underlying class structure better than other convolutions; (iii) performance degrades with network depth due to over-smoothing; (iv) skip connections retain the class information even at infinite depth, thereby eliminating over-smooth
- Score: 8.824340350342512
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The fundamental principle of Graph Neural Networks (GNNs) is to exploit the
structural information of the data by aggregating the neighboring nodes using a
`graph convolution' in conjunction with a suitable choice for the network
architecture, such as depth and activation functions. Therefore, understanding
the influence of each of the design choice on the network performance is
crucial. Convolutions based on graph Laplacian have emerged as the dominant
choice with the symmetric normalization of the adjacency matrix as the most
widely adopted one. However, some empirical studies show that row normalization
of the adjacency matrix outperforms it in node classification. Despite the
widespread use of GNNs, there is no rigorous theoretical study on the
representation power of these convolutions, that could explain this behavior.
Similarly, the empirical observation of the linear GNNs performance being on
par with non-linear ReLU GNNs lacks rigorous theory.
In this work, we theoretically analyze the influence of different aspects of
the GNN architecture using the Graph Neural Tangent Kernel in a semi-supervised
node classification setting. Under the population Degree Corrected Stochastic
Block Model, we prove that: (i) linear networks capture the class information
as good as ReLU networks; (ii) row normalization preserves the underlying class
structure better than other convolutions; (iii) performance degrades with
network depth due to over-smoothing, but the loss in class information is the
slowest in row normalization; (iv) skip connections retain the class
information even at infinite depth, thereby eliminating over-smoothing. We
finally validate our theoretical findings numerically and on real datasets such
as Cora and Citeseer.
Related papers
- On the Initialization of Graph Neural Networks [10.153841274798829]
We analyze the variance of forward and backward propagation across Graph Neural Networks layers.
We propose a new method for Variance Instability Reduction within GNN Optimization (Virgo)
We conduct comprehensive experiments on 15 datasets to show that Virgo can lead to superior model performance.
arXiv Detail & Related papers (2023-12-05T09:55:49Z) - Neural Tangent Kernels Motivate Graph Neural Networks with
Cross-Covariance Graphs [94.44374472696272]
We investigate NTKs and alignment in the context of graph neural networks (GNNs)
Our results establish the theoretical guarantees on the optimality of the alignment for a two-layer GNN.
These guarantees are characterized by the graph shift operator being a function of the cross-covariance between the input and the output data.
arXiv Detail & Related papers (2023-10-16T19:54:21Z) - Label Deconvolution for Node Representation Learning on Large-scale
Attributed Graphs against Learning Bias [75.44877675117749]
We propose an efficient label regularization technique, namely Label Deconvolution (LD), to alleviate the learning bias by a novel and highly scalable approximation to the inverse mapping of GNNs.
Experiments demonstrate LD significantly outperforms state-of-the-art methods on Open Graph datasets Benchmark.
arXiv Detail & Related papers (2023-09-26T13:09:43Z) - Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - Geometric Graph Filters and Neural Networks: Limit Properties and
Discriminability Trade-offs [122.06927400759021]
We study the relationship between a graph neural network (GNN) and a manifold neural network (MNN) when the graph is constructed from a set of points sampled from the manifold.
We prove non-asymptotic error bounds showing that convolutional filters and neural networks on these graphs converge to convolutional filters and neural networks on the continuous manifold.
arXiv Detail & Related papers (2023-05-29T08:27:17Z) - Understanding Non-linearity in Graph Neural Networks from the
Bayesian-Inference Perspective [33.01636846541052]
Graph neural networks (GNNs) have shown superiority in many prediction tasks over graphs.
We investigate the functions of non-linearity in GNNs for node classification tasks.
arXiv Detail & Related papers (2022-07-22T19:36:12Z) - Adaptive Kernel Graph Neural Network [21.863238974404474]
Graph neural networks (GNNs) have demonstrated great success in representation learning for graph-structured data.
In this paper, we propose a novel framework - i.e., namely Adaptive Kernel Graph Neural Network (AKGNN)
AKGNN learns to adapt to the optimal graph kernel in a unified manner at the first attempt.
Experiments are conducted on acknowledged benchmark datasets and promising results demonstrate the outstanding performance of our proposed AKGNN.
arXiv Detail & Related papers (2021-12-08T20:23:58Z) - Fast Learning of Graph Neural Networks with Guaranteed Generalizability:
One-hidden-layer Case [93.37576644429578]
Graph neural networks (GNNs) have made great progress recently on learning from graph-structured data in practice.
We provide a theoretically-grounded generalizability analysis of GNNs with one hidden layer for both regression and binary classification problems.
arXiv Detail & Related papers (2020-06-25T00:45:52Z) - Optimization and Generalization Analysis of Transduction through
Gradient Boosting and Application to Multi-scale Graph Neural Networks [60.22494363676747]
It is known that the current graph neural networks (GNNs) are difficult to make themselves deep due to the problem known as over-smoothing.
Multi-scale GNNs are a promising approach for mitigating the over-smoothing problem.
We derive the optimization and generalization guarantees of transductive learning algorithms that include multi-scale GNNs.
arXiv Detail & Related papers (2020-06-15T17:06:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.