Related papers: On the Initialization of Graph Neural Networks

On the Initialization of Graph Neural Networks

URL: http://arxiv.org/abs/2312.02622v1
Date: Tue, 5 Dec 2023 09:55:49 GMT
Title: On the Initialization of Graph Neural Networks
Authors: Jiahang Li, Yakun Song, Xiang Song, David Paul Wipf
Abstract summary: We analyze the variance of forward and backward propagation across Graph Neural Networks layers. We propose a new method for Variance Instability Reduction within GNN Optimization (Virgo) We conduct comprehensive experiments on 15 datasets to show that Virgo can lead to superior model performance.
Score: 10.153841274798829
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Graph Neural Networks (GNNs) have displayed considerable promise in graph representation learning across various applications. The core learning process requires the initialization of model weight matrices within each GNN layer, which is typically accomplished via classic initialization methods such as Xavier initialization. However, these methods were originally motivated to stabilize the variance of hidden embeddings and gradients across layers of Feedforward Neural Networks (FNNs) and Convolutional Neural Networks (CNNs) to avoid vanishing gradients and maintain steady information flow. In contrast, within the GNN context classical initializations disregard the impact of the input graph structure and message passing on variance. In this paper, we analyze the variance of forward and backward propagation across GNN layers and show that the variance instability of GNN initializations comes from the combined effect of the activation function, hidden dimension, graph structure and message passing. To better account for these influence factors, we propose a new initialization method for Variance Instability Reduction within GNN Optimization (Virgo), which naturally tends to equate forward and backward variances across successive layers. We conduct comprehensive experiments on 15 datasets to show that Virgo can lead to superior model performance and more stable variance at initialization on node classification, link prediction and graph classification tasks. Codes are in https://github.com/LspongebobJH/virgo_icml2023.

Related papers

Reducing Oversmoothing through Informed Weight Initialization in Graph Neural Networks [16.745718346575202]
We propose a new scheme (G-Init) that reduces oversmoothing, leading to very good results in node and graph classification tasks. Our results indicate that the new method (G-Init) reduces oversmoothing in deep GNNs, facilitating their effective use.
arXiv Detail & Related papers (2024-10-31T11:21:20Z)
Learning to Reweight for Graph Neural Network [63.978102332612906]
Graph Neural Networks (GNNs) show promising results for graph tasks. Existing GNNs' generalization ability will degrade when there exist distribution shifts between testing and training graph data. We propose a novel nonlinear graph decorrelation method, which can substantially improve the out-of-distribution generalization ability.
arXiv Detail & Related papers (2023-12-19T12:25:10Z)
Label Deconvolution for Node Representation Learning on Large-scale Attributed Graphs against Learning Bias [75.44877675117749]
We propose an efficient label regularization technique, namely Label Deconvolution (LD), to alleviate the learning bias by a novel and highly scalable approximation to the inverse mapping of GNNs. Experiments demonstrate LD significantly outperforms state-of-the-art methods on Open Graph datasets Benchmark.
arXiv Detail & Related papers (2023-09-26T13:09:43Z)
Analysis of Convolutions, Non-linearity and Depth in Graph Neural Networks using Neural Tangent Kernel [8.824340350342512]
Graph Neural Networks (GNNs) are designed to exploit the structural information of the data by aggregating the neighboring nodes. We theoretically analyze the influence of different aspects of the GNN architecture using the Graph Neural Kernel in a semi-supervised node classification setting. We prove that: (i) linear networks capture the class information as good as ReLU networks; (ii) row normalization preserves the underlying class structure better than other convolutions; (iii) performance degrades with network depth due to over-smoothing; (iv) skip connections retain the class information even at infinite depth, thereby eliminating over-smooth
arXiv Detail & Related papers (2022-10-18T12:28:37Z)
Invertible Neural Networks for Graph Prediction [22.140275054568985]
In this work, we address conditional generation using deep invertible neural networks. We adopt an end-to-end training approach since our objective is to address prediction and generation in the forward and backward processes at once.
arXiv Detail & Related papers (2022-06-02T17:28:33Z)
Orthogonal Graph Neural Networks [53.466187667936026]
Graph neural networks (GNNs) have received tremendous attention due to their superiority in learning node representations. stacking more convolutional layers significantly decreases the performance of GNNs. We propose a novel Ortho-GConv, which could generally augment the existing GNN backbones to stabilize the model training and improve the model's generalization performance.
arXiv Detail & Related papers (2021-09-23T12:39:01Z)
Overcoming Catastrophic Forgetting in Graph Neural Networks [50.900153089330175]
Catastrophic forgetting refers to the tendency that a neural network "forgets" the previous learned knowledge upon learning new tasks. We propose a novel scheme dedicated to overcoming this problem and hence strengthen continual learning in graph neural networks (GNNs) At the heart of our approach is a generic module, termed as topology-aware weight preserving(TWP)
arXiv Detail & Related papers (2020-12-10T22:30:25Z)
Fast Learning of Graph Neural Networks with Guaranteed Generalizability: One-hidden-layer Case [93.37576644429578]
Graph neural networks (GNNs) have made great progress recently on learning from graph-structured data in practice. We provide a theoretically-grounded generalizability analysis of GNNs with one hidden layer for both regression and binary classification problems.
arXiv Detail & Related papers (2020-06-25T00:45:52Z)
Revisiting Initialization of Neural Networks [72.24615341588846]
We propose a rigorous estimation of the global curvature of weights across layers by approximating and controlling the norm of their Hessian matrix. Our experiments on Word2Vec and the MNIST/CIFAR image classification tasks confirm that tracking the Hessian norm is a useful diagnostic tool.
arXiv Detail & Related papers (2020-04-20T18:12:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.