On the Initialization of Graph Neural Networks
- URL: http://arxiv.org/abs/2312.02622v1
- Date: Tue, 5 Dec 2023 09:55:49 GMT
- Title: On the Initialization of Graph Neural Networks
- Authors: Jiahang Li, Yakun Song, Xiang Song, David Paul Wipf
- Abstract summary: We analyze the variance of forward and backward propagation across Graph Neural Networks layers.
We propose a new method for Variance Instability Reduction within GNN Optimization (Virgo)
We conduct comprehensive experiments on 15 datasets to show that Virgo can lead to superior model performance.
- Score: 10.153841274798829
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph Neural Networks (GNNs) have displayed considerable promise in graph
representation learning across various applications. The core learning process
requires the initialization of model weight matrices within each GNN layer,
which is typically accomplished via classic initialization methods such as
Xavier initialization. However, these methods were originally motivated to
stabilize the variance of hidden embeddings and gradients across layers of
Feedforward Neural Networks (FNNs) and Convolutional Neural Networks (CNNs) to
avoid vanishing gradients and maintain steady information flow. In contrast,
within the GNN context classical initializations disregard the impact of the
input graph structure and message passing on variance. In this paper, we
analyze the variance of forward and backward propagation across GNN layers and
show that the variance instability of GNN initializations comes from the
combined effect of the activation function, hidden dimension, graph structure
and message passing. To better account for these influence factors, we propose
a new initialization method for Variance Instability Reduction within GNN
Optimization (Virgo), which naturally tends to equate forward and backward
variances across successive layers. We conduct comprehensive experiments on 15
datasets to show that Virgo can lead to superior model performance and more
stable variance at initialization on node classification, link prediction and
graph classification tasks. Codes are in
https://github.com/LspongebobJH/virgo_icml2023.
Related papers
- Reducing Oversmoothing through Informed Weight Initialization in Graph Neural Networks [16.745718346575202]
We propose a new scheme (G-Init) that reduces oversmoothing, leading to very good results in node and graph classification tasks.
Our results indicate that the new method (G-Init) reduces oversmoothing in deep GNNs, facilitating their effective use.
arXiv Detail & Related papers (2024-10-31T11:21:20Z) - Learning to Reweight for Graph Neural Network [63.978102332612906]
Graph Neural Networks (GNNs) show promising results for graph tasks.
Existing GNNs' generalization ability will degrade when there exist distribution shifts between testing and training graph data.
We propose a novel nonlinear graph decorrelation method, which can substantially improve the out-of-distribution generalization ability.
arXiv Detail & Related papers (2023-12-19T12:25:10Z) - Label Deconvolution for Node Representation Learning on Large-scale
Attributed Graphs against Learning Bias [75.44877675117749]
We propose an efficient label regularization technique, namely Label Deconvolution (LD), to alleviate the learning bias by a novel and highly scalable approximation to the inverse mapping of GNNs.
Experiments demonstrate LD significantly outperforms state-of-the-art methods on Open Graph datasets Benchmark.
arXiv Detail & Related papers (2023-09-26T13:09:43Z) - Analysis of Convolutions, Non-linearity and Depth in Graph Neural
Networks using Neural Tangent Kernel [8.824340350342512]
Graph Neural Networks (GNNs) are designed to exploit the structural information of the data by aggregating the neighboring nodes.
We theoretically analyze the influence of different aspects of the GNN architecture using the Graph Neural Kernel in a semi-supervised node classification setting.
We prove that: (i) linear networks capture the class information as good as ReLU networks; (ii) row normalization preserves the underlying class structure better than other convolutions; (iii) performance degrades with network depth due to over-smoothing; (iv) skip connections retain the class information even at infinite depth, thereby eliminating over-smooth
arXiv Detail & Related papers (2022-10-18T12:28:37Z) - Invertible Neural Networks for Graph Prediction [22.140275054568985]
In this work, we address conditional generation using deep invertible neural networks.
We adopt an end-to-end training approach since our objective is to address prediction and generation in the forward and backward processes at once.
arXiv Detail & Related papers (2022-06-02T17:28:33Z) - Orthogonal Graph Neural Networks [53.466187667936026]
Graph neural networks (GNNs) have received tremendous attention due to their superiority in learning node representations.
stacking more convolutional layers significantly decreases the performance of GNNs.
We propose a novel Ortho-GConv, which could generally augment the existing GNN backbones to stabilize the model training and improve the model's generalization performance.
arXiv Detail & Related papers (2021-09-23T12:39:01Z) - Overcoming Catastrophic Forgetting in Graph Neural Networks [50.900153089330175]
Catastrophic forgetting refers to the tendency that a neural network "forgets" the previous learned knowledge upon learning new tasks.
We propose a novel scheme dedicated to overcoming this problem and hence strengthen continual learning in graph neural networks (GNNs)
At the heart of our approach is a generic module, termed as topology-aware weight preserving(TWP)
arXiv Detail & Related papers (2020-12-10T22:30:25Z) - Fast Learning of Graph Neural Networks with Guaranteed Generalizability:
One-hidden-layer Case [93.37576644429578]
Graph neural networks (GNNs) have made great progress recently on learning from graph-structured data in practice.
We provide a theoretically-grounded generalizability analysis of GNNs with one hidden layer for both regression and binary classification problems.
arXiv Detail & Related papers (2020-06-25T00:45:52Z) - Revisiting Initialization of Neural Networks [72.24615341588846]
We propose a rigorous estimation of the global curvature of weights across layers by approximating and controlling the norm of their Hessian matrix.
Our experiments on Word2Vec and the MNIST/CIFAR image classification tasks confirm that tracking the Hessian norm is a useful diagnostic tool.
arXiv Detail & Related papers (2020-04-20T18:12:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.