Convergence of gradient based training for linear Graph Neural Networks
- URL: http://arxiv.org/abs/2501.14440v1
- Date: Fri, 24 Jan 2025 12:18:30 GMT
- Title: Convergence of gradient based training for linear Graph Neural Networks
- Authors: Dhiraj Patel, Anton Savostianov, Michael T. Schaub,
- Abstract summary: We show that the gradient flow training of a linear GNN with mean squared loss converges to the global minimum at an exponential rate.<n>We also study the convergence of linear GNNs under gradient descent training.
- Score: 5.079602839359521
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Graph Neural Networks (GNNs) are powerful tools for addressing learning problems on graph structures, with a wide range of applications in molecular biology and social networks. However, the theoretical foundations underlying their empirical performance are not well understood. In this article, we examine the convergence of gradient dynamics in the training of linear GNNs. Specifically, we prove that the gradient flow training of a linear GNN with mean squared loss converges to the global minimum at an exponential rate. The convergence rate depends explicitly on the initial weights and the graph shift operator, which we validate on synthetic datasets from well-known graph models and real-world datasets. Furthermore, we discuss the gradient flow that minimizes the total weights at the global minimum. In addition to the gradient flow, we study the convergence of linear GNNs under gradient descent training, an iterative scheme viewed as a discretization of gradient flow.
Related papers
- On Vanishing Gradients, Over-Smoothing, and Over-Squashing in GNNs: Bridging Recurrent and Graph Learning [15.409865070022951]
Graph Neural Networks (GNNs) are models that leverage the graph structure to transmit information between nodes.
We show that a simple state-space formulation of a GNN effectively alleviates over-smoothing and over-squashing at no extra trainable parameter cost.
arXiv Detail & Related papers (2025-02-15T14:43:41Z) - A Manifold Perspective on the Statistical Generalization of Graph Neural Networks [84.01980526069075]
We take a manifold perspective to establish the statistical generalization theory of GNNs on graphs sampled from a manifold in the spectral domain.
We prove that the generalization bounds of GNNs decrease linearly with the size of the graphs in the logarithmic scale, and increase linearly with the spectral continuity constants of the filter functions.
arXiv Detail & Related papers (2024-06-07T19:25:02Z) - Gradient scarcity with Bilevel Optimization for Graph Learning [0.0]
gradient scarcity occurs when learning graphs by minimizing a loss on a subset of nodes causes edges between unlabelled nodes that are far from labelled ones to receive zero gradients.
We give a precise mathematical characterization of this phenomenon, and prove it also emerges in bilevel optimization.
To alleviate this issue, we study several solutions: we propose to resort to latent graph learning using a Graph-to-Graph model (G2G), graph regularization to impose a prior structure on the graph, or optimizing on a larger graph than the original one with a reduced diameter.
arXiv Detail & Related papers (2023-03-24T12:37:43Z) - MentorGNN: Deriving Curriculum for Pre-Training GNNs [61.97574489259085]
We propose an end-to-end model named MentorGNN that aims to supervise the pre-training process of GNNs across graphs.
We shed new light on the problem of domain adaption on relational data (i.e., graphs) by deriving a natural and interpretable upper bound on the generalization error of the pre-trained GNNs.
arXiv Detail & Related papers (2022-08-21T15:12:08Z) - Understanding convolution on graphs via energies [23.18124653469668]
Graph Networks (GNNs) typically operate by message-passing, where the state of a node is updated based on the information received from its neighbours.
Most message-passing models act as graph convolutions, where features are mixed by a shared, linear transformation before being propagated over the edges.
On node-classification tasks, graph convolutions have been shown to suffer from two limitations: poor performance on heterophilic graphs, and over-smoothing.
arXiv Detail & Related papers (2022-06-22T11:45:36Z) - Learning Graph Structure from Convolutional Mixtures [119.45320143101381]
We propose a graph convolutional relationship between the observed and latent graphs, and formulate the graph learning task as a network inverse (deconvolution) problem.
In lieu of eigendecomposition-based spectral methods, we unroll and truncate proximal gradient iterations to arrive at a parameterized neural network architecture that we call a Graph Deconvolution Network (GDN)
GDNs can learn a distribution of graphs in a supervised fashion, perform link prediction or edge-weight regression tasks by adapting the loss function, and they are inherently inductive.
arXiv Detail & Related papers (2022-05-19T14:08:15Z) - Optimal Propagation for Graph Neural Networks [51.08426265813481]
We propose a bi-level optimization approach for learning the optimal graph structure.
We also explore a low-rank approximation model for further reducing the time complexity.
arXiv Detail & Related papers (2022-05-06T03:37:00Z) - Increase and Conquer: Training Graph Neural Networks on Growing Graphs [116.03137405192356]
We consider the problem of learning a graphon neural network (WNN) by training GNNs on graphs sampled Bernoulli from the graphon.
Inspired by these results, we propose an algorithm to learn GNNs on large-scale graphs that, starting from a moderate number of nodes, successively increases the size of the graph during training.
arXiv Detail & Related papers (2021-06-07T15:05:59Z) - Implicit Graph Neural Networks [46.0589136729616]
We propose a graph learning framework called Implicit Graph Neural Networks (IGNN)
IGNNs consistently capture long-range dependencies and outperform state-of-the-art GNN models.
arXiv Detail & Related papers (2020-09-14T06:04:55Z) - Fast Learning of Graph Neural Networks with Guaranteed Generalizability:
One-hidden-layer Case [93.37576644429578]
Graph neural networks (GNNs) have made great progress recently on learning from graph-structured data in practice.
We provide a theoretically-grounded generalizability analysis of GNNs with one hidden layer for both regression and binary classification problems.
arXiv Detail & Related papers (2020-06-25T00:45:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.