Is Solving Graph Neural Tangent Kernel Equivalent to Training Graph
Neural Network?
- URL: http://arxiv.org/abs/2309.07452v1
- Date: Thu, 14 Sep 2023 06:24:33 GMT
- Title: Is Solving Graph Neural Tangent Kernel Equivalent to Training Graph
Neural Network?
- Authors: Lianke Qin, Zhao Song, Baocheng Sun
- Abstract summary: A rising trend in theoretical deep learning is to understand why deep learning works through Neural Tangent Kernel (NTK) [jgh18].
GNTK is a kernel method that is equivalent to using gradient descent to train a multi-layer infinitely-wide neural network.
We show that GNTK can achieve similar accuracy as GNNs on various bioinformatics datasets.
- Score: 9.599018775881275
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A rising trend in theoretical deep learning is to understand why deep
learning works through Neural Tangent Kernel (NTK) [jgh18], a kernel method
that is equivalent to using gradient descent to train a multi-layer
infinitely-wide neural network. NTK is a major step forward in the theoretical
deep learning because it allows researchers to use traditional mathematical
tools to analyze properties of deep neural networks and to explain various
neural network techniques from a theoretical view. A natural extension of NTK
on graph learning is \textit{Graph Neural Tangent Kernel (GNTK)}, and
researchers have already provide GNTK formulation for graph-level regression
and show empirically that this kernel method can achieve similar accuracy as
GNNs on various bioinformatics datasets [dhs+19]. The remaining question now is
whether solving GNTK regression is equivalent to training an infinite-wide
multi-layer GNN using gradient descent. In this paper, we provide three new
theoretical results. First, we formally prove this equivalence for graph-level
regression. Second, we present the first GNTK formulation for node-level
regression. Finally, we prove the equivalence for node-level regression.
Related papers
- Novel Kernel Models and Exact Representor Theory for Neural Networks Beyond the Over-Parameterized Regime [52.00917519626559]
This paper presents two models of neural-networks and their training applicable to neural networks of arbitrary width, depth and topology.
We also present an exact novel representor theory for layer-wise neural network training with unregularized gradient descent in terms of a local-extrinsic neural kernel (LeNK)
This representor theory gives insight into the role of higher-order statistics in neural network training and the effect of kernel evolution in neural-network kernel models.
arXiv Detail & Related papers (2024-05-24T06:30:36Z) - A Unified Kernel for Neural Network Learning [4.0759204898334715]
We present the Unified Neural Kernel (UNK), which characterizes the learning dynamics of neural networks with gradient descents.
UNK maintains the limiting properties of both NNGP and NTK, exhibiting behaviors akin to NTK with a finite learning step.
We also theoretically characterize the uniform tightness and learning convergence of the UNK kernel.
arXiv Detail & Related papers (2024-03-26T07:55:45Z) - Infinite Width Graph Neural Networks for Node Regression/ Classification [0.0]
This work analyzes Graph Neural Networks, a generalization of Fully-Connected Deep Neural Nets on Graph structured data, when their width, that is the number of nodes in each fullyconnected layer is increasing to infinity.
arXiv Detail & Related papers (2023-10-12T10:01:39Z) - Graph Neural Networks Provably Benefit from Structural Information: A
Feature Learning Perspective [53.999128831324576]
Graph neural networks (GNNs) have pioneered advancements in graph representation learning.
This study investigates the role of graph convolution within the context of feature learning theory.
arXiv Detail & Related papers (2023-06-24T10:21:11Z) - Gradient Descent in Neural Networks as Sequential Learning in RKBS [63.011641517977644]
We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights.
We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning.
arXiv Detail & Related papers (2023-02-01T03:18:07Z) - Graph Neural Tangent Kernel: Convergence on Large Graphs [7.624781434274796]
Graph neural networks (GNNs) achieve remarkable performance in graph machine learning tasks.
We investigate the training dynamics of large-graph GNNs using graph neural kernels (GNTKs) and graphons.
arXiv Detail & Related papers (2023-01-25T19:52:58Z) - A Convergence Analysis of Nesterov's Accelerated Gradient Method in
Training Deep Linear Neural Networks [21.994004684742812]
Momentum methods are widely used in training networks for their fast trajectory.
We show that the convergence of the random number and $kappaO can converge to the global minimum.
We extend our analysis to deep linear ResNets and derive a similar result.
arXiv Detail & Related papers (2022-04-18T13:24:12Z) - Scaling Neural Tangent Kernels via Sketching and Random Features [53.57615759435126]
Recent works report that NTK regression can outperform finitely-wide neural networks trained on small-scale datasets.
We design a near input-sparsity time approximation algorithm for NTK, by sketching the expansions of arc-cosine kernels.
We show that a linear regressor trained on our CNTK features matches the accuracy of exact CNTK on CIFAR-10 dataset while achieving 150x speedup.
arXiv Detail & Related papers (2021-06-15T04:44:52Z) - How Neural Networks Extrapolate: From Feedforward to Graph Neural
Networks [80.55378250013496]
We study how neural networks trained by gradient descent extrapolate what they learn outside the support of the training distribution.
Graph Neural Networks (GNNs) have shown some success in more complex tasks.
arXiv Detail & Related papers (2020-09-24T17:48:59Z) - Fast Learning of Graph Neural Networks with Guaranteed Generalizability:
One-hidden-layer Case [93.37576644429578]
Graph neural networks (GNNs) have made great progress recently on learning from graph-structured data in practice.
We provide a theoretically-grounded generalizability analysis of GNNs with one hidden layer for both regression and binary classification problems.
arXiv Detail & Related papers (2020-06-25T00:45:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.