Related papers: Learning Theory Can (Sometimes) Explain Generalisation in Graph Neural Networks

Learning Theory Can (Sometimes) Explain Generalisation in Graph Neural Networks

URL: http://arxiv.org/abs/2112.03968v1
Date: Tue, 7 Dec 2021 20:06:23 GMT
Title: Learning Theory Can (Sometimes) Explain Generalisation in Graph Neural Networks
Authors: Pascal Mattia Esser, Leena Chennuru Vankadara, Debarghya Ghoshdastidar
Abstract summary: We provide a rigorous analysis of the performance of neural networks in the context of transductive inference. We show that transductive Rademacher complexity can explain the generalisation properties of graph convolutional networks for block models.
Score: 13.518582483147325
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In recent years, several results in the supervised learning setting suggested that classical statistical learning-theoretic measures, such as VC dimension, do not adequately explain the performance of deep learning models which prompted a slew of work in the infinite-width and iteration regimes. However, there is little theoretical explanation for the success of neural networks beyond the supervised setting. In this paper we argue that, under some distributional assumptions, classical learning-theoretic measures can sufficiently explain generalization for graph neural networks in the transductive setting. In particular, we provide a rigorous analysis of the performance of neural networks in the context of transductive inference, specifically by analysing the generalisation properties of graph convolutional networks for the problem of node classification. While VC Dimension does result in trivial generalisation error bounds in this setting as well, we show that transductive Rademacher complexity can explain the generalisation properties of graph convolutional networks for stochastic block models. We further use the generalisation error bounds based on transductive Rademacher complexity to demonstrate the role of graph convolutions and network architectures in achieving smaller generalisation error and provide insights into when the graph structure can help in learning. The findings of this paper could re-new the interest in studying generalisation in neural networks in terms of learning-theoretic measures, albeit in specific problems.

Related papers

Feature Contamination: Neural Networks Learn Uncorrelated Features and Fail to Generalize [5.642322814965062]
Learning representations that generalize under distribution shifts is critical for building robust machine learning models. We show that even allowing a neural network to explicitly fit the representations obtained from a teacher network that can generalize out-of-distribution is insufficient for the generalization of the student network.
arXiv Detail & Related papers (2024-06-05T15:04:27Z)
Generalization Error of Graph Neural Networks in the Mean-field Regime [10.35214360391282]
We explore two widely utilized types of graph neural networks: graph convolutional neural networks and message passing graph neural networks. Our novel approach involves deriving upper bounds within the mean-field regime for evaluating the generalization error of these graph neural networks.
arXiv Detail & Related papers (2024-02-10T19:12:31Z)
Generalization and Estimation Error Bounds for Model-based Neural Networks [78.88759757988761]
We show that the generalization abilities of model-based networks for sparse recovery outperform those of regular ReLU networks. We derive practical design rules that allow to construct model-based networks with guaranteed high generalization.
arXiv Detail & Related papers (2023-04-19T16:39:44Z)
On the Expressiveness and Generalization of Hypergraph Neural Networks [77.65788763444877]
This extended abstract describes a framework for analyzing the expressiveness, learning, and (structural) generalization of hypergraph neural networks (HyperGNNs) Specifically, we focus on how HyperGNNs can learn from finite datasets and generalize structurally to graph reasoning problems of arbitrary input sizes.
arXiv Detail & Related papers (2023-03-09T18:42:18Z)
How You Start Matters for Generalization [26.74340246715699]
We show that the generalization of neural networks is heavily tied to their initializes. We make a case against the controversial flat-minima conjecture.
arXiv Detail & Related papers (2022-06-17T05:30:56Z)
Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks [19.99615698375829]
We show that generalization error can be equivalently bounded in terms of a notion called the 'persistent homology dimension' (PHD) We develop an efficient algorithm to estimate PHD in the scale of modern deep neural networks. Our experiments show that the proposed approach can efficiently compute a network's intrinsic dimension in a variety of settings.
arXiv Detail & Related papers (2021-11-25T17:06:15Z)
A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation. Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z)
Towards Deeper Graph Neural Networks [63.46470695525957]
Graph convolutions perform neighborhood aggregation and represent one of the most important graph operations. Several recent studies attribute this performance deterioration to the over-smoothing issue. We propose Deep Adaptive Graph Neural Network (DAGNN) to adaptively incorporate information from large receptive fields.
arXiv Detail & Related papers (2020-07-18T01:11:14Z)
Generalization bound of globally optimal non-convex neural network training: Transportation map estimation by infinite dimensional Langevin dynamics [50.83356836818667]
We introduce a new theoretical framework to analyze deep learning optimization with connection to its generalization error. Existing frameworks such as mean field theory and neural tangent kernel theory for neural network optimization analysis typically require taking limit of infinite width of the network to show its global convergence.
arXiv Detail & Related papers (2020-07-11T18:19:50Z)
Understanding Generalization in Deep Learning via Tensor Methods [53.808840694241]
We advance the understanding of the relations between the network's architecture and its generalizability from the compression perspective. We propose a series of intuitive, data-dependent and easily-measurable properties that tightly characterize the compressibility and generalizability of neural networks.
arXiv Detail & Related papers (2020-01-14T22:26:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.