Related papers: The Graphon Limit Hypothesis: Understanding Neural Network Pruning via Infinite Width Analysis

The Graphon Limit Hypothesis: Understanding Neural Network Pruning via Infinite Width Analysis

URL: http://arxiv.org/abs/2510.17515v1
Date: Mon, 20 Oct 2025 13:13:35 GMT
Title: The Graphon Limit Hypothesis: Understanding Neural Network Pruning via Infinite Width Analysis
Authors: Hoang Pham, The-Anh Ta, Tom Jacobs, Rebekka Burkholz, Long Tran-Thanh,
Abstract summary: We study the training dynamics of sparse networks in the infinite width limit.<n>Our framework provides theoretical insights into the impact of connectivity patterns on the trainability of various sparse network architectures.
Score: 36.176112635780406
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Sparse neural networks promise efficiency, yet training them effectively remains a fundamental challenge. Despite advances in pruning methods that create sparse architectures, understanding why some sparse structures are better trainable than others with the same level of sparsity remains poorly understood. Aiming to develop a systematic approach to this fundamental problem, we propose a novel theoretical framework based on the theory of graph limits, particularly graphons, that characterizes sparse neural networks in the infinite-width regime. Our key insight is that connectivity patterns of sparse neural networks induced by pruning methods converge to specific graphons as networks' width tends to infinity, which encodes implicit structural biases of different pruning methods. We postulate the Graphon Limit Hypothesis and provide empirical evidence to support it. Leveraging this graphon representation, we derive a Graphon Neural Tangent Kernel (Graphon NTK) to study the training dynamics of sparse networks in the infinite width limit. Graphon NTK provides a general framework for the theoretical analysis of sparse networks. We empirically show that the spectral analysis of Graphon NTK correlates with observed training dynamics of sparse networks, explaining the varying convergence behaviours of different pruning methods. Our framework provides theoretical insights into the impact of connectivity patterns on the trainability of various sparse network architectures.

Related papers

Pruning at Initialisation through the lens of Graphon Limit: Convergence, Expressivity, and Generalisation [16.628325681877556]
Pruning at Initialisation methods discover sparse, trainableworks before training, but their theoretical mechanisms remain elusive.<n>In this work, we connect discrete intrinsic prunings to graph limit theory via graphons, establishing the graphon limit of PaI masks.<n>We derive two fundamental theoretical results: (i) a Universal Approximation Theorem for sparse networks that depends only on the dimension of active coordinate subspaces; and (ii) a Graphon-NTK generalisation bound demonstrating how the limit graphon geometry to align with informative features.
arXiv Detail & Related papers (2026-02-06T13:02:47Z)
Spectral Theory for Edge Pruning in Asynchronous Recurrent Graph Neural Networks [0.0]
Asynchronous Recurrent Graph Neural Networks (ARGNNs) capture complex dependencies in dynamic graphs, resembling living organisms' intricate and adaptive nature.<n>This paper presents a dynamic pruning method based on graph spectral theory, leveraging the imaginary component of the eigenvalues of the network graph's Laplacian.
arXiv Detail & Related papers (2025-02-23T13:05:08Z)
A Signed Graph Approach to Understanding and Mitigating Oversmoothing in GNNs [54.62268052283014]
We present a unified theoretical perspective based on the framework of signed graphs.<n>We show that many existing strategies implicitly introduce negative edges that alter message-passing to resist oversmoothing.<n>We propose Structural Balanced Propagation (SBP), a plug-and-play method that assigns signed edges based on either labels or feature similarity.
arXiv Detail & Related papers (2025-02-17T03:25:36Z)
Superhypergraph Neural Networks and Plithogenic Graph Neural Networks: Theoretical Foundations [0.0]
Hypergraphs extend traditional graphs by allowing edges to connect multiple nodes, while superhypergraphs further generalize this concept to represent even more complex relationships.<n>Graph Neural Networks (GNNs), a well-established framework, have recently been extended to Hypergraph Neural Networks (HGNNs)<n>This paper establishes the theoretical foundation for the development of SuperHyperGraph Neural Networks (SHGNNs) and Plithogenic Graph Neural Networks.
arXiv Detail & Related papers (2024-12-02T06:33:02Z)
Revealing Decurve Flows for Generalized Graph Propagation [108.80758541147418]
This study addresses the limitations of the traditional analysis of message-passing, central to graph learning, by defining em textbfgeneralized propagation with directed and weighted graphs. We include a preliminary exploration of learned propagation patterns in datasets, a first in the field.
arXiv Detail & Related papers (2024-02-13T14:13:17Z)
Fine-tuning Graph Neural Networks by Preserving Graph Generative Patterns [13.378277755978258]
We show that the structural divergence between pre-training and downstream graphs significantly limits the transferability when using the vanilla fine-tuning strategy. We propose G-Tuning to preserve the generative patterns of downstream graphs. G-Tuning demonstrates an average improvement of 0.5% and 2.6% on in-domain and out-of-domain transfer learning experiments.
arXiv Detail & Related papers (2023-12-21T05:17:10Z)
Learning Graph Structure from Convolutional Mixtures [119.45320143101381]
We propose a graph convolutional relationship between the observed and latent graphs, and formulate the graph learning task as a network inverse (deconvolution) problem. In lieu of eigendecomposition-based spectral methods, we unroll and truncate proximal gradient iterations to arrive at a parameterized neural network architecture that we call a Graph Deconvolution Network (GDN) GDNs can learn a distribution of graphs in a supervised fashion, perform link prediction or edge-weight regression tasks by adapting the loss function, and they are inherently inductive.
arXiv Detail & Related papers (2022-05-19T14:08:15Z)
Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training. We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z)
Graph-Coupled Oscillator Networks [23.597444325599835]
Graph-Coupled Networks (GraphCON) is a novel framework for deep learning on graphs. We show that our framework offers competitive performance with respect to the state-of-the-art on a variety of graph-based learning tasks.
arXiv Detail & Related papers (2022-02-04T18:29:49Z)
Implicit Graph Neural Networks [46.0589136729616]
We propose a graph learning framework called Implicit Graph Neural Networks (IGNN) IGNNs consistently capture long-range dependencies and outperform state-of-the-art GNN models.
arXiv Detail & Related papers (2020-09-14T06:04:55Z)
Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis. By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner. This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)
Towards Deeper Graph Neural Networks [63.46470695525957]
Graph convolutions perform neighborhood aggregation and represent one of the most important graph operations. Several recent studies attribute this performance deterioration to the over-smoothing issue. We propose Deep Adaptive Graph Neural Network (DAGNN) to adaptively incorporate information from large receptive fields.
arXiv Detail & Related papers (2020-07-18T01:11:14Z)
Generalization bound of globally optimal non-convex neural network training: Transportation map estimation by infinite dimensional Langevin dynamics [50.83356836818667]
We introduce a new theoretical framework to analyze deep learning optimization with connection to its generalization error. Existing frameworks such as mean field theory and neural tangent kernel theory for neural network optimization analysis typically require taking limit of infinite width of the network to show its global convergence.
arXiv Detail & Related papers (2020-07-11T18:19:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.