Polynomial Neural Sheaf Diffusion: A Spectral Filtering Approach on Cellular Sheaves
- URL: http://arxiv.org/abs/2512.00242v1
- Date: Fri, 28 Nov 2025 23:10:54 GMT
- Title: Polynomial Neural Sheaf Diffusion: A Spectral Filtering Approach on Cellular Sheaves
- Authors: Alessio Borgi, Fabrizio Silvestri, Pietro Liò,
- Abstract summary: Sheaf Neural Networks equip graph structures with a cellular sheaf: a geometric structure which assigns local vector spaces (stalks) and a linear learnable restriction/transport maps to nodes and edges.<n>We introduce Polynomial Neural Sheaf Diffusion (PolyNSD), a new sheaf diffusion approach whose propagation operator is a degree-K in a normalised sheaf Laplacian.
- Score: 23.06390960959433
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Sheaf Neural Networks equip graph structures with a cellular sheaf: a geometric structure which assigns local vector spaces (stalks) and a linear learnable restriction/transport maps to nodes and edges, yielding an edge-aware inductive bias that handles heterophily and limits oversmoothing. However, common Neural Sheaf Diffusion implementations rely on SVD-based sheaf normalization and dense per-edge restriction maps, which scale with stalk dimension, require frequent Laplacian rebuilds, and yield brittle gradients. To address these limitations, we introduce Polynomial Neural Sheaf Diffusion (PolyNSD), a new sheaf diffusion approach whose propagation operator is a degree-K polynomial in a normalised sheaf Laplacian, evaluated via a stable three-term recurrence on a spectrally rescaled operator. This provides an explicit K-hop receptive field in a single layer (independently of the stalk dimension), with a trainable spectral response obtained as a convex mixture of K+1 orthogonal polynomial basis responses. PolyNSD enforces stability via convex mixtures, spectral rescaling, and residual/gated paths, reaching new state-of-the-art results on both homophilic and heterophilic benchmarks, inverting the Neural Sheaf Diffusion trend by obtaining these results with just diagonal restriction maps, decoupling performance from large stalk dimension, while reducing runtime and memory requirements.
Related papers
- Quantum-Inspired Tensor Networks for Approximating PDE Flow Maps [1.7887197093662073]
We investigate quantum-inspired tensor networks (QTNs) for approximating flow maps of hydrodynamic partial differential equations (PDEs)<n>Motivated by the effective low-rank structure that emerges after tensorization, we encode PDE states as matrix product states (MPS)<n>Experiments on one- and two-dimensional linear advection-diffusion and nonlinear viscous Burgers equations demonstrate accurate short-horizon prediction, favorable scaling in smooth diffusive regimes, and error growth in nonlinear multi-step predictions.
arXiv Detail & Related papers (2026-02-16T10:06:19Z) - Structure-Informed Estimation for Pilot-Limited MIMO Channels via Tensor Decomposition [51.56484100374058]
This paper formulates pilot-limited channel estimation as low-rank tensor completion from sparse observations.<n>Experiments on synthetic channels demonstrate 10-20,dB normalized mean-square error (NMSE) improvement over least-squares (LS)<n> evaluations on DeepMIMO ray-tracing channels show 24-44% additional NMSE reduction over pure tensor-based methods.
arXiv Detail & Related papers (2026-02-03T23:38:05Z) - Manifold limit for the training of shallow graph convolutional neural networks [1.2744523252873352]
We study the consistency of the training of shallow graph convolutional neural networks (GCNNs) on proximity graphs of sampled point clouds.<n>We prove $$-convergence of regularized empirical risk minimization functionals and corresponding convergence of their global minimizers.
arXiv Detail & Related papers (2026-01-09T18:59:20Z) - The Homogeneity Trap: Spectral Collapse in Doubly-Stochastic Deep Networks [1.7523718031184992]
We identify a critical spectral degradation phenomenon inherent to structure-preserving deep architectures.<n>We show that maximum-entropy bias drives the mixing operator towards the uniform barycenter, suppressing the subdominant singular value .<n>We derive a spectral bound linking to the network's effective depth, showing that high-entropy constraints restrict feature transformation to a shallow receptive field.
arXiv Detail & Related papers (2026-01-05T13:09:42Z) - Sheaf Graph Neural Networks via PAC-Bayes Spectral Optimization [13.021238902084647]
Over-smoothing in Graph Neural Networks (GNNs) causes collapse in distinct node features.<n>We introduce SGPC (Sheaf GNNs with PAC-Bayes), a unified architecture that combines cellular-sheaf message passing with several mechanisms.<n> Experiments on nine homophilic and heterophilic benchmarks show that SGPC outperforms state-of-the-art spectral and sheaf-based GNNs.
arXiv Detail & Related papers (2025-08-01T06:39:28Z) - Low-Rank Tensor Recovery via Variational Schatten-p Quasi-Norm and Jacobian Regularization [49.85875869048434]
We propose a CP-based low-rank tensor function parameterized by neural networks for implicit neural representation.<n>To achieve sparser CP decomposition, we introduce a variational Schatten-p quasi-norm to prune redundant rank-1 components.<n>For smoothness, we propose a regularization term based on the spectral norm of the Jacobian and Hutchinson's trace estimator.
arXiv Detail & Related papers (2025-06-27T11:23:10Z) - Enabling Probabilistic Learning on Manifolds through Double Diffusion Maps [3.081704060720176]
We present a generative learning framework for probabilistic sampling based on an extension of the Probabilistic Learning on Manifolds (PLoM) approach.<n>We solve a full-order ISDE directly in the latent space, preserving the full dynamical complexity of the system.
arXiv Detail & Related papers (2025-06-02T20:58:49Z) - Approximation properties of neural ODEs [5.828989070109041]
We prove the universal approximation property (UAP) of shallow neural networks in the space of continuous functions.<n>In particular, we constrain the Lipschitz constant of the neural ODE's flow map and the norms of the weights to increase the network's stability.
arXiv Detail & Related papers (2025-03-19T21:11:28Z) - Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training.
We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z) - Towards Training Without Depth Limits: Batch Normalization Without
Gradient Explosion [83.90492831583997]
We show that a batch-normalized network can keep the optimal signal propagation properties, but avoid exploding gradients in depth.
We use a Multi-Layer Perceptron (MLP) with linear activations and batch-normalization that provably has bounded depth.
We also design an activation shaping scheme that empirically achieves the same properties for certain non-linear activations.
arXiv Detail & Related papers (2023-10-03T12:35:02Z) - Convergence of mean-field Langevin dynamics: Time and space
discretization, stochastic gradient, and variance reduction [49.66486092259376]
The mean-field Langevin dynamics (MFLD) is a nonlinear generalization of the Langevin dynamics that incorporates a distribution-dependent drift.
Recent works have shown that MFLD globally minimizes an entropy-regularized convex functional in the space of measures.
We provide a framework to prove a uniform-in-time propagation of chaos for MFLD that takes into account the errors due to finite-particle approximation, time-discretization, and gradient approximation.
arXiv Detail & Related papers (2023-06-12T16:28:11Z) - Decomposed Diffusion Sampler for Accelerating Large-Scale Inverse
Problems [64.29491112653905]
We propose a novel and efficient diffusion sampling strategy that synergistically combines the diffusion sampling and Krylov subspace methods.
Specifically, we prove that if tangent space at a denoised sample by Tweedie's formula forms a Krylov subspace, then the CG with the denoised data ensures the data consistency update to remain in the tangent space.
Our proposed method achieves more than 80 times faster inference time than the previous state-of-the-art method.
arXiv Detail & Related papers (2023-03-10T07:42:49Z) - Spectra of the Conjugate Kernel and Neural Tangent Kernel for
linear-width neural networks [22.57374777395746]
We study the eigenvalue of the Conjugate Neural Kernel and Tangent Kernel associated to feedforward neural networks.
We show that the eigenvalue distributions of the CK and NTK converge to deterministic limits.
arXiv Detail & Related papers (2020-05-25T01:11:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.