Related papers: On Universality of Deep Equivariant Networks

On Universality of Deep Equivariant Networks

URL: http://arxiv.org/abs/2510.15814v1
Date: Fri, 17 Oct 2025 16:51:31 GMT
Title: On Universality of Deep Equivariant Networks
Authors: Marco Pacini, Mircea Petrache, Bruno Lepri, Shubhendu Trivedi, Robin Walters,
Abstract summary: Universality results for equivariant neural networks remain rare.<n>We show that with sufficient depth or with the addition of appropriate readout layers, equivariant networks attain universality within the entry-wise separable regime.
Score: 23.16940006451027
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Universality results for equivariant neural networks remain rare. Those that do exist typically hold only in restrictive settings: either they rely on regular or higher-order tensor representations, leading to impractically high-dimensional hidden spaces, or they target specialized architectures, often confined to the invariant setting. This work develops a more general account. For invariant networks, we establish a universality theorem under separation constraints, showing that the addition of a fully connected readout layer secures approximation within the class of separation-constrained continuous functions. For equivariant networks, where results are even scarcer, we demonstrate that standard separability notions are inadequate and introduce the sharper criterion of $\textit{entry-wise separability}$. We show that with sufficient depth or with the addition of appropriate readout layers, equivariant networks attain universality within the entry-wise separable regime. Together with prior results showing the failure of universality for shallow models, our findings identify depth and readout layers as a decisive mechanism for universality, additionally offering a unified perspective that subsumes and extends earlier specialized results.

Related papers

The Inductive Bias of Convolutional Neural Networks: Locality and Weight Sharing Reshape Implicit Regularization [57.37943479039033]
We study how architectural inductive bias reshapes the implicit regularization induced by the edge-of-stability phenomenon in gradient descent.<n>We show that locality and weight sharing fundamentally change this picture.
arXiv Detail & Related papers (2026-03-05T04:50:51Z)
Dense Neural Networks are not Universal Approximators [53.27010448621372]
We show that dense neural networks do not possess universality of arbitrary continuous functions.<n>We consider ReLU neural networks subject to natural constraints on weights and input and output dimensions.
arXiv Detail & Related papers (2026-02-07T16:52:38Z)
Random-Matrix-Induced Simplicity Bias in Over-parameterized Variational Quantum Circuits [72.0643009153473]
We show that expressive variational ansatze enter a Haar-like universality class in which both observable expectation values and parameter gradients concentrate exponentially with system size.<n>As a consequence, the hypothesis class induced by such circuits collapses with high probability to a narrow family of near-constant functions.<n>We further show that this collapse is not unavoidable: tensor-structured VQCs, including tensor-network-based and tensor-hypernetwork parameterizations, lie outside the Haar-like universality class.
arXiv Detail & Related papers (2026-01-05T08:04:33Z)
Drawback of Enforcing Equivariance and its Compensation via the Lens of Expressive Power [75.44625156899468]
We investigate the impact of equivariance constraints on the expressivity of equivariant and layer-wise equivariant networks.<n>We show that despite a larger model size, the resulting architecture could still correspond to a hypothesis space with lower complexity.
arXiv Detail & Related papers (2025-12-10T14:18:59Z)
On Universality Classes of Equivariant Networks [9.137637807153464]
We investigate the approximation power of equivariant neural networks beyond separation constraints.<n>We show that separation power does not fully capture expressivity.<n>We identify settings where shallow equivariant networks do achieve universality.
arXiv Detail & Related papers (2025-06-02T22:07:52Z)
A Near Complete Nonasymptotic Generalization Theory For Multilayer Neural Networks: Beyond the Bias-Variance Tradeoff [57.25901375384457]
We propose a nonasymptotic generalization theory for multilayer neural networks with arbitrary Lipschitz activations and general Lipschitz loss functions.<n>In particular, it doens't require the boundness of loss function, as commonly assumed in the literature.<n>We show the near minimax optimality of our theory for multilayer ReLU networks for regression problems.
arXiv Detail & Related papers (2025-03-03T23:34:12Z)
On the Sample Complexity of One Hidden Layer Networks with Equivariance, Locality and Weight Sharing [12.845681770287005]
Weight sharing, equivariant, and local filters are believed to contribute to the sample efficiency of neural networks.<n>We show that locality has generalization benefits, however the uncertainty principle implies a trade-off between locality and expressivity.
arXiv Detail & Related papers (2024-11-21T16:36:01Z)
Deep Ridgelet Transform and Unified Universality Theorem for Deep and Shallow Joint-Group-Equivariant Machines [15.67299102925013]
We present a constructive universal approximation theorem for learning machines equipped with joint-group-equivariant feature maps.<n>Our main theorem also unifies the universal approximation theorems for both shallow and deep networks.
arXiv Detail & Related papers (2024-05-22T14:25:02Z)
Generalization of Scaled Deep ResNets in the Mean-Field Regime [55.77054255101667]
We investigate emphscaled ResNet in the limit of infinitely deep and wide neural networks. Our results offer new insights into the generalization ability of deep ResNet beyond the lazy training regime.
arXiv Detail & Related papers (2024-03-14T21:48:00Z)
The Sample Complexity of One-Hidden-Layer Neural Networks [57.6421258363243]
We study a class of scalar-valued one-hidden-layer networks, and inputs bounded in Euclidean norm. We prove that controlling the spectral norm of the hidden layer weight matrix is insufficient to get uniform convergence guarantees. We analyze two important settings where a mere spectral norm control turns out to be sufficient.
arXiv Detail & Related papers (2022-02-13T07:12:02Z)
Label-Based Diversity Measure Among Hidden Units of Deep Neural Networks: A Regularization Method [18.72270439152708]
We introduce a new definition of redundancy to describe the diversity of hidden units under supervised learning settings. We prove an opposite relationship between the defined redundancy and the generalization capacity. Experiments show that the DNNs using the redundancy as the regularizer can effectively reduce the overfitting and decrease the generalization error.
arXiv Detail & Related papers (2020-09-19T04:27:44Z)
Recursive Multi-model Complementary Deep Fusion forRobust Salient Object Detection via Parallel Sub Networks [62.26677215668959]
Fully convolutional networks have shown outstanding performance in the salient object detection (SOD) field. This paper proposes a wider'' network architecture which consists of parallel sub networks with totally different network architectures. Experiments on several famous benchmarks clearly demonstrate the superior performance, good generalization, and powerful learning ability of the proposed wider framework.
arXiv Detail & Related papers (2020-08-07T10:39:11Z)
Closed-Form Factorization of Latent Semantics in GANs [65.42778970898534]
A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images. In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner. We propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
arXiv Detail & Related papers (2020-07-13T18:05:36Z)
Attentive Normalization for Conditional Image Generation [126.08247355367043]
We characterize long-range dependence with attentive normalization (AN), which is an extension to traditional instance normalization. Compared with self-attention GAN, our attentive normalization does not need to measure the correlation of all locations. Experiments on class-conditional image generation and semantic inpainting verify the efficacy of our proposed module.
arXiv Detail & Related papers (2020-04-08T06:12:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.