Related papers: SUPN: Shallow Universal Polynomial Networks

SUPN: Shallow Universal Polynomial Networks

URL: http://arxiv.org/abs/2511.21414v1
Date: Wed, 26 Nov 2025 14:06:42 GMT
Title: SUPN: Shallow Universal Polynomial Networks
Authors: Zachary Morrow, Michael Penwarden, Brian Chen, Aurya Javeed, Akil Narayan, John D. Jakeman,
Abstract summary: Deep neural networks (DNNs) and Kolmogorov-Arnold networks (KANs) are popular methods for function approximation.<n>We propose shallow universal networks (SUPNs) to produce a suitable approximation.<n>We show that SUPNs converge at the same rate as the best approximation of the same degree.
Score: 2.817874121826956
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep neural networks (DNNs) and Kolmogorov-Arnold networks (KANs) are popular methods for function approximation due to their flexibility and expressivity. However, they typically require a large number of trainable parameters to produce a suitable approximation. Beyond making the resulting network less transparent, overparameterization creates a large optimization space, likely producing local minima in training that have quite different generalization errors. In this case, network initialization can have an outsize impact on the model's out-of-sample accuracy. For these reasons, we propose shallow universal polynomial networks (SUPNs). These networks replace all but the last hidden layer with a single layer of polynomials with learnable coefficients, leveraging the strengths of DNNs and polynomials to achieve sufficient expressivity with far fewer parameters. We prove that SUPNs converge at the same rate as the best polynomial approximation of the same degree, and we derive explicit formulas for quasi-optimal SUPN parameters. We complement theory with an extensive suite of numerical experiments involving SUPNs, DNNs, KANs, and polynomial projection in one, two, and ten dimensions, consisting of over 13,000 trained models. On the target functions we numerically studied, for a given number of trainable parameters, the approximation error and variability are often lower for SUPNs than for DNNs and KANs by an order of magnitude. In our examples, SUPNs even outperform polynomial projection on non-smooth functions.

Related papers

efunc: An Efficient Function Representation without Neural Networks [46.76882780184126]
We propose a novel framework for continuous function modeling. Most existing works can be formulated using this framework.<n>We then introduce a compact function representation, which is based on parameter-efficient functions bypassing both neural networks and complex structures.
arXiv Detail & Related papers (2025-05-27T15:16:56Z)
Polynomial-Augmented Neural Networks (PANNs) with Weak Orthogonality Constraints for Enhanced Function and PDE Approximation [22.689531776611084]
We present-augmented neural networks (PANNs)<n>We present a novel machine learning architecture that combines deep neural networks (DNNs) with an approximant.
arXiv Detail & Related papers (2024-06-04T14:06:15Z)
A Simple Solution for Homomorphic Evaluation on Large Intervals [0.0]
Homomorphic encryption (HE) is a promising technique used for privacy-preserving computation. Homomorphic evaluation of approximations for non-polynomial functions plays an important role in privacy-preserving machine learning. We introduce a simple solution to approximating any functions, which might be overmissed by researchers.
arXiv Detail & Related papers (2024-05-24T04:13:22Z)
Sparse Deep Neural Network for Nonlinear Partial Differential Equations [3.0069322256338906]
This paper is devoted to a numerical study of adaptive approximation of solutions of nonlinear partial differential equations. We develop deep neural networks (DNNs) with a sparse regularization with multiple parameters to represent functions having certain singularities. Numerical examples confirm that solutions generated by the proposed SDNN are sparse and accurate.
arXiv Detail & Related papers (2022-07-27T03:12:16Z)
Bagged Polynomial Regression and Neural Networks [0.0]
Series and dataset regression are able to approximate the same function classes as neural networks. textitbagged regression (BPR) is an attractive alternative to neural networks. BPR performs as well as neural networks in crop classification using satellite data.
arXiv Detail & Related papers (2022-05-17T19:55:56Z)
Statistically Meaningful Approximation: a Case Study on Approximating Turing Machines with Transformers [50.85524803885483]
This work proposes a formal definition of statistically meaningful (SM) approximation which requires the approximating network to exhibit good statistical learnability. We study SM approximation for two function classes: circuits and Turing machines.
arXiv Detail & Related papers (2021-07-28T04:28:55Z)
Probabilistic Circuits for Variational Inference in Discrete Graphical Models [101.28528515775842]
Inference in discrete graphical models with variational methods is difficult. Many sampling-based methods have been proposed for estimating Evidence Lower Bound (ELBO) We propose a new approach that leverages the tractability of probabilistic circuit models, such as Sum Product Networks (SPN) We show that selective-SPNs are suitable as an expressive variational distribution, and prove that when the log-density of the target model is aweighted the corresponding ELBO can be computed analytically.
arXiv Detail & Related papers (2020-10-22T05:04:38Z)
ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN. We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z)
Multipole Graph Neural Operator for Parametric Partial Differential Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data. We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity. Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z)
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network. Our model requires a much less number of communication rounds and still a number of communication rounds in theory. Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks. We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.