Related papers: Universal Approximation Theorem for Input-Connected Multilayer Perceptrons

Universal Approximation Theorem for Input-Connected Multilayer Perceptrons

URL: http://arxiv.org/abs/2601.14026v1
Date: Tue, 20 Jan 2026 14:48:45 GMT
Title: Universal Approximation Theorem for Input-Connected Multilayer Perceptrons
Authors: Vugar Ismailov,
Abstract summary: We introduce the Input-Connected Multilayer Perceptron (IC-MLP), a feedforward neural network architecture in which each hidden neuron receives, in addition to the outputs of the preceding layer, a direct affine connection from the raw input.<n>We prove a universal approximation theorem showing that deep IC-MLPs can approximate any continuous function on a closed interval of the real line if and only if the activation function is nonlinear.<n>We then extend the analysis to vector-valued inputs and establish a corresponding universal approximation theorem for continuous functions on compact subsets of $mathbbRn$.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce the Input-Connected Multilayer Perceptron (IC-MLP), a feedforward neural network architecture in which each hidden neuron receives, in addition to the outputs of the preceding layer, a direct affine connection from the raw input. We first study this architecture in the univariate setting and give an explicit and systematic description of IC-MLPs with an arbitrary finite number of hidden layers, including iterated formulas for the network functions. In this setting, we prove a universal approximation theorem showing that deep IC-MLPs can approximate any continuous function on a closed interval of the real line if and only if the activation function is nonlinear. We then extend the analysis to vector-valued inputs and establish a corresponding universal approximation theorem for continuous functions on compact subsets of $\mathbb{R}^n$.

Related papers

Dense Neural Networks are not Universal Approximators [53.27010448621372]
We show that dense neural networks do not possess universality of arbitrary continuous functions.<n>We consider ReLU neural networks subject to natural constraints on weights and input and output dimensions.
arXiv Detail & Related papers (2026-02-07T16:52:38Z)
Neural Network Approximation: A View from Polytope Decomposition [36.24385738607139]
We develop an explicit kernel method to derive an universal approximation of continuous functions.<n>A ReLU network is constructed to approximate the kernel in each subdomain separately.<n>We extend our approach to analytic functions to reach a higher approximation rate.
arXiv Detail & Related papers (2026-01-26T08:39:11Z)
Universal approximation results for neural networks with non-polynomial activation function over non-compact domains [3.3379026542599934]
We extend the universal approximation property of single-hidden-layer feedforward neural networks beyond compact domains.<n>By assuming that the activation function is non-polynomial, we establish universal approximation results within function spaces defined over non-compact subsets of a Euclidean space.
arXiv Detail & Related papers (2024-10-18T09:53:20Z)
Generalized Activation via Multivariate Projection [46.837481855573145]
Activation functions are essential to introduce nonlinearity into neural networks. We consider ReLU as a projection from R onto the nonnegative half-line R+. We extend ReLU by substituting it with a generalized projection operator onto a convex cone, such as the Second-Order Cone (SOC) projection.
arXiv Detail & Related papers (2023-09-29T12:44:27Z)
Data Topology-Dependent Upper Bounds of Neural Network Widths [52.58441144171022]
We first show that a three-layer neural network can be designed to approximate an indicator function over a compact set. This is then extended to a simplicial complex, deriving width upper bounds based on its topological structure. We prove the universal approximation property of three-layer ReLU networks using our topological approach.
arXiv Detail & Related papers (2023-05-25T14:17:15Z)
Deep neural network approximation of composite functions without the curse of dimensionality [0.0]
In this article we identify a class of high-dimensional continuous functions that can be approximated by deep neural networks (DNNs) The functions in our class can be expressed as a potentially unbounded special functions which include products, maxima, and certain parallelized Lipschitz continuous functions.
arXiv Detail & Related papers (2023-04-12T12:08:59Z)
A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer Neural Networks [49.870593940818715]
We study the infinite-width limit of a type of three-layer NN model whose first layer is random and fixed. Our theory accommodates different scaling choices of the model, resulting in two regimes of the MF limit that demonstrate distinctive behaviors.
arXiv Detail & Related papers (2022-10-28T17:26:27Z)
On Feature Learning in Neural Networks with Global Convergence Guarantees [49.870593940818715]
We study the optimization of wide neural networks (NNs) via gradient flow (GF) We show that when the input dimension is no less than the size of the training set, the training loss converges to zero at a linear rate under GF. We also show empirically that, unlike in the Neural Tangent Kernel (NTK) regime, our multi-layer model exhibits feature learning and can achieve better generalization performance than its NTK counterpart.
arXiv Detail & Related papers (2022-04-22T15:56:43Z)
Representation Theorem for Matrix Product States [1.7894377200944511]
We investigate the universal representation capacity of the Matrix Product States (MPS) from the perspective of functions and continuous functions. We show that MPS can accurately realize arbitrary functions by providing a construction method of the corresponding MPS structure for an arbitrarily given gate. We study the relation between MPS and neural networks and show that the MPS with a scale-invariant sigmoidal function is equivalent to a one-hidden-layer neural network.
arXiv Detail & Related papers (2021-03-15T11:06:54Z)
Quantitative Rates and Fundamental Obstructions to Non-Euclidean Universal Approximation with Deep Narrow Feed-Forward Networks [3.8073142980733]
We quantify the number of narrow layers required for "deep geometric feed-forward neural networks" We find that both the global and local universal approximation guarantees can only coincide when approximating null-homotopic functions.
arXiv Detail & Related papers (2021-01-13T23:29:40Z)
A Functional Perspective on Learning Symmetric Functions with Neural Networks [48.80300074254758]
We study the learning and representation of neural networks defined on measures. We establish approximation and generalization bounds under different choices of regularization. The resulting models can be learned efficiently and enjoy generalization guarantees that extend across input sizes.
arXiv Detail & Related papers (2020-08-16T16:34:33Z)
Universal Approximation Power of Deep Residual Neural Networks via Nonlinear Control Theory [9.210074587720172]
We explain the universal approximation capabilities of deep residual neural networks through geometric nonlinear control. Inspired by recent work establishing links between residual networks and control systems, we provide a general sufficient condition for a residual network to have the power of universal approximation.
arXiv Detail & Related papers (2020-07-12T14:53:30Z)
Minimum Width for Universal Approximation [91.02689252671291]
We prove that the minimum width required for the universal approximation of the $Lp$ functions is exactly $maxd_x+1,d_y$. We also prove that the same conclusion does not hold for the uniform approximation with ReLU, but does hold with an additional threshold activation function.
arXiv Detail & Related papers (2020-06-16T01:24:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.