Deep Network Approximation: Achieving Arbitrary Accuracy with Fixed
Number of Neurons
- URL: http://arxiv.org/abs/2107.02397v2
- Date: Wed, 7 Jul 2021 18:21:55 GMT
- Title: Deep Network Approximation: Achieving Arbitrary Accuracy with Fixed
Number of Neurons
- Authors: Zuowei Shen and Haizhao Yang and Shijun Zhang
- Abstract summary: We develop feed-forward neural networks that achieve the universal approximation property for all continuous functions with a fixed finite number of neurons.
We prove that $sigma$-activated networks with width $36d(2d+1)$ and depth $11$ can approximate any continuous function on a $d$-dimensioanl hypercube within an arbitrarily small error.
- Score: 5.37133760455631
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper develops simple feed-forward neural networks that achieve the
universal approximation property for all continuous functions with a fixed
finite number of neurons. These neural networks are simple because they are
designed with a simple and computable continuous activation function $\sigma$
leveraging a triangular-wave function and a softsign function. We prove that
$\sigma$-activated networks with width $36d(2d+1)$ and depth $11$ can
approximate any continuous function on a $d$-dimensioanl hypercube within an
arbitrarily small error. Hence, for supervised learning and its related
regression problems, the hypothesis space generated by these networks with a
size not smaller than $36d(2d+1)\times 11$ is dense in the space of continuous
functions. Furthermore, classification functions arising from image and signal
classification are in the hypothesis space generated by $\sigma$-activated
networks with width $36d(2d+1)$ and depth $12$, when there exist pairwise
disjoint closed bounded subsets of $\mathbb{R}^d$ such that the samples of the
same class are located in the same subset.
Related papers
- Deep Neural Networks: Multi-Classification and Universal Approximation [0.0]
We demonstrate that a ReLU deep neural network with a width of $2$ and a depth of $2N+4M-1$ layers can achieve finite sample memorization for any dataset comprising $N$ elements.
We also provide depth estimates for approximating $W1,p$ functions and width estimates for approximating $Lp(Omega;mathbbRm)$ for $mgeq1$.
arXiv Detail & Related papers (2024-09-10T14:31:21Z) - Optimal Neural Network Approximation for High-Dimensional Continuous Functions [5.748690310135373]
We present a family of continuous functions that requires at least width $d$, and therefore at least $d$ intrinsic neurons, to achieve arbitrary accuracy in its approximation.
This shows that the requirement of $mathcalO(d)$ intrinsic neurons is optimal in the sense that it grows linearly with the input dimension $d$.
arXiv Detail & Related papers (2024-09-04T01:18:55Z) - Bayesian Inference with Deep Weakly Nonlinear Networks [57.95116787699412]
We show at a physics level of rigor that Bayesian inference with a fully connected neural network is solvable.
We provide techniques to compute the model evidence and posterior to arbitrary order in $1/N$ and at arbitrary temperature.
arXiv Detail & Related papers (2024-05-26T17:08:04Z) - Learning Hierarchical Polynomials with Three-Layer Neural Networks [56.71223169861528]
We study the problem of learning hierarchical functions over the standard Gaussian distribution with three-layer neural networks.
For a large subclass of degree $k$s $p$, a three-layer neural network trained via layerwise gradientp descent on the square loss learns the target $h$ up to vanishing test error.
This work demonstrates the ability of three-layer neural networks to learn complex features and as a result, learn a broad class of hierarchical functions.
arXiv Detail & Related papers (2023-11-23T02:19:32Z) - Neural Network Approximation of Continuous Functions in High Dimensions
with Applications to Inverse Problems [6.84380898679299]
Current theory predicts that networks should scale exponentially in the dimension of the problem.
We provide a general method for bounding the complexity required for a neural network to approximate a H"older (or uniformly) continuous function.
arXiv Detail & Related papers (2022-08-28T22:44:07Z) - Shallow neural network representation of polynomials [91.3755431537592]
We show that $d$-variables of degreeR$ can be represented on $[0,1]d$ as shallow neural networks of width $d+1+sum_r=2Rbinomr+d-1d-1d-1[binomr+d-1d-1d-1[binomr+d-1d-1d-1[binomr+d-1d-1d-1d-1[binomr+d-1d-1d-1d-1
arXiv Detail & Related papers (2022-08-17T08:14:52Z) - Neural networks with superexpressive activations and integer weights [91.3755431537592]
An example of an activation function $sigma$ is given such that networks with activations $sigma, lfloorcdotrfloor$, integer weights and a fixed architecture is given.
The range of integer weights required for $varepsilon$-approximation of H"older continuous functions is derived.
arXiv Detail & Related papers (2021-05-20T17:29:08Z) - Deep neural network approximation of analytic functions [91.3755431537592]
entropy bound for the spaces of neural networks with piecewise linear activation functions.
We derive an oracle inequality for the expected error of the considered penalized deep neural network estimators.
arXiv Detail & Related papers (2021-04-05T18:02:04Z) - Function approximation by deep neural networks with parameters $\{0,\pm
\frac{1}{2}, \pm 1, 2\}$ [91.3755431537592]
It is shown that $C_beta$-smooth functions can be approximated by neural networks with parameters $0,pm frac12, pm 1, 2$.
The depth, width and the number of active parameters of constructed networks have, up to a logarithimc factor, the same dependence on the approximation error as the networks with parameters in $[-1,1]$.
arXiv Detail & Related papers (2021-03-15T19:10:02Z) - Nonclosedness of Sets of Neural Networks in Sobolev Spaces [0.0]
We show that realized neural networks are not closed in order-$(m-1)$ Sobolev spaces $Wm-1,p$ for $p in [1,infty]$.
For a real analytic activation function, we show that sets of realized neural networks are not closed in $Wk,p$ for any $k in mathbbN$.
arXiv Detail & Related papers (2020-07-23T00:57:25Z) - A Corrective View of Neural Networks: Representation, Memorization and
Learning [26.87238691716307]
We develop a corrective mechanism for neural network approximation.
We show that two-layer neural networks in the random features regime (RF) can memorize arbitrary labels.
We also consider three-layer neural networks and show that the corrective mechanism yields faster representation rates for smooth radial functions.
arXiv Detail & Related papers (2020-02-01T20:51:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.