Deep neural network approximation of analytic functions
- URL: http://arxiv.org/abs/2104.02095v1
- Date: Mon, 5 Apr 2021 18:02:04 GMT
- Title: Deep neural network approximation of analytic functions
- Authors: Aleksandr Beknazaryan
- Abstract summary: entropy bound for the spaces of neural networks with piecewise linear activation functions.
We derive an oracle inequality for the expected error of the considered penalized deep neural network estimators.
- Score: 91.3755431537592
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We provide an entropy bound for the spaces of neural networks with piecewise
linear activation functions, such as the ReLU and the absolute value functions.
This bound generalizes the known entropy bound for the space of linear
functions on $\mathbb{R}^d$ and it depends on the value at the point
$(1,1,...,1)$ of the networks obtained by taking the absolute values of all
parameters of original networks. Keeping this value together with the depth,
width and the parameters of the networks to have logarithmic dependence on
$1/\varepsilon$, we $\varepsilon$-approximate functions that are analytic on
certain regions of $\mathbb{C}^d$. As a statistical application we derive an
oracle inequality for the expected error of the considered penalized deep
neural network estimators.
Related papers
- A Mean-Field Analysis of Neural Stochastic Gradient Descent-Ascent for Functional Minimax Optimization [90.87444114491116]
This paper studies minimax optimization problems defined over infinite-dimensional function classes of overparametricized two-layer neural networks.
We address (i) the convergence of the gradient descent-ascent algorithm and (ii) the representation learning of the neural networks.
Results show that the feature representation induced by the neural networks is allowed to deviate from the initial one by the magnitude of $O(alpha-1)$, measured in terms of the Wasserstein distance.
arXiv Detail & Related papers (2024-04-18T16:46:08Z) - Shallow neural network representation of polynomials [91.3755431537592]
We show that $d$-variables of degreeR$ can be represented on $[0,1]d$ as shallow neural networks of width $d+1+sum_r=2Rbinomr+d-1d-1d-1[binomr+d-1d-1d-1[binomr+d-1d-1d-1[binomr+d-1d-1d-1d-1[binomr+d-1d-1d-1d-1
arXiv Detail & Related papers (2022-08-17T08:14:52Z) - Scalable Lipschitz Residual Networks with Convex Potential Flows [120.27516256281359]
We show that using convex potentials in a residual network gradient flow provides a built-in $1$-Lipschitz transformation.
A comprehensive set of experiments on CIFAR-10 demonstrates the scalability of our architecture and the benefit of our approach for $ell$ provable defenses.
arXiv Detail & Related papers (2021-10-25T07:12:53Z) - Theory of Deep Convolutional Neural Networks III: Approximating Radial
Functions [7.943024117353317]
We consider a family of deep neural networks consisting of two groups of convolutional layers, a down operator, and a fully connected layer.
The network structure depends on two structural parameters which determine the numbers of convolutional layers and the width of the fully connected layer.
arXiv Detail & Related papers (2021-07-02T08:22:12Z) - Neural networks with superexpressive activations and integer weights [91.3755431537592]
An example of an activation function $sigma$ is given such that networks with activations $sigma, lfloorcdotrfloor$, integer weights and a fixed architecture is given.
The range of integer weights required for $varepsilon$-approximation of H"older continuous functions is derived.
arXiv Detail & Related papers (2021-05-20T17:29:08Z) - Function approximation by deep neural networks with parameters $\{0,\pm
\frac{1}{2}, \pm 1, 2\}$ [91.3755431537592]
It is shown that $C_beta$-smooth functions can be approximated by neural networks with parameters $0,pm frac12, pm 1, 2$.
The depth, width and the number of active parameters of constructed networks have, up to a logarithimc factor, the same dependence on the approximation error as the networks with parameters in $[-1,1]$.
arXiv Detail & Related papers (2021-03-15T19:10:02Z) - Sample Complexity and Overparameterization Bounds for Projection-Free
Neural TD Learning [38.730333068555275]
Existing analysis of neural TD learning relies on either infinite width-analysis or constraining the network parameters in a (random) compact set.
We show that the projection-free TD learning equipped with a two-layer ReLU network of any width exceeding $poly(overlinenu,1/epsilon)$ converges to the true value function with error $epsilon$ given $poly(overlinenu,1/epsilon)$ iterations or samples.
arXiv Detail & Related papers (2021-03-02T01:05:19Z) - Theory of Deep Convolutional Neural Networks II: Spherical Analysis [9.099589602551573]
We consider a family of deep convolutional neural networks applied to approximate functions on the unit sphere $mathbbSd-1$ of $mathbbRd$.
Our analysis presents rates of uniform approximation when the approximated function lies in the Sobolev space $Wr_infty (mathbbSd-1)$ with $r>0$ or takes an additive ridge form.
arXiv Detail & Related papers (2020-07-28T14:54:30Z) - Nonclosedness of Sets of Neural Networks in Sobolev Spaces [0.0]
We show that realized neural networks are not closed in order-$(m-1)$ Sobolev spaces $Wm-1,p$ for $p in [1,infty]$.
For a real analytic activation function, we show that sets of realized neural networks are not closed in $Wk,p$ for any $k in mathbbN$.
arXiv Detail & Related papers (2020-07-23T00:57:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.