A general approximation lower bound in $L^p$ norm, with applications to
feed-forward neural networks
- URL: http://arxiv.org/abs/2206.04360v1
- Date: Thu, 9 Jun 2022 09:01:05 GMT
- Title: A general approximation lower bound in $L^p$ norm, with applications to
feed-forward neural networks
- Authors: El Mehdi Achour (IMT), Armand Foucault (IMT), S\'ebastien Gerchinovitz
(IMT), Fran\c{c}ois Malgouyres (IMT)
- Abstract summary: We study the fundamental limits to the expressive power of neural networks.
We first prove a general lower bound on how well functions in $F$ can be approximated in $Lp(mu)$ norm.
We then instantiate this bound to the case where $G$ corresponds to a piecewise-polynomial feed-forward neural network.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the fundamental limits to the expressive power of neural networks.
Given two sets $F$, $G$ of real-valued functions, we first prove a general
lower bound on how well functions in $F$ can be approximated in $L^p(\mu)$ norm
by functions in $G$, for any $p \geq 1$ and any probability measure $\mu$. The
lower bound depends on the packing number of $F$, the range of $F$, and the
fat-shattering dimension of $G$. We then instantiate this bound to the case
where $G$ corresponds to a piecewise-polynomial feed-forward neural network,
and describe in details the application to two sets $F$: H{\"o}lder balls and
multivariate monotonic functions. Beside matching (known or new) upper bounds
up to log factors, our lower bounds shed some light on the similarities or
differences between approximation in $L^p$ norm or in sup norm, solving an open
question by DeVore et al. (2021). Our proof strategy differs from the sup norm
case and uses a key probability result of Mendelson (2002).
Related papers
- Approximation Rates for Shallow ReLU$^k$ Neural Networks on Sobolev Spaces via the Radon Transform [4.096453902709292]
We consider the problem of how efficiently shallow neural networks with the ReLU$k$ activation function can approximate functions from Sobolev spaces.
We provide a simple proof of nearly optimal approximation rates in a variety of cases, including when $qleq p$, $pgeq 2$, and $s leq k + (d+1)/2$.
arXiv Detail & Related papers (2024-08-20T16:43:45Z) - Interplay between depth and width for interpolation in neural ODEs [0.0]
We examine the interplay between their width $p$ and number of layer transitions $L$.
In the high-dimensional setting, we demonstrate that $p=O(N)$ neurons are likely sufficient to achieve exact control.
arXiv Detail & Related papers (2024-01-18T11:32:50Z) - Achieve the Minimum Width of Neural Networks for Universal Approximation [1.52292571922932]
We study the exact minimum width, $w_min$, for the universal approximation property (UAP) of neural networks.
In particular, the critical width, $w*_min$, for $Lp$-UAP can be achieved by leaky-ReLU networks.
arXiv Detail & Related papers (2022-09-23T04:03:50Z) - Shallow neural network representation of polynomials [91.3755431537592]
We show that $d$-variables of degreeR$ can be represented on $[0,1]d$ as shallow neural networks of width $d+1+sum_r=2Rbinomr+d-1d-1d-1[binomr+d-1d-1d-1[binomr+d-1d-1d-1[binomr+d-1d-1d-1d-1[binomr+d-1d-1d-1d-1
arXiv Detail & Related papers (2022-08-17T08:14:52Z) - Finite-Sample Maximum Likelihood Estimation of Location [16.44999338864628]
For fixed $f$ the maximum-likelihood estimate (MLE) is well-known to be optimal in the limit as $n to infty$
We show for arbitrary $f$ and $n$ that one can recover a similar theory based on the Fisher information of a smoothed version of $f$, where the smoothing radius decays with $n$.
arXiv Detail & Related papers (2022-06-06T04:33:41Z) - A Law of Robustness beyond Isoperimetry [84.33752026418045]
We prove a Lipschitzness lower bound $Omega(sqrtn/p)$ of robustness of interpolating neural network parameters on arbitrary distributions.
We then show the potential benefit of overparametrization for smooth data when $n=mathrmpoly(d)$.
We disprove the potential existence of an $O(1)$-Lipschitz robust interpolating function when $n=exp(omega(d))$.
arXiv Detail & Related papers (2022-02-23T16:10:23Z) - Active Sampling for Linear Regression Beyond the $\ell_2$ Norm [70.49273459706546]
We study active sampling algorithms for linear regression, which aim to query only a small number of entries of a target vector.
We show that this dependence on $d$ is optimal, up to logarithmic factors.
We also provide the first total sensitivity upper bound $O(dmax1,p/2log2 n)$ for loss functions with at most degree $p$ growth.
arXiv Detail & Related papers (2021-11-09T00:20:01Z) - Improved Sample Complexity for Incremental Autonomous Exploration in
MDPs [132.88757893161699]
We learn the set of $epsilon$-optimal goal-conditioned policies attaining all states that are incrementally reachable within $L$ steps.
DisCo is the first algorithm that can return an $epsilon/c_min$-optimal policy for any cost-sensitive shortest-path problem.
arXiv Detail & Related papers (2020-12-29T14:06:09Z) - On the Modularity of Hypernetworks [103.1147622394852]
We show that for a structured target function, the overall number of trainable parameters in a hypernetwork is smaller by orders of magnitude than the number of trainable parameters of a standard neural network and an embedding method.
arXiv Detail & Related papers (2020-02-23T22:51:52Z) - A closer look at the approximation capabilities of neural networks [6.09170287691728]
A feedforward neural network with one hidden layer is able to approximate any continuous function $f$ to any given approximation threshold $varepsilon$.
We show that this uniform approximation property still holds even under seemingly strong conditions imposed on the weights.
arXiv Detail & Related papers (2020-02-16T04:58:43Z) - Curse of Dimensionality on Randomized Smoothing for Certifiable
Robustness [151.67113334248464]
We show that extending the smoothing technique to defend against other attack models can be challenging.
We present experimental results on CIFAR to validate our theory.
arXiv Detail & Related papers (2020-02-08T22:02:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.