ReLU Network Approximation in Terms of Intrinsic Parameters
- URL: http://arxiv.org/abs/2111.07964v1
- Date: Mon, 15 Nov 2021 18:20:38 GMT
- Title: ReLU Network Approximation in Terms of Intrinsic Parameters
- Authors: Zuowei Shen and Haizhao Yang and Shijun Zhang
- Abstract summary: We study the approximation error of ReLU networks in terms of the number of intrinsic parameters.
We design a ReLU network with only three intrinsic parameters to approximate H"older continuous functions with an arbitrarily small error.
- Score: 5.37133760455631
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies the approximation error of ReLU networks in terms of the
number of intrinsic parameters (i.e., those depending on the target function
$f$). First, we prove by construction that, for any Lipschitz continuous
function $f$ on $[0,1]^d$ with a Lipschitz constant $\lambda>0$, a ReLU network
with $n+2$ intrinsic parameters can approximate $f$ with an exponentially small
error $5\lambda \sqrt{d}\,2^{-n}$ measured in the $L^p$-norm for $p\in
[1,\infty)$. More generally for an arbitrary continuous function $f$ on
$[0,1]^d$ with a modulus of continuity $\omega_f(\cdot)$, the approximation
error is $\omega_f(\sqrt{d}\, 2^{-n})+2^{-n+2}\omega_f(\sqrt{d})$. Next, we
extend these two results from the $L^p$-norm to the $L^\infty$-norm at a price
of $3^d n+2$ intrinsic parameters. Finally, by using a high-precision binary
representation and the bit extraction technique via a fixed ReLU network
independent of the target function, we design, theoretically, a ReLU network
with only three intrinsic parameters to approximate H\"older continuous
functions with an arbitrarily small error.
Related papers
- Fast $(1+\varepsilon)$-Approximation Algorithms for Binary Matrix
Factorization [54.29685789885059]
We introduce efficient $(1+varepsilon)$-approximation algorithms for the binary matrix factorization (BMF) problem.
The goal is to approximate $mathbfA$ as a product of low-rank factors.
Our techniques generalize to other common variants of the BMF problem.
arXiv Detail & Related papers (2023-06-02T18:55:27Z) - The Approximate Degree of DNF and CNF Formulas [95.94432031144716]
For every $delta>0,$ we construct CNF and formulas of size with approximate degree $Omega(n1-delta),$ essentially matching the trivial upper bound of $n.
We show that for every $delta>0$, these models require $Omega(n1-delta)$, $Omega(n/4kk2)1-delta$, and $Omega(n/4kk2)1-delta$, respectively.
arXiv Detail & Related papers (2022-09-04T10:01:39Z) - Expressive power of binary and ternary neural networks [91.3755431537592]
We show that deep sparse ReLU networks with ternary weights and deep ReLU networks with binary weights can approximate $beta$-H"older functions on $[0,1]d$.
arXiv Detail & Related papers (2022-06-27T13:16:08Z) - On minimal representations of shallow ReLU networks [0.0]
We show that the minimal representation for $f$ uses either $n$, $n+1$ or $n+2$ neurons.
In particular, where the input layer is one-dimensional, minimal representations always use at most $n+1$ neurons but in all higher dimensional settings there are functions for which $n+2$ neurons are needed.
arXiv Detail & Related papers (2021-08-12T10:22:24Z) - Neural networks with superexpressive activations and integer weights [91.3755431537592]
An example of an activation function $sigma$ is given such that networks with activations $sigma, lfloorcdotrfloor$, integer weights and a fixed architecture is given.
The range of integer weights required for $varepsilon$-approximation of H"older continuous functions is derived.
arXiv Detail & Related papers (2021-05-20T17:29:08Z) - Deep Neural Networks with ReLU-Sine-Exponential Activations Break Curse
of Dimensionality on H\"older Class [6.476766717110237]
We construct neural networks with ReLU, sine and $2x$ as activation functions.
In addition to its supper expressive power, functions implemented by ReLU-sine-$2x$ networks are (generalized) differentiable.
arXiv Detail & Related papers (2021-02-28T15:57:42Z) - Optimal Approximation Rate of ReLU Networks in terms of Width and Depth [5.37133760455631]
This paper concentrates on the approximation power of deep feed-forward neural networks in terms of width and depth.
It is proved that ReLU networks with width $mathcalObig(maxdlfloor N1/drfloor,, N+2big)$ and depth $mathcalO(L)$ can approximate a H"older continuous function on $[0,1]d$ with an approximation rate $mathcalObig(lambdasqrtd (N2L2ln
arXiv Detail & Related papers (2021-02-28T13:15:55Z) - Neural Network Approximation: Three Hidden Layers Are Enough [4.468952886990851]
A three-hidden-layer neural network with super approximation power is introduced.
Network is built with the floor function ($lfloor xrfloor$), the exponential function ($2x$), the step function ($1_xgeq 0$), or their compositions as the activation function in each neuron.
arXiv Detail & Related papers (2020-10-25T18:30:57Z) - Deep Network with Approximation Error Being Reciprocal of Width to Power
of Square Root of Depth [4.468952886990851]
A new network with super approximation power is introduced.
This network is built with Floor ($lfloor xrfloor$) or ReLU ($max0,x$) activation function in each neuron.
arXiv Detail & Related papers (2020-06-22T13:27:33Z) - On the Modularity of Hypernetworks [103.1147622394852]
We show that for a structured target function, the overall number of trainable parameters in a hypernetwork is smaller by orders of magnitude than the number of trainable parameters of a standard neural network and an embedding method.
arXiv Detail & Related papers (2020-02-23T22:51:52Z) - On the Complexity of Minimizing Convex Finite Sums Without Using the
Indices of the Individual Functions [62.01594253618911]
We exploit the finite noise structure of finite sums to derive a matching $O(n2)$-upper bound under the global oracle model.
Following a similar approach, we propose a novel adaptation of SVRG which is both emphcompatible with oracles, and achieves complexity bounds of $tildeO(n2+nsqrtL/mu)log (1/epsilon)$ and $O(nsqrtL/epsilon)$, for $mu>0$ and $mu=0$
arXiv Detail & Related papers (2020-02-09T03:39:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.