Noncompact uniform universal approximation
- URL: http://arxiv.org/abs/2308.03812v2
- Date: Sun, 3 Mar 2024 11:15:35 GMT
- Title: Noncompact uniform universal approximation
- Authors: Teun D. H. van Nuland
- Abstract summary: The universal approximation theorem is generalised to uniform convergence on the (noncompact) input space $mathbbRn$.
All continuous functions that vanish at infinity can be uniformly approximated by neural networks.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The universal approximation theorem is generalised to uniform convergence on
the (noncompact) input space $\mathbb{R}^n$. All continuous functions that
vanish at infinity can be uniformly approximated by neural networks with one
hidden layer, for all activation functions $\varphi$ that are continuous,
nonpolynomial, and asymptotically polynomial at $\pm\infty$. When $\varphi$ is
moreover bounded, we exactly determine which functions can be uniformly
approximated by neural networks, with the following unexpected results. Let
$\overline{\mathcal{N}_\varphi^l(\mathbb{R}^n)}$ denote the vector space of
functions that are uniformly approximable by neural networks with $l$ hidden
layers and $n$ inputs. For all $n$ and all $l\geq2$,
$\overline{\mathcal{N}_\varphi^l(\mathbb{R}^n)}$ turns out to be an algebra
under the pointwise product. If the left limit of $\varphi$ differs from its
right limit (for instance, when $\varphi$ is sigmoidal) the algebra
$\overline{\mathcal{N}_\varphi^l(\mathbb{R}^n)}$ ($l\geq2$) is independent of
$\varphi$ and $l$, and equals the closed span of products of sigmoids composed
with one-dimensional projections. If the left limit of $\varphi$ equals its
right limit, $\overline{\mathcal{N}_\varphi^l(\mathbb{R}^n)}$ ($l\geq1$) equals
the (real part of the) commutative resolvent algebra, a C*-algebra which is
used in mathematical approaches to quantum theory. In the latter case, the
algebra is independent of $l\geq1$, whereas in the former case
$\overline{\mathcal{N}_\varphi^2(\mathbb{R}^n)}$ is strictly bigger than
$\overline{\mathcal{N}_\varphi^1(\mathbb{R}^n)}$.
Related papers
- The Communication Complexity of Approximating Matrix Rank [50.6867896228563]
We show that this problem has randomized communication complexity $Omega(frac1kcdot n2log|mathbbF|)$.
As an application, we obtain an $Omega(frac1kcdot n2log|mathbbF|)$ space lower bound for any streaming algorithm with $k$ passes.
arXiv Detail & Related papers (2024-10-26T06:21:42Z) - Provably learning a multi-head attention layer [55.2904547651831]
Multi-head attention layer is one of the key components of the transformer architecture that sets it apart from traditional feed-forward models.
In this work, we initiate the study of provably learning a multi-head attention layer from random examples.
We prove computational lower bounds showing that in the worst case, exponential dependence on $m$ is unavoidable.
arXiv Detail & Related papers (2024-02-06T15:39:09Z) - Dimension-free Remez Inequalities and norm designs [48.5897526636987]
A class of domains $X$ and test sets $Y$ -- termed emphnorm -- enjoy dimension-free Remez-type estimates.
We show that the supremum of $f$ does not increase by more than $mathcalO(log K)2d$ when $f$ is extended to the polytorus.
arXiv Detail & Related papers (2023-10-11T22:46:09Z) - The Approximate Degree of DNF and CNF Formulas [95.94432031144716]
For every $delta>0,$ we construct CNF and formulas of size with approximate degree $Omega(n1-delta),$ essentially matching the trivial upper bound of $n.
We show that for every $delta>0$, these models require $Omega(n1-delta)$, $Omega(n/4kk2)1-delta$, and $Omega(n/4kk2)1-delta$, respectively.
arXiv Detail & Related papers (2022-09-04T10:01:39Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - On Outer Bi-Lipschitz Extensions of Linear Johnson-Lindenstrauss
Embeddings of Low-Dimensional Submanifolds of $\mathbb{R}^N$ [0.24366811507669117]
Let $mathcalM$ be a compact $d$-dimensional submanifold of $mathbbRN$ with reach $tau$ and volume $V_mathcal M$.
We prove that a nonlinear function $f: mathbbRN rightarrow mathbbRmm exists with $m leq C left(d / epsilon2right) log left(fracsqrt[d]V_math
arXiv Detail & Related papers (2022-06-07T15:10:46Z) - Deep Learning in High Dimension: Neural Network Approximation of
Analytic Functions in $L^2(\mathbb{R}^d,\gamma_d)$ [0.0]
We prove expression rates for analytic functions $f:mathbbRdtomathbbR$ in the norm of $L2(mathbbRd,gamma_d)$.
We consider in particular ReLU and ReLU$k$ activations for integer $kgeq 2$.
As an application, we prove expression rate bounds of deep ReLU-NNs for response surfaces of elliptic PDEs with log-Gaussian random field inputs.
arXiv Detail & Related papers (2021-11-13T09:54:32Z) - Linear Bandits on Uniformly Convex Sets [88.3673525964507]
Linear bandit algorithms yield $tildemathcalO(nsqrtT)$ pseudo-regret bounds on compact convex action sets.
Two types of structural assumptions lead to better pseudo-regret bounds.
arXiv Detail & Related papers (2021-03-10T07:33:03Z) - Algorithms and Hardness for Linear Algebra on Geometric Graphs [14.822517769254352]
We show that the exponential dependence on the dimension dimension $d in the celebrated fast multipole method of Greengard and Rokhlin cannot be improved.
This is the first formal limitation proven about fast multipole methods.
arXiv Detail & Related papers (2020-11-04T18:35:02Z) - A Canonical Transform for Strengthening the Local $L^p$-Type Universal
Approximation Property [4.18804572788063]
$Lp$-type universal approximation theorems guarantee that a given machine learning model class $mathscrFsubseteq C(mathbbRd,mathbbRD)$ is dense in $Lp_mu(mathbbRd,mathbbRD)$.
This paper proposes a generic solution to this approximation theoretic problem by introducing a canonical transformation which "upgrades $mathscrF$'s approximation property"
arXiv Detail & Related papers (2020-06-24T17:46:35Z) - A closer look at the approximation capabilities of neural networks [6.09170287691728]
A feedforward neural network with one hidden layer is able to approximate any continuous function $f$ to any given approximation threshold $varepsilon$.
We show that this uniform approximation property still holds even under seemingly strong conditions imposed on the weights.
arXiv Detail & Related papers (2020-02-16T04:58:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.