Implicit Hypersurface Approximation Capacity in Deep ReLU Networks
- URL: http://arxiv.org/abs/2407.03851v1
- Date: Thu, 4 Jul 2024 11:34:42 GMT
- Title: Implicit Hypersurface Approximation Capacity in Deep ReLU Networks
- Authors: Jonatan Vallin, Karl Larsson, Mats G. Larson,
- Abstract summary: We develop a geometric approximation theory for deep feed-forward neural networks with ReLU activations.
We show that a deep fully-connected ReLU network of width $d+1$ can implicitly construct an approximation as its zero contour.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We develop a geometric approximation theory for deep feed-forward neural networks with ReLU activations. Given a $d$-dimensional hypersurface in $\mathbb{R}^{d+1}$ represented as the graph of a $C^2$-function $\phi$, we show that a deep fully-connected ReLU network of width $d+1$ can implicitly construct an approximation as its zero contour with a precision bound depending on the number of layers. This result is directly applicable to the binary classification setting where the sign of the network is trained as a classifier, with the network's zero contour as a decision boundary. Our proof is constructive and relies on the geometrical structure of ReLU layers provided in [doi:10.48550/arXiv.2310.03482]. Inspired by this geometrical description, we define a new equivalent network architecture that is easier to interpret geometrically, where the action of each hidden layer is a projection onto a polyhedral cone derived from the layer's parameters. By repeatedly adding such layers, with parameters chosen such that we project small parts of the graph of $\phi$ from the outside in, we, in a controlled way, construct a network that implicitly approximates the graph over a ball of radius $R$. The accuracy of this construction is controlled by a discretization parameter $\delta$ and we show that the tolerance in the resulting error bound scales as $(d-1)R^{3/2}\delta^{1/2}$ and the required number of layers is of order $d\big(\frac{32R}{\delta}\big)^{\frac{d+1}{2}}$.
Related papers
- Deep Neural Networks: Multi-Classification and Universal Approximation [0.0]
We demonstrate that a ReLU deep neural network with a width of $2$ and a depth of $2N+4M-1$ layers can achieve finite sample memorization for any dataset comprising $N$ elements.
We also provide depth estimates for approximating $W1,p$ functions and width estimates for approximating $Lp(Omega;mathbbRm)$ for $mgeq1$.
arXiv Detail & Related papers (2024-09-10T14:31:21Z) - Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit [75.4661041626338]
We study the problem of gradient descent learning of a single-index target function $f_*(boldsymbolx) = textstylesigma_*left(langleboldsymbolx,boldsymbolthetarangleright)$ under isotropic Gaussian data.
We prove that a two-layer neural network optimized by an SGD-based algorithm learns $f_*$ of arbitrary link function with a sample and runtime complexity of $n asymp T asymp C(q) cdot d
arXiv Detail & Related papers (2024-06-03T17:56:58Z) - Learning Hierarchical Polynomials with Three-Layer Neural Networks [56.71223169861528]
We study the problem of learning hierarchical functions over the standard Gaussian distribution with three-layer neural networks.
For a large subclass of degree $k$s $p$, a three-layer neural network trained via layerwise gradientp descent on the square loss learns the target $h$ up to vanishing test error.
This work demonstrates the ability of three-layer neural networks to learn complex features and as a result, learn a broad class of hierarchical functions.
arXiv Detail & Related papers (2023-11-23T02:19:32Z) - Differential Equation Scaling Limits of Shaped and Unshaped Neural Networks [8.716913598251386]
We find similar differential equation based characterization for two types of unshaped networks.
We derive the first order correction to the layerwise correlation.
These results together provide a connection between shaped and unshaped network architectures.
arXiv Detail & Related papers (2023-10-18T16:15:10Z) - Geometric structure of Deep Learning networks and construction of global ${\mathcal L}^2$ minimizers [1.189367612437469]
We explicitly determine local and global minimizers of the $mathcalL2$ cost function in underparametrized Deep Learning (DL) networks.
arXiv Detail & Related papers (2023-09-19T14:20:55Z) - Neural Network Architecture Beyond Width and Depth [4.468952886990851]
This paper proposes a new neural network architecture by introducing an additional dimension called height beyond width and depth.
It is shown that neural networks with three-dimensional architectures are significantly more expressive than the ones with two-dimensional architectures.
arXiv Detail & Related papers (2022-05-19T10:29:11Z) - Function approximation by deep neural networks with parameters $\{0,\pm
\frac{1}{2}, \pm 1, 2\}$ [91.3755431537592]
It is shown that $C_beta$-smooth functions can be approximated by neural networks with parameters $0,pm frac12, pm 1, 2$.
The depth, width and the number of active parameters of constructed networks have, up to a logarithimc factor, the same dependence on the approximation error as the networks with parameters in $[-1,1]$.
arXiv Detail & Related papers (2021-03-15T19:10:02Z) - Deep Learning Meets Projective Clustering [66.726500395069]
A common approach for compressing NLP networks is to encode the embedding layer as a matrix $AinmathbbRntimes d$.
Inspired by emphprojective clustering from computational geometry, we suggest replacing this subspace by a set of $k$ subspaces.
arXiv Detail & Related papers (2020-10-08T22:47:48Z) - A deep network construction that adapts to intrinsic dimensionality
beyond the domain [79.23797234241471]
We study the approximation of two-layer compositions $f(x) = g(phi(x))$ via deep networks with ReLU activation.
We focus on two intuitive and practically relevant choices for $phi$: the projection onto a low-dimensional embedded submanifold and a distance to a collection of low-dimensional sets.
arXiv Detail & Related papers (2020-08-06T09:50:29Z) - Sharp Representation Theorems for ReLU Networks with Precise Dependence
on Depth [26.87238691716307]
We prove sharp-free representation results for neural networks with $D$ ReLU layers under square loss.
Our results confirm the prevailing hypothesis that deeper networks are better at representing less smooth functions.
arXiv Detail & Related papers (2020-06-07T05:25:06Z) - PUGeo-Net: A Geometry-centric Network for 3D Point Cloud Upsampling [103.09504572409449]
We propose a novel deep neural network based method, called PUGeo-Net, to generate uniform dense point clouds.
Thanks to its geometry-centric nature, PUGeo-Net works well for both CAD models with sharp features and scanned models with rich geometric details.
arXiv Detail & Related papers (2020-02-24T14:13:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.