Expressivity and Approximation Properties of Deep Neural Networks with
ReLU$^k$ Activation
- URL: http://arxiv.org/abs/2312.16483v2
- Date: Thu, 11 Jan 2024 04:48:47 GMT
- Title: Expressivity and Approximation Properties of Deep Neural Networks with
ReLU$^k$ Activation
- Authors: Juncai He, Tong Mao, Jinchao Xu
- Abstract summary: We investigate the expressivity and approximation properties of deep networks employing the ReLU$k$ activation function for $k geq 2$.
Although deep ReLU$k$ networks can approximates effectively, deep ReLU$k$ networks have the capability to represent higher-degrees precisely.
- Score: 2.3020018305241337
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we investigate the expressivity and approximation properties
of deep neural networks employing the ReLU$^k$ activation function for $k \geq
2$. Although deep ReLU networks can approximate polynomials effectively, deep
ReLU$^k$ networks have the capability to represent higher-degree polynomials
precisely. Our initial contribution is a comprehensive, constructive proof for
polynomial representation using deep ReLU$^k$ networks. This allows us to
establish an upper bound on both the size and count of network parameters.
Consequently, we are able to demonstrate a suboptimal approximation rate for
functions from Sobolev spaces as well as for analytic functions. Additionally,
through an exploration of the representation power of deep ReLU$^k$ networks
for shallow networks, we reveal that deep ReLU$^k$ networks can approximate
functions from a range of variation spaces, extending beyond those generated
solely by the ReLU$^k$ activation function. This finding demonstrates the
adaptability of deep ReLU$^k$ networks in approximating functions within
various variation spaces.
Related papers
- Piecewise Linear Functions Representable with Infinite Width Shallow
ReLU Neural Networks [0.0]
We prove a conjecture of Ongie et al. that every continuous piecewise linear function expressible with this kind of infinite width neural network is expressible as a finite width shallow ReLU neural network.
arXiv Detail & Related papers (2023-07-25T15:38:18Z) - Polynomial Width is Sufficient for Set Representation with
High-dimensional Features [69.65698500919869]
DeepSets is the most widely used neural network architecture for set representation.
We present two set-element embedding layers: (a) linear + power activation (LP) and (b) linear + exponential activations (LE)
arXiv Detail & Related papers (2023-07-08T16:00:59Z) - Bayesian Interpolation with Deep Linear Networks [92.1721532941863]
Characterizing how neural network depth, width, and dataset size jointly impact model quality is a central problem in deep learning theory.
We show that linear networks make provably optimal predictions at infinite depth.
We also show that with data-agnostic priors, Bayesian model evidence in wide linear networks is maximized at infinite depth.
arXiv Detail & Related papers (2022-12-29T20:57:46Z) - Benefits of Overparameterized Convolutional Residual Networks: Function
Approximation under Smoothness Constraint [48.25573695787407]
We prove that large ConvResNets can not only approximate a target function in terms of function value, but also exhibit sufficient first-order smoothness.
Our theory partially justifies the benefits of using deep and wide networks in practice.
arXiv Detail & Related papers (2022-06-09T15:35:22Z) - Most Activation Functions Can Win the Lottery Without Excessive Depth [6.68999512375737]
Lottery ticket hypothesis has highlighted the potential for training deep neural networks by pruning.
For networks with ReLU activation functions, it has been proven that a target network with depth $L$ can be approximated by the subnetwork of a randomly neural network that has double the target's depth $2L$ and is wider by a logarithmic factor.
arXiv Detail & Related papers (2022-05-04T20:51:30Z) - Theory of Deep Convolutional Neural Networks III: Approximating Radial
Functions [7.943024117353317]
We consider a family of deep neural networks consisting of two groups of convolutional layers, a down operator, and a fully connected layer.
The network structure depends on two structural parameters which determine the numbers of convolutional layers and the width of the fully connected layer.
arXiv Detail & Related papers (2021-07-02T08:22:12Z) - Adversarial Examples in Multi-Layer Random ReLU Networks [39.797621513256026]
adversarial examples arise in ReLU networks with independent gaussian parameters.
Bottleneck layers in the network play a key role: the minimal width up to some point determines scales and sensitivities of mappings computed up to that point.
arXiv Detail & Related papers (2021-06-23T18:16:34Z) - Deep neural network approximation of analytic functions [91.3755431537592]
entropy bound for the spaces of neural networks with piecewise linear activation functions.
We derive an oracle inequality for the expected error of the considered penalized deep neural network estimators.
arXiv Detail & Related papers (2021-04-05T18:02:04Z) - Size and Depth Separation in Approximating Natural Functions with Neural
Networks [52.73592689730044]
We show the benefits of size and depth for approximation of natural functions with ReLU networks.
We show a complexity-theoretic barrier to proving such results beyond size $O(d)$.
We also show an explicit natural function, that can be approximated with networks of size $O(d)$.
arXiv Detail & Related papers (2021-01-30T21:30:11Z) - Deep Polynomial Neural Networks [77.70761658507507]
$Pi$Nets are a new class of function approximators based on expansions.
$Pi$Nets produce state-the-art results in three challenging tasks, i.e. image generation, face verification and 3D mesh representation learning.
arXiv Detail & Related papers (2020-06-20T16:23:32Z) - Sharp Representation Theorems for ReLU Networks with Precise Dependence
on Depth [26.87238691716307]
We prove sharp-free representation results for neural networks with $D$ ReLU layers under square loss.
Our results confirm the prevailing hypothesis that deeper networks are better at representing less smooth functions.
arXiv Detail & Related papers (2020-06-07T05:25:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.