Related papers: Simultaneous approximation of a smooth function and its derivatives by deep neural networks with piecewise-polynomial activations

Simultaneous approximation of a smooth function and its derivatives by deep neural networks with piecewise-polynomial activations

URL: http://arxiv.org/abs/2206.09527v2
Date: Fri, 2 Dec 2022 10:01:35 GMT
Title: Simultaneous approximation of a smooth function and its derivatives by deep neural networks with piecewise-polynomial activations
Authors: Denis Belomestny, Alexey Naumov, Nikita Puchkin, Sergey Samsonov
Abstract summary: We derive the required depth, width, and sparsity of a deep neural network to approximate any H"older smooth function up to a given approximation error in H"older norms. The latter feature is essential to control generalization errors in many statistical and machine learning applications.
Score: 2.15145758970292
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper investigates the approximation properties of deep neural networks with piecewise-polynomial activation functions. We derive the required depth, width, and sparsity of a deep neural network to approximate any H\"{o}lder smooth function up to a given approximation error in H\"{o}lder norms in such a way that all weights of this neural network are bounded by $1$. The latter feature is essential to control generalization errors in many statistical and machine learning applications.

Related papers

Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence. We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers. This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z)
When Deep Learning Meets Polyhedral Theory: A Survey [6.899761345257773]
In the past decade, deep became the prevalent methodology for predictive modeling thanks to the remarkable accuracy of deep neural learning. Meanwhile, the structure of neural networks converged back to simplerwise and linear functions.
arXiv Detail & Related papers (2023-04-29T11:46:53Z)
Globally Optimal Training of Neural Networks with Threshold Activation Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations. We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z)
Exploring the Approximation Capabilities of Multiplicative Neural Networks for Smooth Functions [9.936974568429173]
We consider two classes of target functions: generalized bandlimited functions and Sobolev-Type balls. Our results demonstrate that multiplicative neural networks can approximate these functions with significantly fewer layers and neurons. These findings suggest that multiplicative gates can outperform standard feed-forward layers and have potential for improving neural network design.
arXiv Detail & Related papers (2023-01-11T17:57:33Z)
A Local Geometric Interpretation of Feature Extraction in Deep Feedforward Neural Networks [13.159994710917022]
In this paper, we present a local geometric analysis to interpret how deep feedforward neural networks extract low-dimensional features from high-dimensional data. Our study shows that, in a local geometric region, the optimal weight in one layer of the neural network and the optimal feature generated by the previous layer comprise a low-rank approximation of a matrix that is determined by the Bayes action of this layer.
arXiv Detail & Related papers (2022-02-09T18:50:00Z)
Going Beyond Linear RL: Sample Efficient Neural Function Approximation [76.57464214864756]
We study function approximation with two-layer neural networks. Our results significantly improve upon what can be attained with linear (or eluder dimension) methods.
arXiv Detail & Related papers (2021-07-14T03:03:56Z)
On the approximation of functions by tanh neural networks [0.0]
We derive bounds on the error, in high-order Sobolev norms, incurred in the approximation of Sobolev-regular. We show that tanh neural networks with only two hidden layers suffice to approximate functions at comparable or better rates than much deeper ReLU neural networks.
arXiv Detail & Related papers (2021-04-18T19:30:45Z)
Deep neural network approximation of analytic functions [91.3755431537592]
entropy bound for the spaces of neural networks with piecewise linear activation functions. We derive an oracle inequality for the expected error of the considered penalized deep neural network estimators.
arXiv Detail & Related papers (2021-04-05T18:02:04Z)
Topological obstructions in neural networks learning [67.8848058842671]
We study global properties of the loss gradient function flow. We use topological data analysis of the loss function and its Morse complex to relate local behavior along gradient trajectories with global properties of the loss surface.
arXiv Detail & Related papers (2020-12-31T18:53:25Z)
The Representation Power of Neural Networks: Breaking the Curse of Dimensionality [0.0]
We prove upper bounds on quantities for shallow and deep neural networks. We further prove that these bounds nearly match the minimal number of parameters any continuous function approximator needs to approximate Korobov functions.
arXiv Detail & Related papers (2020-12-10T04:44:07Z)
Measuring Model Complexity of Neural Networks with Curve Activation Functions [100.98319505253797]
We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function. We experimentally explore the training process of neural networks and detect overfitting. We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
arXiv Detail & Related papers (2020-06-16T07:38:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.