On the Optimal Expressive Power of ReLU DNNs and Its Application in
Approximation with Kolmogorov Superposition Theorem
- URL: http://arxiv.org/abs/2308.05509v1
- Date: Thu, 10 Aug 2023 11:42:09 GMT
- Title: On the Optimal Expressive Power of ReLU DNNs and Its Application in
Approximation with Kolmogorov Superposition Theorem
- Authors: Juncai He
- Abstract summary: We study the optimal expressive power of ReLU deep neural networks (DNNs) and its application in approximation via the Kolmogorov Superposition Theorem.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper is devoted to studying the optimal expressive power of ReLU deep
neural networks (DNNs) and its application in approximation via the Kolmogorov
Superposition Theorem. We first constructively prove that any continuous
piecewise linear functions on $[0,1]$, comprising $O(N^2L)$ segments, can be
represented by ReLU DNNs with $L$ hidden layers and $N$ neurons per layer.
Subsequently, we demonstrate that this construction is optimal regarding the
parameter count of the DNNs, achieved through investigating the shattering
capacity of ReLU DNNs. Moreover, by invoking the Kolmogorov Superposition
Theorem, we achieve an enhanced approximation rate for ReLU DNNs of arbitrary
width and depth when dealing with continuous functions in high-dimensional
spaces.
Related papers
- Improving the Expressive Power of Deep Neural Networks through Integral
Activation Transform [12.36064367319084]
We generalize the traditional fully connected deep neural network (DNN) through the concept of continuous width.
We show that IAT-ReLU exhibits a continuous activation pattern when continuous basis functions are employed.
Our numerical experiments demonstrate that IAT-ReLU outperforms regular ReLU in terms of trainability and better smoothness.
arXiv Detail & Related papers (2023-12-19T20:23:33Z) - Sample Complexity of Neural Policy Mirror Descent for Policy
Optimization on Low-Dimensional Manifolds [75.51968172401394]
We study the sample complexity of the neural policy mirror descent (NPMD) algorithm with deep convolutional neural networks (CNN)
In each iteration of NPMD, both the value function and the policy can be well approximated by CNNs.
We show that NPMD can leverage the low-dimensional structure of state space to escape from the curse of dimensionality.
arXiv Detail & Related papers (2023-09-25T07:31:22Z) - Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - Minimal Width for Universal Property of Deep RNN [6.744583770038476]
A recurrent neural network (RNN) is a widely used deep-learning network for dealing with sequential data.
We prove the universality of deep narrow RNNs and show that the upper bound of the minimum width for universality can be independent of the length of the data.
arXiv Detail & Related papers (2022-11-25T02:43:54Z) - On Feature Learning in Neural Networks with Global Convergence
Guarantees [49.870593940818715]
We study the optimization of wide neural networks (NNs) via gradient flow (GF)
We show that when the input dimension is no less than the size of the training set, the training loss converges to zero at a linear rate under GF.
We also show empirically that, unlike in the Neural Tangent Kernel (NTK) regime, our multi-layer model exhibits feature learning and can achieve better generalization performance than its NTK counterpart.
arXiv Detail & Related papers (2022-04-22T15:56:43Z) - ReLU Deep Neural Networks from the Hierarchical Basis Perspective [8.74591882131599]
We study ReLU deep neural networks (DNNs) by investigating their connections with the hierarchical basis method in finite element methods.
We show that the approximation schemes of ReLU DNNs for $x2$ and $xy$ are composition versions of the hierarchical basis approximation for these two functions.
arXiv Detail & Related papers (2021-05-10T07:25:33Z) - On the Turnpike to Design of Deep Neural Nets: Explicit Depth Bounds [0.0]
This paper attempts a quantifiable answer to the question of how many layers should be considered in a Deep Neural Networks (DNN)
The underlying assumption is that the number of neurons per layer -- i.e., the width of the DNN -- is kept constant.
We prove explicit bounds on the required depths of DNNs based on reachability of assumptions and a dissipativity-inducing choice of the regularization terms in the training problem.
arXiv Detail & Related papers (2021-01-08T13:23:37Z) - Improve the Robustness and Accuracy of Deep Neural Network with
$L_{2,\infty}$ Normalization [0.0]
The robustness and accuracy of the deep neural network (DNN) was enhanced by introducing the $L_2,infty$ normalization.
It is proved that the $L_2,infty$ normalization leads to large dihedral angles between two adjacent faces of the polyhedron graph of the DNN function.
arXiv Detail & Related papers (2020-10-10T05:45:45Z) - An Infinite-Feature Extension for Bayesian ReLU Nets That Fixes Their
Asymptotic Overconfidence [65.24701908364383]
A Bayesian treatment can mitigate overconfidence in ReLU nets around the training data.
But far away from them, ReLU neural networks (BNNs) can still underestimate uncertainty and thus be overconfident.
We show that it can be applied emphpost-hoc to any pre-trained ReLU BNN at a low cost.
arXiv Detail & Related papers (2020-10-06T13:32:18Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - Exact posterior distributions of wide Bayesian neural networks [51.20413322972014]
We show that the exact BNN posterior converges (weakly) to the one induced by the GP limit of the prior.
For empirical validation, we show how to generate exact samples from a finite BNN on a small dataset via rejection sampling.
arXiv Detail & Related papers (2020-06-18T13:57:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.