Approximation smooth and sparse functions by deep neural networks
without saturation
- URL: http://arxiv.org/abs/2001.04114v1
- Date: Mon, 13 Jan 2020 09:28:50 GMT
- Title: Approximation smooth and sparse functions by deep neural networks
without saturation
- Authors: Xia Liu
- Abstract summary: In this paper, we aim at constructing deep neural networks with three hidden layers to approximate smooth and sparse functions.
We prove that the constructed deep nets can reach the optimal approximation rate in approximating both smooth and sparse functions with controllable magnitude of free parameters.
- Score: 0.6396288020763143
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Constructing neural networks for function approximation is a classical and
longstanding topic in approximation theory. In this paper, we aim at
constructing deep neural networks (deep nets for short) with three hidden
layers to approximate smooth and sparse functions. In particular, we prove that
the constructed deep nets can reach the optimal approximation rate in
approximating both smooth and sparse functions with controllable magnitude of
free parameters. Since the saturation that describes the bottleneck of
approximate is an insurmountable problem of constructive neural networks, we
also prove that deepening the neural network with only one more hidden layer
can avoid the saturation. The obtained results underlie advantages of deep nets
and provide theoretical explanations for deep learning.
Related papers
- Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence.
We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers.
This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z) - Neural Network Pruning as Spectrum Preserving Process [7.386663473785839]
We identify the close connection between matrix spectrum learning and neural network training for dense and convolutional layers.
We propose a matrix sparsification algorithm tailored for neural network pruning that yields better pruning result.
arXiv Detail & Related papers (2023-07-18T05:39:32Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Semantic Strengthening of Neuro-Symbolic Learning [85.6195120593625]
Neuro-symbolic approaches typically resort to fuzzy approximations of a probabilistic objective.
We show how to compute this efficiently for tractable circuits.
We test our approach on three tasks: predicting a minimum-cost path in Warcraft, predicting a minimum-cost perfect matching, and solving Sudoku puzzles.
arXiv Detail & Related papers (2023-02-28T00:04:22Z) - Approximation Power of Deep Neural Networks: an explanatory mathematical
survey [0.0]
The goal of this survey is to present an explanatory review of the approximation properties of deep neural networks.
We aim at understanding how and why deep neural networks outperform other classical linear and nonlinear approximation methods.
arXiv Detail & Related papers (2022-07-19T18:47:44Z) - Benefits of Overparameterized Convolutional Residual Networks: Function
Approximation under Smoothness Constraint [48.25573695787407]
We prove that large ConvResNets can not only approximate a target function in terms of function value, but also exhibit sufficient first-order smoothness.
Our theory partially justifies the benefits of using deep and wide networks in practice.
arXiv Detail & Related papers (2022-06-09T15:35:22Z) - Exact Solutions of a Deep Linear Network [2.2344764434954256]
This work finds the analytical expression of the global minima of a deep linear network with weight decay and neurons.
We show that weight decay strongly interacts with the model architecture and can create bad minima at zero in a network with more than $1$ hidden layer.
arXiv Detail & Related papers (2022-02-10T00:13:34Z) - Optimization-Based Separations for Neural Networks [57.875347246373956]
We show that gradient descent can efficiently learn ball indicator functions using a depth 2 neural network with two layers of sigmoidal activations.
This is the first optimization-based separation result where the approximation benefits of the stronger architecture provably manifest in practice.
arXiv Detail & Related papers (2021-12-04T18:07:47Z) - On the approximation of functions by tanh neural networks [0.0]
We derive bounds on the error, in high-order Sobolev norms, incurred in the approximation of Sobolev-regular.
We show that tanh neural networks with only two hidden layers suffice to approximate functions at comparable or better rates than much deeper ReLU neural networks.
arXiv Detail & Related papers (2021-04-18T19:30:45Z) - The Connection Between Approximation, Depth Separation and Learnability
in Neural Networks [70.55686685872008]
We study the connection between learnability and approximation capacity.
We show that learnability with deep networks of a target function depends on the ability of simpler classes to approximate the target.
arXiv Detail & Related papers (2021-01-31T11:32:30Z) - Theoretical Analysis of the Advantage of Deepening Neural Networks [0.0]
It is important to know the expressivity of functions computable by deep neural networks.
By the two criteria, we show that to increase layers is more effective than to increase units at each layer on improving the expressivity of deep neural networks.
arXiv Detail & Related papers (2020-09-24T04:10:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.