Related papers: Approximation smooth and sparse functions by deep neural networks without saturation

Approximation smooth and sparse functions by deep neural networks without saturation

URL: http://arxiv.org/abs/2001.04114v1
Date: Mon, 13 Jan 2020 09:28:50 GMT
Title: Approximation smooth and sparse functions by deep neural networks without saturation
Authors: Xia Liu
Abstract summary: In this paper, we aim at constructing deep neural networks with three hidden layers to approximate smooth and sparse functions. We prove that the constructed deep nets can reach the optimal approximation rate in approximating both smooth and sparse functions with controllable magnitude of free parameters.
Score: 0.6396288020763143
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Constructing neural networks for function approximation is a classical and longstanding topic in approximation theory. In this paper, we aim at constructing deep neural networks (deep nets for short) with three hidden layers to approximate smooth and sparse functions. In particular, we prove that the constructed deep nets can reach the optimal approximation rate in approximating both smooth and sparse functions with controllable magnitude of free parameters. Since the saturation that describes the bottleneck of approximate is an insurmountable problem of constructive neural networks, we also prove that deepening the neural network with only one more hidden layer can avoid the saturation. The obtained results underlie advantages of deep nets and provide theoretical explanations for deep learning.

Related papers

Dense Neural Networks are not Universal Approximators [53.27010448621372]
We show that dense neural networks do not possess universality of arbitrary continuous functions.<n>We consider ReLU neural networks subject to natural constraints on weights and input and output dimensions.
arXiv Detail & Related papers (2026-02-07T16:52:38Z)
Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence. We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers. This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z)
Neural Network Pruning as Spectrum Preserving Process [7.386663473785839]
We identify the close connection between matrix spectrum learning and neural network training for dense and convolutional layers. We propose a matrix sparsification algorithm tailored for neural network pruning that yields better pruning result.
arXiv Detail & Related papers (2023-07-18T05:39:32Z)
Globally Optimal Training of Neural Networks with Threshold Activation Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations. We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z)
Semantic Strengthening of Neuro-Symbolic Learning [85.6195120593625]
Neuro-symbolic approaches typically resort to fuzzy approximations of a probabilistic objective. We show how to compute this efficiently for tractable circuits. We test our approach on three tasks: predicting a minimum-cost path in Warcraft, predicting a minimum-cost perfect matching, and solving Sudoku puzzles.
arXiv Detail & Related papers (2023-02-28T00:04:22Z)
Approximation Power of Deep Neural Networks: an explanatory mathematical survey [0.0]
The goal of this survey is to present an explanatory review of the approximation properties of deep neural networks. We aim at understanding how and why deep neural networks outperform other classical linear and nonlinear approximation methods.
arXiv Detail & Related papers (2022-07-19T18:47:44Z)
Benefits of Overparameterized Convolutional Residual Networks: Function Approximation under Smoothness Constraint [48.25573695787407]
We prove that large ConvResNets can not only approximate a target function in terms of function value, but also exhibit sufficient first-order smoothness. Our theory partially justifies the benefits of using deep and wide networks in practice.
arXiv Detail & Related papers (2022-06-09T15:35:22Z)
Exact Solutions of a Deep Linear Network [2.2344764434954256]
This work finds the analytical expression of the global minima of a deep linear network with weight decay and neurons. We show that weight decay strongly interacts with the model architecture and can create bad minima at zero in a network with more than $1$ hidden layer.
arXiv Detail & Related papers (2022-02-10T00:13:34Z)
Optimization-Based Separations for Neural Networks [57.875347246373956]
We show that gradient descent can efficiently learn ball indicator functions using a depth 2 neural network with two layers of sigmoidal activations. This is the first optimization-based separation result where the approximation benefits of the stronger architecture provably manifest in practice.
arXiv Detail & Related papers (2021-12-04T18:07:47Z)
On the approximation of functions by tanh neural networks [0.0]
We derive bounds on the error, in high-order Sobolev norms, incurred in the approximation of Sobolev-regular. We show that tanh neural networks with only two hidden layers suffice to approximate functions at comparable or better rates than much deeper ReLU neural networks.
arXiv Detail & Related papers (2021-04-18T19:30:45Z)
The Connection Between Approximation, Depth Separation and Learnability in Neural Networks [70.55686685872008]
We study the connection between learnability and approximation capacity. We show that learnability with deep networks of a target function depends on the ability of simpler classes to approximate the target.
arXiv Detail & Related papers (2021-01-31T11:32:30Z)
Theoretical Analysis of the Advantage of Deepening Neural Networks [0.0]
It is important to know the expressivity of functions computable by deep neural networks. By the two criteria, we show that to increase layers is more effective than to increase units at each layer on improving the expressivity of deep neural networks.
arXiv Detail & Related papers (2020-09-24T04:10:50Z)
Expressivity of Deep Neural Networks [2.7909470193274593]
In this review paper, we give a comprehensive overview of the large variety of approximation results for neural networks. While the mainbody of existing results is for general feedforward architectures, we also depict approximation results for convolutional, residual and recurrent neural networks.
arXiv Detail & Related papers (2020-07-09T13:08:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.