Approximation Power of Deep Neural Networks: an explanatory mathematical survey
- URL: http://arxiv.org/abs/2207.09511v2
- Date: Mon, 16 Dec 2024 21:06:21 GMT
- Title: Approximation Power of Deep Neural Networks: an explanatory mathematical survey
- Authors: Owen Davis, Mohammad Motamed,
- Abstract summary: The survey examines how effectively neural networks approximate target functions and to identify conditions under which they outperform traditional approximation methods.
Key topics include the nonlinear, compositional structure of deep networks and the formalization of neural network tasks as optimization problems in regression and classification settings.
The survey explores the density of neural networks in the space of continuous functions, comparing the approximation capabilities of deep ReLU networks with those of other approximation methods.
- Score: 0.0
- License:
- Abstract: This survey provides an in-depth and explanatory review of the approximation properties of deep neural networks, with a focus on feed-forward and residual architectures. The primary objective is to examine how effectively neural networks approximate target functions and to identify conditions under which they outperform traditional approximation methods. Key topics include the nonlinear, compositional structure of deep networks and the formalization of neural network tasks as optimization problems in regression and classification settings. The survey also addresses the training process, emphasizing the role of stochastic gradient descent and backpropagation in solving these optimization problems, and highlights practical considerations such as activation functions, overfitting, and regularization techniques. Additionally, the survey explores the density of neural networks in the space of continuous functions, comparing the approximation capabilities of deep ReLU networks with those of other approximation methods. It discusses recent theoretical advancements in understanding the expressiveness and limitations of these networks. A detailed error-complexity analysis is also presented, focusing on error rates and computational complexity for neural networks with ReLU and Fourier-type activation functions in the context of bounded target functions with minimal regularity assumptions. Alongside recent known results, the survey introduces new findings, offering a valuable resource for understanding the theoretical foundations of neural network approximation. Concluding remarks and further reading suggestions are provided.
Related papers
- Neural Scaling Laws of Deep ReLU and Deep Operator Network: A Theoretical Study [8.183509993010983]
We study the neural scaling laws for deep operator networks using the Chen and Chen style architecture.
We quantify the neural scaling laws by analyzing its approximation and generalization errors.
Our results offer a partial explanation of the neural scaling laws in operator learning and provide a theoretical foundation for their applications.
arXiv Detail & Related papers (2024-10-01T03:06:55Z) - Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence.
We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers.
This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z) - A new approach to generalisation error of machine learning algorithms:
Estimates and convergence [0.0]
We introduce a new approach to the estimation of the (generalisation) error and to convergence.
Our results include estimates of the error without any structural assumption on the neural networks.
arXiv Detail & Related papers (2023-06-23T20:57:31Z) - Rank Diminishing in Deep Neural Networks [71.03777954670323]
Rank of neural networks measures information flowing across layers.
It is an instance of a key structural condition that applies across broad domains of machine learning.
For neural networks, however, the intrinsic mechanism that yields low-rank structures remains vague and unclear.
arXiv Detail & Related papers (2022-06-13T12:03:32Z) - Multigoal-oriented dual-weighted-residual error estimation using deep
neural networks [0.0]
Deep learning is considered as a powerful tool with high flexibility to approximate functions.
Our approach is based on a posteriori error estimation in which the adjoint problem is solved for the error localization.
An efficient and easy to implement algorithm is developed to obtain a posteriori error estimate for multiple goal functionals.
arXiv Detail & Related papers (2021-12-21T16:59:44Z) - What can linearized neural networks actually say about generalization? [67.83999394554621]
In certain infinitely-wide neural networks, the neural tangent kernel (NTK) theory fully characterizes generalization.
We show that the linear approximations can indeed rank the learning complexity of certain tasks for neural networks.
Our work provides concrete examples of novel deep learning phenomena which can inspire future theoretical research.
arXiv Detail & Related papers (2021-06-12T13:05:11Z) - Topological obstructions in neural networks learning [67.8848058842671]
We study global properties of the loss gradient function flow.
We use topological data analysis of the loss function and its Morse complex to relate local behavior along gradient trajectories with global properties of the loss surface.
arXiv Detail & Related papers (2020-12-31T18:53:25Z) - Analytical aspects of non-differentiable neural networks [0.0]
We discuss the expressivity of quantized neural networks and approximation techniques for non-differentiable networks.
We show that QNNs have the same expressivity as DNNs in terms of approximation of Lipschitz functions in the $Linfty$ norm.
We also consider networks defined by means of Heaviside-type activation functions, and prove for them a pointwise approximation result by means of smooth networks.
arXiv Detail & Related papers (2020-11-03T17:20:43Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z) - Expressivity of Deep Neural Networks [2.7909470193274593]
In this review paper, we give a comprehensive overview of the large variety of approximation results for neural networks.
While the mainbody of existing results is for general feedforward architectures, we also depict approximation results for convolutional, residual and recurrent neural networks.
arXiv Detail & Related papers (2020-07-09T13:08:01Z) - Binary Neural Networks: A Survey [126.67799882857656]
The binary neural network serves as a promising technique for deploying deep models on resource-limited devices.
The binarization inevitably causes severe information loss, and even worse, its discontinuity brings difficulty to the optimization of the deep network.
We present a survey of these algorithms, mainly categorized into the native solutions directly conducting binarization, and the optimized ones using techniques like minimizing the quantization error, improving the network loss function, and reducing the gradient error.
arXiv Detail & Related papers (2020-03-31T16:47:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.