Complexity Measures for Neural Networks with General Activation
Functions Using Path-based Norms
- URL: http://arxiv.org/abs/2009.06132v1
- Date: Mon, 14 Sep 2020 01:15:11 GMT
- Title: Complexity Measures for Neural Networks with General Activation
Functions Using Path-based Norms
- Authors: Zhong Li and Chao Ma and Lei Wu
- Abstract summary: A simple approach is proposed to obtain complexity controls for neural networks with general activation functions.
We consider two-layer networks and deep residual networks, for which path-based norms are derived to control complexities.
- Score: 18.487936403345167
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A simple approach is proposed to obtain complexity controls for neural
networks with general activation functions. The approach is motivated by
approximating the general activation functions with one-dimensional ReLU
networks, which reduces the problem to the complexity controls of ReLU
networks. Specifically, we consider two-layer networks and deep residual
networks, for which path-based norms are derived to control complexities. We
also provide preliminary analyses of the function spaces induced by these norms
and a priori estimates of the corresponding regularized estimators.
Related papers
- Generalization of Scaled Deep ResNets in the Mean-Field Regime [55.77054255101667]
We investigate emphscaled ResNet in the limit of infinitely deep and wide neural networks.
Our results offer new insights into the generalization ability of deep ResNet beyond the lazy training regime.
arXiv Detail & Related papers (2024-03-14T21:48:00Z) - Generalization and Estimation Error Bounds for Model-based Neural
Networks [78.88759757988761]
We show that the generalization abilities of model-based networks for sparse recovery outperform those of regular ReLU networks.
We derive practical design rules that allow to construct model-based networks with guaranteed high generalization.
arXiv Detail & Related papers (2023-04-19T16:39:44Z) - The Sample Complexity of One-Hidden-Layer Neural Networks [57.6421258363243]
We study a class of scalar-valued one-hidden-layer networks, and inputs bounded in Euclidean norm.
We prove that controlling the spectral norm of the hidden layer weight matrix is insufficient to get uniform convergence guarantees.
We analyze two important settings where a mere spectral norm control turns out to be sufficient.
arXiv Detail & Related papers (2022-02-13T07:12:02Z) - Imbedding Deep Neural Networks [0.0]
Continuous depth neural networks, such as Neural ODEs, have refashioned the understanding of residual neural networks in terms of non-linear vector-valued optimal control problems.
We propose a new approach which explicates the network's depth' as a fundamental variable, thus reducing the problem to a system of forward-facing initial value problems.
arXiv Detail & Related papers (2022-01-31T22:00:41Z) - Path Regularization: A Convexity and Sparsity Inducing Regularization
for Parallel ReLU Networks [75.33431791218302]
We study the training problem of deep neural networks and introduce an analytic approach to unveil hidden convexity in the optimization landscape.
We consider a deep parallel ReLU network architecture, which also includes standard deep networks and ResNets as its special cases.
arXiv Detail & Related papers (2021-10-18T18:00:36Z) - Towards Understanding Theoretical Advantages of Complex-Reaction
Networks [77.34726150561087]
We show that a class of functions can be approximated by a complex-reaction network using the number of parameters.
For empirical risk minimization, our theoretical result shows that the critical point set of complex-reaction networks is a proper subset of that of real-valued networks.
arXiv Detail & Related papers (2021-08-15T10:13:49Z) - Validation of RELU nets with tropical polyhedra [7.087237546722617]
We present an approach that abstracts ReLU feedforward neural networks using tropical polyhedra.
We show how the connection between ReLU networks and tropical rational functions can provide approaches for range analysis of ReLU neural networks.
arXiv Detail & Related papers (2021-07-30T06:22:59Z) - On reaction network implementations of neural networks [0.0]
This paper is concerned with the utilization of deterministically modeled chemical reaction networks for the implementation of (feed-forward) neural networks.
We develop a general mathematical framework and prove that the ordinary differential equations (ODEs) associated with certain reaction network implementations of neural networks have desirable properties.
arXiv Detail & Related papers (2020-10-26T02:37:26Z) - Universal Approximation Power of Deep Residual Neural Networks via
Nonlinear Control Theory [9.210074587720172]
We explain the universal approximation capabilities of deep residual neural networks through geometric nonlinear control.
Inspired by recent work establishing links between residual networks and control systems, we provide a general sufficient condition for a residual network to have the power of universal approximation.
arXiv Detail & Related papers (2020-07-12T14:53:30Z) - Measuring Model Complexity of Neural Networks with Curve Activation
Functions [100.98319505253797]
We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function.
We experimentally explore the training process of neural networks and detect overfitting.
We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
arXiv Detail & Related papers (2020-06-16T07:38:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.