Convergence Analysis of Deep Residual Networks
- URL: http://arxiv.org/abs/2205.06571v1
- Date: Fri, 13 May 2022 11:53:09 GMT
- Title: Convergence Analysis of Deep Residual Networks
- Authors: Wentao Huang and Haizhang Zhang
- Abstract summary: Deep Residual Networks (ResNets) are of particular importance because they demonstrated great usefulness in computer vision.
We aim at characterizing the convergence of deep ResNets as the depth tends to infinity in terms of the parameters of the networks.
- Score: 3.274290296343038
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Various powerful deep neural network architectures have made great
contribution to the exciting successes of deep learning in the past two
decades. Among them, deep Residual Networks (ResNets) are of particular
importance because they demonstrated great usefulness in computer vision by
winning the first place in many deep learning competitions. Also, ResNets were
the first class of neural networks in the development history of deep learning
that are really deep. It is of mathematical interest and practical meaning to
understand the convergence of deep ResNets. We aim at characterizing the
convergence of deep ResNets as the depth tends to infinity in terms of the
parameters of the networks. Toward this purpose, we first give a matrix-vector
description of general deep neural networks with shortcut connections and
formulate an explicit expression for the networks by using the notions of
activation domains and activation matrices. The convergence is then reduced to
the convergence of two series involving infinite products of non-square
matrices. By studying the two series, we establish a sufficient condition for
pointwise convergence of ResNets. Our result is able to give justification for
the design of ResNets. We also conduct experiments on benchmark machine
learning data to verify our results.
Related papers
- Riemannian Residual Neural Networks [58.925132597945634]
We show how to extend the residual neural network (ResNet)
ResNets have become ubiquitous in machine learning due to their beneficial learning properties, excellent empirical results, and easy-to-incorporate nature when building varied neural networks.
arXiv Detail & Related papers (2023-10-16T02:12:32Z) - Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK [86.45209429863858]
We study training one-hidden-layer ReLU networks in the neural tangent kernel (NTK) regime.
We show that the neural networks possess a different limiting kernel which we call textitbias-generalized NTK
We also study various properties of the neural networks with this new kernel.
arXiv Detail & Related papers (2023-01-01T02:11:39Z) - Robust Training and Verification of Implicit Neural Networks: A
Non-Euclidean Contractive Approach [64.23331120621118]
This paper proposes a theoretical and computational framework for training and robustness verification of implicit neural networks.
We introduce a related embedded network and show that the embedded network can be used to provide an $ell_infty$-norm box over-approximation of the reachable sets of the original network.
We apply our algorithms to train implicit neural networks on the MNIST dataset and compare the robustness of our models with the models trained via existing approaches in the literature.
arXiv Detail & Related papers (2022-08-08T03:13:24Z) - Rank Diminishing in Deep Neural Networks [71.03777954670323]
Rank of neural networks measures information flowing across layers.
It is an instance of a key structural condition that applies across broad domains of machine learning.
For neural networks, however, the intrinsic mechanism that yields low-rank structures remains vague and unclear.
arXiv Detail & Related papers (2022-06-13T12:03:32Z) - Convergence of Deep Neural Networks with General Activation Functions
and Pooling [5.316908050163474]
Convergence of deep neural networks is a fundamental issue in building the mathematical foundation for deep learning.
We study the convergence of deep neural networks as the depth tends to infinity for two other activation functions: the leaky ReLU and the sigmoid function.
arXiv Detail & Related papers (2022-05-13T11:49:03Z) - Convergence of Deep Convolutional Neural Networks [2.5991265608180396]
Convergence of deep neural networks as the depth of the networks tends to infinity is fundamental in building the mathematical foundation for deep learning.
We first study convergence of general ReLU networks with increasing widths and then apply the results obtained to deep convolutional neural networks.
arXiv Detail & Related papers (2021-09-28T07:48:17Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z) - Doubly infinite residual neural networks: a diffusion process approach [8.642603456626393]
We show that deep ResNets do not suffer from undesirable forward-propagation properties.
We focus on doubly infinite fully-connected ResNets, for which we consider i.i.d.
Our results highlight a limited expressive power of doubly infinite ResNets when the unscaled network's parameters are i.i.d. and the residual blocks are shallow.
arXiv Detail & Related papers (2020-07-07T07:45:34Z) - Quasi-Equivalence of Width and Depth of Neural Networks [10.365556153676538]
We investigate if the design of artificial neural networks should have a directional preference.
Inspired by the De Morgan law, we establish a quasi-equivalence between the width and depth of ReLU networks.
Based on our findings, a deep network has a wide equivalent, subject to an arbitrarily small error.
arXiv Detail & Related papers (2020-02-06T21:17:32Z) - On Random Kernels of Residual Architectures [93.94469470368988]
We derive finite width and depth corrections for the Neural Tangent Kernel (NTK) of ResNets and DenseNets.
Our findings show that in ResNets, convergence to the NTK may occur when depth and width simultaneously tend to infinity.
In DenseNets, however, convergence of the NTK to its limit as the width tends to infinity is guaranteed.
arXiv Detail & Related papers (2020-01-28T16:47:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.