From NeurODEs to AutoencODEs: a mean-field control framework for
width-varying Neural Networks
- URL: http://arxiv.org/abs/2307.02279v2
- Date: Thu, 10 Aug 2023 15:30:15 GMT
- Title: From NeurODEs to AutoencODEs: a mean-field control framework for
width-varying Neural Networks
- Authors: Cristina Cipriani, Massimo Fornasier and Alessandro Scagliotti
- Abstract summary: We propose a new type of continuous-time control system, called AutoencODE, based on a controlled field that drives dynamics.
We show that many architectures can be recovered in regions where the loss function is locally convex.
- Score: 68.8204255655161
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The connection between Residual Neural Networks (ResNets) and continuous-time
control systems (known as NeurODEs) has led to a mathematical analysis of
neural networks which has provided interesting results of both theoretical and
practical significance. However, by construction, NeurODEs have been limited to
describing constant-width layers, making them unsuitable for modeling deep
learning architectures with layers of variable width. In this paper, we propose
a continuous-time Autoencoder, which we call AutoencODE, based on a
modification of the controlled field that drives the dynamics. This adaptation
enables the extension of the mean-field control framework originally devised
for conventional NeurODEs. In this setting, we tackle the case of low Tikhonov
regularization, resulting in potentially non-convex cost landscapes. While the
global results obtained for high Tikhonov regularization may not hold globally,
we show that many of them can be recovered in regions where the loss function
is locally convex. Inspired by our theoretical findings, we develop a training
method tailored to this specific type of Autoencoders with residual
connections, and we validate our approach through numerical experiments
conducted on various examples.
Related papers
- Generalization and Estimation Error Bounds for Model-based Neural
Networks [78.88759757988761]
We show that the generalization abilities of model-based networks for sparse recovery outperform those of regular ReLU networks.
We derive practical design rules that allow to construct model-based networks with guaranteed high generalization.
arXiv Detail & Related papers (2023-04-19T16:39:44Z) - On Feature Learning in Neural Networks with Global Convergence
Guarantees [49.870593940818715]
We study the optimization of wide neural networks (NNs) via gradient flow (GF)
We show that when the input dimension is no less than the size of the training set, the training loss converges to zero at a linear rate under GF.
We also show empirically that, unlike in the Neural Tangent Kernel (NTK) regime, our multi-layer model exhibits feature learning and can achieve better generalization performance than its NTK counterpart.
arXiv Detail & Related papers (2022-04-22T15:56:43Z) - Imbedding Deep Neural Networks [0.0]
Continuous depth neural networks, such as Neural ODEs, have refashioned the understanding of residual neural networks in terms of non-linear vector-valued optimal control problems.
We propose a new approach which explicates the network's depth' as a fundamental variable, thus reducing the problem to a system of forward-facing initial value problems.
arXiv Detail & Related papers (2022-01-31T22:00:41Z) - A Dimensionality Reduction Approach for Convolutional Neural Networks [0.0]
We propose a generic methodology to reduce the number of layers of a pre-trained network by combining the aforementioned techniques for dimensionality reduction with input-output mappings.
Our experiment shows that the reduced nets can achieve a level of accuracy similar to the original Convolutional Neural Network under examination, while saving in memory allocation.
arXiv Detail & Related papers (2021-10-18T10:31:12Z) - LocalDrop: A Hybrid Regularization for Deep Neural Networks [98.30782118441158]
We propose a new approach for the regularization of neural networks by the local Rademacher complexity called LocalDrop.
A new regularization function for both fully-connected networks (FCNs) and convolutional neural networks (CNNs) has been developed based on the proposed upper bound of the local Rademacher complexity.
arXiv Detail & Related papers (2021-03-01T03:10:11Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - An Ode to an ODE [78.97367880223254]
We present a new paradigm for Neural ODE algorithms, called ODEtoODE, where time-dependent parameters of the main flow evolve according to a matrix flow on the group O(d)
This nested system of two flows provides stability and effectiveness of training and provably solves the gradient vanishing-explosion problem.
arXiv Detail & Related papers (2020-06-19T22:05:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.