Lipschitz Recurrent Neural Networks
- URL: http://arxiv.org/abs/2006.12070v3
- Date: Sat, 24 Apr 2021 03:31:59 GMT
- Title: Lipschitz Recurrent Neural Networks
- Authors: N.Benjamin Erichson, Omri Azencot, Alejandro Queiruga, Liam
Hodgkinson, and Michael W. Mahoney
- Abstract summary: We show that our Lipschitz recurrent unit is more robust with respect to input and parameter perturbations as compared to other continuous-time RNNs.
Our experiments demonstrate that the Lipschitz RNN can outperform existing recurrent units on a range of benchmark tasks.
- Score: 100.72827570987992
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Viewing recurrent neural networks (RNNs) as continuous-time dynamical
systems, we propose a recurrent unit that describes the hidden state's
evolution with two parts: a well-understood linear component plus a Lipschitz
nonlinearity. This particular functional form facilitates stability analysis of
the long-term behavior of the recurrent unit using tools from nonlinear systems
theory. In turn, this enables architectural design decisions before
experimentation. Sufficient conditions for global stability of the recurrent
unit are obtained, motivating a novel scheme for constructing hidden-to-hidden
matrices. Our experiments demonstrate that the Lipschitz RNN can outperform
existing recurrent units on a range of benchmark tasks, including computer
vision, language modeling and speech prediction tasks. Finally, through
Hessian-based analysis we demonstrate that our Lipschitz recurrent unit is more
robust with respect to input and parameter perturbations as compared to other
continuous-time RNNs.
Related papers
- Unconditional stability of a recurrent neural circuit implementing divisive normalization [0.0]
We prove the remarkable property of unconditional local stability for an arbitrary-dimensional ORGaNICs circuit.
We show that ORGaNICs can be trained by backpropagation through time without gradient clipping/scaling.
arXiv Detail & Related papers (2024-09-27T17:46:05Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Learning Low Dimensional State Spaces with Overparameterized Recurrent
Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory.
Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z) - Lipschitz Continuity Retained Binary Neural Network [52.17734681659175]
We introduce the Lipschitz continuity as the rigorous criteria to define the model robustness for BNN.
We then propose to retain the Lipschitz continuity as a regularization term to improve the model robustness.
Our experiments prove that our BNN-specific regularization method can effectively strengthen the robustness of BNN.
arXiv Detail & Related papers (2022-07-13T22:55:04Z) - Linear systems with neural network nonlinearities: Improved stability
analysis via acausal Zames-Falb multipliers [0.0]
We analyze the stability of feedback interconnections of a linear time-invariant system with a neural network nonlinearity in discrete time.
Our approach provides a flexible and versatile framework for stability analysis of feedback interconnections with neural network nonlinearities.
arXiv Detail & Related papers (2021-03-31T14:21:03Z) - Certifying Incremental Quadratic Constraints for Neural Networks via
Convex Optimization [2.388501293246858]
We propose a convex program to certify incremental quadratic constraints on the map of neural networks over a region of interest.
certificates can capture several useful properties such as (local) Lipschitz continuity, one-sided Lipschitz continuity, invertibility, and contraction.
arXiv Detail & Related papers (2020-12-10T21:15:00Z) - Coupled Oscillatory Recurrent Neural Network (coRNN): An accurate and
(gradient) stable architecture for learning long time dependencies [15.2292571922932]
We propose a novel architecture for recurrent neural networks.
Our proposed RNN is based on a time-discretization of a system of second-order ordinary differential equations.
Experiments show that the proposed RNN is comparable in performance to the state of the art on a variety of benchmarks.
arXiv Detail & Related papers (2020-10-02T12:35:04Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - Liquid Time-constant Networks [117.57116214802504]
We introduce a new class of time-continuous recurrent neural network models.
Instead of declaring a learning system's dynamics by implicit nonlinearities, we construct networks of linear first-order dynamical systems.
These neural networks exhibit stable and bounded behavior, yield superior expressivity within the family of neural ordinary differential equations.
arXiv Detail & Related papers (2020-06-08T09:53:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.