Lyapunov-Guided Representation of Recurrent Neural Network Performance
- URL: http://arxiv.org/abs/2204.04876v2
- Date: Wed, 27 Dec 2023 05:19:29 GMT
- Title: Lyapunov-Guided Representation of Recurrent Neural Network Performance
- Authors: Ryan Vogt, Yang Zheng and Eli Shlizerman
- Abstract summary: Recurrent Neural Networks (RNN) are ubiquitous computing systems for sequences and time series data.
We propose to treat RNN as dynamical systems and to correlate hyperparameters with accuracy through Lyapunov spectral analysis.
Our studies of various RNN architectures show that AeLLE successfully correlates RNN Lyapunov spectrum with accuracy.
- Score: 9.449520199858952
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recurrent Neural Networks (RNN) are ubiquitous computing systems for
sequences and multivariate time series data. While several robust architectures
of RNN are known, it is unclear how to relate RNN initialization, architecture,
and other hyperparameters with accuracy for a given task. In this work, we
propose to treat RNN as dynamical systems and to correlate hyperparameters with
accuracy through Lyapunov spectral analysis, a methodology specifically
designed for nonlinear dynamical systems. To address the fact that RNN features
go beyond the existing Lyapunov spectral analysis, we propose to infer relevant
features from the Lyapunov spectrum with an Autoencoder and an embedding of its
latent representation (AeLLE). Our studies of various RNN architectures show
that AeLLE successfully correlates RNN Lyapunov spectrum with accuracy.
Furthermore, the latent representation learned by AeLLE is generalizable to
novel inputs from the same task and is formed early in the process of RNN
training. The latter property allows for the prediction of the accuracy to
which RNN would converge when training is complete. We conclude that
representation of RNN through Lyapunov spectrum along with AeLLE provides a
novel method for organization and interpretation of variants of RNN
architectures.
Related papers
- Scalable Mechanistic Neural Networks [52.28945097811129]
We propose an enhanced neural network framework designed for scientific machine learning applications involving long temporal sequences.
By reformulating the original Mechanistic Neural Network (MNN) we reduce the computational time and space complexities from cubic and quadratic with respect to the sequence length, respectively, to linear.
Extensive experiments demonstrate that S-MNN matches the original MNN in precision while substantially reducing computational resources.
arXiv Detail & Related papers (2024-10-08T14:27:28Z) - Universal approximation property of invertible neural networks [76.95927093274392]
Invertible neural networks (INNs) are neural network architectures with invertibility by design.
Thanks to their invertibility and the tractability of Jacobian, INNs have various machine learning applications such as probabilistic modeling, generative modeling, and representation learning.
arXiv Detail & Related papers (2022-04-15T10:45:26Z) - Reverse engineering recurrent neural networks with Jacobian switching
linear dynamical systems [24.0378100479104]
Recurrent neural networks (RNNs) are powerful models for processing time-series data.
The framework of reverse engineering a trained RNN by linearizing around its fixed points has provided insight, but the approach has significant challenges.
We present a new model that overcomes these limitations by co-training an RNN with a novel switching linear dynamical system (SLDS) formulation.
arXiv Detail & Related papers (2021-11-01T20:49:30Z) - Framing RNN as a kernel method: A neural ODE approach [11.374487003189468]
We show that the solution of a RNN can be viewed as a linear function of a specific feature set of the input sequence, known as the signature.
We obtain theoretical guarantees on generalization and stability for a large class of recurrent networks.
arXiv Detail & Related papers (2021-06-02T14:46:40Z) - Coupled Oscillatory Recurrent Neural Network (coRNN): An accurate and
(gradient) stable architecture for learning long time dependencies [15.2292571922932]
We propose a novel architecture for recurrent neural networks.
Our proposed RNN is based on a time-discretization of a system of second-order ordinary differential equations.
Experiments show that the proposed RNN is comparable in performance to the state of the art on a variety of benchmarks.
arXiv Detail & Related papers (2020-10-02T12:35:04Z) - How Neural Networks Extrapolate: From Feedforward to Graph Neural
Networks [80.55378250013496]
We study how neural networks trained by gradient descent extrapolate what they learn outside the support of the training distribution.
Graph Neural Networks (GNNs) have shown some success in more complex tasks.
arXiv Detail & Related papers (2020-09-24T17:48:59Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z) - On Lyapunov Exponents for RNNs: Understanding Information Propagation
Using Dynamical Systems Tools [6.199300239433395]
Lyapunov Exponents (LEs) measure the rates of expansion and contraction of nonlinear system trajectories.
Les have a bearing on stability of RNN training dynamics because forward propagation of information is related to the backward propagation of error gradients.
As a tool to understand and exploit stability of training dynamics, the Lyapunov spectrum fills an existing gap between prescriptive mathematical approaches.
arXiv Detail & Related papers (2020-06-25T00:53:19Z) - Understanding Recurrent Neural Networks Using Nonequilibrium Response
Theory [5.33024001730262]
Recurrent neural networks (RNNs) are brain-inspired models widely used in machine learning for analyzing sequential data.
We show how RNNs process input signals using the response theory from nonequilibrium statistical mechanics.
arXiv Detail & Related papers (2020-06-19T10:09:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.