Reverse engineering recurrent neural networks with Jacobian switching
linear dynamical systems
- URL: http://arxiv.org/abs/2111.01256v1
- Date: Mon, 1 Nov 2021 20:49:30 GMT
- Title: Reverse engineering recurrent neural networks with Jacobian switching
linear dynamical systems
- Authors: Jimmy T.H. Smith, Scott W. Linderman, David Sussillo
- Abstract summary: Recurrent neural networks (RNNs) are powerful models for processing time-series data.
The framework of reverse engineering a trained RNN by linearizing around its fixed points has provided insight, but the approach has significant challenges.
We present a new model that overcomes these limitations by co-training an RNN with a novel switching linear dynamical system (SLDS) formulation.
- Score: 24.0378100479104
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recurrent neural networks (RNNs) are powerful models for processing
time-series data, but it remains challenging to understand how they function.
Improving this understanding is of substantial interest to both the machine
learning and neuroscience communities. The framework of reverse engineering a
trained RNN by linearizing around its fixed points has provided insight, but
the approach has significant challenges. These include difficulty choosing
which fixed point to expand around when studying RNN dynamics and error
accumulation when reconstructing the nonlinear dynamics with the linearized
dynamics. We present a new model that overcomes these limitations by
co-training an RNN with a novel switching linear dynamical system (SLDS)
formulation. A first-order Taylor series expansion of the co-trained RNN and an
auxiliary function trained to pick out the RNN's fixed points govern the SLDS
dynamics. The results are a trained SLDS variant that closely approximates the
RNN, an auxiliary function that can produce a fixed point for each point in
state-space, and a trained nonlinear RNN whose dynamics have been regularized
such that its first-order terms perform the computation, if possible. This
model removes the post-training fixed point optimization and allows us to
unambiguously study the learned dynamics of the SLDS at any point in
state-space. It also generalizes SLDS models to continuous manifolds of
switching points while sharing parameters across switches. We validate the
utility of the model on two synthetic tasks relevant to previous work reverse
engineering RNNs. We then show that our model can be used as a drop-in in more
complex architectures, such as LFADS, and apply this LFADS hybrid to analyze
single-trial spiking activity from the motor system of a non-human primate.
Related papers
- Bifurcations and loss jumps in RNN training [7.937801286897863]
We introduce a novel algorithm for detecting all fixed points and k-cycles in ReLU-based RNNs and their existence and stability regions.
Our algorithm provides exact results and returns fixed points and cycles up to high orders with surprisingly good scaling behavior.
arXiv Detail & Related papers (2023-10-26T16:49:44Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Training High-Performance Low-Latency Spiking Neural Networks by
Differentiation on Spike Representation [70.75043144299168]
Spiking Neural Network (SNN) is a promising energy-efficient AI model when implemented on neuromorphic hardware.
It is a challenge to efficiently train SNNs due to their non-differentiability.
We propose the Differentiation on Spike Representation (DSR) method, which could achieve high performance.
arXiv Detail & Related papers (2022-05-01T12:44:49Z) - Regularized Sequential Latent Variable Models with Adversarial Neural
Networks [33.74611654607262]
We will present different ways of using high level latent random variables in RNN to model the variability in the sequential data.
We will explore possible ways of using adversarial method to train a variational RNN model.
arXiv Detail & Related papers (2021-08-10T08:05:14Z) - Skip-Connected Self-Recurrent Spiking Neural Networks with Joint
Intrinsic Parameter and Synaptic Weight Training [14.992756670960008]
We propose a new type of RSNN called Skip-Connected Self-Recurrent SNNs (ScSr-SNNs)
ScSr-SNNs can boost performance by up to 2.55% compared with other types of RSNNs trained by state-of-the-art BP methods.
arXiv Detail & Related papers (2020-10-23T22:27:13Z) - A Fully Tensorized Recurrent Neural Network [48.50376453324581]
We introduce a "fully tensorized" RNN architecture which jointly encodes the separate weight matrices within each recurrent cell.
This approach reduces model size by several orders of magnitude, while still maintaining similar or better performance compared to standard RNNs.
arXiv Detail & Related papers (2020-10-08T18:24:12Z) - Coupled Oscillatory Recurrent Neural Network (coRNN): An accurate and
(gradient) stable architecture for learning long time dependencies [15.2292571922932]
We propose a novel architecture for recurrent neural networks.
Our proposed RNN is based on a time-discretization of a system of second-order ordinary differential equations.
Experiments show that the proposed RNN is comparable in performance to the state of the art on a variety of benchmarks.
arXiv Detail & Related papers (2020-10-02T12:35:04Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.