Understanding Recurrent Neural Networks Using Nonequilibrium Response
Theory
- URL: http://arxiv.org/abs/2006.11052v2
- Date: Mon, 18 Jan 2021 17:28:17 GMT
- Title: Understanding Recurrent Neural Networks Using Nonequilibrium Response
Theory
- Authors: Soon Hoe Lim
- Abstract summary: Recurrent neural networks (RNNs) are brain-inspired models widely used in machine learning for analyzing sequential data.
We show how RNNs process input signals using the response theory from nonequilibrium statistical mechanics.
- Score: 5.33024001730262
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recurrent neural networks (RNNs) are brain-inspired models widely used in
machine learning for analyzing sequential data. The present work is a
contribution towards a deeper understanding of how RNNs process input signals
using the response theory from nonequilibrium statistical mechanics. For a
class of continuous-time stochastic RNNs (SRNNs) driven by an input signal, we
derive a Volterra type series representation for their output. This
representation is interpretable and disentangles the input signal from the SRNN
architecture. The kernels of the series are certain recursively defined
correlation functions with respect to the unperturbed dynamics that completely
determine the output. Exploiting connections of this representation and its
implications to rough paths theory, we identify a universal feature -- the
response feature, which turns out to be the signature of tensor product of the
input signal and a natural support basis. In particular, we show that SRNNs,
with only the weights in the readout layer optimized and the weights in the
hidden layer kept fixed and not optimized, can be viewed as kernel machines
operating on a reproducing kernel Hilbert space associated with the response
feature.
Related papers
- Learning local discrete features in explainable-by-design convolutional neural networks [0.0]
We introduce an explainable-by-design convolutional neural network (CNN) based on the lateral inhibition mechanism.
The model consists of the predictor, that is a high-accuracy CNN with residual or dense skip connections.
By collecting observations and directly calculating probabilities, we can explain causal relationships between motifs of adjacent levels.
arXiv Detail & Related papers (2024-10-31T18:39:41Z) - Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations [54.17275171325324]
We present a counterexample to the Linear Representation Hypothesis (LRH)
When trained to repeat an input token sequence, neural networks learn to represent the token at each position with a particular order of magnitude, rather than a direction.
These findings strongly indicate that interpretability research should not be confined to the LRH.
arXiv Detail & Related papers (2024-08-20T15:04:37Z) - Use of Parallel Explanatory Models to Enhance Transparency of Neural Network Configurations for Cell Degradation Detection [18.214293024118145]
We build a parallel model to illuminate and understand the internal operation of neural networks.
We show how each layer of the RNN transforms the input distributions to increase detection accuracy.
At the same time we also discover a side effect acting to limit the improvement in accuracy.
arXiv Detail & Related papers (2024-04-17T12:22:54Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Signal Processing for Implicit Neural Representations [80.38097216996164]
Implicit Neural Representations (INRs) encode continuous multi-media data via multi-layer perceptrons.
Existing works manipulate such continuous representations via processing on their discretized instance.
We propose an implicit neural signal processing network, dubbed INSP-Net, via differential operators on INR.
arXiv Detail & Related papers (2022-10-17T06:29:07Z) - Lyapunov-Guided Representation of Recurrent Neural Network Performance [9.449520199858952]
Recurrent Neural Networks (RNN) are ubiquitous computing systems for sequences and time series data.
We propose to treat RNN as dynamical systems and to correlate hyperparameters with accuracy through Lyapunov spectral analysis.
Our studies of various RNN architectures show that AeLLE successfully correlates RNN Lyapunov spectrum with accuracy.
arXiv Detail & Related papers (2022-04-11T05:38:38Z) - Framing RNN as a kernel method: A neural ODE approach [11.374487003189468]
We show that the solution of a RNN can be viewed as a linear function of a specific feature set of the input sequence, known as the signature.
We obtain theoretical guarantees on generalization and stability for a large class of recurrent networks.
arXiv Detail & Related papers (2021-06-02T14:46:40Z) - Stability of Algebraic Neural Networks to Small Perturbations [179.55535781816343]
Algebraic neural networks (AlgNNs) are composed of a cascade of layers each one associated to and algebraic signal model.
We show how any architecture that uses a formal notion of convolution can be stable beyond particular choices of the shift operator.
arXiv Detail & Related papers (2020-10-22T09:10:16Z) - Connecting Weighted Automata, Tensor Networks and Recurrent Neural
Networks through Spectral Learning [58.14930566993063]
We present connections between three models used in different research fields: weighted finite automata(WFA) from formal languages and linguistics, recurrent neural networks used in machine learning, and tensor networks.
We introduce the first provable learning algorithm for linear 2-RNN defined over sequences of continuous vectors input.
arXiv Detail & Related papers (2020-10-19T15:28:00Z) - Coupled Oscillatory Recurrent Neural Network (coRNN): An accurate and
(gradient) stable architecture for learning long time dependencies [15.2292571922932]
We propose a novel architecture for recurrent neural networks.
Our proposed RNN is based on a time-discretization of a system of second-order ordinary differential equations.
Experiments show that the proposed RNN is comparable in performance to the state of the art on a variety of benchmarks.
arXiv Detail & Related papers (2020-10-02T12:35:04Z) - How Neural Networks Extrapolate: From Feedforward to Graph Neural
Networks [80.55378250013496]
We study how neural networks trained by gradient descent extrapolate what they learn outside the support of the training distribution.
Graph Neural Networks (GNNs) have shown some success in more complex tasks.
arXiv Detail & Related papers (2020-09-24T17:48:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.