Related papers: Understanding Recurrent Neural Networks Using Nonequilibrium Response Theory

Understanding Recurrent Neural Networks Using Nonequilibrium Response Theory

URL: http://arxiv.org/abs/2006.11052v2
Date: Mon, 18 Jan 2021 17:28:17 GMT
Title: Understanding Recurrent Neural Networks Using Nonequilibrium Response Theory
Authors: Soon Hoe Lim
Abstract summary: Recurrent neural networks (RNNs) are brain-inspired models widely used in machine learning for analyzing sequential data. We show how RNNs process input signals using the response theory from nonequilibrium statistical mechanics.
Score: 5.33024001730262
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recurrent neural networks (RNNs) are brain-inspired models widely used in machine learning for analyzing sequential data. The present work is a contribution towards a deeper understanding of how RNNs process input signals using the response theory from nonequilibrium statistical mechanics. For a class of continuous-time stochastic RNNs (SRNNs) driven by an input signal, we derive a Volterra type series representation for their output. This representation is interpretable and disentangles the input signal from the SRNN architecture. The kernels of the series are certain recursively defined correlation functions with respect to the unperturbed dynamics that completely determine the output. Exploiting connections of this representation and its implications to rough paths theory, we identify a universal feature -- the response feature, which turns out to be the signature of tensor product of the input signal and a natural support basis. In particular, we show that SRNNs, with only the weights in the readout layer optimized and the weights in the hidden layer kept fixed and not optimized, can be viewed as kernel machines operating on a reproducing kernel Hilbert space associated with the response feature.

Related papers

Learning local discrete features in explainable-by-design convolutional neural networks [0.0]
We introduce an explainable-by-design convolutional neural network (CNN) based on the lateral inhibition mechanism. The model consists of the predictor, that is a high-accuracy CNN with residual or dense skip connections. By collecting observations and directly calculating probabilities, we can explain causal relationships between motifs of adjacent levels.
arXiv Detail & Related papers (2024-10-31T18:39:41Z)
Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations [54.17275171325324]
We present a counterexample to the Linear Representation Hypothesis (LRH) When trained to repeat an input token sequence, neural networks learn to represent the token at each position with a particular order of magnitude, rather than a direction. These findings strongly indicate that interpretability research should not be confined to the LRH.
arXiv Detail & Related papers (2024-08-20T15:04:37Z)
Use of Parallel Explanatory Models to Enhance Transparency of Neural Network Configurations for Cell Degradation Detection [18.214293024118145]
We build a parallel model to illuminate and understand the internal operation of neural networks. We show how each layer of the RNN transforms the input distributions to increase detection accuracy. At the same time we also discover a side effect acting to limit the improvement in accuracy.
arXiv Detail & Related papers (2024-04-17T12:22:54Z)
How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series. We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z)
Signal Processing for Implicit Neural Representations [80.38097216996164]
Implicit Neural Representations (INRs) encode continuous multi-media data via multi-layer perceptrons. Existing works manipulate such continuous representations via processing on their discretized instance. We propose an implicit neural signal processing network, dubbed INSP-Net, via differential operators on INR.
arXiv Detail & Related papers (2022-10-17T06:29:07Z)
Lyapunov-Guided Representation of Recurrent Neural Network Performance [9.449520199858952]
Recurrent Neural Networks (RNN) are ubiquitous computing systems for sequences and time series data. We propose to treat RNN as dynamical systems and to correlate hyperparameters with accuracy through Lyapunov spectral analysis. Our studies of various RNN architectures show that AeLLE successfully correlates RNN Lyapunov spectrum with accuracy.
arXiv Detail & Related papers (2022-04-11T05:38:38Z)
Framing RNN as a kernel method: A neural ODE approach [11.374487003189468]
We show that the solution of a RNN can be viewed as a linear function of a specific feature set of the input sequence, known as the signature. We obtain theoretical guarantees on generalization and stability for a large class of recurrent networks.
arXiv Detail & Related papers (2021-06-02T14:46:40Z)
Stability of Algebraic Neural Networks to Small Perturbations [179.55535781816343]
Algebraic neural networks (AlgNNs) are composed of a cascade of layers each one associated to and algebraic signal model. We show how any architecture that uses a formal notion of convolution can be stable beyond particular choices of the shift operator.
arXiv Detail & Related papers (2020-10-22T09:10:16Z)
Connecting Weighted Automata, Tensor Networks and Recurrent Neural Networks through Spectral Learning [58.14930566993063]
We present connections between three models used in different research fields: weighted finite automata(WFA) from formal languages and linguistics, recurrent neural networks used in machine learning, and tensor networks. We introduce the first provable learning algorithm for linear 2-RNN defined over sequences of continuous vectors input.
arXiv Detail & Related papers (2020-10-19T15:28:00Z)
Coupled Oscillatory Recurrent Neural Network (coRNN): An accurate and (gradient) stable architecture for learning long time dependencies [15.2292571922932]
We propose a novel architecture for recurrent neural networks. Our proposed RNN is based on a time-discretization of a system of second-order ordinary differential equations. Experiments show that the proposed RNN is comparable in performance to the state of the art on a variety of benchmarks.
arXiv Detail & Related papers (2020-10-02T12:35:04Z)
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks [80.55378250013496]
We study how neural networks trained by gradient descent extrapolate what they learn outside the support of the training distribution. Graph Neural Networks (GNNs) have shown some success in more complex tasks.
arXiv Detail & Related papers (2020-09-24T17:48:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.