The Power of Linear Recurrent Neural Networks
- URL: http://arxiv.org/abs/1802.03308v9
- Date: Wed, 24 Jan 2024 16:22:36 GMT
- Title: The Power of Linear Recurrent Neural Networks
- Authors: Frieder Stolzenburg, Sandra Litz, Olivia Michael, Oliver Obst
- Abstract summary: We show how autoregressive linear, i.e., linearly activated recurrent neural networks (LRNNs) can approximate any time-dependent function f(t)
LRNNs outperform the previous state-of-the-art for the MSO task with a minimal number of units.
- Score: 1.124958340749622
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recurrent neural networks are a powerful means to cope with time series. We
show how autoregressive linear, i.e., linearly activated recurrent neural
networks (LRNNs) can approximate any time-dependent function f(t). The
approximation can effectively be learned by simply solving a linear equation
system; no backpropagation or similar methods are needed. Furthermore, and this
is the main contribution of this article, the size of an LRNN can be reduced
significantly in one step after inspecting the spectrum of the network
transition matrix, i.e., its eigenvalues, by taking only the most relevant
components. Therefore, in contrast to other approaches, we do not only learn
network weights but also the network architecture. LRNNs have interesting
properties: They end up in ellipse trajectories in the long run and allow the
prediction of further values and compact representations of functions. We
demonstrate this by several experiments, among them multiple superimposed
oscillators (MSO), robotic soccer (RoboCup), and stock price prediction. LRNNs
outperform the previous state-of-the-art for the MSO task with a minimal number
of units.
Related papers
- Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations [54.17275171325324]
We present a counterexample to the Linear Representation Hypothesis (LRH)
When trained to repeat an input token sequence, neural networks learn to represent the token at each position with a particular order of magnitude, rather than a direction.
These findings strongly indicate that interpretability research should not be confined to the LRH.
arXiv Detail & Related papers (2024-08-20T15:04:37Z) - A Novel Explanation Against Linear Neural Networks [1.223779595809275]
Linear Regression and neural networks are widely used to model data.
We show that neural networks without activation functions, or linear neural networks, actually reduce both training and testing performance.
We prove this hypothesis through an analysis of the optimization of an LNN and rigorous testing comparing the performance between both LNNs and linear regression on noisy datasets.
arXiv Detail & Related papers (2023-12-30T09:44:51Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Pretraining Graph Neural Networks for few-shot Analog Circuit Modeling
and Design [68.1682448368636]
We present a supervised pretraining approach to learn circuit representations that can be adapted to new unseen topologies or unseen prediction tasks.
To cope with the variable topological structure of different circuits we describe each circuit as a graph and use graph neural networks (GNNs) to learn node embeddings.
We show that pretraining GNNs on prediction of output node voltages can encourage learning representations that can be adapted to new unseen topologies or prediction of new circuit level properties.
arXiv Detail & Related papers (2022-03-29T21:18:47Z) - Reverse engineering recurrent neural networks with Jacobian switching
linear dynamical systems [24.0378100479104]
Recurrent neural networks (RNNs) are powerful models for processing time-series data.
The framework of reverse engineering a trained RNN by linearizing around its fixed points has provided insight, but the approach has significant challenges.
We present a new model that overcomes these limitations by co-training an RNN with a novel switching linear dynamical system (SLDS) formulation.
arXiv Detail & Related papers (2021-11-01T20:49:30Z) - Metric Entropy Limits on Recurrent Neural Network Learning of Linear
Dynamical Systems [0.0]
We show that RNNs can optimally learn - or identify in system-theory parlance - stable LTI systems.
For LTI systems whose input-output relation is characterized through a difference equation, this means that RNNs can learn the difference equation from input-output traces in a metric-entropy optimal manner.
arXiv Detail & Related papers (2021-05-06T10:12:30Z) - A Fully Tensorized Recurrent Neural Network [48.50376453324581]
We introduce a "fully tensorized" RNN architecture which jointly encodes the separate weight matrices within each recurrent cell.
This approach reduces model size by several orders of magnitude, while still maintaining similar or better performance compared to standard RNNs.
arXiv Detail & Related papers (2020-10-08T18:24:12Z) - Tensor train decompositions on recurrent networks [60.334946204107446]
Matrix product state (MPS) tensor trains have more attractive features than MPOs, in terms of storage reduction and computing time at inference.
We show that MPS tensor trains should be at the forefront of LSTM network compression through a theoretical analysis and practical experiments on NLP task.
arXiv Detail & Related papers (2020-06-09T18:25:39Z) - Liquid Time-constant Networks [117.57116214802504]
We introduce a new class of time-continuous recurrent neural network models.
Instead of declaring a learning system's dynamics by implicit nonlinearities, we construct networks of linear first-order dynamical systems.
These neural networks exhibit stable and bounded behavior, yield superior expressivity within the family of neural ordinary differential equations.
arXiv Detail & Related papers (2020-06-08T09:53:35Z) - Achieving Online Regression Performance of LSTMs with Simple RNNs [0.0]
We introduce a first-order training algorithm with a linear time complexity in the number of parameters.
We show that when SRNNs are trained with our algorithm, they provide very similar regression performance with the LSTMs in two to three times shorter training time.
arXiv Detail & Related papers (2020-05-16T11:41:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.