Related papers: The Power of Linear Recurrent Neural Networks

The Power of Linear Recurrent Neural Networks

URL: http://arxiv.org/abs/1802.03308v9
Date: Wed, 24 Jan 2024 16:22:36 GMT
Title: The Power of Linear Recurrent Neural Networks
Authors: Frieder Stolzenburg, Sandra Litz, Olivia Michael, Oliver Obst
Abstract summary: We show how autoregressive linear, i.e., linearly activated recurrent neural networks (LRNNs) can approximate any time-dependent function f(t) LRNNs outperform the previous state-of-the-art for the MSO task with a minimal number of units.
Score: 1.124958340749622
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recurrent neural networks are a powerful means to cope with time series. We show how autoregressive linear, i.e., linearly activated recurrent neural networks (LRNNs) can approximate any time-dependent function f(t). The approximation can effectively be learned by simply solving a linear equation system; no backpropagation or similar methods are needed. Furthermore, and this is the main contribution of this article, the size of an LRNN can be reduced significantly in one step after inspecting the spectrum of the network transition matrix, i.e., its eigenvalues, by taking only the most relevant components. Therefore, in contrast to other approaches, we do not only learn network weights but also the network architecture. LRNNs have interesting properties: They end up in ellipse trajectories in the long run and allow the prediction of further values and compact representations of functions. We demonstrate this by several experiments, among them multiple superimposed oscillators (MSO), robotic soccer (RoboCup), and stock price prediction. LRNNs outperform the previous state-of-the-art for the MSO task with a minimal number of units.

Related papers

Bidirectional Linear Recurrent Models for Sequence-Level Multisource Fusion [10.867398697751742]
We introduce BLUR (Bidirectional Linear Unit for Recurrent network), which uses forward and backward linear recurrent units (LRUs) to capture both past and future dependencies with high computational efficiency. Experiments on sequential image and time series datasets reveal that BLUR not only surpasses transformers and traditional RNNs in accuracy but also significantly reduces computational costs.
arXiv Detail & Related papers (2025-04-11T20:42:58Z)
Scalable Mechanistic Neural Networks for Differential Equations and Machine Learning [52.28945097811129]
We propose an enhanced neural network framework designed for scientific machine learning applications involving long temporal sequences. We reduce the computational time and space complexities from cubic and quadratic with respect to the sequence length, respectively, to linear. Extensive experiments demonstrate that S-MNN matches the original MNN in precision while substantially reducing computational resources.
arXiv Detail & Related papers (2024-10-08T14:27:28Z)
Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations [54.17275171325324]
We present a counterexample to the Linear Representation Hypothesis (LRH) When trained to repeat an input token sequence, neural networks learn to represent the token at each position with a particular order of magnitude, rather than a direction. These findings strongly indicate that interpretability research should not be confined to the LRH.
arXiv Detail & Related papers (2024-08-20T15:04:37Z)
A Novel Explanation Against Linear Neural Networks [1.223779595809275]
Linear Regression and neural networks are widely used to model data. We show that neural networks without activation functions, or linear neural networks, actually reduce both training and testing performance. We prove this hypothesis through an analysis of the optimization of an LNN and rigorous testing comparing the performance between both LNNs and linear regression on noisy datasets.
arXiv Detail & Related papers (2023-12-30T09:44:51Z)
How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series. We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
Pretraining Graph Neural Networks for few-shot Analog Circuit Modeling and Design [68.1682448368636]
We present a supervised pretraining approach to learn circuit representations that can be adapted to new unseen topologies or unseen prediction tasks. To cope with the variable topological structure of different circuits we describe each circuit as a graph and use graph neural networks (GNNs) to learn node embeddings. We show that pretraining GNNs on prediction of output node voltages can encourage learning representations that can be adapted to new unseen topologies or prediction of new circuit level properties.
arXiv Detail & Related papers (2022-03-29T21:18:47Z)
Reverse engineering recurrent neural networks with Jacobian switching linear dynamical systems [24.0378100479104]
Recurrent neural networks (RNNs) are powerful models for processing time-series data. The framework of reverse engineering a trained RNN by linearizing around its fixed points has provided insight, but the approach has significant challenges. We present a new model that overcomes these limitations by co-training an RNN with a novel switching linear dynamical system (SLDS) formulation.
arXiv Detail & Related papers (2021-11-01T20:49:30Z)
Metric Entropy Limits on Recurrent Neural Network Learning of Linear Dynamical Systems [0.0]
We show that RNNs can optimally learn - or identify in system-theory parlance - stable LTI systems. For LTI systems whose input-output relation is characterized through a difference equation, this means that RNNs can learn the difference equation from input-output traces in a metric-entropy optimal manner.
arXiv Detail & Related papers (2021-05-06T10:12:30Z)
A Fully Tensorized Recurrent Neural Network [48.50376453324581]
We introduce a "fully tensorized" RNN architecture which jointly encodes the separate weight matrices within each recurrent cell. This approach reduces model size by several orders of magnitude, while still maintaining similar or better performance compared to standard RNNs.
arXiv Detail & Related papers (2020-10-08T18:24:12Z)
Tensor train decompositions on recurrent networks [60.334946204107446]
Matrix product state (MPS) tensor trains have more attractive features than MPOs, in terms of storage reduction and computing time at inference. We show that MPS tensor trains should be at the forefront of LSTM network compression through a theoretical analysis and practical experiments on NLP task.
arXiv Detail & Related papers (2020-06-09T18:25:39Z)
Liquid Time-constant Networks [117.57116214802504]
We introduce a new class of time-continuous recurrent neural network models. Instead of declaring a learning system's dynamics by implicit nonlinearities, we construct networks of linear first-order dynamical systems. These neural networks exhibit stable and bounded behavior, yield superior expressivity within the family of neural ordinary differential equations.
arXiv Detail & Related papers (2020-06-08T09:53:35Z)
Achieving Online Regression Performance of LSTMs with Simple RNNs [0.0]
We introduce a first-order training algorithm with a linear time complexity in the number of parameters. We show that when SRNNs are trained with our algorithm, they provide very similar regression performance with the LSTMs in two to three times shorter training time.
arXiv Detail & Related papers (2020-05-16T11:41:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.