Real-Time Recurrent Learning using Trace Units in Reinforcement Learning
- URL: http://arxiv.org/abs/2409.01449v2
- Date: Wed, 30 Oct 2024 04:07:51 GMT
- Title: Real-Time Recurrent Learning using Trace Units in Reinforcement Learning
- Authors: Esraa Elelimy, Adam White, Michael Bowling, Martha White,
- Abstract summary: Recurrent Neural Networks (RNNs) are used to learn representations in partially observable environments.
For agents that learn online and continually interact with the environment, it is desirable to train RNNs with real-time recurrent learning (RTRL)
We build on these insights to provide a lightweight but effective approach for training RNNs in online RL.
- Score: 27.250024431890477
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recurrent Neural Networks (RNNs) are used to learn representations in partially observable environments. For agents that learn online and continually interact with the environment, it is desirable to train RNNs with real-time recurrent learning (RTRL); unfortunately, RTRL is prohibitively expensive for standard RNNs. A promising direction is to use linear recurrent architectures (LRUs), where dense recurrent weights are replaced with a complex-valued diagonal, making RTRL efficient. In this work, we build on these insights to provide a lightweight but effective approach for training RNNs in online RL. We introduce Recurrent Trace Units (RTUs), a small modification on LRUs that we nonetheless find to have significant performance benefits over LRUs when trained with RTRL. We find RTUs significantly outperform other recurrent architectures across several partially observable environments while using significantly less computation.
Related papers
- Real-Time Recurrent Reinforcement Learning [7.737685867200335]
RTRRL consists of three parts: (1) a Meta-RL RNN architecture, implementing on its own an actor-critic algorithm; (2) an outer reinforcement learning algorithm, exploiting temporal difference learning and dutch eligibility traces to train the Meta-RL network; and (3) random-feedback local-online (RFLO) learning, an online automatic differentiation algorithm for computing the gradients with respect to parameters of the network.
arXiv Detail & Related papers (2023-11-08T16:56:16Z) - Universal Approximation of Linear Time-Invariant (LTI) Systems through RNNs: Power of Randomness in Reservoir Computing [19.995241682744567]
Reservoir computing (RC) is a special RNN where the recurrent weights are randomized and left untrained.
We show that RC can universally approximate a general linear time-invariant (LTI) system.
arXiv Detail & Related papers (2023-08-04T17:04:13Z) - Exploring the Promise and Limits of Real-Time Recurrent Learning [14.162274619299902]
Real-time recurrent learning (RTRL) for sequence-processing recurrent neural networks (RNNs) offers certain conceptual advantages over backpropagation through time (BPTT)
We study actor-critic methods that combine RTRL and policy gradients, and test them in several subsets of DMLab-30, ProcGen, and Atari- 2600 environments.
Our system trained on fewer than 1.2 B environmental frames is competitive with or outperforms well-known IMPALA and R2D2 baselines trained on 10 B frames.
arXiv Detail & Related papers (2023-05-30T13:59:21Z) - Training Spiking Neural Networks with Local Tandem Learning [96.32026780517097]
Spiking neural networks (SNNs) are shown to be more biologically plausible and energy efficient than their predecessors.
In this paper, we put forward a generalized learning rule, termed Local Tandem Learning (LTL)
We demonstrate rapid network convergence within five training epochs on the CIFAR-10 dataset while having low computational complexity.
arXiv Detail & Related papers (2022-10-10T10:05:00Z) - Single-Shot Pruning for Offline Reinforcement Learning [47.886329599997474]
Deep Reinforcement Learning (RL) is a powerful framework for solving complex real-world problems.
One way to tackle this problem is to prune neural networks leaving only the necessary parameters.
We close the gap between RL and single-shot pruning techniques and present a general pruning approach to the Offline RL.
arXiv Detail & Related papers (2021-12-31T18:10:02Z) - Online learning of windmill time series using Long Short-term Cognitive
Networks [58.675240242609064]
The amount of data generated on windmill farms makes online learning the most viable strategy to follow.
We use Long Short-term Cognitive Networks (LSTCNs) to forecast windmill time series in online settings.
Our approach reported the lowest forecasting errors with respect to a simple RNN, a Long Short-term Memory, a Gated Recurrent Unit, and a Hidden Markov Model.
arXiv Detail & Related papers (2021-07-01T13:13:24Z) - Critic Regularized Regression [70.8487887738354]
We propose a novel offline RL algorithm to learn policies from data using a form of critic-regularized regression (CRR)
We find that CRR performs surprisingly well and scales to tasks with high-dimensional state and action spaces.
arXiv Detail & Related papers (2020-06-26T17:50:26Z) - Continual Learning in Recurrent Neural Networks [67.05499844830231]
We evaluate the effectiveness of continual learning methods for processing sequential data with recurrent neural networks (RNNs)
We shed light on the particularities that arise when applying weight-importance methods, such as elastic weight consolidation, to RNNs.
We show that the performance of weight-importance methods is not directly affected by the length of the processed sequences, but rather by high working memory requirements.
arXiv Detail & Related papers (2020-06-22T10:05:12Z) - A Practical Sparse Approximation for Real Time Recurrent Learning [38.19296522866088]
Real Time Recurrent Learning (RTRL) eliminates the need for history storage and allows for online weight updates.
We introduce the Sparse n-step Approximation (SnAp) to the RTRL influence matrix, which only keeps entries that are nonzero within n steps of the recurrent core.
For highly sparse networks, SnAp with n=2 remains tractable and can outperform backpropagation through time in terms of learning speed when updates are done online.
arXiv Detail & Related papers (2020-06-12T14:38:15Z) - Transient Non-Stationarity and Generalisation in Deep Reinforcement
Learning [67.34810824996887]
Non-stationarity can arise in Reinforcement Learning (RL) even in stationary environments.
We propose Iterated Relearning (ITER) to improve generalisation of deep RL agents.
arXiv Detail & Related papers (2020-06-10T13:26:31Z) - Achieving Online Regression Performance of LSTMs with Simple RNNs [0.0]
We introduce a first-order training algorithm with a linear time complexity in the number of parameters.
We show that when SRNNs are trained with our algorithm, they provide very similar regression performance with the LSTMs in two to three times shorter training time.
arXiv Detail & Related papers (2020-05-16T11:41:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.