Recurrent Reinforcement Learning with Memoroids
- URL: http://arxiv.org/abs/2402.09900v3
- Date: Mon, 28 Oct 2024 05:15:15 GMT
- Title: Recurrent Reinforcement Learning with Memoroids
- Authors: Steven Morad, Chris Lu, Ryan Kortvelesy, Stephan Liwicki, Jakob Foerster, Amanda Prorok,
- Abstract summary: We study memory models such as Recurrent Neural Networks (RNNs) and Transformers, by mapping trajectories to latent Markov states.
Neither model scales particularly well to long sequences, especially compared to an emerging class of memory models called Linear Recurrent Models.
We reformulate existing models using a novel monoid-based framework that we call memoroids.
- Score: 11.302674177386383
- License:
- Abstract: Memory models such as Recurrent Neural Networks (RNNs) and Transformers address Partially Observable Markov Decision Processes (POMDPs) by mapping trajectories to latent Markov states. Neither model scales particularly well to long sequences, especially compared to an emerging class of memory models called Linear Recurrent Models. We discover that the recurrent update of these models resembles a monoid, leading us to reformulate existing models using a novel monoid-based framework that we call memoroids. We revisit the traditional approach to batching in recurrent reinforcement learning, highlighting theoretical and empirical deficiencies. We leverage memoroids to propose a batching method that improves sample efficiency, increases the return, and simplifies the implementation of recurrent loss functions in reinforcement learning.
Related papers
- Bayesian sparsification for deep neural networks with Bayesian model
reduction [0.6144680854063939]
We advocate for the use of Bayesian model reduction (BMR) as a more efficient alternative for pruning of model weights.
BMR allows a post-hoc elimination of redundant model weights based on the posterior estimates under a straightforward (non-hierarchical) generative model.
We illustrate the potential of BMR across various deep learning architectures, from classical networks like LeNet to modern frameworks such as Vision and Transformers-Mixers.
arXiv Detail & Related papers (2023-09-21T14:10:47Z) - ResMem: Learn what you can and memorize the rest [79.19649788662511]
We propose the residual-memorization (ResMem) algorithm to augment an existing prediction model.
By construction, ResMem can explicitly memorize the training labels.
We show that ResMem consistently improves the test set generalization of the original prediction model.
arXiv Detail & Related papers (2023-02-03T07:12:55Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - State-driven Implicit Modeling for Sparsity and Robustness in Neural
Networks [3.604879434384177]
We present a new approach to training implicit models, called State-driven Implicit Modeling (SIM)
SIM constrains the internal states and outputs to match that of a baseline model, circumventing costly backward computations.
We demonstrate how the SIM approach can be applied to significantly improve sparsity and robustness of baseline models trained on datasets.
arXiv Detail & Related papers (2022-09-19T23:58:48Z) - Towards performant and reliable undersampled MR reconstruction via
diffusion model sampling [67.73698021297022]
DiffuseRecon is a novel diffusion model-based MR reconstruction method.
It guides the generation process based on the observed signals.
It does not require additional training on specific acceleration factors.
arXiv Detail & Related papers (2022-03-08T02:25:38Z) - Measuring and Reducing Model Update Regression in Structured Prediction
for NLP [31.86240946966003]
backward compatibility requires that the new model does not regress on cases that were correctly handled by its predecessor.
This work studies model update regression in structured prediction tasks.
We propose a simple and effective method, Backward-Congruent Re-ranking (BCR), by taking into account the characteristics of structured output.
arXiv Detail & Related papers (2022-02-07T07:04:54Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - Automatic Recall Machines: Internal Replay, Continual Learning and the
Brain [104.38824285741248]
Replay in neural networks involves training on sequential data with memorized samples, which counteracts forgetting of previous behavior caused by non-stationarity.
We present a method where these auxiliary samples are generated on the fly, given only the model that is being trained for the assessed objective.
Instead the implicit memory of learned samples within the assessed model itself is exploited.
arXiv Detail & Related papers (2020-06-22T15:07:06Z) - Dynamic Model Pruning with Feedback [64.019079257231]
We propose a novel model compression method that generates a sparse trained model without additional overhead.
We evaluate our method on CIFAR-10 and ImageNet, and show that the obtained sparse models can reach the state-of-the-art performance of dense models.
arXiv Detail & Related papers (2020-06-12T15:07:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.