HTMRL: Biologically Plausible Reinforcement Learning with Hierarchical
Temporal Memory
- URL: http://arxiv.org/abs/2009.08880v1
- Date: Fri, 18 Sep 2020 15:05:17 GMT
- Title: HTMRL: Biologically Plausible Reinforcement Learning with Hierarchical
Temporal Memory
- Authors: Jakob Struye, Kevin Mets, Steven Latr\'e
- Abstract summary: We present HTMRL, the first strictly HTM-basedReinforcement Learning algorithm.
We empirically and statistically show that HTMRL scales to many states and actions, and demonstrate that HTM's ability for adapting to changing patterns extends to RL.
HTMRL is the first iteration of a novel RL approach, with the potential of extending to a capable algorithm for Meta-RL.
- Score: 1.138723572165938
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Building Reinforcement Learning (RL) algorithms which are able to adapt to
continuously evolving tasks is an open research challenge. One technology that
is known to inherently handle such non-stationary input patterns well is
Hierarchical Temporal Memory (HTM), a general and biologically plausible
computational model for the human neocortex. As the RL paradigm is inspired by
human learning, HTM is a natural framework for an RL algorithm supporting
non-stationary environments. In this paper, we present HTMRL, the first
strictly HTM-based RL algorithm. We empirically and statistically show that
HTMRL scales to many states and actions, and demonstrate that HTM's ability for
adapting to changing patterns extends to RL. Specifically, HTMRL performs well
on a 10-armed bandit after 750 steps, but only needs a third of that to adapt
to the bandit suddenly shuffling its arms. HTMRL is the first iteration of a
novel RL approach, with the potential of extending to a capable algorithm for
Meta-RL.
Related papers
- Understanding the Synergies between Quality-Diversity and Deep
Reinforcement Learning [4.788163807490196]
Generalized Actor-Critic QD-RL is a unified modular framework for actor-critic deep RL methods in the QD-RL setting.
We introduce two new algorithms, PGA-ME (SAC) and PGA-ME (DroQ) which apply recent advancements in Deep RL to the QD-RL setting.
arXiv Detail & Related papers (2023-03-10T19:02:42Z) - Train Hard, Fight Easy: Robust Meta Reinforcement Learning [78.16589993684698]
A major challenge of reinforcement learning (RL) in real-world applications is the variation between environments, tasks or clients.
Standard MRL methods optimize the average return over tasks, but often suffer from poor results in tasks of high risk or difficulty.
In this work, we define a robust MRL objective with a controlled level.
The data inefficiency is addressed via the novel Robust Meta RL algorithm (RoML)
arXiv Detail & Related papers (2023-01-26T14:54:39Z) - Entropy Regularized Reinforcement Learning with Cascading Networks [9.973226671536041]
Deep RL uses neural networks as function approximators.
One of the major difficulties of RL is the absence of i.i.d. data.
In this work, we challenge the common practices of the (un)supervised learning community of using a fixed neural architecture.
arXiv Detail & Related papers (2022-10-16T10:28:59Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - LCRL: Certified Policy Synthesis via Logically-Constrained Reinforcement
Learning [78.2286146954051]
LCRL implements model-free Reinforcement Learning (RL) algorithms over unknown Decision Processes (MDPs)
We present case studies to demonstrate the applicability, ease of use, scalability, and performance of LCRL.
arXiv Detail & Related papers (2022-09-21T13:21:00Z) - Beyond Tabula Rasa: Reincarnating Reinforcement Learning [37.201451908129386]
Learning tabula rasa, that is without any prior knowledge, is the prevalent workflow in reinforcement learning (RL) research.
We present reincarnating RL as an alternative workflow, where prior computational work is reused or transferred between design iterations of an RL agent.
We find that existing approaches fail in this setting and propose a simple algorithm to address their limitations.
arXiv Detail & Related papers (2022-06-03T15:11:10Z) - Heuristic-Guided Reinforcement Learning [31.056460162389783]
Tabula rasa RL algorithms require environment interactions or computation that scales with the horizon of the decision-making task.
Our framework can be viewed as a horizon-based regularization for controlling bias and variance in RL under a finite interaction budget.
In particular, we introduce the novel concept of an "improvable" -- a that allows an RL agent to extrapolate beyond its prior knowledge.
arXiv Detail & Related papers (2021-06-05T00:04:09Z) - RL-DARTS: Differentiable Architecture Search for Reinforcement Learning [62.95469460505922]
We introduce RL-DARTS, one of the first applications of Differentiable Architecture Search (DARTS) in reinforcement learning (RL)
By replacing the image encoder with a DARTS supernet, our search method is sample-efficient, requires minimal extra compute resources, and is also compatible with off-policy and on-policy RL algorithms, needing only minor changes in preexisting code.
We show that the supernet gradually learns better cells, leading to alternative architectures which can be highly competitive against manually designed policies, but also verify previous design choices for RL policies.
arXiv Detail & Related papers (2021-06-04T03:08:43Z) - Combining Pessimism with Optimism for Robust and Efficient Model-Based
Deep Reinforcement Learning [56.17667147101263]
In real-world tasks, reinforcement learning agents encounter situations that are not present during training time.
To ensure reliable performance, the RL agents need to exhibit robustness against worst-case situations.
We propose the Robust Hallucinated Upper-Confidence RL (RH-UCRL) algorithm to provably solve this problem.
arXiv Detail & Related papers (2021-03-18T16:50:17Z) - Maximum Entropy RL (Provably) Solves Some Robust RL Problems [94.80212602202518]
We prove theoretically that standard maximum entropy RL is robust to some disturbances in the dynamics and the reward function.
Our results suggest that MaxEnt RL by itself is robust to certain disturbances, without requiring any additional modifications.
arXiv Detail & Related papers (2021-03-10T18:45:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.