Planning with RL and episodic-memory behavioral priors
- URL: http://arxiv.org/abs/2207.01845v2
- Date: Thu, 7 Jul 2022 09:04:54 GMT
- Title: Planning with RL and episodic-memory behavioral priors
- Authors: Shivansh Beohar and Andrew Melnik
- Abstract summary: Learning from behavioral priors is a promising way to bootstrap agents with a better-than-random exploration policy.
We present a planning-based approach that can use these behavioral priors for effective exploration and learning in a reinforcement learning environment.
- Score: 0.20305676256390934
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The practical application of learning agents requires sample efficient and
interpretable algorithms. Learning from behavioral priors is a promising way to
bootstrap agents with a better-than-random exploration policy or a safe-guard
against the pitfalls of early learning. Existing solutions for imitation
learning require a large number of expert demonstrations and rely on
hard-to-interpret learning methods like Deep Q-learning. In this work we
present a planning-based approach that can use these behavioral priors for
effective exploration and learning in a reinforcement learning environment, and
we demonstrate that curated exploration policies in the form of behavioral
priors can help an agent learn faster.
Related papers
- RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning.
Our proposed method uses reinforcement learning with user intervention signals themselves as rewards.
This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z) - Inapplicable Actions Learning for Knowledge Transfer in Reinforcement
Learning [3.194414753332705]
We show that learning inapplicable actions greatly improves the sample efficiency of RL algorithms.
Thanks to the transferability of the knowledge acquired, it can be reused in other tasks and domains to make the learning process more efficient.
arXiv Detail & Related papers (2022-11-28T17:45:39Z) - Learning and Retrieval from Prior Data for Skill-based Imitation
Learning [47.59794569496233]
We develop a skill-based imitation learning framework that extracts temporally extended sensorimotor skills from prior data.
We identify several key design choices that significantly improve performance on novel tasks.
arXiv Detail & Related papers (2022-10-20T17:34:59Z) - Jump-Start Reinforcement Learning [68.82380421479675]
We present a meta algorithm that can use offline data, demonstrations, or a pre-existing policy to initialize an RL policy.
In particular, we propose Jump-Start Reinforcement Learning (JSRL), an algorithm that employs two policies to solve tasks.
We show via experiments that JSRL is able to significantly outperform existing imitation and reinforcement learning algorithms.
arXiv Detail & Related papers (2022-04-05T17:25:22Z) - A Survey of Exploration Methods in Reinforcement Learning [64.01676570654234]
Reinforcement learning agents depend crucially on exploration to obtain informative data for the learning process.
In this article, we provide a survey of modern exploration methods in (Sequential) reinforcement learning, as well as a taxonomy of exploration methods.
arXiv Detail & Related papers (2021-09-01T02:36:14Z) - Coverage as a Principle for Discovering Transferable Behavior in
Reinforcement Learning [16.12658895065585]
We argue that representation alone is not enough for efficient transfer in challenging domains and explore how to transfer knowledge through behavior.
The behavior of pre-trained policies may be used for solving the task at hand (exploitation) or for collecting useful data to solve the problem (exploration)
arXiv Detail & Related papers (2021-02-24T16:51:02Z) - Parrot: Data-Driven Behavioral Priors for Reinforcement Learning [79.32403825036792]
We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials.
We show how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
arXiv Detail & Related papers (2020-11-19T18:47:40Z) - Importance Weighted Policy Learning and Adaptation [89.46467771037054]
We study a complementary approach which is conceptually simple, general, modular and built on top of recent improvements in off-policy learning.
The framework is inspired by ideas from the probabilistic inference literature and combines robust off-policy learning with a behavior prior.
Our approach achieves competitive adaptation performance on hold-out tasks compared to meta reinforcement learning baselines and can scale to complex sparse-reward scenarios.
arXiv Detail & Related papers (2020-09-10T14:16:58Z) - Efficient Exploration in Constrained Environments with Goal-Oriented
Reference Path [15.679210057474922]
We train a deep convolutional network that can predict collision-free paths based on a map of the environment.
This is then used by a reinforcement learning algorithm to learn to closely follow the path.
We show that our method consistently improves the sample efficiency and generalization capability to novel environments.
arXiv Detail & Related papers (2020-03-03T17:07:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.