COG: Connecting New Skills to Past Experience with Offline Reinforcement
Learning
- URL: http://arxiv.org/abs/2010.14500v1
- Date: Tue, 27 Oct 2020 17:57:29 GMT
- Title: COG: Connecting New Skills to Past Experience with Offline Reinforcement
Learning
- Authors: Avi Singh, Albert Yu, Jonathan Yang, Jesse Zhang, Aviral Kumar, Sergey
Levine
- Abstract summary: We show that we can reuse prior data to extend new skills simply through dynamic programming.
We demonstrate the effectiveness of our approach by chaining together several behaviors seen in prior datasets for solving a new task.
We train our policies in an end-to-end fashion, mapping high-dimensional image observations to low-level robot control commands.
- Score: 78.13740204156858
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning has been applied to a wide variety of robotics
problems, but most of such applications involve collecting data from scratch
for each new task. Since the amount of robot data we can collect for any single
task is limited by time and cost considerations, the learned behavior is
typically narrow: the policy can only execute the task in a handful of
scenarios that it was trained on. What if there was a way to incorporate a
large amount of prior data, either from previously solved tasks or from
unsupervised or undirected environment interaction, to extend and generalize
learned behaviors? While most prior work on extending robotic skills using
pre-collected data focuses on building explicit hierarchies or skill
decompositions, we show in this paper that we can reuse prior data to extend
new skills simply through dynamic programming. We show that even when the prior
data does not actually succeed at solving the new task, it can still be
utilized for learning a better policy, by providing the agent with a broader
understanding of the mechanics of its environment. We demonstrate the
effectiveness of our approach by chaining together several behaviors seen in
prior datasets for solving a new task, with our hardest experimental setting
involving composing four robotic skills in a row: picking, placing, drawer
opening, and grasping, where a +1/0 sparse reward is provided only on task
completion. We train our policies in an end-to-end fashion, mapping
high-dimensional image observations to low-level robot control commands, and
present results in both simulated and real world domains. Additional materials
and source code can be found on our project website:
https://sites.google.com/view/cog-rl
Related papers
- Online Continual Learning For Interactive Instruction Following Agents [20.100312650193228]
We argue that such a learning scenario is less realistic since a robotic agent is supposed to learn the world continuously as it explores and perceives it.
We propose two continual learning setups for embodied agents; learning new behaviors and new environments.
arXiv Detail & Related papers (2024-03-12T11:33:48Z) - Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - Learning and Retrieval from Prior Data for Skill-based Imitation
Learning [47.59794569496233]
We develop a skill-based imitation learning framework that extracts temporally extended sensorimotor skills from prior data.
We identify several key design choices that significantly improve performance on novel tasks.
arXiv Detail & Related papers (2022-10-20T17:34:59Z) - Pre-Training for Robots: Offline RL Enables Learning New Tasks from a
Handful of Trials [97.95400776235736]
We present a framework based on offline RL that attempts to effectively learn new tasks.
It combines pre-training on existing robotic datasets with rapid fine-tuning on a new task, with as few as 10 demonstrations.
To our knowledge, PTR is the first RL method that succeeds at learning new tasks in a new domain on a real WidowX robot.
arXiv Detail & Related papers (2022-10-11T06:30:53Z) - Bridge Data: Boosting Generalization of Robotic Skills with Cross-Domain
Datasets [122.85598648289789]
We study how multi-domain and multi-task datasets can improve the learning of new tasks in new environments.
We also find that data for only a few tasks in a new domain can bridge the domain gap and make it possible for a robot to perform a variety of prior tasks that were only seen in other domains.
arXiv Detail & Related papers (2021-09-27T23:42:12Z) - Actionable Models: Unsupervised Offline Reinforcement Learning of
Robotic Skills [93.12417203541948]
We propose the objective of learning a functional understanding of the environment by learning to reach any goal state in a given dataset.
We find that our method can operate on high-dimensional camera images and learn a variety of skills on real robots that generalize to previously unseen scenes and objects.
arXiv Detail & Related papers (2021-04-15T20:10:11Z) - Parrot: Data-Driven Behavioral Priors for Reinforcement Learning [79.32403825036792]
We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials.
We show how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
arXiv Detail & Related papers (2020-11-19T18:47:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.