Real World Offline Reinforcement Learning with Realistic Data Source
- URL: http://arxiv.org/abs/2210.06479v1
- Date: Wed, 12 Oct 2022 17:57:05 GMT
- Title: Real World Offline Reinforcement Learning with Realistic Data Source
- Authors: Gaoyue Zhou, Liyiming Ke, Siddhartha Srinivasa, Abhinav Gupta, Aravind
Rajeswaran, Vikash Kumar
- Abstract summary: offline reinforcement learning (ORL) holds great promise for robot learning due to its ability to learn from arbitrary pre-generated experience.
Current ORL benchmarks are almost entirely in simulation and utilize contrived datasets like replay buffers of online RL agents or sub-optimal trajectories.
In this work (Real-ORL), we posit that data collected from safe operations of closely related tasks are more practical data sources for real-world robot learning.
- Score: 33.7474988142367
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Offline reinforcement learning (ORL) holds great promise for robot learning
due to its ability to learn from arbitrary pre-generated experience. However,
current ORL benchmarks are almost entirely in simulation and utilize contrived
datasets like replay buffers of online RL agents or sub-optimal trajectories,
and thus hold limited relevance for real-world robotics. In this work
(Real-ORL), we posit that data collected from safe operations of closely
related tasks are more practical data sources for real-world robot learning.
Under these settings, we perform an extensive (6500+ trajectories collected
over 800+ robot hours and 270+ human labor hour) empirical study evaluating
generalization and transfer capabilities of representative ORL methods on four
real-world tabletop manipulation tasks. Our study finds that ORL and imitation
learning prefer different action spaces, and that ORL algorithms can generalize
from leveraging offline heterogeneous data sources and outperform imitation
learning. We release our dataset and implementations at URL:
https://sites.google.com/view/real-orl
Related papers
- D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning [99.33607114541861]
We propose a new benchmark for offline RL that focuses on realistic simulations of robotic manipulation and locomotion environments.
Our proposed benchmark covers state-based and image-based domains, and supports both offline RL and online fine-tuning evaluation.
arXiv Detail & Related papers (2024-08-15T22:27:00Z) - Robotic Offline RL from Internet Videos via Value-Function Pre-Training [67.44673316943475]
We develop a system for leveraging large-scale human video datasets in robotic offline RL.
We show that value learning on video datasets learns representations more conducive to downstream robotic offline RL than other approaches.
arXiv Detail & Related papers (2023-09-22T17:59:14Z) - A Real-World Quadrupedal Locomotion Benchmark for Offline Reinforcement
Learning [27.00483962026472]
We benchmark 11 offline reinforcement learning algorithms in realistic quadrupedal locomotion dataset.
Experiments show that the best-performing ORL algorithms can achieve competitive performance compared with the model-free RL.
Our proposed benchmark will serve as a development platform for testing and evaluating the performance of ORL algorithms in real-world legged locomotion tasks.
arXiv Detail & Related papers (2023-09-13T13:18:29Z) - Accelerating Robotic Reinforcement Learning via Parameterized Action
Primitives [92.0321404272942]
Reinforcement learning can be used to build general-purpose robotic systems.
However, training RL agents to solve robotics tasks still remains challenging.
In this work, we manually specify a library of robot action primitives (RAPS), parameterized with arguments that are learned by an RL policy.
We find that our simple change to the action interface substantially improves both the learning efficiency and task performance.
arXiv Detail & Related papers (2021-10-28T17:59:30Z) - A Workflow for Offline Model-Free Robotic Reinforcement Learning [117.07743713715291]
offline reinforcement learning (RL) enables learning control policies by utilizing only prior experience, without any online interaction.
We develop a practical workflow for using offline RL analogous to the relatively well-understood for supervised learning problems.
We demonstrate the efficacy of this workflow in producing effective policies without any online tuning.
arXiv Detail & Related papers (2021-09-22T16:03:29Z) - Robotic Surgery With Lean Reinforcement Learning [0.8258451067861933]
We describe adding reinforcement learning support to the da Vinci Skill Simulator.
We teach an RL-based agent to perform sub-tasks in the simulator environment, using either image or state data.
We tackle the sample inefficiency of RL using a simple-to-implement system which we term hybrid-batch learning (HBL)
arXiv Detail & Related papers (2021-05-03T16:52:26Z) - S4RL: Surprisingly Simple Self-Supervision for Offline Reinforcement
Learning [28.947071041811586]
offline reinforcement learning proposes to learn policies from large collected datasets without interaction.
Current algorithms overfit to the dataset they are trained on and perform poor out-of-distribution generalization to the environment when deployed.
We propose a Surprisingly Simple Self-Supervision algorithm (S4RL) which utilizes data augmentations from states to learn value functions that are better at generalizing and extrapolating when deployed in the environment.
arXiv Detail & Related papers (2021-03-10T20:13:21Z) - RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning [108.9599280270704]
We propose a benchmark called RL Unplugged to evaluate and compare offline RL methods.
RL Unplugged includes data from a diverse range of domains including games and simulated motor control problems.
We will release data for all our tasks and open-source all algorithms presented in this paper.
arXiv Detail & Related papers (2020-06-24T17:14:51Z) - AWAC: Accelerating Online Reinforcement Learning with Offline Datasets [84.94748183816547]
We show that our method, advantage weighted actor critic (AWAC), enables rapid learning of skills with a combination of prior demonstration data and online experience.
Our results show that incorporating prior data can reduce the time required to learn a range of robotic skills to practical time-scales.
arXiv Detail & Related papers (2020-06-16T17:54:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.