A Real-World Quadrupedal Locomotion Benchmark for Offline Reinforcement
Learning
- URL: http://arxiv.org/abs/2309.16718v1
- Date: Wed, 13 Sep 2023 13:18:29 GMT
- Title: A Real-World Quadrupedal Locomotion Benchmark for Offline Reinforcement
Learning
- Authors: Hongyin Zhang, Shuyu Yang and Donglin Wang
- Abstract summary: We benchmark 11 offline reinforcement learning algorithms in realistic quadrupedal locomotion dataset.
Experiments show that the best-performing ORL algorithms can achieve competitive performance compared with the model-free RL.
Our proposed benchmark will serve as a development platform for testing and evaluating the performance of ORL algorithms in real-world legged locomotion tasks.
- Score: 27.00483962026472
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Online reinforcement learning (RL) methods are often data-inefficient or
unreliable, making them difficult to train on real robotic hardware, especially
quadruped robots. Learning robotic tasks from pre-collected data is a promising
direction. Meanwhile, agile and stable legged robotic locomotion remains an
open question in their general form. Offline reinforcement learning (ORL) has
the potential to make breakthroughs in this challenging field, but its current
bottleneck lies in the lack of diverse datasets for challenging realistic
tasks. To facilitate the development of ORL, we benchmarked 11 ORL algorithms
in the realistic quadrupedal locomotion dataset. Such dataset is collected by
the classic model predictive control (MPC) method, rather than the model-free
online RL method commonly used by previous benchmarks. Extensive experimental
results show that the best-performing ORL algorithms can achieve competitive
performance compared with the model-free RL, and even surpass it in some tasks.
However, there is still a gap between the learning-based methods and MPC,
especially in terms of stability and rapid adaptation. Our proposed benchmark
will serve as a development platform for testing and evaluating the performance
of ORL algorithms in real-world legged locomotion tasks.
Related papers
- D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning [99.33607114541861]
We propose a new benchmark for offline RL that focuses on realistic simulations of robotic manipulation and locomotion environments.
Our proposed benchmark covers state-based and image-based domains, and supports both offline RL and online fine-tuning evaluation.
arXiv Detail & Related papers (2024-08-15T22:27:00Z) - SERL: A Software Suite for Sample-Efficient Robotic Reinforcement
Learning [85.21378553454672]
We develop a library containing a sample efficient off-policy deep RL method, together with methods for computing rewards and resetting the environment.
We find that our implementation can achieve very efficient learning, acquiring policies for PCB board assembly, cable routing, and object relocation.
These policies achieve perfect or near-perfect success rates, extreme robustness even under perturbations, and exhibit emergent robustness recovery and correction behaviors.
arXiv Detail & Related papers (2024-01-29T10:01:10Z) - Finetuning Offline World Models in the Real World [13.46766121896684]
Reinforcement Learning (RL) is notoriously data-inefficient, which makes training on a real robot difficult.
offline RL has been proposed as a framework for training RL policies on pre-existing datasets without any online interaction.
In this work, we consider the problem of pretraining a world model with offline data collected on a real robot, and then finetuning the model on online data collected by planning with the learned model.
arXiv Detail & Related papers (2023-10-24T17:46:12Z) - Combining model-predictive control and predictive reinforcement learning
for stable quadrupedal robot locomotion [0.0]
We study how this can be achieved by a combination of model-predictive and predictive reinforcement learning controllers.
In this work, we combine both control methods to address the quadrupedal robot stable gate generation problem.
arXiv Detail & Related papers (2023-07-15T09:22:37Z) - Real World Offline Reinforcement Learning with Realistic Data Source [33.7474988142367]
offline reinforcement learning (ORL) holds great promise for robot learning due to its ability to learn from arbitrary pre-generated experience.
Current ORL benchmarks are almost entirely in simulation and utilize contrived datasets like replay buffers of online RL agents or sub-optimal trajectories.
In this work (Real-ORL), we posit that data collected from safe operations of closely related tasks are more practical data sources for real-world robot learning.
arXiv Detail & Related papers (2022-10-12T17:57:05Z) - Accelerating Robotic Reinforcement Learning via Parameterized Action
Primitives [92.0321404272942]
Reinforcement learning can be used to build general-purpose robotic systems.
However, training RL agents to solve robotics tasks still remains challenging.
In this work, we manually specify a library of robot action primitives (RAPS), parameterized with arguments that are learned by an RL policy.
We find that our simple change to the action interface substantially improves both the learning efficiency and task performance.
arXiv Detail & Related papers (2021-10-28T17:59:30Z) - Behavioral Priors and Dynamics Models: Improving Performance and Domain
Transfer in Offline RL [82.93243616342275]
We introduce Offline Model-based RL with Adaptive Behavioral Priors (MABE)
MABE is based on the finding that dynamics models, which support within-domain generalization, and behavioral priors, which support cross-domain generalization, are complementary.
In experiments that require cross-domain generalization, we find that MABE outperforms prior methods.
arXiv Detail & Related papers (2021-06-16T20:48:49Z) - Offline Reinforcement Learning from Images with Latent Space Models [60.69745540036375]
offline reinforcement learning (RL) refers to the problem of learning policies from a static dataset of environment interactions.
We build on recent advances in model-based algorithms for offline RL, and extend them to high-dimensional visual observation spaces.
Our approach is both tractable in practice and corresponds to maximizing a lower bound of the ELBO in the unknown POMDP.
arXiv Detail & Related papers (2020-12-21T18:28:17Z) - AWAC: Accelerating Online Reinforcement Learning with Offline Datasets [84.94748183816547]
We show that our method, advantage weighted actor critic (AWAC), enables rapid learning of skills with a combination of prior demonstration data and online experience.
Our results show that incorporating prior data can reduce the time required to learn a range of robotic skills to practical time-scales.
arXiv Detail & Related papers (2020-06-16T17:54:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.