Learning to Walk in Minutes Using Massively Parallel Deep Reinforcement
Learning
- URL: http://arxiv.org/abs/2109.11978v1
- Date: Fri, 24 Sep 2021 14:04:19 GMT
- Title: Learning to Walk in Minutes Using Massively Parallel Deep Reinforcement
Learning
- Authors: Nikita Rudin, David Hoeller, Philipp Reist, and Marco Hutter
- Abstract summary: We present and study a training set-up that achieves fast policy generation for real-world robotic tasks by using massive parallelism on a single workstation GPU.
We analyze and discuss the impact of different training algorithm components in the massively parallel regime on the final policy performance and training times.
We present a novel game-inspired curriculum that is well suited for training with thousands of simulated robots in parallel.
- Score: 2.930703970709558
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we present and study a training set-up that achieves fast
policy generation for real-world robotic tasks by using massive parallelism on
a single workstation GPU. We analyze and discuss the impact of different
training algorithm components in the massively parallel regime on the final
policy performance and training times. In addition, we present a novel
game-inspired curriculum that is well suited for training with thousands of
simulated robots in parallel. We evaluate the approach by training the
quadrupedal robot ANYmal to walk on challenging terrain. The parallel approach
allows training policies for flat terrain in under four minutes, and in twenty
minutes for uneven terrain. This represents a speedup of multiple orders of
magnitude compared to previous work. Finally, we transfer the policies to the
real robot to validate the approach. We open-source our training code to help
accelerate further research in the field of learned legged locomotion.
Related papers
- Multi-Objective Algorithms for Learning Open-Ended Robotic Problems [1.0124625066746598]
Quadrupedal locomotion is a complex, open-ended problem vital to expanding autonomous vehicle reach.
Traditional reinforcement learning approaches often fall short due to training instability and sample inefficiency.
We propose a novel method leveraging multi-objective evolutionary algorithms as an automatic curriculum learning mechanism.
arXiv Detail & Related papers (2024-11-11T16:26:42Z) - FLaRe: Achieving Masterful and Adaptive Robot Policies with Large-Scale Reinforcement Learning Fine-Tuning [74.25049012472502]
FLaRe is a large-scale Reinforcement Learning framework that integrates robust pre-trained representations, large-scale training, and gradient stabilization techniques.
Our method aligns pre-trained policies towards task completion, achieving state-of-the-art (SoTA) performance on previously demonstrated and on entirely novel tasks and embodiments.
arXiv Detail & Related papers (2024-09-25T03:15:17Z) - Grow Your Limits: Continuous Improvement with Real-World RL for Robotic
Locomotion [66.69666636971922]
We present APRL, a policy regularization framework that modulates the robot's exploration over the course of training.
APRL enables a quadrupedal robot to efficiently learn to walk entirely in the real world within minutes.
arXiv Detail & Related papers (2023-10-26T17:51:46Z) - Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - Rethinking Closed-loop Training for Autonomous Driving [82.61418945804544]
We present the first empirical study which analyzes the effects of different training benchmark designs on the success of learning agents.
We propose trajectory value learning (TRAVL), an RL-based driving agent that performs planning with multistep look-ahead.
Our experiments show that TRAVL can learn much faster and produce safer maneuvers compared to all the baselines.
arXiv Detail & Related papers (2023-06-27T17:58:39Z) - Robust and Versatile Bipedal Jumping Control through Reinforcement
Learning [141.56016556936865]
This work aims to push the limits of agility for bipedal robots by enabling a torque-controlled bipedal robot to perform robust and versatile dynamic jumps in the real world.
We present a reinforcement learning framework for training a robot to accomplish a large variety of jumping tasks, such as jumping to different locations and directions.
We develop a new policy structure that encodes the robot's long-term input/output (I/O) history while also providing direct access to a short-term I/O history.
arXiv Detail & Related papers (2023-02-19T01:06:09Z) - Advanced Skills by Learning Locomotion and Local Navigation End-to-End [10.872193480485596]
In this work, we propose to solve the complete problem by training an end-to-end policy with deep reinforcement learning.
We demonstrate the successful deployment of policies on a real quadrupedal robot.
arXiv Detail & Related papers (2022-09-26T16:35:00Z) - A Walk in the Park: Learning to Walk in 20 Minutes With Model-Free
Reinforcement Learning [86.06110576808824]
Deep reinforcement learning is a promising approach to learning policies in uncontrolled environments.
Recent advancements in machine learning algorithms and libraries combined with a carefully tuned robot controller lead to learning quadruped in only 20 minutes in the real world.
arXiv Detail & Related papers (2022-08-16T17:37:36Z) - Learning Bipedal Walking On Planned Footsteps For Humanoid Robots [5.127310126394387]
Deep reinforcement learning (RL) based controllers for legged robots have demonstrated impressive robustness for walking in different environments for several robot platforms.
To enable the application of RL policies for humanoid robots in real-world settings, it is crucial to build a system that can achieve robust walking in any direction.
In this paper, we tackle this problem by learning a policy to follow a given step sequence.
We show that simply feeding the upcoming 2 steps to the policy is sufficient to achieve omnidirectional walking, turning in place, standing, and climbing stairs.
arXiv Detail & Related papers (2022-07-26T04:16:00Z) - Robust High-speed Running for Quadruped Robots via Deep Reinforcement
Learning [7.264355680723856]
In this paper, we explore learning foot positions in Cartesian space for a task of running as fast as possible subject to environmental disturbances.
Compared with other action spaces, we observe less needed reward shaping, much improved sample efficiency, and the emergence of natural gaits such as galloping and bounding.
Policies can be learned in only a few million time steps, even for challenging tasks of running over rough terrain with loads of over 100% of the nominal quadruped mass.
arXiv Detail & Related papers (2021-03-11T06:13:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.