FastTD3: Simple, Fast, and Capable Reinforcement Learning for Humanoid Control
- URL: http://arxiv.org/abs/2505.22642v3
- Date: Sun, 01 Jun 2025 22:51:56 GMT
- Title: FastTD3: Simple, Fast, and Capable Reinforcement Learning for Humanoid Control
- Authors: Younggyo Seo, Carmelo Sferrazza, Haoran Geng, Michal Nauman, Zhao-Heng Yin, Pieter Abbeel,
- Abstract summary: FastTD3 is aReinforcement Learning (RL) algorithm that solves a range of HumanoidBench tasks in under 3 hours on a single A100 GPU.<n>We also provide a lightweight and easy-to-use implementation of FastTD3 to accelerate RL research in robotics.
- Score: 49.08235196039602
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reinforcement learning (RL) has driven significant progress in robotics, but its complexity and long training times remain major bottlenecks. In this report, we introduce FastTD3, a simple, fast, and capable RL algorithm that significantly speeds up training for humanoid robots in popular suites such as HumanoidBench, IsaacLab, and MuJoCo Playground. Our recipe is remarkably simple: we train an off-policy TD3 agent with several modifications -- parallel simulation, large-batch updates, a distributional critic, and carefully tuned hyperparameters. FastTD3 solves a range of HumanoidBench tasks in under 3 hours on a single A100 GPU, while remaining stable during training. We also provide a lightweight and easy-to-use implementation of FastTD3 to accelerate RL research in robotics.
Related papers
- Assistax: A Hardware-Accelerated Reinforcement Learning Benchmark for Assistive Robotics [18.70896736010314]
Games have dominated reinforcement learning benchmarks because they present relevant challenges, are inexpensive to run and easy to understand.<n>We introduce Assistax: an open-source benchmark designed to address challenges arising in assistive robotics tasks.<n>In terms of open-loop wall-clock time, Assistax runs up to $370times$ faster when vectorising training runs compared to CPU-based alternatives.
arXiv Detail & Related papers (2025-07-29T09:49:11Z) - RobocupGym: A challenging continuous control benchmark in Robocup [7.926196208425107]
We introduce a Robocup-based RL environment based on the open source rcssserver3d soccer server.
In each task, an RL agent controls a simulated robot, and can interact with the ball or other agents.
arXiv Detail & Related papers (2024-07-03T15:26:32Z) - RLingua: Improving Reinforcement Learning Sample Efficiency in Robotic Manipulations With Large Language Models [16.963228633341792]
Reinforcement learning (RL) has demonstrated its capability in solving various tasks but is notorious for its low sample efficiency.
We propose RLingua, a framework that can leverage the internal knowledge of large language models (LLMs) to reduce the sample complexity of RL in robotic manipulations.
arXiv Detail & Related papers (2024-03-11T04:13:26Z) - Reinforcement Learning for Versatile, Dynamic, and Robust Bipedal Locomotion Control [106.32794844077534]
This paper presents a study on using deep reinforcement learning to create dynamic locomotion controllers for bipedal robots.
We develop a general control solution that can be used for a range of dynamic bipedal skills, from periodic walking and running to aperiodic jumping and standing.
This work pushes the limits of agility for bipedal robots through extensive real-world experiments.
arXiv Detail & Related papers (2024-01-30T10:48:43Z) - SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning [82.46975428739329]
We develop a library containing a sample efficient off-policy deep RL method, together with methods for computing rewards and resetting the environment.<n>We find that our implementation can achieve very efficient learning, acquiring policies for PCB board assembly, cable routing, and object relocation.<n>These policies achieve perfect or near-perfect success rates, extreme robustness even under perturbations, and exhibit emergent robustness recovery and correction behaviors.
arXiv Detail & Related papers (2024-01-29T10:01:10Z) - Rethinking Closed-loop Training for Autonomous Driving [82.61418945804544]
We present the first empirical study which analyzes the effects of different training benchmark designs on the success of learning agents.
We propose trajectory value learning (TRAVL), an RL-based driving agent that performs planning with multistep look-ahead.
Our experiments show that TRAVL can learn much faster and produce safer maneuvers compared to all the baselines.
arXiv Detail & Related papers (2023-06-27T17:58:39Z) - Rapid Locomotion via Reinforcement Learning [15.373208553045416]
We present an end-to-end learned controller that achieves record agility for the MIT Mini Cheetah.
This system runs and turns fast on natural terrains like grass, ice, and gravel and responds robustly to disturbances.
arXiv Detail & Related papers (2022-05-05T17:55:11Z) - Accelerating Robotic Reinforcement Learning via Parameterized Action
Primitives [92.0321404272942]
Reinforcement learning can be used to build general-purpose robotic systems.
However, training RL agents to solve robotics tasks still remains challenging.
In this work, we manually specify a library of robot action primitives (RAPS), parameterized with arguments that are learned by an RL policy.
We find that our simple change to the action interface substantially improves both the learning efficiency and task performance.
arXiv Detail & Related papers (2021-10-28T17:59:30Z) - Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with
Asynchronous Reinforcement Learning [68.2099740607854]
"Sample Factory" is a high- throughput training system optimized for a single-machine setting.
Our architecture combines a highly efficient, asynchronous, GPU-based sampler with off-policy correction techniques.
We extend Sample Factory to support self-play and population-based training and apply these techniques to train highly capable agents for a multiplayer first-person shooter game.
arXiv Detail & Related papers (2020-06-21T10:00:23Z) - Smooth Exploration for Robotic Reinforcement Learning [11.215352918313577]
Reinforcement learning (RL) enables robots to learn skills from interactions with the real world.
In practice, the unstructured step-based exploration used in Deep RL leads to jerky motion patterns on real robots.
We address these issues by adapting state-dependent exploration (SDE) to current Deep RL algorithms.
arXiv Detail & Related papers (2020-05-12T12:28:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.