Related papers: Automatic Environment Shaping is the Next Frontier in RL

Automatic Environment Shaping is the Next Frontier in RL

URL: http://arxiv.org/abs/2407.16186v1
Date: Tue, 23 Jul 2024 05:22:29 GMT
Title: Automatic Environment Shaping is the Next Frontier in RL
Authors: Younghyo Park, Gabriel B. Margolis, Pulkit Agrawal,
Abstract summary: Many roboticists dream of presenting a robot with a task in the evening and returning the next morning to find the robot capable of solving the task. Sim-to-real reinforcement learning has achieved impressive performance on challenging robotics tasks, but requires substantial human effort to set up the task in a way that is amenable to RL. It's our position that algorithmic improvements in policy optimization and other ideas should be guided towards resolving the primary bottleneck of shaping the training environment.
Score: 20.894840942319323
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Many roboticists dream of presenting a robot with a task in the evening and returning the next morning to find the robot capable of solving the task. What is preventing us from achieving this? Sim-to-real reinforcement learning (RL) has achieved impressive performance on challenging robotics tasks, but requires substantial human effort to set up the task in a way that is amenable to RL. It's our position that algorithmic improvements in policy optimization and other ideas should be guided towards resolving the primary bottleneck of shaping the training environment, i.e., designing observations, actions, rewards and simulation dynamics. Most practitioners don't tune the RL algorithm, but other environment parameters to obtain a desirable controller. We posit that scaling RL to diverse robotic tasks will only be achieved if the community focuses on automating environment shaping procedures.

Related papers

STRIDE: Automating Reward Design, Deep Reinforcement Learning Training and Feedback Optimization in Humanoid Robotics Locomotion [33.91518509518502]
We introduce STRIDE, a novel framework built on agentic engineering to automate reward design, DRL training, and feedback optimization for humanoid robot locomotion tasks. By combining structured principles of agentic engineering with large language models (LLMs) for code-writing, zero-shot generation, and in-context optimization, STRIDE generates, evaluates, and iteratively refines reward functions without relying on task-specific prompts or templates. Across diverse environments featuring humanoid robot morphologies, STRIDE outperforms the state-of-the-art reward design framework EUREKA, achieving an average improvement of round 250% in
arXiv Detail & Related papers (2025-02-07T06:37:05Z)
SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning [85.21378553454672]
We develop a library containing a sample efficient off-policy deep RL method, together with methods for computing rewards and resetting the environment. We find that our implementation can achieve very efficient learning, acquiring policies for PCB board assembly, cable routing, and object relocation. These policies achieve perfect or near-perfect success rates, extreme robustness even under perturbations, and exhibit emergent robustness recovery and correction behaviors.
arXiv Detail & Related papers (2024-01-29T10:01:10Z)
RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation [68.70755196744533]
RoboGen is a generative robotic agent that automatically learns diverse robotic skills at scale via generative simulation. Our work attempts to extract the extensive and versatile knowledge embedded in large-scale models and transfer them to the field of robotics.
arXiv Detail & Related papers (2023-11-02T17:59:21Z)
Reaching the Limit in Autonomous Racing: Optimal Control versus Reinforcement Learning [66.10854214036605]
A central question in robotics is how to design a control system for an agile mobile robot. We show that a neural network controller trained with reinforcement learning (RL) outperformed optimal control (OC) methods in this setting. Our findings allowed us to push an agile drone to its maximum performance, achieving a peak acceleration greater than 12 times the gravitational acceleration and a peak velocity of 108 kilometers per hour.
arXiv Detail & Related papers (2023-10-17T02:40:27Z)
Stabilizing Contrastive RL: Techniques for Robotic Goal Reaching from Offline Data [101.43350024175157]
Self-supervised learning has the potential to decrease the amount of human annotation and engineering effort required to learn control strategies. Our work builds on prior work showing that the reinforcement learning (RL) itself can be cast as a self-supervised problem. We demonstrate that a self-supervised RL algorithm based on contrastive learning can solve real-world, image-based robotic manipulation tasks.
arXiv Detail & Related papers (2023-06-06T01:36:56Z)
Quality-Diversity Optimisation on a Physical Robot Through Dynamics-Aware and Reset-Free Learning [4.260312058817663]
We build upon the Reset-Free QD (RF-QD) algorithm to learn controllers directly on a physical robot. This method uses a dynamics model, learned from interactions between the robot and the environment, to predict the robot's behaviour. RF-QD also includes a recovery policy that returns the robot to a safe zone when it has walked outside of it, allowing continuous learning.
arXiv Detail & Related papers (2023-04-24T13:24:00Z)
Learning Bipedal Walking for Humanoids with Current Feedback [5.429166905724048]
We present an approach for overcoming the sim2real gap issue for humanoid robots arising from inaccurate torque-tracking at the actuator level. Our approach successfully trains a unified, end-to-end policy in simulation that can be deployed on a real HRP-5P humanoid robot to achieve bipedal locomotion.
arXiv Detail & Related papers (2023-03-07T08:16:46Z)
Accelerating Robotic Reinforcement Learning via Parameterized Action Primitives [92.0321404272942]
Reinforcement learning can be used to build general-purpose robotic systems. However, training RL agents to solve robotics tasks still remains challenging. In this work, we manually specify a library of robot action primitives (RAPS), parameterized with arguments that are learned by an RL policy. We find that our simple change to the action interface substantially improves both the learning efficiency and task performance.
arXiv Detail & Related papers (2021-10-28T17:59:30Z)
How to Train Your Robot with Deep Reinforcement Learning; Lessons We've Learned [111.06812202454364]
We present a number of case studies involving robotic deep RL. We discuss commonly perceived challenges in deep RL and how they have been addressed in these works. We also provide an overview of other outstanding challenges, many of which are unique to the real-world robotics setting.
arXiv Detail & Related papers (2021-02-04T22:09:28Z)
Robust Reinforcement Learning-based Autonomous Driving Agent for Simulation and Real World [0.0]
We present a DRL-based algorithm that is capable of performing autonomous robot control using Deep Q-Networks (DQN) In our approach, the agent is trained in a simulated environment and it is able to navigate both in a simulated and real-world environment. The trained agent is able to run on limited hardware resources and its performance is comparable to state-of-the-art approaches.
arXiv Detail & Related papers (2020-09-23T15:23:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.