Bridging the Sim-to-Real Gap for Athletic Loco-Manipulation
- URL: http://arxiv.org/abs/2502.10894v1
- Date: Sat, 15 Feb 2025 20:18:37 GMT
- Title: Bridging the Sim-to-Real Gap for Athletic Loco-Manipulation
- Authors: Nolan Fey, Gabriel B. Margolis, Martin Peticco, Pulkit Agrawal,
- Abstract summary: We introduce the Unsupervised Actuator Net (UAN) to bridge the sim-to-real gap for complex actuation mechanisms.
UAN mitigates reward hacking by ensuring that the learned behaviors remain robust and transferable.
With these innovations, our robot athlete learns to lift, throw, and drag with remarkable fidelity from simulation to reality.
- Score: 18.451995260533682
- License:
- Abstract: Achieving athletic loco-manipulation on robots requires moving beyond traditional tracking rewards - which simply guide the robot along a reference trajectory - to task rewards that drive truly dynamic, goal-oriented behaviors. Commands such as "throw the ball as far as you can" or "lift the weight as quickly as possible" compel the robot to exhibit the agility and power inherent in athletic performance. However, training solely with task rewards introduces two major challenges: these rewards are prone to exploitation (reward hacking), and the exploration process can lack sufficient direction. To address these issues, we propose a two-stage training pipeline. First, we introduce the Unsupervised Actuator Net (UAN), which leverages real-world data to bridge the sim-to-real gap for complex actuation mechanisms without requiring access to torque sensing. UAN mitigates reward hacking by ensuring that the learned behaviors remain robust and transferable. Second, we use a pre-training and fine-tuning strategy that leverages reference trajectories as initial hints to guide exploration. With these innovations, our robot athlete learns to lift, throw, and drag with remarkable fidelity from simulation to reality.
Related papers
- Moto: Latent Motion Token as the Bridging Language for Robot Manipulation [66.18557528695924]
We introduce Moto, which converts video content into latent Motion Token sequences by a Latent Motion Tokenizer.
We pre-train Moto-GPT through motion token autoregression, enabling it to capture diverse visual motion knowledge.
To transfer learned motion priors to real robot actions, we implement a co-fine-tuning strategy that seamlessly bridges latent motion token prediction and real robot control.
arXiv Detail & Related papers (2024-12-05T18:57:04Z) - DexDribbler: Learning Dexterous Soccer Manipulation via Dynamic Supervision [26.9579556496875]
Joint manipulation of moving objects and locomotion with legs, such as playing soccer, receive scant attention in the learning community.
We propose a feedback control block to compute the necessary body-level movement accurately and using the outputs as dynamic joint-level locomotion supervision.
We observe that our learning scheme can not only make the policy network converge faster but also enable soccer robots to perform sophisticated maneuvers.
arXiv Detail & Related papers (2024-03-21T11:16:28Z) - Robot Learning with Sensorimotor Pre-training [98.7755895548928]
We present a self-supervised sensorimotor pre-training approach for robotics.
Our model, called RPT, is a Transformer that operates on sequences of sensorimotor tokens.
We find that sensorimotor pre-training consistently outperforms training from scratch, has favorable scaling properties, and enables transfer across different tasks, environments, and robots.
arXiv Detail & Related papers (2023-06-16T17:58:10Z) - Barkour: Benchmarking Animal-level Agility with Quadruped Robots [70.97471756305463]
We introduce the Barkour benchmark, an obstacle course to quantify agility for legged robots.
Inspired by dog agility competitions, it consists of diverse obstacles and a time based scoring mechanism.
We present two methods for tackling the benchmark.
arXiv Detail & Related papers (2023-05-24T02:49:43Z) - Learning and Adapting Agile Locomotion Skills by Transferring Experience [71.8926510772552]
We propose a framework for training complex robotic skills by transferring experience from existing controllers to jumpstart learning new tasks.
We show that our method enables learning complex agile jumping behaviors, navigating to goal locations while walking on hind legs, and adapting to new environments.
arXiv Detail & Related papers (2023-04-19T17:37:54Z) - Legs as Manipulator: Pushing Quadrupedal Agility Beyond Locomotion [34.33972863987201]
We train quadruped robots to use the front legs to climb walls, press buttons, and perform object interaction in the real world.
These skills are trained in simulation using curriculum and transferred to the real world using our proposed sim2real variant.
We evaluate our method in both simulation and real-world showing successful executions of both short as well as long-range tasks.
arXiv Detail & Related papers (2023-03-20T17:59:58Z) - GraspARL: Dynamic Grasping via Adversarial Reinforcement Learning [16.03016392075486]
We introduce an adversarial reinforcement learning framework for dynamic grasping, namely GraspARL.
We formulate the dynamic grasping problem as a'move-and-grasp' game, where the robot is to pick up the object on the mover and the adversarial mover is to find a path to escape it.
In this way, the mover can auto-generate diverse moving trajectories while training. And the robot trained with the adversarial trajectories can generalize to various motion patterns.
arXiv Detail & Related papers (2022-03-04T03:25:09Z) - Real Robot Challenge using Deep Reinforcement Learning [6.332038240397164]
This paper details our winning submission to Phase 1 of the 2021 Real Robot Challenge.
The challenge is in which a three fingered robot must carry a cube along specified goal trajectories.
We use a pure reinforcement learning approach which requires minimal expert knowledge of the robotic system.
arXiv Detail & Related papers (2021-09-30T16:12:17Z) - Reinforcement Learning for Robust Parameterized Locomotion Control of
Bipedal Robots [121.42930679076574]
We present a model-free reinforcement learning framework for training robust locomotion policies in simulation.
domain randomization is used to encourage the policies to learn behaviors that are robust across variations in system dynamics.
We demonstrate this on versatile walking behaviors such as tracking a target walking velocity, walking height, and turning yaw.
arXiv Detail & Related papers (2021-03-26T07:14:01Z) - Learning Agile Locomotion via Adversarial Training [59.03007947334165]
In this paper, we present a multi-agent learning system, in which a quadruped robot (protagonist) learns to chase another robot (adversary) while the latter learns to escape.
We find that this adversarial training process not only encourages agile behaviors but also effectively alleviates the laborious environment design effort.
In contrast to prior works that used only one adversary, we find that training an ensemble of adversaries, each of which specializes in a different escaping strategy, is essential for the protagonist to master agility.
arXiv Detail & Related papers (2020-08-03T01:20:37Z) - Learning to Play Table Tennis From Scratch using Muscular Robots [34.34824536814943]
This work is the first to (a) fail-safe learn of a safety-critical dynamic task using anthropomorphic robot arms, (b) learn a precision-demanding problem with a PAM-driven system, and (c) train robots to play table tennis without real balls.
Videos and datasets are available at muscularTT.embodied.ml.
arXiv Detail & Related papers (2020-06-10T16:43:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.