Related papers: Bipedalism for Quadrupedal Robots: Versatile Loco-Manipulation through Risk-Adaptive Reinforcement Learning

Bipedalism for Quadrupedal Robots: Versatile Loco-Manipulation through Risk-Adaptive Reinforcement Learning

URL: http://arxiv.org/abs/2507.20382v1
Date: Sun, 27 Jul 2025 18:51:34 GMT
Title: Bipedalism for Quadrupedal Robots: Versatile Loco-Manipulation through Risk-Adaptive Reinforcement Learning
Authors: Yuyou Zhang, Radu Corcodel, Ding Zhao,
Abstract summary: We introduce bipedalism for quadrupedal robots, freeing the front legs for versatile interactions with the environment.<n>We propose a risk-adaptive distributional Reinforcement Learning framework designed for quadrupedal robots walking on their hind legs.
Score: 21.938067330028066
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Loco-manipulation of quadrupedal robots has broadened robotic applications, but using legs as manipulators often compromises locomotion, while mounting arms complicates the system. To mitigate this issue, we introduce bipedalism for quadrupedal robots, thus freeing the front legs for versatile interactions with the environment. We propose a risk-adaptive distributional Reinforcement Learning (RL) framework designed for quadrupedal robots walking on their hind legs, balancing worst-case conservativeness with optimal performance in this inherently unstable task. During training, the adaptive risk preference is dynamically adjusted based on the uncertainty of the return, measured by the coefficient of variation of the estimated return distribution. Extensive experiments in simulation show our method's superior performance over baselines. Real-world deployment on a Unitree Go2 robot further demonstrates the versatility of our policy, enabling tasks like cart pushing, obstacle probing, and payload transport, while showcasing robustness against challenging dynamics and external disturbances.

Related papers

GRoQ-LoCO: Generalist and Robot-agnostic Quadruped Locomotion Control using Offline Datasets [0.8678250057211367]
GRoQ-LoCO is a scalable, attention-based framework that learns a single generalist locomotion policy across multiple quadruped robots and terrains.<n>Our framework operates solely on proprioceptive data from all robots without incorporating any robot-specific encodings.<n>Results demonstrate the potential of offline, data-driven learning to generalize locomotion across diverse quadruped morphologies and behaviors.
arXiv Detail & Related papers (2025-05-16T08:17:01Z)
Training Directional Locomotion for Quadrupedal Low-Cost Robotic Systems via Deep Reinforcement Learning [4.669957449088593]
We present Deep Reinforcement Learning training of directional locomotion for low-cost quadpedalru robots in the real world.<n>We exploit randomization of heading that the robot must follow to foster exploration of action-state transitions.<n>Changing the heading in episode resets to current yaw plus a random value drawn from a normal distribution yields policies able to follow complex trajectories.
arXiv Detail & Related papers (2025-03-14T03:53:01Z)
Humanoid Whole-Body Locomotion on Narrow Terrain via Dynamic Balance and Reinforcement Learning [54.26816599309778]
We propose a novel whole-body locomotion algorithm based on dynamic balance and Reinforcement Learning (RL)<n> Specifically, we introduce a dynamic balance mechanism by leveraging an extended measure of Zero-Moment Point (ZMP)-driven rewards and task-driven rewards in a whole-body actor-critic framework.<n> Experiments conducted on a full-sized Unitree H1-2 robot verify the ability of our method to maintain balance on extremely narrow terrains.
arXiv Detail & Related papers (2025-02-24T14:53:45Z)
Reinforcement Learning for Versatile, Dynamic, and Robust Bipedal Locomotion Control [106.32794844077534]
This paper presents a study on using deep reinforcement learning to create dynamic locomotion controllers for bipedal robots. We develop a general control solution that can be used for a range of dynamic bipedal skills, from periodic walking and running to aperiodic jumping and standing. This work pushes the limits of agility for bipedal robots through extensive real-world experiments.
arXiv Detail & Related papers (2024-01-30T10:48:43Z)
Adapt On-the-Go: Behavior Modulation for Single-Life Robot Deployment [88.06408322210025]
We study the problem of adapting on-the-fly to novel scenarios during deployment.<n>Our approach, RObust Autonomous Modulation (ROAM), introduces a mechanism based on the perceived value of pre-trained behaviors.<n>We demonstrate that ROAM enables a robot to adapt rapidly to changes in dynamics both in simulation and on a real Go1 quadruped.
arXiv Detail & Related papers (2023-11-02T08:22:28Z)
Learning Bipedal Walking for Humanoids with Current Feedback [5.429166905724048]
We present an approach for overcoming the sim2real gap issue for humanoid robots arising from inaccurate torque-tracking at the actuator level. Our approach successfully trains a unified, end-to-end policy in simulation that can be deployed on a real HRP-5P humanoid robot to achieve bipedal locomotion.
arXiv Detail & Related papers (2023-03-07T08:16:46Z)
VAE-Loco: Versatile Quadruped Locomotion by Learning a Disentangled Gait Representation [78.92147339883137]
We show that it is pivotal in increasing controller robustness by learning a latent space capturing the key stance phases constituting a particular gait. We demonstrate that specific properties of the drive signal map directly to gait parameters such as cadence, footstep height and full stance duration. The use of a generative model facilitates the detection and mitigation of disturbances to provide a versatile and robust planning framework.
arXiv Detail & Related papers (2022-05-02T19:49:53Z)
REvolveR: Continuous Evolutionary Models for Robot-to-robot Policy Transfer [57.045140028275036]
We consider the problem of transferring a policy across two different robots with significantly different parameters such as kinematics and morphology. Existing approaches that train a new policy by matching the action or state transition distribution, including imitation learning methods, fail due to optimal action and/or state distribution being mismatched in different robots. We propose a novel method named $REvolveR$ of using continuous evolutionary models for robotic policy transfer implemented in a physics simulator.
arXiv Detail & Related papers (2022-02-10T18:50:25Z)
OSCAR: Data-Driven Operational Space Control for Adaptive and Robust Robot Manipulation [50.59541802645156]
Operational Space Control (OSC) has been used as an effective task-space controller for manipulation. We propose OSC for Adaptation and Robustness (OSCAR), a data-driven variant of OSC that compensates for modeling errors. We evaluate our method on a variety of simulated manipulation problems, and find substantial improvements over an array of controller baselines.
arXiv Detail & Related papers (2021-10-02T01:21:38Z)
Robust Quadruped Jumping via Deep Reinforcement Learning [10.095966161524043]
In this paper, we consider jumping varying distances and heights for a quadrupedal robot in noisy environments. We propose a framework using deep reinforcement learning that leverages and augments the complex solution of nonlinear trajectory optimization for quadrupedal jumping. We demonstrate robustness of foot disturbances of up to 6 cm in height, or 33% of the robot's nominal standing height, while jumping 2x the body length in distance.
arXiv Detail & Related papers (2020-11-13T19:04:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.