MASH: Cooperative-Heterogeneous Multi-Agent Reinforcement Learning for Single Humanoid Robot Locomotion
- URL: http://arxiv.org/abs/2508.10423v1
- Date: Thu, 14 Aug 2025 07:54:31 GMT
- Title: MASH: Cooperative-Heterogeneous Multi-Agent Reinforcement Learning for Single Humanoid Robot Locomotion
- Authors: Qi Liu, Xiaopeng Zhang, Mingshan Tan, Shuaikang Ma, Jinliang Ding, Yanjie Li,
- Abstract summary: This paper proposes a novel method to enhance locomotion for a single humanoid robot through cooperative-heterogeneous multi-agent deep reinforcement learning (MARL)<n>Experiments demonstrate that MASH accelerates training convergence and improves whole-body cooperation ability, outperforming conventional single-agent reinforcement learning methods.<n>This work advances the integration of MARL into single-humanoid-robot control, offering new insights into efficient locomotion strategies.
- Score: 14.728331588105789
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes a novel method to enhance locomotion for a single humanoid robot through cooperative-heterogeneous multi-agent deep reinforcement learning (MARL). While most existing methods typically employ single-agent reinforcement learning algorithms for a single humanoid robot or MARL algorithms for multi-robot system tasks, we propose a distinct paradigm: applying cooperative-heterogeneous MARL to optimize locomotion for a single humanoid robot. The proposed method, multi-agent reinforcement learning for single humanoid locomotion (MASH), treats each limb (legs and arms) as an independent agent that explores the robot's action space while sharing a global critic for cooperative learning. Experiments demonstrate that MASH accelerates training convergence and improves whole-body cooperation ability, outperforming conventional single-agent reinforcement learning methods. This work advances the integration of MARL into single-humanoid-robot control, offering new insights into efficient locomotion strategies.
Related papers
- FRoM-W1: Towards General Humanoid Whole-Body Control with Language Instructions [147.04372611893032]
We present FRoM-W1, an open-source framework designed to achieve general humanoid whole-body motion control using natural language.<n>We extensively evaluate FRoM-W1 on Unitree H1 and G1 robots.<n>Results demonstrate superior performance on the HumanML3D-X benchmark for human whole-body motion generation.
arXiv Detail & Related papers (2026-01-19T07:59:32Z) - R2BC: Multi-Agent Imitation Learning from Single-Agent Demonstrations [8.790468078980306]
We introduce and study Round-Robin Behavior Cloning (R2BC), a method that enables a single human operator to effectively train multi-robot systems.<n>Our approach allows the human to teleoperate one agent at a time and incrementally teach multi-agent behavior to the entire system.<n>We show that R2BC methods match, and in some cases surpass, the performance of an oracle behavior cloning approach trained on privileged synchronized demonstrations.
arXiv Detail & Related papers (2025-10-20T20:24:23Z) - OmniRetarget: Interaction-Preserving Data Generation for Humanoid Whole-Body Loco-Manipulation and Scene Interaction [76.44108003274955]
A dominant paradigm for teaching humanoid robots complex skills is to retarget human motions as kinematic references to train reinforcement learning policies.<n>We introduce OmniRetarget, an interaction-preserving data generation engine based on an interaction mesh.<n>By minimizing the Laplacian deformation between the human and robot meshes, OmniRetarget generates kinematically feasible trajectories.
arXiv Detail & Related papers (2025-09-30T17:59:02Z) - UniPrototype: Humn-Robot Skill Learning with Uniform Prototypes [4.338344229716167]
UniPrototype is a novel framework that enables effective knowledge transfer from human to robot domains via shared motion primitives.<n>Our results show that UniPrototype successfully transfers human manipulation knowledge to robots, significantly improving learning efficiency and task performance.
arXiv Detail & Related papers (2025-09-27T00:33:39Z) - Mixed-Initiative Dialog for Human-Robot Collaborative Manipulation [8.446410154654467]
MICoBot handles the common scenario where both agents, using natural language, take initiative in formulating, accepting, or rejecting proposals on who can best complete different steps of a task.<n>To handle diverse, task-directed dialog, MICoBot makes decisions at three levels: (1) a meta-planner considers human dialog to formulate and code a high-level collaboration strategy, (2) a planner optimally allocates the remaining steps to either agent based on the robot's capabilities, and (3) an action decides the low-level actions to perform or words to say to the human.
arXiv Detail & Related papers (2025-08-07T16:09:12Z) - HACTS: a Human-As-Copilot Teleoperation System for Robot Learning [47.9126187195398]
We introduce HACTS (Human-As-Copilot Teleoperation System), a novel system that establishes bilateral, real-time joint synchronization between a robot arm and teleoperation hardware.<n>This simple yet effective feedback mechanism, akin to a steering wheel in autonomous vehicles, enables the human copilot to intervene seamlessly while collecting action-correction data for future learning.
arXiv Detail & Related papers (2025-03-31T13:28:13Z) - Enabling Multi-Robot Collaboration from Single-Human Guidance [5.016558275355615]
We propose an efficient way of learning collaborative behaviors in multi-agent systems by leveraging expertise from only a single human.<n>We show that agents can effectively learn to collaborate by allowing a human operator to dynamically switch between controlling agents for a short period.<n>Our experiments showed that our method improves the success rate of a challenging collaborative hide-and-seek task by up to 58% with only 40 minutes of human guidance.
arXiv Detail & Related papers (2024-09-30T00:02:56Z) - Human-Agent Joint Learning for Efficient Robot Manipulation Skill Acquisition [48.65867987106428]
We introduce a novel system for joint learning between human operators and robots.
It enables human operators to share control of a robot end-effector with a learned assistive agent.
It reduces the need for human adaptation while ensuring the collected data is of sufficient quality for downstream tasks.
arXiv Detail & Related papers (2024-06-29T03:37:29Z) - Learning Multimodal Latent Dynamics for Human-Robot Interaction [18.68936554172693]
This article presents a method for learning well-coordinated Human-Robot Interaction (HRI) from Human-Human Interactions (HHI)<n>We devise a hybrid approach using Hidden Markov Models (HMMs) as the latent space priors for a Variational Autoencoder to model a joint distribution over the interacting agents.<n>We find that users perceive our method as more human-like, timely, and accurate and rank our method with a higher degree of preference over other baselines.
arXiv Detail & Related papers (2023-11-27T23:56:59Z) - HERD: Continuous Human-to-Robot Evolution for Learning from Human
Demonstration [57.045140028275036]
We show that manipulation skills can be transferred from a human to a robot through the use of micro-evolutionary reinforcement learning.
We propose an algorithm for multi-dimensional evolution path searching that allows joint optimization of both the robot evolution path and the policy.
arXiv Detail & Related papers (2022-12-08T15:56:13Z) - Show Me What You Can Do: Capability Calibration on Reachable Workspace
for Human-Robot Collaboration [83.4081612443128]
We show that a short calibration using REMP can effectively bridge the gap between what a non-expert user thinks a robot can reach and the ground-truth.
We show that this calibration procedure not only results in better user perception, but also promotes more efficient human-robot collaborations.
arXiv Detail & Related papers (2021-03-06T09:14:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.