Learning Flexible Heterogeneous Coordination with Capability-Aware Shared Hypernetworks
- URL: http://arxiv.org/abs/2501.06058v2
- Date: Tue, 18 Feb 2025 09:23:35 GMT
- Title: Learning Flexible Heterogeneous Coordination with Capability-Aware Shared Hypernetworks
- Authors: Kevin Fu, Pierce Howell, Shalin Jain, Harish Ravichandar,
- Abstract summary: We present Capability-Aware Shared Hypernetworks (CASH), a novel architecture for heterogeneous multi-agent coordination.<n>CASH generates sufficient diversity while maintaining sample-efficiency via soft parameter-sharing hypernetworks.<n>We present experiments across two heterogeneous coordination tasks and three standard learning paradigms.
- Score: 2.681242476043447
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cooperative heterogeneous multi-agent tasks require agents to effectively coordinate their behaviors while accounting for their relative capabilities. Learning-based solutions to this challenge span between two extremes: i) shared-parameter methods, which encode diverse behaviors within a single architecture by assigning an ID to each agent, and are sample-efficient but result in limited behavioral diversity; ii) independent methods, which learn a separate policy for each agent, and show greater behavioral diversity but lack sample-efficiency. Prior work has also explored selective parameter-sharing, allowing for a compromise between diversity and efficiency. None of these approaches, however, effectively generalize to unseen agents or teams. We present Capability-Aware Shared Hypernetworks (CASH), a novel architecture for heterogeneous multi-agent coordination that generates sufficient diversity while maintaining sample-efficiency via soft parameter-sharing hypernetworks. Intuitively, CASH allows the team to learn common strategies using a shared encoder, which are then adapted according to the team's individual and collective capabilities with a hypernetwork, allowing for zero-shot generalization to unseen teams and agents. We present experiments across two heterogeneous coordination tasks and three standard learning paradigms (imitation learning, on- and off-policy reinforcement learning). CASH is able to outperform baseline architectures in success rate and sample efficiency when evaluated on unseen teams and agents despite using less than half of the learnable parameters.
Related papers
- Fault-Tolerant Multi-Robot Coordination with Limited Sensing within Confined Environments [0.6144680854063939]
We propose a novel fault-tolerance technique leveraging physical contact interactions in multi-robot systems.<n>We introduce the "Active Contact Response" (ACR) method, where each robot modulates its behavior based on the likelihood of encountering an inoperative (faulty) robot.
arXiv Detail & Related papers (2025-05-21T02:43:36Z) - UniVLA: Learning to Act Anywhere with Task-centric Latent Actions [32.83715417294052]
UniVLA is a new framework for learning cross-embodiment vision-language-action (VLA) policies.<n>We derive task-centric action representations from videos with a latent action model.<n>We obtain state-of-the-art results across multiple manipulation and navigation benchmarks, as well as real-robot deployments.
arXiv Detail & Related papers (2025-05-09T15:11:13Z) - RoBridge: A Hierarchical Architecture Bridging Cognition and Execution for General Robotic Manipulation [90.81956345363355]
RoBridge is a hierarchical intelligent architecture for general robotic manipulation.<n>It consists of a high-level cognitive planner (HCP) based on a large-scale pre-trained vision-language model (VLM)<n>It unleashes the procedural skill of reinforcement learning, effectively bridging the gap between cognition and execution.
arXiv Detail & Related papers (2025-05-03T06:17:18Z) - CoinRobot: Generalized End-to-end Robotic Learning for Physical Intelligence [12.629888401901418]
Our framework supports cross-platform adaptability, enabling seamless deployment across industrial-grade robots, collaborative arms, and novel embodiments without task-specific modifications.
We validate our framework through extensive experiments on seven manipulation tasks. Notably, Diffusion-based models trained in our framework demonstrated superior performance and generalizability compared to the LeRobot framework.
arXiv Detail & Related papers (2025-03-07T10:50:58Z) - EMOS: Embodiment-aware Heterogeneous Multi-robot Operating System with LLM Agents [33.77674812074215]
We introduce a novel multi-agent framework designed to enable effective collaboration among heterogeneous robots.
We propose a self-prompted approach, where agents comprehend robot URDF files and call robot kinematics tools to generate descriptions of their physics capabilities.
The Habitat-MAS benchmark is designed to assess how a multi-agent framework handles tasks that require embodiment-aware reasoning.
arXiv Detail & Related papers (2024-10-30T03:20:01Z) - Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers [41.069074375686164]
We propose Heterogeneous Pre-trained Transformers (HPT), which pre-train a trunk of a policy neural network to learn a task and embodiment shared representation.
We conduct experiments to investigate the scaling behaviors of training objectives, to the extent of 52 datasets.
HPTs outperform several baselines and enhance the fine-tuned policy performance by over 20% on unseen tasks.
arXiv Detail & Related papers (2024-09-30T17:39:41Z) - COHERENT: Collaboration of Heterogeneous Multi-Robot System with Large Language Models [49.24666980374751]
COHERENT is a novel LLM-based task planning framework for collaboration of heterogeneous multi-robot systems.
A Proposal-Execution-Feedback-Adjustment mechanism is designed to decompose and assign actions for individual robots.
The experimental results show that our work surpasses the previous methods by a large margin in terms of success rate and execution efficiency.
arXiv Detail & Related papers (2024-09-23T15:53:41Z) - Generalized Robot Learning Framework [10.03174544844559]
We present a low-cost robot learning framework that is both easily reproducible and transferable to various robots and environments.
We demonstrate that deployable imitation learning can be successfully applied even to industrial-grade robots.
arXiv Detail & Related papers (2024-09-18T15:34:31Z) - Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation [49.03165169369552]
By training a single policy across many different kinds of robots, a robot learning method can leverage much broader and more diverse datasets.
We propose CrossFormer, a scalable and flexible transformer-based policy that can consume data from any embodiment.
We demonstrate that the same network weights can control vastly different robots, including single and dual arm manipulation systems, wheeled robots, quadcopters, and quadrupeds.
arXiv Detail & Related papers (2024-08-21T17:57:51Z) - Robotic Control via Embodied Chain-of-Thought Reasoning [86.6680905262442]
Key limitation of learned robot control policies is their inability to generalize outside their training data.<n>Recent works on vision-language-action models (VLAs) have shown that the use of large, internet pre-trained vision-language models can substantially improve their robustness and generalization ability.<n>We introduce Embodied Chain-of-Thought Reasoning (ECoT) for VLAs, in which we train VLAs to perform multiple steps of reasoning about plans, sub-tasks, motions, and visually grounded features before predicting the robot action.
arXiv Detail & Related papers (2024-07-11T17:31:01Z) - Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning [61.294110816231886]
We introduce a sparse, reusable, and flexible policy, Sparse Diffusion Policy (SDP)
SDP selectively activates experts and skills, enabling efficient and task-specific learning without retraining the entire model.
Demos and codes can be found in https://forrest-110.io/sparse_diffusion_policy/.
arXiv Detail & Related papers (2024-07-01T17:59:56Z) - RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis [102.1876259853457]
We propose a tree-structured multimodal code generation framework for generalized robotic behavior synthesis, termed RoboCodeX.
RoboCodeX decomposes high-level human instructions into multiple object-centric manipulation units consisting of physical preferences such as affordance and safety constraints.
To further enhance the capability to map conceptual and perceptual understanding into control commands, a specialized multimodal reasoning dataset is collected for pre-training and an iterative self-updating methodology is introduced for supervised fine-tuning.
arXiv Detail & Related papers (2024-02-25T15:31:43Z) - RoboScript: Code Generation for Free-Form Manipulation Tasks across Real
and Simulation [77.41969287400977]
This paper presents textbfRobotScript, a platform for a deployable robot manipulation pipeline powered by code generation.
We also present a benchmark for a code generation benchmark for robot manipulation tasks in free-form natural language.
We demonstrate the adaptability of our code generation framework across multiple robot embodiments, including the Franka and UR5 robot arms.
arXiv Detail & Related papers (2024-02-22T15:12:00Z) - Adaptive parameter sharing for multi-agent reinforcement learning [16.861543418593044]
We propose a novel parameter sharing method inspired by research pertaining to the brain in biology.
It maps each type of agent to different regions within a shared network based on their identity, resulting in distinctworks.
Our method can increase the diversity of strategies among different agents without additional training parameters.
arXiv Detail & Related papers (2023-12-14T15:00:32Z) - Robot Fleet Learning via Policy Merging [58.5086287737653]
We propose FLEET-MERGE to efficiently merge policies in the fleet setting.
We show that FLEET-MERGE consolidates the behavior of policies trained on 50 tasks in the Meta-World environment.
We introduce a novel robotic tool-use benchmark, FLEET-TOOLS, for fleet policy learning in compositional and contact-rich robot manipulation tasks.
arXiv Detail & Related papers (2023-10-02T17:23:51Z) - RoboAgent: Generalization and Efficiency in Robot Manipulation via
Semantic Augmentations and Action Chunking [54.776890150458385]
We develop an efficient system for training universal agents capable of multi-task manipulation skills.
We are able to train a single agent capable of 12 unique skills, and demonstrate its generalization over 38 tasks.
On average, RoboAgent outperforms prior methods by over 40% in unseen situations.
arXiv Detail & Related papers (2023-09-05T03:14:39Z) - AgentVerse: Facilitating Multi-Agent Collaboration and Exploring
Emergent Behaviors [93.38830440346783]
We propose a multi-agent framework framework that can collaboratively adjust its composition as a greater-than-the-sum-of-its-parts system.
Our experiments demonstrate that framework framework can effectively deploy multi-agent groups that outperform a single agent.
In view of these behaviors, we discuss some possible strategies to leverage positive ones and mitigate negative ones for improving the collaborative potential of multi-agent groups.
arXiv Detail & Related papers (2023-08-21T16:47:11Z) - CoMIX: A Multi-agent Reinforcement Learning Training Architecture for Efficient Decentralized Coordination and Independent Decision-Making [2.4555276449137042]
Robust coordination skills enable agents to operate cohesively in shared environments, together towards a common goal and, ideally, individually without hindering each other's progress.<n>This paper presents Coordinated QMIX, a novel training framework for decentralized agents that enables emergent coordination through flexible policies, allowing at the same time independent decision-making at individual level.
arXiv Detail & Related papers (2023-08-21T13:45:44Z) - Asynchronous Multi-Agent Reinforcement Learning for Efficient Real-Time
Multi-Robot Cooperative Exploration [16.681164058779146]
We consider the problem of cooperative exploration where multiple robots need to cooperatively explore an unknown region as fast as possible.
Existing MARL-based methods adopt action-making steps as the metric for exploration efficiency by assuming all the agents are acting in a fully synchronous manner.
We propose an asynchronous MARL solution, Asynchronous Coordination Explorer (ACE), to tackle this real-world challenge.
arXiv Detail & Related papers (2023-01-09T14:53:38Z) - Learning Heterogeneous Agent Cooperation via Multiagent League Training [6.801749815385998]
This work proposes a general-purpose reinforcement learning algorithm named Heterogeneous League Training (HLT) to address heterogeneous multiagent problems.
HLT keeps track of a pool of policies that agents have explored during training, gathering a league of heterogeneous policies to facilitate future policy optimization.
A hyper-network is introduced to increase the diversity of agent behaviors when collaborating with teammates having different levels of cooperation skills.
arXiv Detail & Related papers (2022-11-13T13:57:15Z) - Multi-agent Deep Covering Skill Discovery [50.812414209206054]
We propose Multi-agent Deep Covering Option Discovery, which constructs the multi-agent options through minimizing the expected cover time of the multiple agents' joint state space.
Also, we propose a novel framework to adopt the multi-agent options in the MARL process.
We show that the proposed algorithm can effectively capture the agent interactions with the attention mechanism, successfully identify multi-agent options, and significantly outperforms prior works using single-agent options or no options.
arXiv Detail & Related papers (2022-10-07T00:40:59Z) - Policy Diagnosis via Measuring Role Diversity in Cooperative Multi-agent
RL [107.58821842920393]
We quantify the agent's behavior difference and build its relationship with the policy performance via bf Role Diversity
We find that the error bound in MARL can be decomposed into three parts that have a strong relation to the role diversity.
The decomposed factors can significantly impact policy optimization on three popular directions.
arXiv Detail & Related papers (2022-06-01T04:58:52Z) - LDSA: Learning Dynamic Subtask Assignment in Cooperative Multi-Agent
Reinforcement Learning [122.47938710284784]
We propose a novel framework for learning dynamic subtask assignment (LDSA) in cooperative MARL.
To reasonably assign agents to different subtasks, we propose an ability-based subtask selection strategy.
We show that LDSA learns reasonable and effective subtask assignment for better collaboration.
arXiv Detail & Related papers (2022-05-05T10:46:16Z) - Centralizing State-Values in Dueling Networks for Multi-Robot
Reinforcement Learning Mapless Navigation [87.85646257351212]
We study the problem of multi-robot mapless navigation in the popular Training and Decentralized Execution (CTDE) paradigm.
This problem is challenging when each robot considers its path without explicitly sharing observations with other robots.
We propose a novel architecture for CTDE that uses a centralized state-value network to compute a joint state-value.
arXiv Detail & Related papers (2021-12-16T16:47:00Z) - Celebrating Diversity in Shared Multi-Agent Reinforcement Learning [20.901606233349177]
Deep multi-agent reinforcement learning has shown the promise to solve complex cooperative tasks.
In this paper, we aim to introduce diversity in both optimization and representation of shared multi-agent reinforcement learning.
Our method achieves state-of-the-art performance on Google Research Football and super hard StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2021-06-04T00:55:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.