Robot Fleet Learning via Policy Merging
- URL: http://arxiv.org/abs/2310.01362v3
- Date: Fri, 23 Feb 2024 03:51:51 GMT
- Title: Robot Fleet Learning via Policy Merging
- Authors: Lirui Wang, Kaiqing Zhang, Allan Zhou, Max Simchowitz, Russ Tedrake
- Abstract summary: We propose FLEET-MERGE to efficiently merge policies in the fleet setting.
We show that FLEET-MERGE consolidates the behavior of policies trained on 50 tasks in the Meta-World environment.
We introduce a novel robotic tool-use benchmark, FLEET-TOOLS, for fleet policy learning in compositional and contact-rich robot manipulation tasks.
- Score: 58.5086287737653
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fleets of robots ingest massive amounts of heterogeneous streaming data silos
generated by interacting with their environments, far more than what can be
stored or transmitted with ease. At the same time, teams of robots should
co-acquire diverse skills through their heterogeneous experiences in varied
settings. How can we enable such fleet-level learning without having to
transmit or centralize fleet-scale data? In this paper, we investigate policy
merging (PoMe) from such distributed heterogeneous datasets as a potential
solution. To efficiently merge policies in the fleet setting, we propose
FLEET-MERGE, an instantiation of distributed learning that accounts for the
permutation invariance that arises when parameterizing the control policies
with recurrent neural networks. We show that FLEET-MERGE consolidates the
behavior of policies trained on 50 tasks in the Meta-World environment, with
good performance on nearly all training tasks at test time. Moreover, we
introduce a novel robotic tool-use benchmark, FLEET-TOOLS, for fleet policy
learning in compositional and contact-rich robot manipulation tasks, to
validate the efficacy of FLEET-MERGE on the benchmark.
Related papers
- Fed-EC: Bandwidth-Efficient Clustering-Based Federated Learning For Autonomous Visual Robot Navigation [7.8839937556789375]
Federated-EmbedCluster (Fed-EC) is a clustering-based federated learning framework deployed with vision based autonomous robot navigation in diverse outdoor environments.
Fed-EC reduces the communication size by 23x for each robot while matching the performance of centralized learning for goal-oriented navigation and outperforms local learning.
arXiv Detail & Related papers (2024-11-06T18:44:09Z) - Efficient Data Collection for Robotic Manipulation via Compositional Generalization [70.76782930312746]
We show that policies can compose environmental factors from their data to succeed when encountering unseen factor combinations.
We propose better in-domain data collection strategies that exploit composition.
We provide videos at http://iliad.stanford.edu/robot-data-comp/.
arXiv Detail & Related papers (2024-03-08T07:15:38Z) - PoCo: Policy Composition from and for Heterogeneous Robot Learning [44.1315170137613]
Current methods usually collect and pool all data from one domain to train a single policy.
We present a flexible approach, dubbed Policy Composition, to combine information across diverse modalities and domains.
Our method can use task-level composition for multi-task manipulation and be composed with analytic cost functions to adapt policy behaviors at inference time.
arXiv Detail & Related papers (2024-02-04T14:51:49Z) - Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - FedGradNorm: Personalized Federated Gradient-Normalized Multi-Task
Learning [50.756991828015316]
Multi-task learning (MTL) is a novel framework to learn several tasks simultaneously with a single shared network.
We propose FedGradNorm which uses a dynamic-weighting method to normalize norms in order to balance learning speeds among different tasks.
arXiv Detail & Related papers (2022-03-24T17:43:12Z) - On Addressing Heterogeneity in Federated Learning for Autonomous
Vehicles Connected to a Drone Orchestrator [32.61132332561498]
We envision a federated learning (FL) scenario in service of amending the performance of autonomous road vehicles.
We focus on the issue of accelerating the learning of a particular class of critical object (CO), that may harm the nominal operation of an autonomous vehicle.
arXiv Detail & Related papers (2021-08-05T16:25:48Z) - Scalable Multi-Robot System for Non-myopic Spatial Sampling [9.37678298330157]
This paper presents a scalable distributed multi-robot planning algorithm for non-uniform sampling of spatial fields.
We analyze the effect of communication between multiple robots, acting independently, on the overall sampling performance of the team.
arXiv Detail & Related papers (2021-05-20T20:30:10Z) - Learning Connectivity for Data Distribution in Robot Teams [96.39864514115136]
We propose a task-agnostic, decentralized, low-latency method for data distribution in ad-hoc networks using Graph Neural Networks (GNN)
Our approach enables multi-agent algorithms based on global state information to function by ensuring it is available at each robot.
We train the distributed GNN communication policies via reinforcement learning using the average Age of Information as the reward function and show that it improves training stability compared to task-specific reward functions.
arXiv Detail & Related papers (2021-03-08T21:48:55Z) - Bayesian Meta-Learning for Few-Shot Policy Adaptation Across Robotic
Platforms [60.59764170868101]
Reinforcement learning methods can achieve significant performance but require a large amount of training data collected on the same robotic platform.
We formulate it as a few-shot meta-learning problem where the goal is to find a model that captures the common structure shared across different robotic platforms.
We experimentally evaluate our framework on a simulated reaching and a real-robot picking task using 400 simulated robots.
arXiv Detail & Related papers (2021-03-05T14:16:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.