Related papers: MAMBPO: Sample-efficient multi-robot reinforcement learning using learned world models

MAMBPO: Sample-efficient multi-robot reinforcement learning using learned world models

URL: http://arxiv.org/abs/2103.03662v1
Date: Fri, 5 Mar 2021 13:37:23 GMT
Title: MAMBPO: Sample-efficient multi-robot reinforcement learning using learned world models
Authors: Dani\"el Willemsen, Mario Coppola and Guido C.H.E. de Croon
Abstract summary: Multi-robot systems can benefit from reinforcement learning (RL) algorithms that learn behaviours in a small number of trials. We present a novel multi-agent model-based RL algorithm: Multi-Agent Model-Based Policy Optimization (MAMBPO)
Score: 4.84279798426797
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-robot systems can benefit from reinforcement learning (RL) algorithms that learn behaviours in a small number of trials, a property known as sample efficiency. This research thus investigates the use of learned world models to improve sample efficiency. We present a novel multi-agent model-based RL algorithm: Multi-Agent Model-Based Policy Optimization (MAMBPO), utilizing the Centralized Learning for Decentralized Execution (CLDE) framework. CLDE algorithms allow a group of agents to act in a fully decentralized manner after training. This is a desirable property for many systems comprising of multiple robots. MAMBPO uses a learned world model to improve sample efficiency compared to model-free Multi-Agent Soft Actor-Critic (MASAC). We demonstrate this on two simulated multi-robot tasks, where MAMBPO achieves a similar performance to MASAC, but requires far fewer samples to do so. Through this, we take an important step towards making real-life learning for multi-robot systems possible.

Related papers

MaskSearch: A Universal Pre-Training Framework to Enhance Agentic Search Capability [106.35604230971396]
Recent advancements in Agent techniques enable Large Language Models (LLMs) to autonomously utilize tools for retrieval, planning, and reasoning.<n>To further enhance the universal search capability of agents, we propose a novel pre-training framework, MaskSearch.<n>In the pre-training stage, we introduce the Retrieval Augmented Mask Prediction (RAMP) task, where the model learns to leverage search tools to fill masked spans.<n>After that, the model is trained on downstream tasks to achieve further improvement.
arXiv Detail & Related papers (2025-05-26T17:58:50Z)
Multi-Agent Sampling: Scaling Inference Compute for Data Synthesis with Tree Search-Based Agentic Collaboration [81.45763823762682]
This work aims to bridge the gap by investigating the problem of data synthesis through multi-agent sampling. We introduce Tree Search-based Orchestrated Agents(TOA), where the workflow evolves iteratively during the sequential sampling process. Our experiments on alignment, machine translation, and mathematical reasoning demonstrate that multi-agent sampling significantly outperforms single-agent sampling as inference compute scales.
arXiv Detail & Related papers (2024-12-22T15:16:44Z)
Sample Efficient Robot Learning in Supervised Effect Prediction Tasks [0.0]
MUSEL (Model Uncertainty for Sample-Efficient Learning) is a novel AL framework tailored for regression tasks in robotics.<n>We show that MUSEL improves both learning accuracy and sample efficiency, validating its effectiveness in learning action effects selecting informative samples.
arXiv Detail & Related papers (2024-12-03T09:48:28Z)
Decentralized Transformers with Centralized Aggregation are Sample-Efficient Multi-Agent World Models [106.94827590977337]
We propose a novel world model for Multi-Agent RL (MARL) that learns decentralized local dynamics for scalability. We also introduce a Perceiver Transformer as an effective solution to enable centralized representation aggregation. Results on Starcraft Multi-Agent Challenge (SMAC) show that it outperforms strong model-free approaches and existing model-based methods in both sample efficiency and overall performance.
arXiv Detail & Related papers (2024-06-22T12:40:03Z)
Efficient Multi-agent Reinforcement Learning by Planning [33.51282615335009]
Multi-agent reinforcement learning (MARL) algorithms have accomplished remarkable breakthroughs in solving large-scale decision-making tasks. Most existing MARL algorithms are model-free, limiting sample efficiency and hindering their applicability in more challenging scenarios. We propose the MAZero algorithm, which combines a centralized model with Monte Carlo Tree Search (MCTS) for policy search.
arXiv Detail & Related papers (2024-05-20T04:36:02Z)
Active Exploration in Bayesian Model-based Reinforcement Learning for Robot Manipulation [8.940998315746684]
We propose a model-based reinforcement learning (RL) approach for robotic arm end-tasks. We employ Bayesian neural network models to represent, in a probabilistic way, both the belief and information encoded in the dynamic model during exploration. Our experiments show the advantages of our Bayesian model-based RL approach, with similar quality in the results than relevant alternatives.
arXiv Detail & Related papers (2024-04-02T11:44:37Z)
Physics-informed reinforcement learning via probabilistic co-adjustment functions [3.6787556334630334]
We introduce co-kriging adjustments (CKA) and ridge regression adjustment (RRA) as novel ways to combine the advantages of both approaches. Our adjustment methods are based on an auto-regressive AR1 co-kriging model that we integrate with GP priors.
arXiv Detail & Related papers (2023-09-11T12:10:19Z)
SAM-RL: Sensing-Aware Model-Based Reinforcement Learning via Differentiable Physics-Based Simulation and Rendering [49.78647219715034]
We propose a sensing-aware model-based reinforcement learning system called SAM-RL. With the sensing-aware learning pipeline, SAM-RL allows a robot to select an informative viewpoint to monitor the task process. We apply our framework to real world experiments for accomplishing three manipulation tasks: robotic assembly, tool manipulation, and deformable object manipulation.
arXiv Detail & Related papers (2022-10-27T05:30:43Z)
Multitask Adaptation by Retrospective Exploration with Learned World Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage. The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z)
Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems. Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC. We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z)
MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning [61.28547338576706]
Population-based multi-agent reinforcement learning (PB-MARL) refers to the series of methods nested with reinforcement learning (RL) algorithms. We present MALib, a scalable and efficient computing framework for PB-MARL.
arXiv Detail & Related papers (2021-06-05T03:27:08Z)
Energy-Efficient and Federated Meta-Learning via Projected Stochastic Gradient Ascent [79.58680275615752]
We propose an energy-efficient federated meta-learning framework. We assume each task is owned by a separate agent, so a limited number of tasks is used to train a meta-model.
arXiv Detail & Related papers (2021-05-31T08:15:44Z)
Fast Online Adaptation in Robotics through Meta-Learning Embeddings of Simulated Priors [3.4376560669160385]
In the real world, a robot might encounter any situation starting from motor failures to finding itself in a rocky terrain. We show that FAMLE allows the robots to adapt to novel damages in significantly fewer time-steps than the baselines.
arXiv Detail & Related papers (2020-03-10T12:37:52Z)
Ready Policy One: World Building Through Active Learning [35.358315617358976]
We introduce Ready Policy One (RP1), a framework that views Model-Based Reinforcement Learning as an active learning problem. RP1 achieves this by utilizing a hybrid objective function, which crucially adapts during optimization. We rigorously evaluate our method on a variety of continuous control tasks, and demonstrate statistically significant gains over existing approaches.
arXiv Detail & Related papers (2020-02-07T09:57:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.