Representation Learning For Efficient Deep Multi-Agent Reinforcement Learning
- URL: http://arxiv.org/abs/2406.02890v1
- Date: Wed, 5 Jun 2024 03:11:44 GMT
- Title: Representation Learning For Efficient Deep Multi-Agent Reinforcement Learning
- Authors: Dom Huh, Prasant Mohapatra,
- Abstract summary: We present MAPO-LSO which applies a form of comprehensive representation learning devised to supplement MARL training.
Specifically, MAPO-LSO proposes a multi-agent extension of transition dynamics reconstruction and self-predictive learning.
Empirical results demonstrate MAPO-LSO to show notable improvements in sample efficiency and learning performance compared to its vanilla MARL counterpart.
- Score: 10.186029242664931
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Sample efficiency remains a key challenge in multi-agent reinforcement learning (MARL). A promising approach is to learn a meaningful latent representation space through auxiliary learning objectives alongside the MARL objective to aid in learning a successful control policy. In our work, we present MAPO-LSO (Multi-Agent Policy Optimization with Latent Space Optimization) which applies a form of comprehensive representation learning devised to supplement MARL training. Specifically, MAPO-LSO proposes a multi-agent extension of transition dynamics reconstruction and self-predictive learning that constructs a latent state optimization scheme that can be trivially extended to current state-of-the-art MARL algorithms. Empirical results demonstrate MAPO-LSO to show notable improvements in sample efficiency and learning performance compared to its vanilla MARL counterpart without any additional MARL hyperparameter tuning on a diverse suite of MARL tasks.
Related papers
- EVOLvE: Evaluating and Optimizing LLMs For Exploration [76.66831821738927]
Large language models (LLMs) remain under-studied in scenarios requiring optimal decision-making under uncertainty.
We measure LLMs' (in)ability to make optimal decisions in bandits, a state-less reinforcement learning setting relevant to many applications.
Motivated by the existence of optimal exploration algorithms, we propose efficient ways to integrate this algorithmic knowledge into LLMs.
arXiv Detail & Related papers (2024-10-08T17:54:03Z) - Multi-agent Reinforcement Learning for Dynamic Dispatching in Material Handling Systems [5.050348337816326]
This paper proposes a multi-agent reinforcement learning (MARL) approach to learn dynamic dispatching strategies.
To benchmark our method, we developed a material handling environment that reflects the complexities of an actual system.
arXiv Detail & Related papers (2024-09-27T03:57:54Z) - CoMMIT: Coordinated Instruction Tuning for Multimodal Large Language Models [68.64605538559312]
In this paper, we analyze the MLLM instruction tuning from both theoretical and empirical perspectives.
Inspired by our findings, we propose a measurement to quantitatively evaluate the learning balance.
In addition, we introduce an auxiliary loss regularization method to promote updating of the generation distribution of MLLMs.
arXiv Detail & Related papers (2024-07-29T23:18:55Z) - Meta Reasoning for Large Language Models [58.87183757029041]
We introduce Meta-Reasoning Prompting (MRP), a novel and efficient system prompting method for large language models (LLMs)
MRP guides LLMs to dynamically select and apply different reasoning methods based on the specific requirements of each task.
We evaluate the effectiveness of MRP through comprehensive benchmarks.
arXiv Detail & Related papers (2024-06-17T16:14:11Z) - Let's reward step by step: Step-Level reward model as the Navigators for
Reasoning [64.27898739929734]
Process-Supervised Reward Model (PRM) furnishes LLMs with step-by-step feedback during the training phase.
We propose a greedy search algorithm that employs the step-level feedback from PRM to optimize the reasoning pathways explored by LLMs.
To explore the versatility of our approach, we develop a novel method to automatically generate step-level reward dataset for coding tasks and observed similar improved performance in the code generation tasks.
arXiv Detail & Related papers (2023-10-16T05:21:50Z) - ESP: Exploiting Symmetry Prior for Multi-Agent Reinforcement Learning [22.733348449818838]
Multi-agent reinforcement learning (MARL) has achieved promising results in recent years.
This paper proposes a framework for exploiting prior knowledge by integrating data augmentation and a well-designed consistency loss.
arXiv Detail & Related papers (2023-07-30T09:49:05Z) - MA2CL:Masked Attentive Contrastive Learning for Multi-Agent
Reinforcement Learning [128.19212716007794]
We propose an effective framework called textbfMulti-textbfAgent textbfMasked textbfAttentive textbfContrastive textbfLearning (MA2CL)
MA2CL encourages learning representation to be both temporal and agent-level predictive by reconstructing the masked agent observation in latent space.
Our method significantly improves the performance and sample efficiency of different MARL algorithms and outperforms other methods in various vision-based and state-based scenarios.
arXiv Detail & Related papers (2023-06-03T05:32:19Z) - Semantically Aligned Task Decomposition in Multi-Agent Reinforcement
Learning [56.26889258704261]
We propose a novel "disentangled" decision-making method, Semantically Aligned task decomposition in MARL (SAMA)
SAMA prompts pretrained language models with chain-of-thought that can suggest potential goals, provide suitable goal decomposition and subgoal allocation as well as self-reflection-based replanning.
SAMA demonstrates considerable advantages in sample efficiency compared to state-of-the-art ASG methods.
arXiv Detail & Related papers (2023-05-18T10:37:54Z) - Model-based Multi-agent Reinforcement Learning: Recent Progress and
Prospects [23.347535672670688]
Multi-Agent Reinforcement Learning (MARL) tackles sequential decision-making problems involving multiple participants.
MARL requires a tremendous number of samples for effective training.
Model-based methods have been shown to achieve provable advantages of sample efficiency.
arXiv Detail & Related papers (2022-03-20T17:24:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.