Remember and Forget Experience Replay for Multi-Agent Reinforcement
Learning
- URL: http://arxiv.org/abs/2203.13319v1
- Date: Thu, 24 Mar 2022 19:59:43 GMT
- Title: Remember and Forget Experience Replay for Multi-Agent Reinforcement
Learning
- Authors: Pascal Weber, Daniel W\"alchli, Mustafa Zeqiri, Petros Koumoutsakos
- Abstract summary: We present the extension of the Remember and Forget for Experience Replay (ReF-ER) algorithm to Multi-Agent Reinforcement Learning (MARL)
ReF-ER was shown to outperform state of the art algorithms for continuous control in problems ranging from the OpenAI Gym to complex fluid flows.
We find that employing a single feed-forward neural network for the policy and the value function in ReF-ER MARL, outperforms state of the art algorithms that rely on complex neural network architectures.
- Score: 3.06414751922655
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We present the extension of the Remember and Forget for Experience Replay
(ReF-ER) algorithm to Multi-Agent Reinforcement Learning (MARL). {ReF-ER} was
shown to outperform state of the art algorithms for continuous control in
problems ranging from the OpenAI Gym to complex fluid flows. In MARL, the
dependencies between the agents are included in the state-value estimator and
the environment dynamics are modeled via the importance weights used by ReF-ER.
In collaborative environments, we find the best performance when the value is
estimated using individual rewards and we ignore the effects of other actions
on the transition map. We benchmark the performance of ReF-ER MARL on the
Stanford Intelligent Systems Laboratory (SISL) environments. We find that
employing a single feed-forward neural network for the policy and the value
function in ReF-ER MARL, outperforms state of the art algorithms that rely on
complex neural network architectures.
Related papers
- Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank [52.831993899183416]
We introduce a structural assumption -- the interaction rank -- and establish that functions with low interaction rank are significantly more robust to distribution shift compared to general ones.
We demonstrate that utilizing function classes with low interaction rank, when combined with regularization and no-regret learning, admits decentralized, computationally and statistically efficient learning in offline MARL.
arXiv Detail & Related papers (2024-10-01T22:16:22Z) - Value-Based Deep Multi-Agent Reinforcement Learning with Dynamic Sparse Training [38.03693752287459]
Multi-agent Reinforcement Learning (MARL) relies on neural networks with numerous parameters in multi-agent scenarios.
This paper proposes the utilization of dynamic sparse training (DST), a technique proven effective in deep supervised learning tasks.
We introduce an innovative Multi-Agent Sparse Training (MAST) framework aimed at simultaneously enhancing the reliability of learning targets and the rationality of sample distribution.
arXiv Detail & Related papers (2024-09-28T15:57:24Z) - FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models [50.331708897857574]
We introduce FactorLLM, a novel approach that decomposes well-trained dense FFNs into sparse sub-networks without requiring any further modifications.
FactorLLM achieves comparable performance to the source model securing up to 85% model performance while obtaining over a 30% increase in inference speed.
arXiv Detail & Related papers (2024-08-15T16:45:16Z) - Uncovering cognitive taskonomy through transfer learning in masked autoencoder-based fMRI reconstruction [6.3348067441225915]
We employ the masked autoencoder (MAE) model to reconstruct functional magnetic resonance imaging (fMRI) data.
Our study suggests that the fMRI reconstruction with MAE model can uncover the latent representation.
arXiv Detail & Related papers (2024-05-24T09:29:16Z) - MA2CL:Masked Attentive Contrastive Learning for Multi-Agent
Reinforcement Learning [128.19212716007794]
We propose an effective framework called textbfMulti-textbfAgent textbfMasked textbfAttentive textbfContrastive textbfLearning (MA2CL)
MA2CL encourages learning representation to be both temporal and agent-level predictive by reconstructing the masked agent observation in latent space.
Our method significantly improves the performance and sample efficiency of different MARL algorithms and outperforms other methods in various vision-based and state-based scenarios.
arXiv Detail & Related papers (2023-06-03T05:32:19Z) - Efficient Model-based Multi-agent Reinforcement Learning via Optimistic
Equilibrium Computation [93.52573037053449]
H-MARL (Hallucinated Multi-Agent Reinforcement Learning) learns successful equilibrium policies after a few interactions with the environment.
We demonstrate our approach experimentally on an autonomous driving simulation benchmark.
arXiv Detail & Related papers (2022-03-14T17:24:03Z) - Mask-based Latent Reconstruction for Reinforcement Learning [58.43247393611453]
Mask-based Latent Reconstruction (MLR) is proposed to predict the complete state representations in the latent space from the observations with spatially and temporally masked pixels.
Extensive experiments show that our MLR significantly improves the sample efficiency in deep reinforcement learning.
arXiv Detail & Related papers (2022-01-28T13:07:11Z) - MHER: Model-based Hindsight Experience Replay [33.00149668905828]
We propose Model-based Hindsight Experience Replay (MHER) to solve multi-goal reinforcement learning problems.
replacing original goals with virtual goals generated from interaction with a trained dynamics model leads to a novel relabeling method.
MHER exploits experiences more efficiently by leveraging environmental dynamics to generate virtual achieved goals.
arXiv Detail & Related papers (2021-07-01T08:52:45Z) - Forgetful Experience Replay in Hierarchical Reinforcement Learning from
Demonstrations [55.41644538483948]
In this paper, we propose a combination of approaches that allow the agent to use low-quality demonstrations in complex vision-based environments.
Our proposed goal-oriented structuring of replay buffer allows the agent to automatically highlight sub-goals for solving complex hierarchical tasks in demonstrations.
The solution based on our algorithm beats all the solutions for the famous MineRL competition and allows the agent to mine a diamond in the Minecraft environment.
arXiv Detail & Related papers (2020-06-17T15:38:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.