Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement
Learning
- URL: http://arxiv.org/abs/2003.10423v1
- Date: Mon, 23 Mar 2020 17:49:39 GMT
- Title: Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement
Learning
- Authors: Qian Long, Zihan Zhou, Abhibav Gupta, Fei Fang, Yi Wu, Xiaolong Wang
- Abstract summary: Evolutionary Population Curriculum scales up Multi-Agent Reinforcement Learning (MARL) by progressively increasing the population of training agents in a stage-wise manner.
We implement EPC on a popular MARL algorithm, MADDPG, and empirically show that our approach consistently outperforms baselines by a large margin as the number of agents grows exponentially.
- Score: 37.22210622432453
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In multi-agent games, the complexity of the environment can grow
exponentially as the number of agents increases, so it is particularly
challenging to learn good policies when the agent population is large. In this
paper, we introduce Evolutionary Population Curriculum (EPC), a curriculum
learning paradigm that scales up Multi-Agent Reinforcement Learning (MARL) by
progressively increasing the population of training agents in a stage-wise
manner. Furthermore, EPC uses an evolutionary approach to fix an objective
misalignment issue throughout the curriculum: agents successfully trained in an
early stage with a small population are not necessarily the best candidates for
adapting to later stages with scaled populations. Concretely, EPC maintains
multiple sets of agents in each stage, performs mix-and-match and fine-tuning
over these sets and promotes the sets of agents with the best adaptability to
the next stage. We implement EPC on a popular MARL algorithm, MADDPG, and
empirically show that our approach consistently outperforms baselines by a
large margin as the number of agents grows exponentially.
Related papers
- From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.
We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - Evolution with Opponent-Learning Awareness [10.689403855269704]
We show how large heterogeneous populations of learning agents evolve in normal-form games.
We derive a fast, parallelizable implementation of Opponent-Learning Awareness tailored for evolutionary simulations.
We demonstrate our approach in simulations of 200,000 agents, evolving in the classic games of Hawk-Dove, Stag-Hunt, and Rock-Paper-Scissors.
arXiv Detail & Related papers (2024-10-22T22:49:04Z) - EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms [55.77492625524141]
EvoAgent is a generic method to automatically extend expert agents to multi-agent systems via the evolutionary algorithm.
We show that EvoAgent can automatically generate multiple expert agents and significantly enhance the task-solving capabilities of LLM-based agents.
arXiv Detail & Related papers (2024-06-20T11:49:23Z) - Agent Alignment in Evolving Social Norms [65.45423591744434]
We propose an evolutionary framework for agent evolution and alignment, named EvolutionaryAgent.
In an environment where social norms continuously evolve, agents better adapted to the current social norms will have a higher probability of survival and proliferation.
We show that EvolutionaryAgent can align progressively better with the evolving social norms while maintaining its proficiency in general tasks.
arXiv Detail & Related papers (2024-01-09T15:44:44Z) - Quantifying Agent Interaction in Multi-agent Reinforcement Learning for
Cost-efficient Generalization [63.554226552130054]
Generalization poses a significant challenge in Multi-agent Reinforcement Learning (MARL)
The extent to which an agent is influenced by unseen co-players depends on the agent's policy and the specific scenario.
We present the Level of Influence (LoI), a metric quantifying the interaction intensity among agents within a given scenario and environment.
arXiv Detail & Related papers (2023-10-11T06:09:26Z) - Decentralized Adaptive Formation via Consensus-Oriented Multi-Agent
Communication [9.216867817261493]
We propose a novel Consensus-based Decentralized Adaptive Formation (Cons-DecAF) framework.
Specifically, we develop a novel multi-agent reinforcement learning method, Consensus-oriented Multi-Agent Communication (ConsMAC)
Instead of pre-assigning specific positions of agents, we employ a displacement-based formation by Hausdorff distance to significantly improve the formation efficiency.
arXiv Detail & Related papers (2023-07-23T10:41:17Z) - Supplementing Gradient-Based Reinforcement Learning with Simple
Evolutionary Ideas [4.873362301533824]
We present a simple, sample-efficient algorithm for introducing large but directed learning steps in reinforcement learning (RL)
The methodology uses a population of RL agents training with a common experience buffer, with occasional crossovers and mutations of the agents in order to search efficiently through the policy space.
arXiv Detail & Related papers (2023-05-10T09:46:53Z) - RPM: Generalizable Behaviors for Multi-Agent Reinforcement Learning [90.43925357575543]
We propose ranked policy memory ( RPM) to collect diverse multi-agent trajectories for training MARL policies with good generalizability.
RPM enables MARL agents to interact with unseen agents in multi-agent generalization evaluation scenarios and complete given tasks, and it significantly boosts the performance up to 402% on average.
arXiv Detail & Related papers (2022-10-18T07:32:43Z) - Quantifying environment and population diversity in multi-agent
reinforcement learning [7.548322030720646]
Generalization is a major challenge for multi-agent reinforcement learning.
In this paper, we investigate and quantify the relationship between generalization and diversity in the multi-agent domain.
To better understand the effects of co-player variation, our experiments introduce a new environment-agnostic measure of behavioral diversity.
arXiv Detail & Related papers (2021-02-16T18:54:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.