Related papers: Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning

Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning

URL: http://arxiv.org/abs/2003.10423v1
Date: Mon, 23 Mar 2020 17:49:39 GMT
Title: Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning
Authors: Qian Long, Zihan Zhou, Abhibav Gupta, Fei Fang, Yi Wu, Xiaolong Wang
Abstract summary: Evolutionary Population Curriculum scales up Multi-Agent Reinforcement Learning (MARL) by progressively increasing the population of training agents in a stage-wise manner. We implement EPC on a popular MARL algorithm, MADDPG, and empirically show that our approach consistently outperforms baselines by a large margin as the number of agents grows exponentially.
Score: 37.22210622432453
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In multi-agent games, the complexity of the environment can grow exponentially as the number of agents increases, so it is particularly challenging to learn good policies when the agent population is large. In this paper, we introduce Evolutionary Population Curriculum (EPC), a curriculum learning paradigm that scales up Multi-Agent Reinforcement Learning (MARL) by progressively increasing the population of training agents in a stage-wise manner. Furthermore, EPC uses an evolutionary approach to fix an objective misalignment issue throughout the curriculum: agents successfully trained in an early stage with a small population are not necessarily the best candidates for adapting to later stages with scaled populations. Concretely, EPC maintains multiple sets of agents in each stage, performs mix-and-match and fine-tuning over these sets and promotes the sets of agents with the best adaptability to the next stage. We implement EPC on a popular MARL algorithm, MADDPG, and empirically show that our approach consistently outperforms baselines by a large margin as the number of agents grows exponentially.

Related papers

Improving Retrospective Language Agents via Joint Policy Gradient Optimization [57.35348425288859]
RetroAct is a framework that jointly optimize both task-planning and self-reflective evolution capabilities in language agents. We develop a two-stage joint optimization process that integrates imitation learning and reinforcement learning. We conduct extensive experiments across various testing environments, demonstrating RetroAct has substantial improvements in task performance and decision-making processes.
arXiv Detail & Related papers (2025-03-03T12:54:54Z)
From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process. We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z)
Evolution with Opponent-Learning Awareness [10.689403855269704]
We show how large heterogeneous populations of learning agents evolve in normal-form games. We derive a fast, parallelizable implementation of Opponent-Learning Awareness tailored for evolutionary simulations. We demonstrate our approach in simulations of 200,000 agents, evolving in the classic games of Hawk-Dove, Stag-Hunt, and Rock-Paper-Scissors.
arXiv Detail & Related papers (2024-10-22T22:49:04Z)
EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms [55.77492625524141]
EvoAgent is a generic method to automatically extend expert agents to multi-agent systems via the evolutionary algorithm. We show that EvoAgent can automatically generate multiple expert agents and significantly enhance the task-solving capabilities of LLM-based agents.
arXiv Detail & Related papers (2024-06-20T11:49:23Z)
Agent Alignment in Evolving Social Norms [65.45423591744434]
We propose an evolutionary framework for agent evolution and alignment, named EvolutionaryAgent. In an environment where social norms continuously evolve, agents better adapted to the current social norms will have a higher probability of survival and proliferation. We show that EvolutionaryAgent can align progressively better with the evolving social norms while maintaining its proficiency in general tasks.
arXiv Detail & Related papers (2024-01-09T15:44:44Z)
Quantifying Agent Interaction in Multi-agent Reinforcement Learning for Cost-efficient Generalization [63.554226552130054]
Generalization poses a significant challenge in Multi-agent Reinforcement Learning (MARL) The extent to which an agent is influenced by unseen co-players depends on the agent's policy and the specific scenario. We present the Level of Influence (LoI), a metric quantifying the interaction intensity among agents within a given scenario and environment.
arXiv Detail & Related papers (2023-10-11T06:09:26Z)
Decentralized Adaptive Formation via Consensus-Oriented Multi-Agent Communication [9.216867817261493]
We propose a novel Consensus-based Decentralized Adaptive Formation (Cons-DecAF) framework. Specifically, we develop a novel multi-agent reinforcement learning method, Consensus-oriented Multi-Agent Communication (ConsMAC) Instead of pre-assigning specific positions of agents, we employ a displacement-based formation by Hausdorff distance to significantly improve the formation efficiency.
arXiv Detail & Related papers (2023-07-23T10:41:17Z)
Supplementing Gradient-Based Reinforcement Learning with Simple Evolutionary Ideas [4.873362301533824]
We present a simple, sample-efficient algorithm for introducing large but directed learning steps in reinforcement learning (RL) The methodology uses a population of RL agents training with a common experience buffer, with occasional crossovers and mutations of the agents in order to search efficiently through the policy space.
arXiv Detail & Related papers (2023-05-10T09:46:53Z)
RPM: Generalizable Behaviors for Multi-Agent Reinforcement Learning [90.43925357575543]
We propose ranked policy memory ( RPM) to collect diverse multi-agent trajectories for training MARL policies with good generalizability. RPM enables MARL agents to interact with unseen agents in multi-agent generalization evaluation scenarios and complete given tasks, and it significantly boosts the performance up to 402% on average.
arXiv Detail & Related papers (2022-10-18T07:32:43Z)
Quantifying environment and population diversity in multi-agent reinforcement learning [7.548322030720646]
Generalization is a major challenge for multi-agent reinforcement learning. In this paper, we investigate and quantify the relationship between generalization and diversity in the multi-agent domain. To better understand the effects of co-player variation, our experiments introduce a new environment-agnostic measure of behavioral diversity.
arXiv Detail & Related papers (2021-02-16T18:54:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.