An Empirical Study on Google Research Football Multi-agent Scenarios
- URL: http://arxiv.org/abs/2305.09458v1
- Date: Tue, 16 May 2023 14:18:53 GMT
- Title: An Empirical Study on Google Research Football Multi-agent Scenarios
- Authors: Yan Song, He Jiang, Zheng Tian, Haifeng Zhang, Yingping Zhang,
Jiangcheng Zhu, Zonghong Dai, Weinan Zhang, Jun Wang,
- Abstract summary: We open-source our training framework Light-MALib which extends the MALib by distributed and asynchronized implementation with additional analytical tools for football games.
We provide guidance for building strong football AI with population-based training and release diverse pretrained policies for benchmarking.
- Score: 30.926070192524193
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few multi-agent reinforcement learning (MARL) research on Google Research
Football (GRF) focus on the 11v11 multi-agent full-game scenario and to the
best of our knowledge, no open benchmark on this scenario has been released to
the public. In this work, we fill the gap by providing a population-based MARL
training pipeline and hyperparameter settings on multi-agent football scenario
that outperforms the bot with difficulty 1.0 from scratch within 2 million
steps. Our experiments serve as a reference for the expected performance of
Independent Proximal Policy Optimization (IPPO), a state-of-the-art multi-agent
reinforcement learning algorithm where each agent tries to maximize its own
policy independently across various training configurations. Meanwhile, we
open-source our training framework Light-MALib which extends the MALib codebase
by distributed and asynchronized implementation with additional analytical
tools for football games. Finally, we provide guidance for building strong
football AI with population-based training and release diverse pretrained
policies for benchmarking. The goal is to provide the community with a head
start for whoever experiment their works on GRF and a simple-to-use
population-based training framework for further improving their agents through
self-play. The implementation is available at
https://github.com/Shanghai-Digital-Brain-Laboratory/DB-Football.
Related papers
- MARL-LNS: Cooperative Multi-agent Reinforcement Learning via Large Neighborhoods Search [27.807695570974644]
We propose a general training framework, MARL-LNS, to address issues by training on alternating subsets of agents.
We show that our algorithms can automatically reduce at least 10% of training time while reaching the same final skill level as the original algorithm.
arXiv Detail & Related papers (2024-04-03T22:51:54Z) - ProAgent: Building Proactive Cooperative Agents with Large Language
Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents.
ProAgent can analyze the present state, and infer the intentions of teammates from observations.
ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z) - Population-based Evaluation in Repeated Rock-Paper-Scissors as a
Benchmark for Multiagent Reinforcement Learning [14.37986882249142]
We propose a benchmark for multiagent learning based on repeated play of the simple game Rock, Paper, Scissors.
We describe metrics to measure the quality of agents based both on average returns and exploitability.
arXiv Detail & Related papers (2023-03-02T15:06:52Z) - Learning From Good Trajectories in Offline Multi-Agent Reinforcement
Learning [98.07495732562654]
offline multi-agent reinforcement learning (MARL) aims to learn effective multi-agent policies from pre-collected datasets.
One agent learned by offline MARL often inherits this random policy, jeopardizing the performance of the entire team.
We propose a novel framework called Shared Individual Trajectories (SIT) to address this problem.
arXiv Detail & Related papers (2022-11-28T18:11:26Z) - RPM: Generalizable Behaviors for Multi-Agent Reinforcement Learning [90.43925357575543]
We propose ranked policy memory ( RPM) to collect diverse multi-agent trajectories for training MARL policies with good generalizability.
RPM enables MARL agents to interact with unseen agents in multi-agent generalization evaluation scenarios and complete given tasks, and it significantly boosts the performance up to 402% on average.
arXiv Detail & Related papers (2022-10-18T07:32:43Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - Decentralized Cooperative Multi-Agent Reinforcement Learning with
Exploration [35.75029940279768]
We study multi-agent reinforcement learning in the most basic cooperative setting -- Markov teams.
We propose an algorithm in which each agent independently runs a stage-based V-learning style algorithm.
We show that the agents can learn an $epsilon$-approximate Nash equilibrium policy in at most $proptowidetildeO (1/epsilon4)$ episodes.
arXiv Detail & Related papers (2021-10-12T02:45:12Z) - TiKick: Toward Playing Multi-agent Football Full Games from Single-agent
Demonstrations [31.596018856092513]
Tikick is the first learning-based AI system that can take over the multi-agent Google Research Football full game.
To the best of our knowledge, Tikick is the first learning-based AI system that can take over the multi-agent Google Research Football full game.
arXiv Detail & Related papers (2021-10-09T08:34:58Z) - MALib: A Parallel Framework for Population-based Multi-agent
Reinforcement Learning [61.28547338576706]
Population-based multi-agent reinforcement learning (PB-MARL) refers to the series of methods nested with reinforcement learning (RL) algorithms.
We present MALib, a scalable and efficient computing framework for PB-MARL.
arXiv Detail & Related papers (2021-06-05T03:27:08Z) - Is Independent Learning All You Need in the StarCraft Multi-Agent
Challenge? [100.48692829396778]
Independent PPO (IPPO) is a form of independent learning in which each agent simply estimates its local value function.
IPPO's strong performance may be due to its robustness to some forms of environment non-stationarity.
arXiv Detail & Related papers (2020-11-18T20:29:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.