TiKick: Toward Playing Multi-agent Football Full Games from Single-agent
Demonstrations
- URL: http://arxiv.org/abs/2110.04507v2
- Date: Tue, 12 Oct 2021 05:25:00 GMT
- Title: TiKick: Toward Playing Multi-agent Football Full Games from Single-agent
Demonstrations
- Authors: Shiyu Huang, Wenze Chen, Longfei Zhang, Ziyang Li, Fengming Zhu,
Deheng Ye, Ting Chen, Jun Zhu
- Abstract summary: Tikick is the first learning-based AI system that can take over the multi-agent Google Research Football full game.
To the best of our knowledge, Tikick is the first learning-based AI system that can take over the multi-agent Google Research Football full game.
- Score: 31.596018856092513
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep reinforcement learning (DRL) has achieved super-human performance on
complex video games (e.g., StarCraft II and Dota II). However, current DRL
systems still suffer from challenges of multi-agent coordination, sparse
rewards, stochastic environments, etc. In seeking to address these challenges,
we employ a football video game, e.g., Google Research Football (GRF), as our
testbed and develop an end-to-end learning-based AI system (denoted as TiKick)
to complete this challenging task. In this work, we first generated a large
replay dataset from the self-playing of single-agent experts, which are
obtained from league training. We then developed a distributed learning system
and new offline algorithms to learn a powerful multi-agent AI from the fixed
single-agent dataset. To the best of our knowledge, Tikick is the first
learning-based AI system that can take over the multi-agent Google Research
Football full game, while previous work could either control a single agent or
experiment on toy academic scenarios. Extensive experiments further show that
our pre-trained model can accelerate the training process of the modern
multi-agent algorithm and our method achieves state-of-the-art performances on
various academic scenarios.
Related papers
- Impact of Decentralized Learning on Player Utilities in Stackelberg Games [57.08270857260131]
In many two-agent systems, each agent learns separately and the rewards of the two agents are not perfectly aligned.
We model these systems as Stackelberg games with decentralized learning and show that standard regret benchmarks result in worst-case linear regret for at least one player.
We develop algorithms to achieve near-optimal $O(T2/3)$ regret for both players with respect to these benchmarks.
arXiv Detail & Related papers (2024-02-29T23:38:28Z) - Minimax Exploiter: A Data Efficient Approach for Competitive Self-Play [12.754819077905061]
Minimax Exploiter is a game theoretic approach to exploiting Main Agents that leverages knowledge of its opponents.
We validate our approach in a diversity of settings, including simple turn based games, the arcade learning environment, and For Honor, a modern video game.
arXiv Detail & Related papers (2023-11-28T19:34:40Z) - Scaling Laws for Imitation Learning in Single-Agent Games [29.941613597833133]
We investigate whether carefully scaling up model and data size can bring similar improvements in the imitation learning setting for single-agent games.
We first demonstrate our findings on a variety of Atari games, and thereafter focus on the extremely challenging game of NetHack.
We find that IL loss and mean return scale smoothly with the compute budget and are strongly correlated, resulting in power laws for training compute-optimal IL agents.
arXiv Detail & Related papers (2023-07-18T16:43:03Z) - TiZero: Mastering Multi-Agent Football with Curriculum Learning and
Self-Play [19.98100026335148]
TiZero is a self-evolving, multi-agent system that learns from scratch.
It outperforms previous systems by a large margin on the Google Research Football environment.
arXiv Detail & Related papers (2023-02-15T08:19:18Z) - Mastering the Game of No-Press Diplomacy via Human-Regularized
Reinforcement Learning and Planning [95.78031053296513]
No-press Diplomacy is a complex strategy game involving both cooperation and competition.
We introduce a planning algorithm we call DiL-piKL that regularizes a reward-maximizing policy toward a human imitation-learned policy.
We show that DiL-piKL can be extended into a self-play reinforcement learning algorithm we call RL-DiL-piKL.
arXiv Detail & Related papers (2022-10-11T14:47:35Z) - Applying supervised and reinforcement learning methods to create
neural-network-based agents for playing StarCraft II [0.0]
We propose a neural network architecture for playing the full two-player match of StarCraft II trained with general-purpose supervised and reinforcement learning.
Our implementation achieves a non-trivial performance when compared to the in-game scripted bots.
arXiv Detail & Related papers (2021-09-26T20:08:10Z) - Explore and Control with Adversarial Surprise [78.41972292110967]
Reinforcement learning (RL) provides a framework for learning goal-directed policies given user-specified rewards.
We propose a new unsupervised RL technique based on an adversarial game which pits two policies against each other to compete over the amount of surprise an RL agent experiences.
We show that our method leads to the emergence of complex skills by exhibiting clear phase transitions.
arXiv Detail & Related papers (2021-07-12T17:58:40Z) - Multi-Agent Collaboration via Reward Attribution Decomposition [75.36911959491228]
We propose Collaborative Q-learning (CollaQ) that achieves state-of-the-art performance in the StarCraft multi-agent challenge.
CollaQ is evaluated on various StarCraft Attribution maps and shows that it outperforms existing state-of-the-art techniques.
arXiv Detail & Related papers (2020-10-16T17:42:11Z) - Distributed Reinforcement Learning for Cooperative Multi-Robot Object
Manipulation [53.262360083572005]
We consider solving a cooperative multi-robot object manipulation task using reinforcement learning (RL)
We propose two distributed multi-agent RL approaches: distributed approximate RL (DA-RL) and game-theoretic RL (GT-RL)
Although we focus on a small system of two agents in this paper, both DA-RL and GT-RL apply to general multi-agent systems, and are expected to scale well to large systems.
arXiv Detail & Related papers (2020-03-21T00:43:54Z) - Neural MMO v1.3: A Massively Multiagent Game Environment for Training
and Evaluating Neural Networks [48.5733173329785]
We present Neural MMO, a massively multiagent game environment inspired by MMOs.
We discuss our progress on two more general challenges in multiagent systems engineering for AI research: distributed infrastructure and game IO.
arXiv Detail & Related papers (2020-01-31T18:50:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.