Diversity is Strength: Mastering Football Full Game with Interactive
Reinforcement Learning of Multiple AIs
- URL: http://arxiv.org/abs/2306.15903v1
- Date: Wed, 28 Jun 2023 03:56:57 GMT
- Title: Diversity is Strength: Mastering Football Full Game with Interactive
Reinforcement Learning of Multiple AIs
- Authors: Chenglu Sun, Shuo Shen, Sijia Xu, Weidong Zhang
- Abstract summary: We propose Diversity is Strength (DIS), a novel DRL training framework that can simultaneously train multiple kinds of AIs.
These AIs are linked through an interconnected history model pool structure, which enhances their capabilities and strategy diversities.
We tested our method in an AI competition based on Google Research Football (GRF) and won the 5v5 and 11v11 tracks.
- Score: 4.020287169811583
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training AI with strong and rich strategies in multi-agent environments
remains an important research topic in Deep Reinforcement Learning (DRL). The
AI's strength is closely related to its diversity of strategies, and this
relationship can guide us to train AI with both strong and rich strategies. To
prove this point, we propose Diversity is Strength (DIS), a novel DRL training
framework that can simultaneously train multiple kinds of AIs. These AIs are
linked through an interconnected history model pool structure, which enhances
their capabilities and strategy diversities. We also design a model evaluation
and screening scheme to select the best models to enrich the model pool and
obtain the final AI. The proposed training method provides diverse,
generalizable, and strong AI strategies without using human data. We tested our
method in an AI competition based on Google Research Football (GRF) and won the
5v5 and 11v11 tracks. The method enables a GRF AI to have a high level on both
5v5 and 11v11 tracks for the first time, which are under complex multi-agent
environments. The behavior analysis shows that the trained AI has rich
strategies, and the ablation experiments proved that the designed modules
benefit the training process.
Related papers
- Mastering Chinese Chess AI (Xiangqi) Without Search [2.309569018066392]
We have developed a high-performance Chinese Chess AI that operates without reliance on search algorithms.
This AI has demonstrated the capability to compete at a level commensurate with the top 0.1% of human players.
arXiv Detail & Related papers (2024-10-07T09:27:51Z) - Mastering the Digital Art of War: Developing Intelligent Combat Simulation Agents for Wargaming Using Hierarchical Reinforcement Learning [0.0]
dissertation proposes a comprehensive approach, including targeted observation abstractions, multi-model integration, a hybrid AI framework, and an overarching hierarchical reinforcement learning framework.
Our localized observation abstraction using piecewise linear spatial decay simplifies the RL problem, enhancing computational efficiency and demonstrating superior efficacy over traditional global observation methods.
Our hybrid AI framework synergizes RL with scripted agents, leveraging RL for high-level decisions and scripted agents for lower-level tasks, enhancing adaptability, reliability, and performance.
arXiv Detail & Related papers (2024-08-23T18:50:57Z) - Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning [50.47568731994238]
Key method for creating Artificial Intelligence (AI) agents is Reinforcement Learning (RL)
This paper presents a general framework model for integrating and learning structured reasoning into AI agents' policies.
arXiv Detail & Related papers (2023-12-22T17:57:57Z) - DanZero+: Dominating the GuanDan Game through Reinforcement Learning [95.90682269990705]
We develop an AI program for an exceptionally complex and popular card game called GuanDan.
We first put forward an AI program named DanZero for this game.
In order to further enhance the AI's capabilities, we apply policy-based reinforcement learning algorithm to GuanDan.
arXiv Detail & Related papers (2023-12-05T08:07:32Z) - Improving Generalization of Alignment with Human Preferences through
Group Invariant Learning [56.19242260613749]
Reinforcement Learning from Human Feedback (RLHF) enables the generation of responses more aligned with human preferences.
Previous work shows that Reinforcement Learning (RL) often exploits shortcuts to attain high rewards and overlooks challenging samples.
We propose a novel approach that can learn a consistent policy via RL across various data groups or domains.
arXiv Detail & Related papers (2023-10-18T13:54:15Z) - Diversity-based Deep Reinforcement Learning Towards Multidimensional
Difficulty for Fighting Game AI [0.9645196221785693]
We introduce a diversity-based deep reinforcement learning approach for generating a set of agents of similar difficulty.
We find this approach outperforms a baseline trained with specialized, human-authored reward functions in both diversity and performance.
arXiv Detail & Related papers (2022-11-04T21:49:52Z) - DIAMBRA Arena: a New Reinforcement Learning Platform for Research and
Experimentation [91.3755431537592]
This work presents DIAMBRA Arena, a new platform for reinforcement learning research and experimentation.
It features a collection of high-quality environments exposing a Python API fully compliant with OpenAI Gym standard.
They are episodic tasks with discrete actions and observations composed by raw pixels plus additional numerical values.
arXiv Detail & Related papers (2022-10-19T14:39:10Z) - Instructive artificial intelligence (AI) for human training, assistance,
and explainability [0.24629531282150877]
We show how a neural network might instruct human trainees as an alternative to traditional approaches to explainable AI (XAI)
An AI examines human actions and calculates variations on the human strategy that lead to better performance.
Results will be presented on AI instruction's ability to improve human decision-making and human-AI teaming in Hanabi.
arXiv Detail & Related papers (2021-11-02T16:46:46Z) - The MineRL BASALT Competition on Learning from Human Feedback [58.17897225617566]
The MineRL BASALT competition aims to spur forward research on this important class of techniques.
We design a suite of four tasks in Minecraft for which we expect it will be hard to write down hardcoded reward functions.
We provide a dataset of human demonstrations on each of the four tasks, as well as an imitation learning baseline.
arXiv Detail & Related papers (2021-07-05T12:18:17Z) - Is the Most Accurate AI the Best Teammate? Optimizing AI for Teamwork [54.309495231017344]
We argue that AI systems should be trained in a human-centered manner, directly optimized for team performance.
We study this proposal for a specific type of human-AI teaming, where the human overseer chooses to either accept the AI recommendation or solve the task themselves.
Our experiments with linear and non-linear models on real-world, high-stakes datasets show that the most accuracy AI may not lead to highest team performance.
arXiv Detail & Related papers (2020-04-27T19:06:28Z) - Multi-AI competing and winning against humans in iterated
Rock-Paper-Scissors game [4.2124879433151605]
We use an AI algorithm based on Markov Models of one fixed memory length to compete against humans in an iterated Rock Paper Scissors game.
We develop an architecture of multi-AI with changeable parameters to adapt to different competition strategies.
Our strategy could win against more than 95% of human opponents.
arXiv Detail & Related papers (2020-03-15T06:39:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.