Mastering Strategy Card Game (Legends of Code and Magic) via End-to-End
Policy and Optimistic Smooth Fictitious Play
- URL: http://arxiv.org/abs/2303.04096v1
- Date: Tue, 7 Mar 2023 17:55:28 GMT
- Title: Mastering Strategy Card Game (Legends of Code and Magic) via End-to-End
Policy and Optimistic Smooth Fictitious Play
- Authors: Wei Xi, Yongxin Zhang, Changnan Xiao, Xuefeng Huang, Shihong Deng,
Haowei Liang, Jie Chen, Peng Sun
- Abstract summary: We study a two-stage strategy card game Legends of Code and Magic.
We propose an end-to-end policy to address the difficulties that arise in multi-stage game.
Our approach wins double championships of COG2022 competition.
- Score: 11.480308614644041
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Reinforcement Learning combined with Fictitious Play shows impressive
results on many benchmark games, most of which are, however, single-stage. In
contrast, real-world decision making problems may consist of multiple stages,
where the observation spaces and the action spaces can be completely different
across stages. We study a two-stage strategy card game Legends of Code and
Magic and propose an end-to-end policy to address the difficulties that arise
in multi-stage game. We also propose an optimistic smooth fictitious play
algorithm to find the Nash Equilibrium for the two-player game. Our approach
wins double championships of COG2022 competition. Extensive studies verify and
show the advancement of our approach.
Related papers
- Securing Equal Share: A Principled Approach for Learning Multiplayer Symmetric Games [21.168085154982712]
equilibria in multiplayer games are neither unique nor non-exploitable.
This paper takes an initial step towards addressing these challenges by focusing on the natural objective of equal share.
We design a series of efficient algorithms, inspired by no-regret learning, that provably attain approximate equal share across various settings.
arXiv Detail & Related papers (2024-06-06T15:59:17Z) - SPRING: Studying the Paper and Reasoning to Play Games [102.5587155284795]
We propose a novel approach, SPRING, to read the game's original academic paper and use the knowledge learned to reason and play the game through a large language model (LLM)
In experiments, we study the quality of in-context "reasoning" induced by different forms of prompts under the setting of the Crafter open-world environment.
Our experiments suggest that LLMs, when prompted with consistent chain-of-thought, have great potential in completing sophisticated high-level trajectories.
arXiv Detail & Related papers (2023-05-24T18:14:35Z) - Provably Efficient Fictitious Play Policy Optimization for Zero-Sum
Markov Games with Structured Transitions [145.54544979467872]
We propose and analyze new fictitious play policy optimization algorithms for zero-sum Markov games with structured but unknown transitions.
We prove tight $widetildemathcalO(sqrtK)$ regret bounds after $K$ episodes in a two-agent competitive game scenario.
Our algorithms feature a combination of Upper Confidence Bound (UCB)-type optimism and fictitious play under the scope of simultaneous policy optimization.
arXiv Detail & Related papers (2022-07-25T18:29:16Z) - Generating Diverse and Competitive Play-Styles for Strategy Games [58.896302717975445]
We propose Portfolio Monte Carlo Tree Search with Progressive Unpruning for playing a turn-based strategy game (Tribes)
We show how it can be parameterized so a quality-diversity algorithm (MAP-Elites) is used to achieve different play-styles while keeping a competitive level of play.
Our results show that this algorithm is capable of achieving these goals even for an extensive collection of game levels beyond those used for training.
arXiv Detail & Related papers (2021-04-17T20:33:24Z) - On the Power of Refined Skat Selection [1.3706331473063877]
Skat is a fascinating card game, show-casing many of the intrinsic challenges for modern AI systems.
We propose hard expert-rules and a scoring function based on refined skat evaluation features.
Experiments emphasize the impact of the refined skat putting algorithm on the playing performance of the bots.
arXiv Detail & Related papers (2021-04-07T08:54:58Z) - Collaborative Agent Gameplay in the Pandemic Board Game [3.223284371460913]
Pandemic is an exemplar collaborative board game where all players coordinate to overcome challenges posed by events occurring during the game's progression.
This paper proposes an artificial agent which controls all players' actions and balances chances of winning versus risk of losing in this highly Evolutionary environment.
Results show that the proposed algorithm can find winning strategies more consistently in different games of varying difficulty.
arXiv Detail & Related papers (2021-03-21T13:18:20Z) - Learning to Play Imperfect-Information Games by Imitating an Oracle
Planner [77.67437357688316]
We consider learning to play multiplayer imperfect-information games with simultaneous moves and large state-action spaces.
Our approach is based on model-based planning.
We show that the planner is able to discover efficient playing strategies in the games of Clash Royale and Pommerman.
arXiv Detail & Related papers (2020-12-22T17:29:57Z) - Learning to Play Sequential Games versus Unknown Opponents [93.8672371143881]
We consider a repeated sequential game between a learner, who plays first, and an opponent who responds to the chosen action.
We propose a novel algorithm for the learner when playing against an adversarial sequence of opponents.
Our results include algorithm's regret guarantees that depend on the regularity of the opponent's response.
arXiv Detail & Related papers (2020-07-10T09:33:05Z) - Learning to Play No-Press Diplomacy with Best Response Policy Iteration [31.367850729299665]
We apply deep reinforcement learning methods to Diplomacy, a 7-player board game.
We show that our agents convincingly outperform the previous state-of-the-art, and game theoretic equilibrium analysis shows that the new process yields consistent improvements.
arXiv Detail & Related papers (2020-06-08T14:33:31Z) - Efficient exploration of zero-sum stochastic games [83.28949556413717]
We investigate the increasingly important and common game-solving setting where we do not have an explicit description of the game but only oracle access to it through gameplay.
During a limited-duration learning phase, the algorithm can control the actions of both players in order to try to learn the game and how to play it well.
Our motivation is to quickly learn strategies that have low exploitability in situations where evaluating the payoffs of a queried strategy profile is costly.
arXiv Detail & Related papers (2020-02-24T20:30:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.