Related papers: Superhuman AI for Stratego Using Self-Play Reinforcement Learning and Test-Time Search

Superhuman AI for Stratego Using Self-Play Reinforcement Learning and Test-Time Search

URL: http://arxiv.org/abs/2511.07312v1
Date: Mon, 10 Nov 2025 17:13:41 GMT
Title: Superhuman AI for Stratego Using Self-Play Reinforcement Learning and Test-Time Search
Authors: Samuel Sokota, Eugene Vinitsky, Hengyuan Hu, J. Zico Kolter, Gabriele Farina,
Abstract summary: Stratego is a board wargame exemplifying the challenge of strategic decision making under massive amounts of hidden information.<n>This work establishes a step change in both performance and cost for Stratego, showing that it is now possible not only to reach the level of top humans, but to achieve vastly superhuman level.
Score: 74.17074385045657
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Few classical games have been regarded as such significant benchmarks of artificial intelligence as to have justified training costs in the millions of dollars. Among these, Stratego -- a board wargame exemplifying the challenge of strategic decision making under massive amounts of hidden information -- stands apart as a case where such efforts failed to produce performance at the level of top humans. This work establishes a step change in both performance and cost for Stratego, showing that it is now possible not only to reach the level of top humans, but to achieve vastly superhuman level -- and that doing so requires not an industrial budget, but merely a few thousand dollars. We achieved this result by developing general approaches for self-play reinforcement learning and test-time search under imperfect information.

Related papers

Evaluating Intelligence via Trial and Error [59.80426744891971]
We introduce Survival Game as a framework to evaluate intelligence based on the number of failed attempts in a trial-and-error process.<n>When the expectation and variance of failure counts are both finite, it signals the ability to consistently find solutions to new challenges.<n>Our results show that while AI systems achieve the Autonomous Level in simple tasks, they are still far from it in more complex tasks.
arXiv Detail & Related papers (2025-02-26T05:59:45Z)
Reinforcement Learning for High-Level Strategic Control in Tower Defense Games [47.618236610219554]
In strategy games, one of the most important aspects of game design is maintaining a sense of challenge for players. We propose an automated approach that combines traditional scripted methods with reinforcement learning. Results show that combining a learned approach, such as reinforcement learning, with a scripted AI produces a higher-performing and more robust agent than using only AI.
arXiv Detail & Related papers (2024-06-12T08:06:31Z)
Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning [86.37438204416435]
Stratego is one of the few iconic board games that Artificial Intelligence (AI) has not yet mastered. Decisions in Stratego are made over a large number of discrete actions with no obvious link between action and outcome. DeepNash beats existing state-of-the-art AI methods in Stratego and achieved a yearly (2022) and all-time top-3 rank on the Gravon games platform.
arXiv Detail & Related papers (2022-06-30T15:53:19Z)
Generating Diverse and Competitive Play-Styles for Strategy Games [58.896302717975445]
We propose Portfolio Monte Carlo Tree Search with Progressive Unpruning for playing a turn-based strategy game (Tribes) We show how it can be parameterized so a quality-diversity algorithm (MAP-Elites) is used to achieve different play-styles while keeping a competitive level of play. Our results show that this algorithm is capable of achieving these goals even for an extensive collection of game levels beyond those used for training.
arXiv Detail & Related papers (2021-04-17T20:33:24Z)
Efficient exploration of zero-sum stochastic games [83.28949556413717]
We investigate the increasingly important and common game-solving setting where we do not have an explicit description of the game but only oracle access to it through gameplay. During a limited-duration learning phase, the algorithm can control the actions of both players in order to try to learn the game and how to play it well. Our motivation is to quickly learn strategies that have low exploitability in situations where evaluating the payoffs of a queried strategy profile is costly.
arXiv Detail & Related papers (2020-02-24T20:30:38Z)
Deep RL Agent for a Real-Time Action Strategy Game [0.3867363075280543]
We introduce a reinforcement learning environment based on Heroic - Magic Duel, a 1 v 1 action strategy game. Our main contribution is a deep reinforcement learning agent playing the game at a competitive level. Our best self-play agent, obtains around $65%$ win rate against the existing AI and over $50%$ win rate against a top human player.
arXiv Detail & Related papers (2020-02-15T01:09:56Z)
Provable Self-Play Algorithms for Competitive Reinforcement Learning [48.12602400021397]
We study self-play in competitive reinforcement learning under the setting of Markov games. We show that a self-play algorithm achieves regret $tildemathcalO(sqrtT)$ after playing $T$ steps of the game. We also introduce an explore-then-exploit style algorithm, which achieves a slightly worse regret $tildemathcalO(T2/3)$, but is guaranteed to run in time even in the worst case.
arXiv Detail & Related papers (2020-02-10T18:44:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.