Relevance-Zone Reduction in Game Solving
- URL: http://arxiv.org/abs/2510.00689v1
- Date: Wed, 01 Oct 2025 09:10:32 GMT
- Title: Relevance-Zone Reduction in Game Solving
- Authors: Chi-Huang Lin, Ting Han Wei, Chun-Jui Wang, Hung Guei, Chung-Chin Shih, Yun-Jui Tsai, I-Chen Wu, Ti-Rong Wu,
- Abstract summary: We propose an iterative RZ reduction method that repeatedly solves the same position while gradually restricting the region involved.<n>In experiments on 7x7 Killall-Go, our method reduces the average RZ size to 85.95% of the original.
- Score: 13.335750959467056
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Game solving aims to find the optimal strategies for all players and determine the theoretical outcome of a game. However, due to the exponential growth of game trees, many games remain unsolved, even though methods like AlphaZero have demonstrated super-human level in game playing. The Relevance-Zone (RZ) is a local strategy reuse technique that restricts the search to only the regions relevant to the outcome, significantly reducing the search space. However, RZs are not unique. Different solutions may result in RZs of varying sizes. Smaller RZs are generally more favorable, as they increase the chance of reuse and improve pruning efficiency. To this end, we propose an iterative RZ reduction method that repeatedly solves the same position while gradually restricting the region involved, guiding the solver toward smaller RZs. We design three constraint generation strategies and integrate an RZ Pattern Table to fully leverage past solutions. In experiments on 7x7 Killall-Go, our method reduces the average RZ size to 85.95% of the original. Furthermore, the reduced RZs can be permanently stored as reusable knowledge for future solving tasks, especially for larger board sizes or different openings.
Related papers
- Adversarial Learning in Games with Bandit Feedback: Logarithmic Pure-Strategy Maximin Regret [64.73231630190121]
Learning to play zero-sum games is a fundamental problem in game theory and machine learning.<n>We investigate adversarial learning in zero-sum games under bandit feedback, aiming to minimize the deficit against the maximin pure strategy.<n>We show that the Tsallis-INF algorithm achieves $O(c log T)$ instance-dependent regret with a game-dependent parameter $c$.
arXiv Detail & Related papers (2026-02-06T03:26:01Z) - The Hidden Game Problem [24.447454826751155]
We introduce the hidden game problem, where for each player, an unknown subset of strategies consistently yields higher rewards compared to the rest.<n>We develop a composition of regret minimization techniques that achieve optimal external and swap regret bounds.<n>Our approach ensures rapid convergence to correlated equilibria in hidden subgames, leveraging the hidden game structure for improved computational efficiency.
arXiv Detail & Related papers (2025-10-04T15:46:04Z) - Dominated Actions in Imperfect-Information Games [0.0]
We define and study the concept of dominated actions in imperfect-information games.<n>Our main result is a empirically-time algorithm for determining whether an action is dominated by any mixed strategy.<n>We explore the role of dominated actions in the "All In or Fold" No-Limit Texas Hold'em poker variant.
arXiv Detail & Related papers (2025-04-13T20:48:44Z) - Neural Population Learning beyond Symmetric Zero-sum Games [52.20454809055356]
We introduce NeuPL-JPSRO, a neural population learning algorithm that benefits from transfer learning of skills and converges to a Coarse Correlated (CCE) of the game.
Our work shows that equilibrium convergent population learning can be implemented at scale and in generality.
arXiv Detail & Related papers (2024-01-10T12:56:24Z) - ApproxED: Approximate exploitability descent via learned best responses [61.17702187957206]
We study the problem of finding an approximate Nash equilibrium of games with continuous action sets.
We propose two new methods that minimize an approximation of exploitability with respect to the strategy profile.
arXiv Detail & Related papers (2023-01-20T23:55:30Z) - Learning Efficient Exploration through Human Seeded Rapidly-exploring
Random Trees [1.2993951779393873]
We introduce RRT and behavior-cloning-assisted RRT in testing the number of game states searched and the time taken to explore those game states.
We find HSRRT and CA-RRT both explore more game states in fewer tree/iterations when compared to the existing baseline.
In our tested environments, CA-RRT was able to reach the same number of states as RRT by 5000 than 5000 fewer iterations on average, almost a 50% reduction.
arXiv Detail & Related papers (2022-03-23T23:53:39Z) - Efficient Policy Space Response Oracles [61.71849698253696]
Policy Space Response Oracle method (PSRO) provides a general solution to Nash equilibrium in two-player zero-sum games.
Central to our development is the newly-introduced of minimax optimization on unrestricted-restricted (URR) games.
We report a 50x speedup in wall-time, 10x data efficiency, and similar exploitability as existing PSRO methods on Kuhn and Leduc Poker games.
arXiv Detail & Related papers (2022-01-28T17:54:45Z) - Multi-Stage Episodic Control for Strategic Exploration in Text Games [16.897326154822135]
This work proposes to tackle the explore-vs-exploit dilemma using a multi-stage approach that explicitly disentangles these two strategies within each episode.
Our algorithm, called eXploit-Then-eXplore (XTX), begins each episode using an exploitation policy that imitates a set of promising trajectories from the past.
Our method significantly outperforms prior approaches by 27% and 11% average normalized score over 12 games from the Jericho benchmark.
arXiv Detail & Related papers (2022-01-04T17:19:52Z) - A Novel Approach to Solving Goal-Achieving Problems for Board Games [18.627167345021835]
This paper first proposes a novel RZ-based approach, called the RZ-Based Search (RZS) to solving L&D problems for Go.<n>RZS tries moves before determining whether they are null moves post-hoc.<n>We also propose a new training method called Faster to Life (FTL), which modifies AlphaZero to entice it to win more quickly.
arXiv Detail & Related papers (2021-12-05T13:23:10Z) - Generating Diverse and Competitive Play-Styles for Strategy Games [58.896302717975445]
We propose Portfolio Monte Carlo Tree Search with Progressive Unpruning for playing a turn-based strategy game (Tribes)
We show how it can be parameterized so a quality-diversity algorithm (MAP-Elites) is used to achieve different play-styles while keeping a competitive level of play.
Our results show that this algorithm is capable of achieving these goals even for an extensive collection of game levels beyond those used for training.
arXiv Detail & Related papers (2021-04-17T20:33:24Z) - Model-Free Online Learning in Unknown Sequential Decision Making
Problems and Games [114.90723492840499]
In large two-player zero-sum imperfect-information games, modern extensions of counterfactual regret minimization (CFR) are currently the practical state of the art for computing a Nash equilibrium.
We formalize an online learning setting in which the strategy space is not known to the agent.
We give an efficient algorithm that achieves $O(T3/4)$ regret with high probability for that setting, even when the agent faces an adversarial environment.
arXiv Detail & Related papers (2021-03-08T04:03:24Z) - Efficient exploration of zero-sum stochastic games [83.28949556413717]
We investigate the increasingly important and common game-solving setting where we do not have an explicit description of the game but only oracle access to it through gameplay.
During a limited-duration learning phase, the algorithm can control the actions of both players in order to try to learn the game and how to play it well.
Our motivation is to quickly learn strategies that have low exploitability in situations where evaluating the payoffs of a queried strategy profile is costly.
arXiv Detail & Related papers (2020-02-24T20:30:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.