Spatial State-Action Features for General Games
- URL: http://arxiv.org/abs/2201.06401v2
- Date: Thu, 4 May 2023 11:43:32 GMT
- Title: Spatial State-Action Features for General Games
- Authors: Dennis J.N.J. Soemers and \'Eric Piette and Matthew Stephenson and
Cameron Browne
- Abstract summary: We formulate a design and efficient implementation of spatial state-action features for general games.
These are patterns that can be trained to incentivise or disincentivise actions based on whether or not they match variables of the state in a local area.
We propose an efficient approach for evaluating active features for any given set of features.
- Score: 5.849736173068868
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In many board games and other abstract games, patterns have been used as
features that can guide automated game-playing agents. Such patterns or
features often represent particular configurations of pieces, empty positions,
etc., which may be relevant for a game's strategies. Their use has been
particularly prevalent in the game of Go, but also many other games used as
benchmarks for AI research. In this paper, we formulate a design and efficient
implementation of spatial state-action features for general games. These are
patterns that can be trained to incentivise or disincentivise actions based on
whether or not they match variables of the state in a local area around action
variables. We provide extensive details on several design and implementation
choices, with a primary focus on achieving a high degree of generality to
support a wide variety of different games using different board geometries or
other graphs. Secondly, we propose an efficient approach for evaluating active
features for any given set of features. In this approach, we take inspiration
from heuristics used in problems such as SAT to optimise the order in which
parts of patterns are matched and prune unnecessary evaluations. This approach
is defined for a highly general and abstract description of the problem --
phrased as optimising the order in which propositions of formulas in
disjunctive normal form are evaluated -- and may therefore also be of interest
to other types of problems than board games. An empirical evaluation on 33
distinct games in the Ludii general game system demonstrates the efficiency of
this approach in comparison to a naive baseline, as well as a baseline based on
prefix trees, and demonstrates that the additional efficiency significantly
improves the playing strength of agents using the features to guide search.
Related papers
- The Update-Equivalence Framework for Decision-Time Planning [78.44953498421854]
We introduce an alternative framework for decision-time planning that is not based on solving subgames, but rather on update equivalence.
We derive a provably sound search algorithm for fully cooperative games based on mirror descent and a search algorithm for adversarial games based on magnetic mirror descent.
arXiv Detail & Related papers (2023-04-25T20:28:55Z) - Revisiting Game Representations: The Hidden Costs of Efficiency in
Sequential Decision-making Algorithms [0.6749750044497732]
Recent advancements in algorithms for sequential decision-making under imperfect information have shown remarkable success in large games.
These algorithms traditionally formalize the games using the extensive-form game formalism.
We show that a popular workaround involves using a specialized representation based on player specific information-state trees.
arXiv Detail & Related papers (2021-12-20T22:34:19Z) - Optimised Playout Implementations for the Ludii General Game System [8.344476599818828]
The Ludii general game system can automatically infer, based on a game's description in its general game description language, whether any optimised implementations are applicable.
An empirical evaluation demonstrates major speedups over a standard implementation, with a median result of running playouts 5.08 times as fast, over 145 different games in Ludii.
arXiv Detail & Related papers (2021-11-04T12:59:53Z) - Rinascimento: searching the behaviour space of Splendor [0.0]
This research is to map the behavioural space (BSpace) of a game by using a general method.
In particular, the use of event-value functions has generally shown a remarkable improvement in the coverage of the BSpace compared to agents based on classic score-based reward signals.
arXiv Detail & Related papers (2021-06-15T18:46:57Z) - Portfolio Search and Optimization for General Strategy Game-Playing [58.896302717975445]
We propose a new algorithm for optimization and action-selection based on the Rolling Horizon Evolutionary Algorithm.
For the optimization of the agents' parameters and portfolio sets we study the use of the N-tuple Bandit Evolutionary Algorithm.
An analysis of the agents' performance shows that the proposed algorithm generalizes well to all game-modes and is able to outperform other portfolio methods.
arXiv Detail & Related papers (2021-04-21T09:28:28Z) - Generating Diverse and Competitive Play-Styles for Strategy Games [58.896302717975445]
We propose Portfolio Monte Carlo Tree Search with Progressive Unpruning for playing a turn-based strategy game (Tribes)
We show how it can be parameterized so a quality-diversity algorithm (MAP-Elites) is used to achieve different play-styles while keeping a competitive level of play.
Our results show that this algorithm is capable of achieving these goals even for an extensive collection of game levels beyond those used for training.
arXiv Detail & Related papers (2021-04-17T20:33:24Z) - Efficient Pure Exploration for Combinatorial Bandits with Semi-Bandit
Feedback [51.21673420940346]
Combinatorial bandits generalize multi-armed bandits, where the agent chooses sets of arms and observes a noisy reward for each arm contained in the chosen set.
We focus on the pure-exploration problem of identifying the best arm with fixed confidence, as well as a more general setting, where the structure of the answer set differs from the one of the action set.
Based on a projection-free online learning algorithm for finite polytopes, it is the first computationally efficient algorithm which is convexally optimal and has competitive empirical performance.
arXiv Detail & Related papers (2021-01-21T10:35:09Z) - Efficient Reasoning in Regular Boardgames [2.909363382704072]
We present the technical side of reasoning in Regular Boardgames (RBG) language.
RBG serves as a research tool that aims to aid in the development of generalized algorithms for knowledge inference, analysis, generation, learning, and playing games.
arXiv Detail & Related papers (2020-06-15T11:42:08Z) - Efficient exploration of zero-sum stochastic games [83.28949556413717]
We investigate the increasingly important and common game-solving setting where we do not have an explicit description of the game but only oracle access to it through gameplay.
During a limited-duration learning phase, the algorithm can control the actions of both players in order to try to learn the game and how to play it well.
Our motivation is to quickly learn strategies that have low exploitability in situations where evaluating the payoffs of a queried strategy profile is costly.
arXiv Detail & Related papers (2020-02-24T20:30:38Z) - Learning Dynamic Belief Graphs to Generalize on Text-Based Games [55.59741414135887]
Playing text-based games requires skills in processing natural language and sequential decision making.
In this work, we investigate how an agent can plan and generalize in text-based games using graph-structured representations learned end-to-end from raw text.
arXiv Detail & Related papers (2020-02-21T04:38:37Z) - State Representation and Polyomino Placement for the Game Patchwork [0.0]
This paper studies the game Patchwork, a two player strategy game using polyomino tile drafting and placement.
The core polyomino placement mechanic is implemented in a constraint model using regular constraints.
Global propagation guided regret is introduced, choosing placements based on not ruling out later placements.
arXiv Detail & Related papers (2020-01-13T13:29:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.