Completeness of Unbounded Best-First Game Algorithms
- URL: http://arxiv.org/abs/2109.09468v1
- Date: Sat, 11 Sep 2021 23:13:52 GMT
- Title: Completeness of Unbounded Best-First Game Algorithms
- Authors: Quentin Cohen-Solal (LAMSADE, Universit\'e Paris-Dauphine, PSL, CNRS,
France)
- Abstract summary: We prove the completeness of two game search algorithms: best-first minimax with completion and descent with completion.
We generalize these algorithms in the context of perfect information multiplayer games.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this article, we prove the completeness of the following game search
algorithms: unbounded best-first minimax with completion and descent with
completion, i.e. we show that, with enough time, they find the best game
strategy. We then generalize these two algorithms in the context of perfect
information multiplayer games. We show that these generalizations are also
complete: they find one of the equilibrium points.
Related papers
- Learning to Play Stochastic Two-player Perfect-Information Games without
Knowledge [5.071342645033634]
We extend the Descent framework, which enables learning and planning in the context of two-player games with perfect information.
We evaluate them on the game Ein wurfelt! against state-of-the-art algorithms.
It is our generalization of Descent which obtains the best results.
arXiv Detail & Related papers (2023-02-08T20:27:45Z) - Predicting Winning Regions in Parity Games via Graph Neural Networks
(Extended Abstract) [68.8204255655161]
We present an incomplete-time approach to determining the winning regions of parity games via graph neural networks.
It correctly determines the winning regions of $$60% of the games in our data set and only incurs minor errors in the remaining ones.
arXiv Detail & Related papers (2022-10-18T15:10:25Z) - Near-Optimal Learning of Extensive-Form Games with Imperfect Information [54.55092907312749]
We present the first line of algorithms that require only $widetildemathcalO((XA+YB)/varepsilon2)$ episodes of play to find an $varepsilon$-approximate Nash equilibrium in two-player zero-sum games.
This improves upon the best known sample complexity of $widetildemathcalO((X2A+Y2B)/varepsilon2)$ by a factor of $widetildemathcalO(maxX,
arXiv Detail & Related papers (2022-02-03T18:18:28Z) - Final Adaptation Reinforcement Learning for N-Player Games [0.0]
This paper covers n-tuple-based reinforcement learning (RL) algorithms for games.
We present new algorithms for TD-, SARSA- and Q-learning which work seamlessly on various games with arbitrary number of players.
We add a new element called Final Adaptation RL (FARL) to all these algorithms.
arXiv Detail & Related papers (2021-11-29T08:36:39Z) - Last-iterate Convergence in Extensive-Form Games [49.31256241275577]
We study last-iterate convergence of optimistic algorithms in sequential games.
We show that all of these algorithms enjoy last-iterate convergence, with some of them even converging exponentially fast.
arXiv Detail & Related papers (2021-06-27T22:02:26Z) - Efficient Pure Exploration for Combinatorial Bandits with Semi-Bandit
Feedback [51.21673420940346]
Combinatorial bandits generalize multi-armed bandits, where the agent chooses sets of arms and observes a noisy reward for each arm contained in the chosen set.
We focus on the pure-exploration problem of identifying the best arm with fixed confidence, as well as a more general setting, where the structure of the answer set differs from the one of the action set.
Based on a projection-free online learning algorithm for finite polytopes, it is the first computationally efficient algorithm which is convexally optimal and has competitive empirical performance.
arXiv Detail & Related papers (2021-01-21T10:35:09Z) - Faster Algorithms for Optimal Ex-Ante Coordinated Collusive Strategies
in Extensive-Form Zero-Sum Games [123.76716667704625]
We focus on the problem of finding an optimal strategy for a team of two players that faces an opponent in an imperfect-information zero-sum extensive-form game.
In that setting, it is known that the best the team can do is sample a profile of potentially randomized strategies (one per player) from a joint (a.k.a. correlated) probability distribution at the beginning of the game.
We provide an algorithm that computes such an optimal distribution by only using profiles where only one of the team members gets to randomize in each profile.
arXiv Detail & Related papers (2020-09-21T17:51:57Z) - From Poincar\'e Recurrence to Convergence in Imperfect Information
Games: Finding Equilibrium via Regularization [49.368421783733815]
We show how adapting the reward can give strong convergence guarantees in monotone games.
We also show how this reward adaptation technique can be leveraged to build algorithms that converge exactly to the Nash equilibrium.
arXiv Detail & Related papers (2020-02-19T21:36:58Z) - Fast Complete Algorithm for Multiplayer Nash Equilibrium [1.7132914341329848]
We describe a new complete algorithm for computing Nash equilibrium in multiplayer general-sum games.
We demonstrate that the algorithm runs significantly faster than the prior fastest complete algorithm on several game classes previously studied.
arXiv Detail & Related papers (2020-02-11T23:42:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.