Related papers: Faster Algorithms for Optimal Ex-Ante Coordinated Collusive Strategies in Extensive-Form Zero-Sum Games

Faster Algorithms for Optimal Ex-Ante Coordinated Collusive Strategies in Extensive-Form Zero-Sum Games

URL: http://arxiv.org/abs/2009.10061v1
Date: Mon, 21 Sep 2020 17:51:57 GMT
Title: Faster Algorithms for Optimal Ex-Ante Coordinated Collusive Strategies in Extensive-Form Zero-Sum Games
Authors: Gabriele Farina and Andrea Celli and Nicola Gatti and Tuomas Sandholm
Abstract summary: We focus on the problem of finding an optimal strategy for a team of two players that faces an opponent in an imperfect-information zero-sum extensive-form game. In that setting, it is known that the best the team can do is sample a profile of potentially randomized strategies (one per player) from a joint (a.k.a. correlated) probability distribution at the beginning of the game. We provide an algorithm that computes such an optimal distribution by only using profiles where only one of the team members gets to randomize in each profile.
Score: 123.76716667704625
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We focus on the problem of finding an optimal strategy for a team of two players that faces an opponent in an imperfect-information zero-sum extensive-form game. Team members are not allowed to communicate during play but can coordinate before the game. In that setting, it is known that the best the team can do is sample a profile of potentially randomized strategies (one per player) from a joint (a.k.a. correlated) probability distribution at the beginning of the game. In this paper, we first provide new modeling results about computing such an optimal distribution by drawing a connection to a different literature on extensive-form correlation. Second, we provide an algorithm that computes such an optimal distribution by only using profiles where only one of the team members gets to randomize in each profile. We can also cap the number of such profiles we allow in the solution. This begets an anytime algorithm by increasing the cap. We find that often a handful of well-chosen such profiles suffices to reach optimal utility for the team. This enables team members to reach coordination through a relatively simple and understandable plan. Finally, inspired by this observation and leveraging theoretical concepts that we introduce, we develop an efficient column-generation algorithm for finding an optimal distribution for the team. We evaluate it on a suite of common benchmark games. It is three orders of magnitude faster than the prior state of the art on games that the latter can solve and it can also solve several games that were previously unsolvable.

Related papers

Multi-agent Multi-armed Bandits with Stochastic Sharable Arm Capacities [69.34646544774161]
We formulate a new variant of multi-player multi-armed bandit (MAB) model, which captures arrival of requests to each arm and the policy of allocating requests to players. The challenge is how to design a distributed learning algorithm such that players select arms according to the optimal arm pulling profile. We design an iterative distributed algorithm, which guarantees that players can arrive at a consensus on the optimal arm pulling profile in only M rounds.
arXiv Detail & Related papers (2024-08-20T13:57:00Z)
Solving Hierarchical Information-Sharing Dec-POMDPs: An Extensive-Form Game Approach [2.908482270923597]
This paper shows how to disentangle decision variables while maintaining optimality under hierarchical information sharing. Our approach reveals that extensive-form games always exist with solutions to a single-stage subgame, significantly reducing time complexity.
arXiv Detail & Related papers (2024-02-05T12:33:05Z)
The Update-Equivalence Framework for Decision-Time Planning [78.44953498421854]
We introduce an alternative framework for decision-time planning that is not based on solving subgames, but rather on update equivalence. We derive a provably sound search algorithm for fully cooperative games based on mirror descent and a search algorithm for adversarial games based on magnetic mirror descent.
arXiv Detail & Related papers (2023-04-25T20:28:55Z)
ApproxED: Approximate exploitability descent via learned best responses [61.17702187957206]
We study the problem of finding an approximate Nash equilibrium of games with continuous action sets. We propose two new methods that minimize an approximation of exploitability with respect to the strategy profile.
arXiv Detail & Related papers (2023-01-20T23:55:30Z)
Predicting Winning Regions in Parity Games via Graph Neural Networks (Extended Abstract) [68.8204255655161]
We present an incomplete-time approach to determining the winning regions of parity games via graph neural networks. It correctly determines the winning regions of $$60% of the games in our data set and only incurs minor errors in the remaining ones.
arXiv Detail & Related papers (2022-10-18T15:10:25Z)
Multi-Agent Coordination in Adversarial Environments through Signal Mediated Strategies [37.00818384785628]
Team members can coordinate their strategies before the beginning of the game, but are unable to communicate during the playing phase of the game. In this setting, model-free RL methods are oftentimes unable to capture coordination because agents' policies are executed in a decentralized fashion. We show convergence to coordinated equilibria in cases where previous state-of-the-art multi-agent RL algorithms did not.
arXiv Detail & Related papers (2021-02-09T18:44:16Z)
Efficient Pure Exploration for Combinatorial Bandits with Semi-Bandit Feedback [51.21673420940346]
Combinatorial bandits generalize multi-armed bandits, where the agent chooses sets of arms and observes a noisy reward for each arm contained in the chosen set. We focus on the pure-exploration problem of identifying the best arm with fixed confidence, as well as a more general setting, where the structure of the answer set differs from the one of the action set. Based on a projection-free online learning algorithm for finite polytopes, it is the first computationally efficient algorithm which is convexally optimal and has competitive empirical performance.
arXiv Detail & Related papers (2021-01-21T10:35:09Z)
Finding Core Members of Cooperative Games using Agent-Based Modeling [0.0]
Agent-based modeling (ABM) is a powerful paradigm to gain insight into social phenomena. In this paper, a algorithm is developed that can be embedded into an ABM to allow the agents to find coalition.
arXiv Detail & Related papers (2020-08-30T17:38:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.