PokerGPT: An End-to-End Lightweight Solver for Multi-Player Texas
Hold'em via Large Language Model
- URL: http://arxiv.org/abs/2401.06781v1
- Date: Thu, 4 Jan 2024 13:27:50 GMT
- Title: PokerGPT: An End-to-End Lightweight Solver for Multi-Player Texas
Hold'em via Large Language Model
- Authors: Chenghao Huang, Yanbo Cao, Yinlong Wen, Tao Zhou, Yanru Zhang
- Abstract summary: Poker, also known as Texas Hold'em, has always been a typical research target within imperfect information games (IIGs)
We introduce PokerGPT, an end-to-end solver for playing Texas Hold'em with arbitrary number of players and gaining high win rates.
- Score: 14.14786217204364
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Poker, also known as Texas Hold'em, has always been a typical research target
within imperfect information games (IIGs). IIGs have long served as a measure
of artificial intelligence (AI) development. Representative prior works, such
as DeepStack and Libratus heavily rely on counterfactual regret minimization
(CFR) to tackle heads-up no-limit Poker. However, it is challenging for
subsequent researchers to learn CFR from previous models and apply it to other
real-world applications due to the expensive computational cost of CFR
iterations. Additionally, CFR is difficult to apply to multi-player games due
to the exponential growth of the game tree size. In this work, we introduce
PokerGPT, an end-to-end solver for playing Texas Hold'em with arbitrary number
of players and gaining high win rates, established on a lightweight large
language model (LLM). PokerGPT only requires simple textual information of
Poker games for generating decision-making advice, thus guaranteeing the
convenient interaction between AI and humans. We mainly transform a set of
textual records acquired from real games into prompts, and use them to
fine-tune a lightweight pre-trained LLM using reinforcement learning human
feedback technique. To improve fine-tuning performance, we conduct prompt
engineering on raw data, including filtering useful information, selecting
behaviors of players with high win rates, and further processing them into
textual instruction using multiple prompt engineering techniques. Through the
experiments, we demonstrate that PokerGPT outperforms previous approaches in
terms of win rate, model size, training time, and response speed, indicating
the great potential of LLMs in solving IIGs.
Related papers
- Instruction-Driven Game Engine: A Poker Case Study [53.689520884467065]
The IDGE project aims to democratize game development by enabling a large language model to follow free-form game descriptions and generate game-play processes.
We train the IDGE in a curriculum manner that progressively increases its exposure to complex scenarios.
Our initial progress lies in developing an IDGE for Poker, which not only supports a wide range of poker variants but also allows for highly individualized new poker games through natural language inputs.
arXiv Detail & Related papers (2024-10-17T11:16:27Z) - AlphaDou: High-Performance End-to-End Doudizhu AI Integrating Bidding [6.177038245239759]
This paper modifies the Deep Monte Carlo algorithm framework by using reinforcement learning to obtain a neural network that simultaneously estimates win rates and expectations.
The modified algorithm enables the AI to perform the full range of tasks in the Doudizhu game, including bidding and cardplay.
arXiv Detail & Related papers (2024-07-14T17:32:36Z) - Instruction-Driven Game Engines on Large Language Models [59.280666591243154]
The IDGE project aims to democratize game development by enabling a large language model to follow free-form game rules.
We train the IDGE in a curriculum manner that progressively increases the model's exposure to complex scenarios.
Our initial progress lies in developing an IDGE for Poker, a universally cherished card game.
arXiv Detail & Related papers (2024-03-30T08:02:16Z) - A Survey on Game Theory Optimal Poker [0.0]
No non-trivial imperfect information game has been solved to date.
This makes poker a great test bed for Artificial Intelligence research.
We discuss the intricacies of abstraction techniques, betting models, and specific strategies employed by successful poker bots.
arXiv Detail & Related papers (2024-01-02T04:19:25Z) - DanZero+: Dominating the GuanDan Game through Reinforcement Learning [95.90682269990705]
We develop an AI program for an exceptionally complex and popular card game called GuanDan.
We first put forward an AI program named DanZero for this game.
In order to further enhance the AI's capabilities, we apply policy-based reinforcement learning algorithm to GuanDan.
arXiv Detail & Related papers (2023-12-05T08:07:32Z) - SPRING: Studying the Paper and Reasoning to Play Games [102.5587155284795]
We propose a novel approach, SPRING, to read the game's original academic paper and use the knowledge learned to reason and play the game through a large language model (LLM)
In experiments, we study the quality of in-context "reasoning" induced by different forms of prompts under the setting of the Crafter open-world environment.
Our experiments suggest that LLMs, when prompted with consistent chain-of-thought, have great potential in completing sophisticated high-level trajectories.
arXiv Detail & Related papers (2023-05-24T18:14:35Z) - Read and Reap the Rewards: Learning to Play Atari with the Help of Instruction Manuals [69.76245723797368]
Read and Reward speeds up RL algorithms on Atari games by reading manuals released by the Atari game developers.
Various RL algorithms obtain significant improvement in performance and training speed when assisted by our design.
arXiv Detail & Related papers (2023-02-09T05:47:03Z) - Discovering Multi-Agent Auto-Curricula in Two-Player Zero-Sum Games [31.97631243571394]
We introduce a framework, LMAC, that automates the discovery of the update rule without explicit human design.
Surprisingly, even without human design, the discovered MARL algorithms achieve competitive or even better performance.
We show that LMAC is able to generalise from small games to large games, for example training on Kuhn Poker and outperforming PSRO.
arXiv Detail & Related papers (2021-06-04T22:30:25Z) - Model-Free Online Learning in Unknown Sequential Decision Making
Problems and Games [114.90723492840499]
In large two-player zero-sum imperfect-information games, modern extensions of counterfactual regret minimization (CFR) are currently the practical state of the art for computing a Nash equilibrium.
We formalize an online learning setting in which the strategy space is not known to the agent.
We give an efficient algorithm that achieves $O(T3/4)$ regret with high probability for that setting, even when the agent faces an adversarial environment.
arXiv Detail & Related papers (2021-03-08T04:03:24Z) - ScrofaZero: Mastering Trick-taking Poker Game Gongzhu by Deep
Reinforcement Learning [2.7178968279054936]
We study Gongzhu, a trick-taking game analogous to, but slightly simpler than contract bridge.
We train a strong Gongzhu AI ScrofaZero from textittabula rasa by deep reinforcement learning.
We introduce new techniques for imperfect information game including stratified sampling, importance weighting, integral over equivalent class, Bayesian inference, etc.
arXiv Detail & Related papers (2021-02-15T12:01:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.