Fugu-MT 論文翻訳(概要): PokerSkill: LLMs Can Play Expert-Level Poker without Training or Solvers

論文の概要: PokerSkill: LLMs Can Play Expert-Level Poker without Training or Solvers

arxiv url: http://arxiv.org/abs/2605.30094v1
Date: Thu, 28 May 2026 15:38:33 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-30 02:45:56.431179
Title: PokerSkill: LLMs Can Play Expert-Level Poker without Training or Solvers
Title（参考訳）: ポーカースキル:LLMは専門家レベルのポーカーを、訓練や問題解決なしにプレイできる
Authors: Boning Li, Baoxiang Wang, Longbo Huang,
Abstract要約: 大規模言語モデル (LLMs) はポーカーの知識が豊富だが、直接プレイするように要求された場合、解法に基づくエージェントよりはるかに低い範囲で実行する。伝統的なルールベースのポーカーエージェントは解釈可能で、訓練も不要だが、その戦略的な天井は均衡の条件よりはるかに低いままである。 textbfPokerSkillは、このギャップを埋めるトレーニングフリーで解決可能なフレームワークです。
参考スコア（独自算出の注目度）: 42.39656827917113
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Poker is a landmark challenge for artificial intelligence. The dominant approach relies on equilibrium solvers built on counterfactual regret minimization, requiring millions of core-hours of training. Large Language Models (LLMs) possess extensive poker knowledge but perform far below solver-based agents when asked to play directly. Traditional rule-based poker agents are interpretable and training-free, but their strategic ceiling remains far below equilibrium play. We introduce \textbf{PokerSkill}, a training-free and solver-free framework that bridges this gap by using detailed rule-based poker skills as a structured action-grounding interface for LLMs. A deterministic context engine analyzes the current state and retrieves only the relevant fragments from a layered skill library, which is entirely designed by human poker experts, constraining the LLM's choice to reasonable actions. Against GTOWizard, a state-of-the-art GTO benchmark, GPT-5.5 XHigh with PokerSkill achieves $-57 \pm 21$ mbb/hand, Claude Opus 4.6 achieves $-80 \pm 29$ mbb/hand and Claude Opus 4.7 achieves $-87\pm 64$ mbb/hand, reducing losses by 49--61\% compared to default-prompt baselines and outperforming the strong bot Slumbot. Our key finding is that rule-based skills alone do not constitute a strong strategy, and LLMs alone cannot play well, but their combination yields an agent that requires neither training nor solver access yet competes with systems built on millions of core-hours of computation. To our knowledge, this is the first demonstration of an LLM achieving competitive performance in a complex imperfect-information game without game-specific training or solver queries. Code is available at https://github.com/lbn187/PokerSkill.
Abstract（参考訳）: ポーカーは人工知能にとって画期的な挑戦だ。主流のアプローチは、反実的後悔の最小化に基づいて構築された平衡解法に依存し、数百万コア時間のトレーニングを必要とする。大規模言語モデル (LLMs) はポーカーの知識が豊富だが、直接プレイするように要求された場合、解法に基づくエージェントよりはるかに低い範囲で実行する。伝統的なルールベースのポーカーエージェントは解釈可能で、訓練も不要だが、その戦略的な天井は均衡の条件よりはるかに低いままである。 LLMのための構造化アクショングラウンドインタフェースとして,詳細なルールベースのポーカースキルを用いて,このギャップを橋渡しする,トレーニングフリーで解決可能なフレームワークである‘textbf{PokerSkill} を紹介した。決定論的コンテキストエンジンは、現在の状態を分析し、レイヤー化されたスキルライブラリから関連するフラグメントのみを取得する。最先端のGTOベンチマークであるGTOWizardに対して、GPT-5.5 XHigh with PokerSkillは$-57 \pm 21$ mbb/hand、Claude Opus 4.6は$-80 \pm 29$ mbb/hand、Claude Opus 4.7は$-87\pm 64$ mbb/hand、損失はデフォルトのプロンプトベースラインに比べて49-61\%減少し、Slumbotを上回っている。我々の重要な発見は、ルールベースのスキルだけでは強力な戦略を構成しておらず、LSMだけではうまく機能しないが、それらの組み合わせによって、トレーニングもソルバアクセスも必要としないエージェントが、数百万コア時間の計算で構築されたシステムと競合する。我々の知る限り、これはゲーム固有のトレーニングやソルバクエリを使わずに複雑な不完全情報ゲームにおいて、LLMが競争性能を達成する最初の実演である。コードはhttps://github.com/lbn187/PokerSkill.comで入手できる。

論文の概要: PokerSkill: LLMs Can Play Expert-Level Poker without Training or Solvers

関連論文リスト