Fugu-MT 論文翻訳(概要): Code World Models for General Game Playing

論文の概要: Code World Models for General Game Playing

arxiv url: http://arxiv.org/abs/2510.04542v1
Date: Mon, 06 Oct 2025 07:16:07 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-07 16:52:59.721879
Title: Code World Models for General Game Playing
Title（参考訳）: 汎用ゲームのためのコードワールドモデル
Authors: Wolfgang Lehrach, Daniel Hennes, Miguel Lazaro-Gredilla, Xinghua Lou, Carter Wendelken, Zun Li, Antoine Dedieu, Jordi Grau-Moya, Marc Lanctot, Atil Iscen, John Schultz, Marcus Chiam, Ian Gemp, Piotr Zielinski, Satinder Singh, Kevin P. Murphy,
Abstract要約: 我々はLarge Language Modelsを用いて、自然言語規則とゲーム軌跡をPythonコードとして表現された形式的で実行可能な世界モデルに変換する。この生成モデルは、高性能計画アルゴリズムの検証可能なシミュレーションエンジンとして機能する。提案手法は,10ゲーム中9ゲームにおいて,Gemini 2.5 Proより優れているか,あるいは一致していることがわかった。
参考スコア（独自算出の注目度）: 22.382021070682256
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) reasoning abilities are increasingly being applied to classical board and card games, but the dominant approach -- involving prompting for direct move generation -- has significant drawbacks. It relies on the model's implicit fragile pattern-matching capabilities, leading to frequent illegal moves and strategically shallow play. Here we introduce an alternative approach: We use the LLM to translate natural language rules and game trajectories into a formal, executable world model represented as Python code. This generated model -- comprising functions for state transition, legal move enumeration, and termination checks -- serves as a verifiable simulation engine for high-performance planning algorithms like Monte Carlo tree search (MCTS). In addition, we prompt the LLM to generate heuristic value functions (to make MCTS more efficient), and inference functions (to estimate hidden states in imperfect information games). Our method offers three distinct advantages compared to directly using the LLM as a policy: (1) Verifiability: The generated CWM serves as a formal specification of the game's rules, allowing planners to algorithmically enumerate valid actions and avoid illegal moves, contingent on the correctness of the synthesized model; (2) Strategic Depth: We combine LLM semantic understanding with the deep search power of classical planners; and (3) Generalization: We direct the LLM to focus on the meta-task of data-to-code translation, enabling it to adapt to new games more easily. We evaluate our agent on 10 different games, of which 4 are novel and created for this paper. 5 of the games are fully observed (perfect information), and 5 are partially observed (imperfect information). We find that our method outperforms or matches Gemini 2.5 Pro in 9 out of the 10 considered games.
Abstract（参考訳）: 大規模言語モデル(LLM)推論能力は、古典的なボードゲームやカードゲームにますます応用されているが、直接的な移動生成の促進を含む支配的なアプローチには、大きな欠点がある。これはモデルの暗黙の脆弱なパターンマッチング能力に依存しており、しばしば違法な動きと戦略的に浅いプレーをもたらす。自然言語規則とゲームトラジェクトリを,Pythonコードとして表現された形式的で実行可能な世界モデルに変換するために,LLMを使用します。このモデル - 状態遷移、法的な移動列挙、終了チェック機能 - は、モンテカルロ木探索(MCTS)のような高性能な計画アルゴリズムの検証可能なシミュレーションエンジンとして機能する。さらに,LLMにヒューリスティックな値関数(MCTSをより効率的にする)と推論関数(不完全な情報ゲームにおける隠れ状態を推定する)を生成するよう促す。検証可能性: 生成したCWMはゲームルールの形式的仕様として機能し,有効動作をアルゴリズムで列挙し,不正動作を回避し,合成されたモデルの正しさに留意すること; 戦略的深さ: LLM意味論的理解と古典的プランナーの深い探索力を組み合わせること; 一般化: LLMにデータ-コード翻訳のメタタスクに重点を置き,新しいゲームに容易に適応できるように指示すること。エージェントを10種類の異なるゲームで評価し,その内4つは新規で,本論文のために作成された。 5は完全観察(完全情報)、5は部分的に観察(完全情報)される。提案手法は,10ゲーム中9ゲームにおいて,Gemini 2.5 Proより優れているか,あるいは一致していることがわかった。

論文の概要: Code World Models for General Game Playing

関連論文リスト