Fugu-MT 論文翻訳(概要): Independent Learning of Nash Equilibria in Partially Observable Markov Potential Games with Decoupled Dynamics

論文の概要: Independent Learning of Nash Equilibria in Partially Observable Markov Potential Games with Decoupled Dynamics

arxiv url: http://arxiv.org/abs/2605.06377v1
Date: Thu, 07 May 2026 14:56:56 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-08 22:27:11.92195
Title: Independent Learning of Nash Equilibria in Partially Observable Markov Potential Games with Decoupled Dynamics
Title（参考訳）: 切り離されたダイナミクスを持つ部分観測可能なマルコフポテンシャルゲームにおけるナッシュ平衡の独立学習
Authors: Philip Jordan, Maryam Kamgarpour,
Abstract要約: 部分的に観測可能なマルコフゲーム(POMG)におけるナッシュ均衡学習の研究本研究では,各プレイヤーが自身の行動や観察のみを観察し,コミュニケーションを伴わない独立学習アルゴリズムを提案する。
参考スコア（独自算出の注目度）: 8.784438985280092
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study Nash equilibrium learning in partially observable Markov games (POMGs), a multi-agent reinforcement learning framework in which agents cannot fully observe the underlying state. Prior work in this setting relies on centralization or information sharing, and suffers from sample and computational complexity that scales exponentially in the number of players. We focus on a subclass of POMGs with independent state transitions, where agents remain coupled through their rewards, and assume that the underlying fully observed Markov game is a Markov potential game. For this class, we present an independent learning algorithm in which players, observing only their own actions and observations and without communication, jointly converge to an approximate Nash equilibrium. Due to partial observability, optimal policies may in general depend on the full action-observation history. Under a filter stability assumption, we show that policies based on finite history windows provide sufficient approximation guarantees. This enables us to approximate the POMG by a surrogate Markov game that is near-potential, leading to quasi-polynomial sample and computational complexity for independent Nash equilibrium learning in the underlying POMG.
Abstract（参考訳）: エージェントが基礎となる状態を十分に観察できない多エージェント強化学習フレームワークである、部分的に観測可能なマルコフゲーム(POMG)におけるナッシュ均衡学習について検討する。この設定での以前の作業は、中央集権化や情報共有に依存しており、プレイヤー数で指数関数的にスケールするサンプルと計算の複雑さに悩まされている。我々は独立状態遷移を持つPOMGのサブクラスに焦点を合わせ、エージェントは報酬を通じて結合し続け、基礎となる完全に観察されたマルコフゲームがマルコフポテンシャルゲームであると仮定する。このクラスでは、プレイヤーが自身の行動や観察のみを観察し、コミュニケーション無しで、近似的なナッシュ均衡に共同収束する独立した学習アルゴリズムを提案する。部分的な可観測性のため、最適ポリシーは一般に完全な行動観測履歴に依存することがある。フィルタ安定性の仮定により、有限履歴ウィンドウに基づくポリシーが十分な近似保証を提供することを示す。これにより、擬似ポリーノミカル標本と計算複雑性を基礎となるPOMGにおける独立なナッシュ平衡学習に導いた、ほぼ潜在的確率のマルコフゲームにより、POMGを近似することができる。

論文の概要: Independent Learning of Nash Equilibria in Partially Observable Markov Potential Games with Decoupled Dynamics

関連論文リスト