Fugu-MT 論文翻訳(概要): Coopetition-Gym v1: A Formally Grounded Platform for Mixed-Motive Multi-Agent Reinforcement Learning under Strategic Coopetition

論文の概要: Coopetition-Gym v1: A Formally Grounded Platform for Mixed-Motive Multi-Agent Reinforcement Learning under Strategic Coopetition

arxiv url: http://arxiv.org/abs/2605.02063v1
Date: Sun, 03 May 2026 21:14:06 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-05 20:33:50.065978
Title: Coopetition-Gym v1: A Formally Grounded Platform for Mixed-Motive Multi-Agent Reinforcement Learning under Strategic Coopetition
Title（参考訳）: Coopetition-Gym v1: 戦略的Coopetition下での混合運動型マルチエージェント強化学習のための形式的基盤プラットフォーム
Authors: Vik Pant, Eric Yu,
Abstract要約: Coopetition-Gym v1は、戦略的コパイションの下での混合モチベーション強化学習のためのベンチマークプラットフォームである。プラットフォームは、Gymnasium、Petting Parallel、PettingZoo AECインターフェースを公開し、116の参照アルゴリズムを出荷する。
参考スコア（独自算出の注目度）: 0.33985917934283577
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present Coopetition-Gym v1, a benchmark platform for mixed-motive multi-agent reinforcement learning under strategic coopetition. The platform comprises twenty environments organized into four mechanism classes that correspond to four foundational technical reports: interdependence and complementarity (arXiv:2510.18802), trust and reputation dynamics (arXiv:2510.24909), collective action and loyalty (arXiv:2601.16237), and sequential interaction and reciprocity (arXiv:2604.01240). Each environment carries a closed-form payoff structure and a calibrated interdependence matrix derived from the corresponding report. Every environment exposes a parameterized reward layer configurable across three structurally distinct modes (private, integrated, cooperative). This separation of payoff from reward enables reward-type ablation, the platform's principal methodological apparatus. Four of the twenty environments are calibrated against historically documented coopetitive relationships and reproduce their outcomes at 98.3, 81.7, 86.7, and 87.3 percent on the validation rubric (Samsung-Sony LCD, Renault-Nissan Alliance, Apache HTTP Server, Apple iOS App Store). The platform exposes Gymnasium, PettingZoo Parallel, and PettingZoo AEC interfaces and ships 126 reference algorithms: 16 learning algorithms, 7 game-theoretic oracles, 2 heuristic baselines, and 101 constant-action policies. A reference experimental study trained the 16 learning algorithms on every environment under every reward configuration with seven random seeds, producing a 25,708-run training corpus and a 1,116-run behavioral audit corpus, both released under CC-BY-4.0 with Croissant 1.0 metadata. Coopetition-Gym v1 is the first platform to combine continuous-action mixed-motive environments, parameterized reward mutuality, calibrated interdependence coefficients, game-theoretic oracle baselines, and validated case studies.
Abstract（参考訳）: 本稿では,コペティションに基づくマルチエージェント強化学習のためのベンチマークプラットフォームであるCoopetition-Gym v1を提案する。プラットフォームは、相互依存と相補性(arXiv:2510.18802)、信頼と評判のダイナミクス(arXiv:2510.24909)、集団行動と忠誠(arXiv:2601.16237)、シーケンシャル相互作用と相互性(arXiv:2604.01240)の4つの基本的な技術的レポートに対応する4つのメカニズムクラスで構成されている。各環境は、クローズドフォームのペイオフ構造と、対応するレポートから派生したキャリブレーションされた相互依存行列を有する。各環境は、3つの構造的に異なるモード(プライベート、統合、協調)で構成可能なパラメータ化された報酬層を公開する。この報酬と報酬の分離は、プラットフォームの主要な方法論装置である報酬型アブレーションを可能にする。 20の環境のうち4つは、歴史的に記録されたコペティティブな関係に対して調整され、検証ルーリック(Samsung-Sony LCD、Renault-Nissan Alliance、Apache HTTP Server、Apple iOS App Store)の98.3、81.7、86.7、87.3%で結果が再現される。プラットフォームは、Gymnasium、PettingZoo Parallel、PettingZoo AECインターフェースを公開し、16の学習アルゴリズム、7のゲーム理論のオラクル、2つのヒューリスティックベースライン、101の定期的なアクションポリシーを含む126の参照アルゴリズムを出荷する。基準実験では、16の学習アルゴリズムを、7つのランダムなシードで全ての環境下で訓練し、25,708個のトレーニングコーパスと1,116個の行動監査コーパスを生成した。 Coopetition-Gym v1は、連続作用混合運動環境、パラメータ化された報酬相互性、キャリブレーションされた相互依存性係数、ゲーム理論のオラクルベースライン、検証されたケーススタディを組み合わせた最初のプラットフォームである。

論文の概要: Coopetition-Gym v1: A Formally Grounded Platform for Mixed-Motive Multi-Agent Reinforcement Learning under Strategic Coopetition

関連論文リスト