Fugu-MT 論文翻訳(概要): Real-Time Parallel Counterfactual Regret Minimization

論文の概要: Real-Time Parallel Counterfactual Regret Minimization

arxiv url: http://arxiv.org/abs/2605.19928v1
Date: Tue, 19 May 2026 14:49:30 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-20 15:03:09.412323
Title: Real-Time Parallel Counterfactual Regret Minimization
Title（参考訳）: 実時間並列逆Regret最小化
Authors: Boning Li, Longbo Huang,
Abstract要約: リアルタイムゲームプレイシステムでは、解法は決定に数秒の厳格な時間予算でほぼ平衡戦略を計算しなければならない。リアルタイム深度制限型CFR解析のための最初の並列化フレームワークである textbfParallel CFR を提案する。ヘッドアップノーリミットテキサスホールドムの実験では、パラレルCFRはシングルスレッドベースラインよりも3.3$-3.4times$スピードアップを達成した。
参考スコア（独自算出の注目度）: 38.06569764716213
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Counterfactual Regret Minimization (CFR) is the dominant algorithmic family for solving large imperfect-information games, underpinning breakthroughs such as Libratus and Pluribus in No-Limit Texas Hold'em poker. In real-time game-playing systems, the solver must compute a near-equilibrium strategy within a strict time budget of only a few seconds per decision, and the number of CFR iterations completed in this window directly determines play strength. We present \textbf{Parallel CFR}, the first parallelization framework for real-time depth-limited CFR solving that seamlessly integrates pruning, abstraction, and advanced CFR variants. We decompose each CFR iteration into a pipeline of seven stages and identify two orthogonal dimensions of parallelism: \emph{by information set} and \emph{by tree node}. Leaf node evaluation is offloaded to GPUs via batched neural network inference, creating a heterogeneous CPU--GPU pipeline. Experiments on Heads-Up No-Limit Texas Hold'em demonstrate that Parallel CFR achieves $3.3$--$3.4\times$ speedup over the single-threaded baseline on postflop streets, with per-iteration time of ${\sim}47$--$54$~ms on a depth-limited game tree with over $1$ billion histories. All experiments run on a single desktop-class device (NVIDIA DGX Spark), enabling hundreds of CFR iterations within a typical real-time decision budget without requiring datacenter-scale infrastructure.
Abstract（参考訳）: Counterfactual Regret Minimization (CFR) は、No-Limit Texas Hold'em pokerにおけるLibratusやPluribusのようなブレークスルーを支えている大規模な不完全な情報ゲームを解くアルゴリズムの主流である。リアルタイムゲームプレイシステムでは、決定毎に数秒の厳格な時間予算でほぼ平衡戦略を計算し、このウィンドウで完了したCFRイテレーションの回数が直接プレイ強度を決定する。実時間深度制限付きCFR解くための最初の並列化フレームワークである \textbf{Parallel CFR} について述べる。それぞれの CFR 反復を 7 段階のパイプラインに分解し,並列性の直交次元である \emph{by information set} と \emph{by tree node} を同定する。リーフノードの評価は、バッチニューラルネットワーク推論を通じてGPUにオフロードされ、異種CPU-GPUパイプラインを生成する。 Heads-Up No-Limit Texas Hold'emの実験では、Parallel CFRが3.3$--3.4\times$、ポストフロップ通りのシングルスレッドベースラインでのスピードアップを達成した。すべての実験は、単一のデスクトップクラスデバイス(NVIDIA DGX Spark)上で実行され、データセンター規模のインフラストラクチャを必要とせずに、典型的なリアルタイム意思決定予算内で数百のCFRイテレーションを可能にする。

論文の概要: Real-Time Parallel Counterfactual Regret Minimization

関連論文リスト