Fugu-MT 論文翻訳(概要): Bitboard version of Tetris AI

論文の概要: Bitboard version of Tetris AI

arxiv url: http://arxiv.org/abs/2603.26765v1
Date: Tue, 24 Mar 2026 02:35:09 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-31 23:18:44.58522
Title: Bitboard version of Tetris AI
Title（参考訳）: Tetris AIのBitboardバージョン
Authors: Xingguo Chen, Pingshou Xiong, Zhenyu Luo, Mengfei Hu, Xinwen Li, Yongzhou Lü, Guang Yang, Chao Li, Shangdong Yang,
Abstract要約: 既存のテトリス実装は、シミュレーション速度の低下、準最適状態評価、非効率なトレーニングパラダイムに悩まされている。本稿では,ビットボード最適化と改良されたRLアルゴリズムに基づく高性能テトリスAIフレームワークを提案する。
参考スコア（独自算出の注目度）: 9.23305813094404
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The efficiency of game engines and policy optimization algorithms is crucial for training reinforcement learning (RL) agents in complex sequential decision-making tasks, such as Tetris. Existing Tetris implementations suffer from low simulation speeds, suboptimal state evaluation, and inefficient training paradigms, limiting their utility for large-scale RL research. To address these limitations, this paper proposes a high-performance Tetris AI framework based on bitboard optimization and improved RL algorithms. First, we redesign the Tetris game board and tetrominoes using bitboard representations, leveraging bitwise operations to accelerate core processes (e.g., collision detection, line clearing, and Dellacherie-Thiery Features extraction) and achieve a 53-fold speedup compared to OpenAI Gym-Tetris. Second, we introduce an afterstate-evaluating actor network that simplifies state value estimation by leveraging Tetris afterstate property, outperforming traditional action-value networks with fewer parameters. Third, we propose a buffer-optimized Proximal Policy Optimization (PPO) algorithm that balances sampling and update efficiency, achieving an average score of 3,829 on 10x10 grids within 3 minutes. Additionally, we develop a Python-Java interface compliant with the OpenAI Gym standard, enabling seamless integration with modern RL frameworks. Experimental results demonstrate that our framework enhances Tetris's utility as an RL benchmark by bridging low-level bitboard optimizations with high-level AI strategies, providing a sample-efficient and computationally lightweight solution for scalable sequential decision-making research.
Abstract（参考訳）: ゲームエンジンとポリシー最適化アルゴリズムの効率は、テトリスのような複雑な意思決定タスクにおける強化学習(RL)エージェントの訓練に不可欠である。既存のテトリス実装は、シミュレーション速度の低下、準最適状態評価、非効率なトレーニングパラダイムに悩まされており、大規模なRL研究においてその有用性を制限している。これらの制約に対処するために,ビットボード最適化と改良されたRLアルゴリズムに基づく高性能テトリスAIフレームワークを提案する。まず,ビットボード表現を用いてテトリスゲームボードとテトロミノをリデザインし,コアプロセス(例えば,衝突検出,ラインクリア,Dellacherie-Thiery Features 抽出)を高速化し,OpenAI Gym-Tetrisと比較して53倍の高速化を実現した。第2に,テトリス残状態特性を活用して状態値推定を簡略化し,パラメータの少ない従来の行動値ネットワークよりも優れた状態値推定を行う,残状態評価アクタネットワークを提案する。第3に、サンプリングと更新効率のバランスをとるバッファ最適化プロキシポリシー最適化(PPO)アルゴリズムを提案し、平均スコアは10×10グリッドで3,829点を3分で達成した。さらに,OpenAI Gym標準に準拠したPython-Javaインターフェースを開発し,最新のRLフレームワークとのシームレスな統合を実現する。実験により,我々のフレームワークは,高レベルのAI戦略で低レベルビットボード最適化をブリッジすることで,RLベンチマークとしてのテトリスの有用性を高め,スケーラブルなシーケンシャルな意思決定研究のためのサンプル効率で計算的に軽量なソリューションを提供することを示した。

論文の概要: Bitboard version of Tetris AI

関連論文リスト