Fugu-MT 論文翻訳(概要): VOiLA: Vectorized Online Planning with Learned Diffusion Model for POMDP Agents

論文の概要: VOiLA: Vectorized Online Planning with Learned Diffusion Model for POMDP Agents

arxiv url: http://arxiv.org/abs/2606.19729v1
Date: Thu, 18 Jun 2026 02:51:54 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-19 18:23:39.619192
Title: VOiLA: Vectorized Online Planning with Learned Diffusion Model for POMDP Agents
Title（参考訳）: VOiLA:POMDPエージェントのための学習拡散モデルを用いたベクトルオンラインプランニング
Authors: Marcus Hoerger, Rishikesh Joshi, Rahul Shome, Ian Manchester, Hanna Kurniawati,
Abstract要約: 不確実性の下での計画は自律ロボットにとって不可欠な能力である。本稿では,不確実性を考慮したオンラインプランニングのためのタスク非依存のPOMDPモデルを学習するフレームワークを提案する。
参考スコア（独自算出の注目度）: 9.270170611697141
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Planning under uncertainty is an essential capability for autonomous robots. The Partially Observable Markov Decision Process (POMDP) provides a powerful framework for such a capability. Although POMDP-based planning has advanced significantly, its application to real-world problems is often limited by the difficulty of obtaining faithful POMDP models. We present Vectorized Online planning wIth Learned diffusion model for POMDP Agents (VOiLA), a framework that learns task-agnostic POMDP models for online planning under uncertainty. VOiLA learns transition and observation samplers using conditional diffusion models and learns observation-likelihood models for particle-based belief updates. To enable efficient online planning, the diffusion samplers are distilled into compact feedforward generators and integrated with Vectorized Online POMDP Planner (VOPP), an online POMDP planner designed to leverage GPU parallelization. Experimental results indicate the distillation strategy reduces sampling cost by up to nearly three orders of magnitude, making learned generative POMDP models practical for online planning. Evaluation of VOiLA on three benchmark problems indicate that VOiLA achieves equal or better performance than Recurrent Soft Actor Critic while using less than 10% training data, and generalizes much better to unseen environment configurations. Physical robot evaluation indicates VOiLA uses the models learned using only simulated data and generates a policy that successfully accomplish the task in 10 of 10 runs.
Abstract（参考訳）: 不確実性の下での計画は自律ロボットにとって不可欠な能力である。部分観測可能なマルコフ決定プロセス(POMDP)は、そのような機能のための強力なフレームワークを提供する。 POMDPベースのプランニングは大幅に進歩しているが、実世界の問題への適用は、忠実なPOMDPモデルを得るのが困難であるために制限されることが多い。我々は、不確実なオンライン計画のためのタスク非依存のPOMDPモデルを学習するフレームワークであるPOMDP Agents (VOiLA)について、Vectorized Online Planning wIth Learned diffusion modelを提示する。 VOiLAは、条件付き拡散モデルを用いて遷移および観測サンプルを学習し、粒子ベースの信念更新のための観測類似モデルを学ぶ。効率的なオンライン計画を可能にするため、拡散サンプリングは小型のフィードフォワードジェネレータに蒸留され、GPU並列化を活用するように設計されたオンラインPOMDPプランナーであるVectorized Online POMDP Planner (VOPP)と統合される。実験結果から, 蒸留法は, 採取コストを最大3桁程度削減し, オンラインプランニングに有効であることがわかった。 3つのベンチマーク問題に対するVOiLAの評価は、VOiLAが10%未満のトレーニングデータを使用しながら、リカレント・ソフト・アクター・クリティカル(英語版)よりも同等かそれ以上の性能を達成し、未知の環境構成よりもはるかに優れていることを示している。物理ロボット評価は、VOiLAがシミュレーションデータのみを用いて学習したモデルを使用して、10回中10回でタスクを成功させるポリシーを生成することを示している。

論文の概要: VOiLA: Vectorized Online Planning with Learned Diffusion Model for POMDP Agents

関連論文リスト