Fugu-MT 論文翻訳(概要): REBA: A Revealed Belief Automaton Framework for Online Planning in Continuous POMDPs

論文の概要: REBA: A Revealed Belief Automaton Framework for Online Planning in Continuous POMDPs

arxiv url: http://arxiv.org/abs/2606.21971v1
Date: Sat, 20 Jun 2026 10:01:05 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-25 23:28:47.902675
Title: REBA: A Revealed Belief Automaton Framework for Online Planning in Continuous POMDPs
Title（参考訳）: REBA: 継続的POMDPにおけるオンラインプランニングのためのRevealed Belief Automatonフレームワーク
Authors: Xiangwei Chen, Lingling Fang, Andreas Holzinger, Liming Chen,
Abstract要約: Revealed Belief Automaton (REBA)は、啓示イベントのオンライン認証のためのイベント駆動フレームワークである。我々は、オンライン有限オートマトンを実現するために、認証アンカー上で漸進的なトポロジー適応機構を開発する。 REBAは評価されたベースラインすべてと一致または超え、主要なメートル法は、最先端のアプローチに対して+17.0%から+47.4%である。
参考スコア（独自算出の注目度）: 10.520568737566732
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Online planning in continuous partially observable Markov decision processes (POMDPs) using $ω$-regular specifications requires handling continuous belief dynamics within the finite symbolic memory in order to track temporal progress. Existing methods based on either direct search in belief space or predefined discrete abstractions suffer from drawbacks, e.g., lack of symbolic memory for long-horizon logical progress or difficult to certify from noisy online beliefs. As such, obtaining reliable symbolic states online from continuous observations remains a challenge. To address this issue, we introduce the Revealed Belief Automaton (REBA), an event-driven framework that advances the research from global belief-space discretization to a fundamental new way of thinking, namely online certification of revelation events. Specifically, we propose an online revelation method that, through information-theoretic gates, can dynamically analyse and establish belief abstraction from the continuous belief space by discovering reliable anchors among noisy beliefs. We then develop an incremental topology adaptation mechanism over the certified anchors to realise the online finite Belief Automaton. By combining with the $ω$-regular specification, REBA is able to support formal parity policy synthesis without a predefined discrete abstraction, which in turn can guide the Monte Carlo Tree Search process to perform online search beyond its local horizon. In addition, we design an error decomposition analysis which can assess the effectiveness and reliability of this discrete guidance for the underlying continuous POMDP. Empirical evaluations in patrolling and navigation scenarios show that REBA matches or exceeds all evaluated baselines, with primary metric gains of +17.0\% to +47.4\% over state-of-the-art approaches.
Abstract（参考訳）: ω$-regular仕様を用いた連続的部分観測可能なマルコフ決定プロセス(POMDP)のオンライン計画では、時間的進行を追跡するために有限シンボルメモリ内の連続的信念ダイナミクスを扱う必要がある。信念空間の直接探索や事前定義された離散抽象概念に基づく既存の手法は、例えば、長期の論理的進歩のためのシンボル記憶の欠如や、ノイズの多いオンライン信念からの認証が困難といった欠点に悩まされている。このように、連続的な観測から信頼できるシンボル状態を取得することは、依然として課題である。この問題に対処するため、我々はRevealed Belief Automaton(REBA)というイベント駆動のフレームワークを導入し、世界的信念空間の離散化から基本的な新しい思考方法、すなわち、啓示イベントのオンライン認証へと研究を進める。具体的には、情報理論ゲートを通して、ノイズのある信念間の信頼なアンカーを発見することによって、継続的な信念空間からの信念抽象化を動的に分析し、確立できるオンライン啓示手法を提案する。そこで我々は,オンライン有限オートマトンを実現するために,認証アンカー上に漸進的なトポロジー適応機構を開発する。 ω$-regular仕様と組み合わせることで、REBAは、事前に定義された抽象概念を使わずに、公式なパリティポリシー合成をサポートすることができる。さらに,この離散的ガイダンスの有効性と信頼性を評価できる誤り分解解析を設計する。パトロールとナビゲーションのシナリオにおける実証的な評価は、REBAがすべての評価基準値に一致または超え、主要なメートル法は+17.0.%から+47.4.%であることを示している。

論文の概要: REBA: A Revealed Belief Automaton Framework for Online Planning in Continuous POMDPs

関連論文リスト