Fugu-MT 論文翻訳(概要): Learning What Matters: Adaptive Information-Theoretic Objectives for Robot Exploration

論文の概要: Learning What Matters: Adaptive Information-Theoretic Objectives for Robot Exploration

arxiv url: http://arxiv.org/abs/2605.12084v1
Date: Tue, 12 May 2026 13:07:27 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-13 21:48:56.874115
Title: Learning What Matters: Adaptive Information-Theoretic Objectives for Robot Exploration
Title（参考訳）: 重要なことを学ぶ:ロボット探索のための適応的情報理論の対象
Authors: Youwei Yu, Jionghao Wang, Zhengming Yu, Wenping Wang, Lantao Liu,
Abstract要約: 目的は、モデルパラメータの不確実性を低減するデータへの探索をガイドすることである。多くのパラメータ方向は観測不能か識別不能であり、識別可能な方向が選択されたとしても、省略方向は探索や歪んだ情報測定に影響を与えうる。 Qfootnotesize OEDは最適な実験設計に基づく適応情報である。
参考スコア（独自算出の注目度）: 32.475564852975104
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Designing learnable information-theoretic objectives for robot exploration remains challenging. Such objectives aim to guide exploration toward data that reduces uncertainty in model parameters, yet it is often unclear what information the collected data can actually reveal. Although reinforcement learning (RL) can optimize a given objective, constructing objectives that reflect parametric learnability is difficult in high-dimensional robotic systems. Many parameter directions are weakly observable or unidentifiable, and even when identifiable directions are selected, omitted directions can still influence exploration and distort information measures. To address this challenge, we propose Quasi-Optimal Experimental Design (Q{\footnotesize OED}), an adaptive information objective grounded in optimal experimental design. Q{\footnotesize OED} (i) performs eigenspace analysis of the Fisher information matrix to identify an observable subspace and select identifiable parameter directions, and (ii) modifies the exploration objective to emphasize these directions while suppressing nuisance effects from non-critical parameters. Under bounded nuisance influence and limited coupling between critical and nuisance directions, Q{\footnotesize OED} provides a constant-factor approximation to the ideal information objective that explores all parameters. We evaluate Q{\footnotesize OED} on simulated and real-world navigation and manipulation tasks, where identifiable-direction selection and nuisance suppression yield performance improvements of \SI{35.23}{\percent} and \SI{21.98}{\percent}, respectively. When integrated as an exploration objective in model-based policy optimization, Q{\footnotesize OED} further improves policy performance over established RL baselines.
Abstract（参考訳）: ロボット探索のための学習可能な情報理論の目的を設計することは依然として困難である。このような目的は、モデルパラメータの不確実性を減少させるデータへの探索を導くことを目的としているが、収集されたデータが実際にどのような情報を明らかにするかは、しばしば不明である。強化学習(RL)は、与えられた目的を最適化することができるが、高次元ロボットシステムではパラメトリック学習性を反映した目的の構築は困難である。多くのパラメータ方向は観測不能か識別不能であり、識別可能な方向が選択されたとしても、省略方向は探索や歪んだ情報測定に影響を与えうる。この課題に対処するため、最適実験設計に基づく適応情報である準最適実験設計(Q{\footnotesize OED})を提案する。 Q{\footnotesize OED 一フィッシャー情報行列の固有空間解析を行い、観測可能な部分空間を特定し、特定可能なパラメータ方向を選択し、 2)非臨界パラメータからのニュアンス効果を抑えつつ,これらの方向を強調するために探索対象を変更する。 Q{\footnotesize OED} は、有界なニュアンスの影響と臨界方向とニュアンス方向の限定的な結合の下で、全てのパラメータを探索する理想的な情報目的に対する定数要素近似を提供する。シミュレーションおよび実世界のナビゲーションおよび操作タスクにおけるQ{\footnotesize OED} の評価を行い, 同定可能な方向選択とニュアンス抑制がそれぞれ, SI{35.23}{\percent} と \SI{21.98}{\percent} の性能向上をもたらすことを示した。 Q{\footnotesize OED}は、モデルベースポリシー最適化における探索目標として統合されると、確立されたRLベースラインよりも、ポリシー性能が向上する。

論文の概要: Learning What Matters: Adaptive Information-Theoretic Objectives for Robot Exploration

関連論文リスト