Fugu-MT 論文翻訳(概要): Explore Like Humans: Autonomous Exploration with Online SG-Memo Construction for Embodied Agents

論文の概要: Explore Like Humans: Autonomous Exploration with Online SG-Memo Construction for Embodied Agents

arxiv url: http://arxiv.org/abs/2604.19034v1
Date: Tue, 21 Apr 2026 03:35:31 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-22 22:41:49.599256
Title: Explore Like Humans: Autonomous Exploration with Online SG-Memo Construction for Embodied Agents
Title（参考訳）: 人間らしく探究する: 人工エージェントのためのオンラインSG-Memo構築による自律的な探索
Authors: Xu Chen, Shichao Xie, Zhining Gu, Lu Jia, Minghua Luo, Fei Liu, Zedong Chu, Yanfen Shen, Xiaolong Wu, Mu Xu,
Abstract要約: ABot-Explorerは、メモリ構築とオンラインRGBのみのプロセスへの探索を統合する、新しいアクティブな探索フレームワークである。中心となるABot-Explorerは、Large Vision-Language Models (VLMs)を活用して、セマンティックナビゲーション改善(SNA)を蒸留する。 SNAを階層的なSG-Memoに動的に統合することで、ABot-Explorerは構造的トランジットノードを優先順位付けすることで、人間のような探索ロジックをミラーする。
参考スコア（独自算出の注目度）: 15.948899354590408
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Constructing structured spatial memory is essential for enabling long-horizon reasoning in complex embodied navigation tasks. Current memory construction predominantly relies on a decoupled, two-stage paradigm: agents first aggregate environmental data through exploration, followed by the offline reconstruction of spatial memory. However, this post-hoc and geometry-centric approach precludes agents from leveraging high-level semantic intelligence, often causing them to overlook navigationally critical landmarks (e.g., doorways and staircases) that serve as fundamental semantic anchors in human cognitive maps. To bridge this gap, we propose ABot-Explorer, a novel active exploration framework that unifies memory construction and exploration into an online, RGB-only process. At its core, ABot-Explorer leverages Large Vision-Language Models (VLMs) to distill Semantic Navigational Affordances (SNA), which act as cognitive-aligned anchors to guide the agent's movement. By dynamically integrating these SNAs into a hierarchical SG-Memo, ABot-Explorer mirrors human-like exploratory logic by prioritizing structural transit nodes to facilitate efficient coverage. To support this framework, we contribute a large-scale dataset extending InteriorGS with SNA and SG-Memo annotations. Experimental results demonstrate that ABot-Explorer significantly outperforms current state-of-the-art methods in both exploration efficiency and environment coverage, while the resulting SG-Memo is shown to effectively support diverse downstream tasks.
Abstract（参考訳）: 複雑な具体化されたナビゲーションタスクにおいて、長い水平推論を可能にするためには、構造化された空間記憶の構築が不可欠である。現在のメモリ構築は、主に分離された2段階のパラダイムに依存している:エージェントは探索を通じてまず環境データを集約し、次いで空間記憶のオフライン再構築を行う。しかし、このポストホックで幾何学中心のアプローチは、エージェントが高レベルのセマンティックインテリジェンスを活用することを妨げ、しばしば、人間の認知地図の基本的なセマンティックアンカーとして機能するナビゲーション上重要なランドマーク(例えばドアウェイや階段)を見落としてしまう。このギャップを埋めるために、我々はABot-Explorerを提案する。ABot-Explorerは、オンラインのRGBのみのプロセスにメモリ構築と探索を統合する、新しいアクティブな探索フレームワークである。 ABot-Explorerは、Large Vision-Language Models (VLMs) を利用して、エージェントの動きを導くための認知的なアンカーとして機能するセマンティックナビゲーションアフォードランス (SNA) を蒸留する。これらのSNAを階層的なSG-Memoに動的に統合することにより、ABot-Explorerは、構造的トランジットノードを優先順位付けして、ヒューマンライクな探索ロジックをミラーし、効率的なカバレッジを促進する。このフレームワークをサポートするために,SNAアノテーションとSG-Memoアノテーションを併用した大規模データセットをInstituteGSにコントリビュートする。実験の結果、ABot-Explorerは探索効率と環境カバレッジの両方において最先端の手法よりも優れており、その結果、SG-Memoは様々な下流タスクを効果的にサポートしていることがわかった。

論文の概要: Explore Like Humans: Autonomous Exploration with Online SG-Memo Construction for Embodied Agents

関連論文リスト