Fugu-MT 論文翻訳(概要): SysNav: Multi-Level Systematic Cooperation Enables Real-World, Cross-Embodiment Object Navigation

論文の概要: SysNav: Multi-Level Systematic Cooperation Enables Real-World, Cross-Embodiment Object Navigation

arxiv url: http://arxiv.org/abs/2603.06914v1
Date: Fri, 06 Mar 2026 22:20:51 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-23 08:17:42.019775
Title: SysNav: Multi-Level Systematic Cooperation Enables Real-World, Cross-Embodiment Object Navigation
Title（参考訳）: SysNav: マルチレベルシステム連携により、実世界のクロスボディーメントオブジェクトナビゲーションが可能に
Authors: Haokun Zhu, Zongtai Li, Zihan Liu, Kevin Guo, Zhengzhi Lin, Yuxin Cai, Guofei Chen, Chen Lv, Wenshan Wang, Jean Oh, Ji Zhang,
Abstract要約: 我々は,現実世界のクロスエボデーメント展開のための3レベルObjectNavシステムであるSysNavを紹介する。 SysNavはセマンティック推論、ナビゲーション計画、モーションコントロールを分離し、堅牢性と一般化性を保証する。本システムは,成功率と航法効率の両面で大幅に向上する。
参考スコア（独自算出の注目度）: 46.34939555586507
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Object navigation (ObjectNav) in real-world environments is a complex problem that requires simultaneously addressing multiple challenges, including complex spatial structure, long-horizon planning and semantic understanding. Recent advances in Vision-Language Models (VLMs) offer promising capabilities for semantic understanding, yet effectively integrating them into real-world navigation systems remains a non-trivial challenge. In this work, we formulate real-world ObjectNav as a system-level problem and introduce SysNav, a three-level ObjectNav system designed for real-world crossembodiment deployment. SysNav decouples semantic reasoning, navigation planning and motion control to ensure robustness and generalizability. At the high-level, we summarize the environment into a structured scene representation and leverage VLMs to provide semantic-grounded navigation guidance. At the mid-level, we introduce a hierarchical room-based navigation strategy that reserves VLM guidance for room-level decisions, which effectively utilizes its reasoning ability while ensuring system efficiency. At the low-level, planned waypoints are executed through different embodiment-specific motion control modules. We deploy our system on three embodiments, a custom-built wheeled robot, the Unitree Go2 quadruped and the Unitree G1 humanoid, and conduct 190 real-world experiments. Our system achieves substantial improvements in both success rate and navigation efficiency. To the best of our knowledge, SysNav is the first system capable of reliably and efficiently completing building-scale long-range object navigation in complex real-world environments. Furthermore, extensive experiments on four simulation benchmarks demonstrate state-of-the-art performance. Project page is available at: https://cmu-vln.github.io/.
Abstract（参考訳）: 現実環境におけるオブジェクトナビゲーション(ObjectNav)は、複雑な空間構造、長期計画、意味的理解など、複数の課題に同時に対処する必要がある複雑な問題である。近年のVLM(Vision-Language Models)の進歩はセマンティック理解に有望な能力を提供しているが、現実のナビゲーションシステムに効果的に統合することは難しい課題である。本研究では,現実のObjectNavをシステムレベルの問題として定式化し,現実のクロスボデーメント展開用に設計された3レベルObjectNavシステムであるSysNavを紹介する。 SysNavはセマンティック推論、ナビゲーション計画、モーションコントロールを分離し、堅牢性と一般化性を保証する。高レベルでは、環境を構造化されたシーン表現に要約し、VLMを活用してセマンティックグラウンドナビゲーションのガイダンスを提供する。中間層では,VLM指導を室内レベル決定に活用する階層型ナビゲーション戦略を導入し,その推論能力を有効活用し,システム効率の確保を図る。低レベルでは、計画されたウェイポイントは、異なるエンボディメント固有のモーションコントロールモジュールを通して実行される。我々は3つの実施形態、特注の車輪付きロボット、Unitree Go2の四足歩行、Unitree G1のヒューマノイドにシステムを配置し、190の現実世界実験を行った。本システムは,成功率と航法効率の両面で大幅に向上する。我々の知る限りでは、SysNavは複雑な現実世界環境における建築規模の長距離航法を確実かつ効率的に完了できる最初のシステムである。さらに、4つのシミュレーションベンチマークの広範な実験により、最先端の性能が実証された。プロジェクトページは、https://cmu-vln.github.io/.com/で公開されている。

論文の概要: SysNav: Multi-Level Systematic Cooperation Enables Real-World, Cross-Embodiment Object Navigation

関連論文リスト