Fugu-MT 論文翻訳(概要): RobotPan: A 360$^\circ$ Surround-View Robotic Vision System for Embodied Perception

論文の概要: RobotPan: A 360$^\circ$ Surround-View Robotic Vision System for Embodied Perception

arxiv url: http://arxiv.org/abs/2604.13476v1
Date: Wed, 15 Apr 2026 04:58:23 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-16 20:38:32.388915
Title: RobotPan: A 360$^\circ$ Surround-View Robotic Vision System for Embodied Perception
Title（参考訳）: ロボットパン:360ドル^\circ$ Overround-View Robotic Vision System for Embodied Perception
Authors: Jiahao Ma, Qiang Zhang, Peiran Liu, Zeran Su, Pihai Sun, Gang Han, Wen Zhao, Wei Cui, Zhang Zhang, Zhiyuan Xu, Renjing Xu, Jian Tang, Miaomiao Liu, Yijie Guo,
Abstract要約: 6台のカメラをLiDARと組み合わせて360ドル(約3万2000円)のビジュアルカバレッジを提供するサラウンドビューロボットビジョンシステムを導入する。また、キャリブレーションされたスパースビューの入力から、エンフェロメトリースケールとエンフェロパクトの3Dガウスを予測できるフィードフォワードフレームワークであるtextscRobotPan を提示する。実験により,textscRobotPanはフィードフォワードの事前再構成やビュー合成手法と競合する品質を実現することが示された。
参考スコア（独自算出の注目度）: 47.76543396190029
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Surround-view perception is increasingly important for robotic navigation and loco-manipulation, especially in human-in-the-loop settings such as teleoperation, data collection, and emergency takeover. However, current robotic visual interfaces are often limited to narrow forward-facing views, or, when multiple on-board cameras are available, require cumbersome manual switching that interrupts the operator's workflow. Both configurations suffer from motion-induced jitter that causes simulator sickness in head-mounted displays. We introduce a surround-view robotic vision system that combines six cameras with LiDAR to provide full 360$^\circ$ visual coverage, while meeting the geometric and real-time constraints of embodied deployment. We further present \textsc{RobotPan}, a feed-forward framework that predicts \emph{metric-scaled} and \emph{compact} 3D Gaussians from calibrated sparse-view inputs for real-time rendering, reconstruction, and streaming. \textsc{RobotPan} lifts multi-view features into a unified spherical coordinate representation and decodes Gaussians using hierarchical spherical voxel priors, allocating fine resolution near the robot and coarser resolution at larger radii to reduce computational redundancy without sacrificing fidelity. To support long sequences, our online fusion updates dynamic content while preventing unbounded growth in static regions by selectively updating appearance. Finally, we release a multi-sensor dataset tailored to 360$^\circ$ novel view synthesis and metric 3D reconstruction for robotics, covering navigation, manipulation, and locomotion on real platforms. Experiments show that \textsc{RobotPan} achieves competitive quality against prior feed-forward reconstruction and view-synthesis methods while producing substantially fewer Gaussians, enabling practical real-time embodied deployment. Project website: https://robotpan.github.io/
Abstract（参考訳）: ロボットナビゲーションやロコ操作では,特に遠隔操作やデータ収集,緊急テイクオーバといった,ループ内設定において,周囲の認識がますます重要になっている。しかしながら、現在のロボット・ビジュアル・インタフェースは、しばしば前方の狭いビューに制限されるか、複数のオンボードカメラが利用可能である場合、オペレーターのワークフローを中断する面倒な手動切替が必要となる。両方の構成は、ヘッドマウントディスプレイでシミュレーターの病気を引き起こす動きによって引き起こされるジッタに悩まされる。本研究では、6台のカメラをLiDARと組み合わせて360$^\circ$の視覚的カバレッジを提供するサラウンドビューロボットビジョンシステムを提案する。さらに、リアルタイムレンダリング、再構成、ストリーミングのためのキャリブレーションされたスパースビュー入力から \emph{metric-scaled} と \emph{compact} 3D Gaussian を予測するフィードフォワードフレームワークである \textsc{RobotPan} を提示する。 \textsc{RobotPan} は、多面的な特徴を統一された球面座標表現に持ち上げ、階層的な球面ボクセル前駆体を用いてガウスをデコードし、ロボットの近傍に微細な解像度を割り当て、より大きな半径で粗い解像度を割り当て、忠実さを犠牲にすることなく計算冗長性を減少させる。長いシーケンスをサポートするため、我々のオンライン融合は動的コンテンツを更新し、外観を選択的に更新することで静的領域の非有界成長を防止した。最後に、実プラットフォーム上でのナビゲーション、操作、移動をカバーし、ロボティクスのための360$^\circ$新しいビュー合成とメートル法3D再構成に適したマルチセンサーデータセットをリリースする。実験により, 従来のフィードフォワード再構成やビュー合成手法と競合する品質を実現し, ガウシアンを著しく減らし, 実時間での実施を可能にした。プロジェクトウェブサイト: https://robotpan.github.io/

論文の概要: RobotPan: A 360$^\circ$ Surround-View Robotic Vision System for Embodied Perception

関連論文リスト