Fugu-MT 論文翻訳(概要): Fast-SegSim: Real-Time Open-Vocabulary Segmentation for Robotics in Simulation

論文の概要: Fast-SegSim: Real-Time Open-Vocabulary Segmentation for Robotics in Simulation

arxiv url: http://arxiv.org/abs/2604.10951v1
Date: Mon, 13 Apr 2026 03:49:06 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-14 20:13:16.297049
Title: Fast-SegSim: Real-Time Open-Vocabulary Segmentation for Robotics in Simulation
Title（参考訳）: Fast-SegSim: シミュレーションにおけるロボットのためのリアルタイムオープン語彙セグメンテーション
Authors: Xuan Yu, Yuxuan Xie, Shichao Zhai, Shuhao Ye, Rong Xiong, Yue Wang,
Abstract要約: Fast-SegSimは、2D Gaussian Splatting上に構築された、新しく、シンプルで、エンドツーエンドのフレームワークである。我々のコアコントリビューションは高度に最適化されたレンダリングパイプラインであり、特にハイチャネルセグメンテーションの計算ボトルネックに対処しています。 Fast-SegSimはロボットアプリケーションにおいて重要な価値を提供し、その3D一貫性のある出力は、不可欠なマルチビューの「地上真実」ラベルを提供する。
参考スコア（独自算出の注目度）: 23.703731324592656
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Open-vocabulary panoptic reconstruction is crucial for advanced robotics and simulation. However, existing 3D reconstruction methods, such as NeRF or Gaussian Splatting variants, often struggle to achieve the real-time inference frequency required by robotic control loops. Existing methods incur prohibitive latency when processing the high-dimensional features required for robust open-vocabulary segmentation. We propose Fast-SegSim, a novel, simple, and end-to-end framework built upon 2D Gaussian Splatting, designed to realize real-time, high-fidelity, and 3D-consistent open-vocabulary segmentation reconstruction. Our core contribution is a highly optimized rendering pipeline that specifically addresses the computational bottleneck of high-channel segmentation feature accumulation. We introduce two key optimizations: Precise Tile Intersection to reduce rasterization redundancy, and a novel Top-K Hard Selection strategy. This strategy leverages the geometric sparsity inherent in the 2D Gaussian representation to greatly simplify feature accumulation and alleviate bandwidth limitations, achieving render rates exceeding 40 FPS. Fast-SegSim provides critical value in robotic applications: it serves both as a high-frequency sensor input for simulation platforms like Gazebo, and its 3D-consistent outputs provide essential multi-view 'ground truth' labels for fine-tuning downstream perception tasks. We demonstrate this utility by using the generated labels to fine-tune the perception module in object goal navigation, successfully doubling the navigation success rate. Our superior rendering speed and practical utility underscore Fast-SegSim's potential to bridge the sim-to-real gap.
Abstract（参考訳）: オープン・ボキャブラリ・パンオプティカル・コンストラクションは高度なロボティクスとシミュレーションに不可欠である。しかし、NeRF や Gaussian Splatting のような既存の3D再構成手法は、ロボット制御ループで要求されるリアルタイムの推論周波数を達成するのにしばしば苦労する。既存の手法は、堅牢な開語彙セグメンテーションに必要な高次元特徴を処理する際に、禁忌遅延を発生させる。実時間, 高忠実度, および3D一貫性のオープン語彙セグメンテーション再構成を実現するために, 2次元ガウス平板上に構築された新しい, シンプルで, エンドツーエンドのフレームワークであるFast-SegSimを提案する。我々のコアコントリビューションは高度に最適化されたレンダリングパイプラインであり、特にハイチャネルセグメンテーション機能蓄積の計算ボトルネックに対処しています。本稿では,ラスタ化冗長性を低減するための精密タイル切断法と,新しいTop-Kハード選択法を提案する。この戦略は、2Dガウス表現に固有の幾何学的空間性を活用し、特徴蓄積を大幅に単純化し、帯域幅制限を緩和し、40 FPSを超えるレンダリングレートを達成する。 Fast-SegSimは、Gazeboのようなシミュレーションプラットフォームのための高周波センサー入力として機能し、その3D一貫性のある出力は、下流の知覚タスクを微調整するために不可欠なマルチビューの「地上真実」ラベルを提供する。生成したラベルを用いてオブジェクト目標ナビゲーションの認識モジュールを微調整し、ナビゲーション成功率を2倍にすることで、このユーティリティを実証する。我々の優れたレンダリング速度と実用性は、sim-to-realギャップを埋めるFast-SegSimの可能性を基盤にしています。

論文の概要: Fast-SegSim: Real-Time Open-Vocabulary Segmentation for Robotics in Simulation

関連論文リスト