Fugu-MT 論文翻訳(概要): S2GS: Streaming Semantic Gaussian Splatting for Online Scene Understanding and Reconstruction

論文の概要: S2GS: Streaming Semantic Gaussian Splatting for Online Scene Understanding and Reconstruction

arxiv url: http://arxiv.org/abs/2603.14232v1
Date: Sun, 15 Mar 2026 05:48:55 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-17 16:19:35.688794
Title: S2GS: Streaming Semantic Gaussian Splatting for Online Scene Understanding and Reconstruction
Title（参考訳）: S2GS:オンラインシーン理解と再構築のためのセマンティックガウススプレイティング
Authors: Renhe Zhang, Yuyang Tan, Jingyu Gong, Zhizhong Zhang, Lizhuang Ma, Yuan Xie, Xin Tan,
Abstract要約: Streaming Semantic Gaussian Splatting (S2GS) は厳密に因果的かつ漸進的な3D Gaussianセマンティックフィールドフレームワークである。将来のフレームを活用せず、歴史的フレームを再処理することなく、シーンの幾何学、外観、インスタンスレベルのセマンティクスを継続的に更新する。 S2GSは、ジョイントリコンストラクションとアンダーホールドのベンチマークにおいて、強いオフラインベースラインをマッチまたは上回る。
参考スコア（独自算出の注目度）: 57.07346645250984
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Existing offline feed-forward methods for joint scene understanding and reconstruction on long image streams often repeatedly perform global computation over an ever-growing set of past observations, causing runtime and GPU memory to increase rapidly with sequence length and limiting scalability. We propose Streaming Semantic Gaussian Splatting (S2GS), a strictly causal, incremental 3D Gaussian semantic field framework: it does not leverage future frames and continuously updates scene geometry, appearance, and instance-level semantics without reprocessing historical frames, enabling scalable online joint reconstruction and understanding. S2GS adopts a geometry-semantic decoupled dual-backbone design: the geometry branch performs causal modeling to drive incremental Gaussian updates, while the semantic branch leverages a 2D foundation vision model and a query-driven decoder to predict segmentation masks and identity embeddings, further stabilized by query-level contrastive alignment and lightweight online association with an instance memory. Experiments show that S2GS matches or outperforms strong offline baselines on joint reconstruction-and-understanding benchmarks, while significantly improving long-horizon scalability: it processes 1,000+ frames with much slower growth in runtime and GPU memory, whereas offline global-processing baselines typically run out of memory at around 80 frames under the same setting.
Abstract（参考訳）: 既存のオフラインフィードフォワード方式では、長い画像ストリームに対する共同シーンの理解と再構築が可能であり、しばしば過去の観測結果に対するグローバルな計算を繰り返し実行し、実行時とGPUメモリは、シーケンス長とスケーラビリティの制限により急速に増大する。 S2GS(Streaming Semantic Gaussian Splatting)は、厳密な因果的・漸進的な3次元ガウス意味論フレームワークであり、将来のフレームを活用せず、歴史的フレームを再処理することなくシーンの幾何学、外観、インスタンスレベルの意味論を継続的に更新し、スケーラブルなオンライン共同再構築と理解を可能にする。 S2GSはジオメトリ・セマンティック・デカップリングされたデュアルバックボーンの設計を採用しており、ジオメトリ・ブランチはインクリメンタルなガウス的更新を駆動するための因果モデリングを行い、セグメンテーション・マスクとアイデンティティ・埋め込みを予測するためにセグメンテーション・マスクとクエリ駆動デコーダを活用し、クエリレベルのコントラスト・アライメントとインスタンス・メモリとの軽量なオンラインアソシエーションによって安定化されている。実験によると、S2GSは共同再構築とアンダーバックのベンチマークで強いオフラインベースラインと一致または性能を向上し、長期スケーラビリティを大幅に改善している。ランタイムとGPUメモリの伸びが大幅に遅い1000以上のフレームを処理するのに対して、オフラインのグローバル処理ベースラインは、通常、80フレーム前後でメモリが切れている。

論文の概要: S2GS: Streaming Semantic Gaussian Splatting for Online Scene Understanding and Reconstruction

関連論文リスト