Fugu-MT 論文翻訳(概要): Intrinsic 4D Gaussian Segmentation from Scene Cues

論文の概要: Intrinsic 4D Gaussian Segmentation from Scene Cues

arxiv url: http://arxiv.org/abs/2606.18623v1
Date: Wed, 17 Jun 2026 02:40:35 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-18 17:16:50.974405
Title: Intrinsic 4D Gaussian Segmentation from Scene Cues
Title（参考訳）: シーンクイズからの固有4次元ガウスセグメンテーション
Authors: Hasan Yazar, Mohamed Rayan Barhdadi, Erchin Serpedin, Mehmet Tuncel, Hasan Kurban,
Abstract要約: Intrinsic-GSは、ガウス原始体上のスパース親和性グラフを構築する、トレーニングフリーでマスフリーな手法である。グラフはLeidenコミュニティ検出で分割されており、基礎モデルや学習された機能フィールドを必要としない。標準的な4Dガウス分割ベンチマークであるNeu3DとHyperNeRFでは、Intrinsic-GSはマスクの監督なしに実質的なオブジェクト構造を復元する。
参考スコア（独自算出の注目度）: 2.2354043586480414
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Dynamic 4D Gaussian Splatting reconstructs deforming scenes with high fidelity and is increasingly adopted as a representation for dynamic 3D scenes. Putting such a scene to use, for editing, manipulation or motion analysis, first requires segmenting it: grouping the Gaussian primitives into coherent objects. Current pipelines obtain this grouping by importing 2D masks from foundation models such as SAM and lifting or distilling them into the Gaussian representation. In dynamic scenes these masks must be generated across many frames and views, which is costly, and the resulting segmentation can depend strongly on the quality and consistency of those external masks. We ask how much object-level structure can instead be recovered from the Gaussians themselves, and propose Intrinsic-GS, a training-free, mask-free method that builds a sparse affinity graph over Gaussian primitives from appearance, orientation, scale, deformation-trajectory and non-learned rendered-boundary cues. The graph is partitioned with Leiden community detection, requiring no foundation model and no learned feature field. On the standard 4D Gaussian segmentation benchmarks, Neu3D and HyperNeRF, Intrinsic-GS recovers substantial object structure without mask supervision, reaching 0.746 mIoU on Neu3D and 0.575 on HyperNeRF; on Neu3D, a geometry-only variant reaches 0.902 mIoU, matching SAM-supervised TRASE. On HyperNeRF, Intrinsic-GS runs 12.5x faster than the mask-generation and feature-rendering stages used by mask-supervised pipelines. These results suggest that much of the segmentation signal is already encoded in the Gaussians themselves, offering a fast, mask-free direction for 3D and 4D Gaussian segmentation that may also point toward more generalizable, robust segmentation in settings where external masks are unreliable or expensive.
Abstract（参考訳）: ダイナミック4Dガウス・スプレイティングは、高い忠実度を持つデフォーミングシーンを再構築し、ダイナミック3Dシーンの表現としてますます採用されている。このようなシーンを編集、操作、動作分析に使用するには、まず、ガウス原始体をコヒーレントなオブジェクトにグループ化するというセグメンテーションが必要である。現在のパイプラインはSAMのような基礎モデルから2Dマスクをインポートし、ガウス表現に引き上げたり蒸留したりすることで、このグループ化を得る。ダイナミックなシーンでは、これらのマスクは多くのフレームやビューで生成されなければならないが、これはコストがかかり、その結果のセグメンテーションはそれらの外部マスクの品質と一貫性に強く依存する。 Intrinsic-GSは、ガウス原始体を外見、向き、スケール、変形軌道、非学習的有界キューからスパース親和性グラフを構築する訓練自由マスフリーな手法である。グラフはLeidenコミュニティ検出で分割されており、基礎モデルや学習された機能フィールドを必要としない。標準的な4Dガウス分割ベンチマークであるNeu3DとHyperNeRFでは、Intrinsic-GSはマスクの監督なしに実質的な物体構造を復元し、Neu3Dは0.746 mIoU、HyperNeRFは0.575 mIoUに達する。 HyperNeRFでは、Intrinsic-GSはマスク生成およびマスク管理パイプラインで使用される機能レンダリングステージよりも12.5倍高速で動作する。これらの結果は、セグメンテーション信号の多くは既にガウシアン自身に符号化されており、3Dおよび4Dガウシアンセグメンテーションのための高速でマスフリーな方向を提供しており、外部マスクが信頼できない、あるいは高価であるような設定において、より一般化可能で堅牢なセグメンテーションを目指していることを示唆している。

論文の概要: Intrinsic 4D Gaussian Segmentation from Scene Cues

関連論文リスト