Fugu-MT 論文翻訳(概要): DivAS: Interactive 3D Segmentation of NeRFs via Depth-Weighted Voxel Aggregation

論文の概要: DivAS: Interactive 3D Segmentation of NeRFs via Depth-Weighted Voxel Aggregation

arxiv url: http://arxiv.org/abs/2601.04860v1
Date: Thu, 08 Jan 2026 11:53:04 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-09 17:01:53.188587
Title: DivAS: Interactive 3D Segmentation of NeRFs via Depth-Weighted Voxel Aggregation
Title（参考訳）: DivAS: 奥行き重み付きボクセルアグリゲーションによるNeRFのインタラクティブな3次元セグメンテーション
Authors: Ayush Pande,
Abstract要約: 既存のNeural Radiance Fields(NeRF)のセグメンテーション方法は、しばしば最適化に基づいており、2D基礎モデルのゼロショット能力を犠牲にする、シーンごとの遅いトレーニングを必要としている。 DivASは最適化のない、完全にインタラクティブなフレームワークで、これらの制限に対処しています。提案手法は,ユーザポイントプロンプトから生成される2次元SAMマスクをNeRFから派生した深度で改良し,幾何学的精度と前景分離を向上する高速GUIベースのワークフローを介して動作する。私たちのコントリビューションの中核はカスタムカーネルで、これらの洗練されたマルチビューマスクを統合された3Dボクセルグリッドに集約します。
参考スコア（独自算出の注目度）: 1.1458853556386799
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Existing methods for segmenting Neural Radiance Fields (NeRFs) are often optimization-based, requiring slow per-scene training that sacrifices the zero-shot capabilities of 2D foundation models. We introduce DivAS (Depth-interactive Voxel Aggregation Segmentation), an optimization-free, fully interactive framework that addresses these limitations. Our method operates via a fast GUI-based workflow where 2D SAM masks, generated from user point prompts, are refined using NeRF-derived depth priors to improve geometric accuracy and foreground-background separation. The core of our contribution is a custom CUDA kernel that aggregates these refined multi-view masks into a unified 3D voxel grid in under 200ms, enabling real-time visual feedback. This optimization-free design eliminates the need for per-scene training. Experiments on Mip-NeRF 360° and LLFF show that DivAS achieves segmentation quality comparable to optimization-based methods, while being 2-2.5x faster end-to-end, and up to an order of magnitude faster when excluding user prompting time.
Abstract（参考訳）: 既存のNeural Radiance Fields(NeRF)のセグメンテーション方法は、しばしば最適化に基づいており、2D基礎モデルのゼロショット能力を犠牲にする、シーンごとの遅いトレーニングを必要としている。 DivAS (Depth-interactive Voxel Aggregation Segmentation)は,これらの制限に対処する,最適化のない,完全にインタラクティブなフレームワークである。提案手法は,ユーザポイントプロンプトから生成される2次元SAMマスクをNeRFから派生した深度を用いて改良し,幾何学的精度の向上と地上背景分離を行う高速GUIベースのワークフローを介して動作する。私たちのコントリビューションの中核はカスタムのCUDAカーネルで、これらの洗練されたマルチビューマスクを200ミリ秒未満で統一された3Dボクセルグリッドに集約し、リアルタイムの視覚フィードバックを可能にします。この最適化不要な設計は、シーンごとのトレーニングを不要にする。 Mip-NeRF 360°およびLLFFの実験では、DivASは最適化ベースの手法に匹敵するセグメンテーション品質を達成し、エンド・ツー・エンドは2-2.5倍高速で、ユーザ・プロンプト時間を除くと桁違いに高速である。

論文の概要: DivAS: Interactive 3D Segmentation of NeRFs via Depth-Weighted Voxel Aggregation

関連論文リスト