Fugu-MT 論文翻訳(概要): FreeOcc: Training-free Panoptic Occupancy Prediction via Foundation Models

論文の概要: FreeOcc: Training-free Panoptic Occupancy Prediction via Foundation Models

arxiv url: http://arxiv.org/abs/2603.06166v1
Date: Fri, 06 Mar 2026 11:19:33 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-09 13:17:45.531467
Title: FreeOcc: Training-free Panoptic Occupancy Prediction via Foundation Models
Title（参考訳）: FreeOcc:基礎モデルによる無トレーニングパノラマ活動予測
Authors: Andrew Caunes, Thierry Chateau, Vincent Fremont,
Abstract要約: FreeOccは、マルチビューイメージからセマンティクスと幾何学を復元する、トレーニング不要のパイプラインである。 Occ3D-nuScenesでは、FreeOccは最先端の弱い監督手法と同等の16.9 mIoUと16.5 RayIoUである。
参考スコア（独自算出の注目度）: 0.9345376836714131
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Semantic and panoptic occupancy prediction for road scene analysis provides a dense 3D representation of the ego vehicle's surroundings. Current camera-only approaches typically rely on costly dense 3D supervision or require training models on data from the target domain, limiting deployment in unseen environments. We propose FreeOcc, a training-free pipeline that leverages pretrained foundation models to recover both semantics and geometry from multi-view images. FreeOcc extracts per-view panoptic priors with a promptable foundation segmentation model and prompt-to-taxonomy rules, and reconstructs metric 3D points with a reconstruction foundation model. Depth- and confidence- aware filtering lifts reliable labels into 3D, which are fused over time and voxelized with a deterministic refinement stack. For panoptic occupancy, instances are recovered by fitting and merging robust current-view 3D box candidates, enabling instance-aware occupancy without any learned 3D model. On Occ3D-nuScenes, FreeOcc achieves 16.9 mIoU and 16.5 RayIoU train-free, on par with state-of-the-art weakly supervised methods. When employed as a pseudo-label generation pipeline for training downstream models, it achieves 21.1 RayIoU, surpassing the previous state-of-the-art weakly supervised baseline. Furthermore, FreeOcc sets new baselines for both train-free and weakly supervised panoptic occupancy prediction, achieving 3.1 RayPQ and 3.9 RayPQ, respectively. These results highlight foundation-model-driven perception as a practical route to training-free 3D scene understanding.
Abstract（参考訳）: 道路景観解析のためのセマンティックおよびパン光学的占有予測は,エゴ車両の周囲の密集した3次元表現を提供する。現在のカメラのみのアプローチは、通常、コストのかかる3D監視に依存するか、ターゲットドメインのデータに対するトレーニングモデルを必要とするため、目に見えない環境でのデプロイメントが制限される。我々は、事前訓練された基礎モデルを利用して、マルチビュー画像から意味論と幾何学の両方を復元する訓練不要パイプラインFreeOccを提案する。 FreeOccは、ファウンデーションセグメンテーションモデルとアクシデント・トゥ・タコノミールールでビュー毎のパノプティクスを抽出し、再構築基盤モデルでメートル法3Dポイントを再構成する。深さと信頼性に敏感なフィルタリングは、信頼できるラベルを3Dにリフトし、時間とともに融合し、決定論的精製スタックで酸化する。パン光学的占有のために、インスタンスは、堅牢な3Dボックス候補を取り付けてマージすることにより、学習された3Dモデルなしでインスタンス認識の占有を可能にする。 Occ3D-nuScenesでは、FreeOccは最先端の弱い監督手法と同等の16.9 mIoUと16.5 RayIoUである。下流モデルのトレーニングに擬似ラベル生成パイプラインとして使用されると、21.1 RayIoUが達成され、それまでの最先端の教師付きベースラインを抜いた。さらにFreeOccは、列車無しと弱教師付きの両方で新しいベースラインを設定し、それぞれ3.1レイPQと3.9レイPQを達成した。これらの結果は,基礎モデルによる3Dシーン理解の実践的経路としての認識を浮き彫りにした。

論文の概要: FreeOcc: Training-free Panoptic Occupancy Prediction via Foundation Models

関連論文リスト