Fugu-MT 論文翻訳(概要): FreeOcc: Training-Free Embodied Open-Vocabulary Occupancy Prediction

論文の概要: FreeOcc: Training-Free Embodied Open-Vocabulary Occupancy Prediction

arxiv url: http://arxiv.org/abs/2604.28115v1
Date: Thu, 30 Apr 2026 17:05:56 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-01 16:31:54.211191
Title: FreeOcc: Training-Free Embodied Open-Vocabulary Occupancy Prediction
Title（参考訳）: FreeOcc: トレーニング不要なオープンボキャブラリ職業予測
Authors: Zeyu Jiang, Changqing Zhou, Xingxing Zuo, Changhao Chen,
Abstract要約: FreeOccは、単分子またはRGB-D配列からのオープン語彙占有予測のためのトレーニングフリーフレームワークである。 FreeOccは、3Dアノテーションなしで動作します。 FreeOccは、EmbodiedOcc-ScanNet上のIoUとmIoUの2ドル以上の改善を実現している。
参考スコア（独自算出の注目度）: 11.503430067699442
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Existing learning-based occupancy prediction methods rely on large-scale 3D annotations and generalize poorly across environments. We present FreeOcc, a training-free framework for open-vocabulary occupancy prediction from monocular or RGB-D sequences. Unlike prior approaches that require voxel-level supervision and ground-truth camera poses, FreeOcc operates without 3D annotations, pose ground truth, or any learning stage. FreeOcc incrementally builds a globally consistent occupancy map via a four-layer pipeline: a SLAM backbone estimates poses and sparse geometry; a geometrically consistent Gaussian update constructs dense 3D Gaussian maps; open-vocabulary semantics from off-the-shelf vision-language models are associated with Gaussian primitives; and a probabilistic Gaussian-to-occupancy projection produces dense voxel occupancy. Despite being entirely training-free and pose-agnostic, FreeOcc achieves over $2\times$ improvements in IoU and mIoU on EmbodiedOcc-ScanNet compared to prior self-supervised methods. We further introduce ReplicaOcc, a benchmark for indoor open-vocabulary occupancy prediction, and show that FreeOcc transfers zero-shot to novel environments, substantially outperforming both supervised and self-supervised baselines. Project page: https://the-masses.github.io/freeocc-web/.
Abstract（参考訳）: 既存の学習ベースの占有予測手法は、大規模な3Dアノテーションに依存し、環境全体にわたって不適切な一般化を行う。単分子またはRGB-D配列からの開語彙占有予測のための学習自由フレームワークFreeOccを提案する。ボクセルレベルの監視と地味なカメラのポーズを必要とする従来のアプローチとは異なり、FreeOccは3Dアノテーションなしで動作し、真実のポーズを取る。 SLAMのバックボーンはポーズとスパース幾何学を推定し、幾何学的に一貫したガウス的更新は密度の高い3次元ガウス的写像を構成する。 FreeOccは完全にトレーニング不要で、ポーズに依存しないにもかかわらず、以前の自己管理手法と比較して、EmbodiedOcc-ScanNet上でIoUとmIoUが2ドル以上改善されている。さらに,室内におけるオープンボキャブラリ占有予測のベンチマークであるReplicaOccを紹介するとともに,FreeOccがゼロショットを新規環境に移行し,教師付きベースラインと自己監督ベースラインの両方を大幅に上回っていることを示す。プロジェクトページ:https://the-masses.github.io/freeocc-web/.com

論文の概要: FreeOcc: Training-Free Embodied Open-Vocabulary Occupancy Prediction

関連論文リスト