Fugu-MT 論文翻訳(概要): $M^2$-Occ: Resilient 3D Semantic Occupancy Prediction for Autonomous Driving with Incomplete Camera Inputs

論文の概要: $M^2$-Occ: Resilient 3D Semantic Occupancy Prediction for Autonomous Driving with Incomplete Camera Inputs

arxiv url: http://arxiv.org/abs/2603.09737v1
Date: Tue, 10 Mar 2026 14:42:56 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-11 15:25:24.40072
Title: $M^2$-Occ: Resilient 3D Semantic Occupancy Prediction for Autonomous Driving with Incomplete Camera Inputs
Title（参考訳）: M^2$-Occ:不完全なカメラ入力による自律走行のためのレジリエントな3次元セマンティック動作予測
Authors: Kaixin Lin, Kunyu Peng, Di Wen, Yufan Chen, Ruiping Liu, Kailun Yang,
Abstract要約: M2$-Occは、ビューが欠けているときに幾何学的構造とセマンティックコヒーレンスを保存するために設計されたフレームワークである。本稿では,nuScenesをベースとしたSurroundOccベンチマークに,系統的欠落ビュー評価プロトコルを導入する。 M2$-OccでIoUを4.93%改善する。
参考スコア（独自算出の注目度）: 21.277554919824958
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Semantic occupancy prediction enables dense 3D geometric and semantic understanding for autonomous driving. However, existing camera-based approaches implicitly assume complete surround-view observations, an assumption that rarely holds in real-world deployment due to occlusion, hardware malfunction, or communication failures. We study semantic occupancy prediction under incomplete multi-camera inputs and introduce $M^2$-Occ, a framework designed to preserve geometric structure and semantic coherence when views are missing. $M^2$-Occ addresses two complementary challenges. First, a Multi-view Masked Reconstruction (MMR) module leverages the spatial overlap among neighboring cameras to recover missing-view representations directly in the feature space. Second, a Feature Memory Module (FMM) introduces a learnable memory bank that stores class-level semantic prototypes. By retrieving and integrating these global priors, the FMM refines ambiguous voxel features, ensuring semantic consistency even when observational evidence is incomplete. We introduce a systematic missing-view evaluation protocol on the nuScenes-based SurroundOcc benchmark, encompassing both deterministic single-view failures and stochastic multi-view dropout scenarios. Under the safety-critical missing back-view setting, $M^2$-Occ improves the IoU by 4.93%. As the number of missing cameras increases, the robustness gap further widens; for instance, under the setting with five missing views, our method boosts the IoU by 5.01%. These gains are achieved without compromising full-view performance. The source code will be publicly released at https://github.com/qixi7up/M2-Occ.
Abstract（参考訳）: セマンティック占有予測は、自律運転のための密集した3次元幾何学的および意味論的理解を可能にする。しかし、既存のカメラベースのアプローチでは、完全なサラウンドビューの観察を暗黙的に仮定している。不完全なマルチカメラ入力下でのセマンティック占有予測について検討し、ビューの欠落時に幾何学的構造とセマンティックコヒーレンスを保存するためのフレームワークであるM^2$-Occを導入する。 M^2$-Occは2つの相補的課題に対処する。まず,マルチビューマスク付き再構成(MMR)モジュールは,隣接するカメラ間の空間的重なりを利用して,特徴空間内での映像の欠落を再現する。第2に、FMM(Feature Memory Module)は、クラスレベルのセマンティックプロトタイプを格納する学習可能なメモリバンクを導入する。これらのグローバルな先行情報を検索して統合することにより、FMMは曖昧なボクセルの特徴を洗練し、観察的証拠が不完全である場合でも意味的な一貫性を確保する。本稿では,nuScenesをベースとしたSurroundOccベンチマークに,決定論的単一ビュー障害と確率的マルチビュードロップアウトシナリオの両方を包含した,系統的欠落ビュー評価プロトコルを提案する。 M^2$-OccでIoUを4.93%改善する。カメラの不足が増加するにつれて、ロバストさのギャップはさらに広がり、例えば5つのビューの不足設定の下では、我々の手法はIoUを5.01%向上させる。これらの利得は、フルビューのパフォーマンスを損なうことなく達成される。ソースコードはhttps://github.com/qixi7up/M2-Occ.comで公開される。

論文の概要: $M^2$-Occ: Resilient 3D Semantic Occupancy Prediction for Autonomous Driving with Incomplete Camera Inputs

関連論文リスト