Fugu-MT 論文翻訳(概要): AIM-SLAM: Dense Monocular SLAM via Adaptive and Informative Multi-View Keyframe Prioritization with Foundation Model

論文の概要: AIM-SLAM: Dense Monocular SLAM via Adaptive and Informative Multi-View Keyframe Prioritization with Foundation Model

arxiv url: http://arxiv.org/abs/2603.05097v2
Date: Fri, 06 Mar 2026 14:11:46 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-23 08:17:41.934878
Title: AIM-SLAM: Dense Monocular SLAM via Adaptive and Informative Multi-View Keyframe Prioritization with Foundation Model
Title（参考訳）: AIM-SLAM: ファンデーションモデルによる適応的およびインフォーマティブな多視点鍵フレーム優先順位付けによる単眼SLAM
Authors: Jinwoo Jeon, Dong-Uk Seo, Eungchang Mason Lee, Hyun Myung,
Abstract要約: AIM-SLAMは、適応的で情報的なマルチビュー優先順位付けを利用する、高密度な単分子SLAMフレームワークである。我々は、選択したビュー間で一貫したアライメントを強制する共同マルチビューSim最適化を定式化する。 AIM-SLAMの有効性は実世界のデータセットで示され、ポーズ推定と密な再構成の両方において最先端の性能を達成する。
参考スコア（独自算出の注目度）: 7.598885266145037
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in geometric foundation models have emerged as a promising alternative for addressing the challenge of dense reconstruction in monocular visual simultaneous localization and mapping (SLAM). Although geometric foundation models enable SLAM to leverage variable input views, the previous methods remain confined to two-view pairs or fixed-length inputs without sufficient deliberation of geometric context for view selection. To tackle this problem, we propose AIM-SLAM, a dense monocular SLAM framework that exploits an adaptive and informative multi-view keyframe prioritization with dense pointmap predictions from visual geometry grounded transformer (VGGT). Specifically, we introduce the selective information- and geometric-aware multi-view adaptation (SIGMA) module, which employs voxel overlap and information gain to retrieve a candidate set of keyframes and adaptively determine its size. Furthermore, we formulate a joint multi-view Sim(3) optimization that enforces consistent alignment across selected views, substantially improving pose estimation accuracy. The effectiveness of AIM-SLAM is demonstrated on real-world datasets, where it achieves state-of-the-art performance in both pose estimation and dense reconstruction. Our system supports ROS integration, with code is available at https://aimslam.github.io/.
Abstract（参考訳）: 幾何学的基礎モデルの最近の進歩は、単眼の視覚的同時局所化とマッピング(SLAM)における密な再構成の課題に対処するための、有望な代替手段として現れてきた。幾何学的基礎モデルにより、SLAMは可変入力ビューを利用することができるが、以前の方法はビュー選択のための幾何学的文脈の十分な検討をすることなく、2ビューペアまたは固定長入力に限られる。この問題に対処するために,視覚幾何学的基底変換器(VGGT)から高密度な点マップ予測を用いた適応的かつ情報的な多視点鍵フレーム優先順位付けを利用する,高密度単眼SLAMフレームワークであるAIM-SLAMを提案する。具体的には、ボクセル重なりと情報ゲインを用いて鍵フレームの候補集合を検索し、そのサイズを適応的に決定する選択情報および幾何学的マルチビュー適応(SIGMA)モジュールを提案する。さらに、選択されたビュー間で一貫したアライメントを強制し、ポーズ推定精度を大幅に向上する共同マルチビューSim(3)最適化を定式化する。 AIM-SLAMの有効性は実世界のデータセットで示され、ポーズ推定と密な再構成の両方において最先端の性能を達成する。私たちのシステムはROS統合をサポートし、コードはhttps://aimslam.github.io/で利用可能です。

論文の概要: AIM-SLAM: Dense Monocular SLAM via Adaptive and Informative Multi-View Keyframe Prioritization with Foundation Model

関連論文リスト