Fugu-MT 論文翻訳(概要): FoundationSLAM: Unleashing the Power of Depth Foundation Models for End-to-End Dense Visual SLAM

論文の概要: FoundationSLAM: Unleashing the Power of Depth Foundation Models for End-to-End Dense Visual SLAM

arxiv url: http://arxiv.org/abs/2512.25008v2
Date: Thu, 01 Jan 2026 17:02:51 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-05 13:15:27.778283
Title: FoundationSLAM: Unleashing the Power of Depth Foundation Models for End-to-End Dense Visual SLAM
Title（参考訳）: FoundationSLAM: エンド・ツー・エンドのビジュアルSLAMのためのディープス・ファンデーション・モデルのパワーを開放する
Authors: Yuchen Wu, Jiahe Li, Fabio Tosi, Matteo Poggi, Jin Zheng, Xiao Bai,
Abstract要約: FoundationSLAMは、正確でロバストな追跡とマッピングのための学習ベースの単分子高密度SLAMシステムである。我々の中核となる考え方は、基礎深度モデルからのガイダンスを活用することによって、推論によるフロー推定をブリッジすることである。
参考スコア（独自算出の注目度）: 50.9765003472032
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present FoundationSLAM, a learning-based monocular dense SLAM system that addresses the absence of geometric consistency in previous flow-based approaches for accurate and robust tracking and mapping. Our core idea is to bridge flow estimation with geometric reasoning by leveraging the guidance from foundation depth models. To this end, we first develop a Hybrid Flow Network that produces geometry-aware correspondences, enabling consistent depth and pose inference across diverse keyframes. To enforce global consistency, we propose a Bi-Consistent Bundle Adjustment Layer that jointly optimizes keyframe pose and depth under multi-view constraints. Furthermore, we introduce a Reliability-Aware Refinement mechanism that dynamically adapts the flow update process by distinguishing between reliable and uncertain regions, forming a closed feedback loop between matching and optimization. Extensive experiments demonstrate that FoundationSLAM achieves superior trajectory accuracy and dense reconstruction quality across multiple challenging datasets, while running in real-time at 18 FPS, demonstrating strong generalization to various scenarios and practical applicability of our method.
Abstract（参考訳）: 本研究では,従来のフローベースアプローチにおける幾何的整合性の欠如に対処し,高精度でロバストな追跡とマッピングを行う学習型単分子高密度SLAMシステムであるFoundationSLAMを提案する。我々の中心となる考え方は、基礎深度モデルからのガイダンスを活用することで、幾何学的推論によるフロー推定を橋渡しすることである。この目的のために,我々はまず,様々なキーフレームにまたがる一貫した深度と推論が可能な幾何認識対応型ハイブリッドフローネットワークを開発した。グローバルな一貫性を実現するために,多視点制約下でキーフレームのポーズと深さを協調的に最適化するバイ一貫性バンドル調整層を提案する。さらに、信頼性の高い領域と不確実な領域を区別し、マッチングと最適化の間にクローズドなフィードバックループを形成することにより、フロー更新プロセスを動的に適応するReliability-Aware Refinement機構を導入する。広範囲な実験により,FoundationSLAMは,18 FPSでリアルタイムに動作しながら,複数の課題のあるデータセットに対して高い軌道精度と高密度再構成品質を実現し,様々なシナリオへの強力な一般化と本手法の適用性を実証した。

論文の概要: FoundationSLAM: Unleashing the Power of Depth Foundation Models for End-to-End Dense Visual SLAM

関連論文リスト