Fugu-MT 論文翻訳(概要): Efficient Feature-Free Initialization for Monocular Visual-Inertial Systems Using a Feed-Forward 3D Model

論文の概要: Efficient Feature-Free Initialization for Monocular Visual-Inertial Systems Using a Feed-Forward 3D Model

arxiv url: http://arxiv.org/abs/2605.17327v1
Date: Sun, 17 May 2026 08:35:02 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-19 17:57:47.903122
Title: Efficient Feature-Free Initialization for Monocular Visual-Inertial Systems Using a Feed-Forward 3D Model
Title（参考訳）: フィードフォワード3Dモデルを用いた単眼視覚慣性システムの効率的な特徴自由初期化
Authors: Yuantai Zhang, Jiaqi Yang, Huajian Zeng, Changhao Chen, Haoang Li, Liang Li, Dezhen Song, Xingxing Zuo,
Abstract要約: 視覚慣性ナビゲーションシステム(VINS)のための機能フリーフレームワークを提案する。フィードフォワード3Dモデルにより予測される最大スケールの点雲を利用して、視覚的特徴追跡と推定の必要性を回避する。公開データセットの実験では,提案手法が最も成功率が高く,90%を超えることが示されている。
参考スコア（独自算出の注目度）: 23.41814839928409
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Fast and reliable initialization is critical for monocular visual-inertial navigation systems (VINS), as it establishes the starting conditions for subsequent state estimation. Despite steady progress, most existing methods heavily rely on visual feature correspondences and require 3-4 seconds of sensory data for successful initialization, which limits their applicability and efficiency. With the advent of feed-forward 3D models that can directly predict point clouds from images, we revisit the visual-inertial initialization problem from a concise perspective. In this work, we propose a feature-free initialization framework that leverages up-to-scale point clouds predicted by a feed-forward 3D model, thereby obviating the need for visual feature tracking and estimation. This design substantially reduces system complexity and improves the reliability of initialization. Experiments on public datasets demonstrate that the proposed feature-free initialization method achieves the highest success rate, exceeding 90%, and significantly reduces the data duration required for successful initialization, typically to under 1.2 s. We further validate our method on a self-collected dataset covering various indoor and outdoor scenarios, demonstrating robust performance, particularly in visually degraded environments where existing methods often fail. The code and dataset are available at https://github.com/Yuantai-Z/FF-VIO-Init.
Abstract（参考訳）: 高速かつ信頼性の高い初期化は、その後の状態推定の開始条件を確立するため、単眼視覚慣性ナビゲーションシステム(VINS)にとって重要である。安定した進歩にもかかわらず、既存のほとんどの手法は視覚的特徴対応に大きく依存しており、初期化を成功させるためには3-4秒の感覚データが必要である。画像から直接点雲を予測できるフィードフォワード3Dモデルの出現により、簡潔な視点から視覚-慣性初期化問題を再考する。本研究では,フィードフォワード3Dモデルにより予測される最大スケールの点群を利用する機能フリー初期化フレームワークを提案し,視覚的特徴追跡と推定の必要性を回避した。この設計はシステムの複雑さを大幅に減らし、初期化の信頼性を向上させる。公開データセットを用いた実験では,提案手法が90%を超える最大成功率を実現し,初期化を成功させるために必要なデータ期間を1.2秒未満で大幅に短縮することを示した。さらに,既存の手法がしばしば失敗する視覚的劣化環境において,室内および屋外の様々なシナリオをカバーする自己収集データセット上で,ロバストな性能を示す。コードとデータセットはhttps://github.com/Yuantai-Z/FF-VIO-Initで公開されている。

論文の概要: Efficient Feature-Free Initialization for Monocular Visual-Inertial Systems Using a Feed-Forward 3D Model

関連論文リスト