MDE-VIO: Enhancing Visual-Inertial Odometry Using Learned Depth Priors
- URL: http://arxiv.org/abs/2602.11323v1
- Date: Wed, 11 Feb 2026 19:53:06 GMT
- Title: MDE-VIO: Enhancing Visual-Inertial Odometry Using Learned Depth Priors
- Authors: Arda Alniak, Sinan Kalkan, Mustafa Mert Ankarali, Afsar Saranli, Abdullah Aydin Alatan,
- Abstract summary: We propose a novel framework that enforces affine-invariant depth consistency and pairwise ordinal constraints.<n>This approach strictly adheres to the computational limits of edge devices while robustly recovering metric scale.
- Score: 8.2208199207543
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Traditional monocular Visual-Inertial Odometry (VIO) systems struggle in low-texture environments where sparse visual features are insufficient for accurate pose estimation. To address this, dense Monocular Depth Estimation (MDE) has been widely explored as a complementary information source. While recent Vision Transformer (ViT) based complex foundational models offer dense, geometrically consistent depth, their computational demands typically preclude them from real-time edge deployment. Our work bridges this gap by integrating learned depth priors directly into the VINS-Mono optimization backend. We propose a novel framework that enforces affine-invariant depth consistency and pairwise ordinal constraints, explicitly filtering unstable artifacts via variance-based gating. This approach strictly adheres to the computational limits of edge devices while robustly recovering metric scale. Extensive experiments on the TartanGround and M3ED datasets demonstrate that our method prevents divergence in challenging scenarios and delivers significant accuracy gains, reducing Absolute Trajectory Error (ATE) by up to 28.3%. Code will be made available.
Related papers
- GVGS: Gaussian Visibility-Aware Multi-View Geometry for Accurate Surface Reconstruction [15.170414649311441]
3D Gaussian Splatting enables efficient optimization and high-quality rendering, yet accurate surface reconstruction remains challenging.<n>We introduce a visibility-aware multi-view geometric consistency constraint that aggregates the visibility of shared Gaussian primitives across views.<n>We also propose a progressive quadtree-calibrated Monocular depth constraint that performs block-wise affine calibration from coarse to fine spatial scales.
arXiv Detail & Related papers (2026-01-28T07:48:51Z) - Depth Anything with Any Prior [64.39991799606146]
Prior Depth Anything is a framework that combines incomplete but precise metric information in depth measurement with relative but complete geometric structures in depth prediction.<n>We develop a conditioned monocular depth estimation (MDE) model to refine the inherent noise of depth priors.<n>Our model showcases impressive zero-shot generalization across depth completion, super-resolution, and inpainting over 7 real-world datasets.
arXiv Detail & Related papers (2025-05-15T17:59:50Z) - Relative Pose Estimation through Affine Corrections of Monocular Depth Priors [69.59216331861437]
We develop three solvers for relative pose estimation that explicitly account for independent affine (scale and shift) ambiguities.<n>We propose a hybrid estimation pipeline that combines our proposed solvers with classic point-based solvers and epipolar constraints.
arXiv Detail & Related papers (2025-01-09T18:58:30Z) - AGS-Mesh: Adaptive Gaussian Splatting and Meshing with Geometric Priors for Indoor Room Reconstruction Using Smartphones [19.429461194706786]
We propose an approach for joint surface depth and normal refinement of Gaussian Splatting methods for accurate 3D reconstruction of indoor scenes.<n>Our filtering strategy and optimization design demonstrate significant improvements in both mesh estimation and novel-view synthesis.
arXiv Detail & Related papers (2024-11-28T17:04:32Z) - Uncertainty-guided Optimal Transport in Depth Supervised Sparse-View 3D Gaussian [49.21866794516328]
3D Gaussian splatting has demonstrated impressive performance in real-time novel view synthesis.
Previous approaches have incorporated depth supervision into the training of 3D Gaussians to mitigate overfitting.
We introduce a novel method to supervise the depth distribution of 3D Gaussians, utilizing depth priors with integrated uncertainty estimates.
arXiv Detail & Related papers (2024-05-30T03:18:30Z) - MGS-SLAM: Monocular Sparse Tracking and Gaussian Mapping with Depth Smooth Regularization [29.713650915551632]
This letter introduces a novel framework for dense Visual Simultaneous Localization and Mapping based on Gaussian Splatting.
We jointly optimize sparse visual odometry tracking and 3D Gaussian Splatting scene representation for the first time.
The accuracy of our pose estimation surpasses existing methods and state-of-the-art.
arXiv Detail & Related papers (2024-05-10T04:42:21Z) - Probabilistic and Geometric Depth: Detecting Objects in Perspective [78.00922683083776]
3D object detection is an important capability needed in various practical applications such as driver assistance systems.
Monocular 3D detection, as an economical solution compared to conventional settings relying on binocular vision or LiDAR, has drawn increasing attention recently but still yields unsatisfactory results.
This paper first presents a systematic study on this problem and observes that the current monocular 3D detection problem can be simplified as an instance depth estimation problem.
arXiv Detail & Related papers (2021-07-29T16:30:33Z) - Virtual Normal: Enforcing Geometric Constraints for Accurate and Robust
Depth Prediction [87.08227378010874]
We show the importance of the high-order 3D geometric constraints for depth prediction.
By designing a loss term that enforces a simple geometric constraint, we significantly improve the accuracy and robustness of monocular depth estimation.
We show state-of-the-art results of learning metric depth on NYU Depth-V2 and KITTI.
arXiv Detail & Related papers (2021-03-07T00:08:21Z) - CodeVIO: Visual-Inertial Odometry with Learned Optimizable Dense Depth [83.77839773394106]
We present a lightweight, tightly-coupled deep depth network and visual-inertial odometry system.
We provide the network with previously marginalized sparse features from VIO to increase the accuracy of initial depth prediction.
We show that it can run in real-time with single-thread execution while utilizing GPU acceleration only for the network and code Jacobian.
arXiv Detail & Related papers (2020-12-18T09:42:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.