Non-learning Stereo-aided Depth Completion under Mis-projection via
Selective Stereo Matching
- URL: http://arxiv.org/abs/2210.01436v1
- Date: Tue, 4 Oct 2022 07:46:56 GMT
- Title: Non-learning Stereo-aided Depth Completion under Mis-projection via
Selective Stereo Matching
- Authors: Yasuhiro Yao, Ryoichi Ishikawa, Shingo Ando, Kana Kurata, Naoki Ito,
Jun Shimamura, and Takeshi Oishi
- Abstract summary: We propose a non-learning depth completion method for a sparse depth map captured using a light detection and ranging (LiDAR) sensor guided by a pair of stereo images.
The proposed method reduced the mean absolute error (MAE) of the depth estimation to 0.65 times and demonstrated approximately twice more accurate estimation in the long range.
- Score: 0.5067618621449753
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a non-learning depth completion method for a sparse depth map
captured using a light detection and ranging (LiDAR) sensor guided by a pair of
stereo images. Generally, conventional stereo-aided depth completion methods
have two limiations. (i) They assume the given sparse depth map is accurately
aligned to the input image, whereas the alignment is difficult to achieve in
practice. (ii) They have limited accuracy in the long range because the depth
is estimated by pixel disparity. To solve the abovementioned limitations, we
propose selective stereo matching (SSM) that searches the most appropriate
depth value for each image pixel from its neighborly projected LiDAR points
based on an energy minimization framework. This depth selection approach can
handle any type of mis-projection. Moreover, SSM has an advantage in terms of
long-range depth accuracy because it directly uses the LiDAR measurement rather
than the depth acquired from the stereo. SSM is a discrete process; thus, we
apply variational smoothing with binary anisotropic diffusion tensor (B-ADT) to
generate a continuous depth map while preserving depth discontinuity across
object boundaries. Experimentally, compared with the previous state-of-the-art
stereo-aided depth completion, the proposed method reduced the mean absolute
error (MAE) of the depth estimation to 0.65 times and demonstrated
approximately twice more accurate estimation in the long range. Moreover, under
various LiDAR-camera calibration errors, the proposed method reduced the depth
estimation MAE to 0.34-0.93 times from previous depth completion methods.
Related papers
- Adaptive Stereo Depth Estimation with Multi-Spectral Images Across All Lighting Conditions [58.88917836512819]
We propose a novel framework incorporating stereo depth estimation to enforce accurate geometric constraints.
To mitigate the effects of poor lighting on stereo matching, we introduce Degradation Masking.
Our method achieves state-of-the-art (SOTA) performance on the Multi-Spectral Stereo (MS2) dataset.
arXiv Detail & Related papers (2024-11-06T03:30:46Z) - ARAI-MVSNet: A multi-view stereo depth estimation network with adaptive
depth range and depth interval [19.28042366225802]
Multi-View Stereo(MVS) is a fundamental problem in geometric computer vision.
We present a novel multi-stage coarse-to-fine framework to achieve adaptive all-pixel depth range and depth interval.
Our model achieves state-of-the-art performance and yields competitive generalization ability.
arXiv Detail & Related papers (2023-08-17T14:52:11Z) - DiffusionDepth: Diffusion Denoising Approach for Monocular Depth
Estimation [23.22005119986485]
DiffusionDepth is a new approach that reformulates monocular depth estimation as a denoising diffusion process.
It learns an iterative denoising process to denoise' random depth distribution into a depth map with the guidance of monocular visual conditions.
Experimental results on KITTI and NYU-Depth-V2 datasets suggest that a simple yet efficient diffusion approach could reach state-of-the-art performance in both indoor and outdoor scenarios with acceptable inference time.
arXiv Detail & Related papers (2023-03-09T03:48:24Z) - Probabilistic Volumetric Fusion for Dense Monocular SLAM [33.156523309257786]
We present a novel method to reconstruct 3D scenes by leveraging deep dense monocular SLAM and fast uncertainty propagation.
The proposed approach is able to 3D reconstruct scenes densely, accurately, and in real-time while being robust to extremely noisy depth estimates.
We show that our approach achieves 92% better accuracy than directly fusing depths from monocular SLAM, and up to 90% improvements compared to the best competing approach.
arXiv Detail & Related papers (2022-10-03T23:53:35Z) - Distortion-Tolerant Monocular Depth Estimation On Omnidirectional Images
Using Dual-cubemap [37.82642960470551]
We propose a distortion-tolerant omnidirectional depth estimation algorithm using a dual-cubemap.
In DCDE module, we present a rotation-based dual-cubemap model to estimate the accurate NFoV depth.
Then a boundary revision module is designed to smooth the discontinuous boundaries, which contributes to the precise and visually continuous omnidirectional depths.
arXiv Detail & Related papers (2022-03-18T04:20:36Z) - Joint Learning of Salient Object Detection, Depth Estimation and Contour
Extraction [91.43066633305662]
We propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD)
Specifically, we unify three complementary tasks: depth estimation, salient object detection and contour estimation. The multi-task mechanism promotes the model to learn the task-aware features from the auxiliary tasks.
Experiments show that it not only significantly surpasses the depth-based RGB-D SOD methods on multiple datasets, but also precisely predicts a high-quality depth map and salient contour at the same time.
arXiv Detail & Related papers (2022-03-09T17:20:18Z) - Weakly-Supervised Monocular Depth Estimationwith Resolution-Mismatched
Data [73.9872931307401]
We propose a novel weakly-supervised framework to train a monocular depth estimation network.
The proposed framework is composed of a sharing weight monocular depth estimation network and a depth reconstruction network for distillation.
Experimental results demonstrate that our method achieves superior performance than unsupervised and semi-supervised learning based schemes.
arXiv Detail & Related papers (2021-09-23T18:04:12Z) - Deep Two-View Structure-from-Motion Revisited [83.93809929963969]
Two-view structure-from-motion (SfM) is the cornerstone of 3D reconstruction and visual SLAM.
We propose to revisit the problem of deep two-view SfM by leveraging the well-posedness of the classic pipeline.
Our method consists of 1) an optical flow estimation network that predicts dense correspondences between two frames; 2) a normalized pose estimation module that computes relative camera poses from the 2D optical flow correspondences, and 3) a scale-invariant depth estimation network that leverages epipolar geometry to reduce the search space, refine the dense correspondences, and estimate relative depth maps.
arXiv Detail & Related papers (2021-04-01T15:31:20Z) - Robust Consistent Video Depth Estimation [65.53308117778361]
We present an algorithm for estimating consistent dense depth maps and camera poses from a monocular video.
Our algorithm combines two complementary techniques: (1) flexible deformation-splines for low-frequency large-scale alignment and (2) geometry-aware depth filtering for high-frequency alignment of fine depth details.
In contrast to prior approaches, our method does not require camera poses as input and achieves robust reconstruction for challenging hand-held cell phone captures containing a significant amount of noise, shake, motion blur, and rolling shutter deformations.
arXiv Detail & Related papers (2020-12-10T18:59:48Z) - Direct Depth Learning Network for Stereo Matching [79.3665881702387]
A novel Direct Depth Learning Network (DDL-Net) is designed for stereo matching.
DDL-Net consists of two stages: the Coarse Depth Estimation stage and the Adaptive-Grained Depth Refinement stage.
We show that DDL-Net achieves an average improvement of 25% on the SceneFlow dataset and $12%$ on the DrivingStereo dataset.
arXiv Detail & Related papers (2020-12-10T10:33:57Z) - Balanced Depth Completion between Dense Depth Inference and Sparse Range
Measurements via KISS-GP [14.158132769768578]
Estimating a dense and accurate depth map is the key requirement for autonomous driving and robotics.
Recent advances in deep learning have allowed depth estimation in full resolution from a single image.
Despite this impressive result, many deep-learning-based monocular depth estimation algorithms have failed to keep their accuracy yielding a meter-level estimation error.
arXiv Detail & Related papers (2020-08-12T08:07:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.