Direct Depth Learning Network for Stereo Matching
- URL: http://arxiv.org/abs/2012.05570v1
- Date: Thu, 10 Dec 2020 10:33:57 GMT
- Title: Direct Depth Learning Network for Stereo Matching
- Authors: Hong Zhang and Haojie Li and Shenglun Chen and Tiantian Yan and Zhihui
Wang and Guo Lu and Wanli Ouyang
- Abstract summary: A novel Direct Depth Learning Network (DDL-Net) is designed for stereo matching.
DDL-Net consists of two stages: the Coarse Depth Estimation stage and the Adaptive-Grained Depth Refinement stage.
We show that DDL-Net achieves an average improvement of 25% on the SceneFlow dataset and $12%$ on the DrivingStereo dataset.
- Score: 79.3665881702387
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Being a crucial task of autonomous driving, Stereo matching has made great
progress in recent years. Existing stereo matching methods estimate disparity
instead of depth. They treat the disparity errors as the evaluation metric of
the depth estimation errors, since the depth can be calculated from the
disparity according to the triangulation principle. However, we find that the
error of the depth depends not only on the error of the disparity but also on
the depth range of the points. Therefore, even if the disparity error is low,
the depth error is still large, especially for the distant points. In this
paper, a novel Direct Depth Learning Network (DDL-Net) is designed for stereo
matching. DDL-Net consists of two stages: the Coarse Depth Estimation stage and
the Adaptive-Grained Depth Refinement stage, which are all supervised by depth
instead of disparity. Specifically, Coarse Depth Estimation stage uniformly
samples the matching candidates according to depth range to construct cost
volume and output coarse depth. Adaptive-Grained Depth Refinement stage
performs further matching near the coarse depth to correct the imprecise
matching and wrong matching. To make the Adaptive-Grained Depth Refinement
stage robust to the coarse depth and adaptive to the depth range of the points,
the Granularity Uncertainty is introduced to Adaptive-Grained Depth Refinement
stage. Granularity Uncertainty adjusts the matching range and selects the
candidates' features according to coarse prediction confidence and depth range.
We verify the performance of DDL-Net on SceneFlow dataset and DrivingStereo
dataset by different depth metrics. Results show that DDL-Net achieves an
average improvement of 25% on the SceneFlow dataset and $12\%$ on the
DrivingStereo dataset comparing the classical methods. More importantly, we
achieve state-of-the-art accuracy at a large distance.
Related papers
- ARAI-MVSNet: A multi-view stereo depth estimation network with adaptive
depth range and depth interval [19.28042366225802]
Multi-View Stereo(MVS) is a fundamental problem in geometric computer vision.
We present a novel multi-stage coarse-to-fine framework to achieve adaptive all-pixel depth range and depth interval.
Our model achieves state-of-the-art performance and yields competitive generalization ability.
arXiv Detail & Related papers (2023-08-17T14:52:11Z) - Depth Refinement for Improved Stereo Reconstruction [13.941756438712382]
Current techniques for depth estimation from stereoscopic images still suffer from a built-in drawback.
A simple analysis reveals that the depth error is quadratically proportional to the object's distance.
We propose a simple but effective method that uses a refinement network for depth estimation.
arXiv Detail & Related papers (2021-12-15T12:21:08Z) - Robust Depth Completion with Uncertainty-Driven Loss Functions [60.9237639890582]
We introduce uncertainty-driven loss functions to improve the robustness of depth completion and handle the uncertainty in depth completion.
Our method has been tested on KITTI Depth Completion Benchmark and achieved the state-of-the-art robustness performance in terms of MAE, IMAE, and IRMSE metrics.
arXiv Detail & Related papers (2021-12-15T05:22:34Z) - Depth Completion using Plane-Residual Representation [84.63079529738924]
We introduce a novel way of interpreting depth information with the closest depth plane label $p$ and a residual value $r$, as we call it, Plane-Residual (PR) representation.
By interpreting depth information in PR representation and using our corresponding depth completion network, we were able to acquire improved depth completion performance with faster computation.
arXiv Detail & Related papers (2021-04-15T10:17:53Z) - PLADE-Net: Towards Pixel-Level Accuracy for Self-Supervised Single-View
Depth Estimation with Neural Positional Encoding and Distilled Matting Loss [49.66736599668501]
We propose a self-supervised single-view pixel-level accurate depth estimation network, called PLADE-Net.
Our method shows unprecedented accuracy levels, exceeding 95% in terms of the $delta1$ metric on the KITTI dataset.
arXiv Detail & Related papers (2021-03-12T15:54:46Z) - Boundary-induced and scene-aggregated network for monocular depth
prediction [20.358133522462513]
We propose the Boundary-induced and Scene-aggregated network (BS-Net) to predict the dense depth of a single RGB image.
Several experimental results on the NYUD v2 dataset and xffthe iBims-1 dataset illustrate the state-of-the-art performance of the proposed approach.
arXiv Detail & Related papers (2021-02-26T01:43:17Z) - Deep Multi-view Depth Estimation with Predicted Uncertainty [11.012201499666503]
We employ a dense-optical-flow network to compute correspondences and then triangulate the point cloud to obtain an initial depth map.
To further increase the triangulation accuracy, we introduce a depth-refinement network (DRN) that optimize the initial depth map based on the image's contextual cues.
arXiv Detail & Related papers (2020-11-19T00:22:09Z) - Faster Depth-Adaptive Transformers [71.20237659479703]
Depth-adaptive neural networks can dynamically adjust depths according to the hardness of input words.
Previous works generally build a halting unit to decide whether the computation should continue or stop at each layer.
In this paper, we get rid of the halting unit and estimate the required depths in advance, which yields a faster depth-adaptive model.
arXiv Detail & Related papers (2020-04-27T15:08:10Z) - Occlusion-Aware Depth Estimation with Adaptive Normal Constraints [85.44842683936471]
We present a new learning-based method for multi-frame depth estimation from a color video.
Our method outperforms the state-of-the-art in terms of depth estimation accuracy.
arXiv Detail & Related papers (2020-04-02T07:10:45Z) - DELTAS: Depth Estimation by Learning Triangulation And densification of
Sparse points [14.254472131009653]
Multi-view stereo (MVS) is the golden mean between the accuracy of active depth sensing and the practicality of monocular depth estimation.
Cost volume based approaches employing 3D convolutional neural networks (CNNs) have considerably improved the accuracy of MVS systems.
We propose an efficient depth estimation approach by first (a) detecting and evaluating descriptors for interest points, then (b) learning to match and triangulate a small set of interest points, and finally (c) densifying this sparse set of 3D points using CNNs.
arXiv Detail & Related papers (2020-03-19T17:56:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.