Analysis & Computational Complexity Reduction of Monocular and Stereo
Depth Estimation Techniques
- URL: http://arxiv.org/abs/2206.09071v1
- Date: Sat, 18 Jun 2022 00:47:33 GMT
- Title: Analysis & Computational Complexity Reduction of Monocular and Stereo
Depth Estimation Techniques
- Authors: Rajeev Patwari, Varo Ly
- Abstract summary: A high accuracy algorithm may provide the best depth estimation but may consume tremendous compute and energy resources.
Previous work has shown this trade-off can be improved by developing a state-of-the-art method (AnyNet) to improve stereo depth estimation.
Our experiments with the novel stereo vision method (AnyNet) show that accuracy of depth estimation does not degrade more than 3% (three pixel error metric) in spite of reduction in model size by 20%.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate depth estimation with lowest compute and energy cost is a crucial
requirement for unmanned and battery operated autonomous systems. Robotic
applications require real time depth estimation for navigation and decision
making under rapidly changing 3D surroundings. A high accuracy algorithm may
provide the best depth estimation but may consume tremendous compute and energy
resources. A general trade-off is to choose less accurate methods for initial
depth estimate and a more accurate yet compute intensive method when needed.
Previous work has shown this trade-off can be improved by developing a
state-of-the-art method (AnyNet) to improve stereo depth estimation.
We studied both the monocular and stereo vision depth estimation methods and
investigated methods to reduce computational complexity of these methods. This
was our baseline. Consequently, our experiments show reduction of monocular
depth estimation model size by ~75% reduces accuracy by less than 2% (SSIM
metric). Our experiments with the novel stereo vision method (AnyNet) show that
accuracy of depth estimation does not degrade more than 3% (three pixel error
metric) in spite of reduction in model size by ~20%. We have shown that smaller
models can indeed perform competitively.
Related papers
- Deep Neighbor Layer Aggregation for Lightweight Self-Supervised
Monocular Depth Estimation [1.6775954077761863]
We present a fully convolutional depth estimation network using contextual feature fusion.
Compared to UNet++ and HRNet, we use high-resolution and low-resolution features to reserve information on small targets and fast-moving objects.
Our method reduces the parameters without sacrificing accuracy.
arXiv Detail & Related papers (2023-09-17T13:40:15Z) - Monocular Visual-Inertial Depth Estimation [66.71452943981558]
We present a visual-inertial depth estimation pipeline that integrates monocular depth estimation and visual-inertial odometry.
Our approach performs global scale and shift alignment against sparse metric depth, followed by learning-based dense alignment.
We evaluate on the TartanAir and VOID datasets, observing up to 30% reduction in RMSE with dense scale alignment.
arXiv Detail & Related papers (2023-03-21T18:47:34Z) - Depth Refinement for Improved Stereo Reconstruction [13.941756438712382]
Current techniques for depth estimation from stereoscopic images still suffer from a built-in drawback.
A simple analysis reveals that the depth error is quadratically proportional to the object's distance.
We propose a simple but effective method that uses a refinement network for depth estimation.
arXiv Detail & Related papers (2021-12-15T12:21:08Z) - Scale-aware direct monocular odometry [4.111899441919165]
We present a framework for direct monocular odometry based on depth prediction from a deep neural network.
Our proposal largely outperforms classic monocular SLAM, being 5 to 9 times more precise, with an accuracy which is closer to that of stereo systems.
arXiv Detail & Related papers (2021-09-21T10:30:15Z) - Probabilistic and Geometric Depth: Detecting Objects in Perspective [78.00922683083776]
3D object detection is an important capability needed in various practical applications such as driver assistance systems.
Monocular 3D detection, as an economical solution compared to conventional settings relying on binocular vision or LiDAR, has drawn increasing attention recently but still yields unsatisfactory results.
This paper first presents a systematic study on this problem and observes that the current monocular 3D detection problem can be simplified as an instance depth estimation problem.
arXiv Detail & Related papers (2021-07-29T16:30:33Z) - Geometry Uncertainty Projection Network for Monocular 3D Object
Detection [138.24798140338095]
We propose a Geometry Uncertainty Projection Network (GUP Net) to tackle the error amplification problem at both inference and training stages.
Specifically, a GUP module is proposed to obtains the geometry-guided uncertainty of the inferred depth.
At the training stage, we propose a Hierarchical Task Learning strategy to reduce the instability caused by error amplification.
arXiv Detail & Related papers (2021-07-29T06:59:07Z) - Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks [87.50632573601283]
We present a novel method for multi-view depth estimation from a single video.
Our method achieves temporally coherent depth estimation results by using a novel Epipolar Spatio-Temporal (EST) transformer.
To reduce the computational cost, inspired by recent Mixture-of-Experts models, we design a compact hybrid network.
arXiv Detail & Related papers (2020-11-26T04:04:21Z) - Variational Monocular Depth Estimation for Reliability Prediction [12.951621755732544]
Self-supervised learning for monocular depth estimation is widely investigated as an alternative to supervised learning approach.
Previous works have successfully improved the accuracy of depth estimation by modifying the model structure.
In this paper, we theoretically formulate a variational model for the monocular depth estimation to predict the reliability of the estimated depth image.
arXiv Detail & Related papers (2020-11-24T06:23:51Z) - Occlusion-Aware Depth Estimation with Adaptive Normal Constraints [85.44842683936471]
We present a new learning-based method for multi-frame depth estimation from a color video.
Our method outperforms the state-of-the-art in terms of depth estimation accuracy.
arXiv Detail & Related papers (2020-04-02T07:10:45Z) - Fast Depth Estimation for View Synthesis [9.243157709083672]
Disparity/depth estimation from sequences of stereo images is an important element in 3D vision.
We propose a novel learning-based framework making use of dilated convolution, densely connected convolutional modules, compact decoder and skip connections.
We show that our network outperforms state-of-the-art methods with an average improvement in depth estimation and view synthesis by approximately 45% and 34% respectively.
arXiv Detail & Related papers (2020-03-14T14:10:42Z) - D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual
Odometry [57.5549733585324]
D3VO is a novel framework for monocular visual odometry that exploits deep networks on three levels -- deep depth, pose and uncertainty estimation.
We first propose a novel self-supervised monocular depth estimation network trained on stereo videos without any external supervision.
We model the photometric uncertainties of pixels on the input images, which improves the depth estimation accuracy.
arXiv Detail & Related papers (2020-03-02T17:47:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.