UDepth: Fast Monocular Depth Estimation for Visually-guided Underwater
Robots
- URL: http://arxiv.org/abs/2209.12358v1
- Date: Mon, 26 Sep 2022 01:08:36 GMT
- Title: UDepth: Fast Monocular Depth Estimation for Visually-guided Underwater
Robots
- Authors: Boxiao Yu, Jiayi Wu and Md Jahidul Islam
- Abstract summary: We present a fast monocular depth estimation method for enabling 3D perception capabilities of low-cost underwater robots.
We formulate a novel end-to-end deep visual learning pipeline named UDepth, which incorporates domain knowledge of image formation characteristics of natural underwater scenes.
- Score: 4.157415305926584
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we present a fast monocular depth estimation method for
enabling 3D perception capabilities of low-cost underwater robots. We formulate
a novel end-to-end deep visual learning pipeline named UDepth, which
incorporates domain knowledge of image formation characteristics of natural
underwater scenes. First, we adapt a new input space from raw RGB image space
by exploiting underwater light attenuation prior, and then devise a
least-squared formulation for coarse pixel-wise depth prediction. Subsequently,
we extend this into a domain projection loss that guides the end-to-end
learning of UDepth on over 9K RGB-D training samples. UDepth is designed with a
computationally light MobileNetV2 backbone and a Transformer-based optimizer
for ensuring fast inference rates on embedded systems. By domain-aware design
choices and through comprehensive experimental analyses, we demonstrate that it
is possible to achieve state-of-the-art depth estimation performance while
ensuring a small computational footprint. Specifically, with 70%-80% less
network parameters than existing benchmarks, UDepth achieves comparable and
often better depth estimation performance. While the full model offers over 66
FPS (13 FPS) inference rates on a single GPU (CPU core), our domain projection
for coarse depth prediction runs at 51.5 FPS rates on single-board NVIDIA
Jetson TX2s. The inference pipelines are available at
https://github.com/uf-robopi/UDepth.
Related papers
- NeRF-Det++: Incorporating Semantic Cues and Perspective-aware Depth
Supervision for Indoor Multi-View 3D Detection [72.0098999512727]
NeRF-Det has achieved impressive performance in indoor multi-view 3D detection by utilizing NeRF to enhance representation learning.
We present three corresponding solutions, including semantic enhancement, perspective-aware sampling, and ordinal depth supervision.
The resulting algorithm, NeRF-Det++, has exhibited appealing performance in the ScanNetV2 and AR KITScenes datasets.
arXiv Detail & Related papers (2024-02-22T11:48:06Z) - Metrically Scaled Monocular Depth Estimation through Sparse Priors for
Underwater Robots [0.0]
We formulate a deep learning model that fuses sparse depth measurements from triangulated features to improve the depth predictions.
The network is trained in a supervised fashion on the forward-looking underwater dataset, FLSea.
The method achieves real-time performance, running at 160 FPS on a laptop GPU and 7 FPS on a single CPU core.
arXiv Detail & Related papers (2023-10-25T16:32:31Z) - Deep Neighbor Layer Aggregation for Lightweight Self-Supervised
Monocular Depth Estimation [1.6775954077761863]
We present a fully convolutional depth estimation network using contextual feature fusion.
Compared to UNet++ and HRNet, we use high-resolution and low-resolution features to reserve information on small targets and fast-moving objects.
Our method reduces the parameters without sacrificing accuracy.
arXiv Detail & Related papers (2023-09-17T13:40:15Z) - Real-time Monocular Depth Estimation on Embedded Systems [32.40848141360501]
Two efficient RT-MonoDepth and RT-MonoDepth-S architectures are proposed.
RT-MonoDepth and RT-MonoDepth-S achieve frame rates of 18.4&30.5 FPS on NVIDIA Jetson Nano and 253.0&364.1 FPS on Jetson AGX Orin.
arXiv Detail & Related papers (2023-08-21T08:59:59Z) - P3Depth: Monocular Depth Estimation with a Piecewise Planarity Prior [133.76192155312182]
We propose a method that learns to selectively leverage information from coplanar pixels to improve the predicted depth.
An extensive evaluation of our method shows that we set the new state of the art in supervised monocular depth estimation.
arXiv Detail & Related papers (2022-04-05T10:03:52Z) - Sparse Auxiliary Networks for Unified Monocular Depth Prediction and
Completion [56.85837052421469]
Estimating scene geometry from data obtained with cost-effective sensors is key for robots and self-driving cars.
In this paper, we study the problem of predicting dense depth from a single RGB image with optional sparse measurements from low-cost active depth sensors.
We introduce Sparse Networks (SANs), a new module enabling monodepth networks to perform both the tasks of depth prediction and completion.
arXiv Detail & Related papers (2021-03-30T21:22:26Z) - FastFlowNet: A Lightweight Network for Fast Optical Flow Estimation [81.76975488010213]
Dense optical flow estimation plays a key role in many robotic vision tasks.
Current networks often occupy large number of parameters and require heavy computation costs.
Our proposed FastFlowNet works in the well-known coarse-to-fine manner with following innovations.
arXiv Detail & Related papers (2021-03-08T03:09:37Z) - CodeVIO: Visual-Inertial Odometry with Learned Optimizable Dense Depth [83.77839773394106]
We present a lightweight, tightly-coupled deep depth network and visual-inertial odometry system.
We provide the network with previously marginalized sparse features from VIO to increase the accuracy of initial depth prediction.
We show that it can run in real-time with single-thread execution while utilizing GPU acceleration only for the network and code Jacobian.
arXiv Detail & Related papers (2020-12-18T09:42:54Z) - Accurate RGB-D Salient Object Detection via Collaborative Learning [101.82654054191443]
RGB-D saliency detection shows impressive ability on some challenge scenarios.
We propose a novel collaborative learning framework where edge, depth and saliency are leveraged in a more efficient way.
arXiv Detail & Related papers (2020-07-23T04:33:36Z) - MiniNet: An extremely lightweight convolutional neural network for
real-time unsupervised monocular depth estimation [22.495019810166397]
We propose a new powerful network with a recurrent module to achieve the capability of a deep network.
We maintain an extremely lightweight size for real-time high performance unsupervised monocular depth prediction from video sequences.
Our new model can run at a speed of about 110 frames per second (fps) on a single GPU, 37 fps on a single CPU, and 2 fps on a Raspberry Pi 3.
arXiv Detail & Related papers (2020-06-27T12:13:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.