Full Surround Monodepth from Multiple Cameras
- URL: http://arxiv.org/abs/2104.00152v1
- Date: Wed, 31 Mar 2021 22:52:04 GMT
- Title: Full Surround Monodepth from Multiple Cameras
- Authors: Vitor Guizilini, Igor Vasiljevic, Rares Ambrus, Greg Shakhnarovich,
Adrien Gaidon
- Abstract summary: We extend self-supervised monocular depth and ego-motion estimation to large photo-baseline multi-camera rigs.
We learn a single network generating dense, consistent, and scale-aware point clouds that cover the same full surround 360 degree field of view as a typical LiDAR scanner.
- Score: 31.145598985137468
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised monocular depth and ego-motion estimation is a promising
approach to replace or supplement expensive depth sensors such as LiDAR for
robotics applications like autonomous driving. However, most research in this
area focuses on a single monocular camera or stereo pairs that cover only a
fraction of the scene around the vehicle. In this work, we extend monocular
self-supervised depth and ego-motion estimation to large-baseline multi-camera
rigs. Using generalized spatio-temporal contexts, pose consistency constraints,
and carefully designed photometric loss masking, we learn a single network
generating dense, consistent, and scale-aware point clouds that cover the same
full surround 360 degree field of view as a typical LiDAR scanner. We also
propose a new scale-consistent evaluation metric more suitable to multi-camera
settings. Experiments on two challenging benchmarks illustrate the benefits of
our approach over strong baselines.
Related papers
- SDGE: Stereo Guided Depth Estimation for 360$^\circ$ Camera Sets [65.64958606221069]
Multi-camera systems are often used in autonomous driving to achieve a 360$circ$ perception.
These 360$circ$ camera sets often have limited or low-quality overlap regions, making multi-view stereo methods infeasible for the entire image.
We propose the Stereo Guided Depth Estimation (SGDE) method, which enhances depth estimation of the full image by explicitly utilizing multi-view stereo results on the overlap.
arXiv Detail & Related papers (2024-02-19T02:41:37Z) - Robust Self-Supervised Extrinsic Self-Calibration [25.727912226753247]
Multi-camera self-supervised monocular depth estimation from videos is a promising way to reason about the environment.
We introduce a novel method for extrinsic calibration that builds upon the principles of self-supervised monocular depth and ego-motion learning.
arXiv Detail & Related papers (2023-08-04T06:20:20Z) - BiFuse++: Self-supervised and Efficient Bi-projection Fusion for 360
Depth Estimation [59.11106101006008]
We propose BiFuse++ to explore the combination of bi-projection fusion and the self-training scenario.
We propose a new fusion module and Contrast-Aware Photometric Loss to improve the performance of BiFuse.
arXiv Detail & Related papers (2022-09-07T06:24:21Z) - SurroundDepth: Entangling Surrounding Views for Self-Supervised
Multi-Camera Depth Estimation [101.55622133406446]
We propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras.
Specifically, we employ a joint network to process all the surrounding views and propose a cross-view transformer to effectively fuse the information from multiple views.
In experiments, our method achieves the state-of-the-art performance on the challenging multi-camera depth estimation datasets.
arXiv Detail & Related papers (2022-04-07T17:58:47Z) - SGM3D: Stereo Guided Monocular 3D Object Detection [62.11858392862551]
We propose a stereo-guided monocular 3D object detection network, termed SGM3D.
We exploit robust 3D features extracted from stereo images to enhance the features learned from the monocular image.
Our method can be integrated into many other monocular approaches to boost performance without introducing any extra computational cost.
arXiv Detail & Related papers (2021-12-03T13:57:14Z) - LiDARTouch: Monocular metric depth estimation with a few-beam LiDAR [40.98198236276633]
Vision-based depth estimation is a key feature in autonomous systems.
In such a monocular setup, dense depth is obtained with either additional input from one or several expensive LiDARs.
In this paper, we propose a new alternative of densely estimating metric depth by combining a monocular camera with a light-weight LiDAR.
arXiv Detail & Related papers (2021-09-08T12:06:31Z) - SVDistNet: Self-Supervised Near-Field Distance Estimation on Surround
View Fisheye Cameras [30.480562747903186]
A 360deg perception of scene geometry is essential for automated driving, notably for parking and urban driving scenarios.
We present novel camera-geometry adaptive multi-scale convolutions which utilize the camera parameters as a conditional input.
We evaluate our approach on the Fisheye WoodScape surround-view dataset, significantly improving over previous approaches.
arXiv Detail & Related papers (2021-04-09T15:20:20Z) - Monocular Depth Estimation with Self-supervised Instance Adaptation [138.0231868286184]
In robotics applications, multiple views of a scene may or may not be available, depend-ing on the actions of the robot.
We propose a new approach that extends any off-the-shelf self-supervised monocular depth reconstruction system to usemore than one image at test time.
arXiv Detail & Related papers (2020-04-13T08:32:03Z) - Depth Sensing Beyond LiDAR Range [84.19507822574568]
We propose a novel three-camera system that utilizes small field of view cameras.
Our system, along with our novel algorithm for computing metric depth, does not require full pre-calibration.
It can output dense depth maps with practically acceptable accuracy for scenes and objects at long distances.
arXiv Detail & Related papers (2020-04-07T00:09:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.