FisheyeDepth: A Real Scale Self-Supervised Depth Estimation Model for Fisheye Camera
- URL: http://arxiv.org/abs/2409.15054v1
- Date: Mon, 23 Sep 2024 14:31:42 GMT
- Title: FisheyeDepth: A Real Scale Self-Supervised Depth Estimation Model for Fisheye Camera
- Authors: Guoyang Zhao, Yuxuan Liu, Weiqing Qi, Fulong Ma, Ming Liu, Jun Ma,
- Abstract summary: We present FisheyeDepth, a self-supervised depth estimation model tailored for fisheye cameras.
We incorporate a fisheye camera model into the projection and reprojection stages during training to handle image distortions.
We also incorporate real-scale pose information into the geometric projection between consecutive frames, replacing the poses estimated by the conventional pose network.
- Score: 8.502741852406904
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate depth estimation is crucial for 3D scene comprehension in robotics and autonomous vehicles. Fisheye cameras, known for their wide field of view, have inherent geometric benefits. However, their use in depth estimation is restricted by a scarcity of ground truth data and image distortions. We present FisheyeDepth, a self-supervised depth estimation model tailored for fisheye cameras. We incorporate a fisheye camera model into the projection and reprojection stages during training to handle image distortions, thereby improving depth estimation accuracy and training stability. Furthermore, we incorporate real-scale pose information into the geometric projection between consecutive frames, replacing the poses estimated by the conventional pose network. Essentially, this method offers the necessary physical depth for robotic tasks, and also streamlines the training and inference procedures. Additionally, we devise a multi-channel output strategy to improve robustness by adaptively fusing features at various scales, which reduces the noise from real pose data. We demonstrate the superior performance and robustness of our model in fisheye image depth estimation through evaluations on public datasets and real-world scenarios. The project website is available at: https://github.com/guoyangzhao/FisheyeDepth.
Related papers
- FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models [67.96827539201071]
We propose a novel test-time optimization approach for 3D scene reconstruction.
Our method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
arXiv Detail & Related papers (2023-08-10T17:55:02Z) - Towards Accurate Reconstruction of 3D Scene Shape from A Single
Monocular Image [91.71077190961688]
We propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image.
We then exploits 3D point cloud data to predict the depth shift and the camera's focal length that allow us to recover 3D scene shapes.
We test our depth model on nine unseen datasets and achieve state-of-the-art performance on zero-shot evaluation.
arXiv Detail & Related papers (2022-08-28T16:20:14Z) - Improving Depth Estimation using Location Information [0.0]
This paper improves the self-supervised deep learning techniques to perform accurate generalized monocular depth estimation.
The main idea is to train the deep model to take into account a sequence of the different frames, each frame is geotagged with its location information.
arXiv Detail & Related papers (2021-12-27T22:30:14Z) - Depth360: Monocular Depth Estimation using Learnable Axisymmetric Camera
Model for Spherical Camera Image [2.3859169601259342]
We propose a learnable axisymmetric camera model which accepts distorted spherical camera images with two fisheye camera images.
We trained our models with a photo-realistic simulator to generate ground truth depth images.
We demonstrate the efficacy of our method using the spherical camera images from the GO Stanford dataset and pinhole camera images from the KITTI dataset.
arXiv Detail & Related papers (2021-10-20T07:21:04Z) - Learning to Recover 3D Scene Shape from a Single Image [98.20106822614392]
We propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image.
We then use 3D point cloud encoders to predict the missing depth shift and focal length that allow us to recover a realistic 3D scene shape.
arXiv Detail & Related papers (2020-12-17T02:35:13Z) - Robust Consistent Video Depth Estimation [65.53308117778361]
We present an algorithm for estimating consistent dense depth maps and camera poses from a monocular video.
Our algorithm combines two complementary techniques: (1) flexible deformation-splines for low-frequency large-scale alignment and (2) geometry-aware depth filtering for high-frequency alignment of fine depth details.
In contrast to prior approaches, our method does not require camera poses as input and achieves robust reconstruction for challenging hand-held cell phone captures containing a significant amount of noise, shake, motion blur, and rolling shutter deformations.
arXiv Detail & Related papers (2020-12-10T18:59:48Z) - Variational Monocular Depth Estimation for Reliability Prediction [12.951621755732544]
Self-supervised learning for monocular depth estimation is widely investigated as an alternative to supervised learning approach.
Previous works have successfully improved the accuracy of depth estimation by modifying the model structure.
In this paper, we theoretically formulate a variational model for the monocular depth estimation to predict the reliability of the estimated depth image.
arXiv Detail & Related papers (2020-11-24T06:23:51Z) - Neural Ray Surfaces for Self-Supervised Learning of Depth and Ego-motion [51.19260542887099]
We show that self-supervision can be used to learn accurate depth and ego-motion estimation without prior knowledge of the camera model.
Inspired by the geometric model of Grossberg and Nayar, we introduce Neural Ray Surfaces (NRS), convolutional networks that represent pixel-wise projection rays.
We demonstrate the use of NRS for self-supervised learning of visual odometry and depth estimation from raw videos obtained using a wide variety of camera systems.
arXiv Detail & Related papers (2020-08-15T02:29:13Z) - SynDistNet: Self-Supervised Monocular Fisheye Camera Distance Estimation
Synergized with Semantic Segmentation for Autonomous Driving [37.50089104051591]
State-of-the-art self-supervised learning approaches for monocular depth estimation usually suffer from scale ambiguity.
This paper introduces a novel multi-task learning strategy to improve self-supervised monocular distance estimation on fisheye and pinhole camera images.
arXiv Detail & Related papers (2020-08-10T10:52:47Z) - Video Depth Estimation by Fusing Flow-to-Depth Proposals [65.24533384679657]
We present an approach with a differentiable flow-to-depth layer for video depth estimation.
The model consists of a flow-to-depth layer, a camera pose refinement module, and a depth fusion network.
Our approach outperforms state-of-the-art depth estimation methods, and has reasonable cross dataset generalization capability.
arXiv Detail & Related papers (2019-12-30T10:45:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.