Related papers: 360$^\circ$ Depth Estimation from Multiple Fisheye Images with Origami Crown Representation of Icosahedron

360$^\circ$ Depth Estimation from Multiple Fisheye Images with Origami Crown Representation of Icosahedron

URL: http://arxiv.org/abs/2007.06891v1
Date: Tue, 14 Jul 2020 08:02:53 GMT
Title: 360$^\circ$ Depth Estimation from Multiple Fisheye Images with Origami Crown Representation of Icosahedron
Authors: Ren Komatsu, Hiromitsu Fujii, Yusuke Tamura, Atsushi Yamashita, Hajime Asama
Abstract summary: We propose a new icosahedron-based representation and ConvNets for omnidirectional images. CrownConv can be applied to both fisheye images and equirectangular images to extract features. As our proposed method is computationally efficient, the depth is estimated from four fisheye images in less than a second using a laptop with a GPU.
Score: 5.384800591054856
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this study, we present a method for all-around depth estimation from multiple omnidirectional images for indoor environments. In particular, we focus on plane-sweeping stereo as the method for depth estimation from the images. We propose a new icosahedron-based representation and ConvNets for omnidirectional images, which we name "CrownConv" because the representation resembles a crown made of origami. CrownConv can be applied to both fisheye images and equirectangular images to extract features. Furthermore, we propose icosahedron-based spherical sweeping for generating the cost volume on an icosahedron from the extracted features. The cost volume is regularized using the three-dimensional CrownConv, and the final depth is obtained by depth regression from the cost volume. Our proposed method is robust to camera alignments by using the extrinsic camera parameters; therefore, it can achieve precise depth estimation even when the camera alignment differs from that in the training dataset. We evaluate the proposed model on synthetic datasets and demonstrate its effectiveness. As our proposed method is computationally efficient, the depth is estimated from four fisheye images in less than a second using a laptop with a GPU. Therefore, it is suitable for real-world robotics applications. Our source code is available at https://github.com/matsuren/crownconv360depth.

Related papers

Align3R: Aligned Monocular Depth Estimation for Dynamic Videos [50.28715151619659]
We propose a novel video-depth estimation method called Align3R to estimate temporal consistent depth maps for a dynamic video. Our key idea is to utilize the recent DUSt3R model to align estimated monocular depth maps of different timesteps. Experiments demonstrate that Align3R estimates consistent video depth and camera poses for a monocular video with superior performance than baseline methods.
arXiv Detail & Related papers (2024-12-04T07:09:59Z)
Helvipad: A Real-World Dataset for Omnidirectional Stereo Depth Estimation [83.841877607646]
We introduce Helvipad, a real-world dataset for omnidirectional stereo depth estimation. The dataset includes accurate depth and disparity labels by projecting 3D point clouds onto equirectangular images. We benchmark leading stereo depth estimation models for both standard and omnidirectional images.
arXiv Detail & Related papers (2024-11-27T13:34:41Z)
Robust and Flexible Omnidirectional Depth Estimation with Multiple 360-degree Cameras [8.850391039025077]
We use geometric constraints and redundant information of multiple 360-degree cameras to achieve robust and flexible omnidirectional depth estimation. Our two algorithms achieve state-of-the-art performance, accurately predicting depth maps even when provided with soiled panorama inputs.
arXiv Detail & Related papers (2024-09-23T07:31:48Z)
SDGE: Stereo Guided Depth Estimation for 360$^\circ$ Camera Sets [65.64958606221069]
Multi-camera systems are often used in autonomous driving to achieve a 360$circ$ perception. These 360$circ$ camera sets often have limited or low-quality overlap regions, making multi-view stereo methods infeasible for the entire image. We propose the Stereo Guided Depth Estimation (SGDE) method, which enhances depth estimation of the full image by explicitly utilizing multi-view stereo results on the overlap.
arXiv Detail & Related papers (2024-02-19T02:41:37Z)
ARAI-MVSNet: A multi-view stereo depth estimation network with adaptive depth range and depth interval [19.28042366225802]
Multi-View Stereo(MVS) is a fundamental problem in geometric computer vision. We present a novel multi-stage coarse-to-fine framework to achieve adaptive all-pixel depth range and depth interval. Our model achieves state-of-the-art performance and yields competitive generalization ability.
arXiv Detail & Related papers (2023-08-17T14:52:11Z)
Lightweight Monocular Depth Estimation [4.19709743271943]
We create a lightweight machine-learning model in order to predict the depth value of each pixel given only a single RGB image as input with the Unet structure of the image segmentation network. The proposed method achieves relatively high accuracy and low rootmean-square error.
arXiv Detail & Related papers (2022-12-21T21:05:16Z)
Towards Accurate Reconstruction of 3D Scene Shape from A Single Monocular Image [91.71077190961688]
We propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image. We then exploits 3D point cloud data to predict the depth shift and the camera's focal length that allow us to recover 3D scene shapes. We test our depth model on nine unseen datasets and achieve state-of-the-art performance on zero-shot evaluation.
arXiv Detail & Related papers (2022-08-28T16:20:14Z)
SCONE: Surface Coverage Optimization in Unknown Environments by Volumetric Integration [23.95135709027516]
Next Best View computation (NBV) is a long-standing problem in robotics. We show that we can maximize surface metrics by Monte Carlo integration over a volumetric representation. It takes as input an arbitrarily large point cloud gathered by a depth sensor like Lidar systems as well as camera poses to predict NBV.
arXiv Detail & Related papers (2022-08-22T17:04:14Z)
Monocular 3D Object Detection with Depth from Motion [74.29588921594853]
We take advantage of camera ego-motion for accurate object depth estimation and detection. Our framework, named Depth from Motion (DfM), then uses the established geometry to lift 2D image features to the 3D space and detects 3D objects thereon. Our framework outperforms state-of-the-art methods by a large margin on the KITTI benchmark.
arXiv Detail & Related papers (2022-07-26T15:48:46Z)
SurroundDepth: Entangling Surrounding Views for Self-Supervised Multi-Camera Depth Estimation [101.55622133406446]
We propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras. Specifically, we employ a joint network to process all the surrounding views and propose a cross-view transformer to effectively fuse the information from multiple views. In experiments, our method achieves the state-of-the-art performance on the challenging multi-camera depth estimation datasets.
arXiv Detail & Related papers (2022-04-07T17:58:47Z)
P3Depth: Monocular Depth Estimation with a Piecewise Planarity Prior [133.76192155312182]
We propose a method that learns to selectively leverage information from coplanar pixels to improve the predicted depth. An extensive evaluation of our method shows that we set the new state of the art in supervised monocular depth estimation.
arXiv Detail & Related papers (2022-04-05T10:03:52Z)
360MonoDepth: High-Resolution 360{\deg} Monocular Depth Estimation [15.65828728205071]
monocular depth estimation remains a challenge for 360deg data. Current CNN-based methods do not support such high resolutions due to limited GPU memory. We propose a flexible framework for monocular depth estimation from high-resolution 360deg images using tangent images.
arXiv Detail & Related papers (2021-11-30T18:57:29Z)
Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo [71.59494156155309]
Existing approaches for multi-view 3D pose estimation explicitly establish cross-view correspondences to group 2D pose detections from multiple camera views. We present our multi-view 3D pose estimation approach based on plane sweep stereo to jointly address the cross-view fusion and 3D pose reconstruction in a single shot.
arXiv Detail & Related papers (2021-04-06T03:49:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.