${S}^{2}$Net: Accurate Panorama Depth Estimation on Spherical Surface
- URL: http://arxiv.org/abs/2301.05845v1
- Date: Sat, 14 Jan 2023 07:39:15 GMT
- Title: ${S}^{2}$Net: Accurate Panorama Depth Estimation on Spherical Surface
- Authors: Meng Li, Senbo Wang, Weihao Yuan, Weichao Shen, Zhe Sheng and Zilong
Dong
- Abstract summary: We propose an end-to-end deep network for monocular panorama depth estimation on a unit spherical surface.
Specifically, we project the feature maps extracted from equirectangular images onto unit spherical surface sampled by uniformly distributed grids.
We propose a global cross-attention-based fusion module to fuse the feature maps from skip connection and enhance the ability to obtain global context.
- Score: 4.649656275858966
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Monocular depth estimation is an ambiguous problem, thus global structural
cues play an important role in current data-driven single-view depth estimation
methods. Panorama images capture the complete spatial information of their
surroundings utilizing the equirectangular projection which introduces large
distortion. This requires the depth estimation method to be able to handle the
distortion and extract global context information from the image. In this
paper, we propose an end-to-end deep network for monocular panorama depth
estimation on a unit spherical surface. Specifically, we project the feature
maps extracted from equirectangular images onto unit spherical surface sampled
by uniformly distributed grids, where the decoder network can aggregate the
information from the distortion-reduced feature maps. Meanwhile, we propose a
global cross-attention-based fusion module to fuse the feature maps from skip
connection and enhance the ability to obtain global context. Experiments are
conducted on five panorama depth estimation datasets, and the results
demonstrate that the proposed method substantially outperforms previous
state-of-the-art methods. All related codes will be open-sourced in the
upcoming days.
Related papers
- Decoupling Fine Detail and Global Geometry for Compressed Depth Map Super-Resolution [55.9977636042469]
Bit-depth compression produces a uniform depth representation in regions with subtle variations, hindering the recovery of detailed information.
densely distributed random noise reduces the accuracy of estimating the global geometric structure of the scene.
We propose a novel framework, termed geometry-decoupled network (GDNet), for compressed depth map super-resolution.
arXiv Detail & Related papers (2024-11-05T16:37:30Z) - Multi-Camera Collaborative Depth Prediction via Consistent Structure
Estimation [75.99435808648784]
We propose a novel multi-camera collaborative depth prediction method.
It does not require large overlapping areas while maintaining structure consistency between cameras.
Experimental results on DDAD and NuScenes datasets demonstrate the superior performance of our method.
arXiv Detail & Related papers (2022-10-05T03:44:34Z) - SphereDepth: Panorama Depth Estimation from Spherical Domain [17.98608948955211]
This paper proposes SphereDepth, a novel panorama depth estimation method.
It predicts the depth directly on the spherical mesh without projection preprocessing.
It achieves comparable results with the state-of-the-art methods of panorama depth estimation.
arXiv Detail & Related papers (2022-08-29T16:50:19Z) - Visual Attention-based Self-supervised Absolute Depth Estimation using
Geometric Priors in Autonomous Driving [8.045833295463094]
We introduce a fully Visual Attention-based Depth (VADepth) network, where spatial attention and channel attention are applied to all stages.
By continuously extracting the dependencies of features along the spatial and channel dimensions over a long distance, VADepth network can effectively preserve important details.
Experimental results on the KITTI dataset show that this architecture achieves the state-of-the-art performance.
arXiv Detail & Related papers (2022-05-18T08:01:38Z) - ACDNet: Adaptively Combined Dilated Convolution for Monocular Panorama
Depth Estimation [9.670696363730329]
We propose an ACDNet based on the adaptively combined dilated convolution to predict the dense depth map for a monocular panoramic image.
We conduct depth estimation experiments on three datasets (both virtual and real-world) and the experimental results demonstrate that our proposed ACDNet substantially outperforms the current state-of-the-art (SOTA) methods.
arXiv Detail & Related papers (2021-12-29T08:04:19Z) - Efficient Depth Completion Using Learned Bases [94.0808155168311]
We propose a new global geometry constraint for depth completion.
By assuming depth maps often lay on low dimensional subspaces, a dense depth map can be approximated by a weighted sum of full-resolution principal depth bases.
arXiv Detail & Related papers (2020-12-02T11:57:37Z) - View-consistent 4D Light Field Depth Estimation [37.04038603184669]
We propose a method to compute depth maps for every sub-aperture image in a light field in a view consistent way.
Our method precisely defines depth edges via EPIs, then we diffuse these edges spatially within the central view.
arXiv Detail & Related papers (2020-09-09T01:47:34Z) - Learning Geocentric Object Pose in Oblique Monocular Images [18.15647135620892]
An object's geocentric pose, defined as the height above ground and orientation with respect to gravity, is a powerful representation of real-world structure for object detection, segmentation, and localization tasks using RGBD images.
We develop an encoding of geocentric pose to address this challenge and train a deep network to compute the representation densely, supervised by publicly available airborne lidar.
We exploit these attributes to rectify oblique images and remove observed object parallax to dramatically improve the accuracy of localization and to enable accurate alignment of multiple images taken from very different oblique viewpoints.
arXiv Detail & Related papers (2020-07-01T20:06:19Z) - Deep 3D Capture: Geometry and Reflectance from Sparse Multi-View Images [59.906948203578544]
We introduce a novel learning-based method to reconstruct the high-quality geometry and complex, spatially-varying BRDF of an arbitrary object.
We first estimate per-view depth maps using a deep multi-view stereo network.
These depth maps are used to coarsely align the different views.
We propose a novel multi-view reflectance estimation network architecture.
arXiv Detail & Related papers (2020-03-27T21:28:54Z) - OmniSLAM: Omnidirectional Localization and Dense Mapping for
Wide-baseline Multi-camera Systems [88.41004332322788]
We present an omnidirectional localization and dense mapping system for a wide-baseline multiview stereo setup with ultra-wide field-of-view (FOV) fisheye cameras.
For more practical and accurate reconstruction, we first introduce improved and light-weighted deep neural networks for the omnidirectional depth estimation.
We integrate our omnidirectional depth estimates into the visual odometry (VO) and add a loop closing module for global consistency.
arXiv Detail & Related papers (2020-03-18T05:52:10Z) - Single Image Depth Estimation Trained via Depth from Defocus Cues [105.67073923825842]
Estimating depth from a single RGB image is a fundamental task in computer vision.
In this work, we rely, instead of different views, on depth from focus cues.
We present results that are on par with supervised methods on KITTI and Make3D datasets and outperform unsupervised learning approaches.
arXiv Detail & Related papers (2020-01-14T20:22:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.