PanoFormer: Panorama Transformer for Indoor 360{\deg} Depth Estimation
- URL: http://arxiv.org/abs/2203.09283v1
- Date: Thu, 17 Mar 2022 12:19:43 GMT
- Title: PanoFormer: Panorama Transformer for Indoor 360{\deg} Depth Estimation
- Authors: Zhijie Shen, Chunyu Lin, Kang Liao, Lang Nie, Zishuo Zheng, and Yao
Zhao
- Abstract summary: Existing panoramic depth estimation methods based on convolutional neural networks (CNNs) focus on removing panoramic distortions.
This paper proposes the panorama transformer (named PanoFormer) to estimate the depth in panorama images.
In particular, we divide patches on the spherical tangent domain into tokens to reduce the negative effect of panoramic distortions.
- Score: 35.698249161263966
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing panoramic depth estimation methods based on convolutional neural
networks (CNNs) focus on removing panoramic distortions, failing to perceive
panoramic structures efficiently due to the fixed receptive field in CNNs. This
paper proposes the panorama transformer (named PanoFormer) to estimate the
depth in panorama images, with tangent patches from spherical domain, learnable
token flows, and panorama specific metrics. In particular, we divide patches on
the spherical tangent domain into tokens to reduce the negative effect of
panoramic distortions. Since the geometric structures are essential for depth
estimation, a self-attention module is redesigned with an additional learnable
token flow. In addition, considering the characteristic of the spherical
domain, we present two panorama-specific metrics to comprehensively evaluate
the panoramic depth estimation models' performance. Extensive experiments
demonstrate that our approach significantly outperforms the state-of-the-art
(SOTA) methods. Furthermore, the proposed method can be effectively extended to
solve semantic panorama segmentation, a similar pixel2pixel task. Code will be
available.
Related papers
- Calibrating Panoramic Depth Estimation for Practical Localization and
Mapping [20.621442016969976]
The absolute depth values of surrounding environments provide crucial cues for various assistive technologies, such as localization, navigation, and 3D structure estimation.
We propose that accurate depth estimated from panoramic images can serve as a powerful and light-weight input for a wide range of downstream tasks requiring 3D information.
arXiv Detail & Related papers (2023-08-27T04:50:05Z) - PanoGRF: Generalizable Spherical Radiance Fields for Wide-baseline
Panoramas [54.4948540627471]
We propose PanoGRF, Generalizable Spherical Radiance Fields for Wide-baseline Panoramas.
Unlike generalizable radiance fields trained on perspective images, PanoGRF avoids the information loss from panorama-to-perspective conversion.
Results on multiple panoramic datasets demonstrate that PanoGRF significantly outperforms state-of-the-art generalizable view synthesis methods.
arXiv Detail & Related papers (2023-06-02T13:35:07Z) - ${S}^{2}$Net: Accurate Panorama Depth Estimation on Spherical Surface [4.649656275858966]
We propose an end-to-end deep network for monocular panorama depth estimation on a unit spherical surface.
Specifically, we project the feature maps extracted from equirectangular images onto unit spherical surface sampled by uniformly distributed grids.
We propose a global cross-attention-based fusion module to fuse the feature maps from skip connection and enhance the ability to obtain global context.
arXiv Detail & Related papers (2023-01-14T07:39:15Z) - PanoViT: Vision Transformer for Room Layout Estimation from a Single
Panoramic Image [11.053777620735175]
PanoViT is a panorama vision transformer to estimate the room layout from a single panoramic image.
Compared to CNN models, our PanoViT is more proficient in learning global information from the panoramic image.
Our method outperforms state-of-the-art solutions in room layout prediction accuracy.
arXiv Detail & Related papers (2022-12-23T05:37:11Z) - SphereDepth: Panorama Depth Estimation from Spherical Domain [17.98608948955211]
This paper proposes SphereDepth, a novel panorama depth estimation method.
It predicts the depth directly on the spherical mesh without projection preprocessing.
It achieves comparable results with the state-of-the-art methods of panorama depth estimation.
arXiv Detail & Related papers (2022-08-29T16:50:19Z) - Panoramic Panoptic Segmentation: Insights Into Surrounding Parsing for
Mobile Agents via Unsupervised Contrastive Learning [93.6645991946674]
We introduce panoramic panoptic segmentation, as the most holistic scene understanding.
A complete surrounding understanding provides a maximum of information to a mobile agent.
We propose a framework which allows model training on standard pinhole images and transfers the learned features to a different domain.
arXiv Detail & Related papers (2022-06-21T20:07:15Z) - ACDNet: Adaptively Combined Dilated Convolution for Monocular Panorama
Depth Estimation [9.670696363730329]
We propose an ACDNet based on the adaptively combined dilated convolution to predict the dense depth map for a monocular panoramic image.
We conduct depth estimation experiments on three datasets (both virtual and real-world) and the experimental results demonstrate that our proposed ACDNet substantially outperforms the current state-of-the-art (SOTA) methods.
arXiv Detail & Related papers (2021-12-29T08:04:19Z) - Image Inpainting with Edge-guided Learnable Bidirectional Attention Maps [85.67745220834718]
We present an edge-guided learnable bidirectional attention map (Edge-LBAM) for improving image inpainting of irregular holes.
Our Edge-LBAM method contains dual procedures,including structure-aware mask-updating guided by predict edges.
Extensive experiments show that our Edge-LBAM is effective in generating coherent image structures and preventing color discrepancy and blurriness.
arXiv Detail & Related papers (2021-04-25T07:25:16Z) - Light Field Reconstruction Using Convolutional Network on EPI and
Extended Applications [78.63280020581662]
A novel convolutional neural network (CNN)-based framework is developed for light field reconstruction from a sparse set of views.
We demonstrate the high performance and robustness of the proposed framework compared with state-of-the-art algorithms.
arXiv Detail & Related papers (2021-03-24T08:16:32Z) - Panoramic Panoptic Segmentation: Towards Complete Surrounding
Understanding via Unsupervised Contrastive Learning [97.37544023666833]
We introduce panoramic panoptic segmentation as the most holistic scene understanding.
A complete surrounding understanding provides a maximum of information to the agent.
We propose a framework which allows model training on standard pinhole images and transfers the learned features to a different domain.
arXiv Detail & Related papers (2021-03-01T09:37:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.