GPR-Net: Multi-view Layout Estimation via a Geometry-aware Panorama
Registration Network
- URL: http://arxiv.org/abs/2210.11419v2
- Date: Fri, 21 Oct 2022 14:26:37 GMT
- Title: GPR-Net: Multi-view Layout Estimation via a Geometry-aware Panorama
Registration Network
- Authors: Jheng-Wei Su, Chi-Han Peng, Peter Wonka, Hung-Kuo Chu
- Abstract summary: We present a complete panoramic layout estimation framework that jointly learns panorama registration and layout estimation given a pair of panoramas.
The major improvement over PSMNet comes from a novel Geometry-aware Panorama Registration Network or GPR-Net.
Experimental results indicate that our method achieves state-of-the-art performance in both panorama registration and layout estimation on a large-scale indoor panorama dataset ZInD.
- Score: 44.06968418800436
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reconstructing 3D layouts from multiple $360^{\circ}$ panoramas has received
increasing attention recently as estimating a complete layout of a large-scale
and complex room from a single panorama is very difficult. The state-of-the-art
method, called PSMNet, introduces the first learning-based framework that
jointly estimates the room layout and registration given a pair of panoramas.
However, PSMNet relies on an approximate (i.e., "noisy") registration as input.
Obtaining this input requires a solution for wide baseline registration which
is a challenging problem. In this work, we present a complete multi-view
panoramic layout estimation framework that jointly learns panorama registration
and layout estimation given a pair of panoramas without relying on a pose
prior. The major improvement over PSMNet comes from a novel Geometry-aware
Panorama Registration Network or GPR-Net that effectively tackles the wide
baseline registration problem by exploiting the layout geometry and computing
fine-grained correspondences on the layout boundaries, instead of the global
pixel-space. Our architecture consists of two parts. First, given two
panoramas, we adopt a vision transformer to learn a set of 1D horizon features
sampled on the panorama. These 1D horizon features encode the depths of
individual layout boundary samples and the correspondence and covisibility maps
between layout boundaries. We then exploit a non-linear registration module to
convert these 1D horizon features into a set of corresponding 2D boundary
points on the layout. Finally, we estimate the final relative camera pose via
RANSAC and obtain the complete layout simply by taking the union of registered
layouts. Experimental results indicate that our method achieves
state-of-the-art performance in both panorama registration and layout
estimation on a large-scale indoor panorama dataset ZInD.
Related papers
- DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion [60.45000652592418]
We propose a novel text-driven panoramic generation framework, DiffPano, to achieve scalable, consistent, and diverse panoramic scene generation.
We show that DiffPano can generate consistent, diverse panoramic images with given unseen text descriptions and camera poses.
arXiv Detail & Related papers (2024-10-31T17:57:02Z) - Multi-source Domain Adaptation for Panoramic Semantic Segmentation [22.367890439050786]
We propose a new task of multi-source domain adaptation for panoramic semantic segmentation.
We aim to utilize both real pinhole synthetic panoramic images in the source domains, enabling the segmentation model to perform well on unlabeled real panoramic images.
DTA4PASS converts all pinhole images in the source domains into panoramic-like images, and then aligns the converted source domains with the target domain.
arXiv Detail & Related papers (2024-08-29T12:00:11Z) - Pano2Room: Novel View Synthesis from a Single Indoor Panorama [20.262621556667852]
Pano2Room is designed to automatically reconstruct high-quality 3D indoor scenes from a single panoramic image.
The key idea is to initially construct a preliminary mesh from the input panorama, and iteratively refine this mesh using a panoramic RGBD inpainter.
The refined mesh is converted into a 3D Gaussian Splatting field and trained with the collected pseudo novel views.
arXiv Detail & Related papers (2024-08-21T08:19:12Z) - Scaled 360 layouts: Revisiting non-central panoramas [5.2178708158547025]
We present a novel approach for 3D layout recovery of indoor environments using single non-central panoramas.
We exploit the properties of non-central projection systems in a new geometrical processing to recover the scaled layout.
arXiv Detail & Related papers (2024-02-02T14:55:36Z) - 360 Layout Estimation via Orthogonal Planes Disentanglement and Multi-view Geometric Consistency Perception [56.84921040837699]
Existing panoramic layout estimation solutions tend to recover room boundaries from a vertically compressed sequence, yielding imprecise results.
We propose an orthogonal plane disentanglement network (termed DOPNet) to distinguish ambiguous semantics.
We also present an unsupervised adaptation technique tailored for horizon-depth and ratio representations.
Our solution outperforms other SoTA models on both monocular layout estimation and multi-view layout estimation tasks.
arXiv Detail & Related papers (2023-12-26T12:16:03Z) - PanoGRF: Generalizable Spherical Radiance Fields for Wide-baseline
Panoramas [54.4948540627471]
We propose PanoGRF, Generalizable Spherical Radiance Fields for Wide-baseline Panoramas.
Unlike generalizable radiance fields trained on perspective images, PanoGRF avoids the information loss from panorama-to-perspective conversion.
Results on multiple panoramic datasets demonstrate that PanoGRF significantly outperforms state-of-the-art generalizable view synthesis methods.
arXiv Detail & Related papers (2023-06-02T13:35:07Z) - PanoViT: Vision Transformer for Room Layout Estimation from a Single
Panoramic Image [11.053777620735175]
PanoViT is a panorama vision transformer to estimate the room layout from a single panoramic image.
Compared to CNN models, our PanoViT is more proficient in learning global information from the panoramic image.
Our method outperforms state-of-the-art solutions in room layout prediction accuracy.
arXiv Detail & Related papers (2022-12-23T05:37:11Z) - MVLayoutNet:3D layout reconstruction with multi-view panoramas [12.981269280023469]
MVNet is an end-to-end network for holistic 3D reconstruction from multi-view panoramas.
We jointly train a layout module to produce an initial layout and a novel MVS module to obtain accurate layout geometry.
Our method leads to coherent layout geometry that enables the reconstruction of an entire scene.
arXiv Detail & Related papers (2021-12-12T03:04:32Z) - LED2-Net: Monocular 360 Layout Estimation via Differentiable Depth
Rendering [59.63979143021241]
We formulate the task of 360 layout estimation as a problem of predicting depth on the horizon line of a panorama.
We propose the Differentiable Depth Rendering procedure to make the conversion from layout to depth prediction differentiable.
Our method achieves state-of-the-art performance on numerous 360 layout benchmark datasets.
arXiv Detail & Related papers (2021-04-01T15:48:41Z) - Panoramic Panoptic Segmentation: Towards Complete Surrounding
Understanding via Unsupervised Contrastive Learning [97.37544023666833]
We introduce panoramic panoptic segmentation as the most holistic scene understanding.
A complete surrounding understanding provides a maximum of information to the agent.
We propose a framework which allows model training on standard pinhole images and transfers the learned features to a different domain.
arXiv Detail & Related papers (2021-03-01T09:37:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.