Disentangling Orthogonal Planes for Indoor Panoramic Room Layout
Estimation with Cross-Scale Distortion Awareness
- URL: http://arxiv.org/abs/2303.00971v2
- Date: Sat, 4 Mar 2023 02:48:14 GMT
- Title: Disentangling Orthogonal Planes for Indoor Panoramic Room Layout
Estimation with Cross-Scale Distortion Awareness
- Authors: Zhijie Shen, Zishuo Zheng, Chunyu Lin, Lang Nie, Kang Liao, Shuai
Zheng and Yao Zhao
- Abstract summary: We propose to disentangle 1D representation by pre-segmenting planes from a complex scene.
Considering the symmetry between the floor boundary and ceiling boundary, we also design a soft-flipping fusion strategy.
Experiments on four popular benchmarks demonstrate our superiority over existing SoTA solutions.
- Score: 38.096482841789275
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Based on the Manhattan World assumption, most existing indoor layout
estimation schemes focus on recovering layouts from vertically compressed 1D
sequences. However, the compression procedure confuses the semantics of
different planes, yielding inferior performance with ambiguous
interpretability.
To address this issue, we propose to disentangle this 1D representation by
pre-segmenting orthogonal (vertical and horizontal) planes from a complex
scene, explicitly capturing the geometric cues for indoor layout estimation.
Considering the symmetry between the floor boundary and ceiling boundary, we
also design a soft-flipping fusion strategy to assist the pre-segmentation.
Besides, we present a feature assembling mechanism to effectively integrate
shallow and deep features with distortion distribution awareness. To compensate
for the potential errors in pre-segmentation, we further leverage triple
attention to reconstruct the disentangled sequences for better performance.
Experiments on four popular benchmarks demonstrate our superiority over
existing SoTA solutions, especially on the 3DIoU metric. The code is available
at \url{https://github.com/zhijieshen-bjtu/DOPNet}.
Related papers
- 360 Layout Estimation via Orthogonal Planes Disentanglement and Multi-view Geometric Consistency Perception [56.84921040837699]
Existing panoramic layout estimation solutions tend to recover room boundaries from a vertically compressed sequence, yielding imprecise results.
We propose an orthogonal plane disentanglement network (termed DOPNet) to distinguish ambiguous semantics.
We also present an unsupervised adaptation technique tailored for horizon-depth and ratio representations.
Our solution outperforms other SoTA models on both monocular layout estimation and multi-view layout estimation tasks.
arXiv Detail & Related papers (2023-12-26T12:16:03Z) - Vanishing Point Estimation in Uncalibrated Images with Prior Gravity
Direction [82.72686460985297]
We tackle the problem of estimating a Manhattan frame.
We derive two new 2-line solvers, one of which does not suffer from singularities affecting existing solvers.
We also design a new non-minimal method, running on an arbitrary number of lines, to boost the performance in local optimization.
arXiv Detail & Related papers (2023-08-21T13:03:25Z) - Parallax-Tolerant Unsupervised Deep Image Stitching [57.76737888499145]
We propose UDIS++, a parallax-tolerant unsupervised deep image stitching technique.
First, we propose a robust and flexible warp to model the image registration from global homography to local thin-plate spline motion.
To further eliminate the parallax artifacts, we propose to composite the stitched image seamlessly by unsupervised learning for seam-driven composition masks.
arXiv Detail & Related papers (2023-02-16T10:40:55Z) - PlaneDepth: Self-supervised Depth Estimation via Orthogonal Planes [41.517947010531074]
Multiple near frontal-parallel planes based depth estimation demonstrated impressive results in self-supervised monocular depth estimation (MDE)
We propose the PlaneDepth, a novel planes based presentation, including vertical planes and ground planes.
Our method can extract the ground plane in an unsupervised manner, which is important for autonomous driving.
arXiv Detail & Related papers (2022-10-04T13:51:59Z) - Neural 3D Scene Reconstruction with the Manhattan-world Assumption [58.90559966227361]
This paper addresses the challenge of reconstructing 3D indoor scenes from multi-view images.
Planar constraints can be conveniently integrated into the recent implicit neural representation-based reconstruction methods.
The proposed method outperforms previous methods by a large margin on 3D reconstruction quality.
arXiv Detail & Related papers (2022-05-05T17:59:55Z) - Transferable End-to-end Room Layout Estimation via Implicit Encoding [34.99591465853653]
We study the problem of estimating room layouts from a single panorama image.
We propose an end-to-end method that directly predicts parametric layouts from an input panorama image.
arXiv Detail & Related papers (2021-12-21T16:33:14Z) - LED2-Net: Monocular 360 Layout Estimation via Differentiable Depth
Rendering [59.63979143021241]
We formulate the task of 360 layout estimation as a problem of predicting depth on the horizon line of a panorama.
We propose the Differentiable Depth Rendering procedure to make the conversion from layout to depth prediction differentiable.
Our method achieves state-of-the-art performance on numerous 360 layout benchmark datasets.
arXiv Detail & Related papers (2021-04-01T15:48:41Z) - Plane Pair Matching for Efficient 3D View Registration [7.920114031312631]
We present a novel method to estimate the motion matrix between overlapping pairs of 3D views in the context of indoor scenes.
We use the Manhattan world assumption to introduce lightweight geometric constraints under the form of planes quadri into the problem.
We validate our approach on a toy example and present quantitative experiments on a public RGB-D dataset, comparing against recent state-of-the-art methods.
arXiv Detail & Related papers (2020-01-20T11:15:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.