OmniLayout: Room Layout Reconstruction from Indoor Spherical Panoramas
- URL: http://arxiv.org/abs/2104.09403v1
- Date: Mon, 19 Apr 2021 15:44:10 GMT
- Title: OmniLayout: Room Layout Reconstruction from Indoor Spherical Panoramas
- Authors: Shivansh Rao and Vikas Kumar and Daniel Kifer and Lee Giles and Ankur
Mali
- Abstract summary: Given a single RGB panorama, the goal of 3D layout reconstruction is to estimate the room layout by predicting the corners, boundary, and ceiling boundary.
A common approach has been to use standard convolutional networks to predict the corners and boundaries, followed by post-processing to generate the 3D layout.
We propose to use spherical convolutions, which perform convolutions directly on the sphere surface, sampling according to equirectangular projection.
- Score: 16.38156002774853
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Given a single RGB panorama, the goal of 3D layout reconstruction is to
estimate the room layout by predicting the corners, floor boundary, and ceiling
boundary. A common approach has been to use standard convolutional networks to
predict the corners and boundaries, followed by post-processing to generate the
3D layout. However, the space-varying distortions in panoramic images are not
compatible with the translational equivariance property of standard
convolutions, thus degrading performance. Instead, we propose to use spherical
convolutions. The resulting network, which we call OmniLayout performs
convolutions directly on the sphere surface, sampling according to inverse
equirectangular projection and hence invariant to equirectangular distortions.
Using a new evaluation metric, we show that our network reduces the error in
the heavily distorted regions (near the poles) by approx 25 % when compared to
standard convolutional networks. Experimental results show that OmniLayout
outperforms the state-of-the-art by approx 4% on two different benchmark
datasets (PanoContext and Stanford 2D-3D). Code is available at
https://github.com/rshivansh/OmniLayout.
Related papers
- No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images [100.80376573969045]
NoPoSplat is a feed-forward model capable of reconstructing 3D scenes parameterized by 3D Gaussians from multi-view images.
Our model achieves real-time 3D Gaussian reconstruction during inference.
This work makes significant advances in pose-free generalizable 3D reconstruction and demonstrates its applicability to real-world scenarios.
arXiv Detail & Related papers (2024-10-31T17:58:22Z) - Parcel3D: Shape Reconstruction from Single RGB Images for Applications
in Transportation Logistics [62.997667081978825]
We focus on enabling damage and tampering detection in logistics and tackle the problem of 3D shape reconstruction of potentially damaged parcels.
We present a novel synthetic dataset, named Parcel3D, that is based on the Google Scanned Objects (GSO) dataset.
We present a novel architecture called CubeRefine R-CNN, which combines estimating a 3D bounding box with an iterative mesh refinement.
arXiv Detail & Related papers (2023-04-18T13:55:51Z) - Disentangling Orthogonal Planes for Indoor Panoramic Room Layout
Estimation with Cross-Scale Distortion Awareness [38.096482841789275]
We propose to disentangle 1D representation by pre-segmenting planes from a complex scene.
Considering the symmetry between the floor boundary and ceiling boundary, we also design a soft-flipping fusion strategy.
Experiments on four popular benchmarks demonstrate our superiority over existing SoTA solutions.
arXiv Detail & Related papers (2023-03-02T05:10:23Z) - 3D Room Layout Estimation from a Cubemap of Panorama Image via Deep
Manhattan Hough Transform [17.51123287432334]
We present an alternative approach to estimate the walls in 3D space by modeling long-range geometric patterns in a learnable Hough Transform block.
We transform the image feature from a cubemap tile to the Hough space of a Manhattan world and directly map the feature to geometric output.
The convolutional layers not only learn the local gradient-like line features, but also utilize the global information to successfully predict occluded walls with a simple network structure.
arXiv Detail & Related papers (2022-07-19T14:22:28Z) - Neural 3D Scene Reconstruction with the Manhattan-world Assumption [58.90559966227361]
This paper addresses the challenge of reconstructing 3D indoor scenes from multi-view images.
Planar constraints can be conveniently integrated into the recent implicit neural representation-based reconstruction methods.
The proposed method outperforms previous methods by a large margin on 3D reconstruction quality.
arXiv Detail & Related papers (2022-05-05T17:59:55Z) - Transferable End-to-end Room Layout Estimation via Implicit Encoding [34.99591465853653]
We study the problem of estimating room layouts from a single panorama image.
We propose an end-to-end method that directly predicts parametric layouts from an input panorama image.
arXiv Detail & Related papers (2021-12-21T16:33:14Z) - LED2-Net: Monocular 360 Layout Estimation via Differentiable Depth
Rendering [59.63979143021241]
We formulate the task of 360 layout estimation as a problem of predicting depth on the horizon line of a panorama.
We propose the Differentiable Depth Rendering procedure to make the conversion from layout to depth prediction differentiable.
Our method achieves state-of-the-art performance on numerous 360 layout benchmark datasets.
arXiv Detail & Related papers (2021-04-01T15:48:41Z) - Learning Deformable Tetrahedral Meshes for 3D Reconstruction [78.0514377738632]
3D shape representations that accommodate learning-based 3D reconstruction are an open problem in machine learning and computer graphics.
Previous work on neural 3D reconstruction demonstrated benefits, but also limitations, of point cloud, voxel, surface mesh, and implicit function representations.
We introduce Deformable Tetrahedral Meshes (DefTet) as a particular parameterization that utilizes volumetric tetrahedral meshes for the reconstruction problem.
arXiv Detail & Related papers (2020-11-03T02:57:01Z) - Geometric Correspondence Fields: Learned Differentiable Rendering for 3D
Pose Refinement in the Wild [96.09941542587865]
We present a novel 3D pose refinement approach based on differentiable rendering for objects of arbitrary categories in the wild.
In this way, we precisely align 3D models to objects in RGB images which results in significantly improved 3D pose estimates.
We evaluate our approach on the challenging Pix3D dataset and achieve up to 55% relative improvement compared to state-of-the-art refinement methods in multiple metrics.
arXiv Detail & Related papers (2020-07-17T12:34:38Z) - General 3D Room Layout from a Single View by Render-and-Compare [36.94817376590415]
We present a novel method to reconstruct the 3D layout of a room from a single perspective view.
Our dataset consists of 293 images from ScanNet, which we annotated with precise 3D layouts.
arXiv Detail & Related papers (2020-01-07T16:14:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.