Related papers: OmniLayout: Room Layout Reconstruction from Indoor Spherical Panoramas

OmniLayout: Room Layout Reconstruction from Indoor Spherical Panoramas

URL: http://arxiv.org/abs/2104.09403v1
Date: Mon, 19 Apr 2021 15:44:10 GMT
Title: OmniLayout: Room Layout Reconstruction from Indoor Spherical Panoramas
Authors: Shivansh Rao and Vikas Kumar and Daniel Kifer and Lee Giles and Ankur Mali
Abstract summary: Given a single RGB panorama, the goal of 3D layout reconstruction is to estimate the room layout by predicting the corners, boundary, and ceiling boundary. A common approach has been to use standard convolutional networks to predict the corners and boundaries, followed by post-processing to generate the 3D layout. We propose to use spherical convolutions, which perform convolutions directly on the sphere surface, sampling according to equirectangular projection.
Score: 16.38156002774853
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Given a single RGB panorama, the goal of 3D layout reconstruction is to estimate the room layout by predicting the corners, floor boundary, and ceiling boundary. A common approach has been to use standard convolutional networks to predict the corners and boundaries, followed by post-processing to generate the 3D layout. However, the space-varying distortions in panoramic images are not compatible with the translational equivariance property of standard convolutions, thus degrading performance. Instead, we propose to use spherical convolutions. The resulting network, which we call OmniLayout performs convolutions directly on the sphere surface, sampling according to inverse equirectangular projection and hence invariant to equirectangular distortions. Using a new evaluation metric, we show that our network reduces the error in the heavily distorted regions (near the poles) by approx 25 % when compared to standard convolutional networks. Experimental results show that OmniLayout outperforms the state-of-the-art by approx 4% on two different benchmark datasets (PanoContext and Stanford 2D-3D). Code is available at https://github.com/rshivansh/OmniLayout.

Related papers

You Need a Transition Plane: Bridging Continuous Panoramic 3D Reconstruction with Perspective Gaussian Splatting [57.44295803750027]
We present a novel framework, named TPGS, to bridge continuous panoramic 3D scene reconstruction with perspective Gaussian splatting. Specifically, we optimize 3D Gaussians within individual cube faces and then fine-tune them in the stitched panoramic space. Experiments on indoor and outdoor, egocentric, and roaming benchmark datasets demonstrate that our approach outperforms existing state-of-the-art methods.
arXiv Detail & Related papers (2025-04-12T03:42:50Z)
GaussRender: Learning 3D Occupancy with Gaussian Rendering [86.89653628311565]
GaussRender is a module that improves 3D occupancy learning by enforcing projective consistency. Our method penalizes 3D configurations that produce inconsistent 2D projections, thereby enforcing a more coherent 3D structure.
arXiv Detail & Related papers (2025-02-07T16:07:51Z)
PanoSLAM: Panoptic 3D Scene Reconstruction via Gaussian SLAM [105.01907579424362]
PanoSLAM is the first SLAM system to integrate geometric reconstruction, 3D semantic segmentation, and 3D instance segmentation within a unified framework. For the first time, it achieves panoptic 3D reconstruction of open-world environments directly from the RGB-D video.
arXiv Detail & Related papers (2024-12-31T08:58:10Z)
No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images [100.80376573969045]
NoPoSplat is a feed-forward model capable of reconstructing 3D scenes parameterized by 3D Gaussians from multi-view images. Our model achieves real-time 3D Gaussian reconstruction during inference. This work makes significant advances in pose-free generalizable 3D reconstruction and demonstrates its applicability to real-world scenarios.
arXiv Detail & Related papers (2024-10-31T17:58:22Z)
Parcel3D: Shape Reconstruction from Single RGB Images for Applications in Transportation Logistics [62.997667081978825]
We focus on enabling damage and tampering detection in logistics and tackle the problem of 3D shape reconstruction of potentially damaged parcels. We present a novel synthetic dataset, named Parcel3D, that is based on the Google Scanned Objects (GSO) dataset. We present a novel architecture called CubeRefine R-CNN, which combines estimating a 3D bounding box with an iterative mesh refinement.
arXiv Detail & Related papers (2023-04-18T13:55:51Z)
Disentangling Orthogonal Planes for Indoor Panoramic Room Layout Estimation with Cross-Scale Distortion Awareness [38.096482841789275]
We propose to disentangle 1D representation by pre-segmenting planes from a complex scene. Considering the symmetry between the floor boundary and ceiling boundary, we also design a soft-flipping fusion strategy. Experiments on four popular benchmarks demonstrate our superiority over existing SoTA solutions.
arXiv Detail & Related papers (2023-03-02T05:10:23Z)
3D Room Layout Estimation from a Cubemap of Panorama Image via Deep Manhattan Hough Transform [17.51123287432334]
We present an alternative approach to estimate the walls in 3D space by modeling long-range geometric patterns in a learnable Hough Transform block. We transform the image feature from a cubemap tile to the Hough space of a Manhattan world and directly map the feature to geometric output. The convolutional layers not only learn the local gradient-like line features, but also utilize the global information to successfully predict occluded walls with a simple network structure.
arXiv Detail & Related papers (2022-07-19T14:22:28Z)
Neural 3D Scene Reconstruction with the Manhattan-world Assumption [58.90559966227361]
This paper addresses the challenge of reconstructing 3D indoor scenes from multi-view images. Planar constraints can be conveniently integrated into the recent implicit neural representation-based reconstruction methods. The proposed method outperforms previous methods by a large margin on 3D reconstruction quality.
arXiv Detail & Related papers (2022-05-05T17:59:55Z)
Transferable End-to-end Room Layout Estimation via Implicit Encoding [34.99591465853653]
We study the problem of estimating room layouts from a single panorama image. We propose an end-to-end method that directly predicts parametric layouts from an input panorama image.
arXiv Detail & Related papers (2021-12-21T16:33:14Z)
LED2-Net: Monocular 360 Layout Estimation via Differentiable Depth Rendering [59.63979143021241]
We formulate the task of 360 layout estimation as a problem of predicting depth on the horizon line of a panorama. We propose the Differentiable Depth Rendering procedure to make the conversion from layout to depth prediction differentiable. Our method achieves state-of-the-art performance on numerous 360 layout benchmark datasets.
arXiv Detail & Related papers (2021-04-01T15:48:41Z)
Learning Deformable Tetrahedral Meshes for 3D Reconstruction [78.0514377738632]
3D shape representations that accommodate learning-based 3D reconstruction are an open problem in machine learning and computer graphics. Previous work on neural 3D reconstruction demonstrated benefits, but also limitations, of point cloud, voxel, surface mesh, and implicit function representations. We introduce Deformable Tetrahedral Meshes (DefTet) as a particular parameterization that utilizes volumetric tetrahedral meshes for the reconstruction problem.
arXiv Detail & Related papers (2020-11-03T02:57:01Z)
Geometric Correspondence Fields: Learned Differentiable Rendering for 3D Pose Refinement in the Wild [96.09941542587865]
We present a novel 3D pose refinement approach based on differentiable rendering for objects of arbitrary categories in the wild. In this way, we precisely align 3D models to objects in RGB images which results in significantly improved 3D pose estimates. We evaluate our approach on the challenging Pix3D dataset and achieve up to 55% relative improvement compared to state-of-the-art refinement methods in multiple metrics.
arXiv Detail & Related papers (2020-07-17T12:34:38Z)
General 3D Room Layout from a Single View by Render-and-Compare [36.94817376590415]
We present a novel method to reconstruct the 3D layout of a room from a single perspective view. Our dataset consists of 293 images from ScanNet, which we annotated with precise 3D layouts.
arXiv Detail & Related papers (2020-01-07T16:14:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.