Transferable End-to-end Room Layout Estimation via Implicit Encoding
- URL: http://arxiv.org/abs/2112.11340v1
- Date: Tue, 21 Dec 2021 16:33:14 GMT
- Title: Transferable End-to-end Room Layout Estimation via Implicit Encoding
- Authors: Hao Zhao, Rene Ranftl, Yurong Chen, Hongbin Zha
- Abstract summary: We study the problem of estimating room layouts from a single panorama image.
We propose an end-to-end method that directly predicts parametric layouts from an input panorama image.
- Score: 34.99591465853653
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We study the problem of estimating room layouts from a single panorama image.
Most former works have two stages: feature extraction and parametric model
fitting. Here we propose an end-to-end method that directly predicts parametric
layouts from an input panorama image. It exploits an implicit encoding
procedure that embeds parametric layouts into a latent space. Then learning a
mapping from images to this latent space makes end-to-end room layout
estimation possible. However end-to-end methods have several notorious
drawbacks despite many intriguing properties. A widely raised criticism is that
they are troubled with dataset bias and do not transfer to unfamiliar domains.
Our study echos this common belief. To this end, we propose to use semantic
boundary prediction maps as an intermediate domain. It brings significant
performance boost on four benchmarks (Structured3D, PanoContext, S3DIS, and
Matterport3D), notably in the zero-shot transfer setting. Code, data, and
models will be released.
Related papers
- No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images [100.80376573969045]
NoPoSplat is a feed-forward model capable of reconstructing 3D scenes parameterized by 3D Gaussians from multi-view images.
Our model achieves real-time 3D Gaussian reconstruction during inference.
This work makes significant advances in pose-free generalizable 3D reconstruction and demonstrates its applicability to real-world scenarios.
arXiv Detail & Related papers (2024-10-31T17:58:22Z) - View-Consistent Hierarchical 3D Segmentation Using Ultrametric Feature Fields [52.08335264414515]
We learn a novel feature field within a Neural Radiance Field (NeRF) representing a 3D scene.
Our method takes view-inconsistent multi-granularity 2D segmentations as input and produces a hierarchy of 3D-consistent segmentations as output.
We evaluate our method and several baselines on synthetic datasets with multi-view images and multi-granular segmentation, showcasing improved accuracy and viewpoint-consistency.
arXiv Detail & Related papers (2024-05-30T04:14:58Z) - Neural Semantic Surface Maps [52.61017226479506]
We present an automated technique for computing a map between two genus-zero shapes, which matches semantically corresponding regions to one another.
Our approach can generate semantic surface-to-surface maps, eliminating manual annotations or any 3D training data requirement.
arXiv Detail & Related papers (2023-09-09T16:21:56Z) - Disentangling Orthogonal Planes for Indoor Panoramic Room Layout
Estimation with Cross-Scale Distortion Awareness [38.096482841789275]
We propose to disentangle 1D representation by pre-segmenting planes from a complex scene.
Considering the symmetry between the floor boundary and ceiling boundary, we also design a soft-flipping fusion strategy.
Experiments on four popular benchmarks demonstrate our superiority over existing SoTA solutions.
arXiv Detail & Related papers (2023-03-02T05:10:23Z) - Neural 3D Scene Reconstruction with the Manhattan-world Assumption [58.90559966227361]
This paper addresses the challenge of reconstructing 3D indoor scenes from multi-view images.
Planar constraints can be conveniently integrated into the recent implicit neural representation-based reconstruction methods.
The proposed method outperforms previous methods by a large margin on 3D reconstruction quality.
arXiv Detail & Related papers (2022-05-05T17:59:55Z) - Self-supervised 360$^{\circ}$ Room Layout Estimation [20.062713286961326]
We present the first self-supervised method to train panoramic room layout estimation models without any labeled data.
Our approach also shows promising solutions in data-scarce scenarios and active learning, which would have an immediate value in real estate virtual tour software.
arXiv Detail & Related papers (2022-03-30T04:58:07Z) - OmniLayout: Room Layout Reconstruction from Indoor Spherical Panoramas [16.38156002774853]
Given a single RGB panorama, the goal of 3D layout reconstruction is to estimate the room layout by predicting the corners, boundary, and ceiling boundary.
A common approach has been to use standard convolutional networks to predict the corners and boundaries, followed by post-processing to generate the 3D layout.
We propose to use spherical convolutions, which perform convolutions directly on the sphere surface, sampling according to equirectangular projection.
arXiv Detail & Related papers (2021-04-19T15:44:10Z) - Coherent Reconstruction of Multiple Humans from a Single Image [68.3319089392548]
In this work, we address the problem of multi-person 3D pose estimation from a single image.
A typical regression approach in the top-down setting of this problem would first detect all humans and then reconstruct each one of them independently.
Our goal is to train a single network that learns to avoid these problems and generate a coherent 3D reconstruction of all the humans in the scene.
arXiv Detail & Related papers (2020-06-15T17:51:45Z) - General 3D Room Layout from a Single View by Render-and-Compare [36.94817376590415]
We present a novel method to reconstruct the 3D layout of a room from a single perspective view.
Our dataset consists of 293 images from ScanNet, which we annotated with precise 3D layouts.
arXiv Detail & Related papers (2020-01-07T16:14:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.