3D Room Layout Estimation from a Cubemap of Panorama Image via Deep
Manhattan Hough Transform
- URL: http://arxiv.org/abs/2207.09291v1
- Date: Tue, 19 Jul 2022 14:22:28 GMT
- Title: 3D Room Layout Estimation from a Cubemap of Panorama Image via Deep
Manhattan Hough Transform
- Authors: Yining Zhao, Chao Wen, Zhou Xue, Yue Gao
- Abstract summary: We present an alternative approach to estimate the walls in 3D space by modeling long-range geometric patterns in a learnable Hough Transform block.
We transform the image feature from a cubemap tile to the Hough space of a Manhattan world and directly map the feature to geometric output.
The convolutional layers not only learn the local gradient-like line features, but also utilize the global information to successfully predict occluded walls with a simple network structure.
- Score: 17.51123287432334
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Significant geometric structures can be compactly described by global
wireframes in the estimation of 3D room layout from a single panoramic image.
Based on this observation, we present an alternative approach to estimate the
walls in 3D space by modeling long-range geometric patterns in a learnable
Hough Transform block. We transform the image feature from a cubemap tile to
the Hough space of a Manhattan world and directly map the feature to the
geometric output. The convolutional layers not only learn the local
gradient-like line features, but also utilize the global information to
successfully predict occluded walls with a simple network structure. Unlike
most previous work, the predictions are performed individually on each cubemap
tile, and then assembled to get the layout estimation. Experimental results
show that we achieve comparable results with recent state-of-the-art in
prediction accuracy and performance. Code is available at
https://github.com/Starrah/DMH-Net.
Related papers
- Self-supervised 3D Point Cloud Completion via Multi-view Adversarial Learning [61.14132533712537]
We propose MAL-SPC, a framework that effectively leverages both object-level and category-specific geometric similarities to complete missing structures.
Our MAL-SPC does not require any 3D complete supervision and only necessitates a single partial point cloud for each object.
arXiv Detail & Related papers (2024-07-13T06:53:39Z) - Learning to Generate 3D Representations of Building Roofs Using
Single-View Aerial Imagery [68.3565370706598]
We present a novel pipeline for learning the conditional distribution of a building roof mesh given pixels from an aerial image.
Unlike alternative methods that require multiple images of the same object, our approach enables estimating 3D roof meshes using only a single image for predictions.
arXiv Detail & Related papers (2023-03-20T15:47:05Z) - SketchSampler: Sketch-based 3D Reconstruction via View-dependent Depth
Sampling [75.957103837167]
Reconstructing a 3D shape based on a single sketch image is challenging due to the large domain gap between a sparse, irregular sketch and a regular, dense 3D shape.
Existing works try to employ the global feature extracted from sketch to directly predict the 3D coordinates, but they usually suffer from losing fine details that are not faithful to the input sketch.
arXiv Detail & Related papers (2022-08-14T16:37:51Z) - Towards High-Fidelity Single-view Holistic Reconstruction of Indoor
Scenes [50.317223783035075]
We present a new framework to reconstruct holistic 3D indoor scenes from single-view images.
We propose an instance-aligned implicit function (InstPIFu) for detailed object reconstruction.
Our code and model will be made publicly available.
arXiv Detail & Related papers (2022-07-18T14:54:57Z) - Neural 3D Scene Reconstruction with the Manhattan-world Assumption [58.90559966227361]
This paper addresses the challenge of reconstructing 3D indoor scenes from multi-view images.
Planar constraints can be conveniently integrated into the recent implicit neural representation-based reconstruction methods.
The proposed method outperforms previous methods by a large margin on 3D reconstruction quality.
arXiv Detail & Related papers (2022-05-05T17:59:55Z) - Pixel2Mesh++: 3D Mesh Generation and Refinement from Multi-View Images [82.32776379815712]
We study the problem of shape generation in 3D mesh representation from a small number of color images with or without camera poses.
We adopt to further improve the shape quality by leveraging cross-view information with a graph convolution network.
Our model is robust to the quality of the initial mesh and the error of camera pose, and can be combined with a differentiable function for test-time optimization.
arXiv Detail & Related papers (2022-04-21T03:42:31Z) - LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware
Transformer Network [1.3512949730789903]
We propose an efficient network, LGT-Net, for room layout estimation.
Experiments show that the proposed LGT-Net achieves better performance than current state-of-the-arts (SOTA) on benchmark datasets.
arXiv Detail & Related papers (2022-03-03T16:28:10Z) - Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB
Image [32.5277483805739]
Single-image room layout reconstruction aims to reconstruct the enclosed 3D structure of a room from a single image.
This paper considers a more general indoor assumption, i.e., the room layout consists of a single ceiling, a single floor, and several vertical walls.
arXiv Detail & Related papers (2021-04-16T09:24:08Z) - GeoLayout: Geometry Driven Room Layout Estimation Based on Depth Maps of
Planes [18.900646770506256]
We propose to incorporate geometric reasoning to deep learning for layout estimation.
Our approach learns to infer the depth maps of the dominant planes in the scene by predicting the pixel-level surface parameters.
We present a new dataset with pixel-level depth annotation of dominant planes.
arXiv Detail & Related papers (2020-08-14T10:34:24Z) - General 3D Room Layout from a Single View by Render-and-Compare [36.94817376590415]
We present a novel method to reconstruct the 3D layout of a room from a single perspective view.
Our dataset consists of 293 images from ScanNet, which we annotated with precise 3D layouts.
arXiv Detail & Related papers (2020-01-07T16:14:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.