GeoLayout: Geometry Driven Room Layout Estimation Based on Depth Maps of
Planes
- URL: http://arxiv.org/abs/2008.06286v1
- Date: Fri, 14 Aug 2020 10:34:24 GMT
- Title: GeoLayout: Geometry Driven Room Layout Estimation Based on Depth Maps of
Planes
- Authors: Weidong Zhang, Wei Zhang and Yinda Zhang
- Abstract summary: We propose to incorporate geometric reasoning to deep learning for layout estimation.
Our approach learns to infer the depth maps of the dominant planes in the scene by predicting the pixel-level surface parameters.
We present a new dataset with pixel-level depth annotation of dominant planes.
- Score: 18.900646770506256
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The task of room layout estimation is to locate the wall-floor, wall-ceiling,
and wall-wall boundaries. Most recent methods solve this problem based on
edge/keypoint detection or semantic segmentation. However, these approaches
have shown limited attention on the geometry of the dominant planes and the
intersection between them, which has significant impact on room layout. In this
work, we propose to incorporate geometric reasoning to deep learning for layout
estimation. Our approach learns to infer the depth maps of the dominant planes
in the scene by predicting the pixel-level surface parameters, and the layout
can be generated by the intersection of the depth maps. Moreover, we present a
new dataset with pixel-level depth annotation of dominant planes. It is larger
than the existing datasets and contains both cuboid and non-cuboid rooms.
Experimental results show that our approach produces considerable performance
gains on both 2D and 3D datasets.
Related papers
- Ground Plane Matters: Picking Up Ground Plane Prior in Monocular 3D
Object Detection [92.75961303269548]
The ground plane prior is a very informative geometry clue in monocular 3D object detection (M3OD)
We propose a Ground Plane Enhanced Network (GPENet) which resolves both issues at one go.
Our GPENet can outperform other methods and achieve state-of-the-art performance, well demonstrating the effectiveness and the superiority of the proposed approach.
arXiv Detail & Related papers (2022-11-03T02:21:35Z) - 3D Room Layout Estimation from a Cubemap of Panorama Image via Deep
Manhattan Hough Transform [17.51123287432334]
We present an alternative approach to estimate the walls in 3D space by modeling long-range geometric patterns in a learnable Hough Transform block.
We transform the image feature from a cubemap tile to the Hough space of a Manhattan world and directly map the feature to geometric output.
The convolutional layers not only learn the local gradient-like line features, but also utilize the global information to successfully predict occluded walls with a simple network structure.
arXiv Detail & Related papers (2022-07-19T14:22:28Z) - Neural 3D Scene Reconstruction with the Manhattan-world Assumption [58.90559966227361]
This paper addresses the challenge of reconstructing 3D indoor scenes from multi-view images.
Planar constraints can be conveniently integrated into the recent implicit neural representation-based reconstruction methods.
The proposed method outperforms previous methods by a large margin on 3D reconstruction quality.
arXiv Detail & Related papers (2022-05-05T17:59:55Z) - Depth Completion using Geometry-Aware Embedding [22.333381291860498]
This paper proposes an efficient method to learn geometry-aware embedding.
It encodes the local and global geometric structure information from 3D points, e.g., scene layout, object's sizes and shapes, to guide dense depth estimation.
arXiv Detail & Related papers (2022-03-21T12:06:27Z) - ZebraPose: Coarse to Fine Surface Encoding for 6DoF Object Pose
Estimation [76.31125154523056]
We present a discrete descriptor, which can represent the object surface densely.
We also propose a coarse to fine training strategy, which enables fine-grained correspondence prediction.
arXiv Detail & Related papers (2022-03-17T16:16:24Z) - Monocular Road Planar Parallax Estimation [25.36368935789501]
Estimating the 3D structure of the drivable surface and surrounding environment is a crucial task for assisted and autonomous driving.
We propose Road Planar Parallax Attention Network (RPANet), a new deep neural network for 3D sensing from monocular image sequences.
RPANet takes a pair of images aligned by the homography of the road plane as input and outputs a $gamma$ map for 3D reconstruction.
arXiv Detail & Related papers (2021-11-22T10:03:41Z) - LED2-Net: Monocular 360 Layout Estimation via Differentiable Depth
Rendering [59.63979143021241]
We formulate the task of 360 layout estimation as a problem of predicting depth on the horizon line of a panorama.
We propose the Differentiable Depth Rendering procedure to make the conversion from layout to depth prediction differentiable.
Our method achieves state-of-the-art performance on numerous 360 layout benchmark datasets.
arXiv Detail & Related papers (2021-04-01T15:48:41Z) - Depth Completion using Piecewise Planar Model [94.0808155168311]
A depth map can be represented by a set of learned bases and can be efficiently solved in a closed form solution.
However, one issue with this method is that it may create artifacts when colour boundaries are inconsistent with depth boundaries.
We enforce a more strict model in depth recovery: a piece-wise planar model.
arXiv Detail & Related papers (2020-12-06T07:11:46Z) - Occlusion-Aware Depth Estimation with Adaptive Normal Constraints [85.44842683936471]
We present a new learning-based method for multi-frame depth estimation from a color video.
Our method outperforms the state-of-the-art in terms of depth estimation accuracy.
arXiv Detail & Related papers (2020-04-02T07:10:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.