LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware
Transformer Network
- URL: http://arxiv.org/abs/2203.01824v1
- Date: Thu, 3 Mar 2022 16:28:10 GMT
- Title: LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware
Transformer Network
- Authors: Zhigang Jiang, Zhongzheng Xiang, Jinhua Xu, Ming Zhao
- Abstract summary: We propose an efficient network, LGT-Net, for room layout estimation.
Experiments show that the proposed LGT-Net achieves better performance than current state-of-the-arts (SOTA) on benchmark datasets.
- Score: 1.3512949730789903
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D room layout estimation by a single panorama using deep neural networks has
made great progress. However, previous approaches can not obtain efficient
geometry awareness of room layout with the only latitude of boundaries or
horizon-depth. We present that using horizon-depth along with room height can
obtain omnidirectional-geometry awareness of room layout in both horizontal and
vertical directions. In addition, we propose a planar-geometry aware loss
function with normals and gradients of normals to supervise the planeness of
walls and turning of corners. We propose an efficient network, LGT-Net, for
room layout estimation, which contains a novel Transformer architecture called
SWG Transformer to model geometry relations. SWG Transformer consists of
(Shifted) Window Blocks and Global Blocks to combine the local and global
geometry relations. Moreover, we design a novel relative position embedding of
Transformer to enhance the spatial identification ability for the panorama.
Experiments show that the proposed LGT-Net achieves better performance than
current state-of-the-arts (SOTA) on benchmark datasets.
Related papers
- GEOcc: Geometrically Enhanced 3D Occupancy Network with Implicit-Explicit Depth Fusion and Contextual Self-Supervision [49.839374549646884]
This paper presents GEOcc, a Geometric-Enhanced Occupancy network tailored for vision-only surround-view perception.
Our approach achieves State-Of-The-Art performance on the Occ3D-nuScenes dataset with the least image resolution needed and the most weightless image backbone.
arXiv Detail & Related papers (2024-05-17T07:31:20Z) - Atlanta Scaled layouts from non-central panoramas [5.2178708158547025]
We present a novel approach for 3D layout recovery of indoor environments using a non-central acquisition system.
Our approach is the first work using deep learning on non-central panoramas and recovering scaled layouts from single panoramas.
arXiv Detail & Related papers (2024-01-30T14:39:38Z) - DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets [95.84755169585492]
We present Dynamic Sparse Voxel Transformer (DSVT), a single-stride window-based voxel Transformer backbone for outdoor 3D perception.
Our model achieves state-of-the-art performance with a broad range of 3D perception tasks.
arXiv Detail & Related papers (2023-01-15T09:31:58Z) - Flattening-Net: Deep Regular 2D Representation for 3D Point Cloud
Analysis [66.49788145564004]
We present an unsupervised deep neural architecture called Flattening-Net to represent irregular 3D point clouds of arbitrary geometry and topology.
Our methods perform favorably against the current state-of-the-art competitors.
arXiv Detail & Related papers (2022-12-17T15:05:25Z) - End-to-end Graph-constrained Vectorized Floorplan Generation with
Panoptic Refinement [16.103152098205566]
We aim to synthesize floorplans as sequences of 1-D vectors, which eases user interaction and design customization.
In the first stage, we encode the room connectivity graph input by users with a graphal network (GCN), then apply an autoregressive transformer network to generate an initial floorplan sequence.
To polish the initial design and generate more visually appealing floorplans, we further propose a novel panoptic refinement network(PRN) composed of a GCN and a transformer network.
arXiv Detail & Related papers (2022-07-27T03:19:20Z) - 3D Room Layout Estimation from a Cubemap of Panorama Image via Deep
Manhattan Hough Transform [17.51123287432334]
We present an alternative approach to estimate the walls in 3D space by modeling long-range geometric patterns in a learnable Hough Transform block.
We transform the image feature from a cubemap tile to the Hough space of a Manhattan world and directly map the feature to geometric output.
The convolutional layers not only learn the local gradient-like line features, but also utilize the global information to successfully predict occluded walls with a simple network structure.
arXiv Detail & Related papers (2022-07-19T14:22:28Z) - GitNet: Geometric Prior-based Transformation for Birds-Eye-View
Segmentation [105.19949897812494]
Birds-eye-view (BEV) semantic segmentation is critical for autonomous driving.
We present a novel two-stage Geometry Prior-based Transformation framework named GitNet.
arXiv Detail & Related papers (2022-04-16T06:46:45Z) - Geometry-Contrastive Transformer for Generalized 3D Pose Transfer [95.56457218144983]
The intuition of this work is to perceive the geometric inconsistency between the given meshes with the powerful self-attention mechanism.
We propose a novel geometry-contrastive Transformer that has an efficient 3D structured perceiving ability to the global geometric inconsistencies.
We present a latent isometric regularization module together with a novel semi-synthesized dataset for the cross-dataset 3D pose transfer task.
arXiv Detail & Related papers (2021-12-14T13:14:24Z) - Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB
Image [32.5277483805739]
Single-image room layout reconstruction aims to reconstruct the enclosed 3D structure of a room from a single image.
This paper considers a more general indoor assumption, i.e., the room layout consists of a single ceiling, a single floor, and several vertical walls.
arXiv Detail & Related papers (2021-04-16T09:24:08Z) - LED2-Net: Monocular 360 Layout Estimation via Differentiable Depth
Rendering [59.63979143021241]
We formulate the task of 360 layout estimation as a problem of predicting depth on the horizon line of a panorama.
We propose the Differentiable Depth Rendering procedure to make the conversion from layout to depth prediction differentiable.
Our method achieves state-of-the-art performance on numerous 360 layout benchmark datasets.
arXiv Detail & Related papers (2021-04-01T15:48:41Z) - GeoLayout: Geometry Driven Room Layout Estimation Based on Depth Maps of
Planes [18.900646770506256]
We propose to incorporate geometric reasoning to deep learning for layout estimation.
Our approach learns to infer the depth maps of the dominant planes in the scene by predicting the pixel-level surface parameters.
We present a new dataset with pixel-level depth annotation of dominant planes.
arXiv Detail & Related papers (2020-08-14T10:34:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.