LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware
Transformer Network
- URL: http://arxiv.org/abs/2203.01824v1
- Date: Thu, 3 Mar 2022 16:28:10 GMT
- Title: LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware
Transformer Network
- Authors: Zhigang Jiang, Zhongzheng Xiang, Jinhua Xu, Ming Zhao
- Abstract summary: We propose an efficient network, LGT-Net, for room layout estimation.
Experiments show that the proposed LGT-Net achieves better performance than current state-of-the-arts (SOTA) on benchmark datasets.
- Score: 1.3512949730789903
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D room layout estimation by a single panorama using deep neural networks has
made great progress. However, previous approaches can not obtain efficient
geometry awareness of room layout with the only latitude of boundaries or
horizon-depth. We present that using horizon-depth along with room height can
obtain omnidirectional-geometry awareness of room layout in both horizontal and
vertical directions. In addition, we propose a planar-geometry aware loss
function with normals and gradients of normals to supervise the planeness of
walls and turning of corners. We propose an efficient network, LGT-Net, for
room layout estimation, which contains a novel Transformer architecture called
SWG Transformer to model geometry relations. SWG Transformer consists of
(Shifted) Window Blocks and Global Blocks to combine the local and global
geometry relations. Moreover, we design a novel relative position embedding of
Transformer to enhance the spatial identification ability for the panorama.
Experiments show that the proposed LGT-Net achieves better performance than
current state-of-the-arts (SOTA) on benchmark datasets.
Related papers
- GALA: Geometry-Aware Local Adaptive Grids for Detailed 3D Generation [28.299293407423455]
GALA is a novel representation of 3D shapes that excels at capturing and reproducing complex geometry and surface details.
With our optimized C++/CUDA implementation, GALA can be fitted to an object in less than 10 seconds.
We provide a cascaded generation pipeline capable of generating 3D shapes with great geometric detail.
arXiv Detail & Related papers (2024-10-13T22:53:58Z) - Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries.
We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images.
Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z) - GEOcc: Geometrically Enhanced 3D Occupancy Network with Implicit-Explicit Depth Fusion and Contextual Self-Supervision [49.839374549646884]
This paper presents GEOcc, a Geometric-Enhanced Occupancy network tailored for vision-only surround-view perception.
Our approach achieves State-Of-The-Art performance on the Occ3D-nuScenes dataset with the least image resolution needed and the most weightless image backbone.
arXiv Detail & Related papers (2024-05-17T07:31:20Z) - SGFormer: Spherical Geometry Transformer for 360 Depth Estimation [54.13459226728249]
Panoramic distortion poses a significant challenge in 360 depth estimation.
We propose a spherical geometry transformer, named SGFormer, to address the above issues.
We also present a query-based global conditional position embedding to compensate for spatial structure at varying resolutions.
arXiv Detail & Related papers (2024-04-23T12:36:24Z) - Flattening-Net: Deep Regular 2D Representation for 3D Point Cloud
Analysis [66.49788145564004]
We present an unsupervised deep neural architecture called Flattening-Net to represent irregular 3D point clouds of arbitrary geometry and topology.
Our methods perform favorably against the current state-of-the-art competitors.
arXiv Detail & Related papers (2022-12-17T15:05:25Z) - End-to-end Graph-constrained Vectorized Floorplan Generation with
Panoptic Refinement [16.103152098205566]
We aim to synthesize floorplans as sequences of 1-D vectors, which eases user interaction and design customization.
In the first stage, we encode the room connectivity graph input by users with a graphal network (GCN), then apply an autoregressive transformer network to generate an initial floorplan sequence.
To polish the initial design and generate more visually appealing floorplans, we further propose a novel panoptic refinement network(PRN) composed of a GCN and a transformer network.
arXiv Detail & Related papers (2022-07-27T03:19:20Z) - 3D Room Layout Estimation from a Cubemap of Panorama Image via Deep
Manhattan Hough Transform [17.51123287432334]
We present an alternative approach to estimate the walls in 3D space by modeling long-range geometric patterns in a learnable Hough Transform block.
We transform the image feature from a cubemap tile to the Hough space of a Manhattan world and directly map the feature to geometric output.
The convolutional layers not only learn the local gradient-like line features, but also utilize the global information to successfully predict occluded walls with a simple network structure.
arXiv Detail & Related papers (2022-07-19T14:22:28Z) - GitNet: Geometric Prior-based Transformation for Birds-Eye-View
Segmentation [105.19949897812494]
Birds-eye-view (BEV) semantic segmentation is critical for autonomous driving.
We present a novel two-stage Geometry Prior-based Transformation framework named GitNet.
arXiv Detail & Related papers (2022-04-16T06:46:45Z) - LED2-Net: Monocular 360 Layout Estimation via Differentiable Depth
Rendering [59.63979143021241]
We formulate the task of 360 layout estimation as a problem of predicting depth on the horizon line of a panorama.
We propose the Differentiable Depth Rendering procedure to make the conversion from layout to depth prediction differentiable.
Our method achieves state-of-the-art performance on numerous 360 layout benchmark datasets.
arXiv Detail & Related papers (2021-04-01T15:48:41Z) - GeoLayout: Geometry Driven Room Layout Estimation Based on Depth Maps of
Planes [18.900646770506256]
We propose to incorporate geometric reasoning to deep learning for layout estimation.
Our approach learns to infer the depth maps of the dominant planes in the scene by predicting the pixel-level surface parameters.
We present a new dataset with pixel-level depth annotation of dominant planes.
arXiv Detail & Related papers (2020-08-14T10:34:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.