Related papers: Polygon Detection for Room Layout Estimation using Heterogeneous Graphs and Wireframes

Polygon Detection for Room Layout Estimation using Heterogeneous Graphs and Wireframes

URL: http://arxiv.org/abs/2306.12203v1
Date: Wed, 21 Jun 2023 11:55:15 GMT
Title: Polygon Detection for Room Layout Estimation using Heterogeneous Graphs and Wireframes
Authors: David Gillsj\"o, Gabrielle Flood, Kalle {\AA}str\"om
Abstract summary: This paper presents a network method that can be used to solve room layout estimations tasks. The network takes an RGB image and estimates a wireframe as well as space using an hourglass backbone.
Score: 2.76240219662896
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper presents a neural network based semantic plane detection method utilizing polygon representations. The method can for example be used to solve room layout estimations tasks. The method is built on, combines and further develops several different modules from previous research. The network takes an RGB image and estimates a wireframe as well as a feature space using an hourglass backbone. From these, line and junction features are sampled. The lines and junctions are then represented as an undirected graph, from which polygon representations of the sought planes are obtained. Two different methods for this last step are investigated, where the most promising method is built on a heterogeneous graph transformer. The final output is in all cases a projection of the semantic planes in 2D. The methods are evaluated on the Structured 3D dataset and we investigate the performance both using sampled and estimated wireframes. The experiments show the potential of the graph-based method by outperforming state of the art methods in Room Layout estimation in the 2D metrics using synthetic wireframe detections.

Related papers

LaRI: Layered Ray Intersections for Single-view 3D Geometric Reasoning [75.9814389360821]
layered ray intersections (LaRI) is a new method for unseen geometry reasoning from a single image. Benefiting from the compact and layered representation, LaRI enables complete, efficient, and view-aligned geometric reasoning. We build a complete training data generation pipeline for synthetic and real-world data, including 3D objects and scenes.
arXiv Detail & Related papers (2025-04-25T15:31:29Z)
Geometric Framework for 3D Cell Segmentation Correction [3.540684770290861]
3D cellular image segmentation methods are commonly divided into non-2D and 2D-based approaches. Errors in 2D results often propagate, leading to oversegmentations in the final 3D results. We introduce an interpretable geometric framework that addresses the oversegmentations by correcting the 2D segmentation results based on geometric information from adjacent layers.
arXiv Detail & Related papers (2025-02-03T23:47:45Z)
Boundary Detection Algorithm Inspired by Locally Linear Embedding [8.259071011958254]
We propose a method for detecting boundary points inspired by the widely used locally linear embedding algorithm. We implement this method using two nearest neighborhood search schemes: the $epsilon$-radius ball scheme and the $K$-nearest neighbor scheme.
arXiv Detail & Related papers (2024-06-26T16:05:57Z)
Contour Context: Abstract Structural Distribution for 3D LiDAR Loop Detection and Metric Pose Estimation [31.968749056155467]
This paper proposes a simple, effective, and efficient topological loop closure detection pipeline with accurate 3-DoF metric pose estimation. We interpret the Cartesian birds' eye view (BEV) image projected from 3D LiDAR points as layered distribution of structures. A retrieval key is designed to accelerate the search of a database indexed by layered KD-trees.
arXiv Detail & Related papers (2023-02-13T07:18:24Z)
Monocular Road Planar Parallax Estimation [25.36368935789501]
Estimating the 3D structure of the drivable surface and surrounding environment is a crucial task for assisted and autonomous driving. We propose Road Planar Parallax Attention Network (RPANet), a new deep neural network for 3D sensing from monocular image sequences. RPANet takes a pair of images aligned by the homography of the road plane as input and outputs a $gamma$ map for 3D reconstruction.
arXiv Detail & Related papers (2021-11-22T10:03:41Z)
Joint Deep Multi-Graph Matching and 3D Geometry Learning from Inhomogeneous 2D Image Collections [57.60094385551773]
We propose a trainable framework for learning a deformable 3D geometry model from inhomogeneous image collections. We in addition obtain the underlying 3D geometry of the objects depicted in the 2D images.
arXiv Detail & Related papers (2021-03-31T17:25:36Z)
GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation [71.83992173720311]
6D pose estimation from a single RGB image is a fundamental task in computer vision. We propose a simple yet effective Geometry-guided Direct Regression Network (GDR-Net) to learn the 6D pose in an end-to-end manner. Our approach remarkably outperforms state-of-the-art methods on LM, LM-O and YCB-V datasets.
arXiv Detail & Related papers (2021-02-24T09:11:31Z)
Primal-Dual Mesh Convolutional Neural Networks [62.165239866312334]
We propose a primal-dual framework drawn from the graph-neural-network literature to triangle meshes. Our method takes features for both edges and faces of a 3D mesh as input and dynamically aggregates them. We provide theoretical insights of our approach using tools from the mesh-simplification literature.
arXiv Detail & Related papers (2020-10-23T14:49:02Z)
DOPS: Learning to Detect 3D Objects and Predict their 3D Shapes [54.239416488865565]
We propose a fast single-stage 3D object detection method for LIDAR data. The core novelty of our method is a fast, single-pass architecture that both detects objects in 3D and estimates their shapes. We find that our proposed method achieves state-of-the-art results by 5% on object detection in ScanNet scenes, and it gets top results by 3.4% in the Open dataset.
arXiv Detail & Related papers (2020-04-02T17:48:50Z)
Holistically-Attracted Wireframe Parsing [123.58263152571952]
This paper presents a fast and parsimonious parsing method to detect a vectorized wireframe in an input image with a single forward pass. The proposed method is end-to-end trainable, consisting of three components: (i) line segment and junction proposal generation, (ii) line segment and junction matching, and (iii) line segment and junction verification.
arXiv Detail & Related papers (2020-03-03T17:43:57Z)
From Planes to Corners: Multi-Purpose Primitive Detection in Unorganized 3D Point Clouds [59.98665358527686]
We propose a new method for segmentation-free joint estimation of orthogonal planes. Such unified scene exploration allows for multitudes of applications such as semantic plane detection or local and global scan alignment. Our experiments demonstrate the validity of our approach in numerous scenarios from wall detection to 6D tracking.
arXiv Detail & Related papers (2020-01-21T06:51:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.