Related papers: RoIPoly: Vectorized Building Outline Extraction Using Vertex and Logit Embeddings

RoIPoly: Vectorized Building Outline Extraction Using Vertex and Logit Embeddings

URL: http://arxiv.org/abs/2407.14920v1
Date: Sat, 20 Jul 2024 16:12:51 GMT
Title: RoIPoly: Vectorized Building Outline Extraction Using Vertex and Logit Embeddings
Authors: Weiqin Jiao, Hao Cheng, Claudio Persello, George Vosselman,
Abstract summary: We propose a novel query-based approach for extracting building outlines from aerial or satellite imagery. We formulate each polygon as a query and constrain the query attention on the most relevant regions of a potential building. We evaluate our method on the vectorized building outline extraction dataset CrowdAI and the 2D floorplan reconstruction dataset Structured3D.
Score: 5.093758132026397
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Polygonal building outlines are crucial for geographic and cartographic applications. The existing approaches for outline extraction from aerial or satellite imagery are typically decomposed into subtasks, e.g., building masking and vectorization, or treat this task as a sequence-to-sequence prediction of ordered vertices. The former lacks efficiency, and the latter often generates redundant vertices, both resulting in suboptimal performance. To handle these issues, we propose a novel Region-of-Interest (RoI) query-based approach called RoIPoly. Specifically, we formulate each vertex as a query and constrain the query attention on the most relevant regions of a potential building, yielding reduced computational overhead and more efficient vertex level interaction. Moreover, we introduce a novel learnable logit embedding to facilitate vertex classification on the attention map; thus, no post-processing is needed for redundant vertex removal. We evaluated our method on the vectorized building outline extraction dataset CrowdAI and the 2D floorplan reconstruction dataset Structured3D. On the CrowdAI dataset, RoIPoly with a ResNet50 backbone outperforms existing methods with the same or better backbones on most MS-COCO metrics, especially on small buildings, and achieves competitive results in polygon quality and vertex redundancy without any post-processing. On the Structured3D dataset, our method achieves the second-best performance on most metrics among existing methods dedicated to 2D floorplan reconstruction, demonstrating our cross-domain generalization capability. The code will be released upon acceptance of this paper.

Related papers

Multi-Unit Floor Plan Recognition and Reconstruction Using Improved Semantic Segmentation of Raster-Wise Floor Plans [1.0436971860292366]
We propose two novel pixel-wise segmentation methods based on the MDA-Unet and MACU-Net architectures. The proposed methods are compared with two other state-of-the-art techniques and several benchmark datasets. On the commonly used CubiCasa benchmark dataset, our methods have achieved the mean F1 score of 0.86 over five examined classes.
arXiv Detail & Related papers (2024-08-02T18:36:45Z)
Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries. We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images. Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z)
P2PFormer: A Primitive-to-polygon Method for Regular Building Contour Extraction from Remote Sensing Images [5.589842901102337]
Existing methods struggle with irregular contours, rounded corners, and redundancy points. We introduce a novel, streamlined pipeline that generates regular building contours without post-processing. P2PFormer achieves new state-of-the-art performance on the WHU, CrowdAI, and WHU-Mix datasets.
arXiv Detail & Related papers (2024-06-05T04:38:45Z)
BiSVP: Building Footprint Extraction via Bidirectional Serialized Vertex Prediction [43.61580149432732]
BiSVP is a refinement-free and end-to-end building footprint extraction method. We propose a cross-scale feature fusion (CSFF) module to facilitate high resolution and rich semantic feature learning. Our BiSVP outperforms state-of-the-art methods by considerable margins on three building instance segmentation benchmarks.
arXiv Detail & Related papers (2023-03-01T07:50:34Z)
Flattening-Net: Deep Regular 2D Representation for 3D Point Cloud Analysis [66.49788145564004]
We present an unsupervised deep neural architecture called Flattening-Net to represent irregular 3D point clouds of arbitrary geometry and topology. Our methods perform favorably against the current state-of-the-art competitors.
arXiv Detail & Related papers (2022-12-17T15:05:25Z)
BuildMapper: A Fully Learnable Framework for Vectorized Building Contour Extraction [3.862461804734488]
We propose the first end-to-end learnable building contour extraction framework, named BuildMapper. BuildMapper can directly and efficiently delineate building polygons just as a human does. We show that BuildMapper can achieve a state-of-the-art performance, with a higher mask average precision (AP) and boundary AP than both segmentation-based and contour-based methods.
arXiv Detail & Related papers (2022-11-07T08:58:35Z)
CAGroup3D: Class-Aware Grouping for 3D Object Detection on Point Clouds [55.44204039410225]
We present a novel two-stage fully sparse convolutional 3D object detection framework, named CAGroup3D. Our proposed method first generates some high-quality 3D proposals by leveraging the class-aware local group strategy on the object surface voxels. To recover the features of missed voxels due to incorrect voxel-wise segmentation, we build a fully sparse convolutional RoI pooling module.
arXiv Detail & Related papers (2022-10-09T13:38:48Z)
Accurate Polygonal Mapping of Buildings in Satellite Imagery [30.262871819346213]
This paper studies the problem of polygonal mapping of buildings by tackling the issue of mask reversibility. We propose a novel interaction mechanism of feature embedding sourced from different levels of supervision signals to obtain reversible building masks. We show that the learned reversible building masks take all the merits of the advances of deep convolutional neural networks for high-performing polygonal mapping of buildings.
arXiv Detail & Related papers (2022-08-01T04:54:55Z)
Learning Local Displacements for Point Cloud Completion [93.54286830844134]
We propose a novel approach aimed at object and semantic scene completion from a partial scan represented as a 3D point cloud. Our architecture relies on three novel layers that are used successively within an encoder-decoder structure. We evaluate both architectures on object and indoor scene completion tasks, achieving state-of-the-art performance.
arXiv Detail & Related papers (2022-03-30T18:31:37Z)
Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection [89.66162518035144]
We present a flexible and high-performance framework, named Pyramid R-CNN, for two-stage 3D object detection from point clouds. We propose a novel second-stage module, named pyramid RoI head, to adaptively learn the features from the sparse points of interest. Our pyramid RoI head is robust to the sparse and imbalanced circumstances, and can be applied upon various 3D backbones to consistently boost the detection performance.
arXiv Detail & Related papers (2021-09-06T14:17:51Z)
PC-RGNN: Point Cloud Completion and Graph Neural Network for 3D Object Detection [57.49788100647103]
LiDAR-based 3D object detection is an important task for autonomous driving. Current approaches suffer from sparse and partial point clouds of distant and occluded objects. In this paper, we propose a novel two-stage approach, namely PC-RGNN, dealing with such challenges by two specific solutions.
arXiv Detail & Related papers (2020-12-18T18:06:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.