BiSVP: Building Footprint Extraction via Bidirectional Serialized Vertex
Prediction
- URL: http://arxiv.org/abs/2303.00300v1
- Date: Wed, 1 Mar 2023 07:50:34 GMT
- Title: BiSVP: Building Footprint Extraction via Bidirectional Serialized Vertex
Prediction
- Authors: Mingming Zhang, Ye Du, Zhenghui Hu, Qingjie Liu, Yunhong Wang
- Abstract summary: BiSVP is a refinement-free and end-to-end building footprint extraction method.
We propose a cross-scale feature fusion (CSFF) module to facilitate high resolution and rich semantic feature learning.
Our BiSVP outperforms state-of-the-art methods by considerable margins on three building instance segmentation benchmarks.
- Score: 43.61580149432732
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Extracting building footprints from remote sensing images has been attracting
extensive attention recently. Dominant approaches address this challenging
problem by generating vectorized building masks with cumbersome refinement
stages, which limits the application of such methods. In this paper, we
introduce a new refinement-free and end-to-end building footprint extraction
method, which is conceptually intuitive, simple, and effective. Our method,
termed as BiSVP, represents a building instance with ordered vertices and
formulates the building footprint extraction as predicting the serialized
vertices directly in a bidirectional fashion. Moreover, we propose a
cross-scale feature fusion (CSFF) module to facilitate high resolution and rich
semantic feature learning, which is essential for the dense building vertex
prediction task. Without bells and whistles, our BiSVP outperforms
state-of-the-art methods by considerable margins on three building instance
segmentation benchmarks, clearly demonstrating its superiority. The code and
datasets will be made public available.
Related papers
- RoIPoly: Vectorized Building Outline Extraction Using Vertex and Logit Embeddings [5.093758132026397]
We propose a novel query-based approach for extracting building outlines from aerial or satellite imagery.
We formulate each polygon as a query and constrain the query attention on the most relevant regions of a potential building.
We evaluate our method on the vectorized building outline extraction dataset CrowdAI and the 2D floorplan reconstruction dataset Structured3D.
arXiv Detail & Related papers (2024-07-20T16:12:51Z) - P2PFormer: A Primitive-to-polygon Method for Regular Building Contour Extraction from Remote Sensing Images [5.589842901102337]
Existing methods struggle with irregular contours, rounded corners, and redundancy points.
We introduce a novel, streamlined pipeline that generates regular building contours without post-processing.
P2PFormer achieves new state-of-the-art performance on the WHU, CrowdAI, and WHU-Mix datasets.
arXiv Detail & Related papers (2024-06-05T04:38:45Z) - GDB: Gated convolutions-based Document Binarization [0.0]
We formulate text extraction as the learning of gating values and propose an end-to-end gated convolutions-based network (GDB) to solve the problem of imprecise stroke edge extraction.
Our proposed framework consists of two stages. Firstly, a coarse sub-network with an extra edge branch is trained to get more precise feature maps by feeding a priori mask and edge.
Secondly, a refinement sub-network is cascaded to refine the output of the first stage by gated convolutions based on the sharp edge.
arXiv Detail & Related papers (2023-02-04T02:56:40Z) - BuildMapper: A Fully Learnable Framework for Vectorized Building Contour
Extraction [3.862461804734488]
We propose the first end-to-end learnable building contour extraction framework, named BuildMapper.
BuildMapper can directly and efficiently delineate building polygons just as a human does.
We show that BuildMapper can achieve a state-of-the-art performance, with a higher mask average precision (AP) and boundary AP than both segmentation-based and contour-based methods.
arXiv Detail & Related papers (2022-11-07T08:58:35Z) - Collaborative Propagation on Multiple Instance Graphs for 3D Instance
Segmentation with Single-point Supervision [63.429704654271475]
We propose a novel weakly supervised method RWSeg that only requires labeling one object with one point.
With these sparse weak labels, we introduce a unified framework with two branches to propagate semantic and instance information.
Specifically, we propose a Cross-graph Competing Random Walks (CRW) algorithm that encourages competition among different instance graphs.
arXiv Detail & Related papers (2022-08-10T02:14:39Z) - Semantic keypoint-based pose estimation from single RGB frames [64.80395521735463]
We present an approach to estimating the continuous 6-DoF pose of an object from a single RGB image.
The approach combines semantic keypoints predicted by a convolutional network (convnet) with a deformable shape model.
We show that our approach can accurately recover the 6-DoF object pose for both instance- and class-based scenarios.
arXiv Detail & Related papers (2022-04-12T15:03:51Z) - Learning Local Displacements for Point Cloud Completion [93.54286830844134]
We propose a novel approach aimed at object and semantic scene completion from a partial scan represented as a 3D point cloud.
Our architecture relies on three novel layers that are used successively within an encoder-decoder structure.
We evaluate both architectures on object and indoor scene completion tasks, achieving state-of-the-art performance.
arXiv Detail & Related papers (2022-03-30T18:31:37Z) - PC-RGNN: Point Cloud Completion and Graph Neural Network for 3D Object
Detection [57.49788100647103]
LiDAR-based 3D object detection is an important task for autonomous driving.
Current approaches suffer from sparse and partial point clouds of distant and occluded objects.
In this paper, we propose a novel two-stage approach, namely PC-RGNN, dealing with such challenges by two specific solutions.
arXiv Detail & Related papers (2020-12-18T18:06:43Z) - Spatial Pyramid Based Graph Reasoning for Semantic Segmentation [67.47159595239798]
We apply graph convolution into the semantic segmentation task and propose an improved Laplacian.
The graph reasoning is directly performed in the original feature space organized as a spatial pyramid.
We achieve comparable performance with advantages in computational and memory overhead.
arXiv Detail & Related papers (2020-03-23T12:28:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.