Pix2Poly: A Sequence Prediction Method for End-to-end Polygonal Building Footprint Extraction from Remote Sensing Imagery
- URL: http://arxiv.org/abs/2412.07899v1
- Date: Tue, 10 Dec 2024 20:10:46 GMT
- Title: Pix2Poly: A Sequence Prediction Method for End-to-end Polygonal Building Footprint Extraction from Remote Sensing Imagery
- Authors: Yeshwanth Kumar Adimoolam, Charalambos Poullis, Melinos Averkiou,
- Abstract summary: Pix2Poly is an end-to-end trainable and differentiable deep neural network capable of directly generating explicit high-quality building footprints in a ring graph format.
Compared to previous graph learning methods, ours is a truly end-to-end trainable approach that extracts high-quality building footprints and road networks without requiring complicated, computationally intensive loss functions and intricate training pipelines.
- Score: 2.867517731896504
- License:
- Abstract: Extraction of building footprint polygons from remotely sensed data is essential for several urban understanding tasks such as reconstruction, navigation, and mapping. Despite significant progress in the area, extracting accurate polygonal building footprints remains an open problem. In this paper, we introduce Pix2Poly, an attention-based end-to-end trainable and differentiable deep neural network capable of directly generating explicit high-quality building footprints in a ring graph format. Pix2Poly employs a generative encoder-decoder transformer to produce a sequence of graph vertex tokens whose connectivity information is learned by an optimal matching network. Compared to previous graph learning methods, ours is a truly end-to-end trainable approach that extracts high-quality building footprints and road networks without requiring complicated, computationally intensive raster loss functions and intricate training pipelines. Upon evaluating Pix2Poly on several complex and challenging datasets, we report that Pix2Poly outperforms state-of-the-art methods in several vector shape quality metrics while being an entirely explicit method. Our code is available at https://github.com/yeshwanth95/Pix2Poly.
Related papers
- P2PFormer: A Primitive-to-polygon Method for Regular Building Contour Extraction from Remote Sensing Images [5.589842901102337]
Existing methods struggle with irregular contours, rounded corners, and redundancy points.
We introduce a novel, streamlined pipeline that generates regular building contours without post-processing.
P2PFormer achieves new state-of-the-art performance on the WHU, CrowdAI, and WHU-Mix datasets.
arXiv Detail & Related papers (2024-06-05T04:38:45Z) - Progressive Evolution from Single-Point to Polygon for Scene Text [79.29097971932529]
We introduce Point2Polygon, which can efficiently transform single-points into compact polygons.
Our method uses a coarse-to-fine process, starting with creating anchor points based on recognition confidence, then vertically and horizontally refining the polygon.
In training detectors with polygons generated by our method, we attained 86% of the accuracy relative to training with ground truth (GT); 3) Additionally, the proposed Point2Polygon can be seamlessly integrated to empower single-point spotters to generate polygons.
arXiv Detail & Related papers (2023-12-21T12:08:27Z) - HiT: Building Mapping with Hierarchical Transformers [43.31497052507252]
We propose a simple and novel building mapping method with Hierarchical Transformers, called HiT.
HiT builds on a two-stage detection architecture by adding a polygon head parallel to classification and bounding box regression heads.
Our method achieves a new state-of-the-art in terms of instance segmentation and polygonal metrics compared with state-of-the-art methods.
arXiv Detail & Related papers (2023-09-18T10:24:25Z) - PolyGNN: Polyhedron-based Graph Neural Network for 3D Building Reconstruction from Point Clouds [22.18061879431175]
PolyGNN is a graph neural network for building reconstruction point clouds.
It learns to assemble primitives obtained by polyhedral decomposition.
We conduct a transferability analysis across cities and on real-world point clouds.
arXiv Detail & Related papers (2023-07-17T16:52:25Z) - PolyBuilding: Polygon Transformer for End-to-End Building Extraction [9.196604757138825]
PolyBuilding predicts vector representation of buildings from remote sensing images.
Model learns the relations among them and encodes context information from the image to predict the final set of building polygons.
It also achieves a new state-of-the-art in terms of pixel-level coverage, instance-level precision and recall, and geometry-level properties.
arXiv Detail & Related papers (2022-11-03T04:53:17Z) - Towards General-Purpose Representation Learning of Polygonal Geometries [62.34832826705641]
We develop a general-purpose polygon encoding model, which can encode a polygonal geometry into an embedding space.
We conduct experiments on two tasks: 1) shape classification based on MNIST; 2) spatial relation prediction based on two new datasets - DBSR-46K and DBSR-cplx46K.
Our results show that NUFTspec and ResNet1D outperform multiple existing baselines with significant margins.
arXiv Detail & Related papers (2022-09-29T15:59:23Z) - PatchRD: Detail-Preserving Shape Completion by Learning Patch Retrieval
and Deformation [59.70430570779819]
We introduce a data-driven shape completion approach that focuses on completing geometric details of missing regions of 3D shapes.
Our key insight is to copy and deform patches from the partial input to complete missing regions.
We leverage repeating patterns by retrieving patches from the partial input, and learn global structural priors by using a neural network to guide the retrieval and deformation steps.
arXiv Detail & Related papers (2022-07-24T18:59:09Z) - PolyWorld: Polygonal Building Extraction with Graph Neural Networks in
Satellite Images [10.661430927191205]
This paper introduces PolyWorld, a neural network that directly extracts building vertices from an image and connects them correctly to create precise polygons.
PolyWorld significantly outperforms the state-of-the-art in building polygonization.
arXiv Detail & Related papers (2021-11-30T15:23:17Z) - PolyNet: Polynomial Neural Network for 3D Shape Recognition with
PolyShape Representation [51.147664305955495]
3D shape representation and its processing have substantial effects on 3D shape recognition.
We propose a deep neural network-based method (PolyNet) and a specific polygon representation (PolyShape)
Our experiments demonstrate the strength and the advantages of PolyNet on both 3D shape classification and retrieval tasks.
arXiv Detail & Related papers (2021-10-15T06:45:59Z) - Voxel-based Network for Shape Completion by Leveraging Edge Generation [76.23436070605348]
We develop a voxel-based network for point cloud completion by leveraging edge generation (VE-PCN)
We first embed point clouds into regular voxel grids, and then generate complete objects with the help of the hallucinated shape edges.
This decoupled architecture together with a multi-scale grid feature learning is able to generate more realistic on-surface details.
arXiv Detail & Related papers (2021-08-23T05:10:29Z) - Cascaded Refinement Network for Point Cloud Completion with
Self-supervision [74.80746431691938]
We introduce a two-branch network for shape completion.
The first branch is a cascaded shape completion sub-network to synthesize complete objects.
The second branch is an auto-encoder to reconstruct the original partial input.
arXiv Detail & Related papers (2020-10-17T04:56:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.