HiT: Building Mapping with Hierarchical Transformers
- URL: http://arxiv.org/abs/2309.09643v2
- Date: Wed, 10 Jan 2024 09:50:20 GMT
- Title: HiT: Building Mapping with Hierarchical Transformers
- Authors: Mingming Zhang, Qingjie Liu, Yunhong Wang
- Abstract summary: We propose a simple and novel building mapping method with Hierarchical Transformers, called HiT.
HiT builds on a two-stage detection architecture by adding a polygon head parallel to classification and bounding box regression heads.
Our method achieves a new state-of-the-art in terms of instance segmentation and polygonal metrics compared with state-of-the-art methods.
- Score: 43.31497052507252
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning-based methods have been extensively explored for automatic
building mapping from high-resolution remote sensing images over recent years.
While most building mapping models produce vector polygons of buildings for
geographic and mapping systems, dominant methods typically decompose polygonal
building extraction in some sub-problems, including segmentation,
polygonization, and regularization, leading to complex inference procedures,
low accuracy, and poor generalization. In this paper, we propose a simple and
novel building mapping method with Hierarchical Transformers, called HiT,
improving polygonal building mapping quality from high-resolution remote
sensing images. HiT builds on a two-stage detection architecture by adding a
polygon head parallel to classification and bounding box regression heads. HiT
simultaneously outputs building bounding boxes and vector polygons, which is
fully end-to-end trainable. The polygon head formulates a building polygon as
serialized vertices with the bidirectional characteristic, a simple and elegant
polygon representation avoiding the start or end vertex hypothesis. Under this
new perspective, the polygon head adopts a transformer encoder-decoder
architecture to predict serialized vertices supervised by the designed
bidirectional polygon loss. Furthermore, a hierarchical attention mechanism
combined with convolution operation is introduced in the encoder of the polygon
head, providing more geometric structures of building polygons at vertex and
edge levels. Comprehensive experiments on two benchmarks (the CrowdAI and Inria
datasets) demonstrate that our method achieves a new state-of-the-art in terms
of instance segmentation and polygonal metrics compared with state-of-the-art
methods. Moreover, qualitative results verify the superiority and effectiveness
of our model under complex scenes.
Related papers
- SpaceMesh: A Continuous Representation for Learning Manifold Surface Meshes [61.110517195874074]
We present a scheme to directly generate manifold, polygonal meshes of complex connectivity as the output of a neural network.
Our key innovation is to define a continuous latent connectivity space at each mesh, which implies the discrete mesh.
In applications, this approach not only yields high-quality outputs from generative models, but also enables directly learning challenging geometry processing tasks such as mesh repair.
arXiv Detail & Related papers (2024-09-30T17:59:03Z) - Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries.
We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images.
Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z) - P2PFormer: A Primitive-to-polygon Method for Regular Building Contour Extraction from Remote Sensing Images [5.589842901102337]
Existing methods struggle with irregular contours, rounded corners, and redundancy points.
We introduce a novel, streamlined pipeline that generates regular building contours without post-processing.
P2PFormer achieves new state-of-the-art performance on the WHU, CrowdAI, and WHU-Mix datasets.
arXiv Detail & Related papers (2024-06-05T04:38:45Z) - PolyBuilding: Polygon Transformer for End-to-End Building Extraction [9.196604757138825]
PolyBuilding predicts vector representation of buildings from remote sensing images.
Model learns the relations among them and encodes context information from the image to predict the final set of building polygons.
It also achieves a new state-of-the-art in terms of pixel-level coverage, instance-level precision and recall, and geometry-level properties.
arXiv Detail & Related papers (2022-11-03T04:53:17Z) - Towards General-Purpose Representation Learning of Polygonal Geometries [62.34832826705641]
We develop a general-purpose polygon encoding model, which can encode a polygonal geometry into an embedding space.
We conduct experiments on two tasks: 1) shape classification based on MNIST; 2) spatial relation prediction based on two new datasets - DBSR-46K and DBSR-cplx46K.
Our results show that NUFTspec and ResNet1D outperform multiple existing baselines with significant margins.
arXiv Detail & Related papers (2022-09-29T15:59:23Z) - PolyWorld: Polygonal Building Extraction with Graph Neural Networks in
Satellite Images [10.661430927191205]
This paper introduces PolyWorld, a neural network that directly extracts building vertices from an image and connects them correctly to create precise polygons.
PolyWorld significantly outperforms the state-of-the-art in building polygonization.
arXiv Detail & Related papers (2021-11-30T15:23:17Z) - Automated LoD-2 Model Reconstruction from Very-HighResolution
Satellite-derived Digital Surface Model and Orthophoto [1.2691047660244335]
We propose a model-driven method that reconstructs LoD-2 building models following a "decomposition-optimization-fitting" paradigm.
Our proposed method has addressed a few technical caveats over existing methods, resulting in practically high-quality results.
arXiv Detail & Related papers (2021-09-08T19:03:09Z) - Machine-learned 3D Building Vectorization from Satellite Imagery [7.887221474814986]
We propose a machine learning based approach for automatic 3D building reconstruction and vectorization.
Taking a single-channel photogrammetric digital surface model (DSM) and panchromatic (PAN) image as input, we first filter out non-building objects and refine the building of shapes.
The refined DSM and the input PAN image are then used through a semantic segmentation network to detect edges and corners of building roofs.
arXiv Detail & Related papers (2021-04-13T19:57:30Z) - DSG-Net: Learning Disentangled Structure and Geometry for 3D Shape
Generation [98.96086261213578]
We introduce DSG-Net, a deep neural network that learns a disentangled structured and geometric mesh representation for 3D shapes.
This supports a range of novel shape generation applications with disentangled control, such as of structure (geometry) while keeping geometry (structure) unchanged.
Our method not only supports controllable generation applications but also produces high-quality synthesized shapes, outperforming state-of-the-art methods.
arXiv Detail & Related papers (2020-08-12T17:06:51Z) - Quantization in Relative Gradient Angle Domain For Building Polygon
Estimation [88.80146152060888]
CNN approaches often generate imprecise building morphologies including noisy edges and round corners.
We propose a module that uses prior knowledge of building corners to create angular and concise building polygons from CNN segmentation outputs.
Experimental results demonstrate that our method refines CNN output from a rounded approximation to a more clear-cut angular shape of the building footprint.
arXiv Detail & Related papers (2020-07-10T21:33:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.