Geometry-Aware Segmentation of Remote Sensing Images via Implicit Height
Estimation
- URL: http://arxiv.org/abs/2006.05848v2
- Date: Tue, 22 Sep 2020 01:48:22 GMT
- Title: Geometry-Aware Segmentation of Remote Sensing Images via Implicit Height
Estimation
- Authors: Xiang Li, Lingjing Wang, Yi Fang
- Abstract summary: We introduce a geometry-aware segmentation model that achieves accurate semantic labeling of aerial images via joint height estimation.
We develop a new geometry-aware convolution module that fuses the 3D geometric features from the height decoder branch and the 2D contextual features from the semantic segmentation branch.
The proposed model achieves remarkable performance on both datasets without using any hand-crafted features or post-processing.
- Score: 15.900382629390297
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies have shown the benefits of using additional elevation data
(e.g., DSM) for enhancing the performance of the semantic segmentation of
aerial images. However, previous methods mostly adopt 3D elevation information
as additional inputs. While in many real-world applications, one does not have
the corresponding DSM information at hand and the spatial resolution of
acquired DSM images usually do not match the aerial images. To alleviate this
data constraint and also take advantage of 3D elevation information, in this
paper, we introduce a geometry-aware segmentation model that achieves accurate
semantic labeling of aerial images via joint height estimation. Instead of
using a single-stream encoder-decoder network for semantic labeling, we design
a separate decoder branch to predict the height map and use the DSM images as
side supervision to train this newly designed decoder branch. In this way, our
model does not require DSM as model input and still benefits from the helpful
3D geometric information during training. Moreover, we develop a new
geometry-aware convolution module that fuses the 3D geometric features from the
height decoder branch and the 2D contextual features from the semantic
segmentation branch. The fused feature embeddings can produce geometry-aware
segmentation maps with enhanced performance. Our model is trained with DSM
images as side supervision, while in the inference stage, it does not require
DSM data and directly predicts the semantic labels in an end-to-end fashion.
Experiments on ISPRS Vaihingen and Potsdam datasets demonstrate the
effectiveness of the proposed method for the semantic segmentation of aerial
images. The proposed model achieves remarkable performance on both datasets
without using any hand-crafted features or post-processing.
Related papers
- DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields [68.94868475824575]
This paper introduces a novel approach capable of generating infinite, high-quality 3D-consistent 2D annotations alongside 3D point cloud segmentations.
We leverage the strong semantic prior within a 3D generative model to train a semantic decoder.
Once trained, the decoder efficiently generalizes across the latent space, enabling the generation of infinite data.
arXiv Detail & Related papers (2023-11-18T21:58:28Z) - 3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features [70.50665869806188]
3DiffTection is a state-of-the-art method for 3D object detection from single images.
We fine-tune a diffusion model to perform novel view synthesis conditioned on a single image.
We further train the model on target data with detection supervision.
arXiv Detail & Related papers (2023-11-07T23:46:41Z) - SeMLaPS: Real-time Semantic Mapping with Latent Prior Networks and
Quasi-Planar Segmentation [53.83313235792596]
We present a new methodology for real-time semantic mapping from RGB-D sequences.
It combines a 2D neural network and a 3D network based on a SLAM system with 3D occupancy mapping.
Our system achieves state-of-the-art semantic mapping quality within 2D-3D networks-based systems.
arXiv Detail & Related papers (2023-06-28T22:36:44Z) - PointMCD: Boosting Deep Point Cloud Encoders via Multi-view Cross-modal
Distillation for 3D Shape Recognition [55.38462937452363]
We propose a unified multi-view cross-modal distillation architecture, including a pretrained deep image encoder as the teacher and a deep point encoder as the student.
By pair-wise aligning multi-view visual and geometric descriptors, we can obtain more powerful deep point encoders without exhausting and complicated network modification.
arXiv Detail & Related papers (2022-07-07T07:23:20Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based
Perception [122.53774221136193]
State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution.
A natural remedy is to utilize the 3D voxelization and 3D convolution network.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
arXiv Detail & Related papers (2021-09-12T06:25:11Z) - Machine-learned 3D Building Vectorization from Satellite Imagery [7.887221474814986]
We propose a machine learning based approach for automatic 3D building reconstruction and vectorization.
Taking a single-channel photogrammetric digital surface model (DSM) and panchromatic (PAN) image as input, we first filter out non-building objects and refine the building of shapes.
The refined DSM and the input PAN image are then used through a semantic segmentation network to detect edges and corners of building roofs.
arXiv Detail & Related papers (2021-04-13T19:57:30Z) - H3D: Benchmark on Semantic Segmentation of High-Resolution 3D Point
Clouds and textured Meshes from UAV LiDAR and Multi-View-Stereo [4.263987603222371]
This paper introduces a 3D dataset which is unique in three ways.
It depicts the village of Hessigheim (Germany) henceforth referred to as H3D.
It is designed for promoting research in the field of 3D data analysis on one hand and to evaluate and rank emerging approaches.
arXiv Detail & Related papers (2021-02-10T09:33:48Z) - Improving Point Cloud Semantic Segmentation by Learning 3D Object
Detection [102.62963605429508]
Point cloud semantic segmentation plays an essential role in autonomous driving.
Current 3D semantic segmentation networks focus on convolutional architectures that perform great for well represented classes.
We propose a novel Aware 3D Semantic Detection (DASS) framework that explicitly leverages localization features from an auxiliary 3D object detection task.
arXiv Detail & Related papers (2020-09-22T14:17:40Z) - Height estimation from single aerial images using a deep ordinal
regression network [12.991266182762597]
We deal with the ambiguous and unsolved problem of height estimation from a single aerial image.
Driven by the success of deep learning, especially deep convolution neural networks (CNNs), some researches have proposed to estimate height information from a single aerial image.
In this paper, we proposed to divide height values into spacing-increasing intervals and transform the regression problem into an ordinal regression problem.
arXiv Detail & Related papers (2020-06-04T12:03:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.