Height estimation from single aerial images using a deep ordinal
regression network
- URL: http://arxiv.org/abs/2006.02801v1
- Date: Thu, 4 Jun 2020 12:03:51 GMT
- Title: Height estimation from single aerial images using a deep ordinal
regression network
- Authors: Xiang Li, Mingyang Wang, Yi Fang
- Abstract summary: We deal with the ambiguous and unsolved problem of height estimation from a single aerial image.
Driven by the success of deep learning, especially deep convolution neural networks (CNNs), some researches have proposed to estimate height information from a single aerial image.
In this paper, we proposed to divide height values into spacing-increasing intervals and transform the regression problem into an ordinal regression problem.
- Score: 12.991266182762597
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding the 3D geometric structure of the Earth's surface has been an
active research topic in photogrammetry and remote sensing community for
decades, serving as an essential building block for various applications such
as 3D digital city modeling, change detection, and city management. Previous
researches have extensively studied the problem of height estimation from
aerial images based on stereo or multi-view image matching. These methods
require two or more images from different perspectives to reconstruct 3D
coordinates with camera information provided. In this paper, we deal with the
ambiguous and unsolved problem of height estimation from a single aerial image.
Driven by the great success of deep learning, especially deep convolution
neural networks (CNNs), some researches have proposed to estimate height
information from a single aerial image by training a deep CNN model with
large-scale annotated datasets. These methods treat height estimation as a
regression problem and directly use an encoder-decoder network to regress the
height values. In this paper, we proposed to divide height values into
spacing-increasing intervals and transform the regression problem into an
ordinal regression problem, using an ordinal loss for network training. To
enable multi-scale feature extraction, we further incorporate an Atrous Spatial
Pyramid Pooling (ASPP) module to extract features from multiple dilated
convolution layers. After that, a post-processing technique is designed to
transform the predicted height map of each patch into a seamless height map.
Finally, we conduct extensive experiments on ISPRS Vaihingen and Potsdam
datasets. Experimental results demonstrate significantly better performance of
our method compared to the state-of-the-art methods.
Related papers
- MVSDet: Multi-View Indoor 3D Object Detection via Efficient Plane Sweeps [51.44887282336391]
Key challenge of multi-view indoor 3D object detection is to infer accurate geometry information from images for precise 3D detection.
Previous method relies on NeRF for geometry reasoning.
We propose MVSDet which utilizes plane sweep for geometry-aware 3D object detection.
arXiv Detail & Related papers (2024-10-28T21:58:41Z) - GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation [65.33726478659304]
We introduce the Geometry-Aware Large Reconstruction Model (GeoLRM), an approach which can predict high-quality assets with 512k Gaussians and 21 input images in only 11 GB GPU memory.
Previous works neglect the inherent sparsity of 3D structure and do not utilize explicit geometric relationships between 3D and 2D images.
GeoLRM tackles these issues by incorporating a novel 3D-aware transformer structure that directly processes 3D points and uses deformable cross-attention mechanisms.
arXiv Detail & Related papers (2024-06-21T17:49:31Z) - AugUndo: Scaling Up Augmentations for Monocular Depth Completion and Estimation [51.143540967290114]
We propose a method that unlocks a wide range of previously-infeasible geometric augmentations for unsupervised depth computation and estimation.
This is achieved by reversing, or undo''-ing, geometric transformations to the coordinates of the output depth, warping the depth map back to the original reference frame.
arXiv Detail & Related papers (2023-10-15T05:15:45Z) - HeightFormer: A Multilevel Interaction and Image-adaptive
Classification-regression Network for Monocular Height Estimation with Aerial
Images [10.716933766055755]
This paper presents a comprehensive solution for monocular height estimation in remote sensing.
It features the Multilevel Interaction Backbone (MIB) and Image-adaptive Classification-regression Height Generator (ICG)
The ICG dynamically generates height partition for each image and reframes the traditional regression task.
arXiv Detail & Related papers (2023-10-12T02:49:00Z) - Multi-tiling Neural Radiance Field (NeRF) -- Geometric Assessment on Large-scale Aerial Datasets [5.391764618878545]
In this paper, we aim to scale the Neural Radiance Fields (NeRF) on large-scael aerial datasets.
Specifically, we introduce a location-specific sampling technique as well as a multi-camera tiling (MCT) strategy to reduce memory consumption.
We implement our method on a representative approach, Mip-NeRF, and compare its geometry performance with threephotgrammetric MVS pipelines.
arXiv Detail & Related papers (2023-10-01T00:21:01Z) - GraphCSPN: Geometry-Aware Depth Completion via Dynamic GCNs [49.55919802779889]
We propose a Graph Convolution based Spatial Propagation Network (GraphCSPN) as a general approach for depth completion.
In this work, we leverage convolution neural networks as well as graph neural networks in a complementary way for geometric representation learning.
Our method achieves the state-of-the-art performance, especially when compared in the case of using only a few propagation steps.
arXiv Detail & Related papers (2022-10-19T17:56:03Z) - Towards Model Generalization for Monocular 3D Object Detection [57.25828870799331]
We present an effective unified camera-generalized paradigm (CGP) for Mono3D object detection.
We also propose the 2D-3D geometry-consistent object scaling strategy (GCOS) to bridge the gap via an instance-level augment.
Our method called DGMono3D achieves remarkable performance on all evaluated datasets and surpasses the SoTA unsupervised domain adaptation scheme.
arXiv Detail & Related papers (2022-05-23T23:05:07Z) - GCNDepth: Self-supervised Monocular Depth Estimation based on Graph
Convolutional Network [11.332580333969302]
This work brings a new solution with a set of improvements, which increase the quantitative and qualitative understanding of depth maps.
A graph convolutional network (GCN) can handle the convolution on non-Euclidean data and it can be applied to irregular image regions within a topological structure.
Our method provided comparable and promising results with a high prediction accuracy of 89% on the publicly KITTI and Make3D datasets.
arXiv Detail & Related papers (2021-12-13T16:46:25Z) - Large-scale Building Height Retrieval from Single SAR Imagery based on
Bounding Box Regression Networks [21.788338971571736]
Building height retrieval from synthetic aperture radar (SAR) imagery is of great importance for urban applications.
This paper addresses the issue of building height retrieval in large-scale urban areas from a single TerraSAR-X spotlight or stripmap image.
arXiv Detail & Related papers (2021-11-18T00:39:48Z) - Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks [87.50632573601283]
We present a novel method for multi-view depth estimation from a single video.
Our method achieves temporally coherent depth estimation results by using a novel Epipolar Spatio-Temporal (EST) transformer.
To reduce the computational cost, inspired by recent Mixture-of-Experts models, we design a compact hybrid network.
arXiv Detail & Related papers (2020-11-26T04:04:21Z) - Attention Aware Cost Volume Pyramid Based Multi-view Stereo Network for
3D Reconstruction [12.728154351588053]
We present an efficient multi-view stereo (MVS) network for 3D reconstruction from multiview images.
We introduce a coarseto-fine depth inference strategy to achieve high resolution depth.
arXiv Detail & Related papers (2020-11-25T13:34:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.