Enhancing Monocular Height Estimation from Aerial Images with
Street-view Images
- URL: http://arxiv.org/abs/2311.02121v1
- Date: Fri, 3 Nov 2023 06:43:32 GMT
- Title: Enhancing Monocular Height Estimation from Aerial Images with
Street-view Images
- Authors: Xiaomou Hou, Wanshui Gan and Naoto Yokoya
- Abstract summary: We propose a method that enhances monocular height estimation by incorporating street-view images.
Specifically, we aim to optimize an implicit 3D scene representation, density field, with geometry constraints from street-view images.
- Score: 14.555228716999737
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate height estimation from monocular aerial imagery presents a
significant challenge due to its inherently ill-posed nature. This limitation
is rooted in the absence of adequate geometric constraints available to the
model when training with monocular imagery. Without additional geometric
information to supplement the monocular image data, the model's ability to
provide reliable estimations is compromised.
In this paper, we propose a method that enhances monocular height estimation
by incorporating street-view images. Our insight is that street-view images
provide a distinct viewing perspective and rich structural details of the
scene, serving as geometric constraints to enhance the performance of monocular
height estimation. Specifically, we aim to optimize an implicit 3D scene
representation, density field, with geometry constraints from street-view
images, thereby improving the accuracy and robustness of height estimation. Our
experimental results demonstrate the effectiveness of our proposed method,
outperforming the baseline and offering significant improvements in terms of
accuracy and structural consistency.
Related papers
- Structure-Centric Robust Monocular Depth Estimation via Knowledge Distillation [9.032563775151074]
Monocular depth estimation is a key technique for 3D perception in computer vision.
It faces significant challenges in real-world scenarios, which encompass adverse weather variations, motion blur, as well as scenes with poor lighting conditions at night.
We devise a novel approach to reduce over-reliance on local textures, enhancing robustness against missing or interfering patterns.
arXiv Detail & Related papers (2024-10-09T15:20:29Z) - SphereDiffusion: Spherical Geometry-Aware Distortion Resilient Diffusion Model [63.685132323224124]
Controllable spherical panoramic image generation holds substantial applicative potential across a variety of domains.
In this paper, we introduce a novel framework of SphereDiffusion to address these unique challenges.
Experiments on Structured3D dataset show that SphereDiffusion significantly improves the quality of controllable spherical image generation and relatively reduces around 35% FID on average.
arXiv Detail & Related papers (2024-03-15T06:26:46Z) - RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering
Assisted Distillation [50.35403070279804]
3D occupancy prediction is an emerging task that aims to estimate the occupancy states and semantics of 3D scenes using multi-view images.
We propose RadOcc, a Rendering assisted distillation paradigm for 3D Occupancy prediction.
arXiv Detail & Related papers (2023-12-19T03:39:56Z) - GUPNet++: Geometry Uncertainty Propagation Network for Monocular 3D
Object Detection [95.8940731298518]
We propose a novel Geometry Uncertainty Propagation Network (GUPNet++)
It models the uncertainty propagation relationship of the geometry projection during training, improving the stability and efficiency of the end-to-end model learning.
Experiments show that the proposed approach not only obtains (state-of-the-art) SOTA performance in image-based monocular 3D detection but also demonstrates superiority in efficacy with a simplified framework.
arXiv Detail & Related papers (2023-10-24T08:45:15Z) - Single-View Height Estimation with Conditional Diffusion Probabilistic
Models [1.8782750537161614]
We train a generative diffusion model to learn the joint distribution of optical and DSM images as a Markov chain.
This is accomplished by minimizing a denoising score matching objective while being conditioned on the source image to generate realistic high resolution 3D surfaces.
In this paper we experiment with conditional denoising diffusion probabilistic models (DDPM) for height estimation from a single remotely sensed image.
arXiv Detail & Related papers (2023-04-26T00:37:05Z) - Wide-angle Image Rectification: A Survey [86.36118799330802]
wide-angle images contain distortions that violate the assumptions underlying pinhole camera models.
Image rectification, which aims to correct these distortions, can solve these problems.
We present a detailed description and discussion of the camera models used in different approaches.
Next, we review both traditional geometry-based image rectification methods and deep learning-based methods.
arXiv Detail & Related papers (2020-10-30T17:28:40Z) - Self-Supervised Learning for Monocular Depth Estimation from Aerial
Imagery [0.20072624123275526]
We present a method for self-supervised learning for monocular depth estimation from aerial imagery.
For this, we only use an image sequence from a single moving camera and learn to simultaneously estimate depth and pose information.
By sharing the weights between pose and depth estimation, we achieve a relatively small model, which favors real-time application.
arXiv Detail & Related papers (2020-08-17T12:20:46Z) - Single View Metrology in the Wild [94.7005246862618]
We present a novel approach to single view metrology that can recover the absolute scale of a scene represented by 3D heights of objects or camera height above the ground.
Our method relies on data-driven priors learned by a deep network specifically designed to imbibe weakly supervised constraints from the interplay of the unknown camera with 3D entities such as object heights.
We demonstrate state-of-the-art qualitative and quantitative results on several datasets as well as applications including virtual object insertion.
arXiv Detail & Related papers (2020-07-18T22:31:33Z) - D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual
Odometry [57.5549733585324]
D3VO is a novel framework for monocular visual odometry that exploits deep networks on three levels -- deep depth, pose and uncertainty estimation.
We first propose a novel self-supervised monocular depth estimation network trained on stereo videos without any external supervision.
We model the photometric uncertainties of pixels on the input images, which improves the depth estimation accuracy.
arXiv Detail & Related papers (2020-03-02T17:47:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.