Altitude-Aware Visual Place Recognition in Top-Down View
- URL: http://arxiv.org/abs/2602.23872v1
- Date: Fri, 27 Feb 2026 10:15:15 GMT
- Title: Altitude-Aware Visual Place Recognition in Top-Down View
- Authors: Xingyu Shao, Mengfan He, Chunyu Li, Liangzheng Sun, Ziyang Meng,
- Abstract summary: This study proposes an altitude-adaptive VPR approach that integrates ground feature density analysis with image classification techniques.<n>The proposed method estimates airborne platforms' relative altitude by analyzing the density of ground features in images.<n>Under significant altitude variations, incorporating our relative altitude estimation module into the VPR retrieval pipeline boosts average R@1 and R@5 by 29.85% and 60.20%, respectively.
- Score: 1.888468773682976
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To address the challenge of aerial visual place recognition (VPR) problem under significant altitude variations, this study proposes an altitude-adaptive VPR approach that integrates ground feature density analysis with image classification techniques. The proposed method estimates airborne platforms' relative altitude by analyzing the density of ground features in images, then applies relative altitude-based cropping to generate canonical query images, which are subsequently used in a classification-based VPR strategy for localization. Extensive experiments across diverse terrains and altitude conditions demonstrate that the proposed approach achieves high accuracy and robustness in both altitude estimation and VPR under significant altitude changes. Compared to conventional methods relying on barometric altimeters or Time-of-Flight (ToF) sensors, this solution requires no additional hardware and offers a plug-and-play solution for downstream applications, {making it suitable for small- and medium-sized airborne platforms operating in diverse environments, including rural and urban areas.} Under significant altitude variations, incorporating our relative altitude estimation module into the VPR retrieval pipeline boosts average R@1 and R@5 by 29.85\% and 60.20\%, respectively, compared with applying VPR retrieval alone. Furthermore, compared to traditional {Monocular Metric Depth Estimation (MMDE) methods}, the proposed method reduces the mean error by 202.1 m, yielding average additional improvements of 31.4\% in R@1 and 44\% in R@5. These results demonstrate that our method establishes a robust, vision-only framework for three-dimensional visual place recognition, offering a practical and scalable solution for accurate airborne platforms localization under large altitude variations and limited sensor availability.
Related papers
- Maritime Small Object Detection from UAVs using Deep Learning with Altitude-Aware Dynamic Tiling [2.944925363991407]
Small objects often remain difficult to detect from high altitudes due to low object-to-background pixel ratios.<n>We propose an altitude-aware dynamic tiling method that scales and adaptively subdivides the image into tiles for enhanced small object detection.
arXiv Detail & Related papers (2025-11-24T21:45:01Z) - Loc$^2$: Interpretable Cross-View Localization via Depth-Lifted Local Feature Matching [80.57282092735991]
We propose an accurate and interpretable fine-grained cross-view localization method.<n>It estimates the 3 Degrees of Freedom (DoF) pose of a ground-level image by matching its local features with a reference aerial image.<n> Experiments show state-of-the-art accuracy in challenging scenarios such as cross-area testing and unknown orientation.
arXiv Detail & Related papers (2025-09-11T18:52:16Z) - Baltimore Atlas: FreqWeaver Adapter for Semi-supervised Ultra-high Spatial Resolution Land Cover Classification [9.706130801069143]
Land Cover Classification identifies land cover types on sub-meter remote imagery.<n>Most existing methods focus on 1 m imagery and rely heavily on large-scale annotations.<n>We introduce Baltimore Atlas, a generalization; land cover classification framework that reduces reliance on large-scale training data.
arXiv Detail & Related papers (2025-06-18T15:41:29Z) - Aerial Multi-View Stereo via Adaptive Depth Range Inference and Normal Cues [38.954104931025704]
We propose an Adaptive Depth Range MVS ( ADR-MVS) to improve multi-view depth estimation accuracy.<n> ADR-MVS generates adaptive range maps from depth and normal estimates using cross-attention discrepancy learning.<n> Experimental results demonstrate that ADR-MVS achieves state-of-the-art performance on the WHU, LuoJia-MVS, and M"unchen datasets.
arXiv Detail & Related papers (2025-06-06T01:14:55Z) - TS-SatMVSNet: Slope Aware Height Estimation for Large-Scale Earth Terrain Multi-view Stereo [19.509863059288037]
3D terrain reconstruction with remote sensing imagery achieves cost-effective and large-scale earth observation.<n>We propose an end-to-end slope-aware height estimation network named TS-SatMVSNet for large-scale remote sensing terrain reconstruction.<n>To fully integrate slope information into the MVS pipeline, we design two slope-guided modules to enhance reconstruction outcomes at both micro and macro levels.
arXiv Detail & Related papers (2025-01-02T04:18:40Z) - Observation-Guided Meteorological Field Downscaling at Station Scale: A
Benchmark and a New Method [66.80344502790231]
We extend meteorological downscaling to arbitrary scattered station scales and establish a new benchmark and dataset.
Inspired by data assimilation techniques, we integrate observational data into the downscaling process, providing multi-scale observational priors.
Our proposed method outperforms other specially designed baseline models on multiple surface variables.
arXiv Detail & Related papers (2024-01-22T14:02:56Z) - Hi-Map: Hierarchical Factorized Radiance Field for High-Fidelity
Monocular Dense Mapping [51.739466714312805]
We introduce Hi-Map, a novel monocular dense mapping approach based on Neural Radiance Field (NeRF)
Hi-Map is exceptional in its capacity to achieve efficient and high-fidelity mapping using only posed RGB inputs.
arXiv Detail & Related papers (2024-01-06T12:32:25Z) - View Consistent Purification for Accurate Cross-View Localization [59.48131378244399]
This paper proposes a fine-grained self-localization method for outdoor robotics.
The proposed method addresses limitations in existing cross-view localization methods.
It is the first sparse visual-only method that enhances perception in dynamic environments.
arXiv Detail & Related papers (2023-08-16T02:51:52Z) - Vision Transformers, a new approach for high-resolution and large-scale
mapping of canopy heights [50.52704854147297]
We present a new vision transformer (ViT) model optimized with a classification (discrete) and a continuous loss function.
This model achieves better accuracy than previously used convolutional based approaches (ConvNets) optimized with only a continuous loss function.
arXiv Detail & Related papers (2023-04-22T22:39:03Z) - Very high resolution canopy height maps from RGB imagery using
self-supervised vision transformer and convolutional decoder trained on
Aerial Lidar [14.07306593230776]
This paper presents the first high-resolution canopy height map concurrently produced for multiple sub-national jurisdictions.
The maps are generated by the extraction of features from a self-supervised model trained on Maxar imagery from 2017 to 2020.
We also introduce a post-processing step using a convolutional network trained on GEDI observations.
arXiv Detail & Related papers (2023-04-14T15:52:57Z) - A Large Scale Homography Benchmark [52.55694707744518]
We present a large-scale dataset of Planes in 3D, Pi3D, of roughly 1000 planes observed in 10 000 images from the 1DSfM dataset.
We also present HEB, a large-scale homography estimation benchmark leveraging Pi3D.
arXiv Detail & Related papers (2023-02-20T14:18:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.