Prompt2DEM: High-Resolution DEMs for Urban and Open Environments from Global Prompts Using a Monocular Foundation Model
- URL: http://arxiv.org/abs/2507.09681v2
- Date: Mon, 21 Jul 2025 13:29:28 GMT
- Title: Prompt2DEM: High-Resolution DEMs for Urban and Open Environments from Global Prompts Using a Monocular Foundation Model
- Authors: Osher Rafaeli, Tal Svoray, Ariel Nahlieli,
- Abstract summary: High-resolution elevation estimations are essential to understand catchment and hillslope hydrology, study urban morphology and dynamics, and monitor the growth, decline, and mortality of terrestrial ecosystems.<n>We present here a framework for the estimation of high-resolution DEMs as a new paradigm for absolute global elevation mapping.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: High-resolution elevation estimations are essential to understand catchment and hillslope hydrology, study urban morphology and dynamics, and monitor the growth, decline, and mortality of terrestrial ecosystems. Various deep learning approaches (e.g., super-resolution techniques, monocular depth estimation) have been developed to create high-resolution Digital Elevation Models (DEMs). However, super-resolution techniques are limited by the upscaling factor, and monocular depth estimation lacks global elevation context, making its conversion to a seamless DEM restricted. The recently introduced technique of prompt-based monocular depth estimation has opened new opportunities to extract estimates of absolute elevation in a global context. We present here a framework for the estimation of high-resolution DEMs as a new paradigm for absolute global elevation mapping. It is exemplified using low-resolution Shuttle Radar Topography Mission (SRTM) elevation data as prompts and high-resolution RGB imagery from the National Agriculture Imagery Program (NAIP). The approach fine-tunes a vision transformer encoder with LiDAR-derived DEMs and employs a versatile prompting strategy, enabling tasks such as DEM estimation, void filling, and updating. Our framework achieves a 100x resolution gain (from 30-m to 30-cm), surpassing prior methods by an order of magnitude. Evaluations across three diverse U.S. landscapes show robust generalization, capturing urban structures and fine-scale terrain features with < 5 m MAE relative to LiDAR, improving over SRTM by up to 18%. Hydrological analysis confirms suitability for hazard and environmental studies. We demonstrate scalability by applying the framework to large regions in the U.S. and Israel. All code and pretrained models are publicly available at: https://osherr1996.github.io/prompt2dem_propage/.
Related papers
- Digital Elevation Model Estimation from RGB Satellite Imagery using Generative Deep Learning [1.0207955314209534]
This study proposes an approach to generate DEM from freely available RGB satellite imagery using generative deep learning.<n>We first developed a global dataset consisting of 12K RGB-DEM pairs using Landsat satellite imagery and NASA's SRTM digital elevation data.<n>A unique preprocessing pipeline was implemented to select high-quality, cloud-free regions and aggregate normalized RGB composites from Landsat imagery.
arXiv Detail & Related papers (2025-11-26T23:50:00Z) - UnLoc: Leveraging Depth Uncertainties for Floorplan Localization [80.55849461031879]
UnLoc is an efficient data-driven solution for sequential camera localization within floorplans.<n>We introduce a novel probabilistic model that incorporates uncertainty estimation, modeling depth predictions as explicit probability distributions.<n>We evaluate UnLoc on large-scale synthetic and real-world datasets, demonstrating significant improvements in terms of accuracy and robustness.
arXiv Detail & Related papers (2025-09-14T14:45:43Z) - ROVR-Open-Dataset: A Large-Scale Depth Dataset for Autonomous Driving [62.9051914830949]
We present ROVR, a large-scale, diverse, and cost-efficient depth dataset designed to capture the complexity of real-world driving.<n>A lightweight acquisition pipeline ensures scalable collection, while sparse but statistically sufficient ground truth supports robust training.<n> Benchmarking with state-of-the-art monocular depth models reveals severe cross-dataset generalization failures.
arXiv Detail & Related papers (2025-08-19T16:13:49Z) - Data Augmentation and Resolution Enhancement using GANs and Diffusion Models for Tree Segmentation [49.13393683126712]
Urban forests play a key role in enhancing environmental quality and supporting biodiversity in cities.<n> accurately detecting trees is challenging due to complex landscapes and the variability in image resolution caused by different satellite sensors or UAV flight altitudes.<n>We propose a novel pipeline that integrates domain adaptation with GANs and Diffusion models to enhance the quality of low-resolution aerial images.
arXiv Detail & Related papers (2025-05-21T03:57:10Z) - VRS-UIE: Value-Driven Reordering Scanning for Underwater Image Enhancement [104.78586859995333]
State Space Models (SSMs) have emerged as a promising backbone for vision tasks due to their linear complexity and global receptive field.<n>The predominance of large-portion, homogeneous but useless oceanic backgrounds can dilute the feature representation responses of sparse yet valuable targets.<n>We propose a novel Value-Driven Reordering Scanning framework for Underwater Image Enhancement (UIE)<n>Our framework sets a new state-of-the-art, delivering superior enhancement performance (surpassing WMamba by 0.89 dB on average) by effectively suppressing water bias and preserving structural and color fidelity.
arXiv Detail & Related papers (2025-05-02T12:21:44Z) - Multi-view Reconstruction via SfM-guided Monocular Depth Estimation [92.89227629434316]
We present a new method for multi-view geometric reconstruction.<n>We incorporate SfM information, a strong multi-view prior, into the depth estimation process.<n>Our method significantly improves the quality of depth estimation compared to previous monocular depth estimation works.
arXiv Detail & Related papers (2025-03-18T17:54:06Z) - Tomographic SAR Reconstruction for Forest Height Estimation [4.1942958779358674]
Tree height estimation serves as an important proxy for biomass estimation in ecological and forestry applications.<n>In this study, we use deep learning to estimate forest canopy height directly from 2D Single Look Complex (SLC) images, a derivative of Synthetic Aperture Radar (SAR)<n>Our method attempts to bypass traditional tomographic signal processing, potentially reducing latency from SAR capture to end product.
arXiv Detail & Related papers (2024-12-01T17:37:25Z) - MetricGold: Leveraging Text-To-Image Latent Diffusion Models for Metric Depth Estimation [9.639797094021988]
MetricGold is a novel approach that harnesses generative diffusion model's rich priors to improve metric depth estimation.<n>Our experiments demonstrate robust generalization across diverse datasets, producing sharper and higher quality metric depth estimates.
arXiv Detail & Related papers (2024-11-16T20:59:01Z) - TanDepth: Leveraging Global DEMs for Metric Monocular Depth Estimation in UAVs [5.6168844664788855]
This work presents TanDepth, a practical scale recovery method for obtaining metric depth results from relative estimations at inference-time.<n>Our method leverages sparse measurements from Global Digital Elevation Models (GDEM) by projecting them to the camera view.<n>An adaptation to the Cloth Filter Simulation is presented, which allows selecting ground points from the estimated depth map to then correlate with the projected reference points.
arXiv Detail & Related papers (2024-09-08T15:54:43Z) - MambaDS: Near-Surface Meteorological Field Downscaling with Topography Constrained Selective State Space Modeling [68.69647625472464]
Downscaling, a crucial task in meteorological forecasting, enables the reconstruction of high-resolution meteorological states for target regions.
Previous downscaling methods lacked tailored designs for meteorology and encountered structural limitations.
We propose a novel model called MambaDS, which enhances the utilization of multivariable correlations and topography information.
arXiv Detail & Related papers (2024-08-20T13:45:49Z) - Monocular Visual-Inertial Depth Estimation [66.71452943981558]
We present a visual-inertial depth estimation pipeline that integrates monocular depth estimation and visual-inertial odometry.
Our approach performs global scale and shift alignment against sparse metric depth, followed by learning-based dense alignment.
We evaluate on the TartanAir and VOID datasets, observing up to 30% reduction in RMSE with dense scale alignment.
arXiv Detail & Related papers (2023-03-21T18:47:34Z) - DARF: Depth-Aware Generalizable Neural Radiance Field [51.29437249009986]
We propose the Depth-Aware Generalizable Neural Radiance Field (DARF) with a Depth-Aware Dynamic Sampling (DADS) strategy.<n>Our framework infers the unseen scenes on both pixel level and geometry level with only a few input images.<n>Compared with state-of-the-art generalizable NeRF methods, DARF reduces samples by 50%, while improving rendering quality and depth estimation.
arXiv Detail & Related papers (2022-12-05T14:00:59Z) - Guided deep learning by subaperture decomposition: ocean patterns from
SAR imagery [36.922471841100176]
Sentinel 1 SAR wave mode vignettes have made possible to capture many important oceanic and atmospheric phenomena since 2014.
In this study, we propose to apply subaperture decomposition as a preprocessing stage for SAR deep learning models.
arXiv Detail & Related papers (2022-04-09T09:49:05Z) - Improving Monocular Visual Odometry Using Learned Depth [84.05081552443693]
We propose a framework to exploit monocular depth estimation for improving visual odometry (VO)
The core of our framework is a monocular depth estimation module with a strong generalization capability for diverse scenes.
Compared with current learning-based VO methods, our method demonstrates a stronger generalization ability to diverse scenes.
arXiv Detail & Related papers (2022-04-04T06:26:46Z) - Flood Extent Mapping based on High Resolution Aerial Imagery and DEM: A
Hidden Markov Tree Approach [10.72081512622396]
This paper evaluates the proposed geographical hidden Markov tree model through case studies on high-resolution aerial imagery.
Three scenes are selected in heavily vegetated floodplains near the cities of Grimesland and Kinston in North Carolina during Hurricane Matthew floods in 2016.
Results show that the proposed hidden Markov tree model outperforms several state of the art machine learning algorithms.
arXiv Detail & Related papers (2020-08-25T18:35:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.