TS-SatMVSNet: Slope Aware Height Estimation for Large-Scale Earth Terrain Multi-view Stereo
- URL: http://arxiv.org/abs/2501.01049v1
- Date: Thu, 02 Jan 2025 04:18:40 GMT
- Title: TS-SatMVSNet: Slope Aware Height Estimation for Large-Scale Earth Terrain Multi-view Stereo
- Authors: Song Zhang, Zhiwei Wei, Wenjia Xu, Lili Zhang, Yang Wang, Jinming Zhang, Junyi Liu,
- Abstract summary: 3D terrain reconstruction with remote sensing imagery achieves cost-effective and large-scale earth observation.<n>We propose an end-to-end slope-aware height estimation network named TS-SatMVSNet for large-scale remote sensing terrain reconstruction.<n>To fully integrate slope information into the MVS pipeline, we design two slope-guided modules to enhance reconstruction outcomes at both micro and macro levels.
- Score: 19.509863059288037
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D terrain reconstruction with remote sensing imagery achieves cost-effective and large-scale earth observation and is crucial for safeguarding natural disasters, monitoring ecological changes, and preserving the environment.Recently, learning-based multi-view stereo~(MVS) methods have shown promise in this task. However, these methods simply modify the general learning-based MVS framework for height estimation, which overlooks the terrain characteristics and results in insufficient accuracy. Considering that the Earth's surface generally undulates with no drastic changes and can be measured by slope, integrating slope considerations into MVS frameworks could enhance the accuracy of terrain reconstructions. To this end, we propose an end-to-end slope-aware height estimation network named TS-SatMVSNet for large-scale remote sensing terrain reconstruction.To effectively obtain the slope representation, drawing from mathematical gradient concepts, we innovatively proposed a height-based slope calculation strategy to first calculate a slope map from a height map to measure the terrain undulation. To fully integrate slope information into the MVS pipeline, we separately design two slope-guided modules to enhance reconstruction outcomes at both micro and macro levels. Specifically, at the micro level, we designed a slope-guided interval partition module for refined height estimation using slope values. At the macro level, a height correction module is proposed, using a learnable Gaussian smoothing operator to amend the inaccurate height values. Additionally, to enhance the efficacy of height estimation, we proposed a slope direction loss for implicitly optimizing height estimation results. Extensive experiments on the WHU-TLC dataset and MVS3D dataset show that our proposed method achieves state-of-the-art performance and demonstrates competitive generalization ability.
Related papers
- Altitude-Aware Visual Place Recognition in Top-Down View [1.888468773682976]
This study proposes an altitude-adaptive VPR approach that integrates ground feature density analysis with image classification techniques.<n>The proposed method estimates airborne platforms' relative altitude by analyzing the density of ground features in images.<n>Under significant altitude variations, incorporating our relative altitude estimation module into the VPR retrieval pipeline boosts average R@1 and R@5 by 29.85% and 60.20%, respectively.
arXiv Detail & Related papers (2026-02-27T10:15:15Z) - Gamma-from-Mono: Road-Relative, Metric, Self-Supervised Monocular Geometry for Vehicular Applications [2.9457242478147503]
We introduce Gamma-from-Mono (GfM), a lightweight monocular geometry estimation method.<n>GfM predicts a dominant road surface plane together with residual variations expressed by gamma.<n>With only the camera's height above ground, GfM deterministically recovers metric depth via a closed form.
arXiv Detail & Related papers (2025-12-03T22:37:38Z) - PLANA3R: Zero-shot Metric Planar 3D Reconstruction via Feed-Forward Planar Splatting [56.188624157291024]
We introduce PLANA3R, a pose-free framework for metric Planar 3D Reconstruction from unposed two-view images.<n>Unlike prior feedforward methods that require 3D plane annotations during training, PLANA3R learns planar 3D structures without explicit plane supervision.<n>We validate PLANA3R on multiple indoor-scene datasets with metric supervision and demonstrate strong generalization to out-of-domain indoor environments.
arXiv Detail & Related papers (2025-10-21T15:15:33Z) - Loc$^2$: Interpretable Cross-View Localization via Depth-Lifted Local Feature Matching [80.57282092735991]
We propose an accurate and interpretable fine-grained cross-view localization method.<n>It estimates the 3 Degrees of Freedom (DoF) pose of a ground-level image by matching its local features with a reference aerial image.<n> Experiments show state-of-the-art accuracy in challenging scenarios such as cross-area testing and unknown orientation.
arXiv Detail & Related papers (2025-09-11T18:52:16Z) - SC-Lane: Slope-aware and Consistent Road Height Estimation Framework for 3D Lane Detection [6.35342543540348]
We introduce SC-Lane, a novel slope-aware and temporally consistent heightmap estimation framework for 3D lane detection.<n>SC-Lane adaptively determines the fusion of slope-specific height features, improving robustness to diverse road geometries.<n>Extensive experiments on the OpenLane benchmark demonstrate that SC-Lane significantly improves both height estimation and 3D lane detection.
arXiv Detail & Related papers (2025-08-14T07:34:56Z) - Dual-Level Precision Edges Guided Multi-View Stereo with Accurate Planarization [3.597821311597427]
Multi-view stereo (MVS) reconstruction of low-textured areas is a prominent research focus.<n>Traditional MVS methods often encounter issues such as crossing object boundaries and limited perception ranges.<n>We introduce dual-level precision edge information, including fine and coarse edges, to enhance the robustness of plane model construction.<n>Our method achieves state-of-the-art performance on the ETH3D and Tanks & Temples benchmarks.
arXiv Detail & Related papers (2024-12-29T02:54:01Z) - Tomographic SAR Reconstruction for Forest Height Estimation [4.1942958779358674]
Tree height estimation serves as an important proxy for biomass estimation in ecological and forestry applications.
In this study, we use deep learning to estimate forest canopy height directly from 2D Single Look Complex (SLC) images, a derivative of Synthetic Aperture Radar (SAR)
Our method attempts to bypass traditional tomographic signal processing, potentially reducing latency from SAR capture to end product.
arXiv Detail & Related papers (2024-12-01T17:37:25Z) - HeightLane: BEV Heightmap guided 3D Lane Detection [6.940660861207046]
Accurate 3D lane detection from monocular images presents significant challenges due to depth ambiguity and imperfect ground modeling.
Our study introduces HeightLane, an innovative method that predicts a height map from monocular images by creating anchors based on a multi-slope assumption.
HeightLane achieves state-of-the-art performance in terms of F-score, highlighting its potential in real-world applications.
arXiv Detail & Related papers (2024-08-15T17:14:57Z) - Estimating Canopy Height at Scale [15.744009072839425]
We propose a framework for global-scale canopy height estimation based on satellite data.
A comparison between predictions and ground-truth labels yields an MAE / RMSE of 2.43 / 4.73 (meters) overall and 4.45 / 6.72 (meters) for trees taller than five meters.
arXiv Detail & Related papers (2024-06-03T07:53:38Z) - 360 Layout Estimation via Orthogonal Planes Disentanglement and Multi-view Geometric Consistency Perception [56.84921040837699]
Existing panoramic layout estimation solutions tend to recover room boundaries from a vertically compressed sequence, yielding imprecise results.
We propose an orthogonal plane disentanglement network (termed DOPNet) to distinguish ambiguous semantics.
We also present an unsupervised adaptation technique tailored for horizon-depth and ratio representations.
Our solution outperforms other SoTA models on both monocular layout estimation and multi-view layout estimation tasks.
arXiv Detail & Related papers (2023-12-26T12:16:03Z) - Volumetric Semantically Consistent 3D Panoptic Mapping [77.13446499924977]
We introduce an online 2D-to-3D semantic instance mapping algorithm aimed at generating semantic 3D maps suitable for autonomous agents in unstructured environments.
It introduces novel ways of integrating semantic prediction confidence during mapping, producing semantic and instance-consistent 3D regions.
The proposed method achieves accuracy superior to the state of the art on public large-scale datasets, improving on a number of widely used metrics.
arXiv Detail & Related papers (2023-09-26T08:03:10Z) - Semi-supervised Learning from Street-View Images and OpenStreetMap for
Automatic Building Height Estimation [59.6553058160943]
We propose a semi-supervised learning (SSL) method of automatically estimating building height from Mapillary SVI and OpenStreetMap data.
The proposed method leads to a clear performance boosting in estimating building heights with a Mean Absolute Error (MAE) around 2.1 meters.
The preliminary result is promising and motivates our future work in scaling up the proposed method based on low-cost VGI data.
arXiv Detail & Related papers (2023-07-05T18:16:30Z) - Multi-View Stereo Representation Revisit: Region-Aware MVSNet [8.264851594332677]
Deep learning-based multi-view stereo has emerged as a powerful paradigm for reconstructing the complete geometrically-detailed objects from multi-views.
We propose RA-MVSNet to take advantage of point-to-surface distance so that the model is able to perceive a wider range of surfaces.
Our proposed RA-MVSNet is patch-awared, since the perception range is enhanced by associating hypothetical planes with a patch of surface.
arXiv Detail & Related papers (2023-04-26T15:17:51Z) - Towards 3D Scene Reconstruction from Locally Scale-Aligned Monocular
Video Depth [90.33296913575818]
In some video-based scenarios such as video depth estimation and 3D scene reconstruction from a video, the unknown scale and shift residing in per-frame prediction may cause the depth inconsistency.
We propose a locally weighted linear regression method to recover the scale and shift with very sparse anchor points.
Our method can boost the performance of existing state-of-the-art approaches by 50% at most over several zero-shot benchmarks.
arXiv Detail & Related papers (2022-02-03T08:52:54Z) - TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view
Stereo [55.30992853477754]
We present TANDEM, a real-time monocular tracking and dense framework.
For pose estimation, TANDEM performs photometric bundle adjustment based on a sliding window of alignments.
TANDEM shows state-of-the-art real-time 3D reconstruction performance.
arXiv Detail & Related papers (2021-11-14T19:01:02Z) - Unsupervised Scale-consistent Depth Learning from Video [131.3074342883371]
We propose a monocular depth estimator SC-Depth, which requires only unlabelled videos for training.
Thanks to the capability of scale-consistent prediction, we show that our monocular-trained deep networks are readily integrated into the ORB-SLAM2 system.
The proposed hybrid Pseudo-RGBD SLAM shows compelling results in KITTI, and it generalizes well to the KAIST dataset without additional training.
arXiv Detail & Related papers (2021-05-25T02:17:56Z) - OmniSLAM: Omnidirectional Localization and Dense Mapping for
Wide-baseline Multi-camera Systems [88.41004332322788]
We present an omnidirectional localization and dense mapping system for a wide-baseline multiview stereo setup with ultra-wide field-of-view (FOV) fisheye cameras.
For more practical and accurate reconstruction, we first introduce improved and light-weighted deep neural networks for the omnidirectional depth estimation.
We integrate our omnidirectional depth estimates into the visual odometry (VO) and add a loop closing module for global consistency.
arXiv Detail & Related papers (2020-03-18T05:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.