Related papers: Vision Transformers, a new approach for high-resolution and large-scale mapping of canopy heights

Vision Transformers, a new approach for high-resolution and large-scale mapping of canopy heights

URL: http://arxiv.org/abs/2304.11487v1
Date: Sat, 22 Apr 2023 22:39:03 GMT
Title: Vision Transformers, a new approach for high-resolution and large-scale mapping of canopy heights
Authors: Ibrahim Fayad, Philippe Ciais, Martin Schwartz, Jean-Pierre Wigneron, Nicolas Baghdadi, Aur\'elien de Truchis, Alexandre d'Aspremont, Frederic Frappart, Sassan Saatchi, Agnes Pellissier-Tanon and Hassan Bazzi
Abstract summary: We present a new vision transformer (ViT) model optimized with a classification (discrete) and a continuous loss function. This model achieves better accuracy than previously used convolutional based approaches (ConvNets) optimized with only a continuous loss function.
Score: 50.52704854147297
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Accurate and timely monitoring of forest canopy heights is critical for assessing forest dynamics, biodiversity, carbon sequestration as well as forest degradation and deforestation. Recent advances in deep learning techniques, coupled with the vast amount of spaceborne remote sensing data offer an unprecedented opportunity to map canopy height at high spatial and temporal resolutions. Current techniques for wall-to-wall canopy height mapping correlate remotely sensed 2D information from optical and radar sensors to the vertical structure of trees using LiDAR measurements. While studies using deep learning algorithms have shown promising performances for the accurate mapping of canopy heights, they have limitations due to the type of architectures and loss functions employed. Moreover, mapping canopy heights over tropical forests remains poorly studied, and the accurate height estimation of tall canopies is a challenge due to signal saturation from optical and radar sensors, persistent cloud covers and sometimes the limited penetration capabilities of LiDARs. Here, we map heights at 10 m resolution across the diverse landscape of Ghana with a new vision transformer (ViT) model optimized concurrently with a classification (discrete) and a regression (continuous) loss function. This model achieves better accuracy than previously used convolutional based approaches (ConvNets) optimized with only a continuous loss function. The ViT model results show that our proposed discrete/continuous loss significantly increases the sensitivity for very tall trees (i.e., > 35m), for which other approaches show saturation effects. The height maps generated by the ViT also have better ground sampling distance and better sensitivity to sparse vegetation in comparison to a convolutional model. Our ViT model has a RMSE of 3.12m in comparison to a reference dataset while the ConvNet model has a RMSE of 4.3m.

Related papers

TacoDepth: Towards Efficient Radar-Camera Depth Estimation with One-stage Fusion [54.46664104437454]
We propose TacoDepth, an efficient and accurate Radar-Camera depth estimation model with one-stage fusion. Specifically, the graph-based Radar structure extractor and the pyramid-based Radar fusion module are designed. Compared with the previous state-of-the-art approach, TacoDepth improves depth accuracy and processing speed by 12.8% and 91.8%.
arXiv Detail & Related papers (2025-04-16T05:25:04Z)
A Deep Learning Approach to Estimate Canopy Height and Uncertainty by Integrating Seasonal Optical, SAR and Limited GEDI LiDAR Data over Northern Forests [0.0]
This study introduces a methodology for generating spatially continuous, high-resolution canopy height and uncertainty estimates. We integrate multi-source, multi-seasonal satellite data from Sentinel-1, Landsat, and ALOS-PALSAR-2 with spaceborne GEDI LiDAR as reference data. Using seasonal data instead of summer-only data improved variability by 10%, reduced error by 0.45 m, and decreased bias by 1 m.
arXiv Detail & Related papers (2024-10-08T20:27:11Z)
HeightLane: BEV Heightmap guided 3D Lane Detection [6.940660861207046]
Accurate 3D lane detection from monocular images presents significant challenges due to depth ambiguity and imperfect ground modeling. Our study introduces HeightLane, an innovative method that predicts a height map from monocular images by creating anchors based on a multi-slope assumption. HeightLane achieves state-of-the-art performance in terms of F-score, highlighting its potential in real-world applications.
arXiv Detail & Related papers (2024-08-15T17:14:57Z)
Depth Any Canopy: Leveraging Depth Foundation Models for Canopy Height Estimation [4.69726714177332]
Estimating global tree canopy height is crucial for forest conservation and climate change applications. An efficient alternative is to train a canopy height estimator to operate on single-view remotely sensed imagery. Recent monocular depth estimation foundation models have show strong zero-shot performance even for complex scenes.
arXiv Detail & Related papers (2024-08-08T15:24:07Z)
NeRF-Det++: Incorporating Semantic Cues and Perspective-aware Depth Supervision for Indoor Multi-View 3D Detection [72.0098999512727]
NeRF-Det has achieved impressive performance in indoor multi-view 3D detection by utilizing NeRF to enhance representation learning. We present three corresponding solutions, including semantic enhancement, perspective-aware sampling, and ordinal depth supervision. The resulting algorithm, NeRF-Det++, has exhibited appealing performance in the ScanNetV2 and AR KITScenes datasets.
arXiv Detail & Related papers (2024-02-22T11:48:06Z)
Accuracy and Consistency of Space-based Vegetation Height Maps for Forest Dynamics in Alpine Terrain [18.23260742076316]
The Swiss National Forest Inventory (NFI) provides countrywide vegetation height maps at a spatial resolution of 0.5 m. This can be improved by using spaceborne remote sensing and deep learning to generate large-scale vegetation height maps. We generate annual, countrywide vegetation height maps at a 10-meter ground sampling distance for the years 2017 to 2020 based on Sentinel-2 satellite imagery.
arXiv Detail & Related papers (2023-09-04T20:23:57Z)
MonoTDP: Twin Depth Perception for Monocular 3D Object Detection in Adverse Scenes [49.21187418886508]
This paper proposes a monocular 3D detection model designed to perceive twin depth in adverse scenes, termed MonoTDP. We first introduce an adaptive learning strategy to aid the model in handling uncontrollable weather conditions, significantly resisting degradation caused by various degrading factors. Then, to address the depth/content loss in adverse regions, we propose a novel twin depth perception module that simultaneously estimates scene and object depth.
arXiv Detail & Related papers (2023-05-18T13:42:02Z)
Very high resolution canopy height maps from RGB imagery using self-supervised vision transformer and convolutional decoder trained on Aerial Lidar [14.07306593230776]
This paper presents the first high-resolution canopy height map concurrently produced for multiple sub-national jurisdictions. The maps are generated by the extraction of features from a self-supervised model trained on Maxar imagery from 2017 to 2020. We also introduce a post-processing step using a convolutional network trained on GEDI observations.
arXiv Detail & Related papers (2023-04-14T15:52:57Z)
On Robust Cross-View Consistency in Self-Supervised Monocular Depth Estimation [56.97699793236174]
We study two kinds of robust cross-view consistency in this paper. We exploit the temporal coherence in both depth feature space and 3D voxel space for self-supervised monocular depth estimation. Experimental results on several outdoor benchmarks show that our method outperforms current state-of-the-art techniques.
arXiv Detail & Related papers (2022-09-19T03:46:13Z)
A Multi-Stage model based on YOLOv3 for defect detection in PV panels based on IR and Visible Imaging by Unmanned Aerial Vehicle [65.99880594435643]
We propose a novel model to detect panel defects on aerial images captured by unmanned aerial vehicle. The model combines detections of panels and defects to refine its accuracy. The proposed model has been validated on two big PV plants in the south of Italy.
arXiv Detail & Related papers (2021-11-23T08:04:32Z)
Progressive Coordinate Transforms for Monocular 3D Object Detection [52.00071336733109]
We propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations. In this paper, we propose a novel and lightweight approach, dubbed em Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations.
arXiv Detail & Related papers (2021-08-12T15:22:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.