Creating and Leveraging a Synthetic Dataset of Cloud Optical Thickness Measures for Cloud Detection in MSI
- URL: http://arxiv.org/abs/2311.14024v3
- Date: Fri, 15 Mar 2024 05:27:42 GMT
- Title: Creating and Leveraging a Synthetic Dataset of Cloud Optical Thickness Measures for Cloud Detection in MSI
- Authors: Aleksis Pirinen, Nosheen Abid, Nuria Agues Paszkowsky, Thomas Ohlson Timoudas, Ronald Scheirer, Chiara Ceccobello, György Kovács, Anders Persson,
- Abstract summary: Cloud formations often obscure optical satellite-based monitoring of the Earth's surface.
We propose a novel synthetic dataset for cloud optical thickness estimation.
We leverage for obtaining reliable and versatile cloud masks on real data.
- Score: 3.4764766275808583
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cloud formations often obscure optical satellite-based monitoring of the Earth's surface, thus limiting Earth observation (EO) activities such as land cover mapping, ocean color analysis, and cropland monitoring. The integration of machine learning (ML) methods within the remote sensing domain has significantly improved performance on a wide range of EO tasks, including cloud detection and filtering, but there is still much room for improvement. A key bottleneck is that ML methods typically depend on large amounts of annotated data for training, which is often difficult to come by in EO contexts. This is especially true when it comes to cloud optical thickness (COT) estimation. A reliable estimation of COT enables more fine-grained and application-dependent control compared to using pre-specified cloud categories, as is commonly done in practice. To alleviate the COT data scarcity problem, in this work we propose a novel synthetic dataset for COT estimation, that we subsequently leverage for obtaining reliable and versatile cloud masks on real data. In our dataset, top-of-atmosphere radiances have been simulated for 12 of the spectral bands of the Multispectral Imagery (MSI) sensor onboard Sentinel-2 platforms. These data points have been simulated under consideration of different cloud types, COTs, and ground surface and atmospheric profiles. Extensive experimentation of training several ML models to predict COT from the measured reflectivity of the spectral bands demonstrates the usefulness of our proposed dataset. In particular, by thresholding COT estimates from our ML models, we show on two satellite image datasets (one that is publicly available, and one which we have collected and annotated) that reliable cloud masks can be obtained. The synthetic data, the collected real dataset, code and models have been made publicly available at https://github.com/aleksispi/ml-cloud-opt-thick.
Related papers
- PGCS: Physical Law embedded Generative Cloud Synthesis in Remote Sensing Images [9.655563155560658]
Physical law embedded generative cloud synthesis method (PGCS) is proposed to generate diverse realistic cloud images to enhance real data.
Two cloud correction methods are developed from PGCS and exhibits a superior performance compared to state-of-the-art methods in the cloud correction task.
arXiv Detail & Related papers (2024-10-22T12:36:03Z) - Quanv4EO: Empowering Earth Observation by means of Quanvolutional Neural Networks [62.12107686529827]
This article highlights a significant shift towards leveraging quantum computing techniques in processing large volumes of remote sensing data.
The proposed Quanv4EO model introduces a quanvolution method for preprocessing multi-dimensional EO data.
Key findings suggest that the proposed model not only maintains high precision in image classification but also shows improvements of around 5% in EO use cases.
arXiv Detail & Related papers (2024-07-24T09:11:34Z) - M3LEO: A Multi-Modal, Multi-Label Earth Observation Dataset Integrating Interferometric SAR and Multispectral Data [1.4053129774629076]
M3LEO is a multi-modal, multi-label Earth observation dataset.
It spans approximately 17M 4x4 km data chips from six diverse geographic regions.
arXiv Detail & Related papers (2024-06-06T16:30:41Z) - Rethinking Transformers Pre-training for Multi-Spectral Satellite
Imagery [78.43828998065071]
Recent advances in unsupervised learning have demonstrated the ability of large vision models to achieve promising results on downstream tasks.
Such pre-training techniques have also been explored recently in the remote sensing domain due to the availability of large amount of unlabelled data.
In this paper, we re-visit transformers pre-training and leverage multi-scale information that is effectively utilized with multiple modalities.
arXiv Detail & Related papers (2024-03-08T16:18:04Z) - Pushing the Limits of Pre-training for Time Series Forecasting in the
CloudOps Domain [54.67888148566323]
We introduce three large-scale time series forecasting datasets from the cloud operations domain.
We show it is a strong zero-shot baseline and benefits from further scaling, both in model and dataset size.
Accompanying these datasets and results is a suite of comprehensive benchmark results comparing classical and deep learning baselines to our pre-trained method.
arXiv Detail & Related papers (2023-10-08T08:09:51Z) - Learning to Simulate Realistic LiDARs [66.7519667383175]
We introduce a pipeline for data-driven simulation of a realistic LiDAR sensor.
We show that our model can learn to encode realistic effects such as dropped points on transparent surfaces.
We use our technique to learn models of two distinct LiDAR sensors and use them to improve simulated LiDAR data accordingly.
arXiv Detail & Related papers (2022-09-22T13:12:54Z) - Learning Digital Terrain Models from Point Clouds: ALS2DTM Dataset and
Rasterization-based GAN [6.267267911687337]
This paper collects from open sources a large-scale dataset of ALS point clouds and corresponding DTMs with various urban, forested, and mountainous scenes.
A baseline method is proposed as the first attempt to train a Deep neural network to extract digital Terrain models directly from ALS point clouds.
Extensive studies with well-established methods are performed to benchmark the dataset and analyze the challenges in learning to extract DTM from point clouds.
arXiv Detail & Related papers (2022-06-08T09:50:48Z) - A Recommender System-Inspired Cloud Data Filling Scheme for
Satellite-based Coastal Observation [0.0]
This study is inspired by the success of data imputation methods in recommender systems that are designed for online shopping.
A numerical experiment was designed and conducted for a LandSat dataset with a range of synthetic cloud covers.
The recommender system-inspired matrix factorization algorithm called Funk-SVD showed superior performance in computational accuracy and efficiency.
arXiv Detail & Related papers (2021-11-27T18:26:11Z) - Fog Simulation on Real LiDAR Point Clouds for 3D Object Detection in
Adverse Weather [92.84066576636914]
This work addresses the challenging task of LiDAR-based 3D object detection in foggy weather.
We tackle this problem by simulating physically accurate fog into clear-weather scenes.
We are the first to provide strong 3D object detection baselines on the Seeing Through Fog dataset.
arXiv Detail & Related papers (2021-08-11T14:37:54Z) - Lidar Light Scattering Augmentation (LISA): Physics-based Simulation of
Adverse Weather Conditions for 3D Object Detection [60.89616629421904]
Lidar-based object detectors are critical parts of the 3D perception pipeline in autonomous navigation systems such as self-driving cars.
They are sensitive to adverse weather conditions such as rain, snow and fog due to reduced signal-to-noise ratio (SNR) and signal-to-background ratio (SBR)
arXiv Detail & Related papers (2021-07-14T21:10:47Z) - Multi-Sensor Data Fusion for Cloud Removal in Global and All-Season
Sentinel-2 Imagery [15.459106705735376]
The majority of optical observations acquired via spaceborne earth imagery are affected by clouds.
We target the challenge of generalization by curating a large novel data set for training new cloud removal approaches.
Based on the observation that cloud coverage varies widely between clear skies and absolute coverage, we propose a novel model that can deal with either extremes.
arXiv Detail & Related papers (2020-09-16T13:40:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.