Seeing Through the Clouds: Cloud Gap Imputation with Prithvi Foundation Model
- URL: http://arxiv.org/abs/2404.19609v1
- Date: Tue, 30 Apr 2024 15:03:27 GMT
- Title: Seeing Through the Clouds: Cloud Gap Imputation with Prithvi Foundation Model
- Authors: Denys Godwin, Hanxi Li, Michael Cecil, Hamed Alemohammad,
- Abstract summary: We compare a Vision Transformer (ViT) model with a baseline Conditional Generative Adversarial Network (CGAN) model for missing value imputation in time series of multispectral satellite imagery.
We randomly mask time series of satellite images using real-world cloud masks and train each model to reconstruct the missing pixels.
- Score: 1.2374541748245838
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Filling cloudy pixels in multispectral satellite imagery is essential for accurate data analysis and downstream applications, especially for tasks which require time series data. To address this issue, we compare the performance of a foundational Vision Transformer (ViT) model with a baseline Conditional Generative Adversarial Network (CGAN) model for missing value imputation in time series of multispectral satellite imagery. We randomly mask time series of satellite images using real-world cloud masks and train each model to reconstruct the missing pixels. The ViT model is fine-tuned from a pretrained model, while the CGAN is trained from scratch. Using quantitative evaluation metrics such as structural similarity index and mean absolute error as well as qualitative visual analysis, we assess imputation accuracy and contextual preservation.
Related papers
- SatMamba: Development of Foundation Models for Remote Sensing Imagery Using State Space Models [0.0]
Foundation models refer to deep learning models pretrained on large unlabeled datasets through self-supervised algorithms.
Various foundation models have been developed for remote sensing, such as those for multispectral, high-resolution, and hyperspectral images.
This research proposes SatMamba, a new pretraining framework that combines masked autoencoders with State Space Model.
arXiv Detail & Related papers (2025-02-01T14:07:21Z) - Improving Satellite Imagery Masking using Multi-task and Transfer Learning [13.987883100675438]
We present a collection of models offering different speed/accuracy trade-offs for masking.
Our models provide a 9% F1 score improvement compared to previous work on water pixel identification.
arXiv Detail & Related papers (2024-12-11T17:00:51Z) - SatVision-TOA: A Geospatial Foundation Model for Coarse-Resolution All-Sky Remote Sensing Imagery [8.096413986108601]
We introduce SatVision-TOA, a novel foundation model pre-trained on 14-band MODIS L1B Top-Of-Atmosphere (TOA) radiance imagery.
The SatVision-TOA model is pre-trained using a Masked-Image-Modeling (MIM) framework and the SwinV2 architecture.
Results show that SatVision-TOA achieves superior performance over baseline methods on downstream tasks.
arXiv Detail & Related papers (2024-11-26T00:08:00Z) - ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object [78.58860252442045]
We introduce generative model as a data source for hard images that benchmark deep models' robustness.
We are able to generate images with more diversified backgrounds, textures, and materials than any prior work, where we term this benchmark as ImageNet-D.
Our work suggests that diffusion models can be an effective source to test vision models.
arXiv Detail & Related papers (2024-03-27T17:23:39Z) - DiffusionSat: A Generative Foundation Model for Satellite Imagery [63.2807119794691]
We present DiffusionSat, to date the largest generative foundation model trained on a collection of publicly available large, high-resolution remote sensing datasets.
Our method produces realistic samples and can be used to solve multiple generative tasks including temporal generation, superresolution given multi-spectral inputs and in-painting.
arXiv Detail & Related papers (2023-12-06T16:53:17Z) - CLiSA: A Hierarchical Hybrid Transformer Model using Orthogonal Cross
Attention for Satellite Image Cloud Segmentation [5.178465447325005]
Deep learning algorithms have emerged as promising approach to solve image segmentation problems.
In this paper, we introduce a deep-learning model for effective cloud mask generation named CLiSA - Cloud segmentation via Lipschitz Stable Attention network.
We demonstrate both qualitative and quantitative outcomes for multiple satellite image datasets including Landsat-8, Sentinel-2, and Cartosat-2s.
arXiv Detail & Related papers (2023-11-29T09:31:31Z) - DeepDC: Deep Distance Correlation as a Perceptual Image Quality
Evaluator [53.57431705309919]
ImageNet pre-trained deep neural networks (DNNs) show notable transferability for building effective image quality assessment (IQA) models.
We develop a novel full-reference IQA (FR-IQA) model based exclusively on pre-trained DNN features.
We conduct comprehensive experiments to demonstrate the superiority of the proposed quality model on five standard IQA datasets.
arXiv Detail & Related papers (2022-11-09T14:57:27Z) - Efficient data-driven gap filling of satellite image time series using
deep neural networks with partial convolutions [0.0]
This paper shows how three-dimensional partial convolutions can be used as layers in neural networks to fill gaps in satellite image time series.
To evaluate the approach we apply a U-Net-like model on incomplete time series of quasi-global carbon monoxide observations from the Sentinel-5P satellite.
arXiv Detail & Related papers (2022-08-18T11:32:04Z) - CONVIQT: Contrastive Video Quality Estimator [63.749184706461826]
Perceptual video quality assessment (VQA) is an integral component of many streaming and video sharing platforms.
Here we consider the problem of learning perceptually relevant video quality representations in a self-supervised manner.
Our results indicate that compelling representations with perceptual bearing can be obtained using self-supervised learning.
arXiv Detail & Related papers (2022-06-29T15:22:01Z) - A Comprehensive Study of Image Classification Model Sensitivity to
Foregrounds, Backgrounds, and Visual Attributes [58.633364000258645]
We call this dataset RIVAL10 consisting of roughly $26k$ instances over $10$ classes.
We evaluate the sensitivity of a broad set of models to noise corruptions in foregrounds, backgrounds and attributes.
In our analysis, we consider diverse state-of-the-art architectures (ResNets, Transformers) and training procedures (CLIP, SimCLR, DeiT, Adversarial Training)
arXiv Detail & Related papers (2022-01-26T06:31:28Z) - Transferring and Regularizing Prediction for Semantic Segmentation [115.88957139226966]
In this paper, we exploit the intrinsic properties of semantic segmentation to alleviate such problem for model transfer.
We present a Regularizer of Prediction Transfer (RPT) that imposes the intrinsic properties as constraints to regularize model transfer in an unsupervised fashion.
Extensive experiments are conducted to verify the proposal of RPT on the transfer of models trained on GTA5 and SYNTHIA (synthetic data) to Cityscapes dataset (urban street scenes)
arXiv Detail & Related papers (2020-06-11T16:19:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.