Detecting Cloud Presence in Satellite Images Using the RGB-based CLIP
Vision-Language Model
- URL: http://arxiv.org/abs/2308.00541v1
- Date: Tue, 1 Aug 2023 13:36:46 GMT
- Title: Detecting Cloud Presence in Satellite Images Using the RGB-based CLIP
Vision-Language Model
- Authors: Mikolaj Czerkawski, Robert Atkinson, Christos Tachtatzis
- Abstract summary: This work explores capabilities of the pre-trained CLIP vision-language model to identify satellite images affected by clouds.
Several approaches to using the model to perform cloud presence detection are proposed and evaluated.
Results demonstrate that the representations learned by the CLIP model can be useful for satellite image processing tasks involving clouds.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work explores capabilities of the pre-trained CLIP vision-language model
to identify satellite images affected by clouds. Several approaches to using
the model to perform cloud presence detection are proposed and evaluated,
including a purely zero-shot operation with text prompts and several
fine-tuning approaches. Furthermore, the transferability of the methods across
different datasets and sensor types (Sentinel-2 and Landsat-8) is tested. The
results that CLIP can achieve non-trivial performance on the cloud presence
detection task with apparent capability to generalise across sensing modalities
and sensing bands. It is also found that a low-cost fine-tuning stage leads to
a strong increase in true negative rate. The results demonstrate that the
representations learned by the CLIP model can be useful for satellite image
processing tasks involving clouds.
Related papers
- Adapting Vision Foundation Models for Robust Cloud Segmentation in Remote Sensing Images [22.054023867495722]
Cloud segmentation is a critical challenge in remote sensing image interpretation.
We present a parameter-efficient adaptive approach, termed Cloud-Adapter, to enhance the accuracy and robustness of cloud segmentation.
arXiv Detail & Related papers (2024-11-20T08:37:39Z) - Dual-Image Enhanced CLIP for Zero-Shot Anomaly Detection [58.228940066769596]
We introduce a Dual-Image Enhanced CLIP approach, leveraging a joint vision-language scoring system.
Our methods process pairs of images, utilizing each as a visual reference for the other, thereby enriching the inference process with visual context.
Our approach significantly exploits the potential of vision-language joint anomaly detection and demonstrates comparable performance with current SOTA methods across various datasets.
arXiv Detail & Related papers (2024-05-08T03:13:20Z) - Learning to detect cloud and snow in remote sensing images from noisy
labels [26.61590605351686]
The complexity of scenes and the diversity of cloud types in remote sensing images result in many inaccurate labels.
This paper is the first to consider the impact of label noise on the detection of clouds and snow in remote sensing images.
arXiv Detail & Related papers (2024-01-17T03:02:31Z) - CLiSA: A Hierarchical Hybrid Transformer Model using Orthogonal Cross
Attention for Satellite Image Cloud Segmentation [5.178465447325005]
Deep learning algorithms have emerged as promising approach to solve image segmentation problems.
In this paper, we introduce a deep-learning model for effective cloud mask generation named CLiSA - Cloud segmentation via Lipschitz Stable Attention network.
We demonstrate both qualitative and quantitative outcomes for multiple satellite image datasets including Landsat-8, Sentinel-2, and Cartosat-2s.
arXiv Detail & Related papers (2023-11-29T09:31:31Z) - DiffCR: A Fast Conditional Diffusion Framework for Cloud Removal from
Optical Satellite Images [27.02507384522271]
This paper presents a novel framework called DiffCR, which leverages conditional guided diffusion with deep convolutional networks for high-performance cloud removal for optical satellite imagery.
We introduce a decoupled encoder for conditional image feature extraction, providing a robust color representation to ensure the close similarity of appearance information between the conditional input and the synthesized output.
arXiv Detail & Related papers (2023-08-08T17:34:28Z) - Exploring the Application of Large-scale Pre-trained Models on Adverse
Weather Removal [97.53040662243768]
We propose a CLIP embedding module to make the network handle different weather conditions adaptively.
This module integrates the sample specific weather prior extracted by CLIP image encoder together with the distribution specific information learned by a set of parameters.
arXiv Detail & Related papers (2023-06-15T10:06:13Z) - Semantic Segmentation of Radar Detections using Convolutions on Point
Clouds [59.45414406974091]
We introduce a deep-learning based method to convolve radar detections into point clouds.
We adapt this algorithm to radar-specific properties through distance-dependent clustering and pre-processing of input point clouds.
Our network outperforms state-of-the-art approaches that are based on PointNet++ on the task of semantic segmentation of radar point clouds.
arXiv Detail & Related papers (2023-05-22T07:09:35Z) - Ponder: Point Cloud Pre-training via Neural Rendering [93.34522605321514]
We propose a novel approach to self-supervised learning of point cloud representations by differentiable neural encoders.
The learned point-cloud can be easily integrated into various downstream tasks, including not only high-level rendering tasks like 3D detection and segmentation, but low-level tasks like 3D reconstruction and image rendering.
arXiv Detail & Related papers (2022-12-31T08:58:39Z) - SatMAE: Pre-training Transformers for Temporal and Multi-Spectral
Satellite Imagery [74.82821342249039]
We present SatMAE, a pre-training framework for temporal or multi-spectral satellite imagery based on Masked Autoencoder (MAE)
To leverage temporal information, we include a temporal embedding along with independently masking image patches across time.
arXiv Detail & Related papers (2022-07-17T01:35:29Z) - City-scale Scene Change Detection using Point Clouds [71.73273007900717]
We propose a method for detecting structural changes in a city using images captured from mounted cameras over two different times.
A direct comparison of the two point clouds for change detection is not ideal due to inaccurate geo-location information.
To circumvent this problem, we propose a deep learning-based non-rigid registration on the point clouds.
Experiments show that our method is able to detect scene changes effectively, even in the presence of viewpoint and illumination differences.
arXiv Detail & Related papers (2021-03-26T08:04:13Z) - Single Image Cloud Detection via Multi-Image Fusion [23.641624507709274]
A primary challenge in developing algorithms is the cost of collecting annotated training data.
We demonstrate how recent advances in multi-image fusion can be leveraged to bootstrap single image cloud detection.
We collect a large dataset of Sentinel-2 images along with a per-pixel semantic labelling for land cover.
arXiv Detail & Related papers (2020-07-29T22:52:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.