Adapting Vision Foundation Models for Robust Cloud Segmentation in Remote Sensing Images
- URL: http://arxiv.org/abs/2411.13127v2
- Date: Sat, 23 Nov 2024 16:55:16 GMT
- Title: Adapting Vision Foundation Models for Robust Cloud Segmentation in Remote Sensing Images
- Authors: Xuechao Zou, Shun Zhang, Kai Li, Shiying Wang, Junliang Xing, Lei Jin, Congyan Lang, Pin Tao,
- Abstract summary: Cloud segmentation is a critical challenge in remote sensing image interpretation.
We present a parameter-efficient adaptive approach, termed Cloud-Adapter, to enhance the accuracy and robustness of cloud segmentation.
- Score: 22.054023867495722
- License:
- Abstract: Cloud segmentation is a critical challenge in remote sensing image interpretation, as its accuracy directly impacts the effectiveness of subsequent data processing and analysis. Recently, vision foundation models (VFM) have demonstrated powerful generalization capabilities across various visual tasks. In this paper, we present a parameter-efficient adaptive approach, termed Cloud-Adapter, designed to enhance the accuracy and robustness of cloud segmentation. Our method leverages a VFM pretrained on general domain data, which remains frozen, eliminating the need for additional training. Cloud-Adapter incorporates a lightweight spatial perception module that initially utilizes a convolutional neural network (ConvNet) to extract dense spatial representations. These multi-scale features are then aggregated and serve as contextual inputs to an adapting module, which modulates the frozen transformer layers within the VFM. Experimental results demonstrate that the Cloud-Adapter approach, utilizing only 0.6% of the trainable parameters of the frozen backbone, achieves substantial performance gains. Cloud-Adapter consistently achieves state-of-the-art performance across various cloud segmentation datasets from multiple satellite sources, sensor series, data processing levels, land cover scenarios, and annotation granularities. We have released the code and model checkpoints at https://xavierjiezou.github.io/Cloud-Adapter/ to support further research.
Related papers
- PGCS: Physical Law embedded Generative Cloud Synthesis in Remote Sensing Images [9.655563155560658]
Physical law embedded generative cloud synthesis method (PGCS) is proposed to generate diverse realistic cloud images to enhance real data.
Two cloud correction methods are developed from PGCS and exhibits a superior performance compared to state-of-the-art methods in the cloud correction task.
arXiv Detail & Related papers (2024-10-22T12:36:03Z) - Parameter-Efficient Fine-Tuning in Spectral Domain for Point Cloud Learning [49.91297276176978]
We propose a novel.
Efficient Fine-Tuning (PEFT) method for point cloud, called Point GST.
Point GST freezes the pre-trained model and introduces a trainable Point Cloud Spectral Adapter (PCSA) to finetune parameters in the spectral domain.
Extensive experiments on challenging point cloud datasets demonstrate that Point GST not only outperforms its fully finetuning counterpart but also significantly reduces trainable parameters.
arXiv Detail & Related papers (2024-10-10T17:00:04Z) - Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis [51.14136878142034]
Point cloud analysis has achieved outstanding performance by transferring point cloud pre-trained models.
Existing methods for model adaptation usually update all model parameters, which is inefficient as it relies on high computational costs.
In this paper, we aim to study parameter-efficient transfer learning for point cloud analysis with an ideal trade-off between task performance and parameter efficiency.
arXiv Detail & Related papers (2024-03-03T08:25:04Z) - Point Cloud Pre-training with Diffusion Models [62.12279263217138]
We propose a novel pre-training method called Point cloud Diffusion pre-training (PointDif)
PointDif achieves substantial improvement across various real-world datasets for diverse downstream tasks such as classification, segmentation and detection.
arXiv Detail & Related papers (2023-11-25T08:10:05Z) - Rotation-Invariant Completion Network [8.023732679237021]
Real-world point clouds usually suffer from incompleteness and display different poses.
Current point cloud completion methods excel in reproducing complete point clouds with consistent poses as seen in the training set.
We propose a network named Rotation-Invariant Completion Network (RICNet), which consists of two parts: a Dual Pipeline Completion Network (DPCNet) and an enhancing module.
arXiv Detail & Related papers (2023-08-23T07:58:20Z) - Detecting Cloud Presence in Satellite Images Using the RGB-based CLIP
Vision-Language Model [0.0]
This work explores capabilities of the pre-trained CLIP vision-language model to identify satellite images affected by clouds.
Several approaches to using the model to perform cloud presence detection are proposed and evaluated.
Results demonstrate that the representations learned by the CLIP model can be useful for satellite image processing tasks involving clouds.
arXiv Detail & Related papers (2023-08-01T13:36:46Z) - Instance-aware Dynamic Prompt Tuning for Pre-trained Point Cloud Models [64.49254199311137]
We propose a novel Instance-aware Dynamic Prompt Tuning (IDPT) strategy for pre-trained point cloud models.
The essence of IDPT is to develop a dynamic prompt generation module to perceive semantic prior features of each point cloud instance.
In experiments, IDPT outperforms full fine-tuning in most tasks with a mere 7% of the trainable parameters.
arXiv Detail & Related papers (2023-04-14T16:03:09Z) - AdaPoinTr: Diverse Point Cloud Completion with Adaptive Geometry-Aware
Transformers [94.11915008006483]
We present a new method that reformulates point cloud completion as a set-to-set translation problem.
We design a new model, called PoinTr, which adopts a Transformer encoder-decoder architecture for point cloud completion.
Our method attains 6.53 CD on PCN, 0.81 CD on ShapeNet-55 and 0.392 MMD on real-world KITTI.
arXiv Detail & Related papers (2023-01-11T16:14:12Z) - Effective Utilisation of Multiple Open-Source Datasets to Improve
Generalisation Performance of Point Cloud Segmentation Models [0.0]
Semantic segmentation of aerial point cloud data can be utilised to differentiate which points belong to classes such as ground, buildings, or vegetation.
Point clouds generated from aerial sensors mounted to drones or planes can utilise LIDAR sensors or cameras along with photogrammetry.
We show that a naive combination of datasets produces a model with improved generalisation performance as expected.
arXiv Detail & Related papers (2022-11-29T02:31:01Z) - CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point
Cloud Learning [81.85951026033787]
We set transformers in this work and incorporate them into a hierarchical framework for shape classification and part and scene segmentation.
We also compute efficient and dynamic global cross attentions by leveraging sampling and grouping at each iteration.
The proposed hierarchical model achieves state-of-the-art shape classification in mean accuracy and yields results on par with the previous segmentation methods.
arXiv Detail & Related papers (2022-07-31T21:39:15Z) - Point Cloud Pre-training by Mixing and Disentangling [35.18101910728478]
Mixing and Disentangling (MD) is a self-supervised learning approach for point cloud pre-training.
We show that the encoder + ours (MD) significantly surpasses that of the encoder trained from scratch and converges quickly.
We hope this self-supervised learning attempt on point clouds can pave the way for reducing the deeply-learned model dependence on large-scale labeled data.
arXiv Detail & Related papers (2021-09-01T15:52:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.