Using Global Land Cover Product as Prompt for Cropland Mapping via
Visual Foundation Model
- URL: http://arxiv.org/abs/2310.10219v1
- Date: Mon, 16 Oct 2023 09:29:52 GMT
- Title: Using Global Land Cover Product as Prompt for Cropland Mapping via
Visual Foundation Model
- Authors: Chao Tao, Aoran Hu, Rong Xiao, Haifeng Li, and Yuze Wang
- Abstract summary: We introduce the "Pretrain+Prompting" paradigm to interpreting cropland scenes and design the auto-prompting (APT) method based on freely available global land cover product.
It can achieve a fine-grained adaptation process from generic scenes to specialized cropland scenes without introducing additional label costs.
Our experiments using two sub-meter cropland datasets from southern and northern China demonstrated that the proposed method via visual foundation models outperforms traditional supervised learning and fine-tuning approaches in the field of remote sensing.
- Score: 6.35948253619752
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data-driven deep learning methods have shown great potential in cropland
mapping. However, due to multiple factors such as attributes of cropland
(topography, climate, crop type) and imaging conditions (viewing angle,
illumination, scale), croplands under different scenes demonstrate a great
domain gap. This makes it difficult for models trained in the specific scenes
to directly generalize to other scenes. A common way to handle this problem is
through the "Pretrain+Fine-tuning" paradigm. Unfortunately, considering the
variety of features of cropland that are affected by multiple factors, it is
hardly to handle the complex domain gap between pre-trained data and target
data using only sparse fine-tuned samples as general constraints. Moreover, as
the number of model parameters grows, fine-tuning is no longer an easy and
low-cost task. With the emergence of prompt learning via visual foundation
models, the "Pretrain+Prompting" paradigm redesigns the optimization target by
introducing individual prompts for each single sample. This simplifies the
domain adaption from generic to specific scenes during model reasoning
processes. Therefore, we introduce the "Pretrain+Prompting" paradigm to
interpreting cropland scenes and design the auto-prompting (APT) method based
on freely available global land cover product. It can achieve a fine-grained
adaptation process from generic scenes to specialized cropland scenes without
introducing additional label costs. To our best knowledge, this work pioneers
the exploration of the domain adaption problems for cropland mapping under
prompt learning perspectives. Our experiments using two sub-meter cropland
datasets from southern and northern China demonstrated that the proposed method
via visual foundation models outperforms traditional supervised learning and
fine-tuning approaches in the field of remote sensing.
Related papers
- Semi-Self-Supervised Domain Adaptation: Developing Deep Learning Models with Limited Annotated Data for Wheat Head Segmentation [0.10923877073891444]
We introduce a semi-self-supervised domain adaptation technique based on deep convolutional neural networks with a probabilistic diffusion process.
We develop a two-branch convolutional encoder-decoder model architecture that uses both synthesized image-mask pairs and unannotated images.
The proposed model achieved a Dice score of 80.7% on an internal test dataset and a Dice score of 64.8% on an external test set.
arXiv Detail & Related papers (2024-05-12T04:35:49Z) - Domain-Controlled Prompt Learning [49.45309818782329]
Existing prompt learning methods often lack domain-awareness or domain-transfer mechanisms.
We propose a textbfDomain-Controlled Prompt Learning for the specific domains.
Our method achieves state-of-the-art performance in specific domain image recognition datasets.
arXiv Detail & Related papers (2023-09-30T02:59:49Z) - Terrain Diffusion Network: Climatic-Aware Terrain Generation with
Geological Sketch Guidance [16.29267504093274]
Sketch-based terrain generation seeks to create realistic landscapes for virtual environments in various applications such as computer games, animation and virtual reality.
We propose a novel diffusion-based method, namely terrain diffusion network (TDN), which actively incorporates user guidance for enhanced controllability.
Three terrain synthesisers are designed for structural, intermediate, and fine-grained level denoising purposes, which allow each synthesiser concentrate on a distinct terrain aspect.
arXiv Detail & Related papers (2023-08-31T13:41:34Z) - Bilevel Fast Scene Adaptation for Low-Light Image Enhancement [50.639332885989255]
Enhancing images in low-light scenes is a challenging but widely concerned task in the computer vision.
Main obstacle lies in the modeling conundrum from distribution discrepancy across different scenes.
We introduce the bilevel paradigm to model the above latent correspondence.
A bilevel learning framework is constructed to endow the scene-irrelevant generality of the encoder towards diverse scenes.
arXiv Detail & Related papers (2023-06-02T08:16:21Z) - Spatial Steerability of GANs via Self-Supervision from Discriminator [123.27117057804732]
We propose a self-supervised approach to improve the spatial steerability of GANs without searching for steerable directions in the latent space.
Specifically, we design randomly sampled Gaussian heatmaps to be encoded into the intermediate layers of generative models as spatial inductive bias.
During inference, users can interact with the spatial heatmaps in an intuitive manner, enabling them to edit the output image by adjusting the scene layout, moving, or removing objects.
arXiv Detail & Related papers (2023-01-20T07:36:29Z) - Vision Transformers: From Semantic Segmentation to Dense Prediction [139.15562023284187]
We explore the global context learning potentials of vision transformers (ViTs) for dense visual prediction.
Our motivation is that through learning global context at full receptive field layer by layer, ViTs may capture stronger long-range dependency information.
We formulate a family of Hierarchical Local-Global (HLG) Transformers, characterized by local attention within windows and global-attention across windows in a pyramidal architecture.
arXiv Detail & Related papers (2022-07-19T15:49:35Z) - CHALLENGER: Training with Attribution Maps [63.736435657236505]
We show that utilizing attribution maps for training neural networks can improve regularization of models and thus increase performance.
In particular, we show that our generic domain-independent approach yields state-of-the-art results in vision, natural language processing and on time series tasks.
arXiv Detail & Related papers (2022-05-30T13:34:46Z) - Embedding Earth: Self-supervised contrastive pre-training for dense land
cover classification [61.44538721707377]
We present Embedding Earth a self-supervised contrastive pre-training method for leveraging the large availability of satellite imagery.
We observe significant improvements up to 25% absolute mIoU when pre-trained with our proposed method.
We find that learnt features can generalize between disparate regions opening up the possibility of using the proposed pre-training scheme.
arXiv Detail & Related papers (2022-03-11T16:14:14Z) - Segmentation of VHR EO Images using Unsupervised Learning [19.00071868539993]
We propose an unsupervised semantic segmentation method that can be trained using just a single unlabeled scene.
The proposed method exploits this property to sample smaller patches from the larger scene.
After unsupervised training on the target image/scene, the model automatically segregates the major classes present in the scene and produces the segmentation map.
arXiv Detail & Related papers (2021-07-09T11:42:48Z) - An Efficient Method for the Classification of Croplands in Scarce-Label
Regions [0.0]
Two of the main challenges for cropland classification by satellite time-series images are insufficient ground-truth data and inaccessibility of high-quality hyperspectral images for under-developed areas.
Unlabeled medium-resolution satellite images are abundant, but how to benefit from them is an open question.
We will show how to leverage their potential for cropland classification using self-supervised tasks.
arXiv Detail & Related papers (2021-03-17T12:10:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.