Generic Knowledge Boosted Pre-training For Remote Sensing Images
- URL: http://arxiv.org/abs/2401.04614v2
- Date: Sun, 21 Jan 2024 12:56:32 GMT
- Title: Generic Knowledge Boosted Pre-training For Remote Sensing Images
- Authors: Ziyue Huang, Mingming Zhang, Yuan Gong, Qingjie Liu, Yunhong Wang
- Abstract summary: Generic Knowledge Boosted Remote Sensing Pre-training (GeRSP) is a novel remote sensing pre-training framework.
GeRSP learns robust representations from remote sensing and natural images for remote sensing understanding tasks.
We show that GeRSP can effectively learn robust representations in a unified manner, improving the performance of remote sensing downstream tasks.
- Score: 46.071496675604884
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning models are essential for scene classification, change
detection, land cover segmentation, and other remote sensing image
understanding tasks. Most backbones of existing remote sensing deep learning
models are typically initialized by pre-trained weights obtained from ImageNet
pre-training (IMP). However, domain gaps exist between remote sensing images
and natural images (e.g., ImageNet), making deep learning models initialized by
pre-trained weights of IMP perform poorly for remote sensing image
understanding. Although some pre-training methods are studied in the remote
sensing community, current remote sensing pre-training methods face the problem
of vague generalization by only using remote sensing images. In this paper, we
propose a novel remote sensing pre-training framework, Generic Knowledge
Boosted Remote Sensing Pre-training (GeRSP), to learn robust representations
from remote sensing and natural images for remote sensing understanding tasks.
GeRSP contains two pre-training branches: (1) A self-supervised pre-training
branch is adopted to learn domain-related representations from unlabeled remote
sensing images. (2) A supervised pre-training branch is integrated into GeRSP
for general knowledge learning from labeled natural images. Moreover, GeRSP
combines two pre-training branches using a teacher-student architecture to
simultaneously learn representations with general and special knowledge, which
generates a powerful pre-trained model for deep learning model initialization.
Finally, we evaluate GeRSP and other remote sensing pre-training methods on
three downstream tasks, i.e., object detection, semantic segmentation, and
scene classification. The extensive experimental results consistently
demonstrate that GeRSP can effectively learn robust representations in a
unified manner, improving the performance of remote sensing downstream tasks.
Related papers
- Pattern Integration and Enhancement Vision Transformer for Self-Supervised Learning in Remote Sensing [11.626527403157922]
We present the Pattern Integration and Enhancement Vision Transformer (PIEViT), a novel self-supervised learning framework for remote sensing imagery.
PIEViT enhances the representation of internal patch features, providing significant improvements over existing self-supervised baselines.
It achieves excellent results in object detection, land cover classification, and change detection, underscoring its robustness, generalization, and transferability for remote sensing image interpretation tasks.
arXiv Detail & Related papers (2024-11-09T07:06:31Z) - Rethinking Feature Backbone Fine-tuning for Remote Sensing Object Detection [10.896464615994494]
We propose DBF (Dynamic Backbone Freezing) for feature backbone fine-tuning on remote sensing object detection.
Our method aims to handle the dilemma of whether the backbone should extract low-level generic features or possess specific knowledge of the remote sensing domain.
Our approach enables more accurate model learning while substantially reducing computational costs.
arXiv Detail & Related papers (2024-07-21T12:32:00Z) - Remote Sensing Vision-Language Foundation Models without Annotations via
Ground Remote Alignment [61.769441954135246]
We introduce a method to train vision-language models for remote-sensing images without using any textual annotations.
Our key insight is to use co-located internet imagery taken on the ground as an intermediary for connecting remote-sensing images and language.
arXiv Detail & Related papers (2023-12-12T03:39:07Z) - An Empirical Study of Remote Sensing Pretraining [117.90699699469639]
We conduct an empirical study of remote sensing pretraining (RSP) on aerial images.
RSP can help deliver distinctive performances in scene recognition tasks.
RSP mitigates the data discrepancies of traditional ImageNet pretraining on RS images, but it may still suffer from task discrepancies.
arXiv Detail & Related papers (2022-04-06T13:38:11Z) - Semantic-Aware Generation for Self-Supervised Visual Representation
Learning [116.5814634936371]
We advocate for Semantic-aware Generation (SaGe) to facilitate richer semantics rather than details to be preserved in the generated image.
SaGe complements the target network with view-specific features and thus alleviates the semantic degradation brought by intensive data augmentations.
We execute SaGe on ImageNet-1K and evaluate the pre-trained models on five downstream tasks including nearest neighbor test, linear classification, and fine-scaled image recognition.
arXiv Detail & Related papers (2021-11-25T16:46:13Z) - Geographical Knowledge-driven Representation Learning for Remote Sensing
Images [18.79154074365997]
We propose a Geographical Knowledge-driven Representation learning method for remote sensing images (GeoKR)
The global land cover products and geographical location associated with each remote sensing image are regarded as geographical knowledge.
A large scale pre-training dataset Levir-KR is proposed to support network pre-training.
arXiv Detail & Related papers (2021-07-12T09:23:15Z) - Self-Supervised Learning of Remote Sensing Scene Representations Using
Contrastive Multiview Coding [0.0]
We conduct an analysis of the applicability of self-supervised learning in remote sensing image classification.
We show that, for the downstream task of remote sensing image classification, using self-supervised pre-training can give better results than using supervised pre-training on images of natural scenes.
arXiv Detail & Related papers (2021-04-14T18:25:43Z) - Seasonal Contrast: Unsupervised Pre-Training from Uncurated Remote
Sensing Data [64.40187171234838]
Seasonal Contrast (SeCo) is an effective pipeline to leverage unlabeled data for in-domain pre-training of re-mote sensing representations.
SeCo will be made public to facilitate transfer learning and enable rapid progress in re-mote sensing applications.
arXiv Detail & Related papers (2021-03-30T18:26:39Z) - Remote Sensing Image Scene Classification Meets Deep Learning:
Challenges, Methods, Benchmarks, and Opportunities [81.29441139530844]
This paper provides a systematic survey of deep learning methods for remote sensing image scene classification by covering more than 160 papers.
We discuss the main challenges of remote sensing image scene classification and survey.
We introduce the benchmarks used for remote sensing image scene classification and summarize the performance of more than two dozen representative algorithms.
arXiv Detail & Related papers (2020-05-03T14:18:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.