In-Domain Self-Supervised Learning Improves Remote Sensing Image Scene
Classification
- URL: http://arxiv.org/abs/2307.01645v2
- Date: Mon, 5 Feb 2024 14:14:06 GMT
- Title: In-Domain Self-Supervised Learning Improves Remote Sensing Image Scene
Classification
- Authors: Ivica Dimitrovski, Ivan Kitanovski, Nikola Simidjievski, Dragi Kocev
- Abstract summary: Self-supervised learning has emerged as a promising approach for remote sensing image classification.
We present a study of different self-supervised pre-training strategies and evaluate their effect across 14 downstream datasets.
- Score: 5.323049242720532
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We investigate the utility of in-domain self-supervised pre-training of
vision models in the analysis of remote sensing imagery. Self-supervised
learning (SSL) has emerged as a promising approach for remote sensing image
classification due to its ability to exploit large amounts of unlabeled data.
Unlike traditional supervised learning, SSL aims to learn representations of
data without the need for explicit labels. This is achieved by formulating
auxiliary tasks that can be used for pre-training models before fine-tuning
them on a given downstream task. A common approach in practice to SSL
pre-training is utilizing standard pre-training datasets, such as ImageNet.
While relevant, such a general approach can have a sub-optimal influence on the
downstream performance of models, especially on tasks from challenging domains
such as remote sensing. In this paper, we analyze the effectiveness of SSL
pre-training by employing the iBOT framework coupled with Vision transformers
trained on Million-AID, a large and unlabeled remote sensing dataset. We
present a comprehensive study of different self-supervised pre-training
strategies and evaluate their effect across 14 downstream datasets with diverse
properties. Our results demonstrate that leveraging large in-domain datasets
for self-supervised pre-training consistently leads to improved predictive
downstream performance, compared to the standard approaches found in practice.
Related papers
- A Closer Look at Benchmarking Self-Supervised Pre-training with Image Classification [51.35500308126506]
Self-supervised learning (SSL) is a machine learning approach where the data itself provides supervision, eliminating the need for external labels.
We study how classification-based evaluation protocols for SSL correlate and how well they predict downstream performance on different dataset types.
arXiv Detail & Related papers (2024-07-16T23:17:36Z) - Self-supervised visual learning in the low-data regime: a comparative evaluation [40.27083924454058]
Self-Supervised Learning (SSL) is a robust training methodology for contemporary Deep Neural Networks (DNNs)
This work introduces a taxonomy of modern visual SSL methods, accompanied by detailed explanations and insights regarding the main categories of approaches.
For domain-specific downstream tasks, in-domain low-data SSL pretraining outperforms the common approach of large-scale pretraining.
arXiv Detail & Related papers (2024-04-26T07:23:14Z) - Evaluating the Label Efficiency of Contrastive Self-Supervised Learning
for Multi-Resolution Satellite Imagery [0.0]
Self-supervised learning has been applied in the remote sensing domain to exploit readily-available unlabeled data.
In this paper, we study self-supervised visual representation learning through the lens of label efficiency.
arXiv Detail & Related papers (2022-10-13T06:54:13Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Consecutive Pretraining: A Knowledge Transfer Learning Strategy with
Relevant Unlabeled Data for Remote Sensing Domain [25.84756140221655]
ConSecutive PreTraining (CSPT) is proposed based on the idea of not stopping pretraining in natural language processing (NLP)
The proposed CSPT also can release the huge potential of unlabeled data for task-aware model training.
The results show that by utilizing the proposed CSPT for task-aware model training, almost all downstream tasks in RSD can outperform the previous method of supervised pretraining-then-fine-tuning.
arXiv Detail & Related papers (2022-07-08T12:32:09Z) - UniVIP: A Unified Framework for Self-Supervised Visual Pre-training [50.87603616476038]
We propose a novel self-supervised framework to learn versatile visual representations on either single-centric-object or non-iconic dataset.
Massive experiments show that UniVIP pre-trained on non-iconic COCO achieves state-of-the-art transfer performance.
Our method can also exploit single-centric-object dataset such as ImageNet and outperforms BYOL by 2.5% with the same pre-training epochs in linear probing.
arXiv Detail & Related papers (2022-03-14T10:04:04Z) - Self-Supervised Pre-Training for Transformer-Based Person
Re-Identification [54.55281692768765]
Transformer-based supervised pre-training achieves great performance in person re-identification (ReID)
Due to the domain gap between ImageNet and ReID datasets, it usually needs a larger pre-training dataset to boost the performance.
This work aims to mitigate the gap between the pre-training and ReID datasets from the perspective of data and model structure.
arXiv Detail & Related papers (2021-11-23T18:59:08Z) - Unified Instance and Knowledge Alignment Pretraining for Aspect-based
Sentiment Analysis [96.53859361560505]
Aspect-based Sentiment Analysis (ABSA) aims to determine the sentiment polarity towards an aspect.
There always exists severe domain shift between the pretraining and downstream ABSA datasets.
We introduce a unified alignment pretraining framework into the vanilla pretrain-finetune pipeline.
arXiv Detail & Related papers (2021-10-26T04:03:45Z) - Remote Sensing Image Scene Classification with Self-Supervised Paradigm
under Limited Labeled Samples [11.025191332244919]
We introduce new self-supervised learning (SSL) mechanism to obtain the high-performance pre-training model for RSIs scene classification from large unlabeled data.
Experiments on three commonly used RSIs scene classification datasets demonstrated that this new learning paradigm outperforms the traditional dominant ImageNet pre-trained model.
The insights distilled from our studies can help to foster the development of SSL in the remote sensing community.
arXiv Detail & Related papers (2020-10-02T09:27:19Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.