SAFE: a SAR Feature Extractor based on self-supervised learning and masked Siamese ViTs
- URL: http://arxiv.org/abs/2407.00851v1
- Date: Sun, 30 Jun 2024 23:11:20 GMT
- Title: SAFE: a SAR Feature Extractor based on self-supervised learning and masked Siamese ViTs
- Authors: Max Muzeau, Joana Frontera-Pons, Chengfang Ren, Jean-Philippe Ovarlez,
- Abstract summary: We propose a novel self-supervised learning framework based on masked Siamese Vision Transformers to create a General SAR Feature Extractor coined SAFE.
Our method leverages contrastive learning principles to train a model on unlabeled SAR data, extracting robust and generalizable features.
We introduce tailored data augmentation techniques specific to SAR imagery, such as sub-aperture decomposition and despeckling.
Our network competes with or surpasses other state-of-the-art methods in few-shot classification and segmentation tasks, even without being trained on the sensors used for the evaluation.
- Score: 5.961207817077044
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Due to its all-weather and day-and-night capabilities, Synthetic Aperture Radar imagery is essential for various applications such as disaster management, earth monitoring, change detection and target recognition. However, the scarcity of labeled SAR data limits the performance of most deep learning algorithms. To address this issue, we propose a novel self-supervised learning framework based on masked Siamese Vision Transformers to create a General SAR Feature Extractor coined SAFE. Our method leverages contrastive learning principles to train a model on unlabeled SAR data, extracting robust and generalizable features. SAFE is applicable across multiple SAR acquisition modes and resolutions. We introduce tailored data augmentation techniques specific to SAR imagery, such as sub-aperture decomposition and despeckling. Comprehensive evaluations on various downstream tasks, including few-shot classification, segmentation, visualization, and pattern detection, demonstrate the effectiveness and versatility of the proposed approach. Our network competes with or surpasses other state-of-the-art methods in few-shot classification and segmentation tasks, even without being trained on the sensors used for the evaluation.
Related papers
- IncSAR: A Dual Fusion Incremental Learning Framework for SAR Target Recognition [7.9330990800767385]
Models' tendency to forget old knowledge when learning new tasks, known as catastrophic forgetting, remains an open challenge.
In this paper, an incremental learning framework, called IncSAR, is proposed to mitigate catastrophic forgetting in SAR target recognition.
IncSAR comprises a Vision Transformer (ViT) and a custom-designed Convolutional Neural Network (CNN) in individual branches combined through a late-fusion strategy.
arXiv Detail & Related papers (2024-10-08T08:49:47Z) - Towards SAR Automatic Target Recognition MultiCategory SAR Image Classification Based on Light Weight Vision Transformer [11.983317593939688]
This paper tries to apply a lightweight vision transformer based model to classify SAR images.
The entire structure was verified by an open-accessed SAR data set.
arXiv Detail & Related papers (2024-05-18T11:24:52Z) - SARatrX: Towards Building A Foundation Model for SAR Target Recognition [22.770010893572973]
We make the first attempt towards building a foundation model for SAR ATR, termed SARatrX.
SARatrX learns generalizable representations via self-supervised learning (SSL) and provides a basis for label-efficient model adaptation to generic SAR target detection and classification tasks.
Specifically, SARatrX is trained on 0.18 M unlabelled SAR target samples, which are curated by combining contemporary benchmarks and constitute the largest publicly available dataset till now.
arXiv Detail & Related papers (2024-05-15T14:17:44Z) - Efficient Prompt Tuning of Large Vision-Language Model for Fine-Grained
Ship Classification [62.425462136772666]
Fine-grained ship classification in remote sensing (RS-FGSC) poses a significant challenge due to the high similarity between classes and the limited availability of labeled data.
Recent advancements in large pre-trained Vision-Language Models (VLMs) have demonstrated impressive capabilities in few-shot or zero-shot learning.
This study delves into harnessing the potential of VLMs to enhance classification accuracy for unseen ship categories.
arXiv Detail & Related papers (2024-03-13T05:48:58Z) - Rethinking Transformers Pre-training for Multi-Spectral Satellite
Imagery [78.43828998065071]
Recent advances in unsupervised learning have demonstrated the ability of large vision models to achieve promising results on downstream tasks.
Such pre-training techniques have also been explored recently in the remote sensing domain due to the availability of large amount of unlabelled data.
In this paper, we re-visit transformers pre-training and leverage multi-scale information that is effectively utilized with multiple modalities.
arXiv Detail & Related papers (2024-03-08T16:18:04Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - Benchmarking Deep Learning Classifiers for SAR Automatic Target
Recognition [7.858656052565242]
This paper comprehensively benchmarks several advanced deep learning models for SAR ATR with multiple distinct SAR imagery datasets.
We evaluate and compare the five classifiers concerning their classification accuracy runtime performance in terms of inference throughput and analytical performance.
No clear model winner emerges from all of our chosen metrics and a one model rules all case is doubtful in the domain of SAR ATR.
arXiv Detail & Related papers (2023-12-12T02:20:39Z) - SatMAE: Pre-training Transformers for Temporal and Multi-Spectral
Satellite Imagery [74.82821342249039]
We present SatMAE, a pre-training framework for temporal or multi-spectral satellite imagery based on Masked Autoencoder (MAE)
To leverage temporal information, we include a temporal embedding along with independently masking image patches across time.
arXiv Detail & Related papers (2022-07-17T01:35:29Z) - Remote Sensing Image Classification using Transfer Learning and
Attention Based Deep Neural Network [59.86658316440461]
We propose a deep learning based framework for RSISC, which makes use of the transfer learning technique and multihead attention scheme.
The proposed deep learning framework is evaluated on the benchmark NWPU-RESISC45 dataset and achieves the best classification accuracy of 94.7%.
arXiv Detail & Related papers (2022-06-20T10:05:38Z) - Context-Preserving Instance-Level Augmentation and Deformable
Convolution Networks for SAR Ship Detection [50.53262868498824]
Shape deformation of targets in SAR image due to random orientation and partial information loss is an essential challenge in SAR ship detection.
We propose a data augmentation method to train a deep network that is robust to partial information loss within the targets.
arXiv Detail & Related papers (2022-02-14T07:01:01Z) - Learning Efficient Representations for Enhanced Object Detection on
Large-scene SAR Images [16.602738933183865]
It is a challenging problem to detect and recognize targets on complex large-scene Synthetic Aperture Radar (SAR) images.
Recently developed deep learning algorithms can automatically learn the intrinsic features of SAR images.
We propose an efficient and robust deep learning based target detection method.
arXiv Detail & Related papers (2022-01-22T03:25:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.