Related papers: SAFE: a SAR Feature Extractor based on self-supervised learning and masked Siamese ViTs

SAFE: a SAR Feature Extractor based on self-supervised learning and masked Siamese ViTs

URL: http://arxiv.org/abs/2407.00851v1
Date: Sun, 30 Jun 2024 23:11:20 GMT
Title: SAFE: a SAR Feature Extractor based on self-supervised learning and masked Siamese ViTs
Authors: Max Muzeau, Joana Frontera-Pons, Chengfang Ren, Jean-Philippe Ovarlez,
Abstract summary: We propose a novel self-supervised learning framework based on masked Siamese Vision Transformers to create a General SAR Feature Extractor coined SAFE. Our method leverages contrastive learning principles to train a model on unlabeled SAR data, extracting robust and generalizable features. We introduce tailored data augmentation techniques specific to SAR imagery, such as sub-aperture decomposition and despeckling. Our network competes with or surpasses other state-of-the-art methods in few-shot classification and segmentation tasks, even without being trained on the sensors used for the evaluation.
Score: 5.961207817077044
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Due to its all-weather and day-and-night capabilities, Synthetic Aperture Radar imagery is essential for various applications such as disaster management, earth monitoring, change detection and target recognition. However, the scarcity of labeled SAR data limits the performance of most deep learning algorithms. To address this issue, we propose a novel self-supervised learning framework based on masked Siamese Vision Transformers to create a General SAR Feature Extractor coined SAFE. Our method leverages contrastive learning principles to train a model on unlabeled SAR data, extracting robust and generalizable features. SAFE is applicable across multiple SAR acquisition modes and resolutions. We introduce tailored data augmentation techniques specific to SAR imagery, such as sub-aperture decomposition and despeckling. Comprehensive evaluations on various downstream tasks, including few-shot classification, segmentation, visualization, and pattern detection, demonstrate the effectiveness and versatility of the proposed approach. Our network competes with or surpasses other state-of-the-art methods in few-shot classification and segmentation tasks, even without being trained on the sensors used for the evaluation.

Related papers

SAR Object Detection with Self-Supervised Pretraining and Curriculum-Aware Sampling [41.24071764578782]
Object detection in satellite-borne Synthetic Aperture Radar imagery holds immense potential in tasks such as urban monitoring and disaster response. The detection of small objects in satellite-borne SAR images poses a particularly intricate problem, because of the technology's relatively low spatial resolution and inherent noise. In this paper, we introduce TRANSAR, a novel self-supervised end-to-end vision transformer-based SAR object detection model.
arXiv Detail & Related papers (2025-04-17T19:44:05Z)
Unpaired Object-Level SAR-to-Optical Image Translation for Aircraft with Keypoints-Guided Diffusion Models [4.6570959687411975]
Translating SAR images into optical images is a promising solution to enhance interpretation and support downstream tasks. This study proposes a keypoint-guided diffusion model (KeypointDiff) for SAR-to-optical image translation of unpaired aircraft targets.
arXiv Detail & Related papers (2025-03-25T16:05:49Z)
Semi-supervised Semantic Segmentation for Remote Sensing Images via Multi-scale Uncertainty Consistency and Cross-Teacher-Student Attention [59.19580789952102]
This paper proposes a novel semi-supervised Multi-Scale Uncertainty and Cross-Teacher-Student Attention (MUCA) model for RS image semantic segmentation tasks. MUCA constrains the consistency among feature maps at different layers of the network by introducing a multi-scale uncertainty consistency regularization. MUCA utilizes a Cross-Teacher-Student attention mechanism to guide the student network, guiding the student network to construct more discriminative feature representations.
arXiv Detail & Related papers (2025-01-18T11:57:20Z)
IncSAR: A Dual Fusion Incremental Learning Framework for SAR Target Recognition [7.9330990800767385]
Models' tendency to forget old knowledge when learning new tasks, known as catastrophic forgetting, remains an open challenge. In this paper, an incremental learning framework, called IncSAR, is proposed to mitigate catastrophic forgetting in SAR target recognition. IncSAR comprises a Vision Transformer (ViT) and a custom-designed Convolutional Neural Network (CNN) in individual branches combined through a late-fusion strategy.
arXiv Detail & Related papers (2024-10-08T08:49:47Z)
Towards SAR Automatic Target Recognition MultiCategory SAR Image Classification Based on Light Weight Vision Transformer [11.983317593939688]
This paper tries to apply a lightweight vision transformer based model to classify SAR images. The entire structure was verified by an open-accessed SAR data set.
arXiv Detail & Related papers (2024-05-18T11:24:52Z)
SARatrX: Towards Building A Foundation Model for SAR Target Recognition [22.770010893572973]
We make the first attempt towards building a foundation model for SAR ATR, termed SARatrX. SARatrX learns generalizable representations via self-supervised learning (SSL) and provides a basis for label-efficient model adaptation to generic SAR target detection and classification tasks. Specifically, SARatrX is trained on 0.18 M unlabelled SAR target samples, which are curated by combining contemporary benchmarks and constitute the largest publicly available dataset till now.
arXiv Detail & Related papers (2024-05-15T14:17:44Z)
Efficient Prompt Tuning of Large Vision-Language Model for Fine-Grained Ship Classification [62.425462136772666]
Fine-grained ship classification in remote sensing (RS-FGSC) poses a significant challenge due to the high similarity between classes and the limited availability of labeled data. Recent advancements in large pre-trained Vision-Language Models (VLMs) have demonstrated impressive capabilities in few-shot or zero-shot learning. This study delves into harnessing the potential of VLMs to enhance classification accuracy for unseen ship categories.
arXiv Detail & Related papers (2024-03-13T05:48:58Z)
Rethinking Transformers Pre-training for Multi-Spectral Satellite Imagery [78.43828998065071]
Recent advances in unsupervised learning have demonstrated the ability of large vision models to achieve promising results on downstream tasks. Such pre-training techniques have also been explored recently in the remote sensing domain due to the availability of large amount of unlabelled data. In this paper, we re-visit transformers pre-training and leverage multi-scale information that is effectively utilized with multiple modalities.
arXiv Detail & Related papers (2024-03-08T16:18:04Z)
Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing. Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery. We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z)
Benchmarking Deep Learning Classifiers for SAR Automatic Target Recognition [7.858656052565242]
This paper comprehensively benchmarks several advanced deep learning models for SAR ATR with multiple distinct SAR imagery datasets. We evaluate and compare the five classifiers concerning their classification accuracy runtime performance in terms of inference throughput and analytical performance. No clear model winner emerges from all of our chosen metrics and a one model rules all case is doubtful in the domain of SAR ATR.
arXiv Detail & Related papers (2023-12-12T02:20:39Z)
SatMAE: Pre-training Transformers for Temporal and Multi-Spectral Satellite Imagery [74.82821342249039]
We present SatMAE, a pre-training framework for temporal or multi-spectral satellite imagery based on Masked Autoencoder (MAE) To leverage temporal information, we include a temporal embedding along with independently masking image patches across time.
arXiv Detail & Related papers (2022-07-17T01:35:29Z)
Remote Sensing Image Classification using Transfer Learning and Attention Based Deep Neural Network [59.86658316440461]
We propose a deep learning based framework for RSISC, which makes use of the transfer learning technique and multihead attention scheme. The proposed deep learning framework is evaluated on the benchmark NWPU-RESISC45 dataset and achieves the best classification accuracy of 94.7%.
arXiv Detail & Related papers (2022-06-20T10:05:38Z)
Context-Preserving Instance-Level Augmentation and Deformable Convolution Networks for SAR Ship Detection [50.53262868498824]
Shape deformation of targets in SAR image due to random orientation and partial information loss is an essential challenge in SAR ship detection. We propose a data augmentation method to train a deep network that is robust to partial information loss within the targets.
arXiv Detail & Related papers (2022-02-14T07:01:01Z)
Learning Efficient Representations for Enhanced Object Detection on Large-scene SAR Images [16.602738933183865]
It is a challenging problem to detect and recognize targets on complex large-scene Synthetic Aperture Radar (SAR) images. Recently developed deep learning algorithms can automatically learn the intrinsic features of SAR images. We propose an efficient and robust deep learning based target detection method.
arXiv Detail & Related papers (2022-01-22T03:25:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.