SARatrX: Towards Building A Foundation Model for SAR Target Recognition
- URL: http://arxiv.org/abs/2405.09365v2
- Date: Mon, 07 Oct 2024 07:39:40 GMT
- Title: SARatrX: Towards Building A Foundation Model for SAR Target Recognition
- Authors: Weijie Li, Wei Yang, Yuenan Hou, Li Liu, Yongxiang Liu, Xiang Li,
- Abstract summary: We make the first attempt towards building a foundation model for SAR ATR, termed SARatrX.
SARatrX learns generalizable representations via self-supervised learning (SSL) and provides a basis for label-efficient model adaptation to generic SAR target detection and classification tasks.
Specifically, SARatrX is trained on 0.18 M unlabelled SAR target samples, which are curated by combining contemporary benchmarks and constitute the largest publicly available dataset till now.
- Score: 22.770010893572973
- License:
- Abstract: Despite the remarkable progress in synthetic aperture radar automatic target recognition (SAR ATR), recent efforts have concentrated on the detection or classification of a specific and coarse category, e.g., vehicles, ships, airplanes, or buildings. One of the fundamental limitations of the top-performing SAR ATR methods is that the learning paradigm is supervised, task-specific, limited-category, closed-world learning, which depends on massive amounts of accurately annotated samples that are expensively labeled by expert SAR analysts and has limited generalization capability and scalability. In this work, we make the first attempt towards building a foundation model for SAR ATR, termed SARatrX. SARatrX learns generalizable representations via self-supervised learning (SSL) and provides a basis for label-efficient model adaptation to generic SAR target detection and classification tasks. Specifically, SARatrX is trained on 0.18 M unlabelled SAR target samples, which are curated by combining contemporary benchmarks and constitute the largest publicly available dataset till now. Considering the characteristics of SAR images, a backbone tailored for SAR ATR is carefully designed, and a two-step SSL method endowed with multi-scale gradient features was applied to ensure the feature diversity and model scalability of SARatrX. The capabilities of SARatrX are evaluated on classification under few-shot and robustness settings and detection across various categories and scenes, and impressive performance is achieved, often competitive with or even superior to prior fully supervised, semi-supervised, or self-supervised algorithms. Our SARatrX and the curated dataset are released at https://github.com/waterdisappear/SARatrX to foster research into foundation models for SAR ATR and SAR image interpretation.
Related papers
- IncSAR: A Dual Fusion Incremental Learning Framework for SAR Target Recognition [7.9330990800767385]
Models' tendency to forget old knowledge when learning new tasks, known as catastrophic forgetting, remains an open challenge.
In this paper, an incremental learning framework, called IncSAR, is proposed to mitigate catastrophic forgetting in SAR target recognition.
IncSAR comprises a Vision Transformer (ViT) and a custom-designed Convolutional Neural Network (CNN) in individual branches combined through a late-fusion strategy.
arXiv Detail & Related papers (2024-10-08T08:49:47Z) - SAFE: a SAR Feature Extractor based on self-supervised learning and masked Siamese ViTs [5.961207817077044]
We propose a novel self-supervised learning framework based on masked Siamese Vision Transformers to create a General SAR Feature Extractor coined SAFE.
Our method leverages contrastive learning principles to train a model on unlabeled SAR data, extracting robust and generalizable features.
We introduce tailored data augmentation techniques specific to SAR imagery, such as sub-aperture decomposition and despeckling.
Our network competes with or surpasses other state-of-the-art methods in few-shot classification and segmentation tasks, even without being trained on the sensors used for the evaluation.
arXiv Detail & Related papers (2024-06-30T23:11:20Z) - Learning Feature Inversion for Multi-class Anomaly Detection under General-purpose COCO-AD Benchmark [101.23684938489413]
Anomaly detection (AD) is often focused on detecting anomalies for industrial quality inspection and medical lesion examination.
This work first constructs a large-scale and general-purpose COCO-AD dataset by extending COCO to the AD field.
Inspired by the metrics in the segmentation field, we propose several more practical threshold-dependent AD-specific metrics.
arXiv Detail & Related papers (2024-04-16T17:38:26Z) - Efficient Prompt Tuning of Large Vision-Language Model for Fine-Grained
Ship Classification [62.425462136772666]
Fine-grained ship classification in remote sensing (RS-FGSC) poses a significant challenge due to the high similarity between classes and the limited availability of labeled data.
Recent advancements in large pre-trained Vision-Language Models (VLMs) have demonstrated impressive capabilities in few-shot or zero-shot learning.
This study delves into harnessing the potential of VLMs to enhance classification accuracy for unseen ship categories.
arXiv Detail & Related papers (2024-03-13T05:48:58Z) - SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection [79.23689506129733]
We establish a new benchmark dataset and an open-source method for large-scale SAR object detection.
Our dataset, SARDet-100K, is a result of intense surveying, collecting, and standardizing 10 existing SAR detection datasets.
To the best of our knowledge, SARDet-100K is the first COCO-level large-scale multi-class SAR object detection dataset ever created.
arXiv Detail & Related papers (2024-03-11T09:20:40Z) - Benchmarking Deep Learning Classifiers for SAR Automatic Target
Recognition [7.858656052565242]
This paper comprehensively benchmarks several advanced deep learning models for SAR ATR with multiple distinct SAR imagery datasets.
We evaluate and compare the five classifiers concerning their classification accuracy runtime performance in terms of inference throughput and analytical performance.
No clear model winner emerges from all of our chosen metrics and a one model rules all case is doubtful in the domain of SAR ATR.
arXiv Detail & Related papers (2023-12-12T02:20:39Z) - Predicting Gradient is Better: Exploring Self-Supervised Learning for SAR ATR with a Joint-Embedding Predictive Architecture [23.375515181854254]
Self-Supervised Learning (SSL) methods can achieve various SAR Automatic Target Recognition (ATR) tasks with pre-training in large-scale unlabeled data.
SSL aims to construct supervision signals directly from the data, which minimizes the need for expensive expert annotation.
This study investigates an effective SSL method for SAR ATR, which can pave the way for a foundation model in SAR ATR.
arXiv Detail & Related papers (2023-11-26T01:05:55Z) - A Global Model Approach to Robust Few-Shot SAR Automatic Target
Recognition [6.260916845720537]
It may not always be possible to collect hundreds of labeled samples per class for training deep learning-based SAR Automatic Target Recognition (ATR) models.
This work specifically tackles the few-shot SAR ATR problem, where only a handful of labeled samples may be available to support the task of interest.
arXiv Detail & Related papers (2023-03-20T00:24:05Z) - Open-Set Recognition: A Good Closed-Set Classifier is All You Need [146.6814176602689]
We show that the ability of a classifier to make the 'none-of-above' decision is highly correlated with its accuracy on the closed-set classes.
We use this correlation to boost the performance of the cross-entropy OSR 'baseline' by improving its closed-set accuracy.
We also construct new benchmarks which better respect the task of detecting semantic novelty.
arXiv Detail & Related papers (2021-10-12T17:58:59Z) - X-ModalNet: A Semi-Supervised Deep Cross-Modal Network for
Classification of Remote Sensing Data [69.37597254841052]
We propose a novel cross-modal deep-learning framework called X-ModalNet.
X-ModalNet generalizes well, owing to propagating labels on an updatable graph constructed by high-level features on the top of the network.
We evaluate X-ModalNet on two multi-modal remote sensing datasets (HSI-MSI and HSI-SAR) and achieve a significant improvement in comparison with several state-of-the-art methods.
arXiv Detail & Related papers (2020-06-24T15:29:41Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.