MTSGL: Multi-Task Structure Guided Learning for Robust and Interpretable SAR Aircraft Recognition
- URL: http://arxiv.org/abs/2504.16467v1
- Date: Wed, 23 Apr 2025 07:27:08 GMT
- Title: MTSGL: Multi-Task Structure Guided Learning for Robust and Interpretable SAR Aircraft Recognition
- Authors: Qishan He, Lingjun Zhao, Ru Luo, Siqian Zhang, Lin Lei, Kefeng Ji, Gangyao Kuang,
- Abstract summary: We propose a multi-task structure guided learning (MTSGL) network for robust and interpretable SAR aircraft recognition.<n>The proposed MTSGL includes a structural semantic awareness (SSA) module and a structural consistency regularization (SCR) module.<n>In conclusion, the MTSGL is presented with the expert-level aircraft prior knowledge and structure guided learning paradigm, aiming to comprehend the aircraft concept in a way analogous to the human cognitive process.
- Score: 16.88286091071643
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Aircraft recognition in synthetic aperture radar (SAR) imagery is a fundamental mission in both military and civilian applications. Recently deep learning (DL) has emerged a dominant paradigm for its explosive performance on extracting discriminative features. However, current classification algorithms focus primarily on learning decision hyperplane without enough comprehension on aircraft structural knowledge. Inspired by the fined aircraft annotation methods for optical remote sensing images (RSI), we first introduce a structure-based SAR aircraft annotations approach to provide structural and compositional supplement information. On this basis, we propose a multi-task structure guided learning (MTSGL) network for robust and interpretable SAR aircraft recognition. Besides the classification task, MTSGL includes a structural semantic awareness (SSA) module and a structural consistency regularization (SCR) module. The SSA is designed to capture structure semantic information, which is conducive to gain human-like comprehension of aircraft knowledge. The SCR helps maintain the geometric consistency between the aircraft structure in SAR imagery and the proposed annotation. In this process, the structural attribute can be disentangled in a geometrically meaningful manner. In conclusion, the MTSGL is presented with the expert-level aircraft prior knowledge and structure guided learning paradigm, aiming to comprehend the aircraft concept in a way analogous to the human cognitive process. Extensive experiments are conducted on a self-constructed multi-task SAR aircraft recognition dataset (MT-SARD) and the effective results illustrate the superiority of robustness and interpretation ability of the proposed MTSGL.
Related papers
- Unpaired Object-Level SAR-to-Optical Image Translation for Aircraft with Keypoints-Guided Diffusion Models [4.6570959687411975]
Translating SAR images into optical images is a promising solution to enhance interpretation and support downstream tasks.<n>This study proposes a keypoint-guided diffusion model (KeypointDiff) for SAR-to-optical image translation of unpaired aircraft targets.
arXiv Detail & Related papers (2025-03-25T16:05:49Z) - Physics-Guided Detector for SAR Airplanes [48.11882103050703]
We propose a novel physics-guided detector (PGD) learning paradigm for SAR airplanes.
It comprehensively investigate their discreteness and variability to improve the detection performance.
The experiments demonstrate the flexibility and effectiveness of the proposed PGD.
arXiv Detail & Related papers (2024-11-19T07:41:09Z) - StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization [94.31508613367296]
Retrieval-augmented generation (RAG) is a key means to effectively enhance large language models (LLMs)
We propose StructRAG, which can identify the optimal structure type for the task at hand, reconstruct original documents into this structured format, and infer answers based on the resulting structure.
Experiments show that StructRAG achieves state-of-the-art performance, particularly excelling in challenging scenarios.
arXiv Detail & Related papers (2024-10-11T13:52:44Z) - SAFE: a SAR Feature Extractor based on self-supervised learning and masked Siamese ViTs [5.961207817077044]
We propose a novel self-supervised learning framework based on masked Siamese Vision Transformers to create a General SAR Feature Extractor coined SAFE.
Our method leverages contrastive learning principles to train a model on unlabeled SAR data, extracting robust and generalizable features.
We introduce tailored data augmentation techniques specific to SAR imagery, such as sub-aperture decomposition and despeckling.
Our network competes with or surpasses other state-of-the-art methods in few-shot classification and segmentation tasks, even without being trained on the sensors used for the evaluation.
arXiv Detail & Related papers (2024-06-30T23:11:20Z) - Efficient Prompt Tuning of Large Vision-Language Model for Fine-Grained Ship Classification [59.99976102069976]
Fine-grained ship classification in remote sensing (RS-FGSC) poses a significant challenge due to the high similarity between classes and the limited availability of labeled data.<n>Recent advancements in large pre-trained Vision-Language Models (VLMs) have demonstrated impressive capabilities in few-shot or zero-shot learning.<n>This study delves into harnessing the potential of VLMs to enhance classification accuracy for unseen ship categories.
arXiv Detail & Related papers (2024-03-13T05:48:58Z) - MS-Net: A Multi-modal Self-supervised Network for Fine-Grained
Classification of Aircraft in SAR Images [8.54188605939881]
This article proposes a novel multi-modal self-supervised network (MS-Net) for fine-grained classification of aircraft.
In the case of no label, the proposed algorithm achieves an accuracy of 88.46% for 17 types of air-craft classification task.
arXiv Detail & Related papers (2023-08-28T14:28:50Z) - Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal
Structured Representations [70.41385310930846]
We present an end-to-end framework Structure-CLIP to enhance multi-modal structured representations.
We use scene graphs to guide the construction of semantic negative examples, which results in an increased emphasis on learning structured representations.
A Knowledge-Enhance (KEE) is proposed to leverage SGK as input to further enhance structured representations.
arXiv Detail & Related papers (2023-05-06T03:57:05Z) - SIM-Trans: Structure Information Modeling Transformer for Fine-grained
Visual Categorization [59.732036564862796]
We propose the Structure Information Modeling Transformer (SIM-Trans) to incorporate object structure information into transformer for enhancing discriminative representation learning.
The proposed two modules are light-weighted and can be plugged into any transformer network and trained end-to-end easily.
Experiments and analyses demonstrate that the proposed SIM-Trans achieves state-of-the-art performance on fine-grained visual categorization benchmarks.
arXiv Detail & Related papers (2022-08-31T03:00:07Z) - Attentional Feature Refinement and Alignment Network for Aircraft
Detection in SAR Imagery [24.004052923372548]
Aircraft detection in Synthetic Aperture Radar (SAR) imagery is a challenging task due to aircraft's discrete appearance, obvious intraclass variation, small size and serious background's interference.
In this paper, a single-shot detector namely Attentional Feature Refinement and Alignment Network (AFRAN) is proposed for detecting aircraft in SAR images with competitive accuracy and speed.
arXiv Detail & Related papers (2022-01-18T16:54:49Z) - RC-Struct: A Structure-based Neural Network Approach for MIMO-OFDM
Detection [33.414673669107906]
We introduce a structure-based neural network architecture, namely RC-Struct, for signal detection.
The RC-Struct exploits the temporal structure of the signals through reservoir computing (RC)
The introduced RC-Struct sheds light on combining communication domain knowledge and learning-based receive processing for 5G and 5G Beyond.
arXiv Detail & Related papers (2021-10-03T19:39:21Z) - Structure-Preserving Image Super-Resolution [94.16949589128296]
Structures matter in single image super-resolution (SISR)
Recent studies have promoted the development of SISR by recovering photo-realistic images.
However, there are still undesired structural distortions in the recovered images.
arXiv Detail & Related papers (2021-09-26T08:48:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.