Semi-supervised Multiscale Matching for SAR-Optical Image
- URL: http://arxiv.org/abs/2508.07812v1
- Date: Mon, 11 Aug 2025 09:55:39 GMT
- Title: Semi-supervised Multiscale Matching for SAR-Optical Image
- Authors: Jingze Gai, Changchun Li,
- Abstract summary: We propose a semi-supervised multiscale matching for SAR-optical image matching (S2M2-SAR)<n>Specifically, we pseudo-label those unlabeled SAR-optical image pairs with pseudo ground-truth similarity heatmaps.<n>We also introduce a cross-modal feature enhancement module trained using a cross-modality mutual independence loss.
- Score: 5.25009884148204
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Driven by the complementary nature of optical and synthetic aperture radar (SAR) images, SAR-optical image matching has garnered significant interest. Most existing SAR-optical image matching methods aim to capture effective matching features by employing the supervision of pixel-level matched correspondences within SAR-optical image pairs, which, however, suffers from time-consuming and complex manual annotation, making it difficult to collect sufficient labeled SAR-optical image pairs. To handle this, we design a semi-supervised SAR-optical image matching pipeline that leverages both scarce labeled and abundant unlabeled image pairs and propose a semi-supervised multiscale matching for SAR-optical image matching (S2M2-SAR). Specifically, we pseudo-label those unlabeled SAR-optical image pairs with pseudo ground-truth similarity heatmaps by combining both deep and shallow level matching results, and train the matching model by employing labeled and pseudo-labeled similarity heatmaps. In addition, we introduce a cross-modal feature enhancement module trained using a cross-modality mutual independence loss, which requires no ground-truth labels. This unsupervised objective promotes the separation of modality-shared and modality-specific features by encouraging statistical independence between them, enabling effective feature disentanglement across optical and SAR modalities. To evaluate the effectiveness of S2M2-SAR, we compare it with existing competitors on benchmark datasets. Experimental results demonstrate that S2M2-SAR not only surpasses existing semi-supervised methods but also achieves performance competitive with fully supervised SOTA methods, demonstrating its efficiency and practical potential.
Related papers
- MOS: Mitigating Optical-SAR Modality Gap for Cross-Modal Ship Re-Identification [7.7794453452329]
Cross-modal ship re-identification (ReID) between optical and synthetic aperture radar (SAR) imagery has emerged as a critical yet underexplored task in maritime intelligence and surveillance.<n>We propose MOS, a novel framework designed to mitigate the optical-SAR modality gap and achieve modality-consistent feature learning for optical-SAR cross-modal ship ReID.
arXiv Detail & Related papers (2025-12-03T03:23:19Z) - Domain Adaptive SAR Wake Detection: Leveraging Similarity Filtering and Memory Guidance [5.026771815351906]
We propose a Similarity-Guided and Memory-Guided Domain Adap- tation (termed SimMemDA) framework for unsupervised domain adaptive ship wake detection.<n>We first utilize WakeGAN to perform style transfer on optical images, generating pseudo-images close to the SAR style.<n>Then, instance-level feature similarity filtering mechanism is designed to identify and prioritize source samples with target-like dis-tributions.
arXiv Detail & Related papers (2025-09-14T08:35:39Z) - Collaborative Learning of Scattering and Deep Features for SAR Target Recognition with Noisy Labels [7.324728751991982]
We propose collaborative learning of scattering and deep features (DF) for SAR automatic target recognition with noisy labels.<n>Specifically, a multi-model feature fusion framework is designed to integrate scattering and deep features.<n>The proposed method can achieve state-of-the-art performance under different operating conditions with various label noises.
arXiv Detail & Related papers (2025-08-11T06:10:23Z) - Quantitative Comparison of Fine-Tuning Techniques for Pretrained Latent Diffusion Models in the Generation of Unseen SAR Image Concepts [0.0]
This work investigates the adaptation of large pre-trained latent diffusion models to a radically new imaging domain: Synthetic Aperture Radar (SAR)<n>We explore and compare multiple fine-tuning strategies, including full model fine-tuning and parameter-efficient approaches like Low-Rank Adaptation (LoRA)<n>Our results show that a hybrid tuning strategy yields the best performance, while LoRA-based partial tuning of the text encoder, combined with embedding learning of the SAR> token, suffices to preserve prompt alignment.
arXiv Detail & Related papers (2025-06-16T09:48:01Z) - Learning from Noisy Pseudo-labels for All-Weather Land Cover Mapping [20.979328369582486]
SAR imagery lacks detailed information and is plagued by significant speckle noise.<n>Recent efforts have resorted to annotating paired optical-SAR images to generate pseudo-labels.<n>We introduce a more precise method for generating pseudo-labels by incorporating semi-supervised learning alongside a novel image resolution alignment augmentation.
arXiv Detail & Related papers (2025-04-18T04:24:47Z) - FUSE: Label-Free Image-Event Joint Monocular Depth Estimation via Frequency-Decoupled Alignment and Degradation-Robust Fusion [63.87313550399871]
Image-event joint depth estimation methods leverage complementary modalities for robust perception, yet face challenges in generalizability.<n>We propose Self-supervised Transfer (PST) and FrequencyDe-coupled Fusion module (FreDF)<n>PST establishes cross-modal knowledge transfer through latent space alignment with image foundation models.<n>FreDF explicitly decouples high-frequency edge features from low-frequency structural components, resolving modality-specific frequency mismatches.
arXiv Detail & Related papers (2025-03-25T15:04:53Z) - Joint Image De-noising and Enhancement for Satellite-Based SAR [0.0]
The reconstructed images from the Synthetic Aperture Radar (SAR) data suffer from multiplicative noise as well as low contrast level.<n>We propose a technique to handle these shortcomings simultaneously.<n>In fact, we combine the de-noising and contrast enhancement processes into a unified algorithm.
arXiv Detail & Related papers (2024-08-06T18:44:16Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - Symmetrical Bidirectional Knowledge Alignment for Zero-Shot Sketch-Based
Image Retrieval [69.46139774646308]
This paper studies the problem of zero-shot sketch-based image retrieval (ZS-SBIR)
It aims to use sketches from unseen categories as queries to match the images of the same category.
We propose a novel Symmetrical Bidirectional Knowledge Alignment for zero-shot sketch-based image retrieval (SBKA)
arXiv Detail & Related papers (2023-12-16T04:50:34Z) - Two-Stage Self-Supervised Cycle-Consistency Network for Reconstruction
of Thin-Slice MR Images [62.4428833931443]
The thick-slice magnetic resonance (MR) images are often structurally blurred in coronal and sagittal views.
Deep learning has shown great potential to re-construct the high-resolution (HR) thin-slice MR images from those low-resolution (LR) cases.
We propose a novel Two-stage Self-supervised Cycle-consistency Network (TSCNet) for MR slice reconstruction.
arXiv Detail & Related papers (2021-06-29T13:29:18Z) - Single-shot Hyperspectral-Depth Imaging with Learned Diffractive Optics [72.9038524082252]
We propose a compact single-shot monocular hyperspectral-depth (HS-D) imaging method.
Our method uses a diffractive optical element (DOE), the point spread function of which changes with respect to both depth and spectrum.
To facilitate learning the DOE, we present a first HS-D dataset by building a benchtop HS-D imager.
arXiv Detail & Related papers (2020-09-01T14:19:35Z) - DDet: Dual-path Dynamic Enhancement Network for Real-World Image
Super-Resolution [69.2432352477966]
Real image super-resolution(Real-SR) focus on the relationship between real-world high-resolution(HR) and low-resolution(LR) image.
In this article, we propose a Dual-path Dynamic Enhancement Network(DDet) for Real-SR.
Unlike conventional methods which stack up massive convolutional blocks for feature representation, we introduce a content-aware framework to study non-inherently aligned image pair.
arXiv Detail & Related papers (2020-02-25T18:24:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.