M4-SAR: A Multi-Resolution, Multi-Polarization, Multi-Scene, Multi-Source Dataset and Benchmark for Optical-SAR Fusion Object Detection
- URL: http://arxiv.org/abs/2505.10931v1
- Date: Fri, 16 May 2025 07:10:07 GMT
- Title: M4-SAR: A Multi-Resolution, Multi-Polarization, Multi-Scene, Multi-Source Dataset and Benchmark for Optical-SAR Fusion Object Detection
- Authors: Chao Wang, Wei Lu, Xiang Li, Jian Yang, Lei Luo,
- Abstract summary: Single-source remote sensing object detection using optical or SAR images struggles in complex environments.<n>We propose the first comprehensive dataset for optical-SAR fusion object detection, named Multi-resolution, Multi-polarization, Multi-scene, Multi-source SAR dataset (M4-SAR)<n>To enable standardized evaluation, we develop a unified benchmarking toolkit that integrates six state-of-the-art multi-source fusion methods.
- Score: 28.405249208866067
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Single-source remote sensing object detection using optical or SAR images struggles in complex environments. Optical images offer rich textural details but are often affected by low-light, cloud-obscured, or low-resolution conditions, reducing the detection performance. SAR images are robust to weather, but suffer from speckle noise and limited semantic expressiveness. Optical and SAR images provide complementary advantages, and fusing them can significantly improve the detection accuracy. However, progress in this field is hindered by the lack of large-scale, standardized datasets. To address these challenges, we propose the first comprehensive dataset for optical-SAR fusion object detection, named Multi-resolution, Multi-polarization, Multi-scene, Multi-source SAR dataset (M4-SAR). It contains 112,184 precisely aligned image pairs and nearly one million labeled instances with arbitrary orientations, spanning six key categories. To enable standardized evaluation, we develop a unified benchmarking toolkit that integrates six state-of-the-art multi-source fusion methods. Furthermore, we propose E2E-OSDet, a novel end-to-end multi-source fusion detection framework that mitigates cross-domain discrepancies and establishes a robust baseline for future studies. Extensive experiments on M4-SAR demonstrate that fusing optical and SAR data can improve $mAP$ by 5.7\% over single-source inputs, with particularly significant gains in complex environments. The dataset and code are publicly available at https://github.com/wchao0601/M4-SAR.
Related papers
- AuxDet: Auxiliary Metadata Matters for Omni-Domain Infrared Small Target Detection [58.67129770371016]
We propose a novel IRSTD framework that reimagines the IRSTD paradigm by incorporating textual metadata for scene-aware optimization.<n>AuxDet consistently outperforms state-of-the-art methods, validating the critical role of auxiliary information in improving robustness and accuracy.
arXiv Detail & Related papers (2025-05-21T07:02:05Z) - Benchmarking Suite for Synthetic Aperture Radar Imagery Anomaly Detection (SARIAD) Algorithms [0.3124884279860061]
Anomaly detection is a key research challenge in computer vision and machine learning.<n>In radar imaging, specifically synthetic aperture radar (SAR), anomaly detection can be used for the classification, detection, and segmentation of objects of interest.<n>SARIAD provides a comprehensive suite of algorithms and datasets for assessing and developing anomaly detection approaches on SAR imagery.
arXiv Detail & Related papers (2025-04-10T20:31:25Z) - Multi-Resolution SAR and Optical Remote Sensing Image Registration Methods: A Review, Datasets, and Future Perspectives [13.749888089968373]
Synthetic Aperture Radar (SAR) and optical image registration is essential for remote sensing data fusion.<n>As image resolution increases, fine SAR textures become more significant, leading to alignment issues and 3D spatial discrepancies.<n>The MultiResSAR dataset was created, containing over 10k pairs of multi-source, multi-resolution, and multi-scene SAR and optical images.
arXiv Detail & Related papers (2025-02-03T02:51:30Z) - Text-Guided Coarse-to-Fine Fusion Network for Robust Remote Sensing Visual Question Answering [26.8129265632403]
Current Remote Sensing Visual Question Answering (RSVQA) methods are limited by the imaging mechanisms of optical sensors.<n>We propose a Text-guided Coarse-to-Fine Fusion Network (TGFNet) to improve RSVQA performance.<n>We create the first large-scale benchmark dataset for evaluating optical-SAR RSVQA methods.
arXiv Detail & Related papers (2024-11-24T09:48:03Z) - Scaling Efficient Masked Image Modeling on Large Remote Sensing Dataset [66.15872913664407]
We present a new pre-training pipeline for RS models, featuring the creation of a large-scale RS dataset and an efficient MIM approach.<n>We curated a high-quality dataset named OpticalRS-13M by collecting publicly available RS datasets and processing them through exclusion, slicing, and deduplication.<n>Experiments demonstrate that OpticalRS-13M significantly improves classification, detection, and segmentation performance, while SelectiveMAE increases training efficiency over 2 times.
arXiv Detail & Related papers (2024-06-17T15:41:57Z) - 3MOS: Multi-sources, Multi-resolutions, and Multi-scenes dataset for Optical-SAR image matching [6.13702551312774]
We introduce a large-scale Multi-sources,Multi-resolutions, and Multi-scenes dataset for Optical-SAR image matching (3MOS)
It consists of 155K optical-SAR image pairs, including SAR data from six commercial satellites, with resolutions ranging from 1.25m to 12.5m.
The data has been classified into eight scenes including urban, rural, plains, hills, mountains, water, desert, and frozen earth.
arXiv Detail & Related papers (2024-04-01T00:31:11Z) - SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection [79.23689506129733]
We establish a new benchmark dataset and an open-source method for large-scale SAR object detection.
Our dataset, SARDet-100K, is a result of intense surveying, collecting, and standardizing 10 existing SAR detection datasets.
To the best of our knowledge, SARDet-100K is the first COCO-level large-scale multi-class SAR object detection dataset ever created.
arXiv Detail & Related papers (2024-03-11T09:20:40Z) - DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection.
It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor.
Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z) - Searching a Compact Architecture for Robust Multi-Exposure Image Fusion [55.37210629454589]
Two major stumbling blocks hinder the development, including pixel misalignment and inefficient inference.
This study introduces an architecture search-based paradigm incorporating self-alignment and detail repletion modules for robust multi-exposure image fusion.
The proposed method outperforms various competitive schemes, achieving a noteworthy 3.19% improvement in PSNR for general scenarios and an impressive 23.5% enhancement in misaligned scenarios.
arXiv Detail & Related papers (2023-05-20T17:01:52Z) - A lightweight multi-scale context network for salient object detection
in optical remote sensing images [16.933770557853077]
We propose a multi-scale context network, namely MSCNet, for salient object detection in optical RSIs.
Specifically, a multi-scale context extraction module is adopted to address the scale variation of salient objects.
In order to accurately detect complete salient objects in complex backgrounds, we design an attention-based pyramid feature aggregation mechanism.
arXiv Detail & Related papers (2022-05-18T14:32:47Z) - Target-aware Dual Adversarial Learning and a Multi-scenario
Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection [65.30079184700755]
This study addresses the issue of fusing infrared and visible images that appear differently for object detection.
Previous approaches discover commons underlying the two modalities and fuse upon the common space either by iterative optimization or deep networks.
This paper proposes a bilevel optimization formulation for the joint problem of fusion and detection, and then unrolls to a target-aware Dual Adversarial Learning (TarDAL) network for fusion and a commonly used detection network.
arXiv Detail & Related papers (2022-03-30T11:44:56Z) - Multi-Content Complementation Network for Salient Object Detection in
Optical Remote Sensing Images [108.79667788962425]
salient object detection in optical remote sensing images (RSI-SOD) remains to be a challenging emerging topic.
We propose a novel Multi-Content Complementation Network (MCCNet) to explore the complementarity of multiple content for RSI-SOD.
In MCCM, we consider multiple types of features that are critical to RSI-SOD, including foreground features, edge features, background features, and global image-level features.
arXiv Detail & Related papers (2021-12-02T04:46:40Z) - RRNet: Relational Reasoning Network with Parallel Multi-scale Attention
for Salient Object Detection in Optical Remote Sensing Images [82.1679766706423]
Salient object detection (SOD) for optical remote sensing images (RSIs) aims at locating and extracting visually distinctive objects/regions from the optical RSIs.
We propose a relational reasoning network with parallel multi-scale attention for SOD in optical RSIs.
Our proposed RRNet outperforms the existing state-of-the-art SOD competitors both qualitatively and quantitatively.
arXiv Detail & Related papers (2021-10-27T07:18:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.