Related papers: MODA: The First Challenging Benchmark for Multispectral Object Detection in Aerial Images

MODA: The First Challenging Benchmark for Multispectral Object Detection in Aerial Images

URL: http://arxiv.org/abs/2512.09489v1
Date: Wed, 10 Dec 2025 10:07:06 GMT
Title: MODA: The First Challenging Benchmark for Multispectral Object Detection in Aerial Images
Authors: Shuaihao Han, Tingfa Xu, Peifu Liu, Jianan Li,
Abstract summary: We introduce the first large-scale dataset for Multispectral Object Detection in Aerial images (MODA)<n>This dataset comprises 14,041 MSIs and 330,191 annotations across diverse, challenging scenarios.<n>We also propose OSSDet, a framework that integrates spectral and spatial information with object-aware cues.
Score: 26.48439423478357
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Aerial object detection faces significant challenges in real-world scenarios, such as small objects and extensive background interference, which limit the performance of RGB-based detectors with insufficient discriminative information. Multispectral images (MSIs) capture additional spectral cues across multiple bands, offering a promising alternative. However, the lack of training data has been the primary bottleneck to exploiting the potential of MSIs. To address this gap, we introduce the first large-scale dataset for Multispectral Object Detection in Aerial images (MODA), which comprises 14,041 MSIs and 330,191 annotations across diverse, challenging scenarios, providing a comprehensive data foundation for this field. Furthermore, to overcome challenges inherent to aerial object detection using MSIs, we propose OSSDet, a framework that integrates spectral and spatial information with object-aware cues. OSSDet employs a cascaded spectral-spatial modulation structure to optimize target perception, aggregates spectrally related features by exploiting spectral similarities to reinforce intra-object correlations, and suppresses irrelevant background via object-aware masking. Moreover, cross-spectral attention further refines object-related representations under explicit object-aware guidance. Extensive experiments demonstrate that OSSDet outperforms existing methods with comparable parameters and efficiency.

Related papers

MMOT: The First Challenging Benchmark for Drone-based Multispectral Multi-Object Tracking [30.3437683353074]
MMOT is the first benchmark for drone-based multispectral multi-object tracking.<n>It features 125 video sequences with over 488.8K annotations across eight categories.<n>To better extract spectral features and leverage oriented annotations, we present a multispectral and orientation-aware MOT scheme.
arXiv Detail & Related papers (2025-10-14T14:25:17Z)
AuxDet: Auxiliary Metadata Matters for Omni-Domain Infrared Small Target Detection [49.81255045696323]
We present the Auxiliary Metadata Driven Infrared Small Target Detector (AuxDet)<n>AuxDet integrates metadata semantics with visual features, guiding adaptive representation learning for each sample.<n>Experiments on the challenging WideIRSTD-Full benchmark demonstrate that AuxDet consistently outperforms state-of-the-art methods.
arXiv Detail & Related papers (2025-05-21T07:02:05Z)
Hyperspectral Remote Sensing Images Salient Object Detection: The First Benchmark Dataset and Baseline [14.081609886645555]
We introduce the first HRSI-SOD dataset, termed HRSSD, which includes 704 hyperspectral images and 5327 pixel-level annotated salient objects.<n>The HRSSD dataset poses substantial challenges for salient object detection algorithms due to large scale variation, diverse foreground-background relations, and multi-salient objects.<n>We propose an innovative and efficient baseline model for HRSI-SOD, termed the Deep Spectral Saliency Network (DSSN)
arXiv Detail & Related papers (2025-04-03T09:12:42Z)
Hyperspectral Adapter for Object Tracking based on Hyperspectral Video [18.77789707539318]
A new hyperspectral object tracking method, hyperspectral adapter for tracking (HyA-T), is proposed in this work.<n>The proposed methods extract spectral information directly from the hyperspectral images, which prevent the loss of the spectral information.<n>The HyA-T achieves state-of-the-art performance on all the datasets.
arXiv Detail & Related papers (2025-03-28T07:31:48Z)
Spectrum-oriented Point-supervised Saliency Detector for Hyperspectral Images [13.79887292039637]
We introduce point supervision into Hyperspectral salient object detection (HSOD)<n>We incorporate Spectral Saliency, derived from conventional HSOD methods, as a pivotal spectral representation within the framework.<n>We propose a novel pipeline, specifically designed for HSIs, to generate pseudo-labels, effectively mitigating the performance decline associated with point supervision strategy.
arXiv Detail & Related papers (2024-12-24T02:52:43Z)
Optimizing Multispectral Object Detection: A Bag of Tricks and Comprehensive Benchmarks [49.84182981950623]
Multispectral object detection, utilizing RGB and TIR (thermal infrared) modalities, is widely recognized as a challenging task.<n>It requires not only the effective extraction of features from both modalities and robust fusion strategies, but also the ability to address issues such as spectral discrepancies.<n>We introduce an efficient and easily deployable multispectral object detection framework that can seamlessly optimize high-performing single-modality models.
arXiv Detail & Related papers (2024-11-27T12:18:39Z)
SSF-Net: Spatial-Spectral Fusion Network with Spectral Angle Awareness for Hyperspectral Object Tracking [21.664141982246598]
Hyperspectral video (HSV) offers valuable spatial, spectral, and temporal information simultaneously.<n>Existing methods primarily focus on band regrouping and rely on RGB trackers for feature extraction.<n>In this paper, a spatial-spectral fusion network with spectral angle awareness (SST-Net) is proposed for hyperspectral (HS) object tracking.
arXiv Detail & Related papers (2024-03-09T09:37:13Z)
Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head. The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement. This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z)
Improving Vision Anomaly Detection with the Guidance of Language Modality [64.53005837237754]
This paper tackles the challenges for vision modality from a multimodal point of view. We propose Cross-modal Guidance (CMG) to tackle the redundant information issue and sparse space issue. To learn a more compact latent space for the vision anomaly detector, CMLE learns a correlation structure matrix from the language modality.
arXiv Detail & Related papers (2023-10-04T13:44:56Z)
Object Detection in Hyperspectral Image via Unified Spectral-Spatial Feature Aggregation [55.9217962930169]
We present S2ADet, an object detector that harnesses the rich spectral and spatial complementary information inherent in hyperspectral images. S2ADet surpasses existing state-of-the-art methods, achieving robust and reliable results.
arXiv Detail & Related papers (2023-06-14T09:01:50Z)
RRNet: Relational Reasoning Network with Parallel Multi-scale Attention for Salient Object Detection in Optical Remote Sensing Images [82.1679766706423]
Salient object detection (SOD) for optical remote sensing images (RSIs) aims at locating and extracting visually distinctive objects/regions from the optical RSIs. We propose a relational reasoning network with parallel multi-scale attention for SOD in optical RSIs. Our proposed RRNet outperforms the existing state-of-the-art SOD competitors both qualitatively and quantitatively.
arXiv Detail & Related papers (2021-10-27T07:18:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.