Related papers: Towards Robust Optical-SAR Object Detection under Missing Modalities: A Dynamic Quality-Aware Fusion Framework

Towards Robust Optical-SAR Object Detection under Missing Modalities: A Dynamic Quality-Aware Fusion Framework

URL: http://arxiv.org/abs/2512.22447v1
Date: Sat, 27 Dec 2025 03:16:48 GMT
Title: Towards Robust Optical-SAR Object Detection under Missing Modalities: A Dynamic Quality-Aware Fusion Framework
Authors: Zhicheng Zhao, Yuancheng Xu, Andong Lu, Chenglong Li, Jin Tang,
Abstract summary: Optical and Synthetic Aperture Radar (SAR) fusion-based object detection has attracted significant research interest in remote sensing.<n>We propose a novel Quality-Aware Dynamic Fusion Network (QDFNet) for robust optical-SAR object detection.
Score: 27.71603877164877
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Optical and Synthetic Aperture Radar (SAR) fusion-based object detection has attracted significant research interest in remote sensing, as these modalities provide complementary information for all-weather monitoring. However, practical deployment is severely limited by inherent challenges. Due to distinct imaging mechanisms, temporal asynchrony, and registration difficulties, obtaining well-aligned optical-SAR image pairs remains extremely difficult, frequently resulting in missing or degraded modality data. Although recent approaches have attempted to address this issue, they still suffer from limited robustness to random missing modalities and lack effective mechanisms to ensure consistent performance improvement in fusion-based detection. To address these limitations, we propose a novel Quality-Aware Dynamic Fusion Network (QDFNet) for robust optical-SAR object detection. Our proposed method leverages learnable reference tokens to dynamically assess feature reliability and guide adaptive fusion in the presence of missing modalities. In particular, we design a Dynamic Modality Quality Assessment (DMQA) module that employs learnable reference tokens to iteratively refine feature reliability assessment, enabling precise identification of degraded regions and providing quality guidance for subsequent fusion. Moreover, we develop an Orthogonal Constraint Normalization Fusion (OCNF) module that employs orthogonal constraints to preserve modality independence while dynamically adjusting fusion weights based on reliability scores, effectively suppressing unreliable feature propagation. Extensive experiments on the SpaceNet6-OTD and OGSOD-2.0 datasets demonstrate the superiority and effectiveness of QDFNet compared to state-of-the-art methods, particularly under partial modality corruption or missing data scenarios.

Related papers

FeedbackSTS-Det: Sparse Frames-Based Spatio-Temporal Semantic Feedback Network for Infrared Small Target Detection [7.648318265124807]
Infrared small detection target (ISTD) under complex backgrounds remains a challenging task.<n>Existing methods still struggle with inefficient long-range dependency modeling.<n>We propose a novel scheme for ISTD detection through a sparse semantic-temporal feedback network.
arXiv Detail & Related papers (2026-01-21T06:06:36Z)
LLM-Centric RAG with Multi-Granular Indexing and Confidence Constraints [5.2604064919135896]
This paper addresses the issues of insufficient coverage, unstable results, and limited reliability in retrieval-augmented generation under complex knowledge environments.<n>It proposes a confidence control method that integrates multi-granularity memory indexing with uncertainty estimation.<n>The results show that the method achieves superior performance over existing models in QA accuracy, retrieval recall, ranking quality, and factual consistency.
arXiv Detail & Related papers (2025-10-30T23:48:37Z)
Diffuse to Detect: A Generalizable Framework for Anomaly Detection with Diffusion Models Applications to UAVs and Beyond [2.4449457537548036]
Anomaly detection in complex, high-dimensional data, such as UAV sensor readings, is essential for operational safety.<n>We propose the Diffuse to Detect (DTD) framework, a novel approach that adapts diffusion models for anomaly detection.<n>DTD employs a single-step diffusion process to predict noise patterns, enabling rapid and precise identification of anomalies without reconstruction errors.
arXiv Detail & Related papers (2025-10-27T02:08:08Z)
Source-Free Object Detection with Detection Transformer [59.33653163035064]
Source-Free Object Detection (SFOD) enables knowledge transfer from a source domain to an unsupervised target domain for object detection without access to source data.<n>Most existing SFOD approaches are either confined to conventional object detection (OD) models like Faster R-CNN or designed as general solutions without tailored adaptations for novel OD architectures, especially Detection Transformer (DETR)<n>In this paper, we introduce Feature Reweighting ANd Contrastive Learning NetworK (FRANCK), a novel SFOD framework specifically designed to perform query-centric feature enhancement for DETRs.
arXiv Detail & Related papers (2025-10-13T07:35:04Z)
Adaptive Dual Uncertainty Optimization: Boosting Monocular 3D Object Detection under Test-Time Shifts [80.32933059529135]
Test-Time Adaptation (TTA) methods have emerged to adapt to target distributions during inference.<n>We propose Dual Uncertainty Optimization (DUO), the first TTA framework designed to jointly minimize both uncertainties for robust M3OD.<n>In parallel, we design a semantic-aware normal field constraint that preserves geometric coherence in regions with clear semantic cues.
arXiv Detail & Related papers (2025-08-28T07:09:21Z)
Quality-Aware Language-Conditioned Local Auto-Regressive Anomaly Synthesis and Detection [30.77558600436759]
ARAS is a language-conditioned, auto-regressive anomaly synthesis approach.<n>It injects local, text-specified defects into normal images via token-anchored latent editing.<n>It significantly enhances defect realism, preserves fine-grained material textures, and provides continuous semantic control over synthesized anomalies.
arXiv Detail & Related papers (2025-08-05T15:07:32Z)
RoHOI: Robustness Benchmark for Human-Object Interaction Detection [84.78366452133514]
Human-Object Interaction (HOI) detection is crucial for robot-human assistance, enabling context-aware support.<n>We introduce the first benchmark for HOI detection, evaluating model resilience under diverse challenges.<n>Our benchmark, RoHOI, includes 20 corruption types based on the HICO-DET and V-COCO datasets and a new robustness-focused metric.
arXiv Detail & Related papers (2025-07-12T01:58:04Z)
Theoretical Insights in Model Inversion Robustness and Conditional Entropy Maximization for Collaborative Inference Systems [89.35169042718739]
collaborative inference enables end users to leverage powerful deep learning models without exposure of sensitive raw data to cloud servers.<n>Recent studies have revealed that these intermediate features may not sufficiently preserve privacy, as information can be leaked and raw data can be reconstructed via model inversion attacks (MIAs)<n>This work first theoretically proves that the conditional entropy of inputs given intermediate features provides a guaranteed lower bound on the reconstruction mean square error (MSE) under any MIA.<n>Then, we derive a differentiable and solvable measure for bounding this conditional entropy based on the Gaussian mixture estimation and propose a conditional entropy algorithm to enhance the inversion robustness
arXiv Detail & Related papers (2025-03-01T07:15:21Z)
Reliability-Driven LiDAR-Camera Fusion for Robust 3D Object Detection [0.0]
We propose ReliFusion, a LiDAR-camera fusion framework operating in the bird's-eye view (BEV) space.<n>ReliFusion integrates three key components: the Spatio-Temporal Feature Aggregation (STFA) module, the Reliability module, and the Confidence-Weighted Mutual Cross-Attention (CW-MCA) module.<n>Experiments on the nuScenes dataset show that ReliFusion significantly outperforms state-of-the-art methods, achieving superior robustness and accuracy in scenarios with limited LiDAR fields of view and severe sensor malfunctions.
arXiv Detail & Related papers (2025-02-03T22:07:14Z)
Towards Evaluating the Robustness of Visual State Space Models [63.14954591606638]
Vision State Space Models (VSSMs) have demonstrated remarkable performance in visual perception tasks. However, their robustness under natural and adversarial perturbations remains a critical concern. We present a comprehensive evaluation of VSSMs' robustness under various perturbation scenarios.
arXiv Detail & Related papers (2024-06-13T17:59:44Z)
A Discrepancy Aware Framework for Robust Anomaly Detection [51.710249807397695]
We present a Discrepancy Aware Framework (DAF), which demonstrates robust performance consistently with simple and cheap strategies. Our method leverages an appearance-agnostic cue to guide the decoder in identifying defects, thereby alleviating its reliance on synthetic appearance. Under the simple synthesis strategies, it outperforms existing methods by a large margin. Furthermore, it also achieves the state-of-the-art localization performance.
arXiv Detail & Related papers (2023-10-11T15:21:40Z)
Learning Prompt-Enhanced Context Features for Weakly-Supervised Video Anomaly Detection [37.99031842449251]
Video anomaly detection under weak supervision presents significant challenges. We present a weakly supervised anomaly detection framework that focuses on efficient context modeling and enhanced semantic discriminability. Our approach significantly improves the detection accuracy of certain anomaly sub-classes, underscoring its practical value and efficacy.
arXiv Detail & Related papers (2023-06-26T06:45:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.