RS-OOD: A Vision-Language Augmented Framework for Out-of-Distribution Detection in Remote Sensing
- URL: http://arxiv.org/abs/2509.02273v2
- Date: Wed, 01 Oct 2025 16:11:04 GMT
- Title: RS-OOD: A Vision-Language Augmented Framework for Out-of-Distribution Detection in Remote Sensing
- Authors: Chenhao Wang, Yingrui Ji, Yu Meng, Yunjian Zhang, Yao Zhu,
- Abstract summary: Out-of-distribution (OOD) detection represents a critical challenge in remote sensing applications.<n>We propose RS-OOD, a novel framework that leverages remote sensing-specific vision-language modeling to enable robust few-shot OOD detection.<n>Our approach introduces three key innovations: spatial feature enhancement that improved scene discrimination, a dual-prompt alignment mechanism, and a confidence-guided self-training loop.
- Score: 19.743431031185736
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Out-of-distribution (OOD) detection represents a critical challenge in remote sensing applications, where reliable identification of novel or anomalous patterns is essential for autonomous monitoring, disaster response, and environmental assessment. Despite remarkable progress in OOD detection for natural images, existing methods and benchmarks remain poorly suited to remote sensing imagery due to data scarcity, complex multi-scale scene structures, and pronounced distribution shifts. To this end, we propose RS-OOD, a novel framework that leverages remote sensing-specific vision-language modeling to enable robust few-shot OOD detection. Our approach introduces three key innovations: spatial feature enhancement that improved scene discrimination, a dual-prompt alignment mechanism that cross-verifies scene context against fine-grained semantics for spatial-semantic consistency, and a confidence-guided self-training loop that dynamically mines pseudo-labels to expand training data without manual annotation. RS-OOD consistently outperforms existing methods across multiple remote sensing benchmarks and enables efficient adaptation with minimal labeled data, demonstrating the critical value of spatial-semantic integration.
Related papers
- Mind the Way You Select Negative Texts: Pursuing the Distance Consistency in OOD Detection with VLMs [80.03370593724422]
Out-of-distribution (OOD) detection seeks to identify samples from unknown classes.<n>Current methods often incorporate intra-modal distance during OOD detection, such as comparing negative texts with ID labels.<n>We propose InterNeg, a framework that systematically utilizes consistent inter-modal distance enhancement from textual and visual perspectives.
arXiv Detail & Related papers (2026-03-03T05:44:47Z) - Learning to Explore: Policy-Guided Outlier Synthesis for Graph Out-of-Distribution Detection [51.93878677594561]
In unsupervised graph-level OOD detection, models are typically trained using only in-distribution (ID) data.<n>We propose a Policy-Guided Outlier Synthesis framework that replaces statics with a learned exploration strategy.
arXiv Detail & Related papers (2026-02-28T11:40:18Z) - RSGround-R1: Rethinking Remote Sensing Visual Grounding through Spatial Reasoning [61.84363374647606]
Remote Sensing Visual Grounding (RSVG) aims to localize target objects in large-scale aerial imagery based on natural language descriptions.<n>These descriptions often rely heavily on positional cues, posing unique challenges for Multimodal Large Language Models (MLLMs) in spatial reasoning.<n>We propose a reasoning-guided, position-aware post-training framework, dubbed textbfRSGround-R1, to progressively enhance spatial understanding.
arXiv Detail & Related papers (2026-01-29T12:35:57Z) - GOOD: Training-Free Guided Diffusion Sampling for Out-of-Distribution Detection [61.96025941146103]
GOOD is a novel framework that guides sampling trajectories towards OOD regions using off-the-shelf in-distribution (ID) classifiers.<n> GOOD incorporates dual-level guidance: Image-level guidance based on the gradient of log partition to reduce input likelihood, drives samples toward low-density regions in pixel space.<n>We introduce a unified OOD score that adaptively combines image and feature discrepancies, enhancing detection robustness.
arXiv Detail & Related papers (2025-10-20T03:58:46Z) - RoHOI: Robustness Benchmark for Human-Object Interaction Detection [78.18946529195254]
Human-Object Interaction (HOI) detection is crucial for robot-human assistance, enabling context-aware support.<n>We introduce the first benchmark for HOI detection, evaluating model resilience under diverse challenges.<n>Our benchmark, RoHOI, includes 20 corruption types based on the HICO-DET and V-COCO datasets and a new robustness-focused metric.
arXiv Detail & Related papers (2025-07-12T01:58:04Z) - DiffRIS: Enhancing Referring Remote Sensing Image Segmentation with Pre-trained Text-to-Image Diffusion Models [9.109484087832058]
DiffRIS is a novel framework that harnesses the semantic understanding capabilities of pre-trained text-to-image diffusion models for RRSIS tasks.<n>Our framework introduces two key innovations: a context perception adapter (CP-adapter) and a cross-modal reasoning decoder (PCMRD)
arXiv Detail & Related papers (2025-06-23T02:38:56Z) - Adaptive Contextual Embedding for Robust Far-View Borehole Detection [2.206623168926072]
In blasting operations, accurately detecting densely distributed tiny boreholes from far-view imagery is critical for operational safety and efficiency.<n>We propose an adaptive detection approach that builds upon existing architectures (e.g., YOLO) by explicitly leveraging consistent embedding representations derived through exponential moving average (EMA)-based statistical updates.<n>Our method introduces three synergistic components: (1) adaptive augmentation utilizing dynamically updated image statistics to robustly handle illumination and texture variations; (2) embedding stabilization to ensure consistent and reliable feature extraction; and (3) contextual refinement leveraging spatial context for improved detection accuracy.
arXiv Detail & Related papers (2025-05-08T07:25:42Z) - RUNA: Object-level Out-of-Distribution Detection via Regional Uncertainty Alignment of Multimodal Representations [33.971901643313856]
RUNA is a novel framework for detecting out-of-distribution (OOD) objects.<n>It employs a regional uncertainty alignment mechanism to distinguish ID from OOD objects effectively.<n>Our experiments show that RUNA substantially surpasses state-of-the-art methods in object-level OOD detection.
arXiv Detail & Related papers (2025-03-28T10:01:55Z) - Adaptive Residual Transformation for Enhanced Feature-Based OOD Detection in SAR Imagery [5.63530048112308]
The presence of unknown targets in real battlefield scenarios is unavoidable.
Various feature-based out-of-distribution approaches have been developed to address this issue.
We propose transforming feature-based OOD detection into a class-localized feature-residual-based approach.
arXiv Detail & Related papers (2024-11-01T00:09:02Z) - From Global to Local: Multi-scale Out-of-distribution Detection [129.37607313927458]
Out-of-distribution (OOD) detection aims to detect "unknown" data whose labels have not been seen during the in-distribution (ID) training process.
Recent progress in representation learning gives rise to distance-based OOD detection.
We propose Multi-scale OOD DEtection (MODE), a first framework leveraging both global visual information and local region details.
arXiv Detail & Related papers (2023-08-20T11:56:25Z) - Robust Saliency-Aware Distillation for Few-shot Fine-grained Visual
Recognition [57.08108545219043]
Recognizing novel sub-categories with scarce samples is an essential and challenging research topic in computer vision.
Existing literature addresses this challenge by employing local-based representation approaches.
This article proposes a novel model, Robust Saliency-aware Distillation (RSaD), for few-shot fine-grained visual recognition.
arXiv Detail & Related papers (2023-05-12T00:13:17Z) - Improving Out-of-Distribution Detection with Disentangled Foreground and Background Features [23.266183020469065]
We propose a novel framework that disentangles foreground and background features from ID training samples via a dense prediction approach.
It is a generic framework that allows for a seamless combination with various existing OOD detection methods.
arXiv Detail & Related papers (2023-03-15T16:12:14Z) - Triggering Failures: Out-Of-Distribution detection by learning from
local adversarial attacks in Semantic Segmentation [76.2621758731288]
We tackle the detection of out-of-distribution (OOD) objects in semantic segmentation.
Our main contribution is a new OOD detection architecture called ObsNet associated with a dedicated training scheme based on Local Adversarial Attacks (LAA)
We show it obtains top performances both in speed and accuracy when compared to ten recent methods of the literature on three different datasets.
arXiv Detail & Related papers (2021-08-03T17:09:56Z) - Robust Out-of-distribution Detection for Neural Networks [51.19164318924997]
We show that existing detection mechanisms can be extremely brittle when evaluating on in-distribution and OOD inputs.
We propose an effective algorithm called ALOE, which performs robust training by exposing the model to both adversarially crafted inlier and outlier examples.
arXiv Detail & Related papers (2020-03-21T17:46:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.