Related papers: WeatherPrompt: Multi-modality Representation Learning for All-Weather Drone Visual Geo-Localization

WeatherPrompt: Multi-modality Representation Learning for All-Weather Drone Visual Geo-Localization

URL: http://arxiv.org/abs/2508.09560v2
Date: Thu, 14 Aug 2025 01:05:56 GMT
Title: WeatherPrompt: Multi-modality Representation Learning for All-Weather Drone Visual Geo-Localization
Authors: Jiahao Wen, Hang Yu, Zhedong Zheng,
Abstract summary: We present WeatherPrompt, a multi-modality learning paradigm that establishes weather-invariant representations through fusing the image embedding with the text context.<n>Our framework achieves competitive recall rates compared to state-of-the-art drone geo-localization methods.
Score: 22.01591564940522
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Visual geo-localization for drones faces critical degradation under weather perturbations, \eg, rain and fog, where existing methods struggle with two inherent limitations: 1) Heavy reliance on limited weather categories that constrain generalization, and 2) Suboptimal disentanglement of entangled scene-weather features through pseudo weather categories. We present WeatherPrompt, a multi-modality learning paradigm that establishes weather-invariant representations through fusing the image embedding with the text context. Our framework introduces two key contributions: First, a Training-free Weather Reasoning mechanism that employs off-the-shelf large multi-modality models to synthesize multi-weather textual descriptions through human-like reasoning. It improves the scalability to unseen or complex weather, and could reflect different weather strength. Second, to better disentangle the scene and weather feature, we propose a multi-modality framework with the dynamic gating mechanism driven by the text embedding to adaptively reweight and fuse visual features across modalities. The framework is further optimized by the cross-modal objectives, including image-text contrastive learning and image-text matching, which maps the same scene with different weather conditions closer in the respresentation space. Extensive experiments validate that, under diverse weather conditions, our method achieves competitive recall rates compared to state-of-the-art drone geo-localization methods. Notably, it improves Recall@1 by +13.37\% under night conditions and by 18.69\% under fog and snow conditions.

Related papers

RoSe: Robust Self-supervised Stereo Matching under Adverse Weather Conditions [58.37558408672509]
We propose a robust self-supervised training paradigm, consisting of two key steps: robust self-supervised scene correspondence learning and adverse weather distillation.<n>Experiments demonstrate the effectiveness and versatility of our proposed solution, which outperforms existing state-of-the-art self-supervised methods.
arXiv Detail & Related papers (2025-09-23T15:41:40Z)
DA2Diff: Exploring Degradation-aware Adaptive Diffusion Priors for All-in-One Weather Restoration [32.16602874389847]
We propose an innovative diffusion paradigm with degradation-aware adaptive priors for all-in-one weather restoration, termed DA2Diff.<n>We deploy a set of learnable prompts to capture degradation-aware representations by the prompt-image similarity constraints in the CLIP space.<n>We propose a dynamic expert selection modulator that employs a dynamic weather-aware router to flexibly dispatch varying numbers of restoration experts for each weather-distorted image.
arXiv Detail & Related papers (2025-04-07T14:38:57Z)
MWFormer: Multi-Weather Image Restoration Using Degradation-Aware Transformers [44.600209414790854]
Restoring images captured under adverse weather conditions is a fundamental task for many computer vision applications. We propose a multi-weather Transformer, or MWFormer, that aims to solve multiple weather-induced degradations using a single architecture. We show that MWFormer achieves significant performance improvements compared to existing state-of-the-art methods.
arXiv Detail & Related papers (2024-11-26T08:47:39Z)
WeatherGFM: Learning A Weather Generalist Foundation Model via In-context Learning [69.82211470647349]
We introduce the first generalist weather foundation model (WeatherGFM)<n>It addresses a wide spectrum of weather understanding tasks in a unified manner.<n>Our model can effectively handle up to ten weather understanding tasks, including weather forecasting, super-resolution, weather image translation, and post-processing.
arXiv Detail & Related papers (2024-11-08T09:14:19Z)
Multiple weather images restoration using the task transformer and adaptive mixup strategy [14.986500375481546]
We introduce a novel multi-task severe weather removal model that can effectively handle complex weather conditions in an adaptive manner. Our model incorporates a weather task sequence generator, enabling the self-attention mechanism to selectively focus on features specific to different weather types. Our proposed model has achieved state-of-the-art performance on the publicly available dataset.
arXiv Detail & Related papers (2024-09-05T04:55:40Z)
Boosting Adverse Weather Crowd Counting via Multi-queue Contrastive Learning [8.692139673789555]
We propose a two-stage crowd counting method named Multi-queue Contrastive Learning (MQCL)<n>MQCL reduces the counting error under adverse weather conditions by 22%, while introducing only about 13% increase in computational burden.
arXiv Detail & Related papers (2024-08-12T07:13:08Z)
Modeling Weather Uncertainty for Multi-weather Co-Presence Estimation [25.060597623607784]
existing algorithms model weather condition as a discrete status and estimate it using multi-label classification. We consider the physical formulation of multi-weather conditions and model the impact of physical-related parameter on learning from the image appearance.
arXiv Detail & Related papers (2024-03-29T10:05:29Z)
All-weather Multi-Modality Image Fusion: Unified Framework and 100k Benchmark [42.49073228252726]
Multi-modality image fusion (MMIF) combines complementary information from different image modalities to provide a more comprehensive and objective interpretation of scenes. Existing MMIF methods lack the ability to resist different weather interferences in real-world scenes, preventing them from being useful in practical applications such as autonomous driving. We propose an all-weather MMIF model to achieve effective multi-tasking in this context. Experimental results in both real-world and synthetic scenes show that the proposed algorithm excels in detail recovery and multi-modality feature extraction.
arXiv Detail & Related papers (2024-02-03T09:02:46Z)
MetaWeather: Few-Shot Weather-Degraded Image Restoration [17.63266150036311]
We introduce MetaWeather, a universal approach that can handle diverse and novel weather conditions with a single unified model. We show that MetaWeather can adapt to unseen weather conditions, significantly outperforming state-of-the-art multi-weather image restoration methods.
arXiv Detail & Related papers (2023-08-28T06:25:40Z)
Exploring the Application of Large-scale Pre-trained Models on Adverse Weather Removal [97.53040662243768]
We propose a CLIP embedding module to make the network handle different weather conditions adaptively. This module integrates the sample specific weather prior extracted by CLIP image encoder together with the distribution specific information learned by a set of parameters.
arXiv Detail & Related papers (2023-06-15T10:06:13Z)
Counting Crowds in Bad Weather [68.50690406143173]
We propose a method for robust crowd counting in adverse weather scenarios. Our model learns effective features and adaptive queries to account for large appearance variations. Experimental results show that the proposed algorithm is effective in counting crowds under different weather types on benchmark datasets.
arXiv Detail & Related papers (2023-06-02T00:00:09Z)
ScatterNeRF: Seeing Through Fog with Physically-Based Inverse Neural Rendering [83.75284107397003]
We introduce ScatterNeRF, a neural rendering method which renders scenes and decomposes the fog-free background. We propose a disentangled representation for the scattering volume and the scene objects, and learn the scene reconstruction with physics-inspired losses. We validate our method by capturing multi-view In-the-Wild data and controlled captures in a large-scale fog chamber.
arXiv Detail & Related papers (2023-05-03T13:24:06Z)
Weather GAN: Multi-Domain Weather Translation Using Generative Adversarial Networks [76.64158017926381]
A new task is proposed, namely, weather translation, which refers to transferring weather conditions of the image from one category to another. We develop a multi-domain weather translation approach based on generative adversarial networks (GAN), denoted as Weather GAN. Our approach suppresses the distortion and deformation caused by weather translation.
arXiv Detail & Related papers (2021-03-09T13:51:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.