Related papers: GeoExplorer: Active Geo-localization with Curiosity-Driven Exploration

GeoExplorer: Active Geo-localization with Curiosity-Driven Exploration

URL: http://arxiv.org/abs/2508.00152v1
Date: Thu, 31 Jul 2025 20:23:25 GMT
Title: GeoExplorer: Active Geo-localization with Curiosity-Driven Exploration
Authors: Li Mi, Manon Bechaz, Zeming Chen, Antoine Bosselut, Devis Tuia,
Abstract summary: Active Geo-localization (AGL) is the task of localizing a goal within a predefined search area.<n>Current methods approach AGL as a goal-reaching reinforcement learning problem with a distance-based reward.<n>We propose GeoExplorer, an AGL agent that incorporates curiosity-driven exploration through intrinsic rewards.
Score: 24.01750902074338
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Active Geo-localization (AGL) is the task of localizing a goal, represented in various modalities (e.g., aerial images, ground-level images, or text), within a predefined search area. Current methods approach AGL as a goal-reaching reinforcement learning (RL) problem with a distance-based reward. They localize the goal by implicitly learning to minimize the relative distance from it. However, when distance estimation becomes challenging or when encountering unseen targets and environments, the agent exhibits reduced robustness and generalization ability due to the less reliable exploration strategy learned during training. In this paper, we propose GeoExplorer, an AGL agent that incorporates curiosity-driven exploration through intrinsic rewards. Unlike distance-based rewards, our curiosity-driven reward is goal-agnostic, enabling robust, diverse, and contextually relevant exploration based on effective environment modeling. These capabilities have been proven through extensive experiments across four AGL benchmarks, demonstrating the effectiveness and generalization ability of GeoExplorer in diverse settings, particularly in localizing unfamiliar targets and environments.

Related papers

VLM-Guided Visual Place Recognition for Planet-Scale Geo-Localization [24.433604332415204]
We propose a novel hybrid geo-localization framework that combines the strengths of vision-language models and visual place recognition.<n>We evaluate our approach on multiple geo-localization benchmarks and show that it consistently outperforms prior state-of-the-art methods.
arXiv Detail & Related papers (2025-07-23T12:23:03Z)
Geolocation with Real Human Gameplay Data: A Large-Scale Dataset and Human-Like Reasoning Framework [59.42946541163632]
We introduce a comprehensive geolocation framework with three key components.<n>GeoComp, a large-scale dataset; GeoCoT, a novel reasoning method; and GeoEval, an evaluation metric.<n>We demonstrate that GeoCoT significantly boosts geolocation accuracy by up to 25% while enhancing interpretability.
arXiv Detail & Related papers (2025-02-19T14:21:25Z)
FrontierNet: Learning Visual Cues to Explore [54.8265603996238]
This work aims at leveraging 2D visual cues for efficient autonomous exploration, addressing the limitations of extracting goal poses from a 3D map.<n>We propose a visual-only frontier-based exploration system, with FrontierNet as its core component.<n>Our approach provides an alternative to existing 3D-dependent goal-extraction approaches, achieving a 15% improvement in early-stage exploration efficiency.
arXiv Detail & Related papers (2025-01-08T16:25:32Z)
Exploring the Edges of Latent State Clusters for Goal-Conditioned Reinforcement Learning [6.266160051617362]
"Cluster Edge Exploration" ($CE2$) is a new goal-directed exploration algorithm that gives priority to goal states that remain accessible to the agent. In challenging robotics environments, $CE2$ demonstrates superior efficiency in exploration compared to baseline methods and ablations.
arXiv Detail & Related papers (2024-11-03T01:21:43Z)
Swarm Intelligence in Geo-Localization: A Multi-Agent Large Vision-Language Model Collaborative Framework [51.26566634946208]
We introduce smileGeo, a novel visual geo-localization framework. By inter-agent communication, smileGeo integrates the inherent knowledge of these agents with additional retrieved information. Results show that our approach significantly outperforms current state-of-the-art methods.
arXiv Detail & Related papers (2024-08-21T03:31:30Z)
GOMAA-Geo: GOal Modality Agnostic Active Geo-localization [49.599465495973654]
We consider the task of active geo-localization (AGL) in which an agent uses a sequence of visual cues observed during aerial navigation to find a target specified through multiple possible modalities. GOMAA-Geo is a goal modality active geo-localization agent for zero-shot generalization between different goal modalities.
arXiv Detail & Related papers (2024-06-04T02:59:36Z)
GeoLLM: Extracting Geospatial Knowledge from Large Language Models [49.20315582673223]
We present GeoLLM, a novel method that can effectively extract geospatial knowledge from large language models. We demonstrate the utility of our approach across multiple tasks of central interest to the international community, including the measurement of population density and economic livelihoods. Our experiments reveal that LLMs are remarkably sample-efficient, rich in geospatial information, and robust across the globe.
arXiv Detail & Related papers (2023-10-10T00:03:23Z)
Landmark-Guided Subgoal Generation in Hierarchical Reinforcement Learning [64.97599673479678]
We present HIerarchical reinforcement learning Guided by Landmarks (HIGL) HIGL is a novel framework for training a high-level policy with a reduced action space guided by landmarks. Our experiments demonstrate that our framework outperforms prior-arts across a variety of control tasks.
arXiv Detail & Related papers (2021-10-26T12:16:19Z)
Deep Reinforcement Learning for Adaptive Exploration of Unknown Environments [6.90777229452271]
We develop an adaptive exploration approach to trade off between exploration and exploitation in one single step for UAVs. The proposed approach uses a map segmentation technique to decompose the environment map into smaller, tractable maps. The results demonstrate that our proposed approach is capable of navigating through randomly generated environments and covering more AoI in less time steps compared to the baselines.
arXiv Detail & Related papers (2021-05-04T16:29:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.