Related papers: ActLoc: Learning to Localize on the Move via Active Viewpoint Selection

ActLoc: Learning to Localize on the Move via Active Viewpoint Selection

URL: http://arxiv.org/abs/2508.20981v1
Date: Thu, 28 Aug 2025 16:36:02 GMT
Title: ActLoc: Learning to Localize on the Move via Active Viewpoint Selection
Authors: Jiajie Li, Boyang Sun, Luca Di Giammarino, Hermann Blum, Marc Pollefeys,
Abstract summary: ActLoc is an active viewpoint-aware planning framework for enhancing localization accuracy for general robot navigation tasks.<n>At its core, ActLoc employs a largescale trained attention-based model for viewpoint selection.<n>ActLoc achieves stateof-the-art results on single-viewpoint selection and generalizes effectively to fulltrajectory planning.
Score: 52.909507162638526
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reliable localization is critical for robot navigation, yet most existing systems implicitly assume that all viewing directions at a location are equally informative. In practice, localization becomes unreliable when the robot observes unmapped, ambiguous, or uninformative regions. To address this, we present ActLoc, an active viewpoint-aware planning framework for enhancing localization accuracy for general robot navigation tasks. At its core, ActLoc employs a largescale trained attention-based model for viewpoint selection. The model encodes a metric map and the camera poses used during map construction, and predicts localization accuracy across yaw and pitch directions at arbitrary 3D locations. These per-point accuracy distributions are incorporated into a path planner, enabling the robot to actively select camera orientations that maximize localization robustness while respecting task and motion constraints. ActLoc achieves stateof-the-art results on single-viewpoint selection and generalizes effectively to fulltrajectory planning. Its modular design makes it readily applicable to diverse robot navigation and inspection tasks.

Related papers

To Move or Not to Move: Constraint-based Planning Enables Zero-Shot Generalization for Interactive Navigation [14.745622942938532]
In real-world scenarios, such as home environments and warehouses, clutter can block all routes.<n>We introduce the Lifelong Interactive Navigation problem, where a mobile robot can move clutter to forge its own path.<n>We propose an LLM-driven, constraint-based planning framework with active perception.
arXiv Detail & Related papers (2026-02-23T17:10:00Z)
ReasonNavi: Human-Inspired Global Map Reasoning for Zero-Shot Embodied Navigation [53.95797153529148]
Embodied agents often struggle with efficient navigation because they rely primarily on partial egocentric observations.<n>We introduce ReasonNavi, a human-inspired framework that operationalizes this reason-then-act paradigm by coupling Multimodal Large Language Models (MLLMs) with deterministic planners.
arXiv Detail & Related papers (2026-01-26T19:09:20Z)
Sight Over Site: Perception-Aware Reinforcement Learning for Efficient Robotic Inspection [57.37596278863949]
In this work, we revisit inspection from a perception-aware perspective.<n>We propose an end-to-end reinforcement learning framework that explicitly incorporates target visibility as the primary objective.<n>We show that our method outperforms existing classical and learning-based navigation approaches.
arXiv Detail & Related papers (2025-09-22T15:14:02Z)
TopV-Nav: Unlocking the Top-View Spatial Reasoning Potential of MLLM for Zero-shot Object Navigation [52.422619828854984]
We introduce TopV-Nav, an MLLM-based method that directly reasons on the top-view map with sufficient spatial information.<n>To fully unlock the MLLM's spatial reasoning potential in top-view perspective, we propose the Adaptive Visual Prompt Generation (AVPG) method.
arXiv Detail & Related papers (2024-11-25T14:27:55Z)
Learning Where to Look: Self-supervised Viewpoint Selection for Active Localization using Geometrical Information [68.10033984296247]
This paper explores the domain of active localization, emphasizing the importance of viewpoint selection to enhance localization accuracy. Our contributions involve using a data-driven approach with a simple architecture designed for real-time operation, a self-supervised data training method, and the capability to consistently integrate our map into a planning framework tailored for real-world robotics applications.
arXiv Detail & Related papers (2024-07-22T12:32:09Z)
Active Visual Localization for Multi-Agent Collaboration: A Data-Driven Approach [47.373245682678515]
This work investigates how active visual localization can be used to overcome challenges of viewpoint changes. Specifically, we focus on the problem of selecting the optimal viewpoint at a given location. The result demonstrates the superior performance of the data-driven approach when compared to existing methods.
arXiv Detail & Related papers (2023-10-04T08:18:30Z)
Deep Reinforcement Learning for Localizability-Enhanced Navigation in Dynamic Human Environments [16.25625435648576]
Reliable localization is crucial for autonomous robots to navigate efficiently and safely. We propose a novel approach for localizability-enhanced navigation via deep reinforcement learning. Our method exhibits significant improvements in lost rate and arrival rate when tested in previously unseen environments.
arXiv Detail & Related papers (2023-03-22T07:44:35Z)
ViKiNG: Vision-Based Kilometer-Scale Navigation with Geographic Hints [94.60414567852536]
Long-range navigation requires both planning and reasoning about local traversability. We propose a learning-based approach that integrates learning and planning. ViKiNG can leverage its image-based learned controller and goal-directed to navigate to goals up to 3 kilometers away.
arXiv Detail & Related papers (2022-02-23T02:14:23Z)
Deep Multi-Task Learning for Joint Localization, Perception, and Prediction [68.50217234419922]
This paper investigates the issues that arise in state-of-the-art autonomy stacks under localization error. We design a system that jointly performs perception, prediction, and localization. Our architecture is able to reuse computation between both tasks, and is thus able to correct localization errors efficiently.
arXiv Detail & Related papers (2021-01-17T17:20:31Z)
Visual Localization for Autonomous Driving: Mapping the Accurate Location in the City Maze [16.824901952766446]
We propose a novel feature voting technique for visual localization. In our work, we craft the proposed feature voting method into three state-of-the-art visual localization networks. Our approach can predict location robustly even in challenging inner-city settings.
arXiv Detail & Related papers (2020-08-13T03:59:34Z)
Spatial Action Maps for Mobile Manipulation [30.018835572458844]
We show that it can be advantageous to learn with dense action representations defined in the same domain as the state. We present "spatial action maps," in which the set of possible actions is represented by a pixel map. We find that policies learned with spatial action maps achieve much better performance than traditional alternatives.
arXiv Detail & Related papers (2020-04-20T09:06:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.