LoGoPlanner: Localization Grounded Navigation Policy with Metric-aware Visual Geometry
- URL: http://arxiv.org/abs/2512.19629v2
- Date: Tue, 23 Dec 2025 05:37:16 GMT
- Title: LoGoPlanner: Localization Grounded Navigation Policy with Metric-aware Visual Geometry
- Authors: Jiaqi Peng, Wenzhe Cai, Yuqiang Yang, Tai Wang, Yuan Shen, Jiangmiao Pang,
- Abstract summary: Trajectory planning in unstructured environments is a fundamental and challenging capability for mobile robots.<n>We introduce LoGoPlanner, a localization-grounded, end-to-end navigation framework.<n>We evaluate LoGoPlanner in both simulation and real-world settings, where its fully end-to-end design reduces cumulative error.
- Score: 41.054069737969876
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Trajectory planning in unstructured environments is a fundamental and challenging capability for mobile robots. Traditional modular pipelines suffer from latency and cascading errors across perception, localization, mapping, and planning modules. Recent end-to-end learning methods map raw visual observations directly to control signals or trajectories, promising greater performance and efficiency in open-world settings. However, most prior end-to-end approaches still rely on separate localization modules that depend on accurate sensor extrinsic calibration for self-state estimation, thereby limiting generalization across embodiments and environments. We introduce LoGoPlanner, a localization-grounded, end-to-end navigation framework that addresses these limitations by: (1) finetuning a long-horizon visual-geometry backbone to ground predictions with absolute metric scale, thereby providing implicit state estimation for accurate localization; (2) reconstructing surrounding scene geometry from historical observations to supply dense, fine-grained environmental awareness for reliable obstacle avoidance; and (3) conditioning the policy on implicit geometry bootstrapped by the aforementioned auxiliary tasks, thereby reducing error propagation. We evaluate LoGoPlanner in both simulation and real-world settings, where its fully end-to-end design reduces cumulative error while metric-aware geometry memory enhances planning consistency and obstacle avoidance, leading to more than a 27.3\% improvement over oracle-localization baselines and strong generalization across embodiments and environments. The code and models have been made publicly available on the https://steinate.github.io/logoplanner.github.io.
Related papers
- Fly0: Decoupling Semantic Grounding from Geometric Planning for Zero-Shot Aerial Navigation [14.466092698477858]
Current Visual-Language Navigation (VLN) methodologies face a trade-off between semantic understanding and control precision.<n>We propose Fly0, a framework that decouples semantic reasoning from geometric planning.<n>Fly0 reduces computational overhead and improves system stability.
arXiv Detail & Related papers (2026-02-02T09:06:50Z) - OpenNavMap: Structure-Free Topometric Mapping via Large-Scale Collaborative Localization [12.686154192361913]
OpenNavMap is a lightweight, structure-free topometric system leveraging 3D geometric foundation models for on-demand reconstruction.<n>Our method unifies dynamic programming-based sequence matching, geometric verification, and confidence-calibrated optimization to robust, coarse-to-fine submap alignment.<n> Evaluations on the Map-Free benchmark demonstrate superior accuracy over structure-from-motion and regression baselines, achieving an average translation error of 0.62m.
arXiv Detail & Related papers (2026-01-18T07:24:46Z) - TALO: Pushing 3D Vision Foundation Models Towards Globally Consistent Online Reconstruction [57.46712611558817]
3D vision foundation models have shown strong generalization in reconstructing key 3D attributes from uncalibrated images through a single feed-forward pass.<n>Recent strategies align consecutive predictions by solving global transformation, yet our analysis reveals their fundamental limitations in assumption validity, local alignment scope, and robustness under noisy geometry.<n>We propose a higher-DOF and long-term alignment framework based on Thin Plate Spline, leveraging globally propagated control points to correct spatially varying inconsistencies.
arXiv Detail & Related papers (2025-12-02T02:22:20Z) - MetricNet: Recovering Metric Scale in Generative Navigation Policies [51.90872764552077]
MetricNet is an effective add-on for generative navigation that predicts the metric distance between waypoints.<n>We show that executing MetricNet-scaled waypoints significantly improves both navigation and exploration performance.<n>We also propose MetricNav, which integrates MetricNet into a navigation policy to guide the robot away from obstacles while still moving towards the goal.
arXiv Detail & Related papers (2025-09-17T13:37:13Z) - TANGO: Traversability-Aware Navigation with Local Metric Control for Topological Goals [10.69725316052444]
We present a novel RGB-only, object-level topometric navigation pipeline that enables zero-shot, long-horizon robot navigation.<n>Our approach integrates global topological path planning with local metric trajectory control, allowing the robot to navigate towards object-level sub-goals while avoiding obstacles.<n>We demonstrate the effectiveness of our method in both simulated environments and real-world tests, highlighting its robustness and deployability.
arXiv Detail & Related papers (2025-09-10T15:43:32Z) - Unified Linear Parametric Map Modeling and Perception-aware Trajectory Planning for Mobile Robotics [1.7495208770207367]
We introduce a lightweight linear parametric map by first mapping data to a high-dimensional space, followed by a sparse random projection for dimensionality reduction.<n>For UAVs, our method grid and Euclidean Signed Distance Field (ESDF) maps.<n>For UGVs, the model characterizes terrain and provides closed-form gradients, enabling online planning to circumvent large holes.
arXiv Detail & Related papers (2025-07-12T16:39:19Z) - NavTopo: Leveraging Topological Maps For Autonomous Navigation Of a Mobile Robot [1.0550841723235613]
We propose a full navigation pipeline based on topological map and two-level path planning.
The pipeline localizes in the graph by matching neural network descriptors and 2D projections of the input point clouds.
We test our approach in a large indoor photo-relaistic simulated environment and compare it to a metric map-based approach based on popular metric mapping method RTAB-MAP.
arXiv Detail & Related papers (2024-10-15T10:54:49Z) - OPUS: Occupancy Prediction Using a Sparse Set [64.60854562502523]
We present a framework to simultaneously predict occupied locations and classes using a set of learnable queries.
OPUS incorporates a suite of non-trivial strategies to enhance model performance.
Our lightest model achieves superior RayIoU on the Occ3D-nuScenes dataset at near 2x FPS, while our heaviest model surpasses previous best results by 6.1 RayIoU.
arXiv Detail & Related papers (2024-09-14T07:44:22Z) - Towards Motion Forecasting with Real-World Perception Inputs: Are
End-to-End Approaches Competitive? [93.10694819127608]
We propose a unified evaluation pipeline for forecasting methods with real-world perception inputs.
Our in-depth study uncovers a substantial performance gap when transitioning from curated to perception-based data.
arXiv Detail & Related papers (2023-06-15T17:03:14Z) - Think Global, Act Local: Dual-scale Graph Transformer for
Vision-and-Language Navigation [87.03299519917019]
We propose a dual-scale graph transformer (DUET) for joint long-term action planning and fine-grained cross-modal understanding.
We build a topological map on-the-fly to enable efficient exploration in global action space.
The proposed approach, DUET, significantly outperforms state-of-the-art methods on goal-oriented vision-and-language navigation benchmarks.
arXiv Detail & Related papers (2022-02-23T19:06:53Z) - Lightweight Object-level Topological Semantic Mapping and Long-term
Global Localization based on Graph Matching [19.706907816202946]
We present a novel lightweight object-level mapping and localization method with high accuracy and robustness.
We use object-level features with both semantic and geometric information to model landmarks in the environment.
Based on the proposed map, the robust localization is achieved by constructing a novel local semantic scene graph descriptor.
arXiv Detail & Related papers (2022-01-16T05:47:07Z) - Differentiable Spatial Planning using Transformers [87.90709874369192]
We propose Spatial Planning Transformers (SPT), which given an obstacle map learns to generate actions by planning over long-range spatial dependencies.
In the setting where the ground truth map is not known to the agent, we leverage pre-trained SPTs in an end-to-end framework.
SPTs outperform prior state-of-the-art differentiable planners across all the setups for both manipulation and navigation tasks.
arXiv Detail & Related papers (2021-12-02T06:48:16Z) - Sign-Agnostic CONet: Learning Implicit Surface Reconstructions by
Sign-Agnostic Optimization of Convolutional Occupancy Networks [39.65056638604885]
We learn implicit surface reconstruction by sign-agnostic optimization of convolutional occupancy networks.
We show this goal can be effectively achieved by a simple yet effective design.
arXiv Detail & Related papers (2021-05-08T03:35:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.