NaviHydra: Controllable Navigation-guided End-to-end Autonomous Driving with Hydra-distillation
- URL: http://arxiv.org/abs/2512.10660v1
- Date: Thu, 11 Dec 2025 14:05:18 GMT
- Title: NaviHydra: Controllable Navigation-guided End-to-end Autonomous Driving with Hydra-distillation
- Authors: Hanfeng Wu, Marlon Steiner, Michael Schmidt, Alvaro Marcos-Ramiro, Christoph Stiller,
- Abstract summary: NaviHydra is a controllable navigation-guided end-to-end model distilled from an existing rule-based simulator.<n>Our framework accepts high-level navigation commands as control signals, generating trajectories that align with specified intentions.
- Score: 6.722962084265695
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The complexity of autonomous driving scenarios requires robust models that can interpret high-level navigation commands and generate safe trajectories. While traditional rule-based systems can react to these commands, they often struggle in dynamic environments, and end-to-end methods face challenges in complying with explicit navigation commands. To address this, we present NaviHydra, a controllable navigation-guided end-to-end model distilled from an existing rule-based simulator. Our framework accepts high-level navigation commands as control signals, generating trajectories that align with specified intentions. We utilize a Bird's Eye View (BEV) based trajectory gathering method to enhance the trajectory feature extraction. Additionally, we introduce a novel navigation compliance metric to evaluate adherence to intended route, improving controllability and navigation safety. To comprehensively assess our model's controllability, we design a test that evaluates its response to various navigation commands. Our method significantly outperforms baseline models, achieving state-of-the-art results in the NAVSIM benchmark, demonstrating its effectiveness in advancing autonomous driving.
Related papers
- OpenFrontier: General Navigation with Visual-Language Grounded Frontiers [54.661157616245966]
Open-world navigation requires robots to make decisions in complex everyday environments.<n>Recent advances in vision--language navigation (VLN) and vision--language--action (VLA) models enable end-to-end policies conditioned on natural language.<n>We propose OpenFrontier, a training-free navigation framework that seamlessly integrates diverse vision--language prior models.
arXiv Detail & Related papers (2026-03-05T17:02:22Z) - NavigScene: Bridging Local Perception and Global Navigation for Beyond-Visual-Range Autonomous Driving [10.597463021650382]
NavigScene is an auxiliary navigation-guided natural language dataset that simulates a human-like driving environment within autonomous driving systems.<n>We develop three paradigms to leverage NavigScene: (1) Navigation-guided Reasoning, which enhances vision-language models by incorporating navigation context into the prompting approach; (2) Navigation-guided Preference Optimization, a reinforcement learning method that extends Direct Preference Optimization to improve vision-language model responses; and (3) Navigation-guided Vision-Language-Action model, which integrates navigation guidance and vision-language models with conventional driving models through feature fusion.
arXiv Detail & Related papers (2025-07-07T17:37:01Z) - Human-like Semantic Navigation for Autonomous Driving using Knowledge Representation and Large Language Models [11.045662497663514]
Large Language Models are used to generate Answer Set Programming rules.<n> ASP provides non-monotonic reasoning, allowing autonomous vehicles to adapt to evolving scenarios.<n>Results show that LLM-driven ASP rule generation supports semantic-based decision-making.
arXiv Detail & Related papers (2025-05-22T10:32:43Z) - Designing Control Barrier Function via Probabilistic Enumeration for Safe Reinforcement Learning Navigation [55.02966123945644]
We propose a hierarchical control framework leveraging neural network verification techniques to design control barrier functions (CBFs) and policy correction mechanisms.<n>Our approach relies on probabilistic enumeration to identify unsafe regions of operation, which are then used to construct a safe CBF-based control layer.<n>These experiments demonstrate the ability of the proposed solution to correct unsafe actions while preserving efficient navigation behavior.
arXiv Detail & Related papers (2025-04-30T13:47:25Z) - Navigation World Models [68.58459393846461]
We introduce a controllable video generation model that predicts future visual observations based on past observations and navigation actions.<n>In familiar environments, NWM can plan navigation trajectories by simulating them and evaluating whether they achieve the desired goal.<n>Experiments demonstrate its effectiveness in planning trajectories from scratch or by ranking trajectories sampled from an external policy.
arXiv Detail & Related papers (2024-12-04T18:59:45Z) - Angle Robustness Unmanned Aerial Vehicle Navigation in GNSS-Denied
Scenarios [66.05091704671503]
We present a novel angle navigation paradigm to deal with flight deviation in point-to-point navigation tasks.
We also propose a model that includes the Adaptive Feature Enhance Module, Cross-knowledge Attention-guided Module and Robust Task-oriented Head Module.
arXiv Detail & Related papers (2024-02-04T08:41:20Z) - Confidence-Controlled Exploration: Efficient Sparse-Reward Policy Learning for Robot Navigation [72.24964965882783]
Reinforcement learning (RL) is a promising approach for robotic navigation, allowing robots to learn through trial and error.<n>Real-world robotic tasks often suffer from sparse rewards, leading to inefficient exploration and suboptimal policies.<n>We introduce Confidence-Controlled Exploration (CCE), a novel method that improves sample efficiency in RL-based robotic navigation without modifying the reward function.
arXiv Detail & Related papers (2023-06-09T18:45:15Z) - ETPNav: Evolving Topological Planning for Vision-Language Navigation in
Continuous Environments [56.194988818341976]
Vision-language navigation is a task that requires an agent to follow instructions to navigate in environments.
We propose ETPNav, which focuses on two critical skills: 1) the capability to abstract environments and generate long-range navigation plans, and 2) the ability of obstacle-avoiding control in continuous environments.
ETPNav yields more than 10% and 20% improvements over prior state-of-the-art on R2R-CE and RxR-CE datasets.
arXiv Detail & Related papers (2023-04-06T13:07:17Z) - Control Transformer: Robot Navigation in Unknown Environments through
PRM-Guided Return-Conditioned Sequence Modeling [0.0]
We propose Control Transformer that models return-conditioned sequences from low-level policies guided by a sampling-based Probabilistic Roadmap planner.
We show that Control Transformer can successfully navigate through mazes and transfer to unknown environments.
arXiv Detail & Related papers (2022-11-11T18:44:41Z) - Online No-regret Model-Based Meta RL for Personalized Navigation [37.82017324353145]
We propose an online no-regret model-based RL method that quickly conforms to the dynamics of the current user.
Our theoretical analysis shows that our method is a no-regret algorithm and we provide the convergence rate in the agnostic setting.
Our empirical analysis with 60+ hours of real-world user data shows that our method can reduce the number of collisions by more than 60%.
arXiv Detail & Related papers (2022-04-05T01:28:06Z) - Topological Planning with Transformers for Vision-and-Language
Navigation [31.64229792521241]
We propose a modular approach to vision-and-language navigation (VLN) using topological maps.
Given a natural language instruction and topological map, our approach leverages attention mechanisms to predict a navigation plan in the map.
Experiments show that our method outperforms previous end-to-end approaches, generates interpretable navigation plans, and exhibits intelligent behaviors such as backtracking.
arXiv Detail & Related papers (2020-12-09T20:02:03Z) - APPLD: Adaptive Planner Parameter Learning from Demonstration [48.63930323392909]
We introduce APPLD, Adaptive Planner Learning from Demonstration, that allows existing navigation systems to be successfully applied to new complex environments.
APPLD is verified on two robots running different navigation systems in different environments.
Experimental results show that APPLD can outperform navigation systems with the default and expert-tuned parameters, and even the human demonstrator themselves.
arXiv Detail & Related papers (2020-03-31T21:15:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.