NaviDiffusor: Cost-Guided Diffusion Model for Visual Navigation
- URL: http://arxiv.org/abs/2504.10003v1
- Date: Mon, 14 Apr 2025 09:06:02 GMT
- Title: NaviDiffusor: Cost-Guided Diffusion Model for Visual Navigation
- Authors: Yiming Zeng, Hao Ren, Shuhang Wang, Junlong Huang, Hui Cheng,
- Abstract summary: We propose a hybrid approach that combines the strengths of learning-based methods and classical approaches for visual navigation.<n>Our method first trains a conditional diffusion model on diverse path-RGB observation pairs.<n>During inference, it integrates the gradients of differentiable scene-specific and task-level costs, guiding the diffusion model to generate valid paths that meet the constraints.
- Score: 18.542828322750996
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual navigation, a fundamental challenge in mobile robotics, demands versatile policies to handle diverse environments. Classical methods leverage geometric solutions to minimize specific costs, offering adaptability to new scenarios but are prone to system errors due to their multi-modular design and reliance on hand-crafted rules. Learning-based methods, while achieving high planning success rates, face difficulties in generalizing to unseen environments beyond the training data and often require extensive training. To address these limitations, we propose a hybrid approach that combines the strengths of learning-based methods and classical approaches for RGB-only visual navigation. Our method first trains a conditional diffusion model on diverse path-RGB observation pairs. During inference, it integrates the gradients of differentiable scene-specific and task-level costs, guiding the diffusion model to generate valid paths that meet the constraints. This approach alleviates the need for retraining, offering a plug-and-play solution. Extensive experiments in both indoor and outdoor settings, across simulated and real-world scenarios, demonstrate zero-shot transfer capability of our approach, achieving higher success rates and fewer collisions compared to baseline methods. Code will be released at https://github.com/SYSU-RoboticsLab/NaviD.
Related papers
- Leveraging Constraint Violation Signals For Action-Constrained Reinforcement Learning [13.332006760984122]
Action-Constrained Reinforcement Learning (ACRL) employs a projection layer after the policy network to correct the action.<n>Recent methods were proposed to train generative models to learn a differentiable mapping between latent variables and feasible actions.
arXiv Detail & Related papers (2025-02-08T12:58:26Z) - Training-free Quantum-Inspired Image Edge Extraction Method [4.8188571652305185]
We propose a training-free, quantum-inspired edge detection model.<n>Our approach integrates classical Sobel edge detection, the Schr"odinger wave equation refinement, and a hybrid framework.<n>By eliminating the need for training, the model is lightweight and adaptable to diverse applications.
arXiv Detail & Related papers (2025-01-31T07:24:38Z) - Generalizable Spacecraft Trajectory Generation via Multimodal Learning with Transformers [14.176630393074149]
We present a novel trajectory generation framework that generalizes across diverse problem configurations.
We leverage high-capacity transformer neural networks capable of learning from data sources.
The framework is validated through simulations and experiments on a free-flyer platform.
arXiv Detail & Related papers (2024-10-15T15:55:42Z) - NoMaD: Goal Masked Diffusion Policies for Navigation and Exploration [57.15811390835294]
This paper describes how we can train a single unified diffusion policy to handle both goal-directed navigation and goal-agnostic exploration.
We show that this unified policy results in better overall performance when navigating to visually indicated goals in novel environments.
Our experiments, conducted on a real-world mobile robot platform, show effective navigation in unseen environments in comparison with five alternative methods.
arXiv Detail & Related papers (2023-10-11T21:07:14Z) - Re-Evaluating LiDAR Scene Flow for Autonomous Driving [80.37947791534985]
Popular benchmarks for self-supervised LiDAR scene flow have unrealistic rates of dynamic motion, unrealistic correspondences, and unrealistic sampling patterns.
We evaluate a suite of top methods on a suite of real-world datasets.
We show that despite the emphasis placed on learning, most performance gains are caused by pre- and post-processing steps.
arXiv Detail & Related papers (2023-04-04T22:45:50Z) - Experimental study of Neural ODE training with adaptive solver for
dynamical systems modeling [72.84259710412293]
Some ODE solvers called adaptive can adapt their evaluation strategy depending on the complexity of the problem at hand.
This paper describes a simple set of experiments to show why adaptive solvers cannot be seamlessly leveraged as a black-box for dynamical systems modelling.
arXiv Detail & Related papers (2022-11-13T17:48:04Z) - Learning Control Admissibility Models with Graph Neural Networks for
Multi-Agent Navigation [9.05607520128194]
Control admissibility models (CAMs) can be easily composed and used for online inference for an arbitrary number of agents.
We show that the CAM models can be trained in environments with only a few agents and be easily composed for deployment in dense environments with hundreds of agents, achieving better performance than state-of-the-art methods.
arXiv Detail & Related papers (2022-10-17T19:20:58Z) - Adaptive Decision Making at the Intersection for Autonomous Vehicles
Based on Skill Discovery [13.134487965031667]
In urban environments, the complex and uncertain intersection scenarios are challenging for autonomous driving.
To ensure safety, it is crucial to develop an adaptive decision making system that can handle the interaction with other vehicles.
We propose a hierarchical framework that can autonomously accumulate and reuse knowledge.
arXiv Detail & Related papers (2022-07-24T11:56:45Z) - Switchable Representation Learning Framework with Self-compatibility [50.48336074436792]
We propose a Switchable representation learning Framework with Self-Compatibility (SFSC)
SFSC generates a series of compatible sub-models with different capacities through one training process.
SFSC achieves state-of-the-art performance on the evaluated datasets.
arXiv Detail & Related papers (2022-06-16T16:46:32Z) - Visual-Language Navigation Pretraining via Prompt-based Environmental
Self-exploration [83.96729205383501]
We introduce prompt-based learning to achieve fast adaptation for language embeddings.
Our model can adapt to diverse vision-language navigation tasks, including VLN and REVERIE.
arXiv Detail & Related papers (2022-03-08T11:01:24Z) - IQ-Learn: Inverse soft-Q Learning for Imitation [95.06031307730245]
imitation learning from a small amount of expert data can be challenging in high-dimensional environments with complex dynamics.
Behavioral cloning is a simple method that is widely used due to its simplicity of implementation and stable convergence.
We introduce a method for dynamics-aware IL which avoids adversarial training by learning a single Q-function.
arXiv Detail & Related papers (2021-06-23T03:43:10Z) - Scalable Bayesian Inverse Reinforcement Learning [93.27920030279586]
We introduce Approximate Variational Reward Imitation Learning (AVRIL)
Our method addresses the ill-posed nature of the inverse reinforcement learning problem.
Applying our method to real medical data alongside classic control simulations, we demonstrate Bayesian reward inference in environments beyond the scope of current methods.
arXiv Detail & Related papers (2021-02-12T12:32:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.