Grid2Guide: A* Enabled Small Language Model for Indoor Navigation
- URL: http://arxiv.org/abs/2508.08100v2
- Date: Fri, 29 Aug 2025 20:09:29 GMT
- Title: Grid2Guide: A* Enabled Small Language Model for Indoor Navigation
- Authors: Md. Wasiul Haque, Sagar Dasgupta, Mizanur Rahman,
- Abstract summary: This research presents a hybrid navigation framework that combines the A* search algorithm with a Small Language Model (SLM) to generate clear, human-readable route instructions.<n>The results validate the proposed approach as a lightweight, infrastructure-free solution for real-time indoor navigation support.
- Score: 6.341317643879287
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reliable indoor navigation remains a significant challenge in complex environments, particularly where external positioning signals and dedicated infrastructures are unavailable. This research presents Grid2Guide, a hybrid navigation framework that combines the A* search algorithm with a Small Language Model (SLM) to generate clear, human-readable route instructions. The framework first conducts a binary occupancy matrix from a given indoor map. Using this matrix, the A* algorithm computes the optimal path between origin and destination, producing concise textual navigation steps. These steps are then transformed into natural language instructions by the SLM, enhancing interpretability for end users. Experimental evaluations across various indoor scenarios demonstrate the method's effectiveness in producing accurate and timely navigation guidance. The results validate the proposed approach as a lightweight, infrastructure-free solution for real-time indoor navigation support.
Related papers
- A Reliable Indoor Navigation System for Humans Using AR-based Technique [0.0]
An AR-based technique has been applied to campus and small-site navigation, where Vuforia Area Target is used for environment modeling.<n>Compared to Dijkstra's algorithm, it can reach a solution about two to three times faster for smaller search spaces.<n>Results show that AR technology integrated with existing pathfinding algorithms is feasible and scalable.
arXiv Detail & Related papers (2026-02-27T06:18:49Z) - ReasonNavi: Human-Inspired Global Map Reasoning for Zero-Shot Embodied Navigation [53.95797153529148]
Embodied agents often struggle with efficient navigation because they rely primarily on partial egocentric observations.<n>We introduce ReasonNavi, a human-inspired framework that operationalizes this reason-then-act paradigm by coupling Multimodal Large Language Models (MLLMs) with deterministic planners.
arXiv Detail & Related papers (2026-01-26T19:09:20Z) - DAgger Diffusion Navigation: DAgger Boosted Diffusion Policy for Vision-Language Navigation [73.80968452950854]
Vision-Language Navigation in Continuous Environments (VLN-CE) requires agents to follow natural language instructions through free-form 3D spaces.<n>Existing VLN-CE approaches typically use a two-stage waypoint planning framework.<n>We propose DAgger Diffusion Navigation (DifNav) as an end-to-end optimized VLN-CE policy.
arXiv Detail & Related papers (2025-08-13T02:51:43Z) - NavComposer: Composing Language Instructions for Navigation Trajectories through Action-Scene-Object Modularization [17.525269369227786]
We propose NavComposer, a framework for automatically generating high-quality navigation instructions.<n>NavComposer explicitly decomposes semantic entities such as actions, scenes, and objects, and recomposes them into natural language instructions.<n>It operates in a data-agnostic manner, supporting adaptation to diverse navigation trajectories without domain-specific training.<n>NavInstrCritic provides a holistic evaluation of instruction quality, addressing limitations of traditional metrics that rely heavily on expert annotations.
arXiv Detail & Related papers (2025-07-15T01:20:22Z) - LLM-Guided Indoor Navigation with Multimodal Map Understanding [1.5325823985727567]
We explore the potential of a Large Language Model (LLM), i.e., ChatGPT, to generate context-aware navigation instructions from indoor map images.<n>Our findings demonstrate the potential of LLMs for supporting personalized indoor navigation, with an average of 86.59% correct indications and a maximum of 97.14%.<n>These results have key implications for AI-driven navigation and assistive technologies.
arXiv Detail & Related papers (2025-03-12T09:32:43Z) - Navigation-GPT: A Robust and Adaptive Framework Utilizing Large Language Models for Navigation Applications [6.990141986853289]
Existing navigation decision support systems often perform poorly when handling non-predefined scenarios.<n>This research proposes a dual-core framework for LLM applications to address this issue.
arXiv Detail & Related papers (2025-02-23T01:41:58Z) - NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning [97.88246428240872]
Vision-and-Language Navigation (VLN), as a crucial research problem of Embodied AI, requires an embodied agent to navigate through complex 3D environments following natural language instructions.<n>Recent research has highlighted the promising capacity of large language models (LLMs) in VLN by improving navigational reasoning accuracy and interpretability.<n>This paper introduces a novel strategy called Navigational Chain-of-Thought (NavCoT), where we fulfill parameter-efficient in-domain training to enable self-guided navigational decision.
arXiv Detail & Related papers (2024-03-12T07:27:02Z) - ETPNav: Evolving Topological Planning for Vision-Language Navigation in
Continuous Environments [56.194988818341976]
Vision-language navigation is a task that requires an agent to follow instructions to navigate in environments.
We propose ETPNav, which focuses on two critical skills: 1) the capability to abstract environments and generate long-range navigation plans, and 2) the ability of obstacle-avoiding control in continuous environments.
ETPNav yields more than 10% and 20% improvements over prior state-of-the-art on R2R-CE and RxR-CE datasets.
arXiv Detail & Related papers (2023-04-06T13:07:17Z) - Visual-Language Navigation Pretraining via Prompt-based Environmental
Self-exploration [83.96729205383501]
We introduce prompt-based learning to achieve fast adaptation for language embeddings.
Our model can adapt to diverse vision-language navigation tasks, including VLN and REVERIE.
arXiv Detail & Related papers (2022-03-08T11:01:24Z) - Find a Way Forward: a Language-Guided Semantic Map Navigator [53.69229615952205]
This paper attacks the problem of language-guided navigation in a new perspective.
We use novel semantic navigation maps, which enables robots to carry out natural language instructions and move to a target position based on the map observations.
The proposed approach has noticeable performance gains, especially in long-distance navigation cases.
arXiv Detail & Related papers (2022-03-07T07:40:33Z) - Unsupervised Domain Adaptation for Visual Navigation [115.85181329193092]
We propose an unsupervised domain adaptation method for visual navigation.
Our method translates the images in the target domain to the source domain such that the translation is consistent with the representations learned by the navigation policy.
arXiv Detail & Related papers (2020-10-27T18:22:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.