FASTNav: Fine-tuned Adaptive Small-language-models Trained for Multi-point Robot Navigation
- URL: http://arxiv.org/abs/2411.13262v1
- Date: Wed, 20 Nov 2024 12:28:13 GMT
- Title: FASTNav: Fine-tuned Adaptive Small-language-models Trained for Multi-point Robot Navigation
- Authors: Yuxuan Chen, Yixin Han, Xiao Li,
- Abstract summary: We propose FASTNav - a method for boosting lightweight language models (SLMs) for robot navigation.
We train and evaluate models with FASTNav in both simulation and real robots, proving that we can deploy them with low cost, high accuracy and low response time.
- Score: 10.3997505825422
- License:
- Abstract: With the rapid development of large language models (LLM), robots are starting to enjoy the benefits of new interaction methods that large language models bring. Because edge computing fulfills the needs for rapid response, privacy, and network autonomy, we believe it facilitates the extensive deployment of large models for robot navigation across various industries. To enable local deployment of language models on edge devices, we adopt some model boosting methods. In this paper, we propose FASTNav - a method for boosting lightweight LLMs, also known as small language models (SLMs), for robot navigation. The proposed method contains three modules: fine-tuning, teacher-student iteration, and language-based multi-point robot navigation. We train and evaluate models with FASTNav in both simulation and real robots, proving that we can deploy them with low cost, high accuracy and low response time. Compared to other model compression methods, FASTNav shows potential in the local deployment of language models and tends to be a promising solution for language-guided robot navigation on edge devices.
Related papers
- Hyp2Nav: Hyperbolic Planning and Curiosity for Crowd Navigation [58.574464340559466]
We advocate for hyperbolic learning to enable crowd navigation and we introduce Hyp2Nav.
Hyp2Nav leverages the intrinsic properties of hyperbolic geometry to better encode the hierarchical nature of decision-making processes in navigation tasks.
We propose a hyperbolic policy model and a hyperbolic curiosity module that results in effective social navigation, best success rates, and returns across multiple simulation settings.
arXiv Detail & Related papers (2024-07-18T14:40:33Z) - Co-NavGPT: Multi-Robot Cooperative Visual Semantic Navigation using
Large Language Models [10.312968200748118]
Co-NavGPT is an innovative framework that integrates Large Language Models as a global planner for multi-robot cooperative visual target navigation.
It encodes the explored environment data into prompts, enhancing LLMs' scene comprehension.
It then assigns exploration frontiers to each robot for efficient target search.
arXiv Detail & Related papers (2023-10-11T23:17:43Z) - NoMaD: Goal Masked Diffusion Policies for Navigation and Exploration [57.15811390835294]
This paper describes how we can train a single unified diffusion policy to handle both goal-directed navigation and goal-agnostic exploration.
We show that this unified policy results in better overall performance when navigating to visually indicated goals in novel environments.
Our experiments, conducted on a real-world mobile robot platform, show effective navigation in unseen environments in comparison with five alternative methods.
arXiv Detail & Related papers (2023-10-11T21:07:14Z) - RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic
Control [140.48218261864153]
We study how vision-language models trained on Internet-scale data can be incorporated directly into end-to-end robotic control.
Our approach leads to performant robotic policies and enables RT-2 to obtain a range of emergent capabilities from Internet-scale training.
arXiv Detail & Related papers (2023-07-28T21:18:02Z) - VoxPoser: Composable 3D Value Maps for Robotic Manipulation with
Language Models [38.503337052122234]
Large language models (LLMs) are shown to possess a wealth of actionable knowledge that can be extracted for robot manipulation.
We aim to synthesize robot trajectories for a variety of manipulation tasks given an open-set of instructions and an open-set of objects.
We demonstrate how the proposed framework can benefit from online experiences by efficiently learning a dynamics model for scenes that involve contact-rich interactions.
arXiv Detail & Related papers (2023-07-12T07:40:48Z) - ViNT: A Foundation Model for Visual Navigation [52.2571739391896]
Visual Navigation Transformer (ViNT) is a foundation model for vision-based robotic navigation.
ViNT is trained with a general goal-reaching objective that can be used with any navigation dataset.
It exhibits positive transfer, outperforming specialist models trained on singular datasets.
arXiv Detail & Related papers (2023-06-26T16:57:03Z) - GNM: A General Navigation Model to Drive Any Robot [67.40225397212717]
General goal-conditioned model for vision-based navigation can be trained on data obtained from many distinct but structurally similar robots.
We analyze the necessary design decisions for effective data sharing across robots.
We deploy the trained GNM on a range of new robots, including an under quadrotor.
arXiv Detail & Related papers (2022-10-07T07:26:41Z) - LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language,
Vision, and Action [76.71101507291473]
We present a system, LM-Nav, for robotic navigation that enjoys the benefits of training on unannotated large datasets of trajectories.
We show that such a system can be constructed entirely out of pre-trained models for navigation (ViNG), image-language association (CLIP), and language modeling (GPT-3), without requiring any fine-tuning or language-annotated robot data.
arXiv Detail & Related papers (2022-07-10T10:41:50Z) - Reshaping Robot Trajectories Using Natural Language Commands: A Study of
Multi-Modal Data Alignment Using Transformers [33.7939079214046]
We provide a flexible language-based interface for human-robot collaboration.
We take advantage of recent advancements in the field of large language models to encode the user command.
We train the model using imitation learning over a dataset containing robot trajectories modified by language commands.
arXiv Detail & Related papers (2022-03-25T01:36:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.