Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel
- URL: http://arxiv.org/abs/2412.08467v1
- Date: Wed, 11 Dec 2024 15:32:24 GMT
- Title: Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel
- Authors: Zun Wang, Jialu Li, Yicong Hong, Songze Li, Kunchang Li, Shoubin Yu, Yi Wang, Yu Qiao, Yali Wang, Mohit Bansal, Limin Wang,
- Abstract summary: We introduce a Self-Refining Data Flywheel (SRDF) that generates high-quality and large-scale navigational instruction-trajectory pairs.
Our experiments demonstrate that after several flywheel rounds, the navigator elevates the performance boundary from 70% to 78% SPL on the classic R2R test set.
This process results in a superior generator, evidenced by a SPICE increase from 23.5 to 26.2, better than all previous VLN instruction generation methods.
- Score: 83.7466618084902
- License:
- Abstract: Creating high-quality data for training robust language-instructed agents is a long-lasting challenge in embodied AI. In this paper, we introduce a Self-Refining Data Flywheel (SRDF) that generates high-quality and large-scale navigational instruction-trajectory pairs by iteratively refining the data pool through the collaboration between two models, the instruction generator and the navigator, without any human-in-the-loop annotation. Specifically, SRDF starts with using a base generator to create an initial data pool for training a base navigator, followed by applying the trained navigator to filter the data pool. This leads to higher-fidelity data to train a better generator, which can, in turn, produce higher-quality data for training the next-round navigator. Such a flywheel establishes a data self-refining process, yielding a continuously improved and highly effective dataset for large-scale language-guided navigation learning. Our experiments demonstrate that after several flywheel rounds, the navigator elevates the performance boundary from 70% to 78% SPL on the classic R2R test set, surpassing human performance (76%) for the first time. Meanwhile, this process results in a superior generator, evidenced by a SPICE increase from 23.5 to 26.2, better than all previous VLN instruction generation methods. Finally, we demonstrate the scalability of our method through increasing environment and instruction diversity, and the generalization ability of our pre-trained navigator across various downstream navigation tasks, surpassing state-of-the-art methods by a large margin in all cases.
Related papers
- Guided Data Augmentation for Offline Reinforcement Learning and Imitation Learning [3.586527534935176]
In offline reinforcement learning (RL), an RL agent learns to solve a task using only a fixed dataset of previously collected data.
We propose Guided Data Augmentation (GuDA), a human-guided DA framework that generates expert-quality augmented data.
GuDA enables learning given a small initial dataset of potentially suboptimal experience.
arXiv Detail & Related papers (2023-10-27T16:34:00Z) - PlaceNav: Topological Navigation through Place Recognition [1.9382079036818822]
We present PlaceNav, subdividing the robot-independent part into navigation-specific and generic computer vision components.
We utilize visual place recognition for the subgoal selection of the topological navigation pipeline.
Our experimental results verify the design and the new method obtains a 76% higher success rate in indoor and 23% higher in outdoor navigation tasks with higher computational efficiency.
arXiv Detail & Related papers (2023-09-29T14:12:54Z) - Scaling Data Generation in Vision-and-Language Navigation [116.95534559103788]
We propose an effective paradigm for generating large-scale data for learning.
We apply 1200+ photo-realistic environments from HM3D and Gibson datasets and synthesizes 4.9 million instruction trajectory pairs.
Thanks to our large-scale dataset, the performance of an existing agent can be pushed up (+11% absolute with regard to previous SoTA) to a significantly new best of 80% single-run success rate on the R2R test split by simple imitation learning.
arXiv Detail & Related papers (2023-07-28T16:03:28Z) - Enhancing Navigation Benchmarking and Perception Data Generation for
Row-based Crops in Simulation [0.3518016233072556]
This paper presents a synthetic dataset to train semantic segmentation networks and a collection of virtual scenarios for a fast evaluation of navigation algorithms.
An automatic parametric approach is developed to explore different field geometries and features.
The simulation framework and the dataset have been evaluated by training a deep segmentation network on different crops and benchmarking the resulting navigation.
arXiv Detail & Related papers (2023-06-27T14:46:09Z) - Offline Reinforcement Learning for Visual Navigation [66.88830049694457]
ReViND is the first offline RL system for robotic navigation that can leverage previously collected data to optimize user-specified reward functions in the real-world.
We show that ReViND can navigate to distant goals using only offline training from this dataset, and exhibit behaviors that qualitatively differ based on the user-specified reward function.
arXiv Detail & Related papers (2022-12-16T02:23:50Z) - Retrieval-Augmented Reinforcement Learning [63.32076191982944]
We train a network to map a dataset of past experiences to optimal behavior.
The retrieval process is trained to retrieve information from the dataset that may be useful in the current context.
We show that retrieval-augmented R2D2 learns significantly faster than the baseline R2D2 agent and achieves higher scores.
arXiv Detail & Related papers (2022-02-17T02:44:05Z) - Ultrasound-Guided Robotic Navigation with Deep Reinforcement Learning [38.136007056617885]
We introduce the first reinforcement learning (RL) based robotic navigation method which utilizes ultrasound (US) images as an input.
When testing our proposed model, we obtained a 82.91% chance of navigating correctly to the sacrum from 165 different starting positions.
arXiv Detail & Related papers (2020-03-30T10:13:23Z) - Data-Free Knowledge Amalgamation via Group-Stack Dual-GAN [80.17705319689139]
We propose a data-free knowledge amalgamate strategy to craft a well-behaved multi-task student network from multiple single/multi-task teachers.
The proposed method without any training data achieves the surprisingly competitive results, even compared with some full-supervised methods.
arXiv Detail & Related papers (2020-03-20T03:20:52Z) - Deep Learning based Pedestrian Inertial Navigation: Methods, Dataset and
On-Device Inference [49.88536971774444]
Inertial measurements units (IMUs) are small, cheap, energy efficient, and widely employed in smart devices and mobile robots.
Exploiting inertial data for accurate and reliable pedestrian navigation supports is a key component for emerging Internet-of-Things applications and services.
We present and release the Oxford Inertial Odometry dataset (OxIOD), a first-of-its-kind public dataset for deep learning based inertial navigation research.
arXiv Detail & Related papers (2020-01-13T04:41:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.