Adversarial Reinforced Instruction Attacker for Robust Vision-Language
Navigation
- URL: http://arxiv.org/abs/2107.11252v1
- Date: Fri, 23 Jul 2021 14:11:31 GMT
- Title: Adversarial Reinforced Instruction Attacker for Robust Vision-Language
Navigation
- Authors: Bingqian Lin, Yi Zhu, Yanxin Long, Xiaodan Liang, Qixiang Ye, Liang
Lin
- Abstract summary: Language instruction plays an essential role in the natural language grounded navigation tasks.
We exploit to train a more robust navigator which is capable of dynamically extracting crucial factors from the long instruction.
Specifically, we propose a Dynamic Reinforced Instruction Attacker (DR-Attacker), which learns to mislead the navigator to move to the wrong target.
- Score: 145.84123197129298
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Language instruction plays an essential role in the natural language grounded
navigation tasks. However, navigators trained with limited human-annotated
instructions may have difficulties in accurately capturing key information from
the complicated instruction at different timesteps, leading to poor navigation
performance. In this paper, we exploit to train a more robust navigator which
is capable of dynamically extracting crucial factors from the long instruction,
by using an adversarial attacking paradigm. Specifically, we propose a Dynamic
Reinforced Instruction Attacker (DR-Attacker), which learns to mislead the
navigator to move to the wrong target by destroying the most instructive
information in instructions at different timesteps. By formulating the
perturbation generation as a Markov Decision Process, DR-Attacker is optimized
by the reinforcement learning algorithm to generate perturbed instructions
sequentially during the navigation, according to a learnable attack score.
Then, the perturbed instructions, which serve as hard samples, are used for
improving the robustness of the navigator with an effective adversarial
training strategy and an auxiliary self-supervised reasoning task. Experimental
results on both Vision-and-Language Navigation (VLN) and Navigation from Dialog
History (NDH) tasks show the superiority of our proposed method over
state-of-the-art methods. Moreover, the visualization analysis shows the
effectiveness of the proposed DR-Attacker, which can successfully attack
crucial information in the instructions at different timesteps. Code is
available at https://github.com/expectorlin/DR-Attacker.
Related papers
- I2EDL: Interactive Instruction Error Detection and Localization [65.25839671641218]
We propose a novel task of Interactive VLN in Continuous Environments (IVLN-CE)
It allows the agent to interact with the user during the VLN-CE navigation to verify any doubts regarding the instruction errors.
We leverage a pre-trained module to detect instruction errors and pinpoint them in the instruction by cross-referencing the textual input and past observations.
arXiv Detail & Related papers (2024-06-07T16:52:57Z) - InstructNav: Zero-shot System for Generic Instruction Navigation in Unexplored Environment [5.43847693345519]
In this work, we propose InstructNav, a generic instruction navigation system.
InstructNav makes the first endeavor to handle various instruction navigation tasks without any navigation training or pre-built maps.
With InstructNav, we complete the R2R-CE task in a zero-shot way for the first time and outperform many task-training methods.
arXiv Detail & Related papers (2024-06-07T12:26:34Z) - TINA: Think, Interaction, and Action Framework for Zero-Shot Vision Language Navigation [11.591176410027224]
This paper presents a Vision-Language Navigation (VLN) agent based on Large Language Models (LLMs)
We propose the Thinking, Interacting, and Action framework to compensate for the shortcomings of LLMs in environmental perception.
Our approach also outperformed some supervised learning-based methods, highlighting its efficacy in zero-shot navigation.
arXiv Detail & Related papers (2024-03-13T05:22:39Z) - $A^2$Nav: Action-Aware Zero-Shot Robot Navigation by Exploiting
Vision-and-Language Ability of Foundation Models [89.64729024399634]
We study the task of zero-shot vision-and-language navigation (ZS-VLN), a practical yet challenging problem in which an agent learns to navigate following a path described by language instructions.
Normally, the instructions have complex grammatical structures and often contain various action descriptions.
How to correctly understand and execute these action demands is a critical problem, and the absence of annotated data makes it even more challenging.
arXiv Detail & Related papers (2023-08-15T19:01:19Z) - Counterfactual Cycle-Consistent Learning for Instruction Following and
Generation in Vision-Language Navigation [172.15808300686584]
We describe an approach that learns the two tasks simultaneously and exploits their intrinsic correlations to boost the training of each.
Our approach improves the performance of various follower models and produces accurate navigation instructions.
arXiv Detail & Related papers (2022-03-30T18:15:26Z) - Contrastive Instruction-Trajectory Learning for Vision-Language
Navigation [66.16980504844233]
A vision-language navigation (VLN) task requires an agent to reach a target with the guidance of natural language instruction.
Previous works fail to discriminate the similarities and discrepancies across instruction-trajectory pairs and ignore the temporal continuity of sub-instructions.
We propose a Contrastive Instruction-Trajectory Learning framework that explores invariance across similar data samples and variance across different ones to learn distinctive representations for robust navigation.
arXiv Detail & Related papers (2021-12-08T06:32:52Z) - On the Evaluation of Vision-and-Language Navigation Instructions [76.92085026018427]
Vision-and-Language Navigation wayfinding agents can be enhanced by exploiting automatically generated navigation instructions.
Existing instruction generators have not been comprehensively evaluated.
BLEU, ROUGE, METEOR and CIDEr are ineffective for evaluating grounded navigation instructions.
arXiv Detail & Related papers (2021-01-26T01:03:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.