Soft Expert Reward Learning for Vision-and-Language Navigation
- URL: http://arxiv.org/abs/2007.10835v1
- Date: Tue, 21 Jul 2020 14:17:36 GMT
- Title: Soft Expert Reward Learning for Vision-and-Language Navigation
- Authors: Hu Wang, Qi Wu, Chunhua Shen
- Abstract summary: Vision-and-Language Navigation (VLN) requires an agent to find a specified spot in an unseen environment by following natural language instructions.
We introduce a Soft Expert Reward Learning (SERL) model to overcome the reward engineering designing and generalisation problems of the VLN task.
- Score: 94.86954695912125
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Vision-and-Language Navigation (VLN) requires an agent to find a specified
spot in an unseen environment by following natural language instructions.
Dominant methods based on supervised learning clone expert's behaviours and
thus perform better on seen environments, while showing restricted performance
on unseen ones. Reinforcement Learning (RL) based models show better
generalisation ability but have issues as well, requiring large amount of
manual reward engineering is one of which. In this paper, we introduce a Soft
Expert Reward Learning (SERL) model to overcome the reward engineering
designing and generalisation problems of the VLN task. Our proposed method
consists of two complementary components: Soft Expert Distillation (SED) module
encourages agents to behave like an expert as much as possible, but in a soft
fashion; Self Perceiving (SP) module targets at pushing the agent towards the
final destination as fast as possible. Empirically, we evaluate our model on
the VLN seen, unseen and test splits and the model outperforms the
state-of-the-art methods on most of the evaluation metrics.
Related papers
- Vision-Language Navigation with Energy-Based Policy [66.04379819772764]
Vision-language navigation (VLN) requires an agent to execute actions following human instructions.
We propose an Energy-based Navigation Policy (ENP) to model the joint state-action distribution.
ENP achieves promising performances on R2R, REVERIE, RxR, and R2R-CE.
arXiv Detail & Related papers (2024-10-18T08:01:36Z) - Offline Imitation Learning with Model-based Reverse Augmentation [48.64791438847236]
We propose a novel model-based framework, called offline Imitation Learning with Self-paced Reverse Augmentation.
Specifically, we build a reverse dynamic model from the offline demonstrations, which can efficiently generate trajectories leading to the expert-observed states.
We use the subsequent reinforcement learning method to learn from the augmented trajectories and transit from expert-unobserved states to expert-observed states.
arXiv Detail & Related papers (2024-06-18T12:27:02Z) - TINA: Think, Interaction, and Action Framework for Zero-Shot Vision Language Navigation [11.591176410027224]
This paper presents a Vision-Language Navigation (VLN) agent based on Large Language Models (LLMs)
We propose the Thinking, Interacting, and Action framework to compensate for the shortcomings of LLMs in environmental perception.
Our approach also outperformed some supervised learning-based methods, highlighting its efficacy in zero-shot navigation.
arXiv Detail & Related papers (2024-03-13T05:22:39Z) - Towards Learning a Generalist Model for Embodied Navigation [24.816490551945435]
We propose the first generalist model for embodied navigation, NaviLLM.
It adapts LLMs to embodied navigation by introducing schema-based instruction.
We conduct extensive experiments to evaluate the performance and generalizability of our model.
arXiv Detail & Related papers (2023-12-04T16:32:51Z) - ExpeL: LLM Agents Are Experiential Learners [60.54312035818746]
We introduce the Experiential Learning (ExpeL) agent to allow learning from agent experiences without requiring parametric updates.
Our agent autonomously gathers experiences and extracts knowledge using natural language from a collection of training tasks.
At inference, the agent recalls its extracted insights and past experiences to make informed decisions.
arXiv Detail & Related papers (2023-08-20T03:03:34Z) - Curricular Subgoals for Inverse Reinforcement Learning [21.038691420095525]
Inverse Reinforcement Learning (IRL) aims to reconstruct the reward function from expert demonstrations to facilitate policy learning.
Existing IRL methods mainly focus on learning global reward functions to minimize the trajectory difference between the imitator and the expert.
We propose a novel Curricular Subgoal-based Inverse Reinforcement Learning framework, that explicitly disentangles one task with several local subgoals to guide agent imitation.
arXiv Detail & Related papers (2023-06-14T04:06:41Z) - ULN: Towards Underspecified Vision-and-Language Navigation [77.81257404252132]
Underspecified vision-and-Language Navigation (ULN) is a new setting for vision-and-Language Navigation (VLN)
We propose a VLN framework that consists of a classification module, a navigation agent, and an Exploitation-to-Exploration (E2E) module.
Our framework is more robust and outperforms the baselines on ULN by 10% relative success rate across all levels.
arXiv Detail & Related papers (2022-10-18T17:45:06Z) - Imitation by Predicting Observations [17.86983397979034]
We present a new method for imitation solely from observations that achieves comparable performance to experts on challenging continuous control tasks.
Our method, which we call FORM, is derived from an inverse RL objective and imitates using a model of expert behavior learned by generative modelling of the expert's observations.
We show that FORM performs comparably to a strong baseline IRL method (GAIL) on the DeepMind Control Suite benchmark, while outperforming GAIL in the presence of task-irrelevant features.
arXiv Detail & Related papers (2021-07-08T14:09:30Z) - A Recurrent Vision-and-Language BERT for Navigation [54.059606864535304]
We propose a recurrent BERT model that is time-aware for use in vision-and-language navigation.
Our model can replace more complex encoder-decoder models to achieve state-of-the-art results.
arXiv Detail & Related papers (2020-11-26T00:23:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.