Last-Mile Embodied Visual Navigation
- URL: http://arxiv.org/abs/2211.11746v1
- Date: Mon, 21 Nov 2022 18:59:58 GMT
- Title: Last-Mile Embodied Visual Navigation
- Authors: Justin Wasserman, Karmesh Yadav, Girish Chowdhary, Abhinav Gupta,
Unnat Jain
- Abstract summary: We propose SLING to improve the performance of image-goal navigation systems.
We focus on last-mile navigation and leverage the underlying geometric structure of the problem with neural descriptors.
On a standardized image-goal navigation benchmark, we improve performance across policies, scenes, and episode complexity, raising the state-of-the-art from 45% to 55% success rate.
- Score: 31.622495628224403
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Realistic long-horizon tasks like image-goal navigation involve exploratory
and exploitative phases. Assigned with an image of the goal, an embodied agent
must explore to discover the goal, i.e., search efficiently using learned
priors. Once the goal is discovered, the agent must accurately calibrate the
last-mile of navigation to the goal. As with any robust system, switches
between exploratory goal discovery and exploitative last-mile navigation enable
better recovery from errors. Following these intuitive guide rails, we propose
SLING to improve the performance of existing image-goal navigation systems.
Entirely complementing prior methods, we focus on last-mile navigation and
leverage the underlying geometric structure of the problem with neural
descriptors. With simple but effective switches, we can easily connect SLING
with heuristic, reinforcement learning, and neural modular policies. On a
standardized image-goal navigation benchmark (Hahn et al. 2021), we improve
performance across policies, scenes, and episode complexity, raising the
state-of-the-art from 45% to 55% success rate. Beyond photorealistic
simulation, we conduct real-robot experiments in three physical scenes and find
these improvements to transfer well to real environments.
Related papers
- Transformers for Image-Goal Navigation [0.0]
We present a generative Transformer based model that jointly models image goals, camera observations and the robot's past actions to predict future actions.
Our model demonstrates capability in capturing and associating visual information across long time horizons, helping in effective navigation.
arXiv Detail & Related papers (2024-05-23T03:01:32Z) - FGPrompt: Fine-grained Goal Prompting for Image-goal Navigation [54.25416624924669]
We propose a Fine-grained Goal Prompting (FGPrompt) method for image-goal navigation.
FGPrompt preserves detailed information in the goal image and guides the observation encoder to pay attention to goal-relevant regions.
Our method brings significant performance improvement on 3 benchmark datasets.
arXiv Detail & Related papers (2023-10-11T13:19:29Z) - Navigating to Objects in the Real World [76.1517654037993]
We present a large-scale empirical study of semantic visual navigation methods comparing methods from classical, modular, and end-to-end learning approaches.
We find that modular learning works well in the real world, attaining a 90% success rate.
In contrast, end-to-end learning does not, dropping from 77% simulation to 23% real-world success rate due to a large image domain gap between simulation and reality.
arXiv Detail & Related papers (2022-12-02T01:10:47Z) - Towards self-attention based visual navigation in the real world [0.0]
Vision guided navigation requires processing complex visual information to inform task-orientated decisions.
Deep Reinforcement Learning agents trained in simulation often exhibit unsatisfactory results when deployed in the real-world.
This is the first demonstration of a self-attention based agent successfully trained in navigating a 3D action space, using less than 4000 parameters.
arXiv Detail & Related papers (2022-09-15T04:51:42Z) - Augmented reality navigation system for visual prosthesis [67.09251544230744]
We propose an augmented reality navigation system for visual prosthesis that incorporates a software of reactive navigation and path planning.
It consists on four steps: locating the subject on a map, planning the subject trajectory, showing it to the subject and re-planning without obstacles.
Results show how our augmented navigation system help navigation performance by reducing the time and distance to reach the goals, even significantly reducing the number of obstacles collisions.
arXiv Detail & Related papers (2021-09-30T09:41:40Z) - Deep Learning for Embodied Vision Navigation: A Survey [108.13766213265069]
"Embodied visual navigation" problem requires an agent to navigate in a 3D environment mainly rely on its first-person observation.
This paper attempts to establish an outline of the current works in the field of embodied visual navigation by providing a comprehensive literature survey.
arXiv Detail & Related papers (2021-07-07T12:09:04Z) - Rapid Exploration for Open-World Navigation with Latent Goal Models [78.45339342966196]
We describe a robotic learning system for autonomous exploration and navigation in diverse, open-world environments.
At the core of our method is a learned latent variable model of distances and actions, along with a non-parametric topological memory of images.
We use an information bottleneck to regularize the learned policy, giving us (i) a compact visual representation of goals, (ii) improved generalization capabilities, and (iii) a mechanism for sampling feasible goals for exploration.
arXiv Detail & Related papers (2021-04-12T23:14:41Z) - Robot Perception enables Complex Navigation Behavior via Self-Supervised
Learning [23.54696982881734]
We propose an approach to unify successful robot perception systems for active target-driven navigation tasks via reinforcement learning (RL)
Our method temporally incorporates compact motion and visual perception data, directly obtained using self-supervision from a single image sequence.
We demonstrate our approach on two real-world driving dataset, KITTI and Oxford RobotCar, using the new interactive CityLearn framework.
arXiv Detail & Related papers (2020-06-16T07:45:47Z) - Improving Target-driven Visual Navigation with Attention on 3D Spatial
Relationships [52.72020203771489]
We investigate target-driven visual navigation using deep reinforcement learning (DRL) in 3D indoor scenes.
Our proposed method combines visual features and 3D spatial representations to learn navigation policy.
Our experiments, performed in the AI2-THOR, show that our model outperforms the baselines in both SR and SPL metrics.
arXiv Detail & Related papers (2020-04-29T08:46:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.