A Landmark-Aware Visual Navigation Dataset
- URL: http://arxiv.org/abs/2402.14281v1
- Date: Thu, 22 Feb 2024 04:43:20 GMT
- Title: A Landmark-Aware Visual Navigation Dataset
- Authors: Faith Johnson, Bryan Bo Cao, Kristin Dana, Shubham Jain, Ashwin Ashok
- Abstract summary: We present a Landmark-Aware Visual Navigation dataset to allow for supervised learning of human-centric exploration policies and map building.
We collect RGB observation and human point-click pairs as a human annotator explores virtual and real-world environments.
Our dataset covers a wide spectrum of scenes, including rooms in indoor environments, as well as walkways outdoors.
- Score: 6.564789361460195
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Map representation learned by expert demonstrations has shown promising
research value. However, recent advancements in the visual navigation field
face challenges due to the lack of human datasets in the real world for
efficient supervised representation learning of the environments. We present a
Landmark-Aware Visual Navigation (LAVN) dataset to allow for supervised
learning of human-centric exploration policies and map building. We collect RGB
observation and human point-click pairs as a human annotator explores virtual
and real-world environments with the goal of full coverage exploration of the
space. The human annotators also provide distinct landmark examples along each
trajectory, which we intuit will simplify the task of map or graph building and
localization. These human point-clicks serve as direct supervision for waypoint
prediction when learning to explore in environments. Our dataset covers a wide
spectrum of scenes, including rooms in indoor environments, as well as walkways
outdoors. Dataset is available at DOI: 10.5281/zenodo.10608067.
Related papers
- Learning Navigational Visual Representations with Semantic Map
Supervision [85.91625020847358]
We propose a navigational-specific visual representation learning method by contrasting the agent's egocentric views and semantic maps.
Ego$2$-Map learning transfers the compact and rich information from a map, such as objects, structure and transition, to the agent's egocentric representations for navigation.
arXiv Detail & Related papers (2023-07-23T14:01:05Z) - Palm up: Playing in the Latent Manifold for Unsupervised Pretraining [31.92145741769497]
We propose an algorithm that exhibits an exploratory behavior whilst it utilizes large diverse datasets.
Our key idea is to leverage deep generative models that are pretrained on static datasets and introduce a dynamic model in the latent space.
We then employ an unsupervised reinforcement learning algorithm to explore in this environment and perform unsupervised representation learning on the collected data.
arXiv Detail & Related papers (2022-10-19T22:26:12Z) - Weakly-Supervised Multi-Granularity Map Learning for Vision-and-Language
Navigation [87.52136927091712]
We address a practical yet challenging problem of training robot agents to navigate in an environment following a path described by some language instructions.
To achieve accurate and efficient navigation, it is critical to build a map that accurately represents both spatial location and the semantic information of the environment objects.
We propose a multi-granularity map, which contains both object fine-grained details (e.g., color, texture) and semantic classes, to represent objects more comprehensively.
arXiv Detail & Related papers (2022-10-14T04:23:27Z) - Pathdreamer: A World Model for Indoor Navigation [62.78410447776939]
We introduce Pathdreamer, a visual world model for agents navigating in novel indoor environments.
Given one or more previous visual observations, Pathdreamer generates plausible high-resolution 360 visual observations.
In regions of high uncertainty, Pathdreamer can predict diverse scenes, allowing an agent to sample multiple realistic outcomes.
arXiv Detail & Related papers (2021-05-18T18:13:53Z) - Rapid Exploration for Open-World Navigation with Latent Goal Models [78.45339342966196]
We describe a robotic learning system for autonomous exploration and navigation in diverse, open-world environments.
At the core of our method is a learned latent variable model of distances and actions, along with a non-parametric topological memory of images.
We use an information bottleneck to regularize the learned policy, giving us (i) a compact visual representation of goals, (ii) improved generalization capabilities, and (iii) a mechanism for sampling feasible goals for exploration.
arXiv Detail & Related papers (2021-04-12T23:14:41Z) - SOON: Scenario Oriented Object Navigation with Graph-based Exploration [102.74649829684617]
The ability to navigate like a human towards a language-guided target from anywhere in a 3D embodied environment is one of the 'holy grail' goals of intelligent robots.
Most visual navigation benchmarks focus on navigating toward a target from a fixed starting point, guided by an elaborate set of instructions that depicts step-by-step.
This approach deviates from real-world problems in which human-only describes what the object and its surrounding look like and asks the robot to start navigation from anywhere.
arXiv Detail & Related papers (2021-03-31T15:01:04Z) - Occupancy Anticipation for Efficient Exploration and Navigation [97.17517060585875]
We propose occupancy anticipation, where the agent uses its egocentric RGB-D observations to infer the occupancy state beyond the visible regions.
By exploiting context in both the egocentric views and top-down maps our model successfully anticipates a broader map of the environment.
Our approach is the winning entry in the 2020 Habitat PointNav Challenge.
arXiv Detail & Related papers (2020-08-21T03:16:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.