SeanNet: Semantic Understanding Network for Localization Under Object
Dynamics
- URL: http://arxiv.org/abs/2110.02276v1
- Date: Tue, 5 Oct 2021 18:29:07 GMT
- Title: SeanNet: Semantic Understanding Network for Localization Under Object
Dynamics
- Authors: Xiao Li, Yidong Du, Zhen Zeng, Odest Chadwicke Jenkins
- Abstract summary: Under the object-level scene dynamics induced by human daily activities, a robot needs to robustly localize itself in the environment.
Previous works have addressed visual-based localization in static environments, yet the object-level scene dynamics challenge existing methods on long-term deployment of the robot.
This paper proposes SEmantic understANding Network (SeanNet) that enables robots to measure the similarity between two scenes on both visual and semantic aspects.
- Score: 14.936899865448892
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We aim for domestic robots to operate indoor for long-term service. Under the
object-level scene dynamics induced by human daily activities, a robot needs to
robustly localize itself in the environment subject to scene uncertainties.
Previous works have addressed visual-based localization in static environments,
yet the object-level scene dynamics challenge existing methods on long-term
deployment of the robot. This paper proposes SEmantic understANding Network
(SeanNet) that enables robots to measure the similarity between two scenes on
both visual and semantic aspects. We further develop a similarity-based
localization method based on SeanNet for monitoring the progress of visual
navigation tasks. In our experiments, we benchmarked SeanNet against baselines
methods on scene similarity measures, as well as visual navigation performance
once integrated with a visual navigator. We demonstrate that SeanNet
outperforms all baseline methods, by robustly localizing the robot under object
dynamics, thus reliably informing visual navigation about the task status.
Related papers
- OpenObject-NAV: Open-Vocabulary Object-Oriented Navigation Based on Dynamic Carrier-Relationship Scene Graph [10.475404599532157]
This paper captures the relationships between frequently used objects and their static carriers.
We propose an instance navigation strategy that models the navigation process as a Markov Decision Process.
The results demonstrate that by updating the CRSG, the robot can efficiently navigate to moved targets.
arXiv Detail & Related papers (2024-09-27T13:33:52Z) - DISCO: Embodied Navigation and Interaction via Differentiable Scene Semantics and Dual-level Control [53.80518003412016]
Building a general-purpose intelligent home-assistant agent skilled in diverse tasks by human commands is a long-term blueprint of embodied AI research.
We study primitive mobile manipulations for embodied agents, i.e. how to navigate and interact based on an instructed verb-noun pair.
We propose DISCO, which features non-trivial advancements in contextualized scene modeling and efficient controls.
arXiv Detail & Related papers (2024-07-20T05:39:28Z) - Mapping High-level Semantic Regions in Indoor Environments without
Object Recognition [50.624970503498226]
The present work proposes a method for semantic region mapping via embodied navigation in indoor environments.
To enable region identification, the method uses a vision-to-language model to provide scene information for mapping.
By projecting egocentric scene understanding into the global frame, the proposed method generates a semantic map as a distribution over possible region labels at each location.
arXiv Detail & Related papers (2024-03-11T18:09:50Z) - Interactive Semantic Map Representation for Skill-based Visual Object
Navigation [43.71312386938849]
This paper introduces a new representation of a scene semantic map formed during the embodied agent interaction with the indoor environment.
We have implemented this representation into a full-fledged navigation approach called SkillTron.
The proposed approach makes it possible to form both intermediate goals for robot exploration and the final goal for object navigation.
arXiv Detail & Related papers (2023-11-07T16:30:12Z) - NoMaD: Goal Masked Diffusion Policies for Navigation and Exploration [57.15811390835294]
This paper describes how we can train a single unified diffusion policy to handle both goal-directed navigation and goal-agnostic exploration.
We show that this unified policy results in better overall performance when navigating to visually indicated goals in novel environments.
Our experiments, conducted on a real-world mobile robot platform, show effective navigation in unseen environments in comparison with five alternative methods.
arXiv Detail & Related papers (2023-10-11T21:07:14Z) - Learning Navigational Visual Representations with Semantic Map
Supervision [85.91625020847358]
We propose a navigational-specific visual representation learning method by contrasting the agent's egocentric views and semantic maps.
Ego$2$-Map learning transfers the compact and rich information from a map, such as objects, structure and transition, to the agent's egocentric representations for navigation.
arXiv Detail & Related papers (2023-07-23T14:01:05Z) - How To Not Train Your Dragon: Training-free Embodied Object Goal
Navigation with Semantic Frontiers [94.46825166907831]
We present a training-free solution to tackle the object goal navigation problem in Embodied AI.
Our method builds a structured scene representation based on the classic visual simultaneous localization and mapping (V-SLAM) framework.
Our method propagates semantics on the scene graphs based on language priors and scene statistics to introduce semantic knowledge to the geometric frontiers.
arXiv Detail & Related papers (2023-05-26T13:38:33Z) - Sparse Image based Navigation Architecture to Mitigate the need of
precise Localization in Mobile Robots [3.1556608426768324]
This paper focuses on mitigating the need for exact localization of a mobile robot to pursue autonomous navigation using a sparse set of images.
The proposed method consists of a model architecture - RoomNet, for unsupervised learning resulting in a coarse identification of the environment.
The latter uses sparse image matching to characterise the similarity of frames achieved vis-a-vis the frames viewed by the robot during the mapping and training stage.
arXiv Detail & Related papers (2022-03-29T06:38:18Z) - ViNG: Learning Open-World Navigation with Visual Goals [82.84193221280216]
We propose a learning-based navigation system for reaching visually indicated goals.
We show that our system, which we call ViNG, outperforms previously-proposed methods for goal-conditioned reinforcement learning.
We demonstrate ViNG on a number of real-world applications, such as last-mile delivery and warehouse inspection.
arXiv Detail & Related papers (2020-12-17T18:22:32Z) - One-Shot Informed Robotic Visual Search in the Wild [29.604267552742026]
We consider the task of underwater robot navigation for the purpose of collecting scientifically relevant video data for environmental monitoring.
The majority of field robots currently perform monitoring tasks in unstructured natural environments navigate via path-tracking a pre-specified sequence of waypoints.
We propose a method that enables informed visual navigation via a learned visual similarity operator that guides the robot's visual search towards parts of the scene that look like exemplar images.
arXiv Detail & Related papers (2020-03-22T22:14:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.