Object Goal Navigation Based on Semantics and RGB Ego View
- URL: http://arxiv.org/abs/2210.11543v1
- Date: Thu, 20 Oct 2022 19:23:08 GMT
- Title: Object Goal Navigation Based on Semantics and RGB Ego View
- Authors: Snehasis Banerjee, Brojeshwar Bhowmick, Ruddra Dev Roychoudhury
- Abstract summary: This paper presents an architecture and methodology to empower a service robot to navigate an indoor environment with semantic decision making, given RGB ego view.
The robot navigates based on GeoSem map - a relational combination of geometric and semantic map.
The presented approach was found to outperform human users in gamified evaluations with respect to average completion time.
- Score: 9.702784248870522
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents an architecture and methodology to empower a service
robot to navigate an indoor environment with semantic decision making, given
RGB ego view. This method leverages the knowledge of robot's actuation
capability and that of scenes, objects and their relations -- represented in a
semantic form. The robot navigates based on GeoSem map - a relational
combination of geometric and semantic map. The goal given to the robot is to
find an object in a unknown environment with no navigational map and only
egocentric RGB camera perception. The approach is tested both on a simulation
environment and real life indoor settings. The presented approach was found to
outperform human users in gamified evaluations with respect to average
completion time.
Related papers
- Autonomous Exploration and Semantic Updating of Large-Scale Indoor Environments with Mobile Robots [1.8791971592960612]
We introduce a new robotic system that enables a mobile robot to autonomously explore an unknown environment.
The robot can semantically map a 93m x 90m floor and update the semantic map once objects are moved in the environment.
arXiv Detail & Related papers (2024-09-23T19:25:03Z) - Polaris: Open-ended Interactive Robotic Manipulation via Syn2Real Visual Grounding and Large Language Models [53.22792173053473]
We introduce an interactive robotic manipulation framework called Polaris.
Polaris integrates perception and interaction by utilizing GPT-4 alongside grounded vision models.
We propose a novel Synthetic-to-Real (Syn2Real) pose estimation pipeline.
arXiv Detail & Related papers (2024-08-15T06:40:38Z) - Learning Semantic Traversability with Egocentric Video and Automated Annotation Strategy [3.713586225621126]
A robot must have the ability to identify semantically traversable terrains in the image based on the semantic understanding of the scene.
This reasoning ability is based on semantic traversability, which is frequently achieved using semantic segmentation models fine-tuned on the testing domain.
We present an effective methodology for training a semantic traversability estimator using egocentric videos and an automated annotation process.
arXiv Detail & Related papers (2024-06-05T06:40:04Z) - Interactive Semantic Map Representation for Skill-based Visual Object
Navigation [43.71312386938849]
This paper introduces a new representation of a scene semantic map formed during the embodied agent interaction with the indoor environment.
We have implemented this representation into a full-fledged navigation approach called SkillTron.
The proposed approach makes it possible to form both intermediate goals for robot exploration and the final goal for object navigation.
arXiv Detail & Related papers (2023-11-07T16:30:12Z) - Object Goal Navigation with Recursive Implicit Maps [92.6347010295396]
We propose an implicit spatial map for object goal navigation.
Our method significantly outperforms the state of the art on the challenging MP3D dataset.
We deploy our model on a real robot and achieve encouraging object goal navigation results in real scenes.
arXiv Detail & Related papers (2023-08-10T14:21:33Z) - Gesture2Path: Imitation Learning for Gesture-aware Navigation [54.570943577423094]
We present Gesture2Path, a novel social navigation approach that combines image-based imitation learning with model-predictive control.
We deploy our method on real robots and showcase the effectiveness of our approach for the four gestures-navigation scenarios.
arXiv Detail & Related papers (2022-09-19T23:05:36Z) - Rapid Exploration for Open-World Navigation with Latent Goal Models [78.45339342966196]
We describe a robotic learning system for autonomous exploration and navigation in diverse, open-world environments.
At the core of our method is a learned latent variable model of distances and actions, along with a non-parametric topological memory of images.
We use an information bottleneck to regularize the learned policy, giving us (i) a compact visual representation of goals, (ii) improved generalization capabilities, and (iii) a mechanism for sampling feasible goals for exploration.
arXiv Detail & Related papers (2021-04-12T23:14:41Z) - SOON: Scenario Oriented Object Navigation with Graph-based Exploration [102.74649829684617]
The ability to navigate like a human towards a language-guided target from anywhere in a 3D embodied environment is one of the 'holy grail' goals of intelligent robots.
Most visual navigation benchmarks focus on navigating toward a target from a fixed starting point, guided by an elaborate set of instructions that depicts step-by-step.
This approach deviates from real-world problems in which human-only describes what the object and its surrounding look like and asks the robot to start navigation from anywhere.
arXiv Detail & Related papers (2021-03-31T15:01:04Z) - Few-Shot Visual Grounding for Natural Human-Robot Interaction [0.0]
We propose a software architecture that segments a target object from a crowded scene, indicated verbally by a human user.
At the core of our system, we employ a multi-modal deep neural network for visual grounding.
We evaluate the performance of the proposed model on real RGB-D data collected from public scene datasets.
arXiv Detail & Related papers (2021-03-17T15:24:02Z) - Lifelong update of semantic maps in dynamic environments [2.343080600040765]
A robot understands its world through the raw information it senses from its surroundings.
A semantic map, containing high-level information that both the robot and user understand, is better suited to be a shared representation.
We use the semantic map as the user-facing interface on our fleet of floor-cleaning robots.
arXiv Detail & Related papers (2020-10-17T18:44:33Z) - Object Goal Navigation using Goal-Oriented Semantic Exploration [98.14078233526476]
This work studies the problem of object goal navigation which involves navigating to an instance of the given object category in unseen environments.
We propose a modular system called, Goal-Oriented Semantic Exploration' which builds an episodic semantic map and uses it to explore the environment efficiently.
arXiv Detail & Related papers (2020-07-01T17:52:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.