Learning hierarchical relationships for object-goal navigation
- URL: http://arxiv.org/abs/2003.06749v2
- Date: Wed, 18 Nov 2020 22:22:11 GMT
- Title: Learning hierarchical relationships for object-goal navigation
- Authors: Yiding Qiu, Anwesan Pal, Henrik I. Christensen
- Abstract summary: We present Memory-utilized Joint hierarchical Object Learning for Navigation in Indoor Rooms (MJOLNIR)
MJOLNIR is a target-driven navigation algorithm, which considers the inherent relationship between target objects, and the more salient contextual objects occurring in its surrounding.
Our model learns to converge much faster than other algorithms, without suffering from the well-known overfitting problem.
- Score: 7.074818959144171
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Direct search for objects as part of navigation poses a challenge for small
items. Utilizing context in the form of object-object relationships enable
hierarchical search for targets efficiently. Most of the current approaches
tend to directly incorporate sensory input into a reward-based learning
approach, without learning about object relationships in the natural
environment, and thus generalize poorly across domains. We present
Memory-utilized Joint hierarchical Object Learning for Navigation in Indoor
Rooms (MJOLNIR), a target-driven navigation algorithm, which considers the
inherent relationship between target objects, and the more salient contextual
objects occurring in its surrounding. Extensive experiments conducted across
multiple environment settings show an $82.9\%$ and $93.5\%$ gain over existing
state-of-the-art navigation methods in terms of the success rate (SR), and
success weighted by path length (SPL), respectively. We also show that our
model learns to converge much faster than other algorithms, without suffering
from the well-known overfitting problem. Additional details regarding the
supplementary material and code are available at
https://sites.google.com/eng.ucsd.edu/mjolnir.
Related papers
- Leveraging Large Language Model-based Room-Object Relationships Knowledge for Enhancing Multimodal-Input Object Goal Navigation [11.510823733292519]
We propose a data-driven, modular-based approach, trained on a dataset that incorporates common-sense knowledge of object-to-room relationships extracted from a large language model.
The results in the Habitat simulator demonstrate that our framework outperforms the baseline by an average of 10.6% in the efficiency metric, Success weighted by Path Length (SPL).
arXiv Detail & Related papers (2024-03-21T06:32:36Z) - Zero-Shot Object Goal Visual Navigation With Class-Independent Relationship Network [3.0820097046465285]
"Zero-shot" means that the target the agent needs to find is not trained during the training phase.
We propose the Class-Independent Relationship Network (CIRN) to address the issue of coupling navigation ability with target features during training.
Our method outperforms the current state-of-the-art approaches in the zero-shot object goal visual navigation task.
arXiv Detail & Related papers (2023-10-15T16:42:14Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - How To Not Train Your Dragon: Training-free Embodied Object Goal
Navigation with Semantic Frontiers [94.46825166907831]
We present a training-free solution to tackle the object goal navigation problem in Embodied AI.
Our method builds a structured scene representation based on the classic visual simultaneous localization and mapping (V-SLAM) framework.
Our method propagates semantics on the scene graphs based on language priors and scene statistics to introduce semantic knowledge to the geometric frontiers.
arXiv Detail & Related papers (2023-05-26T13:38:33Z) - ESC: Exploration with Soft Commonsense Constraints for Zero-shot Object
Navigation [75.13546386761153]
We present a novel zero-shot object navigation method, Exploration with Soft Commonsense constraints (ESC)
ESC transfers commonsense knowledge in pre-trained models to open-world object navigation without any navigation experience.
Experiments on MP3D, HM3D, and RoboTHOR benchmarks show that our ESC method improves significantly over baselines.
arXiv Detail & Related papers (2023-01-30T18:37:32Z) - A Contextual Bandit Approach for Learning to Plan in Environments with
Probabilistic Goal Configurations [20.15854546504947]
We propose a modular framework for object-nav that is able to efficiently search indoor environments for not just static objects but also movable objects.
Our contextual-bandit agent efficiently explores the environment by showing optimism in the face of uncertainty.
We evaluate our algorithms in two simulated environments and a real-world setting, to demonstrate high sample efficiency and reliability.
arXiv Detail & Related papers (2022-11-29T15:48:54Z) - Weakly-Supervised Multi-Granularity Map Learning for Vision-and-Language
Navigation [87.52136927091712]
We address a practical yet challenging problem of training robot agents to navigate in an environment following a path described by some language instructions.
To achieve accurate and efficient navigation, it is critical to build a map that accurately represents both spatial location and the semantic information of the environment objects.
We propose a multi-granularity map, which contains both object fine-grained details (e.g., color, texture) and semantic classes, to represent objects more comprehensively.
arXiv Detail & Related papers (2022-10-14T04:23:27Z) - PONI: Potential Functions for ObjectGoal Navigation with
Interaction-free Learning [125.22462763376993]
We propose Potential functions for ObjectGoal Navigation with Interaction-free learning (PONI)
PONI disentangles the skills of where to look?' for an object and how to navigate to (x, y)?'
arXiv Detail & Related papers (2022-01-25T01:07:32Z) - Towards Optimal Correlational Object Search [25.355936023640506]
Correlational Object Search POMDP can be solved to produce search strategies that use correlational information.
We conduct experiments using AI2-THOR, a realistic simulator of household environments, as well as YOLOv5, a widely-used object detector.
arXiv Detail & Related papers (2021-10-19T14:03:43Z) - Visuomotor Mechanical Search: Learning to Retrieve Target Objects in
Clutter [43.668395529368354]
We present a novel Deep RL procedure that combines teacher-aided exploration, ii) a critic with privileged information, andiii) mid-level representations.
Our approach trains faster and converges to more efficient uncovering solutions than baselines and ablations, and that our uncovering policies lead to an average improvement in the graspability of the target object.
arXiv Detail & Related papers (2020-08-13T18:23:00Z) - Object Goal Navigation using Goal-Oriented Semantic Exploration [98.14078233526476]
This work studies the problem of object goal navigation which involves navigating to an instance of the given object category in unseen environments.
We propose a modular system called, Goal-Oriented Semantic Exploration' which builds an episodic semantic map and uses it to explore the environment efficiently.
arXiv Detail & Related papers (2020-07-01T17:52:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.