Landmark Policy Optimization for Object Navigation Task
- URL: http://arxiv.org/abs/2109.09512v1
- Date: Fri, 17 Sep 2021 12:28:46 GMT
- Title: Landmark Policy Optimization for Object Navigation Task
- Authors: Aleksey Staroverov, Aleksandr I. Panov
- Abstract summary: This work studies object goal navigation task, which involves navigating to the closest object related to the given semantic category in unseen environments.
Recent works have shown significant achievements both in the end-to-end Reinforcement Learning approach and modular systems, but need a big step forward to be robust and optimal.
We propose a hierarchical method that incorporates standard task formulation and additional area knowledge as landmarks, with a way to extract these landmarks.
- Score: 77.34726150561087
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work studies object goal navigation task, which involves navigating to
the closest object related to the given semantic category in unseen
environments. Recent works have shown significant achievements both in the
end-to-end Reinforcement Learning approach and modular systems, but need a big
step forward to be robust and optimal. We propose a hierarchical method that
incorporates standard task formulation and additional area knowledge as
landmarks, with a way to extract these landmarks. In a hierarchy, a low level
consists of separately trained algorithms to the most intuitive skills, and a
high level decides which skill is needed at this moment. With all proposed
solutions, we achieve a 0.75 success rate in a realistic Habitat simulator.
After a small stage of additional model training in a reconstructed virtual
area at a simulator, we successfully confirmed our results in a real-world
case.
Related papers
- Flex: End-to-End Text-Instructed Visual Navigation with Foundation Models [59.892436892964376]
We investigate the minimal data requirements and architectural adaptations necessary to achieve robust closed-loop performance with vision-based control policies.
Our findings are synthesized in Flex (Fly-lexically), a framework that uses pre-trained Vision Language Models (VLMs) as frozen patch-wise feature extractors.
We demonstrate the effectiveness of this approach on quadrotor fly-to-target tasks, where agents trained via behavior cloning successfully generalize to real-world scenes.
arXiv Detail & Related papers (2024-10-16T19:59:31Z) - Efficient Exploration and Discriminative World Model Learning with an Object-Centric Abstraction [19.59151245929067]
We study whether giving an agent an object-centric mapping (describing a set of items and their attributes) allow for more efficient learning.
We find this problem is best solved hierarchically by modelling items at a higher level of state abstraction to pixels.
We make use of this to propose a fully model-based algorithm that learns a discriminative world model.
arXiv Detail & Related papers (2024-08-21T17:59:31Z) - Leveraging Large Language Model-based Room-Object Relationships Knowledge for Enhancing Multimodal-Input Object Goal Navigation [11.510823733292519]
We propose a data-driven, modular-based approach, trained on a dataset that incorporates common-sense knowledge of object-to-room relationships extracted from a large language model.
The results in the Habitat simulator demonstrate that our framework outperforms the baseline by an average of 10.6% in the efficiency metric, Success weighted by Path Length (SPL).
arXiv Detail & Related papers (2024-03-21T06:32:36Z) - Multi-Object Navigation in real environments using hybrid policies [18.52681391843433]
We introduce a hybrid navigation method, which decomposes the problem into two different skills.
We show the advantages of this approach compared to end-to-end methods both in simulation and a real environment.
arXiv Detail & Related papers (2024-01-24T20:41:25Z) - How To Not Train Your Dragon: Training-free Embodied Object Goal
Navigation with Semantic Frontiers [94.46825166907831]
We present a training-free solution to tackle the object goal navigation problem in Embodied AI.
Our method builds a structured scene representation based on the classic visual simultaneous localization and mapping (V-SLAM) framework.
Our method propagates semantics on the scene graphs based on language priors and scene statistics to introduce semantic knowledge to the geometric frontiers.
arXiv Detail & Related papers (2023-05-26T13:38:33Z) - Navigating to Objects in the Real World [76.1517654037993]
We present a large-scale empirical study of semantic visual navigation methods comparing methods from classical, modular, and end-to-end learning approaches.
We find that modular learning works well in the real world, attaining a 90% success rate.
In contrast, end-to-end learning does not, dropping from 77% simulation to 23% real-world success rate due to a large image domain gap between simulation and reality.
arXiv Detail & Related papers (2022-12-02T01:10:47Z) - Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon
Reasoning [120.38381203153159]
Reinforcement learning can train policies that effectively perform complex tasks.
For long-horizon tasks, the performance of these methods degrades with horizon, often necessitating reasoning over and composing lower-level skills.
We propose Value Function Spaces: a simple approach that produces such a representation by using the value functions corresponding to each lower-level skill.
arXiv Detail & Related papers (2021-11-04T22:46:16Z) - Efficient Robotic Object Search via HIEM: Hierarchical Policy Learning
with Intrinsic-Extrinsic Modeling [33.89793938441333]
We present a novel policy learning paradigm for the object search task, based on hierarchical and interpretable modeling with an intrinsic-extrinsic reward setting.
Experiments conducted on the House3D environment validate and show that the robot, trained with our model, can perform the object search task in a more optimal and interpretable way.
arXiv Detail & Related papers (2020-10-16T19:21:38Z) - Object Goal Navigation using Goal-Oriented Semantic Exploration [98.14078233526476]
This work studies the problem of object goal navigation which involves navigating to an instance of the given object category in unseen environments.
We propose a modular system called, Goal-Oriented Semantic Exploration' which builds an episodic semantic map and uses it to explore the environment efficiently.
arXiv Detail & Related papers (2020-07-01T17:52:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.