3D-Aware Object Goal Navigation via Simultaneous Exploration and
Identification
- URL: http://arxiv.org/abs/2212.00338v3
- Date: Fri, 31 Mar 2023 03:49:07 GMT
- Title: 3D-Aware Object Goal Navigation via Simultaneous Exploration and
Identification
- Authors: Jiazhao Zhang, Liu Dai, Fanpeng Meng, Qingnan Fan, Xuelin Chen, Kai
Xu, He Wang
- Abstract summary: We propose a framework for 3D-aware ObjectNav based on two straightforward sub-policies.
Our framework achieves the best performance among all modular-based methods on the Matterport3D and Gibson datasets.
- Score: 19.125633699422117
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object goal navigation (ObjectNav) in unseen environments is a fundamental
task for Embodied AI. Agents in existing works learn ObjectNav policies based
on 2D maps, scene graphs, or image sequences. Considering this task happens in
3D space, a 3D-aware agent can advance its ObjectNav capability via learning
from fine-grained spatial information. However, leveraging 3D scene
representation can be prohibitively unpractical for policy learning in this
floor-level task, due to low sample efficiency and expensive computational
cost. In this work, we propose a framework for the challenging 3D-aware
ObjectNav based on two straightforward sub-policies. The two sub-polices,
namely corner-guided exploration policy and category-aware identification
policy, simultaneously perform by utilizing online fused 3D points as
observation. Through extensive experiments, we show that this framework can
dramatically improve the performance in ObjectNav through learning from 3D
scene representation. Our framework achieves the best performance among all
modular-based methods on the Matterport3D and Gibson datasets, while requiring
(up to 30x) less computational cost for training.
Related papers
- HM3D-OVON: A Dataset and Benchmark for Open-Vocabulary Object Goal Navigation [39.54854283833085]
We present the Habitat-Matterport 3D Open Vocabulary Object Goal Navigation dataset (HM3D-OVON)
HM3D-OVON incorporates over 15k annotated instances of household objects across 379 distinct categories.
We find that HM3D-OVON can be used to train an open-vocabulary ObjectNav agent that achieves both higher performance and is more robust to localization and actuation noise than the state-of-the-art ObjectNav approach.
arXiv Detail & Related papers (2024-09-22T02:12:29Z) - Task-oriented Sequential Grounding in 3D Scenes [35.90034571439091]
We propose a new task: Task-oriented Sequential Grounding in 3D scenes.
Agents must follow detailed step-by-step instructions to complete daily activities by locating a sequence of target objects in indoor scenes.
To facilitate this task, we introduce SG3D, a large-scale dataset containing 22,346 tasks with 112,236 steps across 4,895 real-world 3D scenes.
arXiv Detail & Related papers (2024-08-07T18:30:18Z) - Volumetric Environment Representation for Vision-Language Navigation [66.04379819772764]
Vision-language navigation (VLN) requires an agent to navigate through a 3D environment based on visual observations and natural language instructions.
We introduce a Volumetric Environment Representation (VER), which voxelizes the physical world into structured 3D cells.
VER predicts 3D occupancy, 3D room layout, and 3D bounding boxes jointly.
arXiv Detail & Related papers (2024-03-21T06:14:46Z) - PonderV2: Pave the Way for 3D Foundation Model with A Universal
Pre-training Paradigm [114.47216525866435]
We introduce a novel universal 3D pre-training framework designed to facilitate the acquisition of efficient 3D representation.
For the first time, PonderV2 achieves state-of-the-art performance on 11 indoor and outdoor benchmarks, implying its effectiveness.
arXiv Detail & Related papers (2023-10-12T17:59:57Z) - Object Goal Navigation with Recursive Implicit Maps [92.6347010295396]
We propose an implicit spatial map for object goal navigation.
Our method significantly outperforms the state of the art on the challenging MP3D dataset.
We deploy our model on a real robot and achieve encouraging object goal navigation results in real scenes.
arXiv Detail & Related papers (2023-08-10T14:21:33Z) - NeurOCS: Neural NOCS Supervision for Monocular 3D Object Localization [80.3424839706698]
We present NeurOCS, a framework that uses instance masks 3D boxes as input to learn 3D object shapes by means of differentiable rendering.
Our approach rests on insights in learning a category-level shape prior directly from real driving scenes.
We make critical design choices to learn object coordinates more effectively from an object-centric view.
arXiv Detail & Related papers (2023-05-28T16:18:41Z) - Hierarchical Representations and Explicit Memory: Learning Effective
Navigation Policies on 3D Scene Graphs using Graph Neural Networks [16.19099481411921]
We present a reinforcement learning framework that leverages high-level hierarchical representations to learn navigation policies.
For each node in the scene graph, our method uses features that capture occupancy and semantic content, while explicitly retaining memory of the robot trajectory.
arXiv Detail & Related papers (2021-08-02T21:21:27Z) - Learnable Online Graph Representations for 3D Multi-Object Tracking [156.58876381318402]
We propose a unified and learning based approach to the 3D MOT problem.
We employ a Neural Message Passing network for data association that is fully trainable.
We show the merit of the proposed approach on the publicly available nuScenes dataset by achieving state-of-the-art performance of 65.6% AMOTA and 58% fewer ID-switches.
arXiv Detail & Related papers (2021-04-23T17:59:28Z) - Improving Target-driven Visual Navigation with Attention on 3D Spatial
Relationships [52.72020203771489]
We investigate target-driven visual navigation using deep reinforcement learning (DRL) in 3D indoor scenes.
Our proposed method combines visual features and 3D spatial representations to learn navigation policy.
Our experiments, performed in the AI2-THOR, show that our model outperforms the baselines in both SR and SPL metrics.
arXiv Detail & Related papers (2020-04-29T08:46:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.