Exploiting Scene-specific Features for Object Goal Navigation
- URL: http://arxiv.org/abs/2008.09403v1
- Date: Fri, 21 Aug 2020 10:16:01 GMT
- Title: Exploiting Scene-specific Features for Object Goal Navigation
- Authors: Tommaso Campari, Paolo Eccher, Luciano Serafini and Lamberto Ballan
- Abstract summary: We introduce a new reduced dataset that speeds up the training of navigation models.
Our proposed dataset permits the training of models that do not exploit online-built maps in reasonable times.
We propose the SMTSC model, an attention-based model capable of exploiting the correlation between scenes and objects contained in them.
- Score: 9.806910643086043
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Can the intrinsic relation between an object and the room in which it is
usually located help agents in the Visual Navigation Task? We study this
question in the context of Object Navigation, a problem in which an agent has
to reach an object of a specific class while moving in a complex domestic
environment. In this paper, we introduce a new reduced dataset that speeds up
the training of navigation models, a notoriously complex task. Our proposed
dataset permits the training of models that do not exploit online-built maps in
reasonable times even without the use of huge computational resources.
Therefore, this reduced dataset guarantees a significant benchmark and it can
be used to identify promising models that could be then tried on bigger and
more challenging datasets. Subsequently, we propose the SMTSC model, an
attention-based model capable of exploiting the correlation between scenes and
objects contained in them, highlighting quantitatively how the idea is correct.
Related papers
- Personalized Instance-based Navigation Toward User-Specific Objects in Realistic Environments [44.6372390798904]
We propose a new task denominated Personalized Instance-based Navigation (PIN), in which an embodied agent is tasked with locating and reaching a specific personal object.
In each episode, the target object is presented to the agent using two modalities: a set of visual reference images on a neutral background and manually annotated textual descriptions.
arXiv Detail & Related papers (2024-10-23T18:01:09Z) - DeTra: A Unified Model for Object Detection and Trajectory Forecasting [68.85128937305697]
Our approach formulates the union of the two tasks as a trajectory refinement problem.
To tackle this unified task, we design a refinement transformer that infers the presence, pose, and multi-modal future behaviors of objects.
In our experiments, we observe that ourmodel outperforms the state-of-the-art on Argoverse 2 Sensor and Open dataset.
arXiv Detail & Related papers (2024-06-06T18:12:04Z) - Temporal-Spatial Object Relations Modeling for Vision-and-Language Navigation [11.372544701050044]
Vision-and-Language Navigation (VLN) is a challenging task where an agent is required to navigate to a natural language described location via vision observations.
The navigation abilities of the agent can be enhanced by the relations between objects, which are usually learned using internal objects or external datasets.
arXiv Detail & Related papers (2024-03-23T02:44:43Z) - Task-Driven Graph Attention for Hierarchical Relational Object
Navigation [25.571175038938527]
Embodied AI agents in large scenes often need to navigate to find objects.
We study a naturally emerging variant of the object navigation task, hierarchical object navigation (HRON)
We propose a solution that uses scene graphs as part of its input and integrates graph neural networks as its backbone.
arXiv Detail & Related papers (2023-06-23T19:50:48Z) - Dense Video Object Captioning from Disjoint Supervision [77.47084982558101]
We propose a new task and model for dense video object captioning.
This task unifies spatial and temporal localization in video.
We show how our model improves upon a number of strong baselines for this new task.
arXiv Detail & Related papers (2023-06-20T17:57:23Z) - How To Not Train Your Dragon: Training-free Embodied Object Goal
Navigation with Semantic Frontiers [94.46825166907831]
We present a training-free solution to tackle the object goal navigation problem in Embodied AI.
Our method builds a structured scene representation based on the classic visual simultaneous localization and mapping (V-SLAM) framework.
Our method propagates semantics on the scene graphs based on language priors and scene statistics to introduce semantic knowledge to the geometric frontiers.
arXiv Detail & Related papers (2023-05-26T13:38:33Z) - Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets.
This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.
We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z) - Visual Navigation with Spatial Attention [26.888916048408895]
This work focuses on object goal visual navigation, aiming at finding the location of an object from a given class.
We propose to learn the agent's policy using a reinforcement learning algorithm.
Our key contribution is a novel attention probability model for visual navigation tasks.
arXiv Detail & Related papers (2021-04-20T07:39:52Z) - SoDA: Multi-Object Tracking with Soft Data Association [75.39833486073597]
Multi-object tracking (MOT) is a prerequisite for a safe deployment of self-driving cars.
We propose a novel approach to MOT that uses attention to compute track embeddings that encode dependencies between observed objects.
arXiv Detail & Related papers (2020-08-18T03:40:25Z) - Object Goal Navigation using Goal-Oriented Semantic Exploration [98.14078233526476]
This work studies the problem of object goal navigation which involves navigating to an instance of the given object category in unseen environments.
We propose a modular system called, Goal-Oriented Semantic Exploration' which builds an episodic semantic map and uses it to explore the environment efficiently.
arXiv Detail & Related papers (2020-07-01T17:52:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.