Collaborative Visual Navigation
- URL: http://arxiv.org/abs/2107.01151v1
- Date: Fri, 2 Jul 2021 15:48:16 GMT
- Title: Collaborative Visual Navigation
- Authors: Haiyang Wang, Wenguan Wang, Xizhou Zhu, Jifeng Dai, Liwei Wang
- Abstract summary: We propose a large-scale 3D dataset, CollaVN, for multi-agent visual navigation (MAVN)
Diverse MAVN variants are explored to make our problem more general.
A memory-augmented communication framework is proposed. Each agent is equipped with a private, external memory to persistently store communication information.
- Score: 69.20264563368762
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As a fundamental problem for Artificial Intelligence, multi-agent system
(MAS) is making rapid progress, mainly driven by multi-agent reinforcement
learning (MARL) techniques. However, previous MARL methods largely focused on
grid-world like or game environments; MAS in visually rich environments has
remained less explored. To narrow this gap and emphasize the crucial role of
perception in MAS, we propose a large-scale 3D dataset, CollaVN, for
multi-agent visual navigation (MAVN). In CollaVN, multiple agents are entailed
to cooperatively navigate across photo-realistic environments to reach target
locations. Diverse MAVN variants are explored to make our problem more general.
Moreover, a memory-augmented communication framework is proposed. Each agent is
equipped with a private, external memory to persistently store communication
information. This allows agents to make better use of their past communication
information, enabling more efficient collaboration and robust long-term
planning. In our experiments, several baselines and evaluation metrics are
designed. We also empirically verify the efficacy of our proposed MARL approach
across different MAVN task settings.
Related papers
- OVER-NAV: Elevating Iterative Vision-and-Language Navigation with Open-Vocabulary Detection and StructurEd Representation [96.46961207887722]
OVER-NAV aims to go over and beyond the current arts of IVLN techniques.
To fully exploit the interpreted navigation data, we introduce a structured representation, coded Omnigraph.
arXiv Detail & Related papers (2024-03-26T02:34:48Z) - OVEL: Large Language Model as Memory Manager for Online Video Entity
Linking [57.70595589893391]
We propose a task called Online Video Entity Linking OVEL, aiming to establish connections between mentions in online videos and a knowledge base with high accuracy and timeliness.
To effectively handle OVEL task, we leverage a memory block managed by a Large Language Model and retrieve entity candidates from the knowledge base to augment LLM performance on memory management.
arXiv Detail & Related papers (2024-03-03T06:47:51Z) - Attention Graph for Multi-Robot Social Navigation with Deep
Reinforcement Learning [0.0]
We present MultiSoc, a new method for learning multi-agent socially aware navigation strategies using deep reinforcement learning (RL)
Inspired by recent works on multi-agent deep RL, our method leverages graph-based representation of agent interactions, combining the positions and fields of view of entities (pedestrians and agents)
Our method learns faster than social navigation deep RL mono-agent techniques, and enables efficient multi-agent implicit coordination in challenging crowd navigation with multiple heterogeneous humans.
arXiv Detail & Related papers (2024-01-31T15:24:13Z) - Co-NavGPT: Multi-Robot Cooperative Visual Semantic Navigation using
Large Language Models [10.312968200748118]
Co-NavGPT is an innovative framework that integrates Large Language Models as a global planner for multi-robot cooperative visual target navigation.
It encodes the explored environment data into prompts, enhancing LLMs' scene comprehension.
It then assigns exploration frontiers to each robot for efficient target search.
arXiv Detail & Related papers (2023-10-11T23:17:43Z) - Learning From Good Trajectories in Offline Multi-Agent Reinforcement
Learning [98.07495732562654]
offline multi-agent reinforcement learning (MARL) aims to learn effective multi-agent policies from pre-collected datasets.
One agent learned by offline MARL often inherits this random policy, jeopardizing the performance of the entire team.
We propose a novel framework called Shared Individual Trajectories (SIT) to address this problem.
arXiv Detail & Related papers (2022-11-28T18:11:26Z) - Recent Advances in Embedding Methods for Multi-Object Tracking: A Survey [71.10448142010422]
Multi-object tracking (MOT) aims to associate target objects across video frames in order to obtain entire moving trajectories.
Embedding methods play an essential role in object location estimation and temporal identity association in MOT.
We first conduct a comprehensive overview with in-depth analysis for embedding methods in MOT from seven different perspectives.
arXiv Detail & Related papers (2022-05-22T06:54:33Z) - Multi-modal Transformers Excel at Class-agnostic Object Detection [105.10403103027306]
We argue that existing methods lack a top-down supervision signal governed by human-understandable semantics.
We develop an efficient and flexible MViT architecture using multi-scale feature processing and deformable self-attention.
We show the significance of MViT proposals in a diverse range of applications.
arXiv Detail & Related papers (2021-11-22T18:59:29Z) - Multi-Agent Embodied Visual Semantic Navigation with Scene Prior
Knowledge [42.37872230561632]
In visual semantic navigation, the robot navigates to a target object with egocentric visual observations and the class label of the target is given.
Most of the existing models are only effective for single-agent navigation, and a single agent has low efficiency and poor fault tolerance when completing more complicated tasks.
We propose the multi-agent visual semantic navigation, in which multiple agents collaborate with others to find multiple target objects.
arXiv Detail & Related papers (2021-09-20T13:31:03Z) - A Visual Communication Map for Multi-Agent Deep Reinforcement Learning [7.003240657279981]
Multi-agent learning poses significant challenges in the effort to allocate a concealed communication medium.
Recent studies typically combine a specialized neural network with reinforcement learning to enable communication between agents.
This paper proposes a more scalable approach that not only deals with a great number of agents but also enables collaboration between dissimilar functional agents.
arXiv Detail & Related papers (2020-02-27T02:38:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.