CORE: Cooperative Reconstruction for Multi-Agent Perception
- URL: http://arxiv.org/abs/2307.11514v2
- Date: Tue, 25 Jul 2023 02:44:55 GMT
- Title: CORE: Cooperative Reconstruction for Multi-Agent Perception
- Authors: Binglu Wang, Lei Zhang, Zhaozhong Wang, Yongqiang Zhao, Tianfei Zhou
- Abstract summary: CORE is a conceptually simple, effective and communication-efficient model for multi-agent cooperative perception.
It addresses the task from a novel perspective of cooperative reconstruction, based on two key insights.
We validate CORE on OPV2V, a large-scale multi-agent percetion dataset.
- Score: 24.306731432524227
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents CORE, a conceptually simple, effective and
communication-efficient model for multi-agent cooperative perception. It
addresses the task from a novel perspective of cooperative reconstruction,
based on two key insights: 1) cooperating agents together provide a more
holistic observation of the environment, and 2) the holistic observation can
serve as valuable supervision to explicitly guide the model learning how to
reconstruct the ideal observation based on collaboration. CORE instantiates the
idea with three major components: a compressor for each agent to create more
compact feature representation for efficient broadcasting, a lightweight
attentive collaboration component for cross-agent message aggregation, and a
reconstruction module to reconstruct the observation based on aggregated
feature representations. This learning-to-reconstruct idea is task-agnostic,
and offers clear and reasonable supervision to inspire more effective
collaboration, eventually promoting perception tasks. We validate CORE on
OPV2V, a large-scale multi-agent percetion dataset, in two tasks, i.e., 3D
object detection and semantic segmentation. Results demonstrate that the model
achieves state-of-the-art performance on both tasks, and is more
communication-efficient.
Related papers
- Multi-branch Collaborative Learning Network for 3D Visual Grounding [66.67647903507927]
3D referring expression comprehension (3DREC) and segmentation (3DRES) have overlapping objectives, indicating their potential for collaboration.
We argue that employing separate branches for 3DREC and 3DRES tasks enhances the model's capacity to learn specific information for each task.
arXiv Detail & Related papers (2024-07-07T13:27:14Z) - Beyond Isolation: Multi-Agent Synergy for Improving Knowledge Graph
Construction [10.1305370182537]
This paper introduces a novel framework, CooperKGC, for knowledge graph construction.
CooperKGC establishes a collaborative processing network, assembling a KGC collaboration team capable of concurrently addressing entity, relation, and event extraction tasks.
Our experiments unequivocally demonstrate that fostering collaboration and information interaction among diverse agents within CooperKGC yields superior results compared to individual cognitive processes operating in isolation.
arXiv Detail & Related papers (2023-12-05T07:27:08Z) - Spatio-Temporal Domain Awareness for Multi-Agent Collaborative
Perception [18.358998861454477]
Multi-agent collaborative perception as a potential application for vehicle-to-everything communication could significantly improve the performance perception of autonomous vehicles over single-agent perception.
We propose SCOPE, a novel collaborative perception framework that aggregates awareness characteristics across agents in an end-to-end manner.
arXiv Detail & Related papers (2023-07-26T03:00:31Z) - A Dynamic Feature Interaction Framework for Multi-task Visual Perception [100.98434079696268]
We devise an efficient unified framework to solve multiple common perception tasks.
These tasks include instance segmentation, semantic segmentation, monocular 3D detection, and depth estimation.
Our proposed framework, termed D2BNet, demonstrates a unique approach to parameter-efficient predictions for multi-task perception.
arXiv Detail & Related papers (2023-06-08T09:24:46Z) - UMC: A Unified Bandwidth-efficient and Multi-resolution based
Collaborative Perception Framework [20.713675020714835]
We propose a Unified Collaborative perception framework named UMC.
It is designed to optimize the communication, collaboration, and reconstruction processes with the Multi-resolution technique.
Our experiments prove that the proposed UMC greatly outperforms the state-of-the-art collaborative perception approaches.
arXiv Detail & Related papers (2023-03-22T09:09:02Z) - Cross-modal Consensus Network for Weakly Supervised Temporal Action
Localization [74.34699679568818]
Weakly supervised temporal action localization (WS-TAL) is a challenging task that aims to localize action instances in the given video with video-level categorical supervision.
We propose a cross-modal consensus network (CO2-Net) to tackle this problem.
arXiv Detail & Related papers (2021-07-27T04:21:01Z) - CoADNet: Collaborative Aggregation-and-Distribution Networks for
Co-Salient Object Detection [91.91911418421086]
Co-Salient Object Detection (CoSOD) aims at discovering salient objects that repeatedly appear in a given query group containing two or more relevant images.
One challenging issue is how to effectively capture co-saliency cues by modeling and exploiting inter-image relationships.
We present an end-to-end collaborative aggregation-and-distribution network (CoADNet) to capture both salient and repetitive visual patterns from multiple images.
arXiv Detail & Related papers (2020-11-10T04:28:11Z) - Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in a
First-person Simulated 3D Environment [73.9469267445146]
First-person object-interaction tasks in high-fidelity, 3D, simulated environments such as the AI2Thor pose significant sample-efficiency challenges for reinforcement learning agents.
We show that one can learn object-interaction tasks from scratch without supervision by learning an attentive object-model as an auxiliary task.
arXiv Detail & Related papers (2020-10-28T19:27:26Z) - Cascaded Human-Object Interaction Recognition [175.60439054047043]
We introduce a cascade architecture for a multi-stage, coarse-to-fine HOI understanding.
At each stage, an instance localization network progressively refines HOI proposals and feeds them into an interaction recognition network.
With our carefully-designed human-centric relation features, these two modules work collaboratively towards effective interaction understanding.
arXiv Detail & Related papers (2020-03-09T17:05:04Z) - A Visual Communication Map for Multi-Agent Deep Reinforcement Learning [7.003240657279981]
Multi-agent learning poses significant challenges in the effort to allocate a concealed communication medium.
Recent studies typically combine a specialized neural network with reinforcement learning to enable communication between agents.
This paper proposes a more scalable approach that not only deals with a great number of agents but also enables collaboration between dissimilar functional agents.
arXiv Detail & Related papers (2020-02-27T02:38:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.