Is Intermediate Fusion All You Need for UAV-based Collaborative Perception?
- URL: http://arxiv.org/abs/2504.21774v1
- Date: Wed, 30 Apr 2025 16:22:14 GMT
- Title: Is Intermediate Fusion All You Need for UAV-based Collaborative Perception?
- Authors: Jiuwu Hao, Liguo Sun, Yuting Wan, Yueyang Wu, Ti Xiang, Haolin Song, Pin Lv,
- Abstract summary: We propose a novel communication-efficient collaborative perception framework based on late-intermediate fusion, dubbed LIF.<n>We leverage vision-guided positional embedding (VPE) and box-based virtual augmented feature (BoBEV) to effectively integrate complementary information from various agents.<n> Experimental results demonstrate that our LIF achieves superior performance with minimal communication bandwidth, proving its effectiveness and practicality.
- Score: 1.8689461238197957
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Collaborative perception enhances environmental awareness through inter-agent communication and is regarded as a promising solution to intelligent transportation systems. However, existing collaborative methods for Unmanned Aerial Vehicles (UAVs) overlook the unique characteristics of the UAV perspective, resulting in substantial communication overhead. To address this issue, we propose a novel communication-efficient collaborative perception framework based on late-intermediate fusion, dubbed LIF. The core concept is to exchange informative and compact detection results and shift the fusion stage to the feature representation level. In particular, we leverage vision-guided positional embedding (VPE) and box-based virtual augmented feature (BoBEV) to effectively integrate complementary information from various agents. Additionally, we innovatively introduce an uncertainty-driven communication mechanism that uses uncertainty evaluation to select high-quality and reliable shared areas. Experimental results demonstrate that our LIF achieves superior performance with minimal communication bandwidth, proving its effectiveness and practicality. Code and models are available at https://github.com/uestchjw/LIF.
Related papers
- CoCMT: Communication-Efficient Cross-Modal Transformer for Collaborative Perception [14.619784179608361]
Multi-agent collaborative perception enhances each agent's capabilities by sharing sensing information to cooperatively perform robot perception tasks.<n>Existing representative collaborative perception systems transmit intermediate feature maps, which contain significant amount of non-critical information.<n>We introduce CoCMT, an object-query-based collaboration framework that maximizes communication bandwidth by selectively extracting and transmitting essential features.
arXiv Detail & Related papers (2025-03-13T06:41:25Z) - CoSDH: Communication-Efficient Collaborative Perception via Supply-Demand Awareness and Intermediate-Late Hybridization [23.958663737034318]
We propose a novel communication-efficient collaborative perception framework based on supply-demand awareness and intermediate-late hybridization.<n>Experiments on multiple datasets, including both simulated and real-world scenarios, demonstrate that mymethodname achieves state-of-the-art detection accuracy and optimal bandwidth trade-offs.
arXiv Detail & Related papers (2025-03-05T12:02:04Z) - Semantic Communication for Cooperative Perception using HARQ [51.148203799109304]
We leverage an importance map to distill critical semantic information, introducing a cooperative perception semantic communication framework.
To counter the challenges posed by time-varying multipath fading, our approach incorporates the use of frequency-division multiplexing (OFDM) along with channel estimation and equalization strategies.
We introduce a novel semantic error detection method that is integrated with our semantic communication framework in the spirit of hybrid automatic repeated request (HARQ)
arXiv Detail & Related papers (2024-08-29T08:53:26Z) - V2X-PC: Vehicle-to-everything Collaborative Perception via Point Cluster [58.79477191603844]
We introduce a new message unit, namely point cluster, to represent the scene sparsely with a combination of low-level structure information and high-level semantic information.
This framework includes a Point Cluster Packing (PCP) module to keep object feature and manage bandwidth.
Experiments on two widely recognized collaborative perception benchmarks showcase the superior performance of our method compared to the previous state-of-the-art approaches.
arXiv Detail & Related papers (2024-03-25T11:24:02Z) - What Makes Good Collaborative Views? Contrastive Mutual Information Maximization for Multi-Agent Perception [52.41695608928129]
Multi-agent perception (MAP) allows autonomous systems to understand complex environments by interpreting data from multiple sources.
This paper investigates intermediate collaboration for MAP with a specific focus on exploring "good" properties of collaborative view.
We propose a novel framework named CMiMC for intermediate collaboration.
arXiv Detail & Related papers (2024-03-15T07:18:55Z) - Towards Full-scene Domain Generalization in Multi-agent Collaborative Bird's Eye View Segmentation for Connected and Autonomous Driving [49.03947018718156]
We propose a unified domain generalization framework to be utilized during the training and inference stages of collaborative perception.
We also introduce an intra-system domain alignment mechanism to reduce or potentially eliminate the domain discrepancy among connected and autonomous vehicles.
arXiv Detail & Related papers (2023-11-28T12:52:49Z) - Integrated Sensing, Computation, and Communication for UAV-assisted
Federated Edge Learning [52.7230652428711]
Federated edge learning (FEEL) enables privacy-preserving model training through periodic communication between edge devices and the server.
Unmanned Aerial Vehicle (UAV)mounted edge devices are particularly advantageous for FEEL due to their flexibility and mobility in efficient data collection.
arXiv Detail & Related papers (2023-06-05T16:01:33Z) - Attention Based Feature Fusion For Multi-Agent Collaborative Perception [4.120288148198388]
We propose an intermediate collaborative perception solution in the form of a graph attention network (GAT)
The proposed approach develops an attention-based aggregation strategy to fuse intermediate representations exchanged among multiple connected agents.
This approach adaptively highlights important regions in the intermediate feature maps at both the channel and spatial levels, resulting in improved object detection precision.
arXiv Detail & Related papers (2023-05-03T12:06:11Z) - Learning to Communicate and Correct Pose Errors [75.03747122616605]
We study the setting proposed in V2VNet, where nearby self-driving vehicles jointly perform object detection and motion forecasting in a cooperative manner.
We propose a novel neural reasoning framework that learns to communicate, to estimate potential errors, and to reach a consensus about those errors.
arXiv Detail & Related papers (2020-11-10T18:19:40Z) - Bandwidth-Adaptive Feature Sharing for Cooperative LIDAR Object
Detection [2.064612766965483]
Situational awareness as a necessity in the connected and autonomous vehicles (CAV) domain.
Cooperative mechanisms have provided a solution to improve situational awareness by utilizing high speed wireless vehicular networks.
We propose a mechanism to add flexibility in adapting to communication channel capacity and a novel decentralized shared data alignment method.
arXiv Detail & Related papers (2020-10-22T00:12:58Z) - Cooperative LIDAR Object Detection via Feature Sharing in Deep Networks [11.737037965090535]
We introduce the concept of feature sharing for cooperative object detection (FS-COD)
In our proposed approach, a better understanding of the environment is achieved by sharing partially processed data between cooperative vehicles.
It is shown that the proposed approach has significant performance superiority over the conventional single-vehicle object detection approaches.
arXiv Detail & Related papers (2020-02-19T20:47:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.