WhisperNet: A Scalable Solution for Bandwidth-Efficient Collaboration
- URL: http://arxiv.org/abs/2603.01708v1
- Date: Mon, 02 Mar 2026 10:33:25 GMT
- Title: WhisperNet: A Scalable Solution for Bandwidth-Efficient Collaboration
- Authors: Gong Chen, Chaokun Zhang, Xinyan Zhao,
- Abstract summary: Collaborative perception is vital for autonomous driving yet remains constrained by tight communication budgets.<n>We introduce textitWhisperNet, a bandwidth-aware framework that proposes a novel, receiver-centric paradigm for global coordination across agents.<n>We show that WhisperNet achieves state-of-the-art performance, improving AP@0.7 on OPV2V by 2.4% with only 0.5% of the communication cost.
- Score: 7.294662317293144
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Collaborative perception is vital for autonomous driving yet remains constrained by tight communication budgets. Earlier work reduced bandwidth by compressing full feature maps with fixed-rate encoders, which adapts poorly to a changing environment, and it further evolved into spatial selection methods that improve efficiency by focusing on salient regions, but this object-centric approach often sacrifices global context, weakening holistic scene understanding. To overcome these limitations, we introduce \textit{WhisperNet}, a bandwidth-aware framework that proposes a novel, receiver-centric paradigm for global coordination across agents. Senders generate lightweight saliency metadata, while the receiver formulates a global request plan that dynamically budgets feature contributions across agents and features, retrieving only the most informative features. A collaborative feature routing module then aligns related messages before fusion to ensure structural consistency. Extensive experiments show that WhisperNet achieves state-of-the-art performance, improving AP@0.7 on OPV2V by 2.4\% with only 0.5\% of the communication cost. As a plug-and-play component, it boosts strong baselines with merely 5\% of full bandwidth while maintaining robustness under localization noise. These results demonstrate that globally-coordinated allocation across \textit{what} and \textit{where} to share is the key to achieving efficient collaborative perception.
Related papers
- COOPERTRIM: Adaptive Data Selection for Uncertainty-Aware Cooperative Perception [4.26607743838444]
Cooperative perception enables autonomous agents to share encoded representations over wireless communication to enhance each other's live situational awareness.<n>Recent studies have explored selection strategies that share only a subset of features per frame while striving to keep the performance on par.<n>We take a proactive approach, exploiting the temporal continuity to identify features that capture environment dynamics, while avoiding repetitive and redundant transmission of static information.<n>We instantiate this intuition into an adaptive selection framework, COOPERTRIM, which introduces a novel conformal temporal uncertainty metric to gauge feature relevance, and a data-driven mechanism to dynamically determine the sharing quantity.
arXiv Detail & Related papers (2026-02-07T21:18:46Z) - CoCo-Fed: A Unified Framework for Memory- and Communication-Efficient Federated Learning at the Wireless Edge [50.42067935605982]
We propose a novel Compression and Combination-based Federated learning framework that unifies local memory efficiency and global communication reduction.<n>CoCo-Fed significantly outperforms state-of-the-art baselines in both memory and communication efficiency while maintaining robust convergence under non-IID settings.
arXiv Detail & Related papers (2026-01-02T03:39:50Z) - UAGLNet: Uncertainty-Aggregated Global-Local Fusion Network with Cooperative CNN-Transformer for Building Extraction [83.48950950780554]
Building extraction from remote sensing images is a challenging task due to the complex structure variations of buildings.<n>Existing methods employ convolutional or self-attention blocks to capture the multi-scale features in the segmentation models.<n>We present an Uncertainty-Aggregated Global-Local Fusion Network (UAGLNet) to exploit high-quality global-local visual semantics.
arXiv Detail & Related papers (2025-12-15T02:59:16Z) - JigsawComm: Joint Semantic Feature Encoding and Transmission for Communication-Efficient Cooperative Perception [7.867653563872962]
JigsawComm is an end-to-end trained, semantic-aware, and communication-efficient CP framework.<n>It uses a regularized encoder to extract semantically-relevant and sparse features.<n>It uses a lightweight Feature Utility Estimator to predict the contribution of each agent's features to the final perception task.
arXiv Detail & Related papers (2025-11-21T23:36:24Z) - Towards Federated Clustering: A Client-wise Private Graph Aggregation Framework [57.04850867402913]
Federated clustering addresses the challenge of extracting patterns from decentralized, unlabeled data.<n>We propose Structural Privacy-Preserving Federated Graph Clustering (SPP-FGC), a novel algorithm that innovatively leverages local structural graphs as the primary medium for privacy-preserving knowledge sharing.<n>Our framework achieves state-of-the-art performance, improving clustering accuracy by up to 10% (NMI) over federated baselines while maintaining provable privacy guarantees.
arXiv Detail & Related papers (2025-11-14T03:05:22Z) - Referring Remote Sensing Image Segmentation with Cross-view Semantics Interaction Network [65.01521002836611]
We propose a paralleled yet unified segmentation framework Cross-view Semantics Interaction Network (CSINet) to solve the limitations.<n>Motivated by human behavior in observing targets of interest, the network orchestrates visual cues from remote and close distances to conduct synergistic prediction.<n>In its every encoding stage, a Cross-View Window-attention module (CVWin) is utilized to supplement global and local semantics into close-view and remote-view branch features.
arXiv Detail & Related papers (2025-08-02T11:57:56Z) - CoCMT: Communication-Efficient Cross-Modal Transformer for Collaborative Perception [14.619784179608361]
Multi-agent collaborative perception enhances each agent's capabilities by sharing sensing information to cooperatively perform robot perception tasks.<n>Existing representative collaborative perception systems transmit intermediate feature maps, which contain significant amount of non-critical information.<n>We introduce CoCMT, an object-query-based collaboration framework that maximizes communication bandwidth by selectively extracting and transmitting essential features.
arXiv Detail & Related papers (2025-03-13T06:41:25Z) - V2X-PC: Vehicle-to-everything Collaborative Perception via Point Cluster [58.79477191603844]
We introduce a new message unit, namely point cluster, to represent the scene sparsely with a combination of low-level structure information and high-level semantic information.
This framework includes a Point Cluster Packing (PCP) module to keep object feature and manage bandwidth.
Experiments on two widely recognized collaborative perception benchmarks showcase the superior performance of our method compared to the previous state-of-the-art approaches.
arXiv Detail & Related papers (2024-03-25T11:24:02Z) - Distributed Adaptive Learning Under Communication Constraints [54.22472738551687]
This work examines adaptive distributed learning strategies designed to operate under communication constraints.
We consider a network of agents that must solve an online optimization problem from continual observation of streaming data.
arXiv Detail & Related papers (2021-12-03T19:23:48Z) - Faster Non-Convex Federated Learning via Global and Local Momentum [57.52663209739171]
textttFedGLOMO is the first (first-order) FLtexttFedGLOMO algorithm.
Our algorithm is provably optimal even with communication between the clients and the server.
arXiv Detail & Related papers (2020-12-07T21:05:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.