Related papers: Rate-Distortion Optimized Communication for Collaborative Perception

Rate-Distortion Optimized Communication for Collaborative Perception

URL: http://arxiv.org/abs/2509.21994v1
Date: Fri, 26 Sep 2025 07:21:32 GMT
Title: Rate-Distortion Optimized Communication for Collaborative Perception
Authors: Genjia Liu, Anning Hu, Yue Hu, Wenjun Zhang, Siheng Chen,
Abstract summary: We introduce a pragmatic rate-distortion theory for multi-agent collaboration, specifically formulated to analyze performance-communication trade-off.<n>We propose RDcomm, a communication-efficient collaborative perception framework that introduces two key innovations.<n>Experiments on 3D object detection and BEV segmentation demonstrate that RDcomm achieves state-of-the-art accuracy on DAIR-V2X and OPV2V, while reducing communication volume by up to 108 times.
Score: 47.737814518681326
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Collaborative perception emphasizes enhancing environmental understanding by enabling multiple agents to share visual information with limited bandwidth resources. While prior work has explored the empirical trade-off between task performance and communication volume, a significant gap remains in the theoretical foundation. To fill this gap, we draw on information theory and introduce a pragmatic rate-distortion theory for multi-agent collaboration, specifically formulated to analyze performance-communication trade-off in goal-oriented multi-agent systems. This theory concretizes two key conditions for designing optimal communication strategies: supplying pragmatically relevant information and transmitting redundancy-less messages. Guided by these two conditions, we propose RDcomm, a communication-efficient collaborative perception framework that introduces two key innovations: i) task entropy discrete coding, which assigns features with task-relevant codeword-lengths to maximize the efficiency in supplying pragmatic information; ii) mutual-information-driven message selection, which utilizes mutual information neural estimation to approach the optimal redundancy-less condition. Experiments on 3D object detection and BEV segmentation demonstrate that RDcomm achieves state-of-the-art accuracy on DAIR-V2X and OPV2V, while reducing communication volume by up to 108 times. The code will be released.

Related papers

On the Rate-Distortion-Complexity Tradeoff for Semantic Communication [42.300429885256435]
This paper proposes a rate-distortion-complexity (RDC) framework which extends the classical rate-distortion theory.<n>We derive the theoretical results of the minimum achievable rate under given constraints on semantic distance and complexity.<n>Our results show a fundamental three-way tradeoff among achievable rate, semantic distance, and model complexity.
arXiv Detail & Related papers (2026-02-16T05:45:52Z)
Communication-Efficient Multi-Agent 3D Detection via Hybrid Collaboration [34.67157102711333]
Collaborative 3D detection can substantially boost detection performance by allowing agents to exchange complementary information.<n>We propose a novel hybrid collaboration that adaptively integrates two types of communication messages.<n>We present textttHyComm, a communication-efficient LiDAR-based collaborative 3D detection system.
arXiv Detail & Related papers (2025-08-09T20:33:37Z)
Task-Oriented Semantic Communication in Large Multimodal Models-based Vehicle Networks [55.32199894495722]
We investigate an LMM-based vehicle AI assistant using a Large Language and Vision Assistant (LLaVA)<n>To reduce computational demands and shorten response time, we optimize LLaVA's image slicing to selectively focus on areas of utmost interest to users.<n>We construct a Visual Question Answering (VQA) dataset for traffic scenarios to evaluate effectiveness.
arXiv Detail & Related papers (2025-05-05T07:18:47Z)
Multi-Modal Self-Supervised Semantic Communication [52.76990720898666]
We propose a multi-modal semantic communication system that leverages multi-modal self-supervised learning to enhance task-agnostic feature extraction.<n>The proposed approach effectively captures both modality-invariant and modality-specific features while minimizing training-related communication overhead.<n>The findings underscore the advantages of multi-modal self-supervised learning in semantic communication, paving the way for more efficient and scalable edge inference systems.
arXiv Detail & Related papers (2025-03-18T06:13:02Z)
CoSDH: Communication-Efficient Collaborative Perception via Supply-Demand Awareness and Intermediate-Late Hybridization [23.958663737034318]
We propose a novel communication-efficient collaborative perception framework based on supply-demand awareness and intermediate-late hybridization.<n>Experiments on multiple datasets, including both simulated and real-world scenarios, demonstrate that mymethodname achieves state-of-the-art detection accuracy and optimal bandwidth trade-offs.
arXiv Detail & Related papers (2025-03-05T12:02:04Z)
Deep Reinforcement Learning-Based User Scheduling for Collaborative Perception [24.300126250046894]
Collaborative perception is envisioned to improve perceptual accuracy by using vehicle-to-everything (V2X) communication.<n>Due to limited communication resources, it is impractical for all units to transmit sensing data such as point clouds or high-definition video.<n>We propose a deep reinforcement learning-based V2X user scheduling algorithm for collaborative perception.
arXiv Detail & Related papers (2025-02-12T04:45:00Z)
Communication Learning in Multi-Agent Systems from Graph Modeling Perspective [62.13508281188895]
We introduce a novel approach wherein we conceptualize the communication architecture among agents as a learnable graph. We introduce a temporal gating mechanism for each agent, enabling dynamic decisions on whether to receive shared information at a given time.
arXiv Detail & Related papers (2024-11-01T05:56:51Z)
Semantic Communication for Cooperative Perception using HARQ [51.148203799109304]
We leverage an importance map to distill critical semantic information, introducing a cooperative perception semantic communication framework. To counter the challenges posed by time-varying multipath fading, our approach incorporates the use of frequency-division multiplexing (OFDM) along with channel estimation and equalization strategies. We introduce a novel semantic error detection method that is integrated with our semantic communication framework in the spirit of hybrid automatic repeated request (HARQ)
arXiv Detail & Related papers (2024-08-29T08:53:26Z)
Tackling Distribution Shifts in Task-Oriented Communication with Information Bottleneck [28.661084093544684]
We propose a novel approach based on the information bottleneck (IB) principle and invariant risk minimization (IRM) framework. The proposed method aims to extract compact and informative features that possess high capability for effective domain-shift generalization. We show that the proposed scheme outperforms state-of-the-art approaches and achieves a better rate-distortion tradeoff.
arXiv Detail & Related papers (2024-05-15T17:07:55Z)
PACE: A Pragmatic Agent for Enhancing Communication Efficiency Using Large Language Models [29.016842120305892]
This paper proposes an image pragmatic communication framework based on a Pragmatic Agent for Communication Efficiency (PACE) using Large Language Models (LLM) PACE sequentially performs semantic perception, intention resolution, and intention-oriented coding. For experimental validation, this paper constructs an image pragmatic communication dataset along with corresponding evaluation standards.
arXiv Detail & Related papers (2024-01-30T06:55:17Z)
Pragmatic Communication in Multi-Agent Collaborative Perception [80.14322755297788]
Collaborative perception results in a trade-off between perception ability and communication costs. We propose PragComm, a multi-agent collaborative perception system with two key components. PragComm consistently outperforms previous methods with more than 32.7K times lower communication volume.
arXiv Detail & Related papers (2024-01-23T11:58:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.