LangCoop: Collaborative Driving with Language
- URL: http://arxiv.org/abs/2504.13406v2
- Date: Mon, 21 Apr 2025 02:00:43 GMT
- Title: LangCoop: Collaborative Driving with Language
- Authors: Xiangbo Gao, Yuheng Wu, Rujia Wang, Chenxi Liu, Yang Zhou, Zhengzhong Tu,
- Abstract summary: LangCoop is a new paradigm for collaborative autonomous driving that leverages natural language as a compact yet expressive medium for inter-agent communication.<n>LangCoop achieves a remarkable 96% reduction in communication bandwidth ( 2KB per message) compared to image-based communication.
- Score: 13.25814019477039
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-agent collaboration holds great promise for enhancing the safety, reliability, and mobility of autonomous driving systems by enabling information sharing among multiple connected agents. However, existing multi-agent communication approaches are hindered by limitations of existing communication media, including high bandwidth demands, agent heterogeneity, and information loss. To address these challenges, we introduce LangCoop, a new paradigm for collaborative autonomous driving that leverages natural language as a compact yet expressive medium for inter-agent communication. LangCoop features two key innovations: Mixture Model Modular Chain-of-thought (M$^3$CoT) for structured zero-shot vision-language reasoning and Natural Language Information Packaging (LangPack) for efficiently packaging information into concise, language-based messages. Through extensive experiments conducted in the CARLA simulations, we demonstrate that LangCoop achieves a remarkable 96\% reduction in communication bandwidth (< 2KB per message) compared to image-based communication, while maintaining competitive driving performance in the closed-loop evaluation. Our project page and code are at https://xiangbogaobarry.github.io/LangCoop/.
Related papers
- MT-PingEval: Evaluating Multi-Turn Collaboration with Private Information Games [70.37904949359938]
We evaluate language models in multi-turn interactions using a suite of collaborative games that require effective communication about private information.<n>We find that language models are unable to use interactive collaboration to improve over the non-interactive baseline scenario.<n>We analyze the linguistic features of these dialogues, assessing the roles of sycophancy, information density, and discourse coherence.
arXiv Detail & Related papers (2026-02-27T17:13:20Z) - SafeCoop: Unravelling Full Stack Safety in Agentic Collaborative Driving [16.620713493180165]
Collaborative driving systems leverage vehicle-to-everything (V2X) communication across multiple agents to enhance driving safety and efficiency.<n>Traditional V2X systems face persistent challenges, including high bandwidth demands, semantic loss, and interoperability issues.<n>Recent advances investigate natural language as a promising medium, which can provide semantic richness, decision-level reasoning, and human-machine interoperability at significantly lower bandwidth.
arXiv Detail & Related papers (2025-10-20T21:41:28Z) - UNCAP: Uncertainty-Guided Planning Using Natural Language Communication for Cooperative Autonomous Vehicles [79.10221881250759]
Uncertainty-Guided Natural Language Cooperative Autonomous Planning (UNCAP) is a vision-language model-based planning approach.<n>It enables CAVs to communicate via lightweight natural language messages while explicitly accounting for perception uncertainty in decision-making.<n> Experiments across diverse driving scenarios show a 63% reduction in communication bandwidth with a 31% increase in driving safety score, a 61% reduction in decision uncertainty, and a four-fold increase in collision distance margin.
arXiv Detail & Related papers (2025-10-14T21:09:09Z) - Cued-Agent: A Collaborative Multi-Agent System for Automatic Cued Speech Recognition [17.451829471077858]
Cued Speech (CS) is a visual communication system that combines lip-reading with hand coding to facilitate communication for individuals with hearing impairments.<n> Automatic CS Recognition (ACSR) aims to convert CS hand gestures and lip movements into text via AI-driven methods.<n>We propose the first collaborative multi-agent system for ACSR, named Cued-Agent.
arXiv Detail & Related papers (2025-08-01T07:40:39Z) - AnyMAC: Cascading Flexible Multi-Agent Collaboration via Next-Agent Prediction [70.60422261117816]
We propose a new framework that rethinks multi-agent coordination through a sequential structure rather than a graph structure.<n>Our method focuses on two key directions: (1) Next-Agent Prediction, which selects the most suitable agent role at each step, and (2) Next-Context Selection, which enables each agent to selectively access relevant information from any previous step.
arXiv Detail & Related papers (2025-06-21T18:34:43Z) - LANGTRAJ: Diffusion Model and Dataset for Language-Conditioned Trajectory Simulation [94.84458417662404]
LangTraj is a language-conditioned scene-diffusion model that simulates the joint behavior of all agents in traffic scenarios.
By conditioning on natural language inputs, LangTraj provides flexible and intuitive control over interactive behaviors.
LangTraj demonstrates strong performance in realism, language controllability, and language-conditioned safety-critical simulation.
arXiv Detail & Related papers (2025-04-15T17:14:06Z) - Exponential Topology-enabled Scalable Communication in Multi-agent Reinforcement Learning [9.48183472865413]
We develop a scalable communication protocol for cooperative multi-agent reinforcement learning (MARL)<n>We propose utilizing the exponential topology to enable rapid information dissemination among agents by leveraging its small-diameter and small-size properties.<n>Experiments on large-scale cooperative benchmarks, including MAgent and Infrastructure Management Planning, demonstrate the superior performance and robust zero-shot transferability of ExpoComm.
arXiv Detail & Related papers (2025-02-27T03:15:31Z) - Semantic Communication for Cooperative Perception using HARQ [51.148203799109304]
We leverage an importance map to distill critical semantic information, introducing a cooperative perception semantic communication framework.
To counter the challenges posed by time-varying multipath fading, our approach incorporates the use of frequency-division multiplexing (OFDM) along with channel estimation and equalization strategies.
We introduce a novel semantic error detection method that is integrated with our semantic communication framework in the spirit of hybrid automatic repeated request (HARQ)
arXiv Detail & Related papers (2024-08-29T08:53:26Z) - Bidirectional Emergent Language in Situated Environments [4.950411915351642]
We introduce two novel cooperative environments: Multi-Agent Pong and Collectors.
optimal performance requires the emergence of a communication protocol, but moderate success can be achieved without one.
We find that the emerging communication is sparse, with the agents only generating meaningful messages and acting upon incoming messages in states where they cannot succeed without coordination.
arXiv Detail & Related papers (2024-08-26T21:25:44Z) - Verco: Learning Coordinated Verbal Communication for Multi-agent Reinforcement Learning [42.27106057372819]
We propose a novel multi-agent reinforcement learning algorithm that embeds large language models into agents.
The framework has a message module and an action module.
Experiments conducted on the Overcooked game demonstrate our method significantly enhances the learning efficiency and performance of existing methods.
arXiv Detail & Related papers (2024-04-27T05:10:33Z) - Context-aware Communication for Multi-agent Reinforcement Learning [6.109127175562235]
We develop a context-aware communication scheme for multi-agent reinforcement learning (MARL)
In the first stage, agents exchange coarse representations in a broadcast fashion, providing context for the second stage.
Following this, agents utilize attention mechanisms in the second stage to selectively generate messages personalized for the receivers.
To evaluate the effectiveness of CACOM, we integrate it with both actor-critic and value-based MARL algorithms.
arXiv Detail & Related papers (2023-12-25T03:33:08Z) - ChatDev: Communicative Agents for Software Development [84.90400377131962]
ChatDev is a chat-powered software development framework in which specialized agents are guided in what to communicate.
These agents actively contribute to the design, coding, and testing phases through unified language-based communication.
arXiv Detail & Related papers (2023-07-16T02:11:34Z) - Building Cooperative Embodied Agents Modularly with Large Language
Models [104.57849816689559]
We address challenging multi-agent cooperation problems with decentralized control, raw sensory observations, costly communication, and multi-objective tasks instantiated in various embodied environments.
We harness the commonsense knowledge, reasoning ability, language comprehension, and text generation prowess of LLMs and seamlessly incorporate them into a cognitive-inspired modular framework.
Our experiments on C-WAH and TDW-MAT demonstrate that CoELA driven by GPT-4 can surpass strong planning-based methods and exhibit emergent effective communication.
arXiv Detail & Related papers (2023-07-05T17:59:27Z) - CAMEL: Communicative Agents for "Mind" Exploration of Large Language
Model Society [58.04479313658851]
This paper explores the potential of building scalable techniques to facilitate autonomous cooperation among communicative agents.
We propose a novel communicative agent framework named role-playing.
Our contributions include introducing a novel communicative agent framework, offering a scalable approach for studying the cooperative behaviors and capabilities of multi-agent systems.
arXiv Detail & Related papers (2023-03-31T01:09:00Z) - On the Role of Emergent Communication for Social Learning in Multi-Agent
Reinforcement Learning [0.0]
Social learning uses cues from experts to align heterogeneous policies, reduce sample complexity, and solve partially observable tasks.
This paper proposes an unsupervised method based on the information bottleneck to capture both referential complexity and task-specific utility.
arXiv Detail & Related papers (2023-02-28T03:23:27Z) - Learning to Infer Belief Embedded Communication [9.862909791015237]
This paper introduces a novel algorithm to mimic an agent's language learning ability.
It contains a perception module for decoding other agents' intentions in response to their past actions.
It also includes a language generation module for learning implicit grammar during communication with two or more agents.
arXiv Detail & Related papers (2022-03-15T12:42:10Z) - Multi-agent Communication with Graph Information Bottleneck under
Limited Bandwidth (a position paper) [92.11330289225981]
In many real-world scenarios, communication can be expensive and the bandwidth of the multi-agent system is subject to certain constraints.
Redundant messages who occupy the communication resources can block the transmission of informative messages and thus jeopardize the performance.
We propose a novel multi-agent communication module, CommGIB, which effectively compresses the structure information and node information in the communication graph to deal with bandwidth-constrained settings.
arXiv Detail & Related papers (2021-12-20T07:53:44Z) - Learning Structured Communication for Multi-agent Reinforcement Learning [104.64584573546524]
This work explores the large-scale multi-agent communication mechanism under a multi-agent reinforcement learning (MARL) setting.
We propose a novel framework termed as Learning Structured Communication (LSC) by using a more flexible and efficient communication topology.
arXiv Detail & Related papers (2020-02-11T07:19:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.