Related papers: Towards Natural Language Communication for Cooperative Autonomous Driving via Self-Play

Towards Natural Language Communication for Cooperative Autonomous Driving via Self-Play

URL: http://arxiv.org/abs/2505.18334v1
Date: Fri, 23 May 2025 19:40:09 GMT
Title: Towards Natural Language Communication for Cooperative Autonomous Driving via Self-Play
Authors: Jiaxun Cui, Chen Tang, Jarrett Holtz, Janice Nguyen, Alessandro G. Allievi, Hang Qiu, Peter Stone,
Abstract summary: Using natural language as a vehicle-to-vehicle (V2V) communication protocol offers the potential for autonomous vehicles to drive cooperatively.<n>This paper introduces a novel method, LLM+Debrief, to learn a message generation and high-level decision-making policy for autonomous vehicles.<n>Our experimental results demonstrate that LLM+Debrief is more effective at generating meaningful and human-understandable natural language messages.
Score: 70.70505035012462
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Past work has demonstrated that autonomous vehicles can drive more safely if they communicate with one another than if they do not. However, their communication has often not been human-understandable. Using natural language as a vehicle-to-vehicle (V2V) communication protocol offers the potential for autonomous vehicles to drive cooperatively not only with each other but also with human drivers. In this work, we propose a suite of traffic tasks in autonomous driving where vehicles in a traffic scenario need to communicate in natural language to facilitate coordination in order to avoid an imminent collision and/or support efficient traffic flow. To this end, this paper introduces a novel method, LLM+Debrief, to learn a message generation and high-level decision-making policy for autonomous vehicles through multi-agent discussion. To evaluate LLM agents for driving, we developed a gym-like simulation environment that contains a range of driving scenarios. Our experimental results demonstrate that LLM+Debrief is more effective at generating meaningful and human-understandable natural language messages to facilitate cooperation and coordination than a zero-shot LLM agent. Our code and demo videos are available at https://talking-vehicles.github.io/.

Related papers

Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases [102.05741859030951]
We propose CODA-LM, the first benchmark for the automatic evaluation of LVLMs for self-driving corner cases.<n>We show that using the text-only large language models as judges reveals even better alignment with human preferences than the LVLM judges.<n>Our CODA-VLM performs comparably with GPT-4V, even surpassing GPT-4V by +21.42% on the regional perception task.
arXiv Detail & Related papers (2024-04-16T14:20:55Z)
Driving Style Alignment for LLM-powered Driver Agent [9.057138382259065]
We propose a framework to align driver agents with human driving styles through demonstrations and feedback. We construct a natural language dataset of human driver behaviors through naturalistic driving experiments and post-driving interviews. The framework's effectiveness is validated through simulation experiments in the CARLA urban traffic simulator and further corroborated by human evaluations.
arXiv Detail & Related papers (2024-03-17T23:07:13Z)
SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems [53.94772445896213]
Large Language Model (LLM)-based multi-agent systems have demonstrated promising performance in simulating human society. We propose SpeechAgents, a multi-modal LLM based multi-agent system designed for simulating human communication.
arXiv Detail & Related papers (2024-01-08T15:01:08Z)
Personalized Autonomous Driving with Large Language Models: Field Experiments [11.429053835807697]
We introduce an LLM-based framework, Talk2Drive, capable of translating natural verbal commands into executable controls. This is the first-of-its-kind multi-scenario field experiment that deploys LLMs on a real-world autonomous vehicle. We validate that the proposed memory module considers personalized preferences and further reduces the takeover rate by up to 65.2%.
arXiv Detail & Related papers (2023-12-14T23:23:37Z)
DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving [69.82743399946371]
DriveMLM is a framework that can perform close-loop autonomous driving in realistic simulators. We employ a multi-modal LLM (MLLM) to model the behavior planning module of a module AD system. This model can plug-and-play in existing AD systems such as Apollo for close-loop driving.
arXiv Detail & Related papers (2023-12-14T18:59:05Z)
LMDrive: Closed-Loop End-to-End Driving with Large Language Models [37.910449013471656]
Large language models (LLM) have shown impressive reasoning capabilities that approach "Artificial General Intelligence" This paper introduces LMDrive, a novel language-guided, end-to-end, closed-loop autonomous driving framework.
arXiv Detail & Related papers (2023-12-12T18:24:15Z)
Receive, Reason, and React: Drive as You Say with Large Language Models in Autonomous Vehicles [13.102404404559428]
We propose a novel framework that leverages Large Language Models (LLMs) to enhance the decision-making process in autonomous vehicles. Our research includes experiments in HighwayEnv, a collection of environments for autonomous driving and tactical decision-making tasks. We also examine real-time personalization, demonstrating how LLMs can influence driving behaviors based on verbal commands.
arXiv Detail & Related papers (2023-10-12T04:56:01Z)
LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving [84.31119464141631]
This work employs Large Language Models (LLMs) as a decision-making component for complex autonomous driving scenarios.<n>Extensive experiments demonstrate that our proposed method not only consistently surpasses baseline approaches in single-vehicle tasks, but also helps handle complex driving behaviors even multi-vehicle coordination.
arXiv Detail & Related papers (2023-10-04T17:59:49Z)
Drive as You Speak: Enabling Human-Like Interaction with Large Language Models in Autonomous Vehicles [13.102404404559428]
We present a novel framework that leverages Large Language Models (LLMs) to enhance autonomous vehicles' decision-making processes. The proposed framework holds the potential to revolutionize the way autonomous vehicles operate, offering personalized assistance, continuous learning, and transparent decision-making.
arXiv Detail & Related papers (2023-09-19T00:47:13Z)
Building Cooperative Embodied Agents Modularly with Large Language Models [104.57849816689559]
We address challenging multi-agent cooperation problems with decentralized control, raw sensory observations, costly communication, and multi-objective tasks instantiated in various embodied environments. We harness the commonsense knowledge, reasoning ability, language comprehension, and text generation prowess of LLMs and seamlessly incorporate them into a cognitive-inspired modular framework. Our experiments on C-WAH and TDW-MAT demonstrate that CoELA driven by GPT-4 can surpass strong planning-based methods and exhibit emergent effective communication.
arXiv Detail & Related papers (2023-07-05T17:59:27Z)
COOPERNAUT: End-to-End Driving with Cooperative Perception for Networked Vehicles [54.61668577827041]
We introduce COOPERNAUT, an end-to-end learning model that uses cross-vehicle perception for vision-based cooperative driving. Our experiments on AutoCastSim suggest that our cooperative perception driving models lead to a 40% improvement in average success rate.
arXiv Detail & Related papers (2022-05-04T17:55:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.