MMCD: Multi-Modal Collaborative Decision-Making for Connected Autonomy with Knowledge Distillation
- URL: http://arxiv.org/abs/2509.18198v1
- Date: Fri, 19 Sep 2025 23:38:18 GMT
- Title: MMCD: Multi-Modal Collaborative Decision-Making for Connected Autonomy with Knowledge Distillation
- Authors: Rui Liu, Zikang Wang, Peng Gao, Yu Shen, Pratap Tokekar, Ming Lin,
- Abstract summary: We introduce a novel framework MMCD (Multi-Modal Collaborative Decision-making) for connected autonomy.<n>Our framework fuses multi-modal observations from ego and collaborative vehicles to enhance decision-making under challenging conditions.<n>Our method improves driving safety by up to $it 20.7%$, surpassing the best-existing baseline in detecting potential accidents.
- Score: 22.153121162641735
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Autonomous systems have advanced significantly, but challenges persist in accident-prone environments where robust decision-making is crucial. A single vehicle's limited sensor range and obstructed views increase the likelihood of accidents. Multi-vehicle connected systems and multi-modal approaches, leveraging RGB images and LiDAR point clouds, have emerged as promising solutions. However, existing methods often assume the availability of all data modalities and connected vehicles during both training and testing, which is impractical due to potential sensor failures or missing connected vehicles. To address these challenges, we introduce a novel framework MMCD (Multi-Modal Collaborative Decision-making) for connected autonomy. Our framework fuses multi-modal observations from ego and collaborative vehicles to enhance decision-making under challenging conditions. To ensure robust performance when certain data modalities are unavailable during testing, we propose an approach based on cross-modal knowledge distillation with a teacher-student model structure. The teacher model is trained with multiple data modalities, while the student model is designed to operate effectively with reduced modalities. In experiments on $\textit{connected autonomous driving with ground vehicles}$ and $\textit{aerial-ground vehicles collaboration}$, our method improves driving safety by up to ${\it 20.7}\%$, surpassing the best-existing baseline in detecting potential accidents and making safe driving decisions. More information can be found on our website https://ruiiu.github.io/mmcd.
Related papers
- Large Multimodal Models for Embodied Intelligent Driving: The Next Frontier in Self-Driving? [68.82027978227008]
This article introduces a novel semantics and policy dual-driven hybrid decision framework to tackle this challenge.<n>The framework merges LMMs for semantic understanding and cognitive representation, and deep reinforcement learning (DRL) for real-time policy optimization.<n>Case study is conducted experimentally to validate the performance superiority of our framework in completing lane-change planning task.
arXiv Detail & Related papers (2026-01-13T11:05:12Z) - CAML: Collaborative Auxiliary Modality Learning for Multi-Agent Systems [38.20651868834145]
We propose Collaborative Auxiliary Modality Learning (CAML), a novel multi-modal multi-agent framework.<n>We show that CAML achieves up to a $bf 58.1%$ improvement in accident detection.<n>We also validate CAML on real-world aerial-ground robot data for collaborative semantic segmentation.
arXiv Detail & Related papers (2025-02-25T03:59:40Z) - V2V-LLM: Vehicle-to-Vehicle Cooperative Autonomous Driving with Multi-Modal Large Language Models [31.537045261401666]
We propose a novel problem setting that integrates a Multi-Modal Large Language Model into cooperative autonomous driving.<n>We also propose our baseline method Vehicle-to-Vehicle Multi-Modal Large Language Model (V2V-LLM)<n> Experimental results show that our proposed V2V-LLM can be a promising unified model architecture for performing various tasks in cooperative autonomous driving.
arXiv Detail & Related papers (2025-02-14T08:05:41Z) - Towards Interactive and Learnable Cooperative Driving Automation: a Large Language Model-Driven Decision-Making Framework [87.7482313774741]
Connected Autonomous Vehicles (CAVs) have begun to open road testing around the world, but their safety and efficiency performance in complex scenarios is still not satisfactory.<n>This paper proposes CoDrivingLLM, an interactive and learnable LLM-driven cooperative driving framework.
arXiv Detail & Related papers (2024-09-19T14:36:00Z) - M2DA: Multi-Modal Fusion Transformer Incorporating Driver Attention for Autonomous Driving [11.36165122994834]
We propose a Multi-Modal fusion transformer incorporating Driver Attention (M2DA) for autonomous driving.
By incorporating driver attention, we empower the human-like scene understanding ability to autonomous vehicles to identify crucial areas precisely and ensure safety.
arXiv Detail & Related papers (2024-03-19T08:54:52Z) - LLM4Drive: A Survey of Large Language Models for Autonomous Driving [62.10344445241105]
Large language models (LLMs) have demonstrated abilities including understanding context, logical reasoning, and generating answers.
In this paper, we systematically review a research line about textitLarge Language Models for Autonomous Driving (LLM4AD).
arXiv Detail & Related papers (2023-11-02T07:23:33Z) - Drive Anywhere: Generalizable End-to-end Autonomous Driving with
Multi-modal Foundation Models [114.69732301904419]
We present an approach to apply end-to-end open-set (any environment/scene) autonomous driving that is capable of providing driving decisions from representations queryable by image and text.
Our approach demonstrates unparalleled results in diverse tests while achieving significantly greater robustness in out-of-distribution situations.
arXiv Detail & Related papers (2023-10-26T17:56:35Z) - Learning Driver Models for Automated Vehicles via Knowledge Sharing and
Personalization [2.07180164747172]
This paper describes a framework for learning Automated Vehicles (AVs) driver models via knowledge sharing between vehicles and personalization.
It finds several applications across transportation engineering including intelligent transportation systems, traffic management, and vehicle-to-vehicle communication.
arXiv Detail & Related papers (2023-08-31T17:18:15Z) - Generative AI-empowered Simulation for Autonomous Driving in Vehicular
Mixed Reality Metaverses [130.15554653948897]
In vehicular mixed reality (MR) Metaverse, distance between physical and virtual entities can be overcome.
Large-scale traffic and driving simulation via realistic data collection and fusion from the physical world is difficult and costly.
We propose an autonomous driving architecture, where generative AI is leveraged to synthesize unlimited conditioned traffic and driving data in simulations.
arXiv Detail & Related papers (2023-02-16T16:54:10Z) - COOPERNAUT: End-to-End Driving with Cooperative Perception for Networked
Vehicles [54.61668577827041]
We introduce COOPERNAUT, an end-to-end learning model that uses cross-vehicle perception for vision-based cooperative driving.
Our experiments on AutoCastSim suggest that our cooperative perception driving models lead to a 40% improvement in average success rate.
arXiv Detail & Related papers (2022-05-04T17:55:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.