Related papers: Topology Enhanced MARL for Multi-Vehicle Cooperative Decision-Making of CAVs

Topology Enhanced MARL for Multi-Vehicle Cooperative Decision-Making of CAVs

URL: http://arxiv.org/abs/2507.12110v1
Date: Wed, 16 Jul 2025 10:27:36 GMT
Title: Topology Enhanced MARL for Multi-Vehicle Cooperative Decision-Making of CAVs
Authors: Ye Han, Lijun Zhang, Dejian Meng, Zhuang Zhang,
Abstract summary: TPE-MARL balances exploration and exploitation in mixed traffic scenarios.<n>It exhibits superior performance in terms of traffic efficiency, safety, decision smoothness, and task completion.
Score: 11.569616198957887
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The exploration-exploitation trade-off constitutes one of the fundamental challenges in reinforcement learning (RL), which is exacerbated in multi-agent reinforcement learning (MARL) due to the exponential growth of joint state-action spaces. This paper proposes a topology-enhanced MARL (TPE-MARL) method for optimizing cooperative decision-making of connected and autonomous vehicles (CAVs) in mixed traffic. This work presents two primary contributions: First, we construct a game topology tensor for dynamic traffic flow, effectively compressing high-dimensional traffic state information and decrease the search space for MARL algorithms. Second, building upon the designed game topology tensor and using QMIX as the backbone RL algorithm, we establish a topology-enhanced MARL framework incorporating visit counts and agent mutual information. Extensive simulations across varying traffic densities and CAV penetration rates demonstrate the effectiveness of TPE-MARL. Evaluations encompassing training dynamics, exploration patterns, macroscopic traffic performance metrics, and microscopic vehicle behaviors reveal that TPE-MARL successfully balances exploration and exploitation. Consequently, it exhibits superior performance in terms of traffic efficiency, safety, decision smoothness, and task completion. Furthermore, the algorithm demonstrates decision-making rationality comparable to or exceeding that of human drivers in both mixed-autonomy and fully autonomous traffic scenarios. Code of our work is available at \href{https://github.com/leoPub/tpemarl}{https://github.com/leoPub/tpemarl}.

Related papers

Topology-Assisted Spatio-Temporal Pattern Disentangling for Scalable MARL in Large-scale Autonomous Traffic Control [14.929720580977152]
This paper introduces a novel MARL framework that integrates Dynamic Graph Neural Networks (DGNNs) and Topological Data Analysis (TDA)<n>Inspired by the Mixture of Experts (MoE) architecture in Large Language Models (LLMs), a topology-assisted spatial pattern disentangling (TSD)-enhanced MoE is proposed.<n> Extensive experiments conducted on real-world traffic scenarios, together with comprehensive theoretical analysis, validate the superior performance of the proposed framework.
arXiv Detail & Related papers (2025-06-14T11:18:12Z)
Reinforced Model Merging [53.84354455400038]
We present an innovative framework termed Reinforced Model Merging (RMM), which encompasses an environment and agent tailored for merging tasks.<n>By utilizing data subsets during the evaluation process, we addressed the bottleneck in the reward feedback phase, thereby accelerating RMM by up to 100 times.
arXiv Detail & Related papers (2025-03-27T08:52:41Z)
Learning Multi-Robot Coordination through Locality-Based Factorized Multi-Agent Actor-Critic Algorithm [54.98788921815576]
We present a novel cooperative multi-agent reinforcement learning method called textbfLocality based textbfFactorized textbfMulti-Agent textbfActor-textbfCritic (Loc-FACMAC)<n>We integrate the concept of locality into critic learning, where strongly related robots form partitions during training.<n>Our method improves existing algorithms by focusing on local rewards and leveraging partition-based learning to enhance training efficiency and performance.
arXiv Detail & Related papers (2025-03-24T16:00:16Z)
CoLLMLight: Cooperative Large Language Model Agents for Network-Wide Traffic Signal Control [7.0964925117958515]
Traffic Signal Control (TSC) plays a critical role in urban traffic management by optimizing traffic flow and mitigating congestion.<n>Existing approaches fail to address the essential need for inter-agent coordination.<n>We propose CoLLMLight, a cooperative LLM agent framework for TSC.
arXiv Detail & Related papers (2025-03-14T15:40:39Z)
Heterogeneous Multi-Agent Reinforcement Learning for Distributed Channel Access in WLANs [47.600901884970845]
This paper investigates the use of multi-agent reinforcement learning (MARL) to address distributed channel access in wireless local area networks.<n>In particular, we consider the challenging yet more practical case where the agents heterogeneously adopt value-based or policy-based reinforcement learning algorithms to train the model.<n>We propose a heterogeneous MARL training framework, named QPMIX, which adopts a centralized training with distributed execution paradigm to enable heterogeneous agents to collaborate.
arXiv Detail & Related papers (2024-12-18T13:50:31Z)
CoMAL: Collaborative Multi-Agent Large Language Models for Mixed-Autonomy Traffic [11.682456863110767]
CoMAL is a framework designed to address the mixed-autonomy traffic problem by collaboration among autonomous vehicles to optimize traffic flow.<n>CoMAL is built upon large language models, operating in an interactive traffic simulation environment.
arXiv Detail & Related papers (2024-10-18T10:53:44Z)
Think Twice Before Recognizing: Large Multimodal Models for General Fine-grained Traffic Sign Recognition [49.20086587208214]
We propose a new strategy called think twice before recognizing to improve fine-grained traffic sign recognition (TSR) Our strategy achieves effective fine-grained TSR by stimulating the multiple-thinking capability of large multimodal models (LMM)
arXiv Detail & Related papers (2024-09-03T02:08:47Z)
Learning Decentralized Traffic Signal Controllers with Multi-Agent Graph Reinforcement Learning [42.175067773481416]
We design a new decentralized control architecture with improved environmental observability to capture the spatial-temporal correlation. Specifically, we first develop a topology-aware information aggregation strategy to extract correlation-related information from unstructured data gathered in the road network. A diffusion convolution module is developed, forming a new MARL algorithm, which endows agents with the capabilities of graph learning.
arXiv Detail & Related papers (2023-11-07T06:43:15Z)
MA2CL:Masked Attentive Contrastive Learning for Multi-Agent Reinforcement Learning [128.19212716007794]
We propose an effective framework called textbfMulti-textbfAgent textbfMasked textbfAttentive textbfContrastive textbfLearning (MA2CL) MA2CL encourages learning representation to be both temporal and agent-level predictive by reconstructing the masked agent observation in latent space. Our method significantly improves the performance and sample efficiency of different MARL algorithms and outperforms other methods in various vision-based and state-based scenarios.
arXiv Detail & Related papers (2023-06-03T05:32:19Z)
MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms. Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return. We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z)
Large-Scale Traffic Signal Control by a Nash Deep Q-network Approach [7.23135508361981]
We introduce an off-policy nash deep Q-Network (OPNDQN) algorithm, which mitigates the weakness of both fully centralized and MARL approaches. One of main advantages of OPNDQN is to mitigate the non-stationarity of multi-agent Markov process. We show the dominant superiority of OPNDQN over several existing MARL approaches in terms of average queue length, episode training reward and average waiting time.
arXiv Detail & Related papers (2023-01-02T12:58:51Z)
A Driving Behavior Recognition Model with Bi-LSTM and Multi-Scale CNN [59.57221522897815]
We propose a neural network model based on trajectories information for driving behavior recognition. We evaluate the proposed model on the public BLVD dataset, achieving a satisfying performance.
arXiv Detail & Related papers (2021-03-01T06:47:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.