A Large Language Model-Enhanced Q-learning for Capacitated Vehicle Routing Problem with Time Windows
- URL: http://arxiv.org/abs/2505.06178v2
- Date: Mon, 21 Jul 2025 02:53:14 GMT
- Title: A Large Language Model-Enhanced Q-learning for Capacitated Vehicle Routing Problem with Time Windows
- Authors: Linjiang Cao, Maonan Wang, Xi Xiong,
- Abstract summary: This paper proposes a novel Q-learning framework to address the Capacitated Vehicle Routing Problem with Time Windows (CVRPTW)<n>Our framework achieves a 7.3% average reduction in cost compared to traditional Q-learning, with fewer training steps required for convergence.
- Score: 3.0518581575184225
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Capacitated Vehicle Routing Problem with Time Windows (CVRPTW) is a classic NP-hard combinatorial optimization problem widely applied in logistics distribution and transportation management. Its complexity stems from the constraints of vehicle capacity and time windows, which pose significant challenges to traditional approaches. Advances in Large Language Models (LLMs) provide new possibilities for finding approximate solutions to CVRPTW. This paper proposes a novel LLM-enhanced Q-learning framework to address the CVRPTW with real-time emergency constraints. Our solution introduces an adaptive two-phase training mechanism that transitions from the LLM-guided exploration phase to the autonomous optimization phase of Q-network. To ensure reliability, we design a three-tier self-correction mechanism based on the Chain-of-Thought (CoT) for LLMs: syntactic validation, semantic verification, and physical constraint enforcement. In addition, we also prioritized replay of the experience generated by LLMs to amplify the regulatory role of LLMs in the architecture. Experimental results demonstrate that our framework achieves a 7.3\% average reduction in cost compared to traditional Q-learning, with fewer training steps required for convergence.
Related papers
- Discrete Tokenization for Multimodal LLMs: A Comprehensive Survey [69.45421620616486]
This work presents the first structured taxonomy and analysis of discrete tokenization methods designed for large language models (LLMs)<n>We categorize 8 representative VQ variants that span classical and modern paradigms and analyze their algorithmic principles, training dynamics, and integration challenges with LLM pipelines.<n>We identify key challenges including codebook collapse, unstable gradient estimation, and modality-specific encoding constraints.
arXiv Detail & Related papers (2025-07-21T10:52:14Z) - SOLVE: Synergy of Language-Vision and End-to-End Networks for Autonomous Driving [51.47621083057114]
SOLVE is an innovative framework that synergizes Vision-Language Models with end-to-end (E2E) models to enhance autonomous vehicle planning.<n>Our approach emphasizes knowledge sharing at the feature level through a shared visual encoder, enabling comprehensive interaction between VLM and E2E components.
arXiv Detail & Related papers (2025-05-22T15:44:30Z) - CoLLMLight: Cooperative Large Language Model Agents for Network-Wide Traffic Signal Control [7.0964925117958515]
Traffic Signal Control (TSC) plays a critical role in urban traffic management by optimizing traffic flow and mitigating congestion.<n>Existing approaches fail to address the essential need for inter-agent coordination.<n>We propose CoLLMLight, a cooperative LLM agent framework for TSC.
arXiv Detail & Related papers (2025-03-14T15:40:39Z) - TeLL-Drive: Enhancing Autonomous Driving with Teacher LLM-Guided Deep Reinforcement Learning [61.33599727106222]
TeLL-Drive is a hybrid framework that integrates a Teacher LLM to guide an attention-based Student DRL policy.<n>A self-attention mechanism then fuses these strategies with the DRL agent's exploration, accelerating policy convergence and boosting robustness.
arXiv Detail & Related papers (2025-02-03T14:22:03Z) - Can Large Language Models Be Trusted as Evolutionary Optimizers for Network-Structured Combinatorial Problems? [8.082897040940447]
Large Language Models (LLMs) have shown strong capabilities in language understanding and reasoning across diverse domains.<n>In this work, we propose a systematic framework to evaluate the capability of LLMs to engage with problem structures.<n>We adopt the commonly used evolutionary (EVO) and propose a comprehensive evaluation framework that rigorously assesses the output fidelity of LLM-based operators.
arXiv Detail & Related papers (2025-01-25T05:19:19Z) - HEART: Achieving Timely Multi-Model Training for Vehicle-Edge-Cloud-Integrated Hierarchical Federated Learning [30.75025062952915]
The rapid growth of AI-enabled Internet of Vehicles (IoV) calls for efficient machine learning (ML) solutions.<n>Vehicles often need to execute multiple ML tasks simultaneously, where this multi-model training environment introduces crucial challenges.<n>We propose a framework for multi-model training in dynamic VEC-HFL with the goal of minimizing global training latency.
arXiv Detail & Related papers (2025-01-17T03:15:03Z) - Learning for Cross-Layer Resource Allocation in MEC-Aided Cell-Free Networks [71.30914500714262]
Cross-layer resource allocation over mobile edge computing (MEC)-aided cell-free networks can sufficiently exploit the transmitting and computing resources to promote the data rate.<n>Joint subcarrier allocation and beamforming optimization are investigated for the MEC-aided cell-free network from the perspective of deep learning.
arXiv Detail & Related papers (2024-12-21T10:18:55Z) - A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs [74.35290684163718]
A primary challenge in large language model (LLM) development is their onerous pre-training cost.
This paper explores a promising paradigm to improve LLM pre-training efficiency and quality by leveraging a small language model (SLM)
arXiv Detail & Related papers (2024-10-24T14:31:52Z) - Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification [76.14641982122696]
We propose a constraint learning schema for fine-tuning Large Language Models (LLMs) with attribute control.
We show that our approach leads to an LLM that produces fewer inappropriate responses while achieving competitive performance on benchmarks and a toxicity detection task.
arXiv Detail & Related papers (2024-10-07T23:38:58Z) - New Solutions on LLM Acceleration, Optimization, and Application [14.995654657013741]
Large Language Models (LLMs) have become extremely potent instruments with exceptional capacities for comprehending and producing human-like text in a range of applications.
However, the increasing size and complexity of LLMs present significant challenges in both training and deployment.
We provide a review of recent advancements and research directions aimed at addressing these challenges.
arXiv Detail & Related papers (2024-06-16T11:56:50Z) - LLM-Assisted Light: Leveraging Large Language Model Capabilities for Human-Mimetic Traffic Signal Control in Complex Urban Environments [3.7788636451616697]
This work introduces an innovative approach that integrates Large Language Models into traffic signal control systems.
A hybrid framework that augments LLMs with a suite of perception and decision-making tools is proposed.
The findings from our simulations attest to the system's adeptness in adjusting to a multiplicity of traffic environments.
arXiv Detail & Related papers (2024-03-13T08:41:55Z) - Training Neural Networks from Scratch with Parallel Low-Rank Adapters [46.764982726136054]
We introduce LoRA-the-Explorer (LTE), a novel bi-level optimization algorithm designed to enable parallel training of multiple low-rank heads across computing nodes.
Our approach includes extensive experimentation on vision transformers using various vision datasets, demonstrating that LTE is competitive with standard pre-training.
arXiv Detail & Related papers (2024-02-26T18:55:13Z) - MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion
Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms.
Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return.
We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.