Related papers: Enhancing Column Generation by Reinforcement Learning-Based Hyper-Heuristic for Vehicle Routing and Scheduling Problems

Enhancing Column Generation by Reinforcement Learning-Based Hyper-Heuristic for Vehicle Routing and Scheduling Problems

URL: http://arxiv.org/abs/2310.09686v1
Date: Sun, 15 Oct 2023 00:05:50 GMT
Title: Enhancing Column Generation by Reinforcement Learning-Based Hyper-Heuristic for Vehicle Routing and Scheduling Problems
Authors: Kuan Xu and Li Shen and Lindong Liu
Abstract summary: Column generation (CG) is a vital method to solve large-scale problems by dynamically generating variables. We propose a reinforcement learning-based hyper-heuristic framework, dubbed RLHH, to enhance the performance of CG.
Score: 9.203492057735074
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Column generation (CG) is a vital method to solve large-scale problems by dynamically generating variables. It has extensive applications in common combinatorial optimization, such as vehicle routing and scheduling problems, where each iteration step requires solving an NP-hard constrained shortest path problem. Although some heuristic methods for acceleration already exist, they are not versatile enough to solve different problems. In this work, we propose a reinforcement learning-based hyper-heuristic framework, dubbed RLHH, to enhance the performance of CG. RLHH is a selection module embedded in CG to accelerate convergence and get better integer solutions. In each CG iteration, the RL agent selects a low-level heuristic to construct a reduced network only containing the edges with a greater chance of being part of the optimal solution. In addition, we specify RLHH to solve two typical combinatorial optimization problems: Vehicle Routing Problem with Time Windows (VRPTW) and Bus Driver Scheduling Problem (BDSP). The total cost can be reduced by up to 27.9\% in VRPTW and 15.4\% in BDSP compared to the best lower-level heuristic in our tested scenarios, within equivalent or even less computational time. The proposed RLHH is the first RL-based CG method that outperforms traditional approaches in terms of solution quality, which can promote the application of CG in combinatorial optimization.

Related papers

Preference Optimization for Combinatorial Optimization Problems [54.87466279363487]
Reinforcement Learning (RL) has emerged as a powerful tool for neural optimization, enabling models learns that solve complex problems without requiring expert knowledge.<n>Despite significant progress, existing RL approaches face challenges such as diminishing reward signals and inefficient exploration in vast action spaces.<n>We propose Preference Optimization, a novel method that transforms quantitative reward signals into qualitative preference signals via statistical comparison modeling.
arXiv Detail & Related papers (2025-05-13T16:47:00Z)
Accelerating Vehicle Routing via AI-Initialized Genetic Algorithms [55.78505925402658]
Vehicle Routing Problems (VRP) are an extension of the Traveling Salesperson Problem and are a fundamental NP-hard challenge in Evolutionary optimization. We introduce a novel optimization framework that uses a reinforcement learning agent - trained on prior instances - to quickly generate initial solutions, which are then further optimized by genetic algorithms. For example, EARLI handles vehicle routing with 500 locations within 1s, 10x faster than current solvers for the same solution quality, enabling applications like real-time and interactive routing.
arXiv Detail & Related papers (2025-04-08T15:21:01Z)
Hybridising Reinforcement Learning and Heuristics for Hierarchical Directed Arc Routing Problems [4.2873412319680035]
HDCARP is an extension of the Capacitated Arc Routing Problem (CARP) We propose a fast to efficiently address the computational challenges of HDCARP. We name this hybrid algorithm as the Hybrid Reinforcement Learning and Heuristic Algorithm for Directed Arc Routing (HRDA)
arXiv Detail & Related papers (2025-01-01T14:29:54Z)
Multiobjective Vehicle Routing Optimization with Time Windows: A Hybrid Approach Using Deep Reinforcement Learning and NSGA-II [52.083337333478674]
This paper proposes a weight-aware deep reinforcement learning (WADRL) approach designed to address the multiobjective vehicle routing problem with time windows (MOVRPTW) The Non-dominated sorting genetic algorithm-II (NSGA-II) method is then employed to optimize the outcomes produced by the WADRL.
arXiv Detail & Related papers (2024-07-18T02:46:06Z)
Transform then Explore: a Simple and Effective Technique for Exploratory Combinatorial Optimization with Reinforcement Learning [11.531786269804707]
We propose a gauge transformation (GT) technique to solve optimization problems (COPs) over graphs. GT is very simple, which can be implemented with less than 10 lines of Python codes, and can be applied to a vast majority of reinforcement learning models. We show that traditional RL models with GT technique produce the state-of-the-art performances on the MaxCut problem.
arXiv Detail & Related papers (2024-04-06T15:31:17Z)
Spatial-temporal-demand clustering for solving large-scale vehicle routing problems with time windows [0.0]
We propose a decompose-route-improve (DRI) framework that groups customers using clustering. Its similarity metric incorporates customers' spatial, temporal, and demand data. We apply pruned local search (LS) between solved subproblems to improve the overall solution.
arXiv Detail & Related papers (2024-01-20T06:06:01Z)
MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms. Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return. We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z)
Learning to Optimize Permutation Flow Shop Scheduling via Graph-based Imitation Learning [70.65666982566655]
Permutation flow shop scheduling (PFSS) is widely used in manufacturing systems. We propose to train the model via expert-driven imitation learning, which accelerates convergence more stably and accurately. Our model's network parameters are reduced to only 37% of theirs, and the solution gap of our model towards the expert solutions decreases from 6.8% to 1.3% on average.
arXiv Detail & Related papers (2022-10-31T09:46:26Z)
Multi-Agent Deep Reinforcement Learning in Vehicular OCC [14.685237010856953]
We introduce a spectral efficiency optimization approach in vehicular OCC. We model the optimization problem as a Markov decision process (MDP) to enable the use of solutions that can be applied online. We verify the performance of our proposed scheme through extensive simulations and compare it with various variants of our approach and a random method.
arXiv Detail & Related papers (2022-05-05T14:25:54Z)
Distributed Multi-agent Meta Learning for Trajectory Design in Wireless Drone Networks [151.27147513363502]
This paper studies the problem of the trajectory design for a group of energyconstrained drones operating in dynamic wireless network environments. A value based reinforcement learning (VDRL) solution and a metatraining mechanism is proposed.
arXiv Detail & Related papers (2020-12-06T01:30:12Z)
Combining Deep Learning and Optimization for Security-Constrained Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems. Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs. This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z)
Combining Reinforcement Learning and Constraint Programming for Combinatorial Optimization [5.669790037378094]
The goal is to find an optimal solution among a finite set of possibilities. Deep reinforcement learning (DRL) has shown its promise for solving NP-hard optimization problems. constraint programming (CP) is a generic tool to solve optimization problems. In this work, we propose a general and hybrid approach, based on DRL and CP, for solving optimization problems.
arXiv Detail & Related papers (2020-06-02T13:54:27Z)
Reinforcement Learning Based Vehicle-cell Association Algorithm for Highly Mobile Millimeter Wave Communication [53.47785498477648]
This paper investigates the problem of vehicle-cell association in millimeter wave (mmWave) communication networks. We first formulate the user state (VU) problem as a discrete non-vehicle association optimization problem. The proposed solution achieves up to 15% gains in terms sum of user complexity and 20% reduction in VUE compared to several baseline designs.
arXiv Detail & Related papers (2020-01-22T08:51:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.