Related papers: Graph Attention-based Deep Reinforcement Learning for solving the Chinese Postman Problem with Load-dependent costs

Graph Attention-based Deep Reinforcement Learning for solving the Chinese Postman Problem with Load-dependent costs

URL: http://arxiv.org/abs/2310.15516v4
Date: Sun, 3 Mar 2024 03:39:21 GMT
Title: Graph Attention-based Deep Reinforcement Learning for solving the Chinese Postman Problem with Load-dependent costs
Authors: Truong Son Hy, Cong Dao Tran
Abstract summary: This paper proposes a novel DRL framework to address the Chinese Postman Problem ( CPP-LC) with load-dependent costs. We introduce an autoregressive model based on DRL, namely ArcDRL, consisting of an encoder and decoder to address the CPP-LC challenge effectively. We also propose a new bio-inspired meta-heuristic solution based on Algorithm (EA) for CPP-LC.
Score: 2.1212179660694104
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recently, Deep reinforcement learning (DRL) models have shown promising results in solving routing problems. However, most DRL solvers are commonly proposed to solve node routing problems, such as the Traveling Salesman Problem (TSP). Meanwhile, there has been limited research on applying neural methods to arc routing problems, such as the Chinese Postman Problem (CPP), since they often feature irregular and complex solution spaces compared to TSP. To fill these gaps, this paper proposes a novel DRL framework to address the CPP with load-dependent costs (CPP-LC) (Corberan et al., 2018), which is a complex arc routing problem with load constraints. The novelty of our method is two-fold. First, we formulate the CPP-LC as a Markov Decision Process (MDP) sequential model. Subsequently, we introduce an autoregressive model based on DRL, namely Arc-DRL, consisting of an encoder and decoder to address the CPP-LC challenge effectively. Such a framework allows the DRL model to work efficiently and scalably to arc routing problems. Furthermore, we propose a new bio-inspired meta-heuristic solution based on Evolutionary Algorithm (EA) for CPP-LC. Extensive experiments show that Arc-DRL outperforms existing meta-heuristic methods such as Iterative Local Search (ILS) and Variable Neighborhood Search (VNS) proposed by (Corberan et al., 2018) on large benchmark datasets for CPP-LC regarding both solution quality and running time; while the EA gives the best solution quality with much more running time. We release our C++ implementations for metaheuristics such as EA, ILS and VNS along with the code for data generation and our generated data at https://github.com/HySonLab/Chinese_Postman_Problem

Related papers

Offline Reinforcement Learning for Learning to Dispatch for Job Shop Scheduling [0.9831489366502301]
We introduce Offline Reinforcement Learning for Learning to Dispatch (Offline-LD), a novel approach for Job Shop Scheduling Problem (JSSP) Offline-LD adapts two CQL-based Q-learning methods for maskable action spaces, introduces a new entropy bonus modification for discrete SAC, and exploits reward normalization through preprocessing. Our experiments show that Offline-LD outperforms online RL on both generated and benchmark instances.
arXiv Detail & Related papers (2024-09-16T15:18:10Z)
Tractable Offline Learning of Regular Decision Processes [50.11277112628193]
This work studies offline Reinforcement Learning (RL) in a class of non-Markovian environments called Regular Decision Processes (RDPs) Ins, the unknown dependency of future observations and rewards from the past interactions can be captured experimentally. Many algorithms first reconstruct this unknown dependency using automata learning techniques.
arXiv Detail & Related papers (2024-09-04T14:26:58Z)
Multiobjective Vehicle Routing Optimization with Time Windows: A Hybrid Approach Using Deep Reinforcement Learning and NSGA-II [52.083337333478674]
This paper proposes a weight-aware deep reinforcement learning (WADRL) approach designed to address the multiobjective vehicle routing problem with time windows (MOVRPTW) The Non-dominated sorting genetic algorithm-II (NSGA-II) method is then employed to optimize the outcomes produced by the WADRL.
arXiv Detail & Related papers (2024-07-18T02:46:06Z)
Leveraging Constraint Programming in a Deep Learning Approach for Dynamically Solving the Flexible Job-Shop Scheduling Problem [1.3927943269211593]
This paper aims to integrate constraint programming (CP) within a deep learning (DL) based methodology, leveraging the benefits of both. We introduce a method that involves training a DL model using optimal solutions generated by CP, ensuring the model learns from high-quality data. Our hybrid approach has been extensively tested on three public FJSSP benchmarks, demonstrating superior performance over five state-of-the-art DRL approaches.
arXiv Detail & Related papers (2024-03-14T10:16:57Z)
Enhancing Column Generation by Reinforcement Learning-Based Hyper-Heuristic for Vehicle Routing and Scheduling Problems [9.203492057735074]
Column generation (CG) is a vital method to solve large-scale problems by dynamically generating variables. We propose a reinforcement learning-based hyper-heuristic framework, dubbed RLHH, to enhance the performance of CG.
arXiv Detail & Related papers (2023-10-15T00:05:50Z)
Learning RL-Policies for Joint Beamforming Without Exploration: A Batch Constrained Off-Policy Approach [1.0080317855851213]
We consider the problem of network parameter cancellation optimization for networks. We show that deploying an algorithm in the real world for exploration and learning can be achieved with the data without exploring.
arXiv Detail & Related papers (2023-10-12T18:36:36Z)
Efficient Diffusion Policies for Offline Reinforcement Learning [85.73757789282212]
Diffsuion-QL significantly boosts the performance of offline RL by representing a policy with a diffusion model. We propose efficient diffusion policy (EDP) to overcome these two challenges. EDP constructs actions from corrupted ones at training to avoid running the sampling chain.
arXiv Detail & Related papers (2023-05-31T17:55:21Z)
Pointerformer: Deep Reinforced Multi-Pointer Transformer for the Traveling Salesman Problem [67.32731657297377]
Traveling Salesman Problem (TSP) is a classic routing optimization problem originally arising in the domain of transportation and logistics. Recently, Deep Reinforcement Learning has been increasingly employed to solve TSP due to its high inference efficiency. We propose a novel end-to-end DRL approach, referred to as Pointerformer, based on multi-pointer Transformer.
arXiv Detail & Related papers (2023-04-19T03:48:32Z)
MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms. Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return. We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z)
Reward-Free RL is No Harder Than Reward-Aware RL in Linear Markov Decision Processes [61.11090361892306]
Reward-free reinforcement learning (RL) considers the setting where the agent does not have access to a reward function during exploration. We show that this separation does not exist in the setting of linear MDPs. We develop a computationally efficient algorithm for reward-free RL in a $d$-dimensional linear MDP.
arXiv Detail & Related papers (2022-01-26T22:09:59Z)
Learning Collaborative Policies to Solve NP-hard Routing Problems [13.13675711285772]
This paper proposes a novel hierarchical problem-solving strategy called learning collaborative policies (LCP) It can effectively find the near-optimum solution using two iterative DRL policies: the seeder and reviser. Extensive experiments demonstrate that the proposed two-policies collaboration scheme improves over single-policy DRL framework on various NP-hard routing problems.
arXiv Detail & Related papers (2021-10-26T19:46:21Z)
Track-Assignment Detailed Routing Using Attention-based Policy Model With Supervision [0.27998963147546135]
We propose a machine learning driven method for solving the track-assignment detailed routing problem. Our approach adopts an attention-based reinforcement learning (RL) policy model. We show that especially for complex problems, our supervised RL method provides good quality solution.
arXiv Detail & Related papers (2020-10-26T16:40:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.