Sym-NCO: Leveraging Symmetricity for Neural Combinatorial Optimization
- URL: http://arxiv.org/abs/2205.13209v1
- Date: Thu, 26 May 2022 07:55:43 GMT
- Title: Sym-NCO: Leveraging Symmetricity for Neural Combinatorial Optimization
- Authors: Minsu Kim, Junyoung Park, Jinkyoo Park
- Abstract summary: Deep reinforcement learning (DRL)-based optimization (CO) methods have shown significant merit over the conventional CO solvers.
This paper presents a novel training scheme, Sym-NCO, that achieves significant performance increments to existing DRL-NCO methods.
- Score: 16.127824824652077
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep reinforcement learning (DRL)-based combinatorial optimization (CO)
methods (i.e., DRL-NCO) have shown significant merit over the conventional CO
solvers as DRL-NCO is capable of learning CO solvers without supervised labels
attained from the verified solver. This paper presents a novel training scheme,
Sym-NCO, that achieves significant performance increments to existing DRL-NCO
methods. Sym-NCO is a regularizer-based training scheme that leverages
universal symmetricities in various CO problems and solutions. Imposing
symmetricities such as rotational and reflectional invariance can greatly
improve generalization capability of DRL-NCO as symmetricities are invariant
features shared by certain CO tasks. Our experimental results verify that our
Sym-NCO greatly improves the performance of DRL-NCO methods in four CO tasks,
including traveling salesman problem (TSP), capacitated vehicle routing problem
(CVRP), prize collecting TSP (PCTSP), and orienteering problem (OP), without
employing problem-specific techniques. Remarkably, Sym-NCO outperformed not
only the existing DRL-NCO methods but also a competitive conventional solver,
the iterative local search (ILS), in PCTSP at 240 times faster speed.
Related papers
- Enhancing Spectrum Efficiency in 6G Satellite Networks: A GAIL-Powered Policy Learning via Asynchronous Federated Inverse Reinforcement Learning [67.95280175998792]
A novel adversarial imitation learning (GAIL)-powered policy learning approach is proposed for optimizing beamforming, spectrum allocation, and remote user equipment (RUE) association ins.
We employ inverse RL (IRL) to automatically learn reward functions without manual tuning.
We show that the proposed MA-AL method outperforms traditional RL approaches, achieving a $14.6%$ improvement in convergence and reward value.
arXiv Detail & Related papers (2024-09-27T13:05:02Z) - Continuous Control with Coarse-to-fine Reinforcement Learning [15.585706638252441]
We present a framework that trains RL agents to zoom-into a continuous action space in a coarse-to-fine manner.
We introduce a concrete, value-based algorithm within the framework called Coarse-to-fine Q-Network (CQN)
CQN robustly learns to solve real-world manipulation tasks within a few minutes of online training.
arXiv Detail & Related papers (2024-07-10T16:04:08Z) - Neural Combinatorial Optimization Algorithms for Solving Vehicle Routing Problems: A Comprehensive Survey with Perspectives [14.47130974868925]
This survey aims to provide a comprehensive taxonomy of NCO solvers for VRPs.
We divide all NCO solvers into four distinct categories, namely Learning to Construct, Learning to Improve, Learning to Predict-Once, and Learning to Predict-Multiplicity solvers.
We present the inadequacies of the SOTA solvers, including poor generalization, incapability to solve large-scale VRPs, and difficulty in comparing these NCO solvers with the conventional Operations Research algorithms.
arXiv Detail & Related papers (2024-06-01T12:18:39Z) - Instance-Conditioned Adaptation for Large-scale Generalization of Neural Combinatorial Optimization [15.842155380912002]
This work proposes a novel Instance-Conditioned Adaptation Model (ICAM) for better large-scale generalization of neural optimization.
In particular, we design a powerful yet lightweight instance-conditioned Routing adaptation module for the NCO model.
We develop an efficient three-stage reinforcement learning-based training scheme that enables the model to learn cross-scale features without any labeled optimal solution.
arXiv Detail & Related papers (2024-05-03T08:00:19Z) - RL4CO: an Extensive Reinforcement Learning for Combinatorial Optimization Benchmark [69.19502244910632]
Deep reinforcement learning (RL) has shown significant benefits in solving optimization (CO) problems.
We introduce RL4CO, a unified benchmark with in-depth library coverage of 23 state-of-the-art methods and more than 20 CO problems.
Built on efficient software libraries and best practices in implementation, RL4CO features modularized implementation and flexible configuration of diverse RL algorithms, neural network architectures, inference techniques, and environments.
arXiv Detail & Related papers (2023-06-29T16:57:22Z) - Learning to Sail Dynamic Networks: The MARLIN Reinforcement Learning
Framework for Congestion Control in Tactical Environments [53.08686495706487]
This paper proposes an RL framework that leverages an accurate and parallelizable emulation environment to reenact the conditions of a tactical network.
We evaluate our RL learning framework by training a MARLIN agent in conditions replicating a bottleneck link transition between a Satellite Communication (SATCOM) and an UHF Wide Band (UHF) radio link.
arXiv Detail & Related papers (2023-06-27T16:15:15Z) - MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion
Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms.
Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return.
We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z) - How Good Is Neural Combinatorial Optimization? A Systematic Evaluation
on the Traveling Salesman Problem [31.451338706630583]
This work presents a comprehensive comparative study of neural optimization solvers and alternative solvers.
Our results show that the solvers learned by NCO approaches, in general, still fall short of traditional solvers in nearly all these aspects.
arXiv Detail & Related papers (2022-09-22T10:50:36Z) - Combining Pessimism with Optimism for Robust and Efficient Model-Based
Deep Reinforcement Learning [56.17667147101263]
In real-world tasks, reinforcement learning agents encounter situations that are not present during training time.
To ensure reliable performance, the RL agents need to exhibit robustness against worst-case situations.
We propose the Robust Hallucinated Upper-Confidence RL (RH-UCRL) algorithm to provably solve this problem.
arXiv Detail & Related papers (2021-03-18T16:50:17Z) - SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep
Reinforcement Learning [102.78958681141577]
We present SUNRISE, a simple unified ensemble method, which is compatible with various off-policy deep reinforcement learning algorithms.
SUNRISE integrates two key ingredients: (a) ensemble-based weighted Bellman backups, which re-weight target Q-values based on uncertainty estimates from a Q-ensemble, and (b) an inference method that selects actions using the highest upper-confidence bounds for efficient exploration.
arXiv Detail & Related papers (2020-07-09T17:08:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.