Reversible Action Design for Combinatorial Optimization with
Reinforcement Learning
- URL: http://arxiv.org/abs/2102.07210v1
- Date: Sun, 14 Feb 2021 18:05:42 GMT
- Title: Reversible Action Design for Combinatorial Optimization with
Reinforcement Learning
- Authors: Fan Yao, Renqin Cai, Hongning Wang
- Abstract summary: Reinforcement learning (RL) has recently emerged as a new framework to tackle these problems.
We propose a general RL framework that not only exhibits state-of-the-art empirical performance but also generalizes to a variety class of COPs.
- Score: 35.50454156611722
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Combinatorial optimization problem (COP) over graphs is a fundamental
challenge in optimization. Reinforcement learning (RL) has recently emerged as
a new framework to tackle these problems and has demonstrated promising
results. However, most RL solutions employ a greedy manner to construct the
solution incrementally, thus inevitably pose unnecessary dependency on action
sequences and need a lot of problem-specific designs. We propose a general RL
framework that not only exhibits state-of-the-art empirical performance but
also generalizes to a variety class of COPs. Specifically, we define state as a
solution to a problem instance and action as a perturbation to this solution.
We utilize graph neural networks (GNN) to extract latent representations for
given problem instances for state-action encoding, and then apply deep
Q-learning to obtain a policy that gradually refines the solution by flipping
or swapping vertex labels. Experiments are conducted on Maximum $k$-Cut and
Traveling Salesman Problem and performance improvement is achieved against a
set of learning-based and heuristic baselines.
Related papers
- Offline reinforcement learning for job-shop scheduling problems [1.3927943269211593]
This paper introduces a novel offline RL method designed for optimization problems with complex constraints.
Our approach encodes actions in edge attributes and balances expected rewards with the imitation of expert solutions.
We demonstrate the effectiveness of this method on job-shop scheduling and flexible job-shop scheduling benchmarks.
arXiv Detail & Related papers (2024-10-21T07:33:42Z) - Instance-Conditioned Adaptation for Large-scale Generalization of Neural Combinatorial Optimization [15.842155380912002]
This work proposes a novel Instance-Conditioned Adaptation Model (ICAM) for better large-scale generalization of neural optimization.
In particular, we design a powerful yet lightweight instance-conditioned Routing adaptation module for the NCO model.
We develop an efficient three-stage reinforcement learning-based training scheme that enables the model to learn cross-scale features without any labeled optimal solution.
arXiv Detail & Related papers (2024-05-03T08:00:19Z) - Analyzing and Enhancing the Backward-Pass Convergence of Unrolled
Optimization [50.38518771642365]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks.
A central challenge in this setting is backpropagation through the solution of an optimization problem, which often lacks a closed form.
This paper provides theoretical insights into the backward pass of unrolled optimization, showing that it is equivalent to the solution of a linear system by a particular iterative method.
A system called Folded Optimization is proposed to construct more efficient backpropagation rules from unrolled solver implementations.
arXiv Detail & Related papers (2023-12-28T23:15:18Z) - Optimizing Solution-Samplers for Combinatorial Problems: The Landscape
of Policy-Gradient Methods [52.0617030129699]
We introduce a novel theoretical framework for analyzing the effectiveness of DeepMatching Networks and Reinforcement Learning methods.
Our main contribution holds for a broad class of problems including Max-and Min-Cut, Max-$k$-Bipartite-Bi, Maximum-Weight-Bipartite-Bi, and Traveling Salesman Problem.
As a byproduct of our analysis we introduce a novel regularization process over vanilla descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.
arXiv Detail & Related papers (2023-10-08T23:39:38Z) - Backpropagation of Unrolled Solvers with Folded Optimization [55.04219793298687]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks.
One typical strategy is algorithm unrolling, which relies on automatic differentiation through the operations of an iterative solver.
This paper provides theoretical insights into the backward pass of unrolled optimization, leading to a system for generating efficiently solvable analytical models of backpropagation.
arXiv Detail & Related papers (2023-01-28T01:50:42Z) - Deep Reinforcement Learning Guided Improvement Heuristic for Job Shop
Scheduling [30.45126420996238]
This paper proposes a novel DRL-guided improvement for solving JSSP, where graph representation is employed to encode complete solutions.
We design a Graph Neural-Network-based representation scheme, consisting of two modules to effectively capture the information of dynamic topology and different types of nodes in graphs encountered during the improvement process.
We prove that our method scales linearly with problem size. Experiments on classic benchmarks show that the improvement policy learned by our method outperforms state-of-the-art DRL-based methods by a large margin.
arXiv Detail & Related papers (2022-11-20T10:20:13Z) - Learning Proximal Operators to Discover Multiple Optima [66.98045013486794]
We present an end-to-end method to learn the proximal operator across non-family problems.
We show that for weakly-ized objectives and under mild conditions, the method converges globally.
arXiv Detail & Related papers (2022-01-28T05:53:28Z) - On the Difficulty of Generalizing Reinforcement Learning Framework for
Combinatorial Optimization [6.935838847004389]
Combinatorial optimization problems (COPs) on the graph with real-life applications are canonical challenges in Computer Science.
The underlying principle of this approach is to deploy a graph neural network (GNN) for encoding both the local information of the nodes and the graph-structured data.
We use the security-aware phone clone allocation in the cloud as a classical quadratic assignment problem (QAP) to investigate whether or not deep RL-based model is generally applicable to solve other classes of such hard problems.
arXiv Detail & Related papers (2021-08-08T19:12:04Z) - A Bi-Level Framework for Learning to Solve Combinatorial Optimization on
Graphs [91.07247251502564]
We propose a hybrid approach to combine the best of the two worlds, in which a bi-level framework is developed with an upper-level learning method to optimize the graph.
Such a bi-level approach simplifies the learning on the original hard CO and can effectively mitigate the demand for model capacity.
arXiv Detail & Related papers (2021-06-09T09:18:18Z) - Constrained Combinatorial Optimization with Reinforcement Learning [0.30938904602244344]
This paper presents a framework to tackle constrained optimization problems using deep Reinforcement Learning (RL)
We extend the Neural Combinatorial Optimization (NCO) theory in order to deal with constraints in its formulation.
In that context, the solution is iteratively constructed based on interactions with the environment.
arXiv Detail & Related papers (2020-06-22T03:13:07Z) - Learning What to Defer for Maximum Independent Sets [84.00112106334655]
We propose a novel DRL scheme, coined learning what to defer (LwD), where the agent adaptively shrinks or stretch the number of stages by learning to distribute the element-wise decisions of the solution at each stage.
We apply the proposed framework to the maximum independent set (MIS) problem, and demonstrate its significant improvement over the current state-of-the-art DRL scheme.
arXiv Detail & Related papers (2020-06-17T02:19:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.