Accelerate Presolve in Large-Scale Linear Programming via Reinforcement
Learning
- URL: http://arxiv.org/abs/2310.11845v1
- Date: Wed, 18 Oct 2023 09:51:59 GMT
- Title: Accelerate Presolve in Large-Scale Linear Programming via Reinforcement
Learning
- Authors: Yufei Kuang, Xijun Li, Jie Wang, Fangzhou Zhu, Meng Lu, Zhihai Wang,
Jia Zeng, Houqiang Li, Yongdong Zhang, Feng Wu
- Abstract summary: We propose a simple and efficient reinforcement learning framework -- namely, reinforcement learning for presolve (RL4Presolve) -- to tackle (P1)-(P3) simultaneously.
Experiments on two solvers and eight benchmarks (real-world and synthetic) demonstrate that RL4Presolve significantly and consistently improves the efficiency of solving large-scale LPs.
- Score: 92.31528918811007
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large-scale LP problems from industry usually contain much redundancy that
severely hurts the efficiency and reliability of solving LPs, making presolve
(i.e., the problem simplification module) one of the most critical components
in modern LP solvers. However, how to design high-quality presolve routines --
that is, the program determining (P1) which presolvers to select, (P2) in what
order to execute, and (P3) when to stop -- remains a highly challenging task
due to the extensive requirements on expert knowledge and the large search
space. Due to the sequential decision property of the task and the lack of
expert demonstrations, we propose a simple and efficient reinforcement learning
(RL) framework -- namely, reinforcement learning for presolve (RL4Presolve) --
to tackle (P1)-(P3) simultaneously. Specifically, we formulate the routine
design task as a Markov decision process and propose an RL framework with
adaptive action sequences to generate high-quality presolve routines
efficiently. Note that adaptive action sequences help learn complex behaviors
efficiently and adapt to various benchmarks. Experiments on two solvers
(open-source and commercial) and eight benchmarks (real-world and synthetic)
demonstrate that RL4Presolve significantly and consistently improves the
efficiency of solving large-scale LPs, especially on benchmarks from industry.
Furthermore, we optimize the hard-coded presolve routines in LP solvers by
extracting rules from learned policies for simple and efficient deployment to
Huawei's supply chain. The results show encouraging economic and academic
potential for incorporating machine learning to modern solvers.
Related papers
- Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization [50.485788083202124]
Reinforcement Learning (RL) plays a crucial role in aligning large language models with human preferences and improving their ability to perform complex tasks.
We introduce Direct Q-function Optimization (DQO), which formulates the response generation process as a Markov Decision Process (MDP) and utilizes the soft actor-critic (SAC) framework to optimize a Q-function directly parameterized by the language model.
Experimental results on two math problem-solving datasets, GSM8K and MATH, demonstrate that DQO outperforms previous methods, establishing it as a promising offline reinforcement learning approach for aligning language models.
arXiv Detail & Related papers (2024-10-11T23:29:20Z) - PEAR: Primitive enabled Adaptive Relabeling for boosting Hierarchical Reinforcement Learning [25.84621883831624]
Hierarchical reinforcement learning has the potential to solve complex long horizon tasks using temporal abstraction and increased exploration.
We present primitive enabled adaptive relabeling (PEAR)
We first perform adaptive relabeling on a few expert demonstrations to generate efficient subgoal supervision.
We then jointly optimize HRL agents by employing reinforcement learning (RL) and imitation learning (IL)
arXiv Detail & Related papers (2023-06-10T09:41:30Z) - Flexible Job Shop Scheduling via Dual Attention Network Based
Reinforcement Learning [73.19312285906891]
In flexible job shop scheduling problem (FJSP), operations can be processed on multiple machines, leading to intricate relationships between operations and machines.
Recent works have employed deep reinforcement learning (DRL) to learn priority dispatching rules (PDRs) for solving FJSP.
This paper presents a novel end-to-end learning framework that weds the merits of self-attention models for deep feature extraction and DRL for scalable decision-making.
arXiv Detail & Related papers (2023-05-09T01:35:48Z) - Deep reinforcement learning applied to an assembly sequence planning
problem with user preferences [1.0558951653323283]
We propose an approach to the implementation of DRL methods in assembly sequence planning problems.
The proposed approach introduces in the RL environment parametric actions to improve training time and sample efficiency.
The results support the potential for the application of deep reinforcement learning in assembly sequence planning problems with human interaction.
arXiv Detail & Related papers (2023-04-13T14:25:15Z) - Learning to Optimize for Reinforcement Learning [58.01132862590378]
Reinforcement learning (RL) is essentially different from supervised learning, and in practice, these learneds do not work well even in simple RL tasks.
Agent-gradient distribution is non-independent and identically distributed, leading to inefficient meta-training.
We show that, although only trained in toy tasks, our learned can generalize unseen complex tasks in Brax.
arXiv Detail & Related papers (2023-02-03T00:11:02Z) - Learning to Optimize Permutation Flow Shop Scheduling via Graph-based
Imitation Learning [70.65666982566655]
Permutation flow shop scheduling (PFSS) is widely used in manufacturing systems.
We propose to train the model via expert-driven imitation learning, which accelerates convergence more stably and accurately.
Our model's network parameters are reduced to only 37% of theirs, and the solution gap of our model towards the expert solutions decreases from 6.8% to 1.3% on average.
arXiv Detail & Related papers (2022-10-31T09:46:26Z) - Learning Algorithms for Intelligent Agents and Mechanisms [4.251500966181852]
In this thesis, we research learning algorithms for optimal decision making in two different contexts, Reinforcement Learning in Part I and Auction Design in Part II.
In Chapter 2, inspired by statistical physics, we develop a novel approach to Reinforcement Learning (RL) that not only learns optimal policies with enhanced desirable properties but also sheds new light on maximum entropy RL.
In Chapter 3, we tackle the generalization problem in RL using a Bayesian perspective. We show that imperfect knowledge of the environments dynamics effectively turn a fully-observed Markov Decision Process (MDP) into a Partially Observed MDP (POMD
arXiv Detail & Related papers (2022-10-06T03:12:43Z) - Math Programming based Reinforcement Learning for Multi-Echelon
Inventory Management [1.9161790404101895]
Reinforcement learning has lead to considerable break-throughs in diverse areas such as robotics, games and many others.
But the application to RL in complex real-world decision making problems remains limited.
These characteristics make the problem considerably harder to solve for existing RL methods that rely on enumeration techniques to solve per step action problems.
We show that a properly selected discretization of the underlying uncertain distribution can yield near optimal actor policy even with very few samples from the underlying uncertainty.
We find that PARL outperforms commonly used base stock by 44.7% and the best performing RL method by up to 12.1% on average
arXiv Detail & Related papers (2021-12-04T01:40:34Z) - Safe-Critical Modular Deep Reinforcement Learning with Temporal Logic
through Gaussian Processes and Control Barrier Functions [3.5897534810405403]
Reinforcement learning (RL) is a promising approach and has limited success towards real-world applications.
In this paper, we propose a learning-based control framework consisting of several aspects.
We show such an ECBF-based modular deep RL algorithm achieves near-perfect success rates and guard safety with a high probability.
arXiv Detail & Related papers (2021-09-07T00:51:12Z) - Combining Deep Learning and Optimization for Security-Constrained
Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems.
Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs.
This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.