Branch Ranking for Efficient Mixed-Integer Programming via Offline
Ranking-based Policy Learning
- URL: http://arxiv.org/abs/2207.13701v1
- Date: Tue, 26 Jul 2022 15:34:10 GMT
- Title: Branch Ranking for Efficient Mixed-Integer Programming via Offline
Ranking-based Policy Learning
- Authors: Zeren Huang, Wenhao Chen, Weinan Zhang, Chuhan Shi, Furui Liu,
Hui-Ling Zhen, Mingxuan Yuan, Jianye Hao, Yong Yu, Jun Wang
- Abstract summary: We formulate learning to branch as an offline reinforcement learning (RL) problem.
We train a branching model named Branch Ranking via offline policy learning.
Experiments on synthetic MIP benchmarks and real-world tasks demonstrate that Branch Rankink is more efficient and robust.
- Score: 45.1011106869493
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deriving a good variable selection strategy in branch-and-bound is essential
for the efficiency of modern mixed-integer programming (MIP) solvers. With MIP
branching data collected during the previous solution process, learning to
branch methods have recently become superior over heuristics. As
branch-and-bound is naturally a sequential decision making task, one should
learn to optimize the utility of the whole MIP solving process instead of being
myopic on each step. In this work, we formulate learning to branch as an
offline reinforcement learning (RL) problem, and propose a long-sighted hybrid
search scheme to construct the offline MIP dataset, which values the long-term
utilities of branching decisions. During the policy training phase, we deploy a
ranking-based reward assignment scheme to distinguish the promising samples
from the long-term or short-term view, and train the branching model named
Branch Ranking via offline policy learning. Experiments on synthetic MIP
benchmarks and real-world tasks demonstrate that Branch Rankink is more
efficient and robust, and can better generalize to large scales of MIP
instances compared to the widely used heuristics and state-of-the-art
learning-based branching models.
Related papers
- Lifelong Learning for Neural powered Mixed Integer Programming [6.297356207581819]
Mixed programs (MIPs) are typically solved by the Branch-and-Bound algorithm.
We propose LIMIP, which is powered by the idea of modeling an MIP instance in branching form of a bipartite graph.
We evaluate LIMIP on a series of NP-hard problems and establish that in comparison to existing baselines, LIMIP is up to 50% better when confronted with lifelong learning.
arXiv Detail & Related papers (2022-08-24T09:05:40Z) - Making Linear MDPs Practical via Contrastive Representation Learning [101.75885788118131]
It is common to address the curse of dimensionality in Markov decision processes (MDPs) by exploiting low-rank representations.
We consider an alternative definition of linear MDPs that automatically ensures normalization while allowing efficient representation learning.
We demonstrate superior performance over existing state-of-the-art model-based and model-free algorithms on several benchmarks.
arXiv Detail & Related papers (2022-07-14T18:18:02Z) - Deep Reinforcement Learning for Exact Combinatorial Optimization:
Learning to Branch [13.024115985194932]
We propose a new approach for solving the data labeling and inference issues in optimization based on the use of the reinforcement learning (RL) paradigm.
We use imitation learning to bootstrap an RL agent and then use Proximal Policy (PPO) to further explore global optimal actions.
arXiv Detail & Related papers (2022-06-14T16:35:58Z) - Reinforcement Learning for Branch-and-Bound Optimisation using
Retrospective Trajectories [72.15369769265398]
Machine learning has emerged as a promising paradigm for branching.
We propose retro branching; a simple yet effective approach to RL for branching.
We outperform the current state-of-the-art RL branching algorithm by 3-5x and come within 20% of the best IL method's performance on MILPs with 500 constraints and 1000 variables.
arXiv Detail & Related papers (2022-05-28T06:08:07Z) - Sequential Information Design: Markov Persuasion Process and Its
Efficient Reinforcement Learning [156.5667417159582]
This paper proposes a novel model of sequential information design, namely the Markov persuasion processes (MPPs)
Planning in MPPs faces the unique challenge in finding a signaling policy that is simultaneously persuasive to the myopic receivers and inducing the optimal long-term cumulative utilities of the sender.
We design a provably efficient no-regret learning algorithm, the Optimism-Pessimism Principle for Persuasion Process (OP4), which features a novel combination of both optimism and pessimism principles.
arXiv Detail & Related papers (2022-02-22T05:41:43Z) - Text Generation with Efficient (Soft) Q-Learning [91.47743595382758]
Reinforcement learning (RL) offers a more flexible solution by allowing users to plug in arbitrary task metrics as reward.
We introduce a new RL formulation for text generation from the soft Q-learning perspective.
We apply the approach to a wide range of tasks, including learning from noisy/negative examples, adversarial attacks, and prompt generation.
arXiv Detail & Related papers (2021-06-14T18:48:40Z) - Learning to Schedule Heuristics in Branch-and-Bound [25.79025327341732]
Real-world applications typically require finding good solutions early in the search to enable fast decision-making.
We propose the first data-driven framework for schedulings in an exact MIP solver.
Compared to the default settings of a state-of-the-art academic MIP solver, we are able to reduce the average primal integral by up to 49% on a class of challenging instances.
arXiv Detail & Related papers (2021-03-18T14:49:52Z) - A Study of Learning Search Approximation in Mixed Integer Branch and
Bound: Node Selection in SCIP [2.528056693920671]
We present an offline method to learn such a policy in two settings.
One setting corresponds to a child selector during plunging, while the latter is akin to a diving.
Empirical results on five MIP datasets indicate that our node selection policy leads to solutions significantly more quickly than the state-of-the-art precedent in the literature.
arXiv Detail & Related papers (2020-07-08T08:12:44Z) - Parameterizing Branch-and-Bound Search Trees to Learn Branching Policies [76.83991682238666]
Branch and Bound (B&B) is the exact tree search method typically used to solve Mixed-Integer Linear Programming problems (MILPs)
We propose a novel imitation learning framework, and introduce new input features and architectures to represent branching.
arXiv Detail & Related papers (2020-02-12T17:43:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.