Related papers: Balans: Multi-Armed Bandits-based Adaptive Large Neighborhood Search for Mixed-Integer Programming Problem

Balans: Multi-Armed Bandits-based Adaptive Large Neighborhood Search for Mixed-Integer Programming Problem

URL: http://arxiv.org/abs/2412.14382v3
Date: Wed, 23 Jul 2025 17:09:55 GMT
Title: Balans: Multi-Armed Bandits-based Adaptive Large Neighborhood Search for Mixed-Integer Programming Problem
Authors: Junyang Cai, Serdar Kadioglu, Bistra Dilkina,
Abstract summary: Mixed-integer programming (MIP) is a powerful paradigm for modeling and solving various important optimization problems.<n>We propose Balans, an adaptive meta-solver for MIPs with online learning capability that does not require any supervision or apriori training.<n>Balans offers significant performance gains over the default MIP solver, is better than committing to any single best neighborhood, and improves over the state-of-the-art large-neighborhood search for MIPs.
Score: 13.381261742433367
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Mixed-integer programming (MIP) is a powerful paradigm for modeling and solving various important combinatorial optimization problems. Recently, learning-based approaches have shown a potential to speed up MIP solving via offline training that then guides important design decisions during the search. However, a significant drawback of these methods is their heavy reliance on offline training, which requires collecting training datasets and computationally costly training epochs yet offering only limited generalization to unseen (larger) instances. In this paper, we propose Balans, an adaptive meta-solver for MIPs with online learning capability that does not require any supervision or apriori training. At its core, Balans is based on adaptive large-neighborhood search, operating on top of an MIP solver by successive applications of destroy and repair neighborhood operators. During the search, the selection among different neighborhood definitions is guided on the fly for the instance at hand via multi-armed bandit algorithms. Our extensive experiments on hard optimization instances show that Balans offers significant performance gains over the default MIP solver, is better than committing to any single best neighborhood, and improves over the state-of-the-art large-neighborhood search for MIPs. Finally, we release Balans as a highly configurable, MIP solver agnostic, open-source software.

Related papers

Applying a Random-Key Optimizer on Mixed Integer Programs [0.36700088931938835]
Mixed-Integer Programs (MIPs) are NP-hard optimization models that arise in a broad range of decision-making applications.<n>This paper explores the use of the Random-Key integer (RKO) framework as a flexible, metaheuristic alternative for computing high-quality solutions to MIPs.
arXiv Detail & Related papers (2026-02-25T18:20:03Z)
Assignment-Routing Optimization: Solvers for Problems Under Constraints [0.0]
We study the Joint Routing-Assignment (JRA) problem in which items must be assigned one-to-one to placeholders.<n>We develop a solver tailored for practical packaging-planning scenarios with richer constraints.<n>Results highlight the practical applicability of MIP-based JRA optimization for robotic packaging, motion planning, and complex logistics.
arXiv Detail & Related papers (2025-12-21T06:32:31Z)
SolverLLM: Leveraging Test-Time Scaling for Optimization Problem via LLM-Guided Search [58.116954449750544]
We introduce a training-free framework that leverages test-time scaling to solve diverse optimization problems.<n>Rather than solving directly, it generates mathematical formulations and translates them into solver-ready code, guided by a novel Monte Carlo Tree Search strategy.
arXiv Detail & Related papers (2025-10-19T16:21:19Z)
FORGE: Foundational Optimization Representations from Graph Embeddings [3.9124823111588163]
Existing learning-based methods require training dedicated models for each problem distribution.<n>We introduce Forge: Foundational Optimization Representations from Graph Embeddings, a framework that pre-trains a vector-quantized graph autoencoder.<n>In both tasks, we improve the performance of a commercial optimization solver and outperform state-of-the-art learning-based methods.
arXiv Detail & Related papers (2025-08-28T00:15:57Z)
ParBalans: Parallel Multi-Armed Bandits-based Adaptive Large Neighborhood Search [12.249099965011458]
This paper investigates the parallelization capabilities of Balans, a proposed adaptive large neighborhood search for MIPs.<n>We introduce ParBalans, an extension that leverages both solver-level and algorithmic-level parallelism to improve performance on challenging MIP instances.<n>Our experimental results demonstrate that ParBalans exhibits competitive performance compared to the state-of-the-art commercial solver Gurobi.
arXiv Detail & Related papers (2025-08-08T22:30:19Z)
FMIP: Joint Continuous-Integer Flow For Mixed-Integer Linear Programming [52.52020895303244]
Mixed-Integer Linear Programming (MILP) is a foundational tool for complex decision-making problems.<n>We propose Joint Continuous-Integer Flow for Mixed-Integer Linear Programming (FMIP), which is the first generative framework that models joint distribution of both integer and continuous variables for MILP solutions.<n>FMIP is fully compatible with arbitrary backbone networks and various downstream solvers, making it well-suited for a broad range of real-world MILP applications.
arXiv Detail & Related papers (2025-07-31T10:03:30Z)
Can Prompt Difficulty be Online Predicted for Accelerating RL Finetuning of Reasoning Models? [62.579951798437115]
This work investigates iterative approximate evaluation for arbitrary prompts.<n>It introduces Model Predictive Prompt Selection (MoPPS), a Bayesian risk-predictive framework.<n>MoPPS reliably predicts prompt difficulty and accelerates training with significantly reduced rollouts.
arXiv Detail & Related papers (2025-07-07T03:20:52Z)
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization [50.485788083202124]
Reinforcement Learning (RL) plays a crucial role in aligning large language models with human preferences and improving their ability to perform complex tasks. We introduce Direct Q-function Optimization (DQO), which formulates the response generation process as a Markov Decision Process (MDP) and utilizes the soft actor-critic (SAC) framework to optimize a Q-function directly parameterized by the language model. Experimental results on two math problem-solving datasets, GSM8K and MATH, demonstrate that DQO outperforms previous methods, establishing it as a promising offline reinforcement learning approach for aligning language models.
arXiv Detail & Related papers (2024-10-11T23:29:20Z)
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning [56.273799410256075]
The framework combines Monte Carlo Tree Search (MCTS) with iterative Self-Refine to optimize the reasoning path. The framework has been tested on general and advanced benchmarks, showing superior performance in terms of search efficiency and problem-solving capability.
arXiv Detail & Related papers (2024-10-03T18:12:29Z)
UCB-driven Utility Function Search for Multi-objective Reinforcement Learning [75.11267478778295]
In Multi-objective Reinforcement Learning (MORL) agents are tasked with optimising decision-making behaviours. We focus on the case of linear utility functions parameterised by weight vectors w. We introduce a method based on Upper Confidence Bound to efficiently search for the most promising weight vectors during different stages of the learning process.
arXiv Detail & Related papers (2024-05-01T09:34:42Z)
Moco: A Learnable Meta Optimizer for Combinatorial Optimization [5.359176539960004]
Moco learns a graph neural network that updates the solution construction procedure based on features extracted from the current search state. This meta training procedure targets the overall best solution found during the search procedure given information such as the search budget. Moco is a fully learnable meta that does not utilize any problem specific local search or decomposition.
arXiv Detail & Related papers (2024-02-07T14:41:17Z)
Adaptive Anytime Multi-Agent Path Finding Using Bandit-Based Large Neighborhood Search [30.364955687049292]
State-of-the-art anytime MAPF is based on Large Neighborhood Search (LNS) We propose Bandit-based Adaptive LArge Neighborhood search Combined with Exploration (BALANCE) We empirically demonstrate cost improvements of at least 50% compared to state-of-the-art anytime MAPF in large-scale scenarios.
arXiv Detail & Related papers (2023-12-28T01:24:42Z)
A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs) MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z)
Online Control of Adaptive Large Neighborhood Search using Deep Reinforcement Learning [4.374837991804085]
We introduce a Deep Reinforcement Learning based approach called DR-ALNS that selects operators, adjusts parameters, and controls the acceptance criterion throughout the search. We evaluate the proposed method on a problem with orienteering weights and time windows, as presented in an IJCAI competition. The results show that our approach outperforms vanilla ALNS, ALNS tuned with Bayesian optimization, and two state-of-the-art DRL approaches.
arXiv Detail & Related papers (2022-11-01T21:33:46Z)
Branch Ranking for Efficient Mixed-Integer Programming via Offline Ranking-based Policy Learning [45.1011106869493]
We formulate learning to branch as an offline reinforcement learning (RL) problem. We train a branching model named Branch Ranking via offline policy learning. Experiments on synthetic MIP benchmarks and real-world tasks demonstrate that Branch Rankink is more efficient and robust.
arXiv Detail & Related papers (2022-07-26T15:34:10Z)
A Survey for Solving Mixed Integer Programming via Machine Learning [76.04988886859871]
This paper surveys the trend of machine learning to solve mixed integer (MIP) problems. In this paper, we first introduce the formulation and preliminaries of MIP and several traditional algorithms to solve MIP. Then, we advocate further promoting the different integration of machine learning and MIP algorithms.
arXiv Detail & Related papers (2022-03-06T05:03:37Z)
Low-Latency Federated Learning over Wireless Channels with Differential Privacy [142.5983499872664]
In federated learning (FL), model training is distributed over clients and local models are aggregated by a central server. In this paper, we aim to minimize FL training delay over wireless channels, constrained by overall training performance as well as each client's differential privacy (DP) requirement.
arXiv Detail & Related papers (2021-06-20T13:51:18Z)
Learning to Schedule Heuristics in Branch-and-Bound [25.79025327341732]
Real-world applications typically require finding good solutions early in the search to enable fast decision-making. We propose the first data-driven framework for schedulings in an exact MIP solver. Compared to the default settings of a state-of-the-art academic MIP solver, we are able to reduce the average primal integral by up to 49% on a class of challenging instances.
arXiv Detail & Related papers (2021-03-18T14:49:52Z)
Combining Deep Learning and Optimization for Security-Constrained Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems. Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs. This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.