Promoting Generalization for Exact Solvers via Adversarial Instance
Augmentation
- URL: http://arxiv.org/abs/2310.14161v1
- Date: Sun, 22 Oct 2023 03:15:36 GMT
- Title: Promoting Generalization for Exact Solvers via Adversarial Instance
Augmentation
- Authors: Haoyang Liu and Yufei Kuang and Jie Wang and Xijun Li and Yongdong
Zhang and Feng Wu
- Abstract summary: Adar is a framework for understanding and improving the generalization of both imitation-learning-based (IL-based) and reinforcement-learning-based solvers (RL-based)
- Score: 62.738582127114704
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning has been successfully applied to improve the efficiency of
Mixed-Integer Linear Programming (MILP) solvers. However, the learning-based
solvers often suffer from severe performance degradation on unseen MILP
instances -- especially on large-scale instances from a perturbed environment
-- due to the limited diversity of training distributions. To tackle this
problem, we propose a novel approach, which is called Adversarial Instance
Augmentation and does not require to know the problem type for new instance
generation, to promote data diversity for learning-based branching modules in
the branch-and-bound (B&B) Solvers (AdaSolver). We use the bipartite graph
representations for MILP instances and obtain various perturbed instances to
regularize the solver by augmenting the graph structures with a learned
augmentation policy. The major technical contribution of AdaSolver is that we
formulate the non-differentiable instance augmentation as a contextual bandit
problem and adversarially train the learning-based solver and augmentation
policy, enabling efficient gradient-based training of the augmentation policy.
To the best of our knowledge, AdaSolver is the first general and effective
framework for understanding and improving the generalization of both
imitation-learning-based (IL-based) and reinforcement-learning-based (RL-based)
B&B solvers. Extensive experiments demonstrate that by producing various
augmented instances, AdaSolver leads to a remarkable efficiency improvement
across various distributions.
Related papers
- From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.
We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - Optimization by Parallel Quasi-Quantum Annealing with Gradient-Based Sampling [0.0]
This study proposes a different approach that integrates gradient-based update through continuous relaxation, combined with Quasi-Quantum Annealing (QQA)
Numerical experiments demonstrate that our method is a competitive general-purpose solver, achieving performance comparable to iSCO and learning-based solvers.
arXiv Detail & Related papers (2024-09-02T12:55:27Z) - DiffSG: A Generative Solver for Network Optimization with Diffusion Model [75.27274046562806]
Diffusion generative models can consider a broader range of solutions and exhibit stronger generalization by learning parameters.
We propose a new framework, which leverages intrinsic distribution learning of diffusion generative models to learn high-quality solutions.
arXiv Detail & Related papers (2024-08-13T07:56:21Z) - Machine Learning Insides OptVerse AI Solver: Design Principles and
Applications [74.67495900436728]
We present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI solver.
We showcase our methods for generating complex SAT and MILP instances utilizing generative models that mirror multifaceted structures of real-world problem.
We detail the incorporation of state-of-the-art parameter tuning algorithms which markedly elevate solver performance.
arXiv Detail & Related papers (2024-01-11T15:02:15Z) - Attention, Filling in The Gaps for Generalization in Routing Problems [5.210197476419621]
This paper aims at encouraging the consolidation of the field through understanding and improving current existing models.
We first target model discrepancies by adapting the Kool et al. method and its loss function for Sparse Dynamic Attention.
We then target inherent differences through the use of a mixed instance training method that has been shown to outperform single instance training in certain scenarios.
arXiv Detail & Related papers (2022-07-14T21:36:51Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - Reversible Action Design for Combinatorial Optimization with
Reinforcement Learning [35.50454156611722]
Reinforcement learning (RL) has recently emerged as a new framework to tackle these problems.
We propose a general RL framework that not only exhibits state-of-the-art empirical performance but also generalizes to a variety class of COPs.
arXiv Detail & Related papers (2021-02-14T18:05:42Z) - Dynamics Generalization via Information Bottleneck in Deep Reinforcement
Learning [90.93035276307239]
We propose an information theoretic regularization objective and an annealing-based optimization method to achieve better generalization ability in RL agents.
We demonstrate the extreme generalization benefits of our approach in different domains ranging from maze navigation to robotic tasks.
This work provides a principled way to improve generalization in RL by gradually removing information that is redundant for task-solving.
arXiv Detail & Related papers (2020-08-03T02:24:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.