Related papers: Toward TransfORmers: Revolutionizing the Solution of Mixed Integer Programs with Transformers

Toward TransfORmers: Revolutionizing the Solution of Mixed Integer Programs with Transformers

URL: http://arxiv.org/abs/2402.13380v3
Date: Fri, 24 May 2024 16:17:43 GMT
Title: Toward TransfORmers: Revolutionizing the Solution of Mixed Integer Programs with Transformers
Authors: Joshua F. Cooper, Seung Jin Choi, I. Esra Buyuktahtakin,
Abstract summary: We introduce an innovative deep learning framework that employs a transformer model to address the challenges of mixed-integer programs. Our approach is the first to utilize transformers to predict the binary variables of a mixed-integer programming (MIP) problem. We present an efficient algorithm in which CLSP solutions are learned through a transformer neural network.
Score: 3.107843027522116
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: In this study, we introduce an innovative deep learning framework that employs a transformer model to address the challenges of mixed-integer programs, specifically focusing on the Capacitated Lot Sizing Problem (CLSP). Our approach, to our knowledge, is the first to utilize transformers to predict the binary variables of a mixed-integer programming (MIP) problem. Specifically, our approach harnesses the encoder decoder transformer's ability to process sequential data, making it well-suited for predicting binary variables indicating production setup decisions in each period of the CLSP. This problem is inherently dynamic, and we need to handle sequential decision making under constraints. We present an efficient algorithm in which CLSP solutions are learned through a transformer neural network. The proposed post-processed transformer algorithm surpasses the state-of-the-art solver, CPLEX and Long Short-Term Memory (LSTM) in solution time, optimal gap, and percent infeasibility over 240K benchmark CLSP instances tested. After the ML model is trained, conducting inference on the model, reduces the MIP into a linear program (LP). This transforms the ML-based algorithm, combined with an LP solver, into a polynomial-time approximation algorithm to solve a well-known NP-Hard problem, with almost perfect solution quality.

Related papers

Towards An Unsupervised Learning Scheme for Efficiently Solving Parameterized Mixed-Integer Programs [6.1860817947800655]
We train an autoencoder for binary variables in an unsupervised learning fashion. We present a strategy to construct a class of cutting plane constraints from the decoder parameters of an offline-trained AE. Their integration into the primal MIP problem leads to a tightened MIP with the reduced feasible region.
arXiv Detail & Related papers (2024-12-23T14:48:32Z)
MLQM: Machine Learning Approach for Accelerating Optimal Qubit Mapping [13.958125071955742]
We propose a machine learning approach for accelerating optimal qubit mapping (MLQM) First, the method proposes a global search space pruning scheme based on prior knowledge and machine learning. Second, to address the limited availability of effective samples in the learning task, MLQM introduces a novel data augmentation and refinement scheme.
arXiv Detail & Related papers (2024-12-04T11:49:09Z)
Progressive Mixed-Precision Decoding for Efficient LLM Inference [49.05448842542558]
We introduce Progressive Mixed-Precision Decoding (PMPD) to address the memory-boundedness of decoding. PMPD achieves 1.4$-$12.2$times$ speedup in matrix-vector multiplications over fp16 models. Our approach delivers a throughput gain of 3.8$-$8.0$times$ over fp16 models and up to 1.54$times$ over uniform quantization approaches.
arXiv Detail & Related papers (2024-10-17T11:46:33Z)
A hybrid Quantum-Classical Algorithm for Mixed-Integer Optimization in Power Systems [0.0]
We present a framework for solving power system optimization problems with a Quantum Computer (QC) Our guiding applications are the optimal transmission switching and the verification of neural networks trained to solve a DC Optimal Power Flow.
arXiv Detail & Related papers (2024-04-16T16:11:56Z)
Transformer-based Stagewise Decomposition for Large-Scale Multistage Stochastic Optimization [1.3124513975412255]
We introduce TranSDDP, a novel Transformer-based stagewise decomposition algorithm. We show it efficiently generates a piecewise linear approximation for the value function.
arXiv Detail & Related papers (2024-04-03T09:08:15Z)
Mixed Integer Linear Programming Solver Using Benders Decomposition Assisted by Neutral Atom Quantum Processor [0.0]
This paper presents a new hybrid classical-quantum approach to solve Mixed Linear Programming (MILP) We apply Benders decomposition (BD) to segment MILPs into a master problem (MP) and a subproblem (SP) Our MILP to QUBO conversion tightens the upper bounds of the involved continuous variables, positively impacting the required qubit count, and the convergence of the algorithm.
arXiv Detail & Related papers (2024-02-08T15:33:09Z)
Transformers as Statisticians: Provable In-Context Learning with In-Context Algorithm Selection [88.23337313766353]
This work first provides a comprehensive statistical theory for transformers to perform ICL. We show that transformers can implement a broad class of standard machine learning algorithms in context. A emphsingle transformer can adaptively select different base ICL algorithms.
arXiv Detail & Related papers (2023-06-07T17:59:31Z)
Deep-Learning Based Linear Precoding for MIMO Channels with Finite-Alphabet Signaling [0.5076419064097732]
This paper studies the problem of linear precoding for multiple-input multiple-output (MIMO) communication channels. Existing solutions typically suffer from high computational complexity due to costly computations of the constellation-constrained mutual information. A data-driven approach, based on deep learning, is proposed to tackle the problem.
arXiv Detail & Related papers (2021-11-05T13:48:45Z)
Joint Deep Reinforcement Learning and Unfolding: Beam Selection and Precoding for mmWave Multiuser MIMO with Lens Arrays [54.43962058166702]
millimeter wave (mmWave) multiuser multiple-input multiple-output (MU-MIMO) systems with discrete lens arrays have received great attention. In this work, we investigate the joint design of a beam precoding matrix for mmWave MU-MIMO systems with DLA.
arXiv Detail & Related papers (2021-01-05T03:55:04Z)
Combining Deep Learning and Optimization for Security-Constrained Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems. Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs. This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z)
Iterative Algorithm Induced Deep-Unfolding Neural Networks: Precoding Design for Multiuser MIMO Systems [59.804810122136345]
We propose a framework for deep-unfolding, where a general form of iterative algorithm induced deep-unfolding neural network (IAIDNN) is developed. An efficient IAIDNN based on the structure of the classic weighted minimum mean-square error (WMMSE) iterative algorithm is developed. We show that the proposed IAIDNN efficiently achieves the performance of the iterative WMMSE algorithm with reduced computational complexity.
arXiv Detail & Related papers (2020-06-15T02:57:57Z)
Channel Assignment in Uplink Wireless Communication using Machine Learning Approach [54.012791474906514]
This letter investigates a channel assignment problem in uplink wireless communication systems. Our goal is to maximize the sum rate of all users subject to integer channel assignment constraints. Due to high computational complexity, machine learning approaches are employed to obtain computational efficient solutions.
arXiv Detail & Related papers (2020-01-12T15:54:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.