Related papers: All You Need is an Improving Column: Enhancing Column Generation for Parallel Machine Scheduling via Transformers

All You Need is an Improving Column: Enhancing Column Generation for Parallel Machine Scheduling via Transformers

URL: http://arxiv.org/abs/2410.15601v1
Date: Mon, 21 Oct 2024 02:53:37 GMT
Title: All You Need is an Improving Column: Enhancing Column Generation for Parallel Machine Scheduling via Transformers
Authors: Amira Hijazi, Osman Ozaltin, Reha Uzsoy,
Abstract summary: We present a neural network-enhanced column generation (CG) approach for a parallel machine scheduling problem. By training the neural network offline and using it in inference mode to predict negative reduced costs columns, we achieve significant computational time savings. For large-sized instances, the proposed approach achieves an 80% improvement in the objective value in under 500 seconds.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present a neural network-enhanced column generation (CG) approach for a parallel machine scheduling problem. The proposed approach utilizes an encoder-decoder attention model, namely the transformer and pointer architectures, to develop job sequences with negative reduced cost and thus generate columns to add to the master problem. By training the neural network offline and using it in inference mode to predict negative reduced costs columns, we achieve significant computational time savings compared to dynamic programming (DP). Since the exact DP procedure is used to verify that no further columns with negative reduced cost can be identified at termination, the optimality guarantee of the original CG procedure is preserved. For small to medium-sized instances, our approach achieves an average 45% reduction in computation time compared to solving the subproblems with DP. Furthermore, the model generalizes not only to unseen, larger problem instances from the same probability distribution but also to instances from different probability distributions than those presented at training time. For large-sized instances, the proposed approach achieves an 80% improvement in the objective value in under 500 seconds, demonstrating both its scalability and efficiency.

Related papers

GDSG: Graph Diffusion-based Solution Generator for Optimization Problems in MEC Networks [109.17835015018532]
We present a Graph Diffusion-based Solution Generation (GDSG) method. This approach is designed to work with suboptimal datasets while converging to the optimal solution large probably. We build GDSG as a multi-task diffusion model utilizing a Graph Neural Network (GNN) to acquire the distribution of high-quality solutions.
arXiv Detail & Related papers (2024-12-11T11:13:43Z)
Optimizing the Optimal Weighted Average: Efficient Distributed Sparse Classification [50.406127962933915]
ACOWA allows an extra round of communication to achieve noticeably better approximation quality with minor runtime increases. Results show that ACOWA obtains solutions that are more faithful to the empirical risk minimizer and attain substantially higher accuracy than other distributed algorithms.
arXiv Detail & Related papers (2024-06-03T19:43:06Z)
CoRMF: Criticality-Ordered Recurrent Mean Field Ising Solver [4.364088891019632]
We propose an RNN-based efficient Ising model solver, the Criticality-ordered Recurrent Mean Field (CoRMF) By leveraging the approximated tree structure of the underlying Ising graph, the newly-obtained criticality order enables the unification between variational mean-field and RNN. CoRFM solves the Ising problems in a self-train fashion without data/evidence, and the inference tasks can be executed by directly sampling from RNN.
arXiv Detail & Related papers (2024-03-05T16:55:06Z)
Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of Language Model [89.8764435351222]
We propose a new family of unbiased estimators called WTA-CRS, for matrix production with reduced variance. Our work provides both theoretical and experimental evidence that, in the context of tuning transformers, our proposed estimators exhibit lower variance compared to existing ones.
arXiv Detail & Related papers (2023-05-24T15:52:08Z)
Flag Aggregator: Scalable Distributed Training under Failures and Augmented Losses using Convex Optimization [14.732408788010313]
ML applications increasingly rely on complex deep learning models and large datasets. To scale computation and data, these models are inevitably trained in a distributed manner in clusters of nodes, and their updates are aggregated before being applied to the model. With data augmentation added to these settings, there is a critical need for robust and efficient aggregation systems. We show that our approach significantly enhances the robustness of state-of-the-art Byzantine resilient aggregators.
arXiv Detail & Related papers (2023-02-12T06:38:30Z)
Using Intermediate Forward Iterates for Intermediate Generator Optimization [14.987013151525368]
Intermediate Generator Optimization can be incorporated into any standard autoencoder pipeline for the generative task. We show applications of the IGO on two dense predictive tasks viz., image extrapolation, and point cloud denoising.
arXiv Detail & Related papers (2023-02-05T08:46:15Z)
DR-DSGD: A Distributionally Robust Decentralized Learning Algorithm over Graphs [54.08445874064361]
We propose to solve a regularized distributionally robust learning problem in the decentralized setting. By adding a Kullback-Liebler regularization function to the robust min-max optimization problem, the learning problem can be reduced to a modified robust problem. We show that our proposed algorithm can improve the worst distribution test accuracy by up to $10%$.
arXiv Detail & Related papers (2022-08-29T18:01:42Z)
Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states. Our method is widely applicable to classical DP-based inference. It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z)
Highly Parallel Autoregressive Entity Linking with Discriminative Correction [51.947280241185]
We propose a very efficient approach that parallelizes autoregressive linking across all potential mentions. Our model is >70 times faster and more accurate than the previous generative method.
arXiv Detail & Related papers (2021-09-08T17:28:26Z)
Partitioning sparse deep neural networks for scalable training and inference [8.282177703075453]
State-of-the-art deep neural networks (DNNs) have significant computational and data management requirements. Sparsification and pruning methods are shown to be effective in removing a large fraction of connections in DNNs. The resulting sparse networks present unique challenges to further improve the computational efficiency of training and inference in deep learning.
arXiv Detail & Related papers (2021-04-23T20:05:52Z)
Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks. We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.