DIG-MILP: a Deep Instance Generator for Mixed-Integer Linear Programming
with Feasibility Guarantee
- URL: http://arxiv.org/abs/2310.13261v1
- Date: Fri, 20 Oct 2023 03:45:29 GMT
- Title: DIG-MILP: a Deep Instance Generator for Mixed-Integer Linear Programming
with Feasibility Guarantee
- Authors: Haoyu Wang, Jialin Liu, Xiaohan Chen, Xinshang Wang, Pan Li, Wotao Yin
- Abstract summary: Mixed-integer linear programming (MILP) stands as a notable NP-hard problem pivotal to numerous crucial industrial applications.
We present DIG-MILP, a deep generative framework based on variational auto-encoder (VAE), adept at extracting deep-level structural features from highly limited MILP data.
- Score: 47.11455377400096
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Mixed-integer linear programming (MILP) stands as a notable NP-hard problem
pivotal to numerous crucial industrial applications. The development of
effective algorithms, the tuning of solvers, and the training of machine
learning models for MILP resolution all hinge on access to extensive, diverse,
and representative data. Yet compared to the abundant naturally occurring data
in image and text realms, MILP is markedly data deficient, underscoring the
vital role of synthetic MILP generation. We present DIG-MILP, a deep generative
framework based on variational auto-encoder (VAE), adept at extracting
deep-level structural features from highly limited MILP data and producing
instances that closely mirror the target data. Notably, by leveraging the MILP
duality, DIG-MILP guarantees a correct and complete generation space as well as
ensures the boundedness and feasibility of the generated instances. Our
empirical study highlights the novelty and quality of the instances generated
by DIG-MILP through two distinct downstream tasks: (S1) Data sharing, where
solver solution times correlate highly positive between original and
DIG-MILP-generated instances, allowing data sharing for solver tuning without
publishing the original data; (S2) Data Augmentation, wherein the
DIG-MILP-generated instances bolster the generalization performance of machine
learning models tasked with resolving MILP problems.
Related papers
- Multi-task Representation Learning for Mixed Integer Linear Programming [13.106799330951842]
This paper introduces the first multi-task learning framework for ML-guided MILP solving.
We demonstrate that our multi-task learning model performs similarly to specialized models within the same distribution.
It significantly outperforms them in generalization across problem sizes and tasks.
arXiv Detail & Related papers (2024-12-18T23:33:32Z) - Evaluating Language Models as Synthetic Data Generators [74.80905172696366]
AgoraBench is a benchmark that provides standardized settings and metrics to evaluate LMs' data generation abilities.
Through synthesizing 1.26 million training instances using 6 LMs and training 99 student models, we uncover key insights about LMs' data generation capabilities.
arXiv Detail & Related papers (2024-12-04T19:20:32Z) - Auto-GDA: Automatic Domain Adaptation for Efficient Grounding Verification in Retrieval Augmented Generation [13.120801609024147]
retrieval augmented generation (RAG) has been shown to enhance factuality of large language model (LLM) outputs.
RAG inputs are more complex than most datasets used for training NLI models.
We introduce Automatic Generative Domain Adaptation (Auto-GDA) to enable unsupervised domain adaptation.
arXiv Detail & Related papers (2024-10-04T14:21:27Z) - UniGen: A Unified Framework for Textual Dataset Generation Using Large Language Models [88.16197692794707]
UniGen is a comprehensive framework designed to produce diverse, accurate, and highly controllable datasets.
To augment data diversity, UniGen incorporates an attribute-guided generation module and a group checking feature.
Extensive experiments demonstrate the superior quality of data generated by UniGen.
arXiv Detail & Related papers (2024-06-27T07:56:44Z) - Curated LLM: Synergy of LLMs and Data Curation for tabular augmentation in low-data regimes [57.62036621319563]
We introduce CLLM, which leverages the prior knowledge of Large Language Models (LLMs) for data augmentation in the low-data regime.
We demonstrate the superior performance of CLLM in the low-data regime compared to conventional generators.
arXiv Detail & Related papers (2023-12-19T12:34:46Z) - Filling the Missing: Exploring Generative AI for Enhanced Federated
Learning over Heterogeneous Mobile Edge Devices [72.61177465035031]
We propose a generative AI-empowered federated learning to address these challenges by leveraging the idea of FIlling the MIssing (FIMI) portion of local data.
Experiment results demonstrate that FIMI can save up to 50% of the device-side energy to achieve the target global test accuracy.
arXiv Detail & Related papers (2023-10-21T12:07:04Z) - A Deep Instance Generative Framework for MILP Solvers Under Limited Data
Availability [66.37474135424637]
We propose G2MILP, the first deep generative framework for MILP instances.
G2MILP represents MILP instances as bipartite graphs, and applies a masked variational autoencoder to iteratively corrupt and replace parts of the original graphs to generate new ones.
We design a suite of benchmarks to evaluate the quality of the generated MILP instances.
arXiv Detail & Related papers (2023-10-04T13:34:34Z) - FedADMM: A Robust Federated Deep Learning Framework with Adaptivity to
System Heterogeneity [4.2059108111562935]
Federated Learning (FL) is an emerging framework for distributed processing of large data volumes by edge devices.
In this paper, we introduce a new FLAD FedADMM based protocol.
We show that FedADMM consistently outperforms all baseline methods in terms of communication efficiency.
arXiv Detail & Related papers (2022-04-07T15:58:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.