Towards Foundation Models for Mixed Integer Linear Programming
- URL: http://arxiv.org/abs/2410.08288v1
- Date: Thu, 10 Oct 2024 18:20:44 GMT
- Title: Towards Foundation Models for Mixed Integer Linear Programming
- Authors: Sirui Li, Janardhan Kulkarni, Ishai Menache, Cathy Wu, Beibin Li,
- Abstract summary: Current deep learning approaches for MILP focus on specific problem classes and do not generalize to unseen classes.
We introduce MILP-Evolve, a novel evolutionary framework that is capable of generating a large set of diverse MILP classes.
Our empirical results show that models trained on the data generated by MILP-Evolve achieve significant improvements on unseen problems.
- Score: 15.064109397239086
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Mixed Integer Linear Programming (MILP) is essential for modeling complex decision-making problems but faces challenges in computational tractability and requires expert formulation. Current deep learning approaches for MILP focus on specific problem classes and do not generalize to unseen classes. To address this shortcoming, we take a foundation model training approach, where we train a single deep learning model on a diverse set of MILP problems to generalize across problem classes. As existing datasets for MILP lack diversity and volume, we introduce MILP-Evolve, a novel LLM-based evolutionary framework that is capable of generating a large set of diverse MILP classes with an unlimited amount of instances. We study our methodology on three key learning tasks that capture diverse aspects of MILP: (1) integrality gap prediction, (2) learning to branch, and (3) a new task of aligning MILP instances with natural language descriptions. Our empirical results show that models trained on the data generated by MILP-Evolve achieve significant improvements on unseen problems, including MIPLIB benchmarks. Our work highlights the potential of moving towards a foundation model approach for MILP that can generalize to a broad range of MILP applications. We are committed to fully open-sourcing our work to advance further research.
Related papers
- PEBench: A Fictitious Dataset to Benchmark Machine Unlearning for Multimodal Large Language Models [30.909294336713845]
Multimodal Large Language Models (MLLMs) have demonstrated remarkable advancements in tasks such as visual question answering, visual understanding, and reasoning.
However, this impressive progress relies on vast amounts of data collected from the internet, raising significant concerns about privacy and security.
Machine unlearning (MU) has emerged as a promising solution, enabling the removal of specific knowledge from an already trained model without requiring retraining from scratch.
arXiv Detail & Related papers (2025-03-16T15:26:20Z) - A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications [7.414857515253022]
We introduce the basic design of MoE, including gating functions, expert networks, routing mechanisms, training strategies, and system design.
We then explore the algorithm design of MoE in important machine learning paradigms such as continual learning, meta-learning, multi-task learning, and reinforcement learning.
arXiv Detail & Related papers (2025-03-10T10:08:55Z) - Empowering Large Language Models in Wireless Communication: A Novel Dataset and Fine-Tuning Framework [81.29965270493238]
We develop a specialized dataset aimed at enhancing the evaluation and fine-tuning of large language models (LLMs) for wireless communication applications.
The dataset includes a diverse set of multi-hop questions, including true/false and multiple-choice types, spanning varying difficulty levels from easy to hard.
We introduce a Pointwise V-Information (PVI) based fine-tuning method, providing a detailed theoretical analysis and justification for its use in quantifying the information content of training data.
arXiv Detail & Related papers (2025-01-16T16:19:53Z) - Multi-task Representation Learning for Mixed Integer Linear Programming [13.106799330951842]
This paper introduces the first multi-task learning framework for ML-guided MILP solving.
We demonstrate that our multi-task learning model performs similarly to specialized models within the same distribution.
It significantly outperforms them in generalization across problem sizes and tasks.
arXiv Detail & Related papers (2024-12-18T23:33:32Z) - RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training [55.54020926284334]
Multimodal Large Language Models (MLLMs) have recently received substantial interest, which shows their emerging potential as general-purpose models for various vision-language tasks.
Retrieval augmentation techniques have proven to be effective plugins for both LLMs and MLLMs.
In this study, we propose multimodal adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training (RA-BLIP), a novel retrieval-augmented framework for various MLLMs.
arXiv Detail & Related papers (2024-10-18T03:45:19Z) - NVLM: Open Frontier-Class Multimodal LLMs [64.00053046838225]
We introduce NVLM 1.0, a family of frontier-class multimodal large language models (LLMs) that achieve state-of-the-art results on vision-language tasks.
We propose a novel architecture that enhances both training efficiency and multimodal reasoning capabilities.
We develop production-grade multimodality for the NVLM-1.0 models, enabling them to excel in vision-language tasks.
arXiv Detail & Related papers (2024-09-17T17:59:06Z) - Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities [89.40778301238642]
Model merging is an efficient empowerment technique in the machine learning community.
There is a significant gap in the literature regarding a systematic and thorough review of these techniques.
arXiv Detail & Related papers (2024-08-14T16:58:48Z) - LLAVADI: What Matters For Multimodal Large Language Models Distillation [77.73964744238519]
In this work, we do not propose a new efficient model structure or train small-scale MLLMs from scratch.
Our studies involve training strategies, model choices, and distillation algorithms in the knowledge distillation process.
By evaluating different benchmarks and proper strategy, even a 2.7B small-scale model can perform on par with larger models with 7B or 13B parameters.
arXiv Detail & Related papers (2024-07-28T06:10:47Z) - HEMM: Holistic Evaluation of Multimodal Foundation Models [91.60364024897653]
Multimodal foundation models can holistically process text alongside images, video, audio, and other sensory modalities.
It is challenging to characterize and study progress in multimodal foundation models, given the range of possible modeling decisions, tasks, and domains.
arXiv Detail & Related papers (2024-07-03T18:00:48Z) - From Efficient Multimodal Models to World Models: A Survey [28.780451336834876]
Multimodal Large Models (MLMs) are becoming a significant research focus combining powerful language models with multimodal learning.
This review explores the latest developments and challenges in large instructions, emphasizing their potential in achieving artificial general intelligence.
arXiv Detail & Related papers (2024-06-27T15:36:43Z) - Scalable Language Model with Generalized Continual Learning [58.700439919096155]
The Joint Adaptive Re-ization (JARe) is integrated with Dynamic Task-related Knowledge Retrieval (DTKR) to enable adaptive adjustment of language models based on specific downstream tasks.
Our method demonstrates state-of-the-art performance on diverse backbones and benchmarks, achieving effective continual learning in both full-set and few-shot scenarios with minimal forgetting.
arXiv Detail & Related papers (2024-04-11T04:22:15Z) - Solution-oriented Agent-based Models Generation with Verifier-assisted
Iterative In-context Learning [10.67134969207797]
Agent-based models (ABMs) stand as an essential paradigm for proposing and validating hypothetical solutions or policies.
Large language models (LLMs) encapsulating cross-domain knowledge and programming proficiency could potentially alleviate the difficulty of this process.
We present SAGE, a general solution-oriented ABM generation framework designed for automatic modeling and generating solutions for targeted problems.
arXiv Detail & Related papers (2024-02-04T07:59:06Z) - MinT: Boosting Generalization in Mathematical Reasoning via Multi-View
Fine-Tuning [53.90744622542961]
Reasoning in mathematical domains remains a significant challenge for small language models (LMs)
We introduce a new method that exploits existing mathematical problem datasets with diverse annotation styles.
Experimental results show that our strategy enables a LLaMA-7B model to outperform prior approaches.
arXiv Detail & Related papers (2023-07-16T05:41:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.