Related papers: Towards an Automatic Optimisation Model Generator Assisted with Generative Pre-trained Transformer

Towards an Automatic Optimisation Model Generator Assisted with Generative Pre-trained Transformer

URL: http://arxiv.org/abs/2305.05811v1
Date: Tue, 9 May 2023 23:51:14 GMT
Title: Towards an Automatic Optimisation Model Generator Assisted with Generative Pre-trained Transformer
Authors: Boris Almonacid
Abstract summary: This article presents a framework for generating optimisation models using a pre-trained generative transformer. The framework involves specifying the features that the optimisation model should have and using a language model to generate an initial version of the model.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This article presents a framework for generating optimisation models using a pre-trained generative transformer. The framework involves specifying the features that the optimisation model should have and using a language model to generate an initial version of the model. The model is then tested and validated, and if it contains build errors, an automatic edition process is triggered. An experiment was performed using MiniZinc as the target language and two GPT-3.5 language models for generation and debugging. The results show that the use of language models for the generation of optimisation models is feasible, with some models satisfying the requested specifications, while others require further refinement. The study provides promising evidence for the use of language models in the modelling of optimisation problems and suggests avenues for future research.

Related papers

Text2Model: Generating dynamic chemical reactor models using large language models (LLMs) [0.0]
We generate dynamic chemical reactor models in Modelica code format from textual descriptions as user input. We fine-tune Llama 3.1 8B Instruct on synthetically generated Modelica code for different reactor scenarios.
arXiv Detail & Related papers (2025-03-21T10:09:34Z)
ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets [106.7760874400261]
This paper presents ML-SUPERB2.0, which is a new benchmark for evaluating pre-trained SSL and supervised speech models. We find performance improvements over the setup of ML-SUPERB, but performance depends on the downstream model design. Also, we find large performance differences between languages and datasets, suggesting the need for more targeted approaches.
arXiv Detail & Related papers (2024-06-12T21:01:26Z)
Repurposing Language Models into Embedding Models: Finding the Compute-Optimal Recipe [10.34105218186634]
In this paper, we study how to contrastively train text embedding models in a compute-optimal fashion. Our innovation is an algorithm that produces optimal configurations of model sizes, data quantities, and fine-tuning methods for text-embedding models at different computational budget levels.
arXiv Detail & Related papers (2024-06-06T15:22:33Z)
Generative Pre-training for Speech with Flow Matching [81.59952572752248]
We pre-trained a generative model, named SpeechFlow, on 60k hours of untranscribed speech with Flow Matching and masked conditions. Experiment results show the pre-trained generative model can be fine-tuned with task-specific data to match or surpass existing expert models on speech enhancement, separation, and synthesis.
arXiv Detail & Related papers (2023-10-25T03:40:50Z)
Artificial Interrogation for Attributing Language Models [0.0]
The challenge provides twelve open-sourced base versions of popular language models and twelve fine-tuned language models for text generation. The goal of the contest is to identify which fine-tuned models originated from which base model. We have employed four distinct approaches for measuring the resemblance between the responses generated from the models of both sets.
arXiv Detail & Related papers (2022-11-20T05:46:29Z)
Investigating Ensemble Methods for Model Robustness Improvement of Text Classifiers [66.36045164286854]
We analyze a set of existing bias features and demonstrate there is no single model that works best for all the cases. By choosing an appropriate bias model, we can obtain a better robustness result than baselines with a more sophisticated model design.
arXiv Detail & Related papers (2022-10-28T17:52:10Z)
Composing Ensembles of Pre-trained Models via Iterative Consensus [95.10641301155232]
We propose a unified framework for composing ensembles of different pre-trained models. We use pre-trained models as "generators" or "scorers" and compose them via closed-loop iterative consensus optimization. We demonstrate that consensus achieved by an ensemble of scorers outperforms the feedback of a single scorer.
arXiv Detail & Related papers (2022-10-20T18:46:31Z)
N-Grammer: Augmenting Transformers with latent n-grams [35.39961549040385]
We propose a simple yet effective modification to the Transformer architecture inspired by the literature in statistical language modeling, by augmenting the model with n-grams that are constructed from a discrete latent representation of the text sequence. We evaluate our model, the N-Grammer on language modeling on the C4 data-set as well as text classification on the SuperGLUE data-set, and find that it outperforms several strong baselines such as the Transformer and the Primer.
arXiv Detail & Related papers (2022-07-13T17:18:02Z)
DIRECTOR: Generator-Classifiers For Supervised Language Modeling [27.86870968048833]
Current language models achieve low perplexity but their resulting generations still suffer from toxic responses, repetitiveness and contradictions. We introduce a new architecture, sc Director, that consists of a unified generator-classifier with both a language modeling and a classification head for each output token.
arXiv Detail & Related papers (2022-06-15T17:44:08Z)
A Generative Language Model for Few-shot Aspect-Based Sentiment Analysis [90.24921443175514]
We focus on aspect-based sentiment analysis, which involves extracting aspect term, category, and predicting their corresponding polarities. We propose to reformulate the extraction and prediction tasks into the sequence generation task, using a generative language model with unidirectional attention. Our approach outperforms the previous state-of-the-art (based on BERT) on average performance by a large margins in few-shot and full-shot settings.
arXiv Detail & Related papers (2022-04-11T18:31:53Z)
Model Selection for Cross-Lingual Transfer [15.197350103781739]
We propose a machine learning approach to model selection that uses the fine-tuned model's own internal representations to predict its cross-lingual capabilities. In extensive experiments we find that this method consistently selects better models than English validation data across twenty five languages.
arXiv Detail & Related papers (2020-10-13T02:36:48Z)
Exploring Versatile Generative Language Model Via Parameter-Efficient Transfer Learning [70.81910984985683]
We propose an effective way to fine-tune multiple down-stream generation tasks simultaneously using a single, large pre-trained model. The experiments on five diverse language generation tasks show that by just using an additional 2-3% parameters for each task, our model can maintain or even improve the performance of fine-tuning the whole model.
arXiv Detail & Related papers (2020-04-08T06:18:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.