Text2Model: Generating dynamic chemical reactor models using large language models (LLMs)
- URL: http://arxiv.org/abs/2503.17004v1
- Date: Fri, 21 Mar 2025 10:09:34 GMT
- Title: Text2Model: Generating dynamic chemical reactor models using large language models (LLMs)
- Authors: Sophia Rupprecht, Yassine Hounat, Monisha Kumar, Giacomo Lastrucci, Artur M. Schweidtmann,
- Abstract summary: We generate dynamic chemical reactor models in Modelica code format from textual descriptions as user input.<n>We fine-tune Llama 3.1 8B Instruct on synthetically generated Modelica code for different reactor scenarios.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As large language models have shown remarkable capabilities in conversing via natural language, the question arises as to how LLMs could potentially assist chemical engineers in research and industry with domain-specific tasks. We generate dynamic chemical reactor models in Modelica code format from textual descriptions as user input. We fine-tune Llama 3.1 8B Instruct on synthetically generated Modelica code for different reactor scenarios. We compare the performance of our fine-tuned model to the baseline Llama 3.1 8B Instruct model and GPT4o. We manually assess the models' predictions regarding the syntactic and semantic accuracy of the generated dynamic models. We find that considerable improvements are achieved by the fine-tuned model with respect to both the semantic and the syntactic accuracy of the Modelica models. However, the fine-tuned model lacks a satisfactory ability to generalize to unseen scenarios compared to GPT4o.
Related papers
- Using (Not so) Large Language Models for Generating Simulation Models in a Formal DSL -- A Study on Reaction Networks [0.0]
We evaluate how a Large Language Model might be used for formalizing natural language into simulation models.<n>We develop a synthetic data generator to serve as the basis for fine-tuning and evaluation.<n>Our evaluation shows that our fine-tuned Mistral model can recover the ground truth simulation model in up to 84.5% of cases.
arXiv Detail & Related papers (2025-03-03T15:48:01Z) - Scalable Language Models with Posterior Inference of Latent Thought Vectors [52.63299874322121]
Latent-Thought Language Models (LTMs) incorporate explicit latent thought vectors that follow an explicit prior model in latent space.<n>LTMs possess additional scaling dimensions beyond traditional LLMs, yielding a structured design space.<n>LTMs significantly outperform conventional autoregressive models and discrete diffusion models in validation perplexity and zero-shot language modeling.
arXiv Detail & Related papers (2025-02-03T17:50:34Z) - Scaling Diffusion Language Models via Adaptation from Autoregressive Models [105.70889434492143]
Diffusion Language Models (DLMs) have emerged as a promising new paradigm for text generative modeling.
We show that we can convert AR models ranging from 127M to 7B parameters into diffusion models DiffuGPT and DiffuLLaMA, using less than 200B tokens for training.
Our experimental results reveal that these models outperform earlier DLMs and are competitive with their AR counterparts.
arXiv Detail & Related papers (2024-10-23T14:04:22Z) - Promises and Pitfalls of Generative Masked Language Modeling: Theoretical Framework and Practical Guidelines [74.42485647685272]
We focus on Generative Masked Language Models (GMLMs)
We train a model to fit conditional probabilities of the data distribution via masking, which are subsequently used as inputs to a Markov Chain to draw samples from the model.
We adapt the T5 model for iteratively-refined parallel decoding, achieving 2-3x speedup in machine translation with minimal sacrifice in quality.
arXiv Detail & Related papers (2024-07-22T18:00:00Z) - EMR-Merging: Tuning-Free High-Performance Model Merging [55.03509900949149]
We show that Elect, Mask & Rescale-Merging (EMR-Merging) shows outstanding performance compared to existing merging methods.
EMR-Merging is tuning-free, thus requiring no data availability or any additional training while showing impressive performance.
arXiv Detail & Related papers (2024-05-23T05:25:45Z) - Language-Guided World Models: A Model-Based Approach to AI Control [31.9089380929602]
This paper introduces the concept of Language-Guided World Models (LWMs)
LWMs are probabilistic models that can simulate environments by reading texts.
We take initial steps in developing robust LWMs that can generalize to compositionally novel language descriptions.
arXiv Detail & Related papers (2024-01-24T03:11:36Z) - Reimagining Retrieval Augmented Language Models for Answering Queries [23.373952699385427]
We present a reality check on large language models and inspect the promise of retrieval augmented language models in comparison.
Such language models are semi-parametric, where models integrate model parameters and knowledge from external data sources to make their predictions.
arXiv Detail & Related papers (2023-06-01T18:08:51Z) - Towards an Automatic Optimisation Model Generator Assisted with
Generative Pre-trained Transformer [0.0]
This article presents a framework for generating optimisation models using a pre-trained generative transformer.
The framework involves specifying the features that the optimisation model should have and using a language model to generate an initial version of the model.
arXiv Detail & Related papers (2023-05-09T23:51:14Z) - Composing Ensembles of Pre-trained Models via Iterative Consensus [95.10641301155232]
We propose a unified framework for composing ensembles of different pre-trained models.
We use pre-trained models as "generators" or "scorers" and compose them via closed-loop iterative consensus optimization.
We demonstrate that consensus achieved by an ensemble of scorers outperforms the feedback of a single scorer.
arXiv Detail & Related papers (2022-10-20T18:46:31Z) - Efficient Large Scale Language Modeling with Mixtures of Experts [61.45159383372181]
Mixture of Experts layers (MoEs) enable efficient scaling of language models through conditional computation.
This paper presents a detailed empirical study of how autoregressive MoE language models scale in comparison with dense models in a wide range of settings.
arXiv Detail & Related papers (2021-12-20T17:05:11Z) - Towards Trustworthy Deception Detection: Benchmarking Model Robustness
across Domains, Modalities, and Languages [10.131671217810581]
We evaluate model robustness to out-of-domain data, modality-specific features, and languages other than English.
We find that with additional image content as input, ELMo embeddings yield significantly fewer errors compared to BERT orGLoVe.
arXiv Detail & Related papers (2021-04-23T18:05:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.