Related papers: ModiGen: A Large Language Model-Based Workflow for Multi-Task Modelica Code Generation

ModiGen: A Large Language Model-Based Workflow for Multi-Task Modelica Code Generation

URL: http://arxiv.org/abs/2503.18460v1
Date: Mon, 24 Mar 2025 09:04:49 GMT
Title: ModiGen: A Large Language Model-Based Workflow for Multi-Task Modelica Code Generation
Authors: Jiahui Xiang, Tong Ye, Peiyu Liu, Yinan Zhang, Wenhai Wang,
Abstract summary: Large language models (LLMs) have demonstrated promising capabilities in code generation, but their application to modeling remains largely unexplored.<n>Our evaluation reveals substantial limitations in current LLMs, as the generated code often fails to simulate successfully.<n>We propose a specialized workflow that integrates supervised fine-tuning, graph retrieval-augmented generation, and feedback optimization to improve the accuracy and reliability of Modelica code generation.
Score: 26.965467452327445
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modelica is a widely adopted language for simulating complex physical systems, yet effective model creation and optimization require substantial domain expertise. Although large language models (LLMs) have demonstrated promising capabilities in code generation, their application to modeling remains largely unexplored. To address this gap, we have developed benchmark datasets specifically designed to evaluate the performance of LLMs in generating Modelica component models and test cases. Our evaluation reveals substantial limitations in current LLMs, as the generated code often fails to simulate successfully. To overcome these challenges, we propose a specialized workflow that integrates supervised fine-tuning, graph retrieval-augmented generation, and feedback optimization to improve the accuracy and reliability of Modelica code generation. The evaluation results demonstrate significant performance gains: the maximum improvement in pass@1 reached 0.3349 for the component generation task and 0.2457 for the test case generation task. This research underscores the potential of LLMs to advance intelligent modeling tools and offers valuable insights for future developments in system modeling and engineering applications.

Related papers

On LLM-Assisted Generation of Smart Contracts from Business Processes [0.08192907805418582]
Large language models (LLMs) have changed the reality of how software is produced.<n>We present an exploratory study to investigate the use of LLMs for generating smart contract code from business process descriptions.<n>Our results show that LLM performance falls short of the perfect reliability required for smart contract development.
arXiv Detail & Related papers (2025-07-30T20:39:45Z)
Solver-Informed RL: Grounding Large Language Models for Authentic Optimization Modeling [3.253908111652627]
Large Language Models (LLMs) often struggle to generate formally correct and usable models against hallucinations.<n>We present a novel framework that significantly improves the authenticity of LLMs for optimization modeling using Reinforcement Learning with Verifiable Reward.
arXiv Detail & Related papers (2025-05-17T02:32:03Z)
Model Utility Law: Evaluating LLMs beyond Performance through Mechanism Interpretable Metric [99.56567010306807]
Large Language Models (LLMs) have become indispensable across academia, industry, and daily applications.<n>One core challenge of evaluation in the large language model (LLM) era is the generalization issue.<n>We propose Model Utilization Index (MUI), a mechanism interpretability enhanced metric that complements traditional performance scores.
arXiv Detail & Related papers (2025-04-10T04:09:47Z)
Efficient Model Selection for Time Series Forecasting via LLMs [52.31535714387368]
We propose to leverage Large Language Models (LLMs) as a lightweight alternative for model selection. Our method eliminates the need for explicit performance matrices by utilizing the inherent knowledge and reasoning capabilities of LLMs.
arXiv Detail & Related papers (2025-04-02T20:33:27Z)
Evaluating the Process Modeling Abilities of Large Language Models -- Preliminary Foundations and Results [1.3812010983144802]
Large language models (LLM) have revolutionized the processing of natural language.<n>It is currently under debate to what extent an LLM can generate good process models.<n>We discuss these challenges in detail and discuss future experiments to tackle these challenges scientifically.
arXiv Detail & Related papers (2025-03-14T18:52:18Z)
Learning to Solve and Verify: A Self-Play Framework for Code and Test Generation [69.62857948698436]
Recent advances in large language models (LLMs) have improved their performance on coding benchmarks.<n>However, improvement is plateauing due to the exhaustion of readily available high-quality data.<n>We propose Sol-Ver, a self-play solver-verifier framework that jointly improves a single model's code and test generation capacity.
arXiv Detail & Related papers (2025-02-20T18:32:19Z)
Applying Large Language Models in Knowledge Graph-based Enterprise Modeling: Challenges and Opportunities [0.0]
Large language models (LLMs) in enterprise modeling have recently started to shift from academic research to that of industrial applications.<n>In this paper we employ a knowledge graph-based approach for enterprise modeling and investigate the potential benefits of LLMs.
arXiv Detail & Related papers (2025-01-07T06:34:17Z)
Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data. We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z)
On the Modeling Capabilities of Large Language Models for Sequential Decision Making [52.128546842746246]
Large pretrained models are showing increasingly better performance in reasoning and planning tasks. We evaluate their ability to produce decision-making policies, either directly, by generating actions, or indirectly. In environments with unfamiliar dynamics, we explore how fine-tuning LLMs with synthetic data can significantly improve their reward modeling capabilities.
arXiv Detail & Related papers (2024-10-08T03:12:57Z)
Towards Synthetic Trace Generation of Modeling Operations using In-Context Learning Approach [1.8874331450711404]
We propose a conceptual framework that combines modeling event logs, intelligent modeling assistants, and the generation of modeling operations. In particular, the architecture comprises modeling components that help the designer specify the system, record its operation within a graphical modeling environment, and automatically recommend relevant operations.
arXiv Detail & Related papers (2024-08-26T13:26:44Z)
UICoder: Finetuning Large Language Models to Generate User Interface Code through Automated Feedback [21.858896845159208]
Large language models (LLMs) struggle to consistently generate UI code that compiles and produces visually relevant designs. Existing approaches to improve generation rely on expensive human feedback or distilling a proprietary model. Our method starts with an existing LLM and iteratively produces improved models by self-generating a large synthetic dataset.
arXiv Detail & Related papers (2024-06-11T21:53:46Z)
ORLM: A Customizable Framework in Training Large Models for Automated Optimization Modeling [15.67321902882617]
We introduce OR-Instruct, a semi-automated data synthesis framework for optimization modeling.<n>We also introduce IndustryOR, the first industrial benchmark for evaluating LLMs in solving practical OR problems.
arXiv Detail & Related papers (2024-05-28T01:55:35Z)
Enhancing Code Generation Performance of Smaller Models by Distilling the Reasoning Ability of LLMs [36.409470894115074]
We propose the CodePLAN framework, which aims to transfer LLMs' code generation reasoning capabilities to smaller models. Our approach improves the smaller model's code generation performance by over 130% on the challenging APPS benchmark.
arXiv Detail & Related papers (2024-03-20T03:09:54Z)
LLM-Assisted Code Cleaning For Training Accurate Code Generators [53.087019724256606]
We investigate data quality for code and find that making the code more structured and readable leads to improved code generation performance of the system. We build a novel data-cleaning pipeline that uses these principles to transform existing programs. We evaluate our approach on two challenging algorithmic code generation benchmarks and find that fine-tuning CodeLLaMa-7B improves the performance by up to 30% compared to fine-tuning on the original dataset.
arXiv Detail & Related papers (2023-11-25T02:45:50Z)
Quantitatively Assessing the Benefits of Model-driven Development in Agent-based Modeling and Simulation [80.49040344355431]
This paper compares the use of MDD and ABMS platforms in terms of effort and developer mistakes. The obtained results show that MDD4ABMS requires less effort to develop simulations with similar (sometimes better) design quality than NetLogo.
arXiv Detail & Related papers (2020-06-15T23:29:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.