Related papers: Assessing the Business Process Modeling Competences of Large Language Models

Assessing the Business Process Modeling Competences of Large Language Models

URL: http://arxiv.org/abs/2601.21787v1
Date: Thu, 29 Jan 2026 14:34:20 GMT
Title: Assessing the Business Process Modeling Competences of Large Language Models
Authors: Chantale Lauer, Peter Pfeiffer, Alexander Rombach, Nijat Mehdiyev,
Abstract summary: Large language models (LLMs) have significantly expanded the possibilities for generating Business Process Model and Notation (BPMN) models directly from natural language.<n>We introduce BEF4LLM, a novel evaluation framework comprising four perspectives: syntactic quality, pragmatic quality, semantic quality, and validity.<n>Using BEF4LLM, we conduct a comprehensive analysis of open-source LLMs and benchmark their performance against human modeling experts.
Score: 40.495149980011924
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The creation of Business Process Model and Notation (BPMN) models is a complex and time-consuming task requiring both domain knowledge and proficiency in modeling conventions. Recent advances in large language models (LLMs) have significantly expanded the possibilities for generating BPMN models directly from natural language, building upon earlier text-to-process methods with enhanced capabilities in handling complex descriptions. However, there is a lack of systematic evaluations of LLM-generated process models. Current efforts either use LLM-as-a-judge approaches or do not consider established dimensions of model quality. To this end, we introduce BEF4LLM, a novel LLM evaluation framework comprising four perspectives: syntactic quality, pragmatic quality, semantic quality, and validity. Using BEF4LLM, we conduct a comprehensive analysis of open-source LLMs and benchmark their performance against human modeling experts. Results indicate that LLMs excel in syntactic and pragmatic quality, while humans outperform in semantic aspects; however, the differences in scores are relatively modest, highlighting LLMs' competitive potential despite challenges in validity and semantic quality. The insights highlight current strengths and limitations of using LLMs for BPMN modeling and guide future model development and fine-tuning. Addressing these areas is essential for advancing the practical deployment of LLMs in business process modeling.

Related papers

Evaluation of LLMs for Process Model Analysis and Optimization [0.0]
We show that a vanilla, untrained LLM like ChatGPT (model o3) in a zero-shot setting is effective in understanding BPMN process models from images.<n>We also study the LLM's "thought process" and ability to perform deeper reasoning in the context of process analysis and optimization.
arXiv Detail & Related papers (2025-10-08T19:39:19Z)
Evaluating the Process Modeling Abilities of Large Language Models -- Preliminary Foundations and Results [1.3812010983144802]
Large language models (LLM) have revolutionized the processing of natural language.<n>It is currently under debate to what extent an LLM can generate good process models.<n>We discuss these challenges in detail and discuss future experiments to tackle these challenges scientifically.
arXiv Detail & Related papers (2025-03-14T18:52:18Z)
A Survey on Mechanistic Interpretability for Multi-Modal Foundation Models [74.48084001058672]
The rise of foundation models has transformed machine learning research.<n> multimodal foundation models (MMFMs) pose unique interpretability challenges beyond unimodal frameworks.<n>This survey explores two key aspects: (1) the adaptation of LLM interpretability methods to multimodal models and (2) understanding the mechanistic differences between unimodal language models and crossmodal systems.
arXiv Detail & Related papers (2025-02-22T20:55:26Z)
Applying Large Language Models in Knowledge Graph-based Enterprise Modeling: Challenges and Opportunities [0.0]
Large language models (LLMs) in enterprise modeling have recently started to shift from academic research to that of industrial applications.<n>In this paper we employ a knowledge graph-based approach for enterprise modeling and investigate the potential benefits of LLMs.
arXiv Detail & Related papers (2025-01-07T06:34:17Z)
Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning [53.6472920229013]
Large Language Models (LLMs) have demonstrated impressive capability in many natural language tasks. LLMs are prone to produce errors, hallucinations and inconsistent statements when performing multi-step reasoning. We introduce Q*, a framework for guiding LLMs decoding process with deliberative planning.
arXiv Detail & Related papers (2024-06-20T13:08:09Z)
LLMs for Mathematical Modeling: Towards Bridging the Gap between Natural and Mathematical Languages [14.04286044600141]
Large Language Models (LLMs) have demonstrated strong performance across various natural language processing tasks.<n>But their proficiency in mathematical reasoning remains a key challenge.<n>We propose a process-oriented framework to evaluate LLMs' ability to construct mathematical models.
arXiv Detail & Related papers (2024-05-21T18:29:54Z)
Process Modeling With Large Language Models [42.0652924091318]
This paper explores the integration of Large Language Models (LLMs) into process modeling. We propose a framework that leverages LLMs for the automated generation and iterative refinement of process models. Preliminary results demonstrate the framework's ability to streamline process modeling tasks.
arXiv Detail & Related papers (2024-03-12T11:27:47Z)
Towards Modeling Learner Performance with Large Language Models [7.002923425715133]
This paper investigates whether the pattern recognition and sequence modeling capabilities of LLMs can be extended to the domain of knowledge tracing. We compare two approaches to using LLMs for this task, zero-shot prompting and model fine-tuning, with existing, non-LLM approaches to knowledge tracing. While LLM-based approaches do not achieve state-of-the-art performance, fine-tuned LLMs surpass the performance of naive baseline models and perform on par with standard Bayesian Knowledge Tracing approaches.
arXiv Detail & Related papers (2024-02-29T14:06:34Z)
Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension [63.330262740414646]
We study how to characterize and predict the truthfulness of texts generated from large language models (LLMs) We suggest investigating internal activations and quantifying LLM's truthfulness using the local intrinsic dimension (LID) of model activations.
arXiv Detail & Related papers (2024-02-28T04:56:21Z)
Model Composition for Multimodal Large Language Models [71.5729418523411]
We propose a new paradigm through the model composition of existing MLLMs to create a new model that retains the modal understanding capabilities of each original model. Our basic implementation, NaiveMC, demonstrates the effectiveness of this paradigm by reusing modality encoders and merging LLM parameters.
arXiv Detail & Related papers (2024-02-20T06:38:10Z)
Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP) What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining. How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.