Related papers: NOMAD: A Multi-Agent LLM System for UML Class Diagram Generation from Natural Language Requirements

NOMAD: A Multi-Agent LLM System for UML Class Diagram Generation from Natural Language Requirements

URL: http://arxiv.org/abs/2511.22409v1
Date: Thu, 27 Nov 2025 12:36:25 GMT
Title: NOMAD: A Multi-Agent LLM System for UML Class Diagram Generation from Natural Language Requirements
Authors: Polydoros Giannouris, Sophia Ananiadou,
Abstract summary: Large Language Models (LLMs) are increasingly utilised in software engineering, yet their ability to generate structured artefacts such as diagrams remains underexplored.<n>In this work we present NOMAD, a cognitively inspired, modular multi-agent framework that decomposes generation into a series of role-specialised subtasks.<n>Each agent handles a distinct modelling activity, such as entity extraction, relationship classification, synthesis diagram, mirroring the goal-directed reasoning processes of an engineer.
Score: 20.080985332719383
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) are increasingly utilised in software engineering, yet their ability to generate structured artefacts such as UML diagrams remains underexplored. In this work we present NOMAD, a cognitively inspired, modular multi-agent framework that decomposes UML generation into a series of role-specialised subtasks. Each agent handles a distinct modelling activity, such as entity extraction, relationship classification, and diagram synthesis, mirroring the goal-directed reasoning processes of an engineer. This decomposition improves interpretability and allows for targeted verification strategies. We evaluate NOMAD through a mixed design: a large case study (Northwind) for in-depth probing and error analysis, and human-authored UML exercises for breadth and realism. NOMAD outperforms all selected baselines, while revealing persistent challenges in fine-grained attribute extraction. Building on these observations, we introduce the first systematic taxonomy of errors in LLM-generated UML diagrams, categorising structural, relationship, and semantic/logical. Finally, we examine verification as a design probe, showing its mixed effects and outlining adaptive strategies as promising directions. Together, these contributions position NOMAD as both an effective framework for UML class diagram generation and a lens onto the broader research challenges of reliable language-to-model workflows.

Related papers

Beyond Unimodal Shortcuts: MLLMs as Cross-Modal Reasoners for Grounded Named Entity Recognition [51.68340973140949]
Multimodal Named Entity Recognition (GMNER) aims to extract text-based entities, assign them semantic categories, and ground them to corresponding visual regions.<n> MLLMs exhibit $textbfmodality bias$, including visual bias and textual bias, which stems from their tendency to take unimodal shortcuts.<n>We propose Modality-aware Consistency Reasoning ($bfMCR$), which enforces structured cross-modal reasoning.
arXiv Detail & Related papers (2026-02-04T12:12:49Z)
Empowering LLMs for Structure-Based Drug Design via Exploration-Augmented Latent Inference [5.052013621974765]
Large Language Models (LLMs) possess strong representation and reasoning capabilities, but their application to structure-based drug design (SBDD) is limited by insufficient understanding of protein structures and unpredictable molecular generation.<n>We propose Exploration-Augmented Latent Inference for LLMs (ELILLM), a framework that reinterprets the LLM generation process as an encoding, latent space exploration, and decoding workflow.<n>ELILLM explicitly explores portions of the design problem beyond the model's current knowledge while using a decoding module to handle familiar regions, generating chemically valid and synthetically reasonable molecules.
arXiv Detail & Related papers (2026-01-20T08:10:48Z)
Beyond Isolated Dots: Benchmarking Structured Table Construction as Deep Knowledge Extraction [80.88654868264645]
Arranged and Organized Extraction Benchmark designed to evaluate ability of large language models to comprehend fragmented documents.<n>AOE includes 11 carefully crafted tasks across three diverse domains, requiring models to generate context-specific schema tailored to varied input queries.<n>Results show that even the most advanced models struggled significantly.
arXiv Detail & Related papers (2025-07-22T06:37:51Z)
Behavioral Augmentation of UML Class Diagrams: An Empirical Study of Large Language Models for Method Generation [0.0]
This study evaluates nine large language models (LLMs) in augmenting a methodless diagram (21 classes, 17 relationships) using 21 structured waste-management use cases.<n>A total of 90 diagrams (3,373 methods) were assessed across six iterations.
arXiv Detail & Related papers (2025-06-01T02:33:40Z)
MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering [57.156093929365255]
Gym-style framework for systematically reinforcement learning, evaluating, and improving autonomous large language model (LLM) agents.<n>MLE-Dojo covers diverse, open-ended MLE tasks carefully curated to reflect realistic engineering scenarios.<n>Its fully executable environment supports comprehensive agent training via both supervised fine-tuning and reinforcement learning.
arXiv Detail & Related papers (2025-05-12T17:35:43Z)
GraphOmni: A Comprehensive and Extendable Benchmark Framework for Large Language Models on Graph-theoretic Tasks [26.992997870540435]
Graph Omni is a benchmark to evaluate the reasoning capabilities of LLMs on graph-theoretic tasks articulated in natural language.<n>We identify critical interactions among graph types, serialization formats, and prompting schemes, demonstrating their substantial impact on model performance.<n>We propose a reinforcement learning-inspired framework that adaptively selects the optimal factors influencing LLM reasoning capabilities.
arXiv Detail & Related papers (2025-04-17T09:01:16Z)
Unified Modeling Language Code Generation from Diagram Images Using Multimodal Large Language Models [0.41942958779358674]
This paper proposes a new approach to generate code using a large multimodal language model automatically.<n> domain-adapted MM-LLMs perform for code generation automation, whereby at the best model, it achieved BLEU and SSIM scores of 0.779 and 0.942 on sequence diagrams.
arXiv Detail & Related papers (2025-03-15T23:20:26Z)
Benchmarking Agentic Workflow Generation [80.74757493266057]
We introduce WorfBench, a unified workflow generation benchmark with multi-faceted scenarios and intricate graph workflow structures.<n>We also present WorfEval, a systemic evaluation protocol utilizing subsequence and subgraph matching algorithms.<n>We observe that the generated can enhance downstream tasks, enabling them to achieve superior performance with less time during inference.
arXiv Detail & Related papers (2024-10-10T12:41:19Z)
CoMMIT: Coordinated Multimodal Instruction Tuning [90.1532838391285]
Multimodal large language models (MLLMs) generally involve cooperative learning between a backbone LLM and a feature encoder of non-text input modalities.<n>In this paper, we analyze the MLLM instruction tuning from both theoretical and empirical perspectives.<n>We propose a Multimodal Balance Coefficient that enables quantitative measurement of the balance of learning.
arXiv Detail & Related papers (2024-07-29T23:18:55Z)
Small LLMs Are Weak Tool Learners: A Multi-LLM Agent [73.54562551341454]
Large Language Model (LLM) agents significantly extend the capabilities of standalone LLMs. We propose a novel approach that decomposes the aforementioned capabilities into a planner, caller, and summarizer. This modular framework facilitates individual updates and the potential use of smaller LLMs for building each capability.
arXiv Detail & Related papers (2024-01-14T16:17:07Z)
LLMs Understand Glass-Box Models, Discover Surprises, and Suggest Repairs [10.222281712562705]
We show that large language models (LLMs) are remarkably good at working with interpretable models. By adopting a hierarchical approach to reasoning, LLMs can provide comprehensive model-level summaries. We present the package $textttTalkToEBM$ as an open-source LLM-GAM interface.
arXiv Detail & Related papers (2023-08-02T13:59:35Z)
CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models [74.22729793816451]
Large Language Models (LLMs) have made significant progress in utilizing tools, but their ability is limited by API availability. We propose CREATOR, a novel framework that enables LLMs to create their own tools using documentation and code realization. We evaluate CREATOR on MATH and TabMWP benchmarks, respectively consisting of challenging math competition problems.
arXiv Detail & Related papers (2023-05-23T17:51:52Z)
MAML is a Noisy Contrastive Learner [72.04430033118426]
Model-agnostic meta-learning (MAML) is one of the most popular and widely-adopted meta-learning algorithms nowadays. We provide a new perspective to the working mechanism of MAML and discover that: MAML is analogous to a meta-learner using a supervised contrastive objective function. We propose a simple but effective technique, zeroing trick, to alleviate such interference.
arXiv Detail & Related papers (2021-06-29T12:52:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.