Accurate and Consistent Graph Model Generation from Text with Large Language Models
- URL: http://arxiv.org/abs/2508.00255v1
- Date: Fri, 01 Aug 2025 01:52:25 GMT
- Title: Accurate and Consistent Graph Model Generation from Text with Large Language Models
- Authors: Boqi Chen, Ou Wei, Bingzhou Zheng, Gunter Mussbacher,
- Abstract summary: Graph model generation from natural language description is an important task with many applications in software engineering.<n>With the rise of large language models (LLMs), there is a growing interest in using LLMs for graph model generation.<n>We propose a novel abstraction-concretization framework that enhances the consistency and quality of generated graph models.
- Score: 1.9049294570026933
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Graph model generation from natural language description is an important task with many applications in software engineering. With the rise of large language models (LLMs), there is a growing interest in using LLMs for graph model generation. Nevertheless, LLM-based graph model generation typically produces partially correct models that suffer from three main issues: (1) syntax violations: the generated model may not adhere to the syntax defined by its metamodel, (2) constraint inconsistencies: the structure of the model might not conform to some domain-specific constraints, and (3) inaccuracy: due to the inherent uncertainty in LLMs, the models can include inaccurate, hallucinated elements. While the first issue is often addressed through techniques such as constraint decoding or filtering, the latter two remain largely unaddressed. Motivated by recent self-consistency approaches in LLMs, we propose a novel abstraction-concretization framework that enhances the consistency and quality of generated graph models by considering multiple outputs from an LLM. Our approach first constructs a probabilistic partial model that aggregates all candidate outputs and then refines this partial model into the most appropriate concrete model that satisfies all constraints. We evaluate our framework on several popular open-source and closed-source LLMs using diverse datasets for model generation tasks. The results demonstrate that our approach significantly improves both the consistency and quality of the generated graph models.
Related papers
- GPTailor: Large Language Model Pruning Through Layer Cutting and Stitching [41.96482857947199]
Large language models (LLMs) have shown remarkable capabilities in language understanding and generation.<n>LLMs typically come with a substantial model size, which presents significant challenges in deployment and inference.<n>We develop a novel strategy to compress models by strategically combining or merging layers from finetuned model variants.
arXiv Detail & Related papers (2025-06-25T14:24:59Z) - CP-Bench: Evaluating Large Language Models for Constraint Modelling [6.273426548149088]
Constraint Programming (CP) is a well-suited problem-solving paradigm, but its core process, namely constraint modelling, is a bottleneck for wider adoption.<n>Recent studies have explored using Large Language Models (LLMs) as modelling assistants, transforming problem descriptions to executable constraint models.<n>This work addresses the gap by introducing CP-Bench, a novel benchmark dataset that includes a diverse set of well-known problem classes.
arXiv Detail & Related papers (2025-06-06T12:56:02Z) - Unifying Multimodal Large Language Model Capabilities and Modalities via Model Merging [103.98582374569789]
Model merging aims to combine multiple expert models into a single model, thereby reducing storage and serving costs.<n>Previous studies have primarily focused on merging visual classification models or Large Language Models (LLMs) for code and math tasks.<n>We introduce the model merging benchmark for MLLMs, which includes multiple tasks such as VQA, Geometry, Chart, OCR, and Grounding, providing both LoRA and full fine-tuning models.
arXiv Detail & Related papers (2025-05-26T12:23:14Z) - Matchmaker: Self-Improving Large Language Model Programs for Schema Matching [60.23571456538149]
We propose a compositional language model program for schema matching, comprised of candidate generation, refinement and confidence scoring.
Matchmaker self-improves in a zero-shot manner without the need for labeled demonstrations.
Empirically, we demonstrate on real-world medical schema matching benchmarks that Matchmaker outperforms previous ML-based approaches.
arXiv Detail & Related papers (2024-10-31T16:34:03Z) - Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild [84.57103623507082]
This paper introduces Model-GLUE, a holistic Large Language Models scaling guideline.<n>We benchmark existing scaling techniques, especially selective merging, and variants of mixture.<n>We then formulate an optimal strategy for the selection and aggregation of a heterogeneous model zoo.<n>Our methodology involves the clustering of mergeable models and optimal merging strategy selection, and the integration of clusters.
arXiv Detail & Related papers (2024-10-07T15:55:55Z) - Unlocking the Potential of Model Merging for Low-Resource Languages [66.7716891808697]
Adapting large language models to new languages typically involves continual pre-training (CT) followed by supervised fine-tuning (SFT)
We propose model merging as an alternative for low-resource languages, combining models with distinct capabilities into a single model without additional training.
Experiments based on Llama-2-7B demonstrate that model merging effectively endows LLMs for low-resource languages with task-solving abilities, outperforming CT-then-SFT in scenarios with extremely scarce data.
arXiv Detail & Related papers (2024-07-04T15:14:17Z) - EMR-Merging: Tuning-Free High-Performance Model Merging [55.03509900949149]
We show that Elect, Mask & Rescale-Merging (EMR-Merging) shows outstanding performance compared to existing merging methods.
EMR-Merging is tuning-free, thus requiring no data availability or any additional training while showing impressive performance.
arXiv Detail & Related papers (2024-05-23T05:25:45Z) - Entity Matching using Large Language Models [3.7277730514654555]
This paper investigates using generative large language models (LLMs) as a less task-specific training data-dependent alternative to PLM-based matchers.
We show that GPT4 can generate structured explanations for matching decisions and can automatically identify potential causes of matching errors.
arXiv Detail & Related papers (2023-10-17T13:12:32Z) - Adapting Large Language Models for Content Moderation: Pitfalls in Data
Engineering and Supervised Fine-tuning [79.53130089003986]
Large Language Models (LLMs) have become a feasible solution for handling tasks in various domains.
In this paper, we introduce how to fine-tune a LLM model that can be privately deployed for content moderation.
arXiv Detail & Related papers (2023-10-05T09:09:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.