Investigating the Role of LLMs Hyperparameter Tuning and Prompt Engineering to Support Domain Modeling
- URL: http://arxiv.org/abs/2507.14735v1
- Date: Sat, 19 Jul 2025 19:49:58 GMT
- Title: Investigating the Role of LLMs Hyperparameter Tuning and Prompt Engineering to Support Domain Modeling
- Authors: Vladyslav Bulhakov, Giordano d'Aloisio, Claudio Di Sipio, Antinisca Di Marco, Davide Di Ruscio,
- Abstract summary: Large language models (LLMs) have enhanced automation in software engineering tasks.<n>This paper explores how hyper parameter tuning and prompt engineering can improve the accuracy of the Llama 3.1 model.
- Score: 6.283288241585592
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The introduction of large language models (LLMs) has enhanced automation in software engineering tasks, including in Model Driven Engineering (MDE). However, using general-purpose LLMs for domain modeling has its limitations. One approach is to adopt fine-tuned models, but this requires significant computational resources and can lead to issues like catastrophic forgetting. This paper explores how hyperparameter tuning and prompt engineering can improve the accuracy of the Llama 3.1 model for generating domain models from textual descriptions. We use search-based methods to tune hyperparameters for a specific medical data model, resulting in a notable quality improvement over the baseline LLM. We then test the optimized hyperparameters across ten diverse application domains. While the solutions were not universally applicable, we demonstrate that combining hyperparameter tuning with prompt engineering can enhance results across nearly all examined domain models.
Related papers
- Predictable Scale: Part I, Step Law -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining [59.369484219304866]
In this study, we conduct an unprecedented empirical investigationtext- training over 3,700 Large Language Models (LLMs) from scratch across 100 trillion tokens.<n>We empirically observe that, under fixed model size ($N$) and dataset size ($D$), the hyperparameter landscape exhibits convexity with a broad optimum.<n>Building on this insight, we formally define and empirically validate the Step Law: The optimal learning rate follows a power-law relationship with $N$ and $D$, while the optimal batch size is primarily influenced by $D$ and remains largely invariant to $N$.
arXiv Detail & Related papers (2025-03-06T18:58:29Z) - Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities.
In-Context Learning (ICL) and.
Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting.
LLMs to downstream tasks.
We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z) - SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction.
SMILE allows for the upscaling of source models into an MoE model without extra data or further training.
We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z) - AutoRAG-HP: Automatic Online Hyper-Parameter Tuning for Retrieval-Augmented Generation [37.456499537121886]
Recent advancements in Large Language Models have transformed ML/AI development.
Recent advancements in Large Language Models have transformed AutoML principles for the Retrieval-Augmented Generation (RAG) systems.
arXiv Detail & Related papers (2024-06-27T15:18:21Z) - LLM can Achieve Self-Regulation via Hyperparameter Aware Generation [88.69052513433603]
Large Language Models (LLMs) employ diverse decoding strategies to control the generated text.
Are LLMs conscious of the existence of these decoding strategies and capable of regulating themselves?
We propose a novel text generation paradigm termed Hyperparameter Aware Generation (HAG)
arXiv Detail & Related papers (2024-02-17T11:18:22Z) - When Parameter-efficient Tuning Meets General-purpose Vision-language
Models [65.19127815275307]
PETAL revolutionizes the training process by requiring only 0.5% of the total parameters, achieved through a unique mode approximation technique.
Our experiments reveal that PETAL not only outperforms current state-of-the-art methods in most scenarios but also surpasses full fine-tuning models in effectiveness.
arXiv Detail & Related papers (2023-12-16T17:13:08Z) - Using Large Language Models for Hyperparameter Optimization [29.395931874196805]
This paper explores the use of foundational large language models (LLMs) in hyper parameter optimization (HPO)
Our empirical evaluations on standard benchmarks reveal that within constrained search budgets, LLMs can match or outperform traditional HPO methods.
arXiv Detail & Related papers (2023-12-07T18:46:50Z) - Fairer and More Accurate Tabular Models Through NAS [14.147928131445852]
We propose using multi-objective Neural Architecture Search (NAS) and Hyperparameter Optimization (HPO) in the first application to the very challenging domain of tabular data.
We show that models optimized solely for accuracy with NAS often fail to inherently address fairness concerns.
We produce architectures that consistently dominate state-of-the-art bias mitigation methods either in fairness, accuracy or both.
arXiv Detail & Related papers (2023-10-18T17:56:24Z) - Parameter-efficient Tuning of Large-scale Multimodal Foundation Model [68.24510810095802]
We propose A graceful prompt framework for cross-modal transfer (Aurora) to overcome these challenges.
Considering the redundancy in existing architectures, we first utilize the mode approximation to generate 0.1M trainable parameters to implement the multimodal prompt tuning.
A thorough evaluation on six cross-modal benchmarks shows that it not only outperforms the state-of-the-art but even outperforms the full fine-tuning approach.
arXiv Detail & Related papers (2023-05-15T06:40:56Z) - Hierarchical Collaborative Hyper-parameter Tuning [0.0]
Hyper- parameter tuning is among the most critical stages in building machine learning solutions.
This paper demonstrates how multi-agent systems can be utilized to develop a distributed technique for determining near-optimal values.
arXiv Detail & Related papers (2022-05-11T05:16:57Z) - Surrogate Model Based Hyperparameter Tuning for Deep Learning with SPOT [0.40611352512781856]
This article demonstrates how the architecture-level parameters of deep learning models that were implemented in Keras/tensorflow can be optimized.
The implementation of the tuning procedure is 100 % based on R, the software environment for statistical computing.
arXiv Detail & Related papers (2021-05-30T21:16:51Z) - On the Sparsity of Neural Machine Translation Models [65.49762428553345]
We investigate whether redundant parameters can be reused to achieve better performance.
Experiments and analyses are systematically conducted on different datasets and NMT architectures.
arXiv Detail & Related papers (2020-10-06T11:47:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.