Towards Generating Executable Metamorphic Relations Using Large Language Models
- URL: http://arxiv.org/abs/2401.17019v2
- Date: Fri, 7 Jun 2024 15:10:22 GMT
- Title: Towards Generating Executable Metamorphic Relations Using Large Language Models
- Authors: Seung Yeob Shin, Fabrizio Pastore, Domenico Bianculli, Alexandra Baicoianu,
- Abstract summary: We propose an approach for automatically deriving executable MRs from requirements using large language models (LLMs)
To assess the feasibility of our approach, we conducted a questionnaire-based survey in collaboration with Siemens Industry Software.
- Score: 46.26208489175692
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Metamorphic testing (MT) has proven to be a successful solution to automating testing and addressing the oracle problem. However, it entails manually deriving metamorphic relations (MRs) and converting them into an executable form; these steps are time-consuming and may prevent the adoption of MT. In this paper, we propose an approach for automatically deriving executable MRs (EMRs) from requirements using large language models (LLMs). Instead of merely asking the LLM to produce EMRs, our approach relies on a few-shot prompting strategy to instruct the LLM to perform activities in the MT process, by providing requirements and API specifications, as one would do with software engineers. To assess the feasibility of our approach, we conducted a questionnaire-based survey in collaboration with Siemens Industry Software, a worldwide leader in providing industry software and services, focusing on four of their software applications. Additionally, we evaluated the accuracy of the generated EMRs for a Web application. The outcomes of our study are highly promising, as they demonstrate the capability of our approach to generate MRs and EMRs that are both comprehensible and pertinent for testing purposes.
Related papers
- Efficient Prompting for LLM-based Generative Internet of Things [88.84327500311464]
Large language models (LLMs) have demonstrated remarkable capacities on various tasks.
We propose a text-based generative IoT (GIoT) system deployed in the local network setting.
arXiv Detail & Related papers (2024-06-14T19:24:00Z) - ORLM: Training Large Language Models for Optimization Modeling [16.348267803499404]
Large Language Models (LLMs) have emerged as powerful tools for tackling complex Operations Research (OR) problem.
To tackle this issue, we propose training open-source LLMs for optimization modeling.
Our best-performing ORLM achieves state-of-the-art performance on the NL4OPT, MAMO, and IndustryOR benchmarks.
arXiv Detail & Related papers (2024-05-28T01:55:35Z) - Using Large Language Models to Understand Telecom Standards [35.343893798039765]
Large Language Models (LLMs) may provide faster access to relevant information.
We evaluate the capability of state-of-art LLMs to be used as Question Answering (QA) assistants.
Results show that LLMs can be used as a credible reference tool on telecom technical documents.
arXiv Detail & Related papers (2024-04-02T09:54:51Z) - MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria [44.401826163314716]
We propose a new evaluation paradigm for MLLMs using potent MLLM as the judge.
We benchmark 21 popular MLLMs in a pairwise-comparison fashion, showing diverse performance across models.
The validity of our benchmark manifests itself in reaching 88.02% agreement with human evaluation.
arXiv Detail & Related papers (2023-11-23T12:04:25Z) - Towards a Complete Metamorphic Testing Pipeline [56.75969180129005]
Metamorphic Testing (MT) addresses the test oracle problem by examining the relationships between input-output pairs in consecutive executions of the System Under Test (SUT)
These relations, known as Metamorphic Relations (MRs), specify the expected output changes resulting from specific input changes.
Our research aims to develop methods and tools that assist testers in generating MRs, defining constraints, and providing explainability for MR outcomes.
arXiv Detail & Related papers (2023-09-30T10:49:22Z) - Benchmarking Automated Machine Learning Methods for Price Forecasting
Applications [58.720142291102135]
We show the possibility of substituting manually created ML pipelines with automated machine learning (AutoML) solutions.
Based on the CRISP-DM process, we split the manual ML pipeline into a machine learning and non-machine learning part.
We show in a case study for the industrial use case of price forecasting, that domain knowledge combined with AutoML can weaken the dependence on ML experts.
arXiv Detail & Related papers (2023-04-28T10:27:38Z) - Just Tell Me: Prompt Engineering in Business Process Management [63.08166397142146]
GPT-3 and other language models (LMs) can effectively address various natural language processing (NLP) tasks.
We argue that prompt engineering can help bring the capabilities of LMs to BPM research.
arXiv Detail & Related papers (2023-04-14T14:55:19Z) - Demonstrate-Search-Predict: Composing retrieval and language models for
knowledge-intensive NLP [77.817293104436]
We propose a framework that relies on passing natural language texts in sophisticated pipelines between an LM and an RM.
We have written novel DSP programs for answering questions in open-domain, multi-hop, and conversational settings.
arXiv Detail & Related papers (2022-12-28T18:52:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.