Generative AI for Requirements Engineering: A Systematic Literature Review
- URL: http://arxiv.org/abs/2409.06741v2
- Date: Thu, 23 Jan 2025 11:12:26 GMT
- Title: Generative AI for Requirements Engineering: A Systematic Literature Review
- Authors: Haowei Cheng, Jati H. Husen, Yijun Lu, Teeradaj Racharak, Nobukazu Yoshioka, Naoyasu Ubayashi, Hironori Washizaki,
- Abstract summary: The emergence of generative AI (GenAI) offers new opportunities and challenges in requirements engineering (RE)
This systematic literature review aims to analyze and synthesize current research on GenAI applications in RE.
- Score: 4.444308664613162
- License:
- Abstract: Context: Requirements engineering (RE) faces mounting challenges in handling increasingly complex software systems. The emergence of generative AI (GenAI) offers new opportunities and challenges in RE. Objective: This systematic literature review aims to analyze and synthesize current research on GenAI applications in RE, focusing on identifying research trends, methodologies, challenges, and future directions. Method: We conducted a comprehensive review of 105 articles published between 2019 and 2024 obtained from major academic databases, using a systematic methodology for paper selection, data extraction, and feature analysis. Results: Analysis revealed the following. (1) While GPT series models dominate current applications by 67.3% of studies, the existing architectures face technical challenges-interpretability (61.9%), reproducibility (52.4%), and controllability (47.6%), which demonstrate strong correlations (>35% co-occurrence). (2) Reproducibility is identified as a major concern by 52.4% of studies, which highlights challenges in achieving consistent results due to the stochastic nature and parameter sensitivity of GenAI. (3) Governance-related issues (e.g., ethics and security) form a distinct cluster of challenges that requires coordinated solutions, yet they are addressed by less than 20% of studies. Conclusions: While GenAI exhibits potential in RE, our findings reveal critical issues: (1) the high correlations among interpretability, reproducibility, and controllability imply the requirement for more specialized architectures that target interdependencies of these attributes. (2) The widespread concern about result consistency and reproducibility demands standardized evaluation frameworks. (3) The emergence of challenges related to interconnected governance demands comprehensive governance structures.
Related papers
- Neuro-Symbolic AI in 2024: A Systematic Review [0.29260385019352086]
The review followed the PRISMA methodology, utilizing databases such as IEEE Explore, Google Scholar, arXiv, ACM, and SpringerLink.
From an initial pool of 1,428 papers, 167 met the inclusion criteria and were analyzed in detail.
The majority of research efforts are concentrated in the areas of learning and inference, logic and reasoning, and knowledge representation.
arXiv Detail & Related papers (2025-01-09T18:48:35Z) - The Impossible Test: A 2024 Unsolvable Dataset and A Chance for an AGI Quiz [0.0]
We evaluate large language models' (LLMs) ability to acknowledge uncertainty on 675 fundamentally unsolvable problems.
The best models scored in 62-68% accuracy ranges for admitting the problem solution was unknown in fields ranging from biology to philosophy and mathematics.
arXiv Detail & Related papers (2024-11-20T04:12:29Z) - Generative Artificial Intelligence Meets Synthetic Aperture Radar: A Survey [49.29751866761522]
This paper aims to investigate the intersection of GenAI and SAR.
First, we illustrate the common data generation-based applications in SAR field.
Then, an overview of the latest GenAI models is systematically reviewed.
Finally, the corresponding applications in SAR domain are also included.
arXiv Detail & Related papers (2024-11-05T03:06:00Z) - An Adaptive Framework for Generating Systematic Explanatory Answer in Online Q&A Platforms [62.878616839799776]
We propose SynthRAG, an innovative framework designed to enhance Question Answering (QA) performance.
SynthRAG improves on conventional models by employing adaptive outlines for dynamic content structuring.
An online deployment on the Zhihu platform revealed that SynthRAG's answers achieved notable user engagement.
arXiv Detail & Related papers (2024-10-23T09:14:57Z) - SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research Repositories [55.161075901665946]
Super aims to capture the realistic challenges faced by researchers working with Machine Learning (ML) and Natural Language Processing (NLP) research repositories.
Our benchmark comprises three distinct problem sets: 45 end-to-end problems with annotated expert solutions, 152 sub problems derived from the expert set that focus on specific challenges, and 602 automatically generated problems for larger-scale development.
We show that state-of-the-art approaches struggle to solve these problems with the best model (GPT-4o) solving only 16.3% of the end-to-end set, and 46.1% of the scenarios.
arXiv Detail & Related papers (2024-09-11T17:37:48Z) - How Mature is Requirements Engineering for AI-based Systems? A Systematic Mapping Study on Practices, Challenges, and Future Research Directions [5.6818729232602205]
It is unclear if existing RE methods are sufficient or if new ones are needed to address these challenges.
Existing RE4AI research focuses mainly on requirements analysis and elicitation, with most practices applied in these areas.
We identified requirements specification, explainability, and the gap between machine learning engineers and end-users as the most prevalent challenges.
arXiv Detail & Related papers (2024-09-11T11:28:16Z) - OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI [73.75520820608232]
We introduce OlympicArena, which includes 11,163 bilingual problems across both text-only and interleaved text-image modalities.
These challenges encompass a wide range of disciplines spanning seven fields and 62 international Olympic competitions, rigorously examined for data leakage.
Our evaluations reveal that even advanced models like GPT-4o only achieve a 39.97% overall accuracy, illustrating current AI limitations in complex reasoning and multimodal integration.
arXiv Detail & Related papers (2024-06-18T16:20:53Z) - A Systematic Literature Review on Explainability for Machine/Deep Learning-based Software Engineering Research [23.273934717819795]
This paper presents a systematic literature review of approaches that aim to improve the explainability of AI models within the context of Software Engineering.
We aim to summarize the SE tasks where XAI techniques have shown success to date; (2) classify and analyze different XAI techniques; and (3) investigate existing evaluation approaches.
arXiv Detail & Related papers (2024-01-26T03:20:40Z) - Classification, Challenges, and Automated Approaches to Handle Non-Functional Requirements in ML-Enabled Systems: A Systematic Literature Review [10.09767622002672]
We propose a systematic literature review targeting two key aspects: the classification of the non-functional requirements investigated so far, and the challenges to be faced when developing models in ML-enabled systems.
We report that current research identified 30 different non-functional requirements, which can be grouped into six main classes.
We also compiled a catalog of more than 23 software engineering challenges, based on which further research should consider the nonfunctional requirements of machine learning-enabled systems.
arXiv Detail & Related papers (2023-11-29T09:45:41Z) - How Robust is GPT-3.5 to Predecessors? A Comprehensive Study on Language
Understanding Tasks [65.7949334650854]
GPT-3.5 models have demonstrated impressive performance in various Natural Language Processing (NLP) tasks.
However, their robustness and abilities to handle various complexities of the open world have yet to be explored.
We show that GPT-3.5 faces some specific robustness challenges, including instability, prompt sensitivity, and number sensitivity.
arXiv Detail & Related papers (2023-03-01T07:39:01Z) - GLUECons: A Generic Benchmark for Learning Under Constraints [102.78051169725455]
In this work, we create a benchmark that is a collection of nine tasks in the domains of natural language processing and computer vision.
We model external knowledge as constraints, specify the sources of the constraints for each task, and implement various models that use these constraints.
arXiv Detail & Related papers (2023-02-16T16:45:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.