Advancing Software Quality: A Standards-Focused Review of LLM-Based Assurance Techniques
- URL: http://arxiv.org/abs/2505.13766v1
- Date: Mon, 19 May 2025 22:49:30 GMT
- Title: Advancing Software Quality: A Standards-Focused Review of LLM-Based Assurance Techniques
- Authors: Avinash Patil,
- Abstract summary: Large Language Models (LLMs) present new opportunities to enhance existing Software Quality Assurance processes.<n>LLMs can automate tasks like requirement analysis, code review, test generation, and compliance checks.<n>This paper surveys the intersection of LLM-based SQA methods and recognized standards.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Software Quality Assurance (SQA) is critical for delivering reliable, secure, and efficient software products. The Software Quality Assurance Process aims to provide assurance that work products and processes comply with predefined provisions and plans. Recent advancements in Large Language Models (LLMs) present new opportunities to enhance existing SQA processes by automating tasks like requirement analysis, code review, test generation, and compliance checks. Simultaneously, established standards such as ISO/IEC 12207, ISO/IEC 25010, ISO/IEC 5055, ISO 9001/ISO/IEC 90003, CMMI, and TMM provide structured frameworks for ensuring robust quality practices. This paper surveys the intersection of LLM-based SQA methods and these recognized standards, highlighting how AI-driven solutions can augment traditional approaches while maintaining compliance and process maturity. We first review the foundational software quality standards and the technical fundamentals of LLMs in software engineering. Next, we explore various LLM-based SQA applications, including requirement validation, defect detection, test generation, and documentation maintenance. We then map these applications to key software quality frameworks, illustrating how LLMs can address specific requirements and metrics within each standard. Empirical case studies and open-source initiatives demonstrate the practical viability of these methods. At the same time, discussions on challenges (e.g., data privacy, model bias, explainability) underscore the need for deliberate governance and auditing. Finally, we propose future directions encompassing adaptive learning, privacy-focused deployments, multimodal analysis, and evolving standards for AI-driven software quality.
Related papers
- Leveraging LLMs for Formal Software Requirements -- Challenges and Prospects [0.0]
VERIFAI1 aims to investigate automated and semi-automated approaches to bridge this gap.<n>This position paper presents a preliminary synthesis of relevant literature to identify recurring challenges and prospective research directions.
arXiv Detail & Related papers (2025-07-18T19:15:50Z) - MERA Code: A Unified Framework for Evaluating Code Generation Across Tasks [56.34018316319873]
We propose MERA Code, a benchmark for evaluating code for the latest code generation LLMs in Russian.<n>This benchmark includes 11 evaluation tasks that span 8 programming languages.<n>We evaluate open LLMs and frontier API models, analyzing their limitations in terms of practical coding tasks in non-English languages.
arXiv Detail & Related papers (2025-07-16T14:31:33Z) - OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of Methods and Metrics [101.78963920333342]
We introduce OpenUnlearning, a standardized framework for benchmarking large language models (LLMs) unlearning methods and metrics.<n>OpenUnlearning integrates 9 unlearning algorithms and 16 diverse evaluations across 3 leading benchmarks.<n>We also benchmark diverse unlearning methods and provide a comparative analysis against an extensive evaluation suite.
arXiv Detail & Related papers (2025-06-14T20:16:37Z) - Software Bill of Materials in Software Supply Chain Security A Systematic Literature Review [0.0]
Software Bill of Materials (SBOMs) are increasingly regarded as essential tools for securing software supply chains (SSCs)<n>This systematic literature review synthesizes evidence from 40 peer-reviewed studies to evaluate how SBOMs are currently used to bolster SSC security.<n>Despite clear promise, adoption is hindered by significant barriers: generation tooling, data privacy, format/standardization, sharing/distribution, cost/overhead, vulnerability exploitability, maintenance, analysis tooling, false positives, hidden packages, and tampering.
arXiv Detail & Related papers (2025-06-04T02:49:04Z) - Training Language Models to Generate Quality Code with Program Analysis Feedback [66.0854002147103]
Code generation with large language models (LLMs) is increasingly adopted in production but fails to ensure code quality.<n>We propose REAL, a reinforcement learning framework that incentivizes LLMs to generate production-quality code.
arXiv Detail & Related papers (2025-05-28T17:57:47Z) - AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios [51.46347732659174]
Large Language Models (LLMs) have demonstrated advanced capabilities in real-world agentic applications.<n>AgentIF is the first benchmark for systematically evaluating LLM instruction following ability in agentic scenarios.
arXiv Detail & Related papers (2025-05-22T17:31:10Z) - Assessing and Advancing Benchmarks for Evaluating Large Language Models in Software Engineering Tasks [13.736881548660422]
Large language models (LLMs) are gaining increasing popularity in software engineering (SE)<n> evaluating their effectiveness is crucial for understanding their potential in this field.<n>This paper offers a thorough review of 191 benchmarks, addressing three main aspects: what benchmarks are available, how benchmarks are constructed, and the future outlook for these benchmarks.
arXiv Detail & Related papers (2025-05-13T18:45:10Z) - Requirements-Driven Automated Software Testing: A Systematic Review [13.67495800498868]
This study synthesizes the current state of REDAST research, highlights trends, and proposes future directions.<n>This systematic literature review ( SLR) explores the landscape of REDAST by analyzing requirements input, transformation techniques, test outcomes, evaluation methods, and existing limitations.
arXiv Detail & Related papers (2025-02-25T23:13:09Z) - Quality Assurance Practices in Agile Methodology [0.0]
The complexity of software is increasing day by day the requirement and need for a verity of softwareproducts increases.
The practice of applying software metrics to the development process and to asoftware product is a critical task and crucial enough that requires study and discipline.
arXiv Detail & Related papers (2024-11-07T19:45:40Z) - Leveraging LLMs for the Quality Assurance of Software Requirements [40.55044936397561]
We introduce and assess the capabilities of a Large Language Model (LLM) to evaluate the quality characteristics of software requirements according to the ISO 29148 standard.
We show how an LLM can assess requirements, explain its decision-making process, and examine its capacity to propose improved versions of requirements.
arXiv Detail & Related papers (2024-08-20T14:17:50Z) - Design of a Quality Management System based on the EU Artificial Intelligence Act [0.0]
The EU AI Act mandates that providers and deployers of high-risk AI systems establish a quality management system (QMS)
This paper introduces a new design concept and prototype for a QMS as a microservice Software as a Service web application.
arXiv Detail & Related papers (2024-08-08T12:14:02Z) - Agent-Driven Automatic Software Improvement [55.2480439325792]
This research proposal aims to explore innovative solutions by focusing on the deployment of agents powered by Large Language Models (LLMs)
The iterative nature of agents, which allows for continuous learning and adaptation, can help surpass common challenges in code generation.
We aim to use the iterative feedback in these systems to further fine-tune the LLMs underlying the agents, becoming better aligned to the task of automated software improvement.
arXiv Detail & Related papers (2024-06-24T15:45:22Z) - Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph [83.90988015005934]
Uncertainty quantification is a key element of machine learning applications.<n>We introduce a novel benchmark that implements a collection of state-of-the-art UQ baselines.<n>We conduct a large-scale empirical investigation of UQ and normalization techniques across eleven tasks, identifying the most effective approaches.
arXiv Detail & Related papers (2024-06-21T20:06:31Z) - Towards Generating Executable Metamorphic Relations Using Large Language Models [46.26208489175692]
We propose an approach for automatically deriving executable MRs from requirements using large language models (LLMs)
To assess the feasibility of our approach, we conducted a questionnaire-based survey in collaboration with Siemens Industry Software.
arXiv Detail & Related papers (2024-01-30T13:52:47Z) - Technology Readiness Levels for AI & ML [79.22051549519989]
Development of machine learning systems can be executed easily with modern tools, but the process is typically rushed and means-to-an-end.
Engineering systems follow well-defined processes and testing standards to streamline development for high-quality, reliable results.
We propose a proven systems engineering approach for machine learning development and deployment.
arXiv Detail & Related papers (2020-06-21T17:14:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.