Agentic RAG for Software Testing with Hybrid Vector-Graph and Multi-Agent Orchestration
- URL: http://arxiv.org/abs/2510.10824v1
- Date: Sun, 12 Oct 2025 22:25:15 GMT
- Title: Agentic RAG for Software Testing with Hybrid Vector-Graph and Multi-Agent Orchestration
- Authors: Mohanakrishnan Hariharan, Satish Arvapalli, Seshu Barma, Evangeline Sheela,
- Abstract summary: We present an approach to software testing automation using Agentic Retrieval-Augmented Generation (RAG) systems for Quality Engineering (QE) artifact creation.<n>We combine autonomous AI agents with hybrid vector-graph knowledge systems to automate test plan, case, and QE metric generation.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present an approach to software testing automation using Agentic Retrieval-Augmented Generation (RAG) systems for Quality Engineering (QE) artifact creation. We combine autonomous AI agents with hybrid vector-graph knowledge systems to automate test plan, case, and QE metric generation. Our approach addresses traditional software testing limitations by leveraging LLMs such as Gemini and Mistral, multi-agent orchestration, and enhanced contextualization. The system achieves remarkable accuracy improvements from 65% to 94.8% while ensuring comprehensive document traceability throughout the quality engineering lifecycle. Experimental validation of enterprise Corporate Systems Engineering and SAP migration projects demonstrates an 85% reduction in testing timeline, an 85% improvement in test suite efficiency, and projected 35% cost savings, resulting in a 2-month acceleration of go-live.
Related papers
- SWE-Universe: Scale Real-World Verifiable Environments to Millions [84.63665266236963]
SWE-Universe is a framework for automatically constructing real-world software engineering (SWE) verifiable environments from GitHub pull requests (PRs)<n>We propose a building agent powered by an efficient custom-trained model to overcome the prevalent challenges of automatic building.<n>We demonstrate the profound value of our environments through large-scale agentic mid-training and reinforcement learning.
arXiv Detail & Related papers (2026-02-02T17:20:30Z) - The Rise of Agentic Testing: Multi-Agent Systems for Robust Software Quality Assurance [0.0]
Current AI-based test generators produce invalid, redundant, or non-executable tests due to lack of execution aware feedback.<n>This paper introduces a closed-loop, self-correcting system in which a Test Generation Agent, an Execution and Analysis Agent, and a Review and Optimization Agent collaboratively generate, execute, analyze, and refine tests.
arXiv Detail & Related papers (2026-01-05T18:20:14Z) - An Agentic Framework for Autonomous Materials Computation [70.24472585135929]
Large Language Models (LLMs) have emerged as powerful tools for accelerating scientific discovery.<n>Recent advances integrate LLMs into agentic frameworks, enabling retrieval, reasoning, and tool use for complex scientific experiments.<n>Here, we present a domain-specialized agent designed for reliable automation of first-principles materials computations.
arXiv Detail & Related papers (2025-12-22T15:03:57Z) - Reinforcement Learning Integrated Agentic RAG for Software Test Cases Authoring [0.0]
This paper introduces a framework that integrates reinforcement learning (RL) with autonomous agents to enable continuous improvement in the automated process of software test cases authoring from business requirement documents within Quality Engineering (QE)<n>Our proposed Reinforcement Infused Agentic RAG (Retrieve, Augment, Generate) framework overcomes this limitation by employing AI agents that learn from QE feedback, assessments, and defect discovery outcomes to automatically improve their test case generation strategies.
arXiv Detail & Related papers (2025-12-05T17:52:26Z) - AegisLLM: Scaling Agentic Systems for Self-Reflective Defense in LLM Security [74.22452069013289]
AegisLLM is a cooperative multi-agent defense against adversarial attacks and information leakage.<n>We show that scaling agentic reasoning system at test-time substantially enhances robustness without compromising model utility.<n> Comprehensive evaluations across key threat scenarios, including unlearning and jailbreaking, demonstrate the effectiveness of AegisLLM.
arXiv Detail & Related papers (2025-04-29T17:36:05Z) - From Code Generation to Software Testing: AI Copilot with Context-Based RAG [8.28588489551341]
We propose a novel perspective on software testing by positing bug detection and coding with fewer bugs as two interconnected problems.<n>We introduce Copilot for Testing, an automated testing system that synchronizes bug detection with updates.<n>Our evaluation demonstrates a 31.2% improvement in bug detection accuracy, a 12.6% increase in critical test coverage, and a 10.5% higher user acceptance rate.
arXiv Detail & Related papers (2025-04-02T16:20:05Z) - The Potential of LLMs in Automating Software Testing: From Generation to Reporting [0.0]
Manual testing, while effective, can be time consuming and costly, leading to an increased demand for automated methods.<n>Recent advancements in Large Language Models (LLMs) have significantly influenced software engineering.<n>This paper explores an agent-oriented approach to automated software testing, using LLMs to reduce human intervention and enhance testing efficiency.
arXiv Detail & Related papers (2024-12-31T02:06:46Z) - AutoPT: How Far Are We from the End2End Automated Web Penetration Testing? [54.65079443902714]
We introduce AutoPT, an automated penetration testing agent based on the principle of PSM driven by LLMs.
Our results show that AutoPT outperforms the baseline framework ReAct on the GPT-4o mini model.
arXiv Detail & Related papers (2024-11-02T13:24:30Z) - Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software Improvement [62.94719119451089]
Lingma SWE-GPT series learns from and simulating real-world code submission activities.
Lingma SWE-GPT 72B resolves 30.20% of GitHub issues, marking a significant improvement in automatic issue resolution.
arXiv Detail & Related papers (2024-11-01T14:27:16Z) - Enriching Automatic Test Case Generation by Extracting Relevant Test Inputs from Bug Reports [10.587260348588064]
We introduce BRMiner, a novel approach that leverages Large Language Models (LLMs) in combination with traditional techniques to extract relevant inputs from bug reports.<n>In this study, we evaluate BRMiner using the Defects4J benchmark and test generation tools such as EvoSuite and Randoop.<n>Our results demonstrate that BRMiner achieves a Relevant Input Rate (RIR) of 60.03% and a Relevant Input Extraction Accuracy Rate (RIEAR) of 31.71%.
arXiv Detail & Related papers (2023-12-22T18:19:33Z) - SUPERNOVA: Automating Test Selection and Defect Prevention in AAA Video
Games Using Risk Based Testing and Machine Learning [62.997667081978825]
Testing video games is an increasingly difficult task as traditional methods fail to scale with growing software systems.
We present SUPERNOVA, a system responsible for test selection and defect prevention while also functioning as an automation hub.
The direct impact of this has been observed to be a reduction in 55% or more testing hours for an undisclosed sports game title.
arXiv Detail & Related papers (2022-03-10T00:47:46Z) - Using Sampling to Estimate and Improve Performance of Automated Scoring
Systems with Guarantees [63.62448343531963]
We propose a combination of the existing paradigms, sampling responses to be scored by humans intelligently.
We observe significant gains in accuracy (19.80% increase on average) and quadratic weighted kappa (QWK) (25.60% on average) with a relatively small human budget.
arXiv Detail & Related papers (2021-11-17T05:00:51Z) - On Introducing Automatic Test Case Generation in Practice: A Success
Story and Lessons Learned [7.717446055777458]
This paper reports our experience in introducing techniques for automatically generating system test suites in a medium-size company.
We describe the technical and organisational obstacles that we faced when introducing automatic test case generation.
We present ABT2.0, the test case generator that we developed.
arXiv Detail & Related papers (2021-02-28T11:31:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.