Related papers: AI5GTest: AI-Driven Specification-Aware Automated Testing and Validation of 5G O-RAN Components

AI5GTest: AI-Driven Specification-Aware Automated Testing and Validation of 5G O-RAN Components

URL: http://arxiv.org/abs/2506.10111v1
Date: Wed, 11 Jun 2025 18:49:57 GMT
Title: AI5GTest: AI-Driven Specification-Aware Automated Testing and Validation of 5G O-RAN Components
Authors: Abiodun Ganiyu, Pranshav Gajjar, Vijay K Shah,
Abstract summary: We present AI5GTest -- an AI-powered, specification-aware testing framework.<n>It is designed to automate the validation of O-RAN components.<n>It demonstrates a significant reduction in overall test execution time compared to traditional manual methods.
Score: 1.1879716317856948
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The advent of Open Radio Access Networks (O-RAN) has transformed the telecommunications industry by promoting interoperability, vendor diversity, and rapid innovation. However, its disaggregated architecture introduces complex testing challenges, particularly in validating multi-vendor components against O-RAN ALLIANCE and 3GPP specifications. Existing frameworks, such as those provided by Open Testing and Integration Centres (OTICs), rely heavily on manual processes, are fragmented and prone to human error, leading to inconsistency and scalability issues. To address these limitations, we present AI5GTest -- an AI-powered, specification-aware testing framework designed to automate the validation of O-RAN components. AI5GTest leverages a cooperative Large Language Models (LLM) framework consisting of Gen-LLM, Val-LLM, and Debug-LLM. Gen-LLM automatically generates expected procedural flows for test cases based on 3GPP and O-RAN specifications, while Val-LLM cross-references signaling messages against these flows to validate compliance and detect deviations. If anomalies arise, Debug-LLM performs root cause analysis, providing insight to the failure cause. To enhance transparency and trustworthiness, AI5GTest incorporates a human-in-the-loop mechanism, where the Gen-LLM presents top-k relevant official specifications to the tester for approval before proceeding with validation. Evaluated using a range of test cases obtained from O-RAN TIFG and WG5-IOT test specifications, AI5GTest demonstrates a significant reduction in overall test execution time compared to traditional manual methods, while maintaining high validation accuracy.

Related papers

AI/ML Life Cycle Management for Interoperable AI Native RAN [50.61227317567369]
Artificial intelligence (AI) and machine learning (ML) models are rapidly permeating the 5G Radio Access Network (RAN)<n>These developments lay the foundation for AI-native transceivers as a key enabler for 6G.
arXiv Detail & Related papers (2025-07-24T16:04:59Z)
Impact of Code Context and Prompting Strategies on Automated Unit Test Generation with Modern General-Purpose Large Language Models [0.0]
Generative AI is gaining increasing attention in software engineering.<n>Unit tests constitute the majority of test cases and are often schematic.<n>This paper investigates the impact of code context and prompting strategies on the quality and adequacy of unit tests.
arXiv Detail & Related papers (2025-07-18T11:23:17Z)
ASSURE: Metamorphic Testing for AI-powered Browser Extensions [27.444724767037922]
Traditional browser extension testing approaches fail to address the non-deterministic behavior, context-sensitivity, and complex web environment integration inherent to AI-powered extensions.<n>We present ASSURE, a modular automated testing framework specifically designed for AI-powered browser extensions.<n>ASSURE achieves 6.4x improved testing throughput compared to manual approaches, detecting critical security vulnerabilities within 12.4 minutes on average.
arXiv Detail & Related papers (2025-07-07T09:11:16Z)
An Automated Blackbox Noncompliance Checker for QUIC Server Implementations [2.9248916859490173]
QUICtester is an automated approach for uncovering non-compliant behaviors in the ratified QUIC protocol implementations (RFC 9000/).<n>We used QUICtester to analyze 186 learned models from 19 QUIC implementations under the five security settings and discovered 55 implementation errors.
arXiv Detail & Related papers (2025-05-19T04:28:49Z)
The BrowserGym Ecosystem for Web Agent Research [151.90034093362343]
BrowserGym ecosystem addresses the growing need for efficient evaluation and benchmarking of web agents.<n>We propose an extended BrowserGym-based ecosystem for web agent research, which unifies existing benchmarks from the literature.<n>We conduct the first large-scale, multi-benchmark web agent experiment and compare the performance of 6 state-of-the-art LLMs across 6 popular web agent benchmarks.
arXiv Detail & Related papers (2024-12-06T23:43:59Z)
Automated Proof Generation for Rust Code via Self-Evolution [69.25795662658356]
We introduce SAFE, a framework that overcomes the lack of human-written snippets to enable automated proof generation of Rust code.<n> SAFE re-purposes the large number of synthesized incorrect proofs to train the self-ging capability of the fine-tuned models.<n>We achieve a 52.52% accuracy rate in a benchmark crafted by human experts, a significant leap over GPT-4o's performance of 14.39%.
arXiv Detail & Related papers (2024-10-21T08:15:45Z)
Networks of Networks: Complexity Class Principles Applied to Compound AI Systems Design [63.24275274981911]
Compound AI Systems consisting of many language model inference calls are increasingly employed. In this work, we construct systems, which we call Networks of Networks (NoNs) organized around the distinction between generating a proposed answer and verifying its correctness. We introduce a verifier-based judge NoN with K generators, an instantiation of "best-of-K" or "judge-based" compound AI systems.
arXiv Detail & Related papers (2024-07-23T20:40:37Z)
Cloud-based XAI Services for Assessing Open Repository Models Under Adversarial Attacks [7.500941533148728]
We propose a cloud-based service framework that encapsulates computing components and assessment tasks into pipelines. We demonstrate the application of XAI services for assessing five quality attributes of AI models.
arXiv Detail & Related papers (2024-01-22T00:37:01Z)
Towards a Complete Metamorphic Testing Pipeline [56.75969180129005]
Metamorphic Testing (MT) addresses the test oracle problem by examining the relationships between input-output pairs in consecutive executions of the System Under Test (SUT) These relations, known as Metamorphic Relations (MRs), specify the expected output changes resulting from specific input changes. Our research aims to develop methods and tools that assist testers in generating MRs, defining constraints, and providing explainability for MR outcomes.
arXiv Detail & Related papers (2023-09-30T10:49:22Z)
HuntGPT: Integrating Machine Learning-Based Anomaly Detection and Explainable AI with Large Language Models (LLMs) [0.09208007322096533]
We present HuntGPT, a specialized intrusion detection dashboard applying a Random Forest classifier. The paper delves into the system's architecture, components, and technical accuracy, assessed through Certified Information Security Manager (CISM) Practice Exams. The results demonstrate that conversational agents, supported by LLM and integrated with XAI, provide robust, explainable, and actionable AI solutions in intrusion detection.
arXiv Detail & Related papers (2023-09-27T20:58:13Z)
Smart Fuzzing of 5G Wireless Software Implementation [4.1439060468480005]
We introduce a comprehensive approach to bolstering the security, reliability, and comprehensibility of OpenAirInterface5G (OAI5G) We employ AFL++, a powerful fuzzing tool, to fuzzy-test OAI5G with respect to its configuration files rigorously. Secondly, we harness the capabilities of Large Language Models such as Google Bard to automatically decipher and document the meanings of parameters within the OAI5G that are used in fuzzing.
arXiv Detail & Related papers (2023-09-22T16:45:42Z)
Task-Oriented Over-the-Air Computation for Multi-Device Edge AI [57.50247872182593]
6G networks for supporting edge AI features task-oriented techniques that focus on effective and efficient execution of AI task. Task-oriented over-the-air computation (AirComp) scheme is proposed in this paper for multi-device split-inference system.
arXiv Detail & Related papers (2022-11-02T16:35:14Z)
Auditing AI models for Verified Deployment under Semantic Specifications [65.12401653917838]
AuditAI bridges the gap between interpretable formal verification and scalability. We show how AuditAI allows us to obtain controlled variations for verification and certified training while addressing the limitations of verifying using only pixel-space perturbations.
arXiv Detail & Related papers (2021-09-25T22:53:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.