Related papers: A Defect Classification Framework for AI-Based Software Systems (AI-ODC)

A Defect Classification Framework for AI-Based Software Systems (AI-ODC)

URL: http://arxiv.org/abs/2508.17900v1
Date: Mon, 25 Aug 2025 11:15:31 GMT
Title: A Defect Classification Framework for AI-Based Software Systems (AI-ODC)
Authors: Mohammed O. Alannsary,
Abstract summary: This paper proposes a framework inspired by the Orthogonal Defect Classification (ODC) paradigm.<n>The framework was adapted to accommodate the Data, Learning, and Thinking aspects of AI systems.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Artificial Intelligence has gained a lot of attention recently, it has been utilized in several fields ranging from daily life activities, such as responding to emails and scheduling appointments, to manufacturing and automating work activities. Artificial Intelligence systems are mainly implemented as software solutions, and it is essential to discover and remove software defects to assure its quality using defect analysis which is one of the major activities that contribute to software quality. Despite the proliferation of AI-based systems, current defect analysis models fail to capture their unique attributes. This paper proposes a framework inspired by the Orthogonal Defect Classification (ODC) paradigm and enables defect analysis of Artificial Intelligence systems while recognizing its special attributes and characteristics. This study demonstrated the feasibility of modifying ODC for AI systems to classify its defects. The ODC was adjusted to accommodate the Data, Learning, and Thinking aspects of AI systems which are newly introduced classification dimensions. This adjustment involved the introduction of an additional attribute to the ODC attributes, the incorporation of a new severity level, and the substitution of impact areas with characteristics pertinent to AI systems. The framework was showcased by applying it to a publicly available Machine Learning bug dataset, with results analyzed through one-way and two-way analysis. The case study indicated that defects occurring during the Learning phase were the most prevalent and were significantly linked to high-severity classifications. In contrast, defects identified in the Thinking phase had a disproportionate effect on trustworthiness and accuracy. These findings illustrate AIODC's capability to identify high-risk defect categories and inform focused quality assurance measures.

Related papers

Barbarians at the Gate: How AI is Upending Systems Research [58.95406995634148]
We argue that systems research, long focused on designing and evaluating new performance-oriented algorithms, is particularly well-suited for AI-driven solution discovery.<n>We term this approach as AI-Driven Research for Systems ( ADRS), which iteratively generates, evaluates, and refines solutions.<n>Our results highlight both the disruptive potential and the urgent need to adapt systems research practices in the age of AI.
arXiv Detail & Related papers (2025-10-07T17:49:24Z)
AI Agentic Vulnerability Injection And Transformation with Optimized Reasoning [2.918225266151982]
We present AVIATOR, the first AI-agentic vulnerability injection workflow.<n>It automatically injects realistic, category-specific vulnerabilities for high-fidelity, diverse, large-scale vulnerability dataset generation.<n>It combines semantic analysis, injection synthesis enhanced with LoRA-based fine-tuning and Retrieval-Augmented Generation, as well as post-injection validation via static analysis and LLM-based discriminators.
arXiv Detail & Related papers (2025-08-28T14:59:39Z)
Benchmarking Image Perturbations for Testing Automated Driving Assistance Systems [1.9526430269580959]
Advanced Driver Assistance Systems (ADAS) based on deep neural networks (DNNs) are widely used in autonomous vehicles for critical perception tasks.<n>These systems are highly sensitive to input variations, such as noise and changes in lighting, which can compromise their effectiveness and potentially lead to safety-critical failures.<n>This study offers a comprehensive empirical evaluation of image perturbations to validate and improve the robustness and generalization of ADAS perception systems.
arXiv Detail & Related papers (2025-01-21T16:40:44Z)
Bringing Order Amidst Chaos: On the Role of Artificial Intelligence in Secure Software Engineering [0.0]
The ever-evolving technological landscape offers both opportunities and threats, creating a dynamic space where chaos and order compete.<n>Secure software engineering (SSE) must continuously address vulnerabilities that endanger software systems.<n>This thesis seeks to bring order to the chaos in SSE by addressing domain-specific differences that impact AI accuracy.
arXiv Detail & Related papers (2025-01-09T11:38:58Z)
A Hybrid Framework for Statistical Feature Selection and Image-Based Noise-Defect Detection [55.2480439325792]
This paper presents a hybrid framework that integrates both statistical feature selection and classification techniques to improve defect detection accuracy.<n>We present around 55 distinguished features that are extracted from industrial images, which are then analyzed using statistical methods.<n>By integrating these methods with flexible machine learning applications, the proposed framework improves detection accuracy and reduces false positives and misclassifications.
arXiv Detail & Related papers (2024-12-11T22:12:21Z)
A Survey on Failure Analysis and Fault Injection in AI Systems [28.30817443151044]
The complexity of AI systems has exposed their vulnerabilities, necessitating robust methods for failure analysis (FA) and fault injection (FI) to ensure resilience and reliability. This study fills this gap by presenting a detailed survey of existing FA and FI approaches across six layers of AI systems. Our findings reveal a taxonomy of AI system failures, assess the capabilities of existing FI tools, and highlight discrepancies between real-world and simulated failures.
arXiv Detail & Related papers (2024-06-28T00:32:03Z)
Cloud-based XAI Services for Assessing Open Repository Models Under Adversarial Attacks [7.500941533148728]
We propose a cloud-based service framework that encapsulates computing components and assessment tasks into pipelines. We demonstrate the application of XAI services for assessing five quality attributes of AI models.
arXiv Detail & Related papers (2024-01-22T00:37:01Z)
Progressing from Anomaly Detection to Automated Log Labeling and Pioneering Root Cause Analysis [53.24804865821692]
This study introduces a taxonomy for log anomalies and explores automated data labeling to mitigate labeling challenges. The study envisions a future where root cause analysis follows anomaly detection, unraveling the underlying triggers of anomalies.
arXiv Detail & Related papers (2023-12-22T15:04:20Z)
Position: AI Evaluation Should Learn from How We Test Humans [65.36614996495983]
We argue that psychometrics, a theory originating in the 20th century for human assessment, could be a powerful solution to the challenges in today's AI evaluations.
arXiv Detail & Related papers (2023-06-18T09:54:33Z)
Anomaly Detection Based on Selection and Weighting in Latent Space [73.01328671569759]
We propose a novel selection-and-weighting-based anomaly detection framework called SWAD. Experiments on both benchmark and real-world datasets have shown the effectiveness and superiority of SWAD.
arXiv Detail & Related papers (2021-03-08T10:56:38Z)
TELESTO: A Graph Neural Network Model for Anomaly Classification in Cloud Services [77.454688257702]
Machine learning (ML) and artificial intelligence (AI) are applied on IT system operation and maintenance. One direction aims at the recognition of re-occurring anomaly types to enable remediation automation. We propose a method that is invariant to dimensionality changes of given data.
arXiv Detail & Related papers (2021-02-25T14:24:49Z)
Towards Characterizing Adversarial Defects of Deep Learning Software from the Lens of Uncertainty [30.97582874240214]
Adversarial examples (AEs) represent a typical and important type of defects needed to be urgently addressed. The intrinsic uncertainty nature of deep learning decisions can be a fundamental reason for its incorrect behavior. We identify and categorize the uncertainty patterns of benign examples (BEs) and AEs, and find that while BEs and AEs generated by existing methods do follow common uncertainty patterns, some other uncertainty patterns are largely missed.
arXiv Detail & Related papers (2020-04-24T07:29:47Z)
Bias in Multimodal AI: Testbed for Fair Automatic Recruitment [73.85525896663371]
We study how current multimodal algorithms based on heterogeneous sources of information are affected by sensitive elements and inner biases in the data. We train automatic recruitment algorithms using a set of multimodal synthetic profiles consciously scored with gender and racial biases. Our methodology and results show how to generate fairer AI-based tools in general, and in particular fairer automated recruitment systems.
arXiv Detail & Related papers (2020-04-15T15:58:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.