Related papers: Mutation Testing framework for Machine Learning

Mutation Testing framework for Machine Learning

URL: http://arxiv.org/abs/2102.10961v1
Date: Fri, 19 Feb 2021 18:02:31 GMT
Title: Mutation Testing framework for Machine Learning
Authors: Raju
Abstract summary: Failure of Machine Learning Models can lead to severe consequences in terms of loss of life or property. Developers, scientists, and ML community around the world, must build a highly reliable test architecture for critical ML application. This article provides an insight journey of Machine Learning Systems (MLS) testing, its evolution, current paradigm and future work.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This is an article or technical note which is intended to provides an insight journey of Machine Learning Systems (MLS) testing, its evolution, current paradigm and future work. Machine Learning Models, used in critical applications such as healthcare industry, Automobile, and Air Traffic control, Share Trading etc., and failure of ML Model can lead to severe consequences in terms of loss of life or property. To remediate this, developers, scientists, and ML community around the world, must build a highly reliable test architecture for critical ML application. At the very foundation layer, any test model must satisfy the core testing attributes such as test properties and its components. This attribute comes from the software engineering, but the same cannot be applied in as-is form to the ML testing and we will tell you why.

Related papers

AutoPT: How Far Are We from the End2End Automated Web Penetration Testing? [54.65079443902714]
We introduce AutoPT, an automated penetration testing agent based on the principle of PSM driven by LLMs. Our results show that AutoPT outperforms the baseline framework ReAct on the GPT-4o mini model.
arXiv Detail & Related papers (2024-11-02T13:24:30Z)
Context-Aware Testing: A New Paradigm for Model Testing with Large Language Models [49.06068319380296]
We introduce context-aware testing (CAT) which uses context as an inductive bias to guide the search for meaningful model failures. We instantiate the first CAT system, SMART Testing, which employs large language models to hypothesize relevant and likely failures.
arXiv Detail & Related papers (2024-10-31T15:06:16Z)
Using Quality Attribute Scenarios for ML Model Test Case Generation [3.9111051646728527]
Current practice for machine learning (ML) model testing prioritizes testing for model performance. This paper presents an approach based on quality attribute (QA) scenarios to elicit and define system- and model-relevant test cases. The QA-based approach has been integrated into MLTE, a process and tool to support ML model test and evaluation.
arXiv Detail & Related papers (2024-06-12T18:26:42Z)
On Extending the Automatic Test Markup Language (ATML) for Machine Learning [3.6458439734112695]
This paper examines the suitability of the IEEE Standard 1671 (IEEE Std 1671), known as the Automatic Test Markup Language (ATML), for machine learning (ML) application testing. Through modeling various tests such as adversarial robustness and drift detection, this paper offers a framework adaptable to specific applications. We conclude that ATML is a promising tool for effective, near real-time operational T&E of ML applications.
arXiv Detail & Related papers (2024-04-04T19:28:38Z)
Learning continuous models for continuous physics [94.42705784823997]
We develop a test based on numerical analysis theory to validate machine learning models for science and engineering applications. Our results illustrate how principled numerical analysis methods can be coupled with existing ML training/testing methodologies to validate models for science and engineering applications.
arXiv Detail & Related papers (2022-02-17T07:56:46Z)
ML4ML: Automated Invariance Testing for Machine Learning Models [7.017320068977301]
We propose an automatic testing framework that is applicable to a variety of invariance qualities. We employ machine learning techniques for analysing such imagery'' testing data automatically, hence facilitating ML4ML. Our testing results show that the trained ML4ML assessors can perform such analytical tasks with sufficient accuracy.
arXiv Detail & Related papers (2021-09-27T10:23:44Z)
Man versus Machine: AutoML and Human Experts' Role in Phishing Detection [4.124446337711138]
This paper compares the performances of six well-known, state-of-the-art AutoML frameworks on ten different phishing datasets. Our results indicate that AutoML-based models are able to outperform manually developed machine learning models in complex classification tasks.
arXiv Detail & Related papers (2021-08-27T09:26:20Z)
ALT-MAS: A Data-Efficient Framework for Active Testing of Machine Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data. The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z)
Technology Readiness Levels for Machine Learning Systems [107.56979560568232]
Development and deployment of machine learning systems can be executed easily with modern tools, but the process is typically rushed and means-to-an-end. We have developed a proven systems engineering approach for machine learning development and deployment. Our "Machine Learning Technology Readiness Levels" framework defines a principled process to ensure robust, reliable, and responsible systems.
arXiv Detail & Related papers (2021-01-11T15:54:48Z)
Transfer Learning without Knowing: Reprogramming Black-box Machine Learning Models with Scarce Data and Limited Resources [78.72922528736011]
We propose a novel approach, black-box adversarial reprogramming (BAR), that repurposes a well-trained black-box machine learning model. Using zeroth order optimization and multi-label mapping techniques, BAR can reprogram a black-box ML model solely based on its input-output responses. BAR outperforms state-of-the-art methods and yields comparable performance to the vanilla adversarial reprogramming method.
arXiv Detail & Related papers (2020-07-17T01:52:34Z)
Insights into Performance Fitness and Error Metrics for Machine Learning [1.827510863075184]
Machine learning (ML) is the field of training machines to achieve high level of cognition and perform human-like analysis. This paper examines a number of the most commonly-used performance fitness and error metrics for regression and classification algorithms.
arXiv Detail & Related papers (2020-05-17T22:59:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.