On Extending the Automatic Test Markup Language (ATML) for Machine Learning
- URL: http://arxiv.org/abs/2404.03769v1
- Date: Thu, 4 Apr 2024 19:28:38 GMT
- Title: On Extending the Automatic Test Markup Language (ATML) for Machine Learning
- Authors: Tyler Cody, Bingtong Li, Peter A. Beling,
- Abstract summary: This paper examines the suitability of the IEEE Standard 1671 (IEEE Std 1671), known as the Automatic Test Markup Language (ATML), for machine learning (ML) application testing.
Through modeling various tests such as adversarial robustness and drift detection, this paper offers a framework adaptable to specific applications.
We conclude that ATML is a promising tool for effective, near real-time operational T&E of ML applications.
- Score: 3.6458439734112695
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper addresses the urgent need for messaging standards in the operational test and evaluation (T&E) of machine learning (ML) applications, particularly in edge ML applications embedded in systems like robots, satellites, and unmanned vehicles. It examines the suitability of the IEEE Standard 1671 (IEEE Std 1671), known as the Automatic Test Markup Language (ATML), an XML-based standard originally developed for electronic systems, for ML application testing. The paper explores extending IEEE Std 1671 to encompass the unique challenges of ML applications, including the use of datasets and dependencies on software. Through modeling various tests such as adversarial robustness and drift detection, this paper offers a framework adaptable to specific applications, suggesting that minor modifications to ATML might suffice to address the novelties of ML. This paper differentiates ATML's focus on testing from other ML standards like Predictive Model Markup Language (PMML) or Open Neural Network Exchange (ONNX), which concentrate on ML model specification. We conclude that ATML is a promising tool for effective, near real-time operational T&E of ML applications, an essential aspect of AI lifecycle management, safety, and governance.
Related papers
- AutoPT: How Far Are We from the End2End Automated Web Penetration Testing? [54.65079443902714]
We introduce AutoPT, an automated penetration testing agent based on the principle of PSM driven by LLMs.
Our results show that AutoPT outperforms the baseline framework ReAct on the GPT-4o mini model.
arXiv Detail & Related papers (2024-11-02T13:24:30Z) - Verbalized Machine Learning: Revisiting Machine Learning with Language Models [63.10391314749408]
We introduce the framework of verbalized machine learning (VML)
VML constrains the parameter space to be human-interpretable natural language.
We empirically verify the effectiveness of VML, and hope that VML can serve as a stepping stone to stronger interpretability.
arXiv Detail & Related papers (2024-06-06T17:59:56Z) - A Cyber Manufacturing IoT System for Adaptive Machine Learning Model Deployment by Interactive Causality Enabled Self-Labeling [0.0]
This paper proposes the AdaptIoT system, comprised of an end-to-end data streaming pipeline, ML service integration, and an automated self-labeling service.
The self-labeling service consists of causal knowledge bases and automated full-cycle self-labeling to adapt multiple ML models simultaneously.
A field demonstration of a self-labeling adaptive ML application is conducted with a makerspace and shows reliable performance.
arXiv Detail & Related papers (2024-04-09T03:10:45Z) - SWITCH: An Exemplar for Evaluating Self-Adaptive ML-Enabled Systems [1.2277343096128712]
Machine Learning-Enabled Systems (MLS) is crucial for maintaining Quality of Service (QoS)
The Machine Learning Model Balancer is a concept that addresses these uncertainties by facilitating dynamic ML model switching.
This paper introduces SWITCH, an exemplar developed to enhance self-adaptive capabilities in such systems.
arXiv Detail & Related papers (2024-02-09T11:56:44Z) - A Multivocal Literature Review on the Benefits and Limitations of
Automated Machine Learning Tools [9.69672653683112]
We conducted a multivocal literature review, which allowed us to identify 54 sources from the academic literature and 108 sources from the grey literature reporting on AutoML benefits and limitations.
Concerning the benefits, we highlight that AutoML tools can help streamline the core steps of ML.
We highlight several limitations that may represent obstacles to the widespread adoption of AutoML.
arXiv Detail & Related papers (2024-01-21T01:39:39Z) - LM-Polygraph: Uncertainty Estimation for Language Models [71.21409522341482]
Uncertainty estimation (UE) methods are one path to safer, more responsible, and more effective use of large language models (LLMs)
We introduce LM-Polygraph, a framework with implementations of a battery of state-of-the-art UE methods for LLMs in text generation tasks, with unified program interfaces in Python.
It introduces an extendable benchmark for consistent evaluation of UE techniques by researchers, and a demo web application that enriches the standard chat dialog with confidence scores.
arXiv Detail & Related papers (2023-11-13T15:08:59Z) - Simultaneous Machine Translation with Large Language Models [51.470478122113356]
We investigate the possibility of applying Large Language Models to SimulMT tasks.
We conducted experiments using the textttLlama2-7b-chat model on nine different languages from the MUST-C dataset.
The results show that LLM outperforms dedicated MT models in terms of BLEU and LAAL metrics.
arXiv Detail & Related papers (2023-09-13T04:06:47Z) - Vulnerability of Machine Learning Approaches Applied in IoT-based Smart Grid: A Review [51.31851488650698]
Machine learning (ML) sees an increasing prevalence of being used in the internet-of-things (IoT)-based smart grid.
adversarial distortion injected into the power signal will greatly affect the system's normal control and operation.
It is imperative to conduct vulnerability assessment for MLsgAPPs applied in the context of safety-critical power systems.
arXiv Detail & Related papers (2023-08-30T03:29:26Z) - Benchmarking Automated Machine Learning Methods for Price Forecasting
Applications [58.720142291102135]
We show the possibility of substituting manually created ML pipelines with automated machine learning (AutoML) solutions.
Based on the CRISP-DM process, we split the manual ML pipeline into a machine learning and non-machine learning part.
We show in a case study for the industrial use case of price forecasting, that domain knowledge combined with AutoML can weaken the dependence on ML experts.
arXiv Detail & Related papers (2023-04-28T10:27:38Z) - MDE for Machine Learning-Enabled Software Systems: A Case Study and
Comparison of MontiAnna & ML-Quadrat [5.839906946900443]
We propose to adopt the MDE paradigm for the development of Machine Learning-enabled software systems with a focus on the Internet of Things (IoT) domain.
We illustrate how two state-of-the-art open-source modeling tools, namely MontiAnna and ML-Quadrat can be used for this purpose as demonstrated through a case study.
arXiv Detail & Related papers (2022-09-15T13:21:16Z) - Mutation Testing framework for Machine Learning [0.0]
Failure of Machine Learning Models can lead to severe consequences in terms of loss of life or property.
Developers, scientists, and ML community around the world, must build a highly reliable test architecture for critical ML application.
This article provides an insight journey of Machine Learning Systems (MLS) testing, its evolution, current paradigm and future work.
arXiv Detail & Related papers (2021-02-19T18:02:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.