MLTEing Models: Negotiating, Evaluating, and Documenting Model and
System Qualities
- URL: http://arxiv.org/abs/2303.01998v1
- Date: Fri, 3 Mar 2023 15:10:38 GMT
- Title: MLTEing Models: Negotiating, Evaluating, and Documenting Model and
System Qualities
- Authors: Katherine R. Maffey, Kyle Dotterrer, Jennifer Niemann, Iain
Cruickshank, Grace A. Lewis, Christian K\"astner
- Abstract summary: MLTE is a framework and implementation to evaluate machine learning models and systems.
It compiles state-of-the-art evaluation techniques into an organizational process.
MLTE tooling supports this process by providing a domain-specific language that teams can use to express model requirements.
- Score: 1.1352560842946413
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many organizations seek to ensure that machine learning (ML) and artificial
intelligence (AI) systems work as intended in production but currently do not
have a cohesive methodology in place to do so. To fill this gap, we propose
MLTE (Machine Learning Test and Evaluation, colloquially referred to as
"melt"), a framework and implementation to evaluate ML models and systems. The
framework compiles state-of-the-art evaluation techniques into an
organizational process for interdisciplinary teams, including model developers,
software engineers, system owners, and other stakeholders. MLTE tooling
supports this process by providing a domain-specific language that teams can
use to express model requirements, an infrastructure to define, generate, and
collect ML evaluation metrics, and the means to communicate results.
Related papers
- A Survey of Small Language Models [104.80308007044634]
Small Language Models (SLMs) have become increasingly important due to their efficiency and performance to perform various language tasks with minimal computational resources.
We present a comprehensive survey on SLMs, focusing on their architectures, training techniques, and model compression techniques.
arXiv Detail & Related papers (2024-10-25T23:52:28Z) - Leveraging Large Language Models for Enhanced Process Model Comprehension [33.803742664323856]
In Business Process Management (BPM), effectively comprehending process models is crucial yet poses significant challenges.
This paper introduces a novel framework utilizing the advanced capabilities of Large Language Models (LLMs) to enhance the interpretability of complex process models.
arXiv Detail & Related papers (2024-08-08T13:12:46Z) - Benchmarks as Microscopes: A Call for Model Metrology [76.64402390208576]
Modern language models (LMs) pose a new challenge in capability assessment.
To be confident in our metrics, we need a new discipline of model metrology.
arXiv Detail & Related papers (2024-07-22T17:52:12Z) - A Framework to Model ML Engineering Processes [1.9744907811058787]
Development of Machine Learning (ML) based systems is complex and requires multidisciplinary teams with diverse skill sets.
Current process modeling languages are not suitable for describing the development of such systems.
We introduce a framework for modeling ML-based software development processes, built around a domain-specific language.
arXiv Detail & Related papers (2024-04-29T09:17:36Z) - Process Modeling With Large Language Models [42.0652924091318]
This paper explores the integration of Large Language Models (LLMs) into process modeling.
We propose a framework that leverages LLMs for the automated generation and iterative refinement of process models.
Preliminary results demonstrate the framework's ability to streamline process modeling tasks.
arXiv Detail & Related papers (2024-03-12T11:27:47Z) - Machine Learning-Enabled Software and System Architecture Frameworks [48.87872564630711]
The stakeholders with data science and Machine Learning related concerns, such as data scientists and data engineers, are yet to be included in existing architecture frameworks.
We surveyed 61 subject matter experts from over 25 organizations in 10 countries.
arXiv Detail & Related papers (2023-08-09T21:54:34Z) - Benchmarking Automated Machine Learning Methods for Price Forecasting
Applications [58.720142291102135]
We show the possibility of substituting manually created ML pipelines with automated machine learning (AutoML) solutions.
Based on the CRISP-DM process, we split the manual ML pipeline into a machine learning and non-machine learning part.
We show in a case study for the industrial use case of price forecasting, that domain knowledge combined with AutoML can weaken the dependence on ML experts.
arXiv Detail & Related papers (2023-04-28T10:27:38Z) - MDE for Machine Learning-Enabled Software Systems: A Case Study and
Comparison of MontiAnna & ML-Quadrat [5.839906946900443]
We propose to adopt the MDE paradigm for the development of Machine Learning-enabled software systems with a focus on the Internet of Things (IoT) domain.
We illustrate how two state-of-the-art open-source modeling tools, namely MontiAnna and ML-Quadrat can be used for this purpose as demonstrated through a case study.
arXiv Detail & Related papers (2022-09-15T13:21:16Z) - Data Analytics and Machine Learning Methods, Techniques and Tool for
Model-Driven Engineering of Smart IoT Services [0.0]
This dissertation proposes a novel approach to enhance the development of smart services for the Internet of Things (IoT) and smart Cyber-Physical Systems (CPS)
The proposed approach offers abstraction and automation to the software engineering processes, as well as the Data Analytics (DA) and Machine Learning (ML) practices.
We implement and validate the proposed approach by extending an open source modeling tool, called ThingML.
arXiv Detail & Related papers (2021-02-12T11:09:54Z) - Technology Readiness Levels for Machine Learning Systems [107.56979560568232]
Development and deployment of machine learning systems can be executed easily with modern tools, but the process is typically rushed and means-to-an-end.
We have developed a proven systems engineering approach for machine learning development and deployment.
Our "Machine Learning Technology Readiness Levels" framework defines a principled process to ensure robust, reliable, and responsible systems.
arXiv Detail & Related papers (2021-01-11T15:54:48Z) - Technology Readiness Levels for AI & ML [79.22051549519989]
Development of machine learning systems can be executed easily with modern tools, but the process is typically rushed and means-to-an-end.
Engineering systems follow well-defined processes and testing standards to streamline development for high-quality, reliable results.
We propose a proven systems engineering approach for machine learning development and deployment.
arXiv Detail & Related papers (2020-06-21T17:14:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.