An investigation of challenges encountered when specifying training data
and runtime monitors for safety critical ML applications
- URL: http://arxiv.org/abs/2301.13476v1
- Date: Tue, 31 Jan 2023 08:56:40 GMT
- Title: An investigation of challenges encountered when specifying training data
and runtime monitors for safety critical ML applications
- Authors: Hans-Martin Heyn and Eric Knauss and Iswarya Malleswaran and Shruthi
Dinakaran
- Abstract summary: The development and operation of critical software that contains machine learning (ML) models requires diligence and established processes.
We see major uncertainty in how to specify training data and runtime monitoring for critical ML models.
- Score: 5.553426007439564
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Context and motivation: The development and operation of critical software
that contains machine learning (ML) models requires diligence and established
processes. Especially the training data used during the development of ML
models have major influences on the later behaviour of the system. Runtime
monitors are used to provide guarantees for that behaviour. Question / problem:
We see major uncertainty in how to specify training data and runtime monitoring
for critical ML models and by this specifying the final functionality of the
system. In this interview-based study we investigate the underlying challenges
for these difficulties. Principal ideas/results: Based on ten interviews with
practitioners who develop ML models for critical applications in the automotive
and telecommunication sector, we identified 17 underlying challenges in 6
challenge groups that relate to the challenge of specifying training data and
runtime monitoring. Contribution: The article provides a list of the identified
underlying challenges related to the difficulties practitioners experience when
specifying training data and runtime monitoring for ML models. Furthermore,
interconnection between the challenges were found and based on these
connections recommendation proposed to overcome the root causes for the
challenges.
Related papers
- Guiding Through Complexity: What Makes Good Supervision for Hard Reasoning Tasks? [74.88417042125985]
We investigate various data-driven strategies that offer supervision data at different quality levels upon tasks of varying complexity.
We find that even when the outcome error rate for hard task supervision is high, training on such data can outperform perfectly correct supervision on easier subtasks.
Our results also reveal that supplementing hard task supervision with the corresponding subtask supervision can yield notable performance improvements.
arXiv Detail & Related papers (2024-10-27T17:55:27Z) - Federated Large Language Models: Current Progress and Future Directions [63.68614548512534]
This paper surveys Federated learning for LLMs (FedLLM), highlighting recent advances and future directions.
We focus on two key aspects: fine-tuning and prompt learning in a federated setting, discussing existing work and associated research challenges.
arXiv Detail & Related papers (2024-09-24T04:14:33Z) - Maintainability Challenges in ML: A Systematic Literature Review [5.669063174637433]
This study aims to identify and synthesise the maintainability challenges in different stages of the Machine Learning workflow.
We screened more than 13000 papers, then selected and qualitatively analysed 56 of them.
arXiv Detail & Related papers (2024-08-17T13:24:15Z) - A Review of the Challenges with Massive Web-mined Corpora Used in Large Language Models Pre-Training [0.0]
This review identifies key challenges in this domain, including challenges such as noise (irrelevant or misleading information), duplication of content, the presence of low-quality or incorrect information, biases, and the inclusion of sensitive or personal information in web-mined corpora.
Through an examination of current methodologies for data cleaning, pre-processing, bias detection and mitigation, we highlight the gaps in existing approaches and suggest directions for future research.
arXiv Detail & Related papers (2024-07-10T13:09:23Z) - Combating Missing Modalities in Egocentric Videos at Test Time [92.38662956154256]
Real-world applications often face challenges with incomplete modalities due to privacy concerns, efficiency needs, or hardware issues.
We propose a novel approach to address this issue at test time without requiring retraining.
MiDl represents the first self-supervised, online solution for handling missing modalities exclusively at test time.
arXiv Detail & Related papers (2024-04-23T16:01:33Z) - ML-Enabled Systems Model Deployment and Monitoring: Status Quo and
Problems [7.280443300122617]
We conducted an international survey to gather practitioner insights on how ML-enabled systems are engineered.
We analyzed the status quo and problems reported for the model deployment and monitoring phases.
Our results help provide a better understanding of the adopted practices and problems in practice.
arXiv Detail & Related papers (2024-02-08T00:25:30Z) - Competition-Level Problems are Effective LLM Evaluators [121.15880285283116]
This paper aims to evaluate the reasoning capacities of large language models (LLMs) in solving recent programming problems in Codeforces.
We first provide a comprehensive evaluation of GPT-4's peiceived zero-shot performance on this task, considering various aspects such as problems' release time, difficulties, and types of errors encountered.
Surprisingly, theThoughtived performance of GPT-4 has experienced a cliff like decline in problems after September 2021 consistently across all the difficulties and types of problems.
arXiv Detail & Related papers (2023-12-04T18:58:57Z) - Towards leveraging LLMs for Conditional QA [1.9649272351760063]
This study delves into the capabilities and limitations of Large Language Models (LLMs) in the challenging domain of conditional question-answering.
Our findings reveal that fine-tuned LLMs can surpass the state-of-the-art (SOTA) performance in some cases, even without fully encoding all input context.
These models encounter challenges in extractive question answering, where they lag behind the SOTA by over 10 points, and in mitigating the risk of injecting false information.
arXiv Detail & Related papers (2023-12-02T14:02:52Z) - Contrastive Example-Based Control [163.6482792040079]
We propose a method for offline, example-based control that learns an implicit model of multi-step transitions, rather than a reward function.
Across a range of state-based and image-based offline control tasks, our method outperforms baselines that use learned reward functions.
arXiv Detail & Related papers (2023-07-24T19:43:22Z) - Causal Scene BERT: Improving object detection by searching for
challenging groups of data [125.40669814080047]
Computer vision applications rely on learning-based perception modules parameterized with neural networks for tasks like object detection.
These modules frequently have low expected error overall but high error on atypical groups of data due to biases inherent in the training process.
Our main contribution is a pseudo-automatic method to discover such groups in foresight by performing causal interventions on simulated scenes.
arXiv Detail & Related papers (2022-02-08T05:14:16Z) - Automatic Feasibility Study via Data Quality Analysis for ML: A
Case-Study on Label Noise [21.491392581672198]
We present Snoopy, with the goal of supporting data scientists and machine learning engineers performing a systematic and theoretically founded feasibility study.
We approach this problem by estimating the irreducible error of the underlying task, also known as the Bayes error rate (BER)
We demonstrate in end-to-end experiments how users are able to save substantial labeling time and monetary efforts.
arXiv Detail & Related papers (2020-10-16T14:21:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.