Related papers: Addressing Quality Challenges in Deep Learning: The Role of MLOps and Domain Knowledge

Addressing Quality Challenges in Deep Learning: The Role of MLOps and Domain Knowledge

URL: http://arxiv.org/abs/2501.08402v2
Date: Fri, 31 Jan 2025 16:47:16 GMT
Title: Addressing Quality Challenges in Deep Learning: The Role of MLOps and Domain Knowledge
Authors: Santiago del Rey, Adrià Medina, Xavier Franch, Silverio Martínez-Fernández,
Abstract summary: Deep learning (DL) systems present unique challenges in software engineering, especially concerning quality attributes like correctness and resource efficiency.<n>This experience paper explores the role of MLOps practices in creating transparent and reproducible experimentation environments.<n>We report on experiences addressing the quality challenges by embedding domain knowledge into the design of a DL model and its integration within a larger system.
Score: 5.190998244098203
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep learning (DL) systems present unique challenges in software engineering, especially concerning quality attributes like correctness and resource efficiency. While DL models excel in specific tasks, engineering DL systems is still essential. The effort, cost, and potential diminishing returns of continual improvements must be carefully evaluated, as software engineers often face the critical decision of when to stop refining a system relative to its quality attributes. This experience paper explores the role of MLOps practices -- such as monitoring and experiment tracking -- in creating transparent and reproducible experimentation environments that enable teams to assess and justify the impact of design decisions on quality attributes. Furthermore, we report on experiences addressing the quality challenges by embedding domain knowledge into the design of a DL model and its integration within a larger system. The findings offer actionable insights into the benefits of domain knowledge and MLOps and the strategic consideration of when to limit further optimizations in DL projects to maximize overall system quality and reliability.

Related papers

Enhancing Machine Learning Performance through Intelligent Data Quality Assessment: An Unsupervised Data-centric Framework [0.0]
Poor data quality limits the advantageous power of Machine Learning (ML) We propose an intelligent data-centric evaluation framework that can identify high-quality data and improve the performance of an ML system.
arXiv Detail & Related papers (2025-02-18T18:01:36Z)
Maturity Framework for Enhancing Machine Learning Quality [4.840252889901222]
We suggest a methodical approach towards Machine Learning system quality assessment and introduce a structured Maturity framework for governance of ML. We emphasize the importance of quality in ML and the need for rigorous assessment, driven by issues in ML governance and gaps in existing frameworks. The study presents empirical findings, highlighting quality improvement trends and showcasing business outcomes.
arXiv Detail & Related papers (2025-02-12T14:56:46Z)
Dynamic Knowledge Integration for Enhanced Vision-Language Reasoning [0.0]
We propose an Adaptive Knowledge-Guided Pretraining for Large Vision-Language Models (AKGP-LVLM)<n>It incorporates structured and unstructured knowledge into LVLMs during pretraining and fine-tuning.<n>We evaluate our method on four benchmark datasets, demonstrating significant performance improvements over state-of-the-art models.
arXiv Detail & Related papers (2025-01-15T05:45:04Z)
Exploring Knowledge Boundaries in Large Language Models for Retrieval Judgment [56.87031484108484]
Large Language Models (LLMs) are increasingly recognized for their practical applications. Retrieval-Augmented Generation (RAG) tackles this challenge and has shown a significant impact on LLMs. By minimizing retrieval requests that yield neutral or harmful results, we can effectively reduce both time and computational costs.
arXiv Detail & Related papers (2024-11-09T15:12:28Z)
A Theoretical Framework for AI-driven data quality monitoring in high-volume data environments [1.2753215270475886]
This paper presents a theoretical framework for an AI-driven data quality monitoring system designed to address the challenges of maintaining data quality in high-volume environments. We examine the limitations of traditional methods in managing the scale, velocity, and variety of big data and propose a conceptual approach leveraging advanced machine learning techniques. Key components include an intelligent data ingestion layer, adaptive preprocessing mechanisms, context-aware feature extraction, and AI-based quality assessment modules.
arXiv Detail & Related papers (2024-10-11T07:06:36Z)
Trustworthiness in Retrieval-Augmented Generation Systems: A Survey [59.26328612791924]
Retrieval-Augmented Generation (RAG) has quickly grown into a pivotal paradigm in the development of Large Language Models (LLMs) We propose a unified framework that assesses the trustworthiness of RAG systems across six key dimensions: factuality, robustness, fairness, transparency, accountability, and privacy.
arXiv Detail & Related papers (2024-09-16T09:06:44Z)
Re-TASK: Revisiting LLM Tasks from Capability, Skill, and Knowledge Perspectives [54.14429346914995]
Chain-of-Thought (CoT) has become a pivotal method for solving complex problems. Large language models (LLMs) often struggle to accurately decompose domain-specific tasks. This paper introduces the Re-TASK framework, a novel theoretical model that revisits LLM tasks from the perspectives of capability, skill, and knowledge.
arXiv Detail & Related papers (2024-08-13T13:58:23Z)
The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources [100.23208165760114]
Foundation model development attracts a rapidly expanding body of contributors, scientists, and applications. To help shape responsible development practices, we introduce the Foundation Model Development Cheatsheet.
arXiv Detail & Related papers (2024-06-24T15:55:49Z)
Characterization of Large Language Model Development in the Datacenter [55.9909258342639]
Large Language Models (LLMs) have presented impressive performance across several transformative tasks. However, it is non-trivial to efficiently utilize large-scale cluster resources to develop LLMs. We present an in-depth characterization study of a six-month LLM development workload trace collected from our GPU datacenter Acme.
arXiv Detail & Related papers (2024-03-12T13:31:14Z)
Rethinking Machine Unlearning for Large Language Models [85.92660644100582]
We explore machine unlearning in the domain of large language models (LLMs) This initiative aims to eliminate undesirable data influence (e.g., sensitive or illegal information) and the associated model capabilities.
arXiv Detail & Related papers (2024-02-13T20:51:58Z)
A Closer Look at the Limitations of Instruction Tuning [52.587607091917214]
We show that Instruction Tuning (IT) fails to enhance knowledge or skills in large language models (LLMs) We also show that popular methods to improve IT do not lead to performance improvements over a simple LoRA fine-tuned model. Our findings reveal that responses generated solely from pre-trained knowledge consistently outperform responses by models that learn any form of new knowledge from IT on open-source datasets.
arXiv Detail & Related papers (2024-02-03T04:45:25Z)
Large Process Models: Business Process Management in the Age of Generative AI [4.249492423406116]
Large Process Model (LPM) combines correlation power of Large Language Models with analytical precision and reliability of knowledge-based systems and automated reasoning approaches. LPM would allow organizations to receive context-specific (tailored) process and other business models, analytical deep-dives, and improvement recommendations.
arXiv Detail & Related papers (2023-09-02T10:32:53Z)
Quality In / Quality Out: Data quality more relevant than model choice in anomaly detection with the UGR'16 [0.29998889086656577]
We show that relatively minor modifications on a benchmark dataset cause significantly more impact on model performance than the specific ML technique considered.<n>We also show that the measured model performance is uncertain, as a result of labelling inaccuracies.
arXiv Detail & Related papers (2023-05-31T12:03:12Z)
Quality Monitoring and Assessment of Deployed Deep Learning Models for Network AIOps [9.881249708266237]
Deep Learning (DL) models are software artifacts, they need to be regularly maintained and updated. In the lifecycle of a DL model deployment, it is important to assess the quality of deployed models, to detect "stale" models and prioritize their update. This article proposes simple yet effective techniques for (i) quality assessment of individual inference, and (ii) overall model quality tracking over multiple inferences.
arXiv Detail & Related papers (2022-02-28T09:37:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.