Related papers: MLDemon: Deployment Monitoring for Machine Learning Systems

MLDemon: Deployment Monitoring for Machine Learning Systems

URL: http://arxiv.org/abs/2104.13621v2
Date: Thu, 29 Apr 2021 06:31:52 GMT
Title: MLDemon: Deployment Monitoring for Machine Learning Systems
Authors: Antonio Ginart, Martin Zhang, James Zou
Abstract summary: We propose a novel approach, MLDemon, for ML DEployment MONitoring. MLDemon integrates both unlabeled features and a small amount of on-demand labeled examples over time to produce a real-time estimate. On temporal datasets with diverse distribution drifts and models, MLDemon substantially outperforms existing monitoring approaches.
Score: 10.074466859579571
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Post-deployment monitoring of the performance of ML systems is critical for ensuring reliability, especially as new user inputs can differ from the training distribution. Here we propose a novel approach, MLDemon, for ML DEployment MONitoring. MLDemon integrates both unlabeled features and a small amount of on-demand labeled examples over time to produce a real-time estimate of the ML model's current performance on a given data stream. Subject to budget constraints, MLDemon decides when to acquire additional, potentially costly, supervised labels to verify the model. On temporal datasets with diverse distribution drifts and models, MLDemon substantially outperforms existing monitoring approaches. Moreover, we provide theoretical analysis to show that MLDemon is minimax rate optimal up to logarithmic factors and is provably robust against broad distribution drifts whereas prior approaches are not.

Related papers

Model Utility Law: Evaluating LLMs beyond Performance through Mechanism Interpretable Metric [99.56567010306807]
Large Language Models (LLMs) have become indispensable across academia, industry, and daily applications.<n>One core challenge of evaluation in the large language model (LLM) era is the generalization issue.<n>We propose Model Utilization Index (MUI), a mechanism interpretability enhanced metric that complements traditional performance scores.
arXiv Detail & Related papers (2025-04-10T04:09:47Z)
Network Resource Optimization for ML-Based UAV Condition Monitoring with Vibration Analysis [54.550658461477106]
Condition Monitoring (CM) uses Machine Learning (ML) models to identify abnormal and adverse conditions. This work explores the optimization of network resources for ML-based UAV CM frameworks. By leveraging dimensionality reduction techniques, there is a 99.9% reduction in network resource consumption.
arXiv Detail & Related papers (2025-02-21T14:36:12Z)
Quantifying the Prediction Uncertainty of Machine Learning Models for Individual Data [2.1248439796866228]
This study investigates pNML's learnability for linear regression and neural networks. It demonstrates that pNML can improve the performance and robustness of these models on various tasks.
arXiv Detail & Related papers (2024-12-10T13:58:19Z)
An Efficient Model Maintenance Approach for MLOps [14.239954811469506]
Existing machine learning model maintenance approaches are often computationally resource intensive, costly, time consuming, and model dependent. We propose an improved MLOps pipeline, a new model maintenance approach and a Similarity Based Model Reuse (SimReuse) tool to address the challenges of ML model maintenance. Our evaluation results on four time series datasets demonstrate that our model reuse approach can maintain the performance of models while significantly reducing maintenance time and costs.
arXiv Detail & Related papers (2024-12-05T23:02:02Z)
On the Cost of Model-Serving Frameworks: An Experimental Evaluation [2.6232657671486983]
Serving strategies are crucial for deploying and managing models in production environments effectively. These strategies ensure that models are available, scalable, reliable, and performant for real-world applications. We show that DL-specific frameworks (TensorFlow Serving and TorchServe) display significantly lower latencies than the three general-purpose ML frameworks.
arXiv Detail & Related papers (2024-11-15T16:36:21Z)
ML-On-Rails: Safeguarding Machine Learning Models in Software Systems A Case Study [4.087995998278127]
We introduce ML-On-Rails, a protocol designed to safeguard machine learning models. ML-On-Rails establishes a well-defined endpoint interface for different ML tasks, and clear communication between ML providers and ML consumers. We evaluate the protocol through a real-world case study of the MoveReminder application.
arXiv Detail & Related papers (2024-01-12T11:27:15Z)
Semi-Supervised Class-Agnostic Motion Prediction with Pseudo Label Regeneration and BEVMix [59.55173022987071]
We study the potential of semi-supervised learning for class-agnostic motion prediction. Our framework adopts a consistency-based self-training paradigm, enabling the model to learn from unlabeled data. Our method exhibits comparable performance to weakly and some fully supervised methods.
arXiv Detail & Related papers (2023-12-13T09:32:50Z)
From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning [52.257422715393574]
We introduce a self-guided methodology for Large Language Models (LLMs) to autonomously discern and select cherry samples from open-source datasets. Our key innovation, the Instruction-Following Difficulty (IFD) metric, emerges as a pivotal metric to identify discrepancies between a model's expected responses and its intrinsic generation capability.
arXiv Detail & Related papers (2023-08-23T09:45:29Z)
Value function estimation using conditional diffusion models for control [62.27184818047923]
We propose a simple algorithm called Diffused Value Function (DVF) It learns a joint multi-step model of the environment-robot interaction dynamics using a diffusion model. We show how DVF can be used to efficiently capture the state visitation measure for multiple controllers.
arXiv Detail & Related papers (2023-06-09T18:40:55Z)
Scaling up Trustless DNN Inference with Zero-Knowledge Proofs [47.42532753464726]
We present the first practical ImageNet-scale method to verify ML model inference non-interactively, i.e., after the inference has been done. We provide the first ZKSNARK proof of valid inference for a full resolution ImageNet model, achieving 79% top-5 accuracy.
arXiv Detail & Related papers (2022-10-17T00:35:38Z)
Scanflow: A multi-graph framework for Machine Learning workflow management, supervision, and debugging [0.0]
We propose a novel containerized directed graph framework to support end-to-end Machine Learning workflow management. The framework allows defining and deploying ML in containers, tracking their metadata, checking their behavior in production, and improving the models by using both learned and human-provided knowledge.
arXiv Detail & Related papers (2021-11-04T17:01:12Z)
FairCanary: Rapid Continuous Explainable Fairness [8.362098382773265]
We present Quantile Demographic Drift (QDD), a novel model bias quantification metric. QDD is ideal for continuous monitoring scenarios, does not suffer from the statistical limitations of conventional threshold-based bias metrics. We incorporate QDD into a continuous model monitoring system, called FairCanary, that reuses existing explanations computed for each individual prediction.
arXiv Detail & Related papers (2021-06-13T17:47:44Z)
Transfer Learning without Knowing: Reprogramming Black-box Machine Learning Models with Scarce Data and Limited Resources [78.72922528736011]
We propose a novel approach, black-box adversarial reprogramming (BAR), that repurposes a well-trained black-box machine learning model. Using zeroth order optimization and multi-label mapping techniques, BAR can reprogram a black-box ML model solely based on its input-output responses. BAR outperforms state-of-the-art methods and yields comparable performance to the vanilla adversarial reprogramming method.
arXiv Detail & Related papers (2020-07-17T01:52:34Z)
Semi-Supervised Learning with Normalizing Flows [54.376602201489995]
FlowGMM is an end-to-end approach to generative semi supervised learning with normalizing flows. We show promising results on a wide range of applications, including AG-News and Yahoo Answers text data.
arXiv Detail & Related papers (2019-12-30T17:36:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.