MLDemon: Deployment Monitoring for Machine Learning Systems
- URL: http://arxiv.org/abs/2104.13621v2
- Date: Thu, 29 Apr 2021 06:31:52 GMT
- Title: MLDemon: Deployment Monitoring for Machine Learning Systems
- Authors: Antonio Ginart, Martin Zhang, James Zou
- Abstract summary: We propose a novel approach, MLDemon, for ML DEployment MONitoring.
MLDemon integrates both unlabeled features and a small amount of on-demand labeled examples over time to produce a real-time estimate.
On temporal datasets with diverse distribution drifts and models, MLDemon substantially outperforms existing monitoring approaches.
- Score: 10.074466859579571
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Post-deployment monitoring of the performance of ML systems is critical for
ensuring reliability, especially as new user inputs can differ from the
training distribution. Here we propose a novel approach, MLDemon, for ML
DEployment MONitoring. MLDemon integrates both unlabeled features and a small
amount of on-demand labeled examples over time to produce a real-time estimate
of the ML model's current performance on a given data stream. Subject to budget
constraints, MLDemon decides when to acquire additional, potentially costly,
supervised labels to verify the model. On temporal datasets with diverse
distribution drifts and models, MLDemon substantially outperforms existing
monitoring approaches. Moreover, we provide theoretical analysis to show that
MLDemon is minimax rate optimal up to logarithmic factors and is provably
robust against broad distribution drifts whereas prior approaches are not.
Related papers
- ML-On-Rails: Safeguarding Machine Learning Models in Software Systems A
Case Study [4.087995998278127]
We introduce ML-On-Rails, a protocol designed to safeguard machine learning models.
ML-On-Rails establishes a well-defined endpoint interface for different ML tasks, and clear communication between ML providers and ML consumers.
We evaluate the protocol through a real-world case study of the MoveReminder application.
arXiv Detail & Related papers (2024-01-12T11:27:15Z) - Semi-Supervised Class-Agnostic Motion Prediction with Pseudo Label
Regeneration and BEVMix [59.55173022987071]
We study the potential of semi-supervised learning for class-agnostic motion prediction.
Our framework adopts a consistency-based self-training paradigm, enabling the model to learn from unlabeled data.
Our method exhibits comparable performance to weakly and some fully supervised methods.
arXiv Detail & Related papers (2023-12-13T09:32:50Z) - From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning [52.257422715393574]
We introduce a self-guided methodology for Large Language Models (LLMs) to autonomously discern and select cherry samples from open-source datasets.
Our key innovation, the Instruction-Following Difficulty (IFD) metric, emerges as a pivotal metric to identify discrepancies between a model's expected responses and its intrinsic generation capability.
arXiv Detail & Related papers (2023-08-23T09:45:29Z) - Value function estimation using conditional diffusion models for control [62.27184818047923]
We propose a simple algorithm called Diffused Value Function (DVF)
It learns a joint multi-step model of the environment-robot interaction dynamics using a diffusion model.
We show how DVF can be used to efficiently capture the state visitation measure for multiple controllers.
arXiv Detail & Related papers (2023-06-09T18:40:55Z) - Scaling up Trustless DNN Inference with Zero-Knowledge Proofs [47.42532753464726]
We present the first practical ImageNet-scale method to verify ML model inference non-interactively, i.e., after the inference has been done.
We provide the first ZKSNARK proof of valid inference for a full resolution ImageNet model, achieving 79% top-5 accuracy.
arXiv Detail & Related papers (2022-10-17T00:35:38Z) - Scanflow: A multi-graph framework for Machine Learning workflow
management, supervision, and debugging [0.0]
We propose a novel containerized directed graph framework to support end-to-end Machine Learning workflow management.
The framework allows defining and deploying ML in containers, tracking their metadata, checking their behavior in production, and improving the models by using both learned and human-provided knowledge.
arXiv Detail & Related papers (2021-11-04T17:01:12Z) - Machine Learning Model Drift Detection Via Weak Data Slices [5.319802998033767]
We propose a method that utilizes feature space rules, called data slices, for drift detection.
We provide experimental indications that our method is likely to identify that the ML model will likely change in performance, based on changes in the underlying data.
arXiv Detail & Related papers (2021-08-11T16:55:34Z) - FairCanary: Rapid Continuous Explainable Fairness [8.362098382773265]
We present Quantile Demographic Drift (QDD), a novel model bias quantification metric.
QDD is ideal for continuous monitoring scenarios, does not suffer from the statistical limitations of conventional threshold-based bias metrics.
We incorporate QDD into a continuous model monitoring system, called FairCanary, that reuses existing explanations computed for each individual prediction.
arXiv Detail & Related papers (2021-06-13T17:47:44Z) - Transfer Learning without Knowing: Reprogramming Black-box Machine
Learning Models with Scarce Data and Limited Resources [78.72922528736011]
We propose a novel approach, black-box adversarial reprogramming (BAR), that repurposes a well-trained black-box machine learning model.
Using zeroth order optimization and multi-label mapping techniques, BAR can reprogram a black-box ML model solely based on its input-output responses.
BAR outperforms state-of-the-art methods and yields comparable performance to the vanilla adversarial reprogramming method.
arXiv Detail & Related papers (2020-07-17T01:52:34Z) - Semi-Supervised Learning with Normalizing Flows [54.376602201489995]
FlowGMM is an end-to-end approach to generative semi supervised learning with normalizing flows.
We show promising results on a wide range of applications, including AG-News and Yahoo Answers text data.
arXiv Detail & Related papers (2019-12-30T17:36:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.