Related papers: Machine Learning Model Drift Detection Via Weak Data Slices

Machine Learning Model Drift Detection Via Weak Data Slices

URL: http://arxiv.org/abs/2108.05319v1
Date: Wed, 11 Aug 2021 16:55:34 GMT
Title: Machine Learning Model Drift Detection Via Weak Data Slices
Authors: Samuel Ackerman, Parijat Dube, Eitan Farchi, Orna Raz, Marcel Zalmanovici
Abstract summary: We propose a method that utilizes feature space rules, called data slices, for drift detection. We provide experimental indications that our method is likely to identify that the ML model will likely change in performance, based on changes in the underlying data.
Score: 5.319802998033767
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Detecting drift in performance of Machine Learning (ML) models is an acknowledged challenge. For ML models to become an integral part of business applications it is essential to detect when an ML model drifts away from acceptable operation. However, it is often the case that actual labels are difficult and expensive to get, for example, because they require expert judgment. Therefore, there is a need for methods that detect likely degradation in ML operation without labels. We propose a method that utilizes feature space rules, called data slices, for drift detection. We provide experimental indications that our method is likely to identify that the ML model will likely change in performance, based on changes in the underlying data.

Related papers

Analysis of Zero Day Attack Detection Using MLP and XAI [0.0]
This paper analyzes Machine Learning (ML) and Deep Learning (DL) based approaches to create Intrusion Detection Systems (IDS) The focus is on using the KDD99 dataset, which has the most research done among all the datasets for detecting zero-day attacks. We evaluate the performance of four multilayer perceptron (MLP) trained on the KDD99 dataset, including baseline ML models, weighted ML models, truncated ML models, and weighted truncated ML models.
arXiv Detail & Related papers (2025-01-28T02:20:34Z)
Predicting the Performance of Black-box LLMs through Self-Queries [60.87193950962585]
Large language models (LLMs) are increasingly relied on in AI systems, predicting when they make mistakes is crucial. In this paper, we extract features of LLMs in a black-box manner by using follow-up prompts and taking the probabilities of different responses as representations. We demonstrate that training a linear model on these low-dimensional representations produces reliable predictors of model performance at the instance level.
arXiv Detail & Related papers (2025-01-02T22:26:54Z)
Time to Retrain? Detecting Concept Drifts in Machine Learning Systems [1.4499463058550683]
We propose a model-agnostic technique (CDSeer) for detecting concept drift in machine learning (ML) models. Results show that CDSeer has better precision and recall compared to the state-of-the-art while requiring significantly less manual labeling. The improved performance and ease of adoption of CDSeer are valuable in making ML systems more reliable.
arXiv Detail & Related papers (2024-10-11T18:47:39Z)
Loss-Free Machine Unlearning [51.34904967046097]
We present a machine unlearning approach that is both retraining- and label-free. Retraining-free approaches often utilise Fisher information, which is derived from the loss and requires labelled data which may not be available. We present an extension to the Selective Synaptic Dampening algorithm, substituting the diagonal of the Fisher information matrix for the gradient of the l2 norm of the model output to approximate sensitivity.
arXiv Detail & Related papers (2024-02-29T16:15:34Z)
Mitigating ML Model Decay in Continuous Integration with Data Drift Detection: An Empirical Study [7.394099294390271]
This study aims to investigate the performance of using data drift detection techniques for automatically detecting the retraining points for ML models for TCP in CI environments. We employed the Hellinger distance to identify changes in both the values and distribution of input data and leveraged these changes as retraining points for the ML model. Our experimental evaluation of the Hellinger distance-based method demonstrated its efficacy and efficiency in detecting retraining points and reducing the associated costs.
arXiv Detail & Related papers (2023-05-22T05:55:23Z)
Learn to Unlearn: A Survey on Machine Unlearning [29.077334665555316]
This article presents a review of recent machine unlearning techniques, verification mechanisms, and potential attacks. We highlight emerging challenges and prospective research directions. We aim for this paper to provide valuable resources for integrating privacy, equity, andresilience into ML systems.
arXiv Detail & Related papers (2023-05-12T14:28:02Z)
AI Model Disgorgement: Methods and Choices [127.54319351058167]
We introduce a taxonomy of possible disgorgement methods that are applicable to modern machine learning systems. We investigate the meaning of "removing the effects" of data in the trained model in a way that does not require retraining from scratch.
arXiv Detail & Related papers (2023-04-07T08:50:18Z)
Particle-Based Score Estimation for State Space Model Learning in Autonomous Driving [62.053071723903834]
Multi-object state estimation is a fundamental problem for robotic applications. We consider learning maximum-likelihood parameters using particle methods. We apply our method to real data collected from autonomous vehicles.
arXiv Detail & Related papers (2022-12-14T01:21:05Z)
Machine Unlearning of Features and Labels [72.81914952849334]
We propose first scenarios for unlearning and labels in machine learning models. Our approach builds on the concept of influence functions and realizes unlearning through closed-form updates of model parameters.
arXiv Detail & Related papers (2021-08-26T04:42:24Z)
FreaAI: Automated extraction of data slices to test machine learning models [2.475112368179548]
We show the feasibility of automatically extracting feature models that result in explainable data slices over which the ML solution under-performs. Our novel technique, IBM FreaAI aka FreaAI, extracts such slices from structured ML test data or any other labeled data.
arXiv Detail & Related papers (2021-08-12T09:21:16Z)
Detecting Faults during Automatic Screwdriving: A Dataset and Use Case of Anomaly Detection for Automatic Screwdriving [80.6725125503521]
Data-driven approaches, using Machine Learning (ML) for detecting faults have recently gained increasing interest. We present a use case of using ML models for detecting faults during automated screwdriving operations.
arXiv Detail & Related papers (2021-07-05T11:46:00Z)
Transfer Learning without Knowing: Reprogramming Black-box Machine Learning Models with Scarce Data and Limited Resources [78.72922528736011]
We propose a novel approach, black-box adversarial reprogramming (BAR), that repurposes a well-trained black-box machine learning model. Using zeroth order optimization and multi-label mapping techniques, BAR can reprogram a black-box ML model solely based on its input-output responses. BAR outperforms state-of-the-art methods and yields comparable performance to the vanilla adversarial reprogramming method.
arXiv Detail & Related papers (2020-07-17T01:52:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.