Detection of data drift and outliers affecting machine learning model
performance over time
- URL: http://arxiv.org/abs/2012.09258v2
- Date: Wed, 20 Jan 2021 09:31:46 GMT
- Title: Detection of data drift and outliers affecting machine learning model
performance over time
- Authors: Samuel Ackerman, Eitan Farchi, Orna Raz, Marcel Zalmanovici, Parijat
Dube
- Abstract summary: Drift is distribution change between the training and deployment data.
We wish to detect these changes but can't measure accuracy without deployment data labels.
We instead detect drift indirectly by nonparametrically testing the distribution of model prediction confidence for changes.
- Score: 5.319802998033767
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A trained ML model is deployed on another `test' dataset where target feature
values (labels) are unknown. Drift is distribution change between the training
and deployment data, which is concerning if model performance changes. For a
cat/dog image classifier, for instance, drift during deployment could be rabbit
images (new class) or cat/dog images with changed characteristics (change in
distribution). We wish to detect these changes but can't measure accuracy
without deployment data labels. We instead detect drift indirectly by
nonparametrically testing the distribution of model prediction confidence for
changes. This generalizes our method and sidesteps domain-specific feature
representation.
We address important statistical issues, particularly Type-1 error control in
sequential testing, using Change Point Models (CPMs; see Adams and Ross 2012).
We also use nonparametric outlier methods to show the user suspicious
observations for model diagnosis, since the before/after change confidence
distributions overlap significantly. In experiments to demonstrate robustness,
we train on a subset of MNIST digit classes, then insert drift (e.g., unseen
digit class) in deployment data in various settings (gradual/sudden changes in
the drift proportion). A novel loss function is introduced to compare the
performance (detection delay, Type-1 and 2 errors) of a drift detector under
different levels of drift class contamination.
Related papers
- CADM: Confusion Model-based Detection Method for Real-drift in Chunk
Data Stream [3.0885191226198785]
Concept drift detection has attracted considerable attention due to its importance in many real-world applications such as health monitoring and fault diagnosis.
We propose a new approach to detect real-drift in the chunk data stream with limited annotations based on concept confusion.
arXiv Detail & Related papers (2023-03-25T08:59:27Z) - Back to the Source: Diffusion-Driven Test-Time Adaptation [77.4229736436935]
Test-time adaptation harnesses test inputs to improve accuracy of a model trained on source data when tested on shifted target data.
We instead update the target data, by projecting all test inputs toward the source domain with a generative diffusion model.
arXiv Detail & Related papers (2022-07-07T17:14:10Z) - Deep learning model solves change point detection for multiple change
types [69.77452691994712]
A change points detection aims to catch an abrupt disorder in data distribution.
We propose an approach that works in the multiple-distributions scenario.
arXiv Detail & Related papers (2022-04-15T09:44:21Z) - On-the-Fly Test-time Adaptation for Medical Image Segmentation [63.476899335138164]
Adapting the source model to target data distribution at test-time is an efficient solution for the data-shift problem.
We propose a new framework called Adaptive UNet where each convolutional block is equipped with an adaptive batch normalization layer.
During test-time, the model takes in just the new test image and generates a domain code to adapt the features of source model according to the test data.
arXiv Detail & Related papers (2022-03-10T18:51:29Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Automatically detecting data drift in machine learning classifiers [2.202253618096515]
We term changes that affect machine learning performance data drift' or drift'
We propose an approach based solely on classifier suggested labels and its confidence in them, for alerting on data distribution or feature space changes that are likely to cause data drift.
arXiv Detail & Related papers (2021-11-10T12:34:14Z) - Training on Test Data with Bayesian Adaptation for Covariate Shift [96.3250517412545]
Deep neural networks often make inaccurate predictions with unreliable uncertainty estimates.
We derive a Bayesian model that provides for a well-defined relationship between unlabeled inputs under distributional shift and model parameters.
We show that our method improves both accuracy and uncertainty estimation.
arXiv Detail & Related papers (2021-09-27T01:09:08Z) - Task-Sensitive Concept Drift Detector with Metric Learning [7.706795195017394]
We propose a novel task-sensitive drift detection framework, which is able to detect drifts without access to true labels during inference.
It is able to detect real drift, where the drift affects the classification performance, while it properly ignores virtual drift.
We evaluate the performance of the proposed framework with a novel metric, which accumulates the standard metrics of detection accuracy, false positive rate and detection delay into one value.
arXiv Detail & Related papers (2021-08-16T09:10:52Z) - Unsupervised Model Drift Estimation with Batch Normalization Statistics
for Dataset Shift Detection and Model Selection [0.0]
We propose a novel method of model drift estimation by exploiting statistics of batch normalization layer on unlabeled test data.
We show the effectiveness of our method not only on dataset shift detection but also on model selection when there are multiple candidate models among model zoo or training trajectories in an unsupervised way.
arXiv Detail & Related papers (2021-07-01T03:04:47Z) - Evaluating Prediction-Time Batch Normalization for Robustness under
Covariate Shift [81.74795324629712]
We call prediction-time batch normalization, which significantly improves model accuracy and calibration under covariate shift.
We show that prediction-time batch normalization provides complementary benefits to existing state-of-the-art approaches for improving robustness.
The method has mixed results when used alongside pre-training, and does not seem to perform as well under more natural types of dataset shift.
arXiv Detail & Related papers (2020-06-19T05:08:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.