An Empirical Study of Uncertainty Estimation Techniques for Detecting
Drift in Data Streams
- URL: http://arxiv.org/abs/2311.13374v1
- Date: Wed, 22 Nov 2023 13:17:55 GMT
- Title: An Empirical Study of Uncertainty Estimation Techniques for Detecting
Drift in Data Streams
- Authors: Anton Winter, Nicolas Jourdan, Tristan Wirth, Volker Knauthe, Arjan
Kuijper
- Abstract summary: This study conducts a comprehensive empirical evaluation of using uncertainty values as substitutes for error rates in detecting drifts.
We examine five uncertainty estimation methods in conjunction with the ADWIN detector across seven real-world datasets.
Our results reveal that while the SWAG method exhibits superior calibration, the overall accuracy in detecting drifts is not notably impacted by the choice of uncertainty estimation method.
- Score: 4.818865062632567
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In safety-critical domains such as autonomous driving and medical diagnosis,
the reliability of machine learning models is crucial. One significant
challenge to reliability is concept drift, which can cause model deterioration
over time. Traditionally, drift detectors rely on true labels, which are often
scarce and costly. This study conducts a comprehensive empirical evaluation of
using uncertainty values as substitutes for error rates in detecting drifts,
aiming to alleviate the reliance on labeled post-deployment data. We examine
five uncertainty estimation methods in conjunction with the ADWIN detector
across seven real-world datasets. Our results reveal that while the SWAG method
exhibits superior calibration, the overall accuracy in detecting drifts is not
notably impacted by the choice of uncertainty estimation method, with even the
most basic method demonstrating competitive performance. These findings offer
valuable insights into the practical applicability of uncertainty-based drift
detection in real-world, safety-critical applications.
Related papers
- A Neighbor-Searching Discrepancy-based Drift Detection Scheme for Learning Evolving Data [40.00357483768265]
This work presents a novel real concept drift detection method based on Neighbor-Searching Discrepancy.
The proposed method is able to detect real concept drift with high accuracy while ignoring virtual drift.
It can also indicate the direction of the classification boundary change by identifying the invasion or retreat of a certain class.
arXiv Detail & Related papers (2024-05-23T04:03:36Z) - Revisiting Confidence Estimation: Towards Reliable Failure Prediction [53.79160907725975]
We find a general, widely existing but actually-neglected phenomenon that most confidence estimation methods are harmful for detecting misclassification errors.
We propose to enlarge the confidence gap by finding flat minima, which yields state-of-the-art failure prediction performance.
arXiv Detail & Related papers (2024-03-05T11:44:14Z) - Conservative Prediction via Data-Driven Confidence Minimization [70.93946578046003]
In safety-critical applications of machine learning, it is often desirable for a model to be conservative.
We propose the Data-Driven Confidence Minimization framework, which minimizes confidence on an uncertainty dataset.
arXiv Detail & Related papers (2023-06-08T07:05:36Z) - Detecting Concept Drift in the Presence of Sparsity -- A Case Study of
Automated Change Risk Assessment System [0.8021979227281782]
Missing values, widely called as textitsparsity in literature, is a common characteristic of many real-world datasets.
We study different patterns of missing values, various statistical and ML based data imputation methods for different kinds of sparsity.
We then select the best concept drift detector given a dataset with missing values based on the different metrics.
arXiv Detail & Related papers (2022-07-27T04:27:49Z) - Bayesian autoencoders with uncertainty quantification: Towards
trustworthy anomaly detection [78.24964622317634]
In this work, the formulation of Bayesian autoencoders (BAEs) is adopted to quantify the total anomaly uncertainty.
To evaluate the quality of uncertainty, we consider the task of classifying anomalies with the additional option of rejecting predictions of high uncertainty.
Our experiments demonstrate the effectiveness of the BAE and total anomaly uncertainty on a set of benchmark datasets and two real datasets for manufacturing.
arXiv Detail & Related papers (2022-02-25T12:20:04Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Detecting Concept Drift With Neural Network Model Uncertainty [0.0]
Uncertainty Drift Detection (UDD) is able to detect drifts without access to true labels.
In contrast to input data-based drift detection, our approach considers the effects of the current input data on the properties of the prediction model.
We show that UDD outperforms other state-of-the-art strategies on two synthetic as well as ten real-world data sets for both regression and classification tasks.
arXiv Detail & Related papers (2021-07-05T08:56:36Z) - Towards Reducing Labeling Cost in Deep Object Detection [61.010693873330446]
We propose a unified framework for active learning, that considers both the uncertainty and the robustness of the detector.
Our method is able to pseudo-label the very confident predictions, suppressing a potential distribution drift.
arXiv Detail & Related papers (2021-06-22T16:53:09Z) - Know Your Limits: Uncertainty Estimation with ReLU Classifiers Fails at
Reliable OOD Detection [0.0]
This paper gives a theoretical explanation for said experimental findings and illustrates it on synthetic data.
We prove that such techniques are not able to reliably identify OOD samples in a classification setting.
arXiv Detail & Related papers (2020-12-09T21:35:55Z) - Out-of-Distribution Detection for Automotive Perception [58.34808836642603]
Neural networks (NNs) are widely used for object classification in autonomous driving.
NNs can fail on input data not well represented by the training dataset, known as out-of-distribution (OOD) data.
This paper presents a method for determining whether inputs are OOD, which does not require OOD data during training and does not increase the computational cost of inference.
arXiv Detail & Related papers (2020-11-03T01:46:35Z) - Deep Learning based Uncertainty Decomposition for Real-time Control [9.067368638784355]
We propose a novel method for detecting the absence of training data using deep learning.
We show its advantages over existing approaches on synthetic and real-world datasets.
We further demonstrate the practicality of this uncertainty estimate in deploying online data-efficient control on a simulated quadcopter.
arXiv Detail & Related papers (2020-10-06T10:46:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.