Handling Concept Drift for Predictions in Business Process Mining
- URL: http://arxiv.org/abs/2005.05810v2
- Date: Mon, 18 May 2020 13:24:30 GMT
- Title: Handling Concept Drift for Predictions in Business Process Mining
- Authors: Lucas Baier, Josua Reimold, Niklas K\"uhl
- Abstract summary: Machine learning models are challenged by changing data streams over time which is described as concept drift.
Current research lacks a recommendation which data should be selected for the retraining of the model.
We show that we can improve accuracy from 0.5400 to 0.7010 with concept drift handling.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Predictive services nowadays play an important role across all business
sectors. However, deployed machine learning models are challenged by changing
data streams over time which is described as concept drift. Prediction quality
of models can be largely influenced by this phenomenon. Therefore, concept
drift is usually handled by retraining of the model. However, current research
lacks a recommendation which data should be selected for the retraining of the
machine learning model. Therefore, we systematically analyze different data
selection strategies in this work. Subsequently, we instantiate our findings on
a use case in process mining which is strongly affected by concept drift. We
can show that we can improve accuracy from 0.5400 to 0.7010 with concept drift
handling. Furthermore, we depict the effects of the different data selection
strategies.
Related papers
- Methods for Generating Drift in Text Streams [49.3179290313959]
Concept drift is a frequent phenomenon in real-world datasets and corresponds to changes in data distribution over time.
This paper provides four textual drift generation methods to ease the production of datasets with labeled drifts.
Results show that all methods have their performance degraded right after the drifts, and the incremental SVM is the fastest to run and recover the previous performance levels.
arXiv Detail & Related papers (2024-03-18T23:48:33Z) - Explaining Drift using Shapley Values [0.0]
Machine learning models often deteriorate in their performance when they are used to predict the outcomes over data on which they were not trained.
There is no framework to identify the drivers behind the drift in model performance.
We propose a novel framework - DBShap that uses principled Shapley values to identify the main contributors of the drift.
arXiv Detail & Related papers (2024-01-18T07:07:42Z) - On the Change of Decision Boundaries and Loss in Learning with Concept
Drift [8.686667049158476]
Concept drift refers to the phenomenon that the distribution generating the observed data changes over time.
Many technologies for learning with drift rely on the interleaved test-train error (ITTE) as a quantity which approximates the model generalization error.
arXiv Detail & Related papers (2022-12-02T14:58:13Z) - Unsupervised Unlearning of Concept Drift with Autoencoders [5.41354952642957]
Concept drift refers to a change in the data distribution affecting the data stream of future samples.
This paper proposes an unsupervised and model-agnostic concept drift adaptation method at the global level.
arXiv Detail & Related papers (2022-11-23T14:52:49Z) - Autoregressive based Drift Detection Method [0.0]
We propose a new concept drift detection method based on autoregressive models called ADDM.
Our results show that this new concept drift detection method outperforms the state-of-the-art drift detection methods.
arXiv Detail & Related papers (2022-03-09T14:36:16Z) - How Well Do Sparse Imagenet Models Transfer? [75.98123173154605]
Transfer learning is a classic paradigm by which models pretrained on large "upstream" datasets are adapted to yield good results on "downstream" datasets.
In this work, we perform an in-depth investigation of this phenomenon in the context of convolutional neural networks (CNNs) trained on the ImageNet dataset.
We show that sparse models can match or even outperform the transfer performance of dense models, even at high sparsities.
arXiv Detail & Related papers (2021-11-26T11:58:51Z) - Machine Unlearning of Features and Labels [72.81914952849334]
We propose first scenarios for unlearning and labels in machine learning models.
Our approach builds on the concept of influence functions and realizes unlearning through closed-form updates of model parameters.
arXiv Detail & Related papers (2021-08-26T04:42:24Z) - Beyond Trivial Counterfactual Explanations with Diverse Valuable
Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction.
We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss.
Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z) - Injecting Knowledge in Data-driven Vehicle Trajectory Predictors [82.91398970736391]
Vehicle trajectory prediction tasks have been commonly tackled from two perspectives: knowledge-driven or data-driven.
In this paper, we propose to learn a "Realistic Residual Block" (RRB) which effectively connects these two perspectives.
Our proposed method outputs realistic predictions by confining the residual range and taking into account its uncertainty.
arXiv Detail & Related papers (2021-03-08T16:03:09Z) - Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual
Model-Based Reinforcement Learning [109.74041512359476]
We study a number of design decisions for the predictive model in visual MBRL algorithms.
We find that a range of design decisions that are often considered crucial, such as the use of latent spaces, have little effect on task performance.
We show how this phenomenon is related to exploration and how some of the lower-scoring models on standard benchmarks will perform the same as the best-performing models when trained on the same training data.
arXiv Detail & Related papers (2020-12-08T18:03:21Z) - Switching Scheme: A Novel Approach for Handling Incremental Concept
Drift in Real-World Data Sets [0.0]
Concept drifts can severely affect the prediction performance of a machine learning system.
In this work, we analyze the effects of concept drifts in the context of a real-world data set.
We introduce the switching scheme which combines the two principles of retraining and updating of a machine learning model.
arXiv Detail & Related papers (2020-11-05T10:16:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.