Employing chunk size adaptation to overcome concept drift
- URL: http://arxiv.org/abs/2110.12881v1
- Date: Mon, 25 Oct 2021 12:36:22 GMT
- Title: Employing chunk size adaptation to overcome concept drift
- Authors: J\k{e}drzej Kozal, Filip Guzy, Micha{\l} Wo\'zniak
- Abstract summary: We propose a new Chunk Adaptive Restoration framework that can be adapted to any block-based data stream classification algorithm.
The proposed algorithm adjusts the data chunk size in the case of concept drift detection to minimize the impact of the change on the predictive performance of the used model.
- Score: 2.277447144331876
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern analytical systems must be ready to process streaming data and
correctly respond to data distribution changes. The phenomenon of changes in
data distributions is called concept drift, and it may harm the quality of the
used models. Additionally, the possibility of concept drift appearance causes
that the used algorithms must be ready for the continuous adaptation of the
model to the changing data distributions. This work focuses on non-stationary
data stream classification, where a classifier ensemble is used. To keep the
ensemble model up to date, the new base classifiers are trained on the incoming
data blocks and added to the ensemble while, at the same time, outdated models
are removed from the ensemble. One of the problems with this type of model is
the fast reaction to changes in data distributions. We propose a new Chunk
Adaptive Restoration framework that can be adapted to any block-based data
stream classification algorithm. The proposed algorithm adjusts the data chunk
size in the case of concept drift detection to minimize the impact of the
change on the predictive performance of the used model. The conducted
experimental research, backed up with the statistical tests, has proven that
Chunk Adaptive Restoration significantly reduces the model's restoration time.
Related papers
- Learning Augmentation Policies from A Model Zoo for Time Series Forecasting [58.66211334969299]
We introduce AutoTSAug, a learnable data augmentation method based on reinforcement learning.
By augmenting the marginal samples with a learnable policy, AutoTSAug substantially improves forecasting performance.
arXiv Detail & Related papers (2024-09-10T07:34:19Z) - Distilled Datamodel with Reverse Gradient Matching [74.75248610868685]
We introduce an efficient framework for assessing data impact, comprising offline training and online evaluation stages.
Our proposed method achieves comparable model behavior evaluation while significantly speeding up the process compared to the direct retraining method.
arXiv Detail & Related papers (2024-04-22T09:16:14Z) - Addressing Concept Shift in Online Time Series Forecasting: Detect-then-Adapt [37.98336090671441]
Concept textbfDrift textbfDetection antextbfD textbfAdaptation (D3A)
It first detects drifting conception and then aggressively adapts the current model to the drifted concepts after the detection for rapid adaption.
It helps mitigate the data distribution gap, a critical factor contributing to train-test performance inconsistency.
arXiv Detail & Related papers (2024-03-22T04:44:43Z) - Quilt: Robust Data Segment Selection against Concept Drifts [30.62320149405819]
Continuous machine learning pipelines are common in industrial settings where models are periodically trained on data streams.
concept drifts may occur in data streams where the joint distribution of the data X and label y, P(X, y), changes over time and possibly degrade model accuracy.
Existing concept drift adaptation approaches mostly focus on updating the model to the new data and tend to discard the drifted historical data.
We propose Quilt, a data-centric framework for identifying and selecting data segments that maximize model accuracy.
arXiv Detail & Related papers (2023-12-15T11:10:34Z) - Unsupervised Unlearning of Concept Drift with Autoencoders [5.41354952642957]
Concept drift refers to a change in the data distribution affecting the data stream of future samples.
This paper proposes an unsupervised and model-agnostic concept drift adaptation method at the global level.
arXiv Detail & Related papers (2022-11-23T14:52:49Z) - On-the-Fly Test-time Adaptation for Medical Image Segmentation [63.476899335138164]
Adapting the source model to target data distribution at test-time is an efficient solution for the data-shift problem.
We propose a new framework called Adaptive UNet where each convolutional block is equipped with an adaptive batch normalization layer.
During test-time, the model takes in just the new test image and generates a domain code to adapt the features of source model according to the test data.
arXiv Detail & Related papers (2022-03-10T18:51:29Z) - Autoregressive based Drift Detection Method [0.0]
We propose a new concept drift detection method based on autoregressive models called ADDM.
Our results show that this new concept drift detection method outperforms the state-of-the-art drift detection methods.
arXiv Detail & Related papers (2022-03-09T14:36:16Z) - Unsupervised Model Drift Estimation with Batch Normalization Statistics
for Dataset Shift Detection and Model Selection [0.0]
We propose a novel method of model drift estimation by exploiting statistics of batch normalization layer on unlabeled test data.
We show the effectiveness of our method not only on dataset shift detection but also on model selection when there are multiple candidate models among model zoo or training trajectories in an unsupervised way.
arXiv Detail & Related papers (2021-07-01T03:04:47Z) - Churn Reduction via Distillation [54.5952282395487]
We show an equivalence between training with distillation using the base model as the teacher and training with an explicit constraint on the predictive churn.
We then show that distillation performs strongly for low churn training against a number of recent baselines.
arXiv Detail & Related papers (2021-06-04T18:03:31Z) - Anomaly Detection of Time Series with Smoothness-Inducing Sequential
Variational Auto-Encoder [59.69303945834122]
We present a Smoothness-Inducing Sequential Variational Auto-Encoder (SISVAE) model for robust estimation and anomaly detection of time series.
Our model parameterizes mean and variance for each time-stamp with flexible neural networks.
We show the effectiveness of our model on both synthetic datasets and public real-world benchmarks.
arXiv Detail & Related papers (2021-02-02T06:15:15Z) - On Robustness and Transferability of Convolutional Neural Networks [147.71743081671508]
Modern deep convolutional networks (CNNs) are often criticized for not generalizing under distributional shifts.
We study the interplay between out-of-distribution and transfer performance of modern image classification CNNs for the first time.
We find that increasing both the training set and model sizes significantly improve the distributional shift robustness.
arXiv Detail & Related papers (2020-07-16T18:39:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.