Improving Real-Time Concept Drift Detection using a Hybrid Transformer-Autoencoder Framework
- URL: http://arxiv.org/abs/2508.07085v1
- Date: Sat, 09 Aug 2025 19:39:33 GMT
- Title: Improving Real-Time Concept Drift Detection using a Hybrid Transformer-Autoencoder Framework
- Authors: N Harshit, K Mounvik,
- Abstract summary: In applied machine learning, concept drift can significantly reduce model performance.<n>Our study proposes a hybrid framework consisting of Transformers and Autoencoders to model complex temporal dynamics.<n>Our results support that the Transformation-Autoencoder detected drift earlier and with more sensitivity than the autoencoders commonly used in the literature.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In applied machine learning, concept drift, which is either gradual or abrupt changes in data distribution, can significantly reduce model performance. Typical detection methods,such as statistical tests or reconstruction-based models,are generally reactive and not very sensitive to early detection. Our study proposes a hybrid framework consisting of Transformers and Autoencoders to model complex temporal dynamics and provide online drift detection. We create a distinct Trust Score methodology, which includes signals on (1) statistical and reconstruction-based drift metrics, more specifically, PSI, JSD, Transformer-AE error, (2) prediction uncertainty, (3) rules violations, and (4) trend of classifier error aligned with the combined metrics defined by the Trust Score. Using a time sequenced airline passenger data set with synthetic drift, our proposed model allows for a better detection of drift using as a whole and at different detection thresholds for both sensitivity and interpretability compared to baseline methods and provides a strong pipeline for drift detection in real time for applied machine learning. We evaluated performance using a time-sequenced airline passenger dataset having the gradually injected stimulus of drift in expectations,e.g. permuted ticket prices in later batches, broken into 10 time segments [1].In the data, our results support that the Transformation-Autoencoder detected drift earlier and with more sensitivity than the autoencoders commonly used in the literature, and provided improved modeling over more error rates and logical violations. Therefore, a robust framework was developed to reliably monitor concept drift.
Related papers
- STREAM-VAE: Dual-Path Routing for Slow and Fast Dynamics in Vehicle Telemetry Anomaly Detection [1.7751300245073598]
STREAM-VAE is a variational autoencoder for anomaly detection in automotive telemetry time-series data.<n>Our model uses a dual-path encoder to separate slow drift and fast spike signal dynamics, and a decoder that represents transient deviations.
arXiv Detail & Related papers (2025-11-19T10:58:40Z) - ResAD: Normalized Residual Trajectory Modeling for End-to-End Autonomous Driving [64.42138266293202]
ResAD is a Normalized Residual Trajectory Modeling framework.<n>It reframes the learning task to predict the residual deviation from an inertial reference.<n>On the NAVSIM benchmark, ResAD achieves a state-of-the-art PDMS of 88.6 using a vanilla diffusion policy.
arXiv Detail & Related papers (2025-10-09T17:59:36Z) - Revisiting Multivariate Time Series Forecasting with Missing Values [65.30332997607141]
Missing values are common in real-world time series.<n>Current approaches have developed an imputation-then-prediction framework that uses imputation modules to fill in missing values, followed by forecasting on the imputed data.<n>This framework overlooks a critical issue: there is no ground truth for the missing values, making the imputation process susceptible to errors that can degrade prediction accuracy.<n>We introduce Consistency-Regularized Information Bottleneck (CRIB), a novel framework built on the Information Bottleneck principle.
arXiv Detail & Related papers (2025-09-27T20:57:48Z) - Flexible and Efficient Drift Detection without Labels [4.517793609535323]
A lot of research on concept drift has focused on the supervised case that assumes the true labels of supervised tasks are available immediately after making predictions.<n>We propose a flexible and efficient concept drift detection algorithm that uses classical statistical process control in a label-less setting.
arXiv Detail & Related papers (2025-06-10T12:31:04Z) - Unsupervised Concept Drift Detection from Deep Learning Representations in Real-time [5.999777817331315]
Concept drift is the phenomenon in which the underlying data distributions and statistical properties of a target domain change over time.<n>textscDriftLens is an unsupervised framework for real-time concept drift detection and characterization.
arXiv Detail & Related papers (2024-06-24T23:41:46Z) - Unsupervised Domain Adaptation for Self-Driving from Past Traversal
Features [69.47588461101925]
We propose a method to adapt 3D object detectors to new driving environments.
Our approach enhances LiDAR-based detection models using spatial quantized historical features.
Experiments on real-world datasets demonstrate significant improvements.
arXiv Detail & Related papers (2023-09-21T15:00:31Z) - Uncovering Drift in Textual Data: An Unsupervised Method for Detecting
and Mitigating Drift in Machine Learning Models [9.035254826664273]
Drift in machine learning refers to the phenomenon where the statistical properties of data or context, in which the model operates, change over time leading to a decrease in its performance.
In our proposed unsupervised drift detection method, we follow a two step process. Our first step involves encoding a sample of production data as the target distribution, and the model training data as the reference distribution.
Our method also identifies the subset of production data that is the root cause of the drift.
The models retrained using these identified high drift samples show improved performance on online customer experience quality metrics.
arXiv Detail & Related papers (2023-09-07T16:45:42Z) - CADM: Confusion Model-based Detection Method for Real-drift in Chunk
Data Stream [3.0885191226198785]
Concept drift detection has attracted considerable attention due to its importance in many real-world applications such as health monitoring and fault diagnosis.
We propose a new approach to detect real-drift in the chunk data stream with limited annotations based on concept confusion.
arXiv Detail & Related papers (2023-03-25T08:59:27Z) - Correlating sparse sensing for large-scale traffic speed estimation: A
Laplacian-enhanced low-rank tensor kriging approach [76.45949280328838]
We propose a Laplacian enhanced low-rank tensor (LETC) framework featuring both lowrankness and multi-temporal correlations for large-scale traffic speed kriging.
We then design an efficient solution algorithm via several effective numeric techniques to scale up the proposed model to network-wide kriging.
arXiv Detail & Related papers (2022-10-21T07:25:57Z) - A Robust and Explainable Data-Driven Anomaly Detection Approach For
Power Electronics [56.86150790999639]
We present two anomaly detection and classification approaches, namely the Matrix Profile algorithm and anomaly transformer.
The Matrix Profile algorithm is shown to be well suited as a generalizable approach for detecting real-time anomalies in streaming time-series data.
A series of custom filters is created and added to the detector to tune its sensitivity, recall, and detection accuracy.
arXiv Detail & Related papers (2022-09-23T06:09:35Z) - Self-Supervised Training with Autoencoders for Visual Anomaly Detection [61.62861063776813]
We focus on a specific use case in anomaly detection where the distribution of normal samples is supported by a lower-dimensional manifold.
We adapt a self-supervised learning regime that exploits discriminative information during training but focuses on the submanifold of normal examples.
We achieve a new state-of-the-art result on the MVTec AD dataset -- a challenging benchmark for visual anomaly detection in the manufacturing domain.
arXiv Detail & Related papers (2022-06-23T14:16:30Z) - Bayesian Autoencoders for Drift Detection in Industrial Environments [69.93875748095574]
Autoencoders are unsupervised models which have been used for detecting anomalies in multi-sensor environments.
Anomalies can come either from real changes in the environment (real drift) or from faulty sensory devices (virtual drift)
arXiv Detail & Related papers (2021-07-28T10:19:58Z) - Detecting Concept Drift With Neural Network Model Uncertainty [0.0]
Uncertainty Drift Detection (UDD) is able to detect drifts without access to true labels.
In contrast to input data-based drift detection, our approach considers the effects of the current input data on the properties of the prediction model.
We show that UDD outperforms other state-of-the-art strategies on two synthetic as well as ten real-world data sets for both regression and classification tasks.
arXiv Detail & Related papers (2021-07-05T08:56:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.