A Dataset for Evaluating Online Anomaly Detection Approaches for Discrete Multivariate Time Series
- URL: http://arxiv.org/abs/2411.13951v2
- Date: Mon, 25 Nov 2024 14:24:57 GMT
- Title: A Dataset for Evaluating Online Anomaly Detection Approaches for Discrete Multivariate Time Series
- Authors: Lucas Correia, Jan-Christoph Goos, Thomas Bäck, Anna V. Kononova,
- Abstract summary: Current publicly available datasets are too small, not diverse and feature trivial anomalies.
We propose a solution: a diverse, extensive, and non-trivial dataset generated via state-of-the-art simulation tools.
We make different versions of the dataset available, where training and test subsets are offered in contaminated and clean versions.
As expected, the baseline experimentation shows that the approaches trained on the semi-supervised version of the dataset outperform their unsupervised counterparts.
- Score: 0.01874930567916036
- License:
- Abstract: Benchmarking anomaly detection approaches for multivariate time series is challenging due to the lack of high-quality datasets. Current publicly available datasets are too small, not diverse and feature trivial anomalies, which hinders measurable progress in this research area. We propose a solution: a diverse, extensive, and non-trivial dataset generated via state-of-the-art simulation tools that reflects realistic behaviour of an automotive powertrain, including its multivariate, dynamic and variable-state properties. To cater for both unsupervised and semi-supervised anomaly detection settings, as well as time series generation and forecasting, we make different versions of the dataset available, where training and test subsets are offered in contaminated and clean versions, depending on the task. We also provide baseline results from a small selection of approaches based on deterministic and variational autoencoders, as well as a non-parametric approach. As expected, the baseline experimentation shows that the approaches trained on the semi-supervised version of the dataset outperform their unsupervised counterparts, highlighting a need for approaches more robust to contaminated training data.
Related papers
- TeVAE: A Variational Autoencoder Approach for Discrete Online Anomaly Detection in Variable-state Multivariate Time-series Data [0.017476232824732776]
We propose a temporal variational autoencoder (TeVAE) that can detect anomalies with minimal false positives when trained on unlabelled data.
When properly configured, TeVAE flags anomalies only 6% of the time wrongly and detects 65% of anomalies present.
arXiv Detail & Related papers (2024-07-09T13:32:33Z) - A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - Towards a General Time Series Anomaly Detector with Adaptive Bottlenecks and Dual Adversarial Decoders [16.31103717602631]
Time series anomaly detection plays a vital role in a wide range of applications.
Existing methods require training one specific model for each dataset.
We propose a general time series anomaly detection model, which is pre-trained on extensive multi-domain datasets.
arXiv Detail & Related papers (2024-05-24T06:59:43Z) - Graph Spatiotemporal Process for Multivariate Time Series Anomaly
Detection with Missing Values [67.76168547245237]
We introduce a novel framework called GST-Pro, which utilizes a graphtemporal process and anomaly scorer to detect anomalies.
Our experimental results show that the GST-Pro method can effectively detect anomalies in time series data and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2024-01-11T10:10:16Z) - MA-VAE: Multi-head Attention-based Variational Autoencoder Approach for
Anomaly Detection in Multivariate Time-series Applied to Automotive Endurance
Powertrain Testing [0.7499722271664147]
We propose a variational autoencoder with multi-head attention (MA-VAE)
When trained on unlabelled data, MA-VAE provides very few false positives but also manages to detect the majority of anomalies presented.
It is 9% of the time wrong when an anomaly is flagged and discovers 67% of the anomalies present.
arXiv Detail & Related papers (2023-09-05T14:05:37Z) - Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers.
We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes.
We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z) - Detection of Anomalies in Multivariate Time Series Using Ensemble
Techniques [3.2422067155309806]
We propose an ensemble technique that combines multiple base models toward the final decision.
A semi-supervised approach using a Logistic Regressor to combine the base models' outputs is also proposed.
The performance improvement in terms of anomaly detection accuracy reaches 2% for the unsupervised and at least 10% for the semi-supervised models.
arXiv Detail & Related papers (2023-08-06T17:51:22Z) - Detecting Multivariate Time Series Anomalies with Zero Known Label [17.930211011723447]
MTGFlow is an unsupervised anomaly detection approach for multivariate time series anomaly detection.
The complex interdependencies among entities and the diverse inherent characteristics of each entity pose significant challenges on the density estimation.
Experiments on five public datasets with seven baselines are conducted, MTGFlow outperforms the SOTA methods by up to 5.0 AUROC%.
arXiv Detail & Related papers (2022-08-03T14:38:19Z) - Meta-learning One-class Classifiers with Eigenvalue Solvers for
Supervised Anomaly Detection [55.888835686183995]
We propose a neural network-based meta-learning method for supervised anomaly detection.
We experimentally demonstrate that the proposed method achieves better performance than existing anomaly detection and few-shot learning methods.
arXiv Detail & Related papers (2021-03-01T01:43:04Z) - TadGAN: Time Series Anomaly Detection Using Generative Adversarial
Networks [73.01104041298031]
TadGAN is an unsupervised anomaly detection approach built on Generative Adversarial Networks (GANs)
To capture the temporal correlations of time series, we use LSTM Recurrent Neural Networks as base models for Generators and Critics.
To demonstrate the performance and generalizability of our approach, we test several anomaly scoring techniques and report the best-suited one.
arXiv Detail & Related papers (2020-09-16T15:52:04Z) - SMART: Simultaneous Multi-Agent Recurrent Trajectory Prediction [72.37440317774556]
We propose advances that address two key challenges in future trajectory prediction.
multimodality in both training data and predictions and constant time inference regardless of number of agents.
arXiv Detail & Related papers (2020-07-26T08:17:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.