Current Time Series Anomaly Detection Benchmarks are Flawed and are
Creating the Illusion of Progress
- URL: http://arxiv.org/abs/2009.13807v5
- Date: Sat, 3 Sep 2022 04:46:38 GMT
- Title: Current Time Series Anomaly Detection Benchmarks are Flawed and are
Creating the Illusion of Progress
- Authors: Renjie Wu, Eamonn J. Keogh
- Abstract summary: We introduce the UCR Time Series Anomaly Archive.
This resource will perform a similar role as the UCR Time Series Classification Archive.
- Score: 11.689905300531917
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Time series anomaly detection has been a perennially important topic in data
science, with papers dating back to the 1950s. However, in recent years there
has been an explosion of interest in this topic, much of it driven by the
success of deep learning in other domains and for other time series tasks. Most
of these papers test on one or more of a handful of popular benchmark datasets,
created by Yahoo, Numenta, NASA, etc. In this work we make a surprising claim.
The majority of the individual exemplars in these datasets suffer from one or
more of four flaws. Because of these four flaws, we believe that many published
comparisons of anomaly detection algorithms may be unreliable, and more
importantly, much of the apparent progress in recent years may be illusionary.
In addition to demonstrating these claims, with this paper we introduce the UCR
Time Series Anomaly Archive. We believe that this resource will perform a
similar role as the UCR Time Series Classification Archive, by providing the
community with a benchmark that allows meaningful comparisons between
approaches and a meaningful gauge of overall progress.
Related papers
- Online Model-based Anomaly Detection in Multivariate Time Series: Taxonomy, Survey, Research Challenges and Future Directions [0.017476232824732776]
Time-series anomaly detection plays an important role in engineering processes.
This survey introduces a novel taxonomy where a distinction between online and offline, and training and inference is made.
It presents the most popular data sets and evaluation metrics used in the literature, as well as a detailed analysis.
arXiv Detail & Related papers (2024-08-07T13:01:10Z) - Too Good To Be True: performance overestimation in (re)current practices
for Human Activity Recognition [49.1574468325115]
sliding windows for data segmentation followed by standard random k-fold cross validation produce biased results.
It is important to raise awareness in the scientific community about this problem, whose negative effects are being overlooked.
Several experiments with different types of datasets and different types of classification models allow us to exhibit the problem and show it persists independently of the method or dataset.
arXiv Detail & Related papers (2023-10-18T13:24:05Z) - Image Classification with Small Datasets: Overview and Benchmark [0.0]
We systematically organize and connect past studies to consolidate a community that is currently fragmented and scattered.
We propose a common benchmark that allows for an objective comparison of approaches.
We use this benchmark to re-evaluate the standard cross-entropy baseline and ten existing methods published between 2017 and 2021 at renowned venues.
arXiv Detail & Related papers (2022-12-23T17:11:16Z) - Are we certain it's anomalous? [57.729669157989235]
Anomaly detection in time series is a complex task since anomalies are rare due to highly non-linear temporal correlations.
Here we propose the novel use of Hyperbolic uncertainty for Anomaly Detection (HypAD)
HypAD learns self-supervisedly to reconstruct the input signal.
arXiv Detail & Related papers (2022-11-16T21:31:39Z) - Geodesics, Non-linearities and the Archive of Novelty Search [69.6462706723023]
We show that a key effect of the archive is that it counterbalances the exploration biases that result from the use of inadequate behavior metrics.
Our observations seem to hint that attributing a more active role to the archive in sampling can be beneficial.
arXiv Detail & Related papers (2022-05-06T12:03:40Z) - Time Series Analysis via Network Science: Concepts and Algorithms [62.997667081978825]
This review provides a comprehensive overview of existing mapping methods for transforming time series into networks.
We describe the main conceptual approaches, provide authoritative references and give insight into their advantages and limitations in a unified notation and language.
Although still very recent, this research area has much potential and with this survey we intend to pave the way for future research on the topic.
arXiv Detail & Related papers (2021-10-11T13:33:18Z) - A Large-Scale Study on Unsupervised Spatiotemporal Representation
Learning [60.720251418816815]
We present a large-scale study on unsupervised representation learning from videos.
Our objective encourages temporally-persistent features in the same video.
We find that encouraging long-spanned persistency can be effective even if the timespan is 60 seconds.
arXiv Detail & Related papers (2021-04-29T17:59:53Z) - Uncertain Time Series Classification With Shapelet Transform [1.4467794332678539]
Time series classification is a task that aims at classifying chronological data.
We propose a new uncertain dissimilarity measure based on Euclidean distance.
We then propose the uncertain shapelet transform algorithm for the classification of uncertain time series.
arXiv Detail & Related papers (2021-02-03T14:46:01Z) - Exathlon: A Benchmark for Explainable Anomaly Detection over Time Series [6.085662888748731]
We present Exathlon, the first benchmark for explainable anomaly detection over high-dimensional time series data.
Exathlon has been constructed based on real data traces from repeated executions of large-scale stream processing jobs on an Apache Spark cluster.
For each of the anomaly instances, ground truth labels for the root cause interval as well as those for the extended effect interval are provided.
arXiv Detail & Related papers (2020-10-10T19:31:22Z) - TadGAN: Time Series Anomaly Detection Using Generative Adversarial
Networks [73.01104041298031]
TadGAN is an unsupervised anomaly detection approach built on Generative Adversarial Networks (GANs)
To capture the temporal correlations of time series, we use LSTM Recurrent Neural Networks as base models for Generators and Critics.
To demonstrate the performance and generalizability of our approach, we test several anomaly scoring techniques and report the best-suited one.
arXiv Detail & Related papers (2020-09-16T15:52:04Z) - Monash University, UEA, UCR Time Series Extrinsic Regression Archive [6.5513221781395465]
We aim to motivate and support the research into Time Series Extrinsic Regression (TSER) by introducing the first TSER benchmarking archive.
This archive contains 19 datasets from different domains, with varying number of dimensions, unequal length dimensions, and missing values.
In this paper, we introduce the datasets in this archive and did an initial benchmark on existing models.
arXiv Detail & Related papers (2020-06-19T07:47:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.