Exathlon: A Benchmark for Explainable Anomaly Detection over Time Series
- URL: http://arxiv.org/abs/2010.05073v3
- Date: Sun, 5 Sep 2021 22:03:08 GMT
- Title: Exathlon: A Benchmark for Explainable Anomaly Detection over Time Series
- Authors: Vincent Jacob, Fei Song, Arnaud Stiegler, Bijan Rad, Yanlei Diao,
Nesime Tatbul
- Abstract summary: We present Exathlon, the first benchmark for explainable anomaly detection over high-dimensional time series data.
Exathlon has been constructed based on real data traces from repeated executions of large-scale stream processing jobs on an Apache Spark cluster.
For each of the anomaly instances, ground truth labels for the root cause interval as well as those for the extended effect interval are provided.
- Score: 6.085662888748731
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Access to high-quality data repositories and benchmarks have been
instrumental in advancing the state of the art in many experimental research
domains. While advanced analytics tasks over time series data have been gaining
lots of attention, lack of such community resources severely limits scientific
progress. In this paper, we present Exathlon, the first comprehensive public
benchmark for explainable anomaly detection over high-dimensional time series
data. Exathlon has been systematically constructed based on real data traces
from repeated executions of large-scale stream processing jobs on an Apache
Spark cluster. Some of these executions were intentionally disturbed by
introducing instances of six different types of anomalous events (e.g.,
misbehaving inputs, resource contention, process failures). For each of the
anomaly instances, ground truth labels for the root cause interval as well as
those for the extended effect interval are provided, supporting the development
and evaluation of a wide range of anomaly detection (AD) and explanation
discovery (ED) tasks. We demonstrate the practical utility of Exathlon's
dataset, evaluation methodology, and end-to-end data science pipeline design
through an experimental study with three state-of-the-art AD and ED techniques.
Related papers
- See it, Think it, Sorted: Large Multimodal Models are Few-shot Time Series Anomaly Analyzers [23.701716999879636]
Time series anomaly detection (TSAD) is becoming increasingly vital due to the rapid growth of time series data.
We introduce a pioneering framework called the Time Series Anomaly Multimodal Analyzer (TAMA) to enhance both the detection and interpretation of anomalies.
arXiv Detail & Related papers (2024-11-04T10:28:41Z) - Online Model-based Anomaly Detection in Multivariate Time Series: Taxonomy, Survey, Research Challenges and Future Directions [0.017476232824732776]
Time-series anomaly detection plays an important role in engineering processes.
This survey introduces a novel taxonomy where a distinction between online and offline, and training and inference is made.
It presents the most popular data sets and evaluation metrics used in the literature, as well as a detailed analysis.
arXiv Detail & Related papers (2024-08-07T13:01:10Z) - A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - ARC: A Generalist Graph Anomaly Detector with In-Context Learning [62.202323209244]
ARC is a generalist GAD approach that enables a one-for-all'' GAD model to detect anomalies across various graph datasets on-the-fly.
equipped with in-context learning, ARC can directly extract dataset-specific patterns from the target dataset.
Extensive experiments on multiple benchmark datasets from various domains demonstrate the superior anomaly detection performance, efficiency, and generalizability of ARC.
arXiv Detail & Related papers (2024-05-27T02:42:33Z) - Graph Spatiotemporal Process for Multivariate Time Series Anomaly
Detection with Missing Values [67.76168547245237]
We introduce a novel framework called GST-Pro, which utilizes a graphtemporal process and anomaly scorer to detect anomalies.
Our experimental results show that the GST-Pro method can effectively detect anomalies in time series data and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2024-01-11T10:10:16Z) - Unraveling the "Anomaly" in Time Series Anomaly Detection: A
Self-supervised Tri-domain Solution [89.16750999704969]
Anomaly labels hinder traditional supervised models in time series anomaly detection.
Various SOTA deep learning techniques, such as self-supervised learning, have been introduced to tackle this issue.
We propose a novel self-supervised learning based Tri-domain Anomaly Detector (TriAD)
arXiv Detail & Related papers (2023-11-19T05:37:18Z) - A Critical Review of Common Log Data Sets Used for Evaluation of
Sequence-based Anomaly Detection Techniques [2.5339493426758906]
We analyze six publicly available log data sets with focus on the manifestations of anomalies and simple techniques for their detection.
Our findings suggest that most anomalies are not directly related to sequential manifestations and that advanced detection techniques are not required to achieve high detection rates on these data sets.
arXiv Detail & Related papers (2023-09-06T09:31:17Z) - DIVERSIFY: A General Framework for Time Series Out-of-distribution
Detection and Generalization [58.704753031608625]
Time series is one of the most challenging modalities in machine learning research.
OOD detection and generalization on time series tend to suffer due to its non-stationary property.
We propose DIVERSIFY, a framework for OOD detection and generalization on dynamic distributions of time series.
arXiv Detail & Related papers (2023-08-04T12:27:11Z) - Sintel: A Machine Learning Framework to Extract Insights from Signals [13.04826679898367]
We introduce Sintel, a machine learning framework for end-to-end time series tasks such as anomaly detection.
Sintel logs the entire anomaly detection journey, providing detailed documentation of anomalies over time.
It enables users to analyze signals, compare methods, and investigate anomalies through an interactive visualization tool.
arXiv Detail & Related papers (2022-04-19T19:38:27Z) - TadGAN: Time Series Anomaly Detection Using Generative Adversarial
Networks [73.01104041298031]
TadGAN is an unsupervised anomaly detection approach built on Generative Adversarial Networks (GANs)
To capture the temporal correlations of time series, we use LSTM Recurrent Neural Networks as base models for Generators and Critics.
To demonstrate the performance and generalizability of our approach, we test several anomaly scoring techniques and report the best-suited one.
arXiv Detail & Related papers (2020-09-16T15:52:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.