Exathlon: A Benchmark for Explainable Anomaly Detection over Time Series
- URL: http://arxiv.org/abs/2010.05073v3
- Date: Sun, 5 Sep 2021 22:03:08 GMT
- Title: Exathlon: A Benchmark for Explainable Anomaly Detection over Time Series
- Authors: Vincent Jacob, Fei Song, Arnaud Stiegler, Bijan Rad, Yanlei Diao,
Nesime Tatbul
- Abstract summary: We present Exathlon, the first benchmark for explainable anomaly detection over high-dimensional time series data.
Exathlon has been constructed based on real data traces from repeated executions of large-scale stream processing jobs on an Apache Spark cluster.
For each of the anomaly instances, ground truth labels for the root cause interval as well as those for the extended effect interval are provided.
- Score: 6.085662888748731
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Access to high-quality data repositories and benchmarks have been
instrumental in advancing the state of the art in many experimental research
domains. While advanced analytics tasks over time series data have been gaining
lots of attention, lack of such community resources severely limits scientific
progress. In this paper, we present Exathlon, the first comprehensive public
benchmark for explainable anomaly detection over high-dimensional time series
data. Exathlon has been systematically constructed based on real data traces
from repeated executions of large-scale stream processing jobs on an Apache
Spark cluster. Some of these executions were intentionally disturbed by
introducing instances of six different types of anomalous events (e.g.,
misbehaving inputs, resource contention, process failures). For each of the
anomaly instances, ground truth labels for the root cause interval as well as
those for the extended effect interval are provided, supporting the development
and evaluation of a wide range of anomaly detection (AD) and explanation
discovery (ED) tasks. We demonstrate the practical utility of Exathlon's
dataset, evaluation methodology, and end-to-end data science pipeline design
through an experimental study with three state-of-the-art AD and ED techniques.
Related papers
- ADer: A Comprehensive Benchmark for Multi-class Visual Anomaly Detection [52.228708947607636]
This paper proposes a comprehensive visual anomaly detection benchmark, textbftextitADer, which is a modular framework for new anomaly detection methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - ARC: A Generalist Graph Anomaly Detector with In-Context Learning [62.202323209244]
ARC is a generalist GAD approach that enables a one-for-all'' GAD model to detect anomalies across various graph datasets on-the-fly.
equipped with in-context learning, ARC can directly extract dataset-specific patterns from the target dataset.
Extensive experiments on multiple benchmark datasets from various domains demonstrate the superior anomaly detection performance, efficiency, and generalizability of ARC.
arXiv Detail & Related papers (2024-05-27T02:42:33Z) - Graph Spatiotemporal Process for Multivariate Time Series Anomaly
Detection with Missing Values [67.76168547245237]
We introduce a novel framework called GST-Pro, which utilizes a graphtemporal process and anomaly scorer to detect anomalies.
Our experimental results show that the GST-Pro method can effectively detect anomalies in time series data and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2024-01-11T10:10:16Z) - Unraveling the "Anomaly" in Time Series Anomaly Detection: A
Self-supervised Tri-domain Solution [89.16750999704969]
Anomaly labels hinder traditional supervised models in time series anomaly detection.
Various SOTA deep learning techniques, such as self-supervised learning, have been introduced to tackle this issue.
We propose a novel self-supervised learning based Tri-domain Anomaly Detector (TriAD)
arXiv Detail & Related papers (2023-11-19T05:37:18Z) - A Critical Review of Common Log Data Sets Used for Evaluation of
Sequence-based Anomaly Detection Techniques [2.5339493426758906]
We analyze six publicly available log data sets with focus on the manifestations of anomalies and simple techniques for their detection.
Our findings suggest that most anomalies are not directly related to sequential manifestations and that advanced detection techniques are not required to achieve high detection rates on these data sets.
arXiv Detail & Related papers (2023-09-06T09:31:17Z) - Multivariate Time-Series Anomaly Detection with Contaminated Data [9.46389554092506]
This paper presents a novel and practical end-to-end unsupervised TSAD when the training data are contaminated with anomalies.
The introduced approach, called TSAD-C, is devoid of access to abnormality labels during the training phase.
Our experiments conducted on three reliable datasets conclusively demonstrate that our approach surpasses existing methodologies.
arXiv Detail & Related papers (2023-08-24T05:10:18Z) - DIVERSIFY: A General Framework for Time Series Out-of-distribution
Detection and Generalization [58.704753031608625]
Time series is one of the most challenging modalities in machine learning research.
OOD detection and generalization on time series tend to suffer due to its non-stationary property.
We propose DIVERSIFY, a framework for OOD detection and generalization on dynamic distributions of time series.
arXiv Detail & Related papers (2023-08-04T12:27:11Z) - Sintel: A Machine Learning Framework to Extract Insights from Signals [13.04826679898367]
We introduce Sintel, a machine learning framework for end-to-end time series tasks such as anomaly detection.
Sintel logs the entire anomaly detection journey, providing detailed documentation of anomalies over time.
It enables users to analyze signals, compare methods, and investigate anomalies through an interactive visualization tool.
arXiv Detail & Related papers (2022-04-19T19:38:27Z) - TadGAN: Time Series Anomaly Detection Using Generative Adversarial
Networks [73.01104041298031]
TadGAN is an unsupervised anomaly detection approach built on Generative Adversarial Networks (GANs)
To capture the temporal correlations of time series, we use LSTM Recurrent Neural Networks as base models for Generators and Critics.
To demonstrate the performance and generalizability of our approach, we test several anomaly scoring techniques and report the best-suited one.
arXiv Detail & Related papers (2020-09-16T15:52:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.