Rare Yet Popular: Evidence and Implications from Labeled Datasets for
Network Anomaly Detection
- URL: http://arxiv.org/abs/2211.10129v1
- Date: Fri, 18 Nov 2022 10:14:03 GMT
- Title: Rare Yet Popular: Evidence and Implications from Labeled Datasets for
Network Anomaly Detection
- Authors: Jose Manuel Navarro, Alexis Huet and Dario Rossi
- Abstract summary: We present a systematic analysis of available public and private ground truth for anomaly detection in the context of network environments.
Our analysis reveals that, while anomalies are, by definition, temporally rare events, their spatial characterization clearly shows some type of anomalies are significantly more popular than others.
- Score: 9.717823994163277
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Anomaly detection research works generally propose algorithms or end-to-end
systems that are designed to automatically discover outliers in a dataset or a
stream. While literature abounds concerning algorithms or the definition of
metrics for better evaluation, the quality of the ground truth against which
they are evaluated is seldom questioned. In this paper, we present a systematic
analysis of available public (and additionally our private) ground truth for
anomaly detection in the context of network environments, where data is
intrinsically temporal, multivariate and, in particular, exhibits spatial
properties, which, to the best of our knowledge, we are the first to explore.
Our analysis reveals that, while anomalies are, by definition, temporally rare
events, their spatial characterization clearly shows some type of anomalies are
significantly more popular than others. We find that simple clustering can
reduce the need for human labeling by a factor of 2x-10x, that we are first to
quantitatively analyze in the wild.
Related papers
- Open Challenges in Time Series Anomaly Detection: An Industry Perspective [0.0]
We list several areas that are of practical relevance and that we believe are either under-investigated or missing entirely from the current discourse.
Based on an investigation of systems deployed in a cloud environment, we motivate the areas of streaming algorithms, human-in-the-loop scenarios, point processes, conditional anomalies and populations analysis of time series.
arXiv Detail & Related papers (2025-02-08T00:38:07Z) - Anomaly Detection by Context Contrasting [57.695202846009714]
Anomaly detection focuses on identifying samples that deviate from the norm.
Recent advances in self-supervised learning have shown great promise in this regard.
We propose Con$$, which learns through context augmentations.
arXiv Detail & Related papers (2024-05-29T07:59:06Z) - Anomaly component analysis [3.046315755726937]
We introduce a new statistical tool dedicated for exploratory analysis of abnormal observations using data depth as a score.
Anomaly component analysis (shortly ACA) is a method that searches a low-dimensional data representation that best visualises and explains anomalies.
arXiv Detail & Related papers (2023-12-26T17:57:46Z) - Video Anomaly Detection via Spatio-Temporal Pseudo-Anomaly Generation : A Unified Approach [49.995833831087175]
This work proposes a novel method for generating generic Video-temporal PAs by inpainting a masked out region of an image.
In addition, we present a simple unified framework to detect real-world anomalies under the OCC setting.
Our method performs on par with other existing state-of-the-art PAs generation and reconstruction based methods under the OCC setting.
arXiv Detail & Related papers (2023-11-27T13:14:06Z) - Precursor-of-Anomaly Detection for Irregular Time Series [31.73234935455713]
We present a novel type of anomaly detection, called Precursor-of-Anomaly (PoA) detection.
To solve both problems at the same time, we present a neural controlled differential equation-based neural network and its multi-task learning algorithm.
arXiv Detail & Related papers (2023-06-27T14:10:09Z) - A Taxonomy of Anomalies in Log Data [0.09558392439655014]
A common taxonomy for anomalies already exists, but it has not yet been applied specifically to log data.
We present a taxonomy for different kinds of log data anomalies and introduce a method for analyzing such anomalies in labeled datasets.
Our results show, that the most common anomaly type is also the easiest to predict.
arXiv Detail & Related papers (2021-11-26T12:23:06Z) - Explainable Deep Few-shot Anomaly Detection with Deviation Networks [123.46611927225963]
We introduce a novel weakly-supervised anomaly detection framework to train detection models.
The proposed approach learns discriminative normality by leveraging the labeled anomalies and a prior probability.
Our model is substantially more sample-efficient and robust, and performs significantly better than state-of-the-art competing methods in both closed-set and open-set settings.
arXiv Detail & Related papers (2021-08-01T14:33:17Z) - Algorithmic Frameworks for the Detection of High Density Anomalies [0.0]
High-density anomalies are deviant cases positioned in the most normal regions of the data space.
This study introduces several non-parametric algorithmic frameworks for unsupervised detection.
arXiv Detail & Related papers (2020-10-09T17:48:02Z) - TadGAN: Time Series Anomaly Detection Using Generative Adversarial
Networks [73.01104041298031]
TadGAN is an unsupervised anomaly detection approach built on Generative Adversarial Networks (GANs)
To capture the temporal correlations of time series, we use LSTM Recurrent Neural Networks as base models for Generators and Critics.
To demonstrate the performance and generalizability of our approach, we test several anomaly scoring techniques and report the best-suited one.
arXiv Detail & Related papers (2020-09-16T15:52:04Z) - Toward Deep Supervised Anomaly Detection: Reinforcement Learning from
Partially Labeled Anomaly Data [150.9270911031327]
We consider the problem of anomaly detection with a small set of partially labeled anomaly examples and a large-scale unlabeled dataset.
Existing related methods either exclusively fit the limited anomaly examples that typically do not span the entire set of anomalies, or proceed with unsupervised learning from the unlabeled data.
We propose here instead a deep reinforcement learning-based approach that enables an end-to-end optimization of the detection of both labeled and unlabeled anomalies.
arXiv Detail & Related papers (2020-09-15T03:05:39Z) - Deep Weakly-supervised Anomaly Detection [118.55172352231381]
Pairwise Relation prediction Network (PReNet) learns pairwise relation features and anomaly scores.
PReNet can detect any seen/unseen abnormalities that fit the learned pairwise abnormal patterns.
Empirical results on 12 real-world datasets show that PReNet significantly outperforms nine competing methods in detecting seen and unseen anomalies.
arXiv Detail & Related papers (2019-10-30T00:40:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.