Can Tree Based Approaches Surpass Deep Learning in Anomaly Detection? A
Benchmarking Study
- URL: http://arxiv.org/abs/2402.07281v2
- Date: Mon, 26 Feb 2024 04:56:07 GMT
- Title: Can Tree Based Approaches Surpass Deep Learning in Anomaly Detection? A
Benchmarking Study
- Authors: Santonu Sarkar, Shanay Mehta, Nicole Fernandes, Jyotirmoy Sarkar and
Snehanshu Saha
- Abstract summary: This paper evaluates a diverse array of machine learning-based anomaly detection algorithms.
The paper contributes significantly by conducting an unbiased comparison of various anomaly detection algorithms.
- Score: 0.6291443816903801
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Detection of anomalous situations for complex mission-critical systems holds
paramount importance when their service continuity needs to be ensured. A major
challenge in detecting anomalies from the operational data arises due to the
imbalanced class distribution problem since the anomalies are supposed to be
rare events. This paper evaluates a diverse array of machine learning-based
anomaly detection algorithms through a comprehensive benchmark study. The paper
contributes significantly by conducting an unbiased comparison of various
anomaly detection algorithms, spanning classical machine learning including
various tree-based approaches to deep learning and outlier detection methods.
The inclusion of 104 publicly available and a few proprietary industrial
systems datasets enhances the diversity of the study, allowing for a more
realistic evaluation of algorithm performance and emphasizing the importance of
adaptability to real-world scenarios. The paper dispels the deep learning myth,
demonstrating that though powerful, deep learning is not a universal solution
in this case. We observed that recently proposed tree-based evolutionary
algorithms outperform in many scenarios. We noticed that tree-based approaches
catch a singleton anomaly in a dataset where deep learning methods fail. On the
other hand, classical SVM performs the best on datasets with more than 10%
anomalies, implying that such scenarios can be best modeled as a classification
problem rather than anomaly detection. To our knowledge, such a study on a
large number of state-of-the-art algorithms using diverse data sets, with the
objective of guiding researchers and practitioners in making informed
algorithmic choices, has not been attempted earlier.
Related papers
- A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - Unraveling the "Anomaly" in Time Series Anomaly Detection: A
Self-supervised Tri-domain Solution [89.16750999704969]
Anomaly labels hinder traditional supervised models in time series anomaly detection.
Various SOTA deep learning techniques, such as self-supervised learning, have been introduced to tackle this issue.
We propose a novel self-supervised learning based Tri-domain Anomaly Detector (TriAD)
arXiv Detail & Related papers (2023-11-19T05:37:18Z) - Active anomaly detection based on deep one-class classification [9.904380236739398]
We tackle two essential problems of active learning for Deep SVDD: query strategy and semi-supervised learning method.
First, rather than solely identifying anomalies, our query strategy selects uncertain samples according to an adaptive boundary.
Second, we apply noise contrastive estimation in training a one-class classification model to incorporate both labeled normal and abnormal data effectively.
arXiv Detail & Related papers (2023-09-18T03:56:45Z) - Deep Learning for Time Series Anomaly Detection: A Survey [53.83593870825628]
Time series anomaly detection has applications in a wide range of research fields and applications, including manufacturing and healthcare.
The large size and complex patterns of time series have led researchers to develop specialised deep learning models for detecting anomalous patterns.
This survey focuses on providing structured and comprehensive state-of-the-art time series anomaly detection models through the use of deep learning.
arXiv Detail & Related papers (2022-11-09T22:40:22Z) - Deep Learning for Anomaly Detection in Log Data: A Survey [3.508620069426877]
Self-learning anomaly detection techniques capture patterns in log data and report unexpected log event occurrences.
Deep learning neural networks for this purpose have been presented.
There exist many different architectures for deep learning and it is non-trivial to encode raw and unstructured log data.
arXiv Detail & Related papers (2022-07-08T10:58:28Z) - Meta-learning One-class Classifiers with Eigenvalue Solvers for
Supervised Anomaly Detection [55.888835686183995]
We propose a neural network-based meta-learning method for supervised anomaly detection.
We experimentally demonstrate that the proposed method achieves better performance than existing anomaly detection and few-shot learning methods.
arXiv Detail & Related papers (2021-03-01T01:43:04Z) - Theoretical Insights Into Multiclass Classification: A High-dimensional
Asymptotic View [82.80085730891126]
We provide the first modernally precise analysis of linear multiclass classification.
Our analysis reveals that the classification accuracy is highly distribution-dependent.
The insights gained may pave the way for a precise understanding of other classification algorithms.
arXiv Detail & Related papers (2020-11-16T05:17:29Z) - Algorithmic Frameworks for the Detection of High Density Anomalies [0.0]
High-density anomalies are deviant cases positioned in the most normal regions of the data space.
This study introduces several non-parametric algorithmic frameworks for unsupervised detection.
arXiv Detail & Related papers (2020-10-09T17:48:02Z) - TadGAN: Time Series Anomaly Detection Using Generative Adversarial
Networks [73.01104041298031]
TadGAN is an unsupervised anomaly detection approach built on Generative Adversarial Networks (GANs)
To capture the temporal correlations of time series, we use LSTM Recurrent Neural Networks as base models for Generators and Critics.
To demonstrate the performance and generalizability of our approach, we test several anomaly scoring techniques and report the best-suited one.
arXiv Detail & Related papers (2020-09-16T15:52:04Z) - Anomaly Detection in Univariate Time-series: A Survey on the
State-of-the-Art [0.0]
Anomaly detection for time-series data has been an important research field for a long time.
Recent years an increasing number of machine learning algorithms have been developed to detect anomalies on time-series.
Researchers tried to improve these techniques using (deep) neural networks.
arXiv Detail & Related papers (2020-04-01T13:22:34Z) - Effectiveness of Tree-based Ensembles for Anomaly Discovery: Insights, Batch and Streaming Active Learning [18.49217234413188]
This paper makes four main contributions to improve the state-of-the-art in anomaly discovery using tree-based ensembles.
We develop a novel batch active learning algorithm to improve the diversity of discovered anomalies.
We present a data drift detection algorithm that not only detects the drift robustly, but also allows us to take corrective actions to adapt the anomaly detector in a principled manner.
arXiv Detail & Related papers (2019-01-23T23:41:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.