Deep Isolation Forest for Anomaly Detection
- URL: http://arxiv.org/abs/2206.06602v4
- Date: Fri, 9 Jun 2023 01:19:55 GMT
- Title: Deep Isolation Forest for Anomaly Detection
- Authors: Hongzuo Xu and Guansong Pang and Yijie Wang and Yongjun Wang
- Abstract summary: Isolation forest (iForest) has been emerging as arguably the most popular anomaly detector in recent years.
Our model achieves significant improvement over state-of-the-art isolation-based methods and deep detectors on datasets.
- Score: 16.581154394513025
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Isolation forest (iForest) has been emerging as arguably the most popular
anomaly detector in recent years due to its general effectiveness across
different benchmarks and strong scalability. Nevertheless, its linear
axis-parallel isolation method often leads to (i) failure in detecting hard
anomalies that are difficult to isolate in
high-dimensional/non-linear-separable data space, and (ii) notorious
algorithmic bias that assigns unexpectedly lower anomaly scores to artefact
regions. These issues contribute to high false negative errors. Several iForest
extensions are introduced, but they essentially still employ shallow, linear
data partition, restricting their power in isolating true anomalies. Therefore,
this paper proposes deep isolation forest. We introduce a new representation
scheme that utilises casually initialised neural networks to map original data
into random representation ensembles, where random axis-parallel cuts are
subsequently applied to perform the data partition. This representation scheme
facilitates high freedom of the partition in the original data space
(equivalent to non-linear partition on subspaces of varying sizes), encouraging
a unique synergy between random representations and random partition-based
isolation. Extensive experiments show that our model achieves significant
improvement over state-of-the-art isolation-based methods and deep detectors on
tabular, graph and time series datasets; our model also inherits desired
scalability from iForest.
Related papers
- GLAD: Towards Better Reconstruction with Global and Local Adaptive Diffusion Models for Unsupervised Anomaly Detection [60.78684630040313]
Diffusion models tend to reconstruct normal counterparts of test images with certain noises added.
From the global perspective, the difficulty of reconstructing images with different anomalies is uneven.
We propose a global and local adaptive diffusion model (abbreviated to GLAD) for unsupervised anomaly detection.
arXiv Detail & Related papers (2024-06-11T17:27:23Z) - Anomaly Detection Based on Isolation Mechanisms: A Survey [13.449446806837422]
Isolation-based unsupervised anomaly detection is a novel and effective approach for identifying anomalies in data.
We review the state-of-the-art isolation-based anomaly detection methods, including their data partitioning strategies, anomaly score functions, and algorithmic details.
arXiv Detail & Related papers (2024-03-16T04:29:21Z) - MLAD: A Unified Model for Multi-system Log Anomaly Detection [35.68387377240593]
We propose MLAD, a novel anomaly detection model that incorporates semantic relational reasoning across multiple systems.
Specifically, we employ Sentence-bert to capture the similarities between log sequences and convert them into highly-dimensional learnable semantic vectors.
We revamp the formulas of the Attention layer to discern the significance of each keyword in the sequence and model the overall distribution of the multi-system dataset.
arXiv Detail & Related papers (2024-01-15T12:51:13Z) - Subspace-Guided Feature Reconstruction for Unsupervised Anomaly
Localization [5.085309164633571]
Unsupervised anomaly localization plays a critical role in industrial manufacturing.
Most recent methods perform feature matching or reconstruction for the target sample with pre-trained deep neural networks.
We propose a novel subspace-guided feature reconstruction framework to pursue adaptive feature approximation for anomaly localization.
arXiv Detail & Related papers (2023-09-25T06:58:57Z) - ManiFlow: Implicitly Representing Manifolds with Normalizing Flows [145.9820993054072]
Normalizing Flows (NFs) are flexible explicit generative models that have been shown to accurately model complex real-world data distributions.
We propose an optimization objective that recovers the most likely point on the manifold given a sample from the perturbed distribution.
Finally, we focus on 3D point clouds for which we utilize the explicit nature of NFs, i.e. surface normals extracted from the gradient of the log-likelihood and the log-likelihood itself.
arXiv Detail & Related papers (2022-08-18T16:07:59Z) - Intrinsic dimension estimation for discrete metrics [65.5438227932088]
In this letter we introduce an algorithm to infer the intrinsic dimension (ID) of datasets embedded in discrete spaces.
We demonstrate its accuracy on benchmark datasets, and we apply it to analyze a metagenomic dataset for species fingerprinting.
This suggests that evolutive pressure acts on a low-dimensional manifold despite the high-dimensionality of sequences' space.
arXiv Detail & Related papers (2022-07-20T06:38:36Z) - Self-Supervised Training with Autoencoders for Visual Anomaly Detection [61.62861063776813]
We focus on a specific use case in anomaly detection where the distribution of normal samples is supported by a lower-dimensional manifold.
We adapt a self-supervised learning regime that exploits discriminative information during training but focuses on the submanifold of normal examples.
We achieve a new state-of-the-art result on the MVTec AD dataset -- a challenging benchmark for visual anomaly detection in the manufacturing domain.
arXiv Detail & Related papers (2022-06-23T14:16:30Z) - Distributed Dynamic Safe Screening Algorithms for Sparse Regularization [73.85961005970222]
We propose a new distributed dynamic safe screening (DDSS) method for sparsity regularized models and apply it on shared-memory and distributed-memory architecture respectively.
We prove that the proposed method achieves the linear convergence rate with lower overall complexity and can eliminate almost all the inactive features in a finite number of iterations almost surely.
arXiv Detail & Related papers (2022-04-23T02:45:55Z) - TiWS-iForest: Isolation Forest in Weakly Supervised and Tiny ML
scenarios [2.7285752469525315]
Isolation Forest is a popular algorithm able to define an anomaly score by means of an ensemble of peculiar trees called isolation trees.
We show that the standard algorithm might be improved in terms of memory requirements, latency and performances.
We propose TiWS-iForest, an approach that, by leveraging weak supervision, is able to reduce Isolation Forest complexity and to enhance detection performances.
arXiv Detail & Related papers (2021-11-30T14:24:27Z) - Discriminative-Generative Dual Memory Video Anomaly Detection [81.09977516403411]
Recently, people tried to use a few anomalies for video anomaly detection (VAD) instead of only normal data during the training process.
We propose a DiscRiminative-gEnerative duAl Memory (DREAM) anomaly detection model to take advantage of a few anomalies and solve data imbalance.
arXiv Detail & Related papers (2021-04-29T15:49:01Z) - Online stochastic gradient descent on non-convex losses from
high-dimensional inference [2.2344764434954256]
gradient descent (SGD) is a popular algorithm for optimization problems in high-dimensional tasks.
In this paper we produce an estimator of non-trivial correlation from data.
We illustrate our approach by applying it to a set of tasks such as phase retrieval, and estimation for generalized models.
arXiv Detail & Related papers (2020-03-23T17:34:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.