Related papers: Deep Isolation Forest for Anomaly Detection

Deep Isolation Forest for Anomaly Detection

URL: http://arxiv.org/abs/2206.06602v4
Date: Fri, 9 Jun 2023 01:19:55 GMT
Title: Deep Isolation Forest for Anomaly Detection
Authors: Hongzuo Xu and Guansong Pang and Yijie Wang and Yongjun Wang
Abstract summary: Isolation forest (iForest) has been emerging as arguably the most popular anomaly detector in recent years. Our model achieves significant improvement over state-of-the-art isolation-based methods and deep detectors on datasets.
Score: 16.581154394513025
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Isolation forest (iForest) has been emerging as arguably the most popular anomaly detector in recent years due to its general effectiveness across different benchmarks and strong scalability. Nevertheless, its linear axis-parallel isolation method often leads to (i) failure in detecting hard anomalies that are difficult to isolate in high-dimensional/non-linear-separable data space, and (ii) notorious algorithmic bias that assigns unexpectedly lower anomaly scores to artefact regions. These issues contribute to high false negative errors. Several iForest extensions are introduced, but they essentially still employ shallow, linear data partition, restricting their power in isolating true anomalies. Therefore, this paper proposes deep isolation forest. We introduce a new representation scheme that utilises casually initialised neural networks to map original data into random representation ensembles, where random axis-parallel cuts are subsequently applied to perform the data partition. This representation scheme facilitates high freedom of the partition in the original data space (equivalent to non-linear partition on subspaces of varying sizes), encouraging a unique synergy between random representations and random partition-based isolation. Extensive experiments show that our model achieves significant improvement over state-of-the-art isolation-based methods and deep detectors on tabular, graph and time series datasets; our model also inherits desired scalability from iForest.

Related papers

A Dataset for Semantic Segmentation in the Presence of Unknowns [49.795683850385956]
Existing datasets allow evaluation of only knowns or unknowns - but not both. We propose a novel anomaly segmentation dataset, ISSU, that features a diverse set of anomaly inputs from cluttered real-world environments. The dataset is twice larger than existing anomaly segmentation datasets.
arXiv Detail & Related papers (2025-03-28T10:31:01Z)
Robust Isolation Forest using Soft Sparse Random Projection and Valley Emphasis Method [9.115927248875568]
Isolation Forest (iForest) is an unsupervised anomaly detection algorithm designed to effectively detect anomalies under the assumption that anomalies are few and different." Various studies have aimed to enhance iForest, but the resulting algorithms often exhibited significant performance disparities across datasets. To address these challenges, we introduce Robust iForest (RiForest) RiForest leverages both existing features and random hyperplanes obtained through soft sparse random projection to identify superior split features for anomaly detection, independent of datasets.
arXiv Detail & Related papers (2025-03-15T13:08:50Z)
GLAD: Towards Better Reconstruction with Global and Local Adaptive Diffusion Models for Unsupervised Anomaly Detection [60.78684630040313]
Diffusion models tend to reconstruct normal counterparts of test images with certain noises added. From the global perspective, the difficulty of reconstructing images with different anomalies is uneven. We propose a global and local adaptive diffusion model (abbreviated to GLAD) for unsupervised anomaly detection.
arXiv Detail & Related papers (2024-06-11T17:27:23Z)
Anomaly Detection Based on Isolation Mechanisms: A Survey [13.449446806837422]
Isolation-based unsupervised anomaly detection is a novel and effective approach for identifying anomalies in data. We review the state-of-the-art isolation-based anomaly detection methods, including their data partitioning strategies, anomaly score functions, and algorithmic details.
arXiv Detail & Related papers (2024-03-16T04:29:21Z)
MLAD: A Unified Model for Multi-system Log Anomaly Detection [35.68387377240593]
We propose MLAD, a novel anomaly detection model that incorporates semantic relational reasoning across multiple systems. Specifically, we employ Sentence-bert to capture the similarities between log sequences and convert them into highly-dimensional learnable semantic vectors. We revamp the formulas of the Attention layer to discern the significance of each keyword in the sequence and model the overall distribution of the multi-system dataset.
arXiv Detail & Related papers (2024-01-15T12:51:13Z)
Subspace-Guided Feature Reconstruction for Unsupervised Anomaly Localization [5.085309164633571]
Unsupervised anomaly localization plays a critical role in industrial manufacturing. Most recent methods perform feature matching or reconstruction for the target sample with pre-trained deep neural networks. We propose a novel subspace-guided feature reconstruction framework to pursue adaptive feature approximation for anomaly localization.
arXiv Detail & Related papers (2023-09-25T06:58:57Z)
ManiFlow: Implicitly Representing Manifolds with Normalizing Flows [145.9820993054072]
Normalizing Flows (NFs) are flexible explicit generative models that have been shown to accurately model complex real-world data distributions. We propose an optimization objective that recovers the most likely point on the manifold given a sample from the perturbed distribution. Finally, we focus on 3D point clouds for which we utilize the explicit nature of NFs, i.e. surface normals extracted from the gradient of the log-likelihood and the log-likelihood itself.
arXiv Detail & Related papers (2022-08-18T16:07:59Z)
Intrinsic dimension estimation for discrete metrics [65.5438227932088]
In this letter we introduce an algorithm to infer the intrinsic dimension (ID) of datasets embedded in discrete spaces. We demonstrate its accuracy on benchmark datasets, and we apply it to analyze a metagenomic dataset for species fingerprinting. This suggests that evolutive pressure acts on a low-dimensional manifold despite the high-dimensionality of sequences' space.
arXiv Detail & Related papers (2022-07-20T06:38:36Z)
Self-Supervised Training with Autoencoders for Visual Anomaly Detection [61.62861063776813]
We focus on a specific use case in anomaly detection where the distribution of normal samples is supported by a lower-dimensional manifold. We adapt a self-supervised learning regime that exploits discriminative information during training but focuses on the submanifold of normal examples. We achieve a new state-of-the-art result on the MVTec AD dataset -- a challenging benchmark for visual anomaly detection in the manufacturing domain.
arXiv Detail & Related papers (2022-06-23T14:16:30Z)
Distributed Dynamic Safe Screening Algorithms for Sparse Regularization [73.85961005970222]
We propose a new distributed dynamic safe screening (DDSS) method for sparsity regularized models and apply it on shared-memory and distributed-memory architecture respectively. We prove that the proposed method achieves the linear convergence rate with lower overall complexity and can eliminate almost all the inactive features in a finite number of iterations almost surely.
arXiv Detail & Related papers (2022-04-23T02:45:55Z)
TiWS-iForest: Isolation Forest in Weakly Supervised and Tiny ML scenarios [2.7285752469525315]
Isolation Forest is a popular algorithm able to define an anomaly score by means of an ensemble of peculiar trees called isolation trees. We show that the standard algorithm might be improved in terms of memory requirements, latency and performances. We propose TiWS-iForest, an approach that, by leveraging weak supervision, is able to reduce Isolation Forest complexity and to enhance detection performances.
arXiv Detail & Related papers (2021-11-30T14:24:27Z)
Discriminative-Generative Dual Memory Video Anomaly Detection [81.09977516403411]
Recently, people tried to use a few anomalies for video anomaly detection (VAD) instead of only normal data during the training process. We propose a DiscRiminative-gEnerative duAl Memory (DREAM) anomaly detection model to take advantage of a few anomalies and solve data imbalance.
arXiv Detail & Related papers (2021-04-29T15:49:01Z)
Online stochastic gradient descent on non-convex losses from high-dimensional inference [2.2344764434954256]
gradient descent (SGD) is a popular algorithm for optimization problems in high-dimensional tasks. In this paper we produce an estimator of non-trivial correlation from data. We illustrate our approach by applying it to a set of tasks such as phase retrieval, and estimation for generalized models.
arXiv Detail & Related papers (2020-03-23T17:34:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.