Robust Isolation Forest using Soft Sparse Random Projection and Valley Emphasis Method
- URL: http://arxiv.org/abs/2503.12125v1
- Date: Sat, 15 Mar 2025 13:08:50 GMT
- Title: Robust Isolation Forest using Soft Sparse Random Projection and Valley Emphasis Method
- Authors: Hun Kang, Kyoungok Kim,
- Abstract summary: Isolation Forest (iForest) is an unsupervised anomaly detection algorithm designed to effectively detect anomalies under the assumption that anomalies are few and different."<n>Various studies have aimed to enhance iForest, but the resulting algorithms often exhibited significant performance disparities across datasets.<n>To address these challenges, we introduce Robust iForest (RiForest)<n>RiForest leverages both existing features and random hyperplanes obtained through soft sparse random projection to identify superior split features for anomaly detection, independent of datasets.
- Score: 9.115927248875568
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Isolation Forest (iForest) is an unsupervised anomaly detection algorithm designed to effectively detect anomalies under the assumption that anomalies are ``few and different." Various studies have aimed to enhance iForest, but the resulting algorithms often exhibited significant performance disparities across datasets. Additionally, the challenge of isolating rare and widely distributed anomalies persisted in research focused on improving splits. To address these challenges, we introduce Robust iForest (RiForest). RiForest leverages both existing features and random hyperplanes obtained through soft sparse random projection to identify superior split features for anomaly detection, independent of datasets. It utilizes the underutilized valley emphasis method for optimal split point determination and incorporates sparsity randomization in soft sparse random projection for enhanced anomaly detection robustness. Across 24 benchmark datasets, experiments demonstrate RiForest's consistent outperformance of existing algorithms in anomaly detection, emphasizing stability and robustness to noise variables.
Related papers
- Explainable Unsupervised Anomaly Detection with Random Forest [1.0485739694839669]
We describe the use of an unsupervised Random Forest for similarity learning and improved anomaly detection.
By training a Random Forest to discriminate between real data and synthetic data sampled from a uniform distribution over the real data bounds, a distance measure is obtained that anisometrically transforms the data.
We show that using distances recovered from this transformation improves the accuracy of unsupervised anomaly detection, compared to other commonly used detectors.
arXiv Detail & Related papers (2025-04-22T17:54:44Z) - Calibrated Unsupervised Anomaly Detection in Multivariate Time-series using Reinforcement Learning [0.0]
This paper investigates unsupervised anomaly detection in time-series data using reinforcement learning (RL) in the latent space of an autoencoder.
We use wavelet analysis to enhance anomaly detection, enabling time-series data decomposition into both time and frequency domains.
We calibrate the decision boundary by generating synthetic anomalies and embedding a supervised framework within the model.
arXiv Detail & Related papers (2025-02-05T15:02:40Z) - Stable Neighbor Denoising for Source-free Domain Adaptive Segmentation [91.83820250747935]
Pseudo-label noise is mainly contained in unstable samples in which predictions of most pixels undergo significant variations during self-training.
We introduce the Stable Neighbor Denoising (SND) approach, which effectively discovers highly correlated stable and unstable samples.
SND consistently outperforms state-of-the-art methods in various SFUDA semantic segmentation settings.
arXiv Detail & Related papers (2024-06-10T21:44:52Z) - Anomaly Detection Based on Isolation Mechanisms: A Survey [13.449446806837422]
Isolation-based unsupervised anomaly detection is a novel and effective approach for identifying anomalies in data.
We review the state-of-the-art isolation-based anomaly detection methods, including their data partitioning strategies, anomaly score functions, and algorithmic details.
arXiv Detail & Related papers (2024-03-16T04:29:21Z) - Generating and Reweighting Dense Contrastive Patterns for Unsupervised
Anomaly Detection [59.34318192698142]
We introduce a prior-less anomaly generation paradigm and develop an innovative unsupervised anomaly detection framework named GRAD.
PatchDiff effectively expose various types of anomaly patterns.
experiments on both MVTec AD and MVTec LOCO datasets also support the aforementioned observation.
arXiv Detail & Related papers (2023-12-26T07:08:06Z) - Unraveling the "Anomaly" in Time Series Anomaly Detection: A
Self-supervised Tri-domain Solution [89.16750999704969]
Anomaly labels hinder traditional supervised models in time series anomaly detection.
Various SOTA deep learning techniques, such as self-supervised learning, have been introduced to tackle this issue.
We propose a novel self-supervised learning based Tri-domain Anomaly Detector (TriAD)
arXiv Detail & Related papers (2023-11-19T05:37:18Z) - Valid Inference After Causal Discovery [73.87055989355737]
We develop tools for valid post-causal-discovery inference.
We show that a naive combination of causal discovery and subsequent inference algorithms leads to highly inflated miscoverage rates.
arXiv Detail & Related papers (2022-08-11T17:40:45Z) - Deep Isolation Forest for Anomaly Detection [16.581154394513025]
Isolation forest (iForest) has been emerging as arguably the most popular anomaly detector in recent years.
Our model achieves significant improvement over state-of-the-art isolation-based methods and deep detectors on datasets.
arXiv Detail & Related papers (2022-06-14T05:47:07Z) - Anomaly Rule Detection in Sequence Data [2.3757190901941736]
We present a new anomaly detection framework called DUOS that enables Discovery of Utility-aware Outlier Sequential rules from a set of sequences.
In this work, we incorporate both the anomalousness and utility of a group, and then introduce the concept of utility-aware outlier rule (UOSR)
arXiv Detail & Related papers (2021-11-29T23:52:31Z) - Deconfounded Score Method: Scoring DAGs with Dense Unobserved
Confounding [101.35070661471124]
We show that unobserved confounding leaves a characteristic footprint in the observed data distribution that allows for disentangling spurious and causal effects.
We propose an adjusted score-based causal discovery algorithm that may be implemented with general-purpose solvers and scales to high-dimensional problems.
arXiv Detail & Related papers (2021-03-28T11:07:59Z) - TadGAN: Time Series Anomaly Detection Using Generative Adversarial
Networks [73.01104041298031]
TadGAN is an unsupervised anomaly detection approach built on Generative Adversarial Networks (GANs)
To capture the temporal correlations of time series, we use LSTM Recurrent Neural Networks as base models for Generators and Critics.
To demonstrate the performance and generalizability of our approach, we test several anomaly scoring techniques and report the best-suited one.
arXiv Detail & Related papers (2020-09-16T15:52:04Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.