Randomized PCA Forest for Outlier Detection
- URL: http://arxiv.org/abs/2508.12776v2
- Date: Fri, 22 Aug 2025 08:24:43 GMT
- Title: Randomized PCA Forest for Outlier Detection
- Authors: Muhammad Rajabinasab, Farhad Pakdaman, Moncef Gabbouj, Peter Schneider-Kamp, Arthur Zimek,
- Abstract summary: We develop a novel unsupervised outlier detection method that utilizes RPCA Forest for outlier detection.<n> Experimental results showcase the superiority of the proposed approach compared to the classical and state-of-the-art methods.
- Score: 22.978540389975
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a novel unsupervised outlier detection method based on Randomized Principal Component Analysis (PCA). Inspired by the performance of Randomized PCA (RPCA) Forest in approximate K-Nearest Neighbor (KNN) search, we develop a novel unsupervised outlier detection method that utilizes RPCA Forest for outlier detection. Experimental results showcase the superiority of the proposed approach compared to the classical and state-of-the-art methods in performing the outlier detection task on several datasets while performing competitively on the rest. The extensive analysis of the proposed method reflects it high generalization power and its computational efficiency, highlighting it as a good choice for unsupervised outlier detection.
Related papers
- Prediction-Powered Risk Monitoring of Deployed Models for Detecting Harmful Distribution Shifts [51.37000123503367]
We propose prediction-powered risk monitoring (PPRM), a semi-supervised risk-monitoring approach based on prediction-powered inference (PPI)<n>PPRM constructs anytime-valid lower bounds on the running risk by combining synthetic labels with a small set of true labels.<n>We demonstrate the effectiveness of PPRM through extensive experiments on image classification, large language model (LLM) and telecommunications monitoring tasks.
arXiv Detail & Related papers (2026-02-02T15:32:14Z) - Leveraging Perturbation Robustness to Enhance Out-of-Distribution Detection [15.184096796229115]
We propose a post-hoc method, Perturbation-Rectified OOD detection (PRO), based on the insight that prediction confidence for OOD inputs is more susceptible to reduction under perturbation than in-distribution (IND) inputs.<n>On a CIFAR-10 model with adversarial training, PRO effectively detects near-OOD inputs, achieving a reduction of more than 10% on FPR@95 compared to state-of-the-art methods.
arXiv Detail & Related papers (2025-03-24T15:32:33Z) - Chain-of-Retrieval Augmented Generation [72.06205327186069]
This paper introduces an approach for training o1-like RAG models that retrieve and reason over relevant information step by step before generating the final answer.<n>Our proposed method, CoRAG, allows the model to dynamically reformulate the query based on the evolving state.
arXiv Detail & Related papers (2025-01-24T09:12:52Z) - Fuzzy Granule Density-Based Outlier Detection with Multi-Scale Granular Balls [65.44462297594308]
Outlier detection refers to the identification of anomalous samples that deviate significantly from the distribution of normal data.<n>Most unsupervised outlier detection methods are carefully designed to detect specified outliers.<n>We propose a fuzzy rough sets-based multi-scale outlier detection method to identify various types of outliers.
arXiv Detail & Related papers (2025-01-06T12:35:51Z) - Unsupervised Outlier Detection using Random Subspace and Subsampling Ensembles of Dirichlet Process Mixtures [1.3258129717033857]
We propose a novel outlier detection method that utilizes ensembles of Dirichlet process Gaussian mixtures.
This unsupervised algorithm employs random subspace and subsampling ensembles to ensure efficient computation and improve the robustness of the outlier detector.
arXiv Detail & Related papers (2024-01-01T14:34:11Z) - On the Universal Adversarial Perturbations for Efficient Data-free
Adversarial Detection [55.73320979733527]
We propose a data-agnostic adversarial detection framework, which induces different responses between normal and adversarial samples to UAPs.
Experimental results show that our method achieves competitive detection performance on various text classification tasks.
arXiv Detail & Related papers (2023-06-27T02:54:07Z) - Free Lunch for Generating Effective Outlier Supervision [46.37464572099351]
We propose an ultra-effective method to generate near-realistic outlier supervision.
Our proposed textttBayesAug significantly reduces the false positive rate over 12.50% compared with the previous schemes.
arXiv Detail & Related papers (2023-01-17T01:46:45Z) - Boosting Out-of-Distribution Detection with Multiple Pre-trained Models [41.66566916581451]
Post hoc detection utilizing pre-trained models has shown promising performance and can be scaled to large-scale problems.
We propose a detection enhancement method by ensembling multiple detection decisions derived from a zoo of pre-trained models.
Our method substantially improves the relative performance by 65.40% and 26.96% on the CIFAR10 and ImageNet benchmarks.
arXiv Detail & Related papers (2022-12-24T12:11:38Z) - CoMadOut -- A Robust Outlier Detection Algorithm based on CoMAD [0.3749861135832073]
Outliers play a significant role, since they bear the potential to distort the predictions of a machine learning algorithm on a given dataset.
To address this problem, we propose the robust outlier detection algorithm CoMadOut.
Our approach can be seen as a robust alternative for outlier detection tasks.
arXiv Detail & Related papers (2022-11-23T21:33:34Z) - Label-Efficient Object Detection via Region Proposal Network
Pre-Training [58.50615557874024]
We propose a simple pretext task that provides an effective pre-training for the region proposal network (RPN)
In comparison with multi-stage detectors without RPN pre-training, our approach is able to consistently improve downstream task performance.
arXiv Detail & Related papers (2022-11-16T16:28:18Z) - Large-Scale Sequential Learning for Recommender and Engineering Systems [91.3755431537592]
In this thesis, we focus on the design of an automatic algorithms that provide personalized ranking by adapting to the current conditions.
For the former, we propose novel algorithm called SAROS that take into account both kinds of feedback for learning over the sequence of interactions.
The proposed idea of taking into account the neighbour lines shows statistically significant results in comparison with the initial approach for faults detection in power grid.
arXiv Detail & Related papers (2022-05-13T21:09:41Z) - Out-of-Distribution Detection using Outlier Detection Methods [0.0]
Out-of-distribution detection (OOD) deals with anomalous input to neural networks.
We use outlier detection algorithms to detect anomalous input as reliable as specialized methods from the field of OOD.
No neural network adaptation is required; detection is based on the model's softmax score.
arXiv Detail & Related papers (2021-08-18T16:05:53Z) - IPOF: An Extremely and Excitingly Simple Outlier Detection Booster via
Infinite Propagation [30.91911545889579]
Outlier detection is one of the most popular and continuously rising topics in the data mining field.
In this paper, we consider the score-based outlier detection category and point out that the performance of current outlier detection algorithms might be further boosted by score propagation.
Specifically, we propose Infinite propagation of Outlier Factor (iPOF) algorithm, an extremely and excitingly simple outlier detection booster via infinite propagation.
arXiv Detail & Related papers (2021-08-01T03:48:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.