Text Anomaly Detection with Simplified Isolation Kernel
- URL: http://arxiv.org/abs/2510.13197v1
- Date: Wed, 15 Oct 2025 06:35:54 GMT
- Title: Text Anomaly Detection with Simplified Isolation Kernel
- Authors: Yang Cao, Sikun Yang, Yujiu Yang, Lianyong Qi, Ming Liu,
- Abstract summary: Two-step approaches combine pre-trained large language model embeddings and anomaly detectors.<n>High-dimensional dense embeddings extracted by large language models pose challenges due to substantial memory requirements and high computation time.<n>We introduce the Simplified Isolation Kernel (SIK), which maps high-dimensional dense embeddings to lower-dimensional sparse representations.
- Score: 58.13924648777626
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Two-step approaches combining pre-trained large language model embeddings and anomaly detectors demonstrate strong performance in text anomaly detection by leveraging rich semantic representations. However, high-dimensional dense embeddings extracted by large language models pose challenges due to substantial memory requirements and high computation time. To address this challenge, we introduce the Simplified Isolation Kernel (SIK), which maps high-dimensional dense embeddings to lower-dimensional sparse representations while preserving crucial anomaly characteristics. SIK has linear time complexity and significantly reduces space complexity through its innovative boundary-focused feature mapping. Experiments across 7 datasets demonstrate that SIK achieves better detection performance than 11 state-of-the-art (SOTA) anomaly detection algorithms while maintaining computational efficiency and low memory cost. All code and demonstrations are available at https://github.com/charles-cao/SIK.
Related papers
- Isolation-based Spherical Ensemble Representations for Anomaly Detection [60.989157958972356]
Anomaly detection is a critical task in data mining and management with applications spanning fraud detection, network security, and log monitoring.<n>Existing unsupervised anomaly detection methods face fundamental challenges including conflicting distributional assumptions, computational inefficiency, and difficulty handling different anomaly types.<n>We propose ISER (Isolation-based Spherical Ensemble Representations) that extends existing isolation-based methods by using hypersphere radii as proxies for local density characteristics while maintaining linear time and constant space complexity.
arXiv Detail & Related papers (2025-10-15T09:00:05Z) - SDS-Net: Shallow-Deep Synergism-detection Network for infrared small target detection [0.18641315013048293]
Current CNN-based infrared small target detection methods overlook the heterogeneity between shallow and deep features.<n>The dependency relationships and fusion mechanisms fail to fully exploit the complementarity of multilevel features.<n>This paper proposes a shallow-deep synergistic detection network (SDS-Net) that efficiently models multilevel feature representations.
arXiv Detail & Related papers (2025-06-06T12:44:41Z) - Efficient High-Resolution Visual Representation Learning with State Space Model for Human Pose Estimation [60.80423207808076]
Capturing long-range dependencies while preserving high-resolution visual representations is crucial for dense prediction tasks such as human pose estimation.<n>We propose the Dynamic Visual State Space (DVSS) block, which augments visual state space models with multi-scale convolutional operations.<n>We build HRVMamba, a novel model for efficient high-resolution representation learning.
arXiv Detail & Related papers (2024-10-04T06:19:29Z) - Detecting Anomalies in Dynamic Graphs via Memory enhanced Normality [39.476378833827184]
Anomaly detection in dynamic graphs presents a significant challenge due to the temporal evolution of graph structures and attributes.
We introduce a novel spatial- temporal memories-enhanced graph autoencoder (STRIPE)
STRIPE significantly outperforms existing methods with 5.8% improvement in AUC scores and 4.62X faster in training time.
arXiv Detail & Related papers (2024-03-14T02:26:10Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - ESSAformer: Efficient Transformer for Hyperspectral Image
Super-resolution [76.7408734079706]
Single hyperspectral image super-resolution (single-HSI-SR) aims to restore a high-resolution hyperspectral image from a low-resolution observation.
We propose ESSAformer, an ESSA attention-embedded Transformer network for single-HSI-SR with an iterative refining structure.
arXiv Detail & Related papers (2023-07-26T07:45:14Z) - Neural Architecture Search for Visual Anomaly Segmentation [4.035753155957698]
This paper presents the first application of neural architecture search to the complex task of segmenting visual anomalies.
The region-weighted Average Precision (rwAP) metric is proposed as an alternative to existing metrics.
The AutoPatch neural architecture search method is proposed, which enables efficient segmentation of visual anomalies without any training.
arXiv Detail & Related papers (2023-04-18T13:15:00Z) - FRE: A Fast Method For Anomaly Detection And Segmentation [5.0468312081378475]
This paper presents a principled approach for solving the visual anomaly detection and segmentation problem.
We propose the application of linear statistical dimensionality reduction techniques on the intermediate features produced by a pretrained DNN on the training data.
We show that the emphfeature reconstruction error (FRE), which is the $ell$-norm of the difference between the original feature in the high-dimensional space and the pre-image of its low-dimensional reduced embedding, is extremely effective for anomaly detection.
arXiv Detail & Related papers (2022-11-23T01:03:20Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.