Tighten The Lasso: A Convex Hull Volume-based Anomaly Detection Method
- URL: http://arxiv.org/abs/2502.18601v1
- Date: Tue, 25 Feb 2025 19:39:20 GMT
- Title: Tighten The Lasso: A Convex Hull Volume-based Anomaly Detection Method
- Authors: Uri Itai, Asael Bar Ilan, Teddy Lazebnik,
- Abstract summary: We propose a novel anomaly detection algorithm based on the convex hull property of a dataset.<n>Our algorithm computes the CH's volume as an increasing number of data points are removed from the dataset.<n>We show that with a computationally cheap and simple check, one can detect datasets that are well-suited for the proposed algorithm.
- Score: 0.6144680854063939
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The rapid advancements in data-driven methodologies have underscored the critical importance of ensuring data quality. Consequently, detecting out-of-distribution (OOD) data has emerged as an essential task to maintain the reliability and robustness of data-driven models, in general, and machine and deep learning models, in particular. In this study, we leveraged the convex hull property of a dataset and the fact that anomalies highly contribute to the increase of the CH's volume to propose a novel anomaly detection algorithm. Our algorithm computes the CH's volume as an increasing number of data points are removed from the dataset to define a decision line between OOD and in-distribution data points. We compared the proposed algorithm to seven widely used anomaly detection algorithms over ten datasets, showing comparable results for state-of-the-art (SOTA) algorithms. Moreover, we show that with a computationally cheap and simple check, one can detect datasets that are well-suited for the proposed algorithm which outperforms the SOTA anomaly detection algorithms.
Related papers
- Predictive Sample Assignment for Semantically Coherent Out-of-Distribution Detection [62.1052001316508]
Semantically coherent out-of-distribution detection (SCOOD) is a recently proposed realistic OOD detection setting.<n>We propose a concise SCOOD framework based on predictive sample assignment (PSA)<n>Our approach outperforms the state-of-the-art methods by a significant margin.
arXiv Detail & Related papers (2025-12-15T01:18:38Z) - Triply Laplacian Scale Mixture Modeling for Seismic Data Noise Suppression [51.87076090814921]
Sparsity-based tensor recovery methods have shown great potential in suppressing seismic data noise.<n>We propose a novel triply Laplacian scale mixture (TLSM) approach for seismic data noise suppression.
arXiv Detail & Related papers (2025-02-20T08:28:01Z) - Fuzzy Granule Density-Based Outlier Detection with Multi-Scale Granular Balls [65.44462297594308]
Outlier detection refers to the identification of anomalous samples that deviate significantly from the distribution of normal data.<n>Most unsupervised outlier detection methods are carefully designed to detect specified outliers.<n>We propose a fuzzy rough sets-based multi-scale outlier detection method to identify various types of outliers.
arXiv Detail & Related papers (2025-01-06T12:35:51Z) - An Efficient Outlier Detection Algorithm for Data Streaming [51.56874851156008]
Traditional outlier detection methods, such as the Local Outlier Factor (LOF) algorithm, struggle with real-time data.<n>We propose a novel approach to enhance the efficiency of LOF algorithms for online anomaly detection, named the Efficient Incremental LOF (EILOF) algorithm.<n>The EILOF algorithm not only significantly reduces computational costs, but also systematically improves detection accuracy when the number of additional points increases.
arXiv Detail & Related papers (2025-01-02T05:12:43Z) - Unsupervised Anomaly Detection for Tabular Data Using Noise Evaluation [26.312206159418903]
Unsupervised anomaly detection (UAD) plays an important role in modern data analytics.<n>We present a novel UAD method by evaluating how much noise is in the data.<n>We provide theoretical guarantees, proving that the proposed method can detect anomalous data successfully.
arXiv Detail & Related papers (2024-12-16T05:35:58Z) - A Mallows-like Criterion for Anomaly Detection with Random Forest Implementation [7.569443648362081]
This paper proposes a novel criterion to select the weights on aggregation of multiple models, wherein the focal loss function accounts for the classification of extremely imbalanced data.
We have evaluated the proposed method on benchmark datasets across various domains, including network intrusion.
arXiv Detail & Related papers (2024-05-29T09:36:57Z) - Study of Robust Direction Finding Based on Joint Sparse Representation [2.3333781137726137]
We propose a novel DOA estimation method based on sparse signal recovery (SSR)
To address the issue of grid mismatch, we utilize an alternating optimization approach.
Simulation results demonstrate that the proposed method exhibits robustness against large outliers.
arXiv Detail & Related papers (2024-05-27T02:26:37Z) - DeepHYDRA: Resource-Efficient Time-Series Anomaly Detection in Dynamically-Configured Systems [3.44012349879073]
We present DeepHYDRA (Deep Hybrid DBSCAN/Reduction-Based Anomaly Detection)
It combines DBSCAN and learning-based anomaly detection.
It is shown to reliably detect different types of anomalies in both large and complex datasets.
arXiv Detail & Related papers (2024-05-13T13:47:15Z) - Bagged Regularized $k$-Distances for Anomaly Detection [9.899763598214122]
We propose a new distance-based algorithm called bagged regularized $k$-distances for anomaly detection (BRDAD)
Our BRDAD algorithm selects the weights by minimizing the surrogate risk, i.e., the finite sample bound of the empirical risk of the bagged weighted $k$-distances for density estimation (BWDDE)
On the theoretical side, we establish fast convergence rates of the AUC regret of our algorithm and demonstrate that the bagging technique significantly reduces the computational complexity.
arXiv Detail & Related papers (2023-12-02T07:00:46Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - Towards Efficient and Accurate Approximation: Tensor Decomposition Based
on Randomized Block Krylov Iteration [27.85452105378894]
This work designs an rBKI-based Tucker decomposition (rBKI-TK) for accurate approximation, together with a hierarchical tensor ring decomposition based on rBKI-TK for efficient compression of large-scale data.
Numerical experiences demonstrate the efficiency, accuracy and scalability of the proposed methods in both data compression and denoising.
arXiv Detail & Related papers (2022-11-27T13:45:28Z) - Denoising diffusion models for out-of-distribution detection [2.113925122479677]
We exploit the view of denoising probabilistic diffusion models (DDPM) as denoising autoencoders.
We use DDPMs to reconstruct an input that has been noised to a range of noise levels, and use the resulting multi-dimensional reconstruction error to classify out-of-distribution inputs.
arXiv Detail & Related papers (2022-11-14T20:35:11Z) - Framing Algorithmic Recourse for Anomaly Detection [18.347886926848563]
We present an approach -- Context preserving Algorithmic Recourse for Anomalies in Tabular data (CARAT)
CARAT uses a transformer based encoder-decoder model to explain an anomaly by finding features with low likelihood.
Semantically coherent counterfactuals are generated by modifying the highlighted features, using the overall context of features in the anomalous instance(s)
arXiv Detail & Related papers (2022-06-29T03:30:51Z) - Self-Supervised Training with Autoencoders for Visual Anomaly Detection [61.62861063776813]
We focus on a specific use case in anomaly detection where the distribution of normal samples is supported by a lower-dimensional manifold.
We adapt a self-supervised learning regime that exploits discriminative information during training but focuses on the submanifold of normal examples.
We achieve a new state-of-the-art result on the MVTec AD dataset -- a challenging benchmark for visual anomaly detection in the manufacturing domain.
arXiv Detail & Related papers (2022-06-23T14:16:30Z) - Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic
Uncertainty [58.144520501201995]
Bi-Lipschitz regularization of neural network layers preserve relative distances between data instances in the feature spaces of each layer.
With the use of an attentive set encoder, we propose to meta learn either diagonal or diagonal plus low-rank factors to efficiently construct task specific covariance matrices.
We also propose an inference procedure which utilizes scaled energy to achieve a final predictive distribution.
arXiv Detail & Related papers (2021-10-12T22:04:19Z) - An Efficient Anomaly Detection Approach using Cube Sampling with
Streaming Data [2.0515785954568626]
Anomaly detection is critical in various fields, including intrusion detection, health monitoring, fault diagnosis, and sensor network event detection.
The isolation forest (or iForest) approach is a well-known technique for detecting anomalies.
We propose an efficient iForest based approach for anomaly detection using cube sampling that is effective on streaming data.
arXiv Detail & Related papers (2021-10-05T04:23:00Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - Am I Rare? An Intelligent Summarization Approach for Identifying Hidden
Anomalies [0.0]
In this paper, we propose an INtelligent Summarization approach for IDENTifying hidden anomalies, called INSIDENT.
Our approach is a clustering-based algorithm that dynamically maps original feature space to a new feature space by locally weighting features in each cluster. Besides, selecting representatives based on cluster size keeps the same distribution as the original data in summarized data.
arXiv Detail & Related papers (2020-12-24T23:22:57Z) - Stochastic Hard Thresholding Algorithms for AUC Maximization [49.00683387735522]
We develop a hard thresholding algorithm for AUC in distributiond classification.
We conduct experiments to show the efficiency and effectiveness of the proposed algorithms.
arXiv Detail & Related papers (2020-11-04T16:49:29Z) - Bayesian Optimization with Machine Learning Algorithms Towards Anomaly
Detection [66.05992706105224]
In this paper, an effective anomaly detection framework is proposed utilizing Bayesian Optimization technique.
The performance of the considered algorithms is evaluated using the ISCX 2012 dataset.
Experimental results show the effectiveness of the proposed framework in term of accuracy rate, precision, low-false alarm rate, and recall.
arXiv Detail & Related papers (2020-08-05T19:29:35Z) - Optimally Displaced Threshold Detection for Discriminating Binary
Coherent States Using Imperfect Devices [50.09039506170243]
We analytically study the performance of the generalized Kennedy receiver having optimally displaced threshold detection (ODTD) in a realistic situation with noises and imperfect devices.
We show that the proposed greedy search algorithm can obtain a lower and smoother error probability than the existing works.
arXiv Detail & Related papers (2020-07-21T21:52:29Z) - Contextual-Bandit Anomaly Detection for IoT Data in Distributed
Hierarchical Edge Computing [65.78881372074983]
IoT devices can hardly afford complex deep neural networks (DNN) models, and offloading anomaly detection tasks to the cloud incurs long delay.
We propose and build a demo for an adaptive anomaly detection approach for distributed hierarchical edge computing (HEC) systems.
We show that our proposed approach significantly reduces detection delay without sacrificing accuracy, as compared to offloading detection tasks to the cloud.
arXiv Detail & Related papers (2020-04-15T06:13:33Z) - Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass.
We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.