Sparse, self-organizing ensembles of local kernels detect rare statistical anomalies
- URL: http://arxiv.org/abs/2511.03095v1
- Date: Wed, 05 Nov 2025 00:55:56 GMT
- Title: Sparse, self-organizing ensembles of local kernels detect rare statistical anomalies
- Authors: Gaia Grosso, Sai Sumedh R. Hindupur, Thomas Fel, Samuel Bright-Thonney, Philip Harris, Demba Ba,
- Abstract summary: Weak or rare signals can remain hidden within the apparent regularity of normal data, creating a gap in our ability to detect and interpret anomalies.<n>We propose a class of self-organizing local kernels that adaptively partition the representation space around regions of statistical imbalance.<n>We provide theoretical insights into the mechanisms that drive detection and self-organization in the proposed model, and demonstrate the effectiveness of this approach on realistic high-dimensional problems.
- Score: 7.472760645164855
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern artificial intelligence has revolutionized our ability to extract rich and versatile data representations across scientific disciplines. Yet, the statistical properties of these representations remain poorly controlled, causing misspecified anomaly detection (AD) methods to falter. Weak or rare signals can remain hidden within the apparent regularity of normal data, creating a gap in our ability to detect and interpret anomalies. We examine this gap and identify a set of structural desiderata for detection methods operating under minimal prior information: sparsity, to enforce parsimony; locality, to preserve geometric sensitivity; and competition, to promote efficient allocation of model capacity. These principles define a class of self-organizing local kernels that adaptively partition the representation space around regions of statistical imbalance. As an instantiation of these principles, we introduce SparKer, a sparse ensemble of Gaussian kernels trained within a semi-supervised Neyman--Pearson framework to locally model the likelihood ratio between a sample that may contain anomalies and a nominal, anomaly-free reference. We provide theoretical insights into the mechanisms that drive detection and self-organization in the proposed model, and demonstrate the effectiveness of this approach on realistic high-dimensional problems of scientific discovery, open-world novelty detection, intrusion detection, and generative-model validation. Our applications span both the natural- and computer-science domains. We demonstrate that ensembles containing only a handful of kernels can identify statistically significant anomalous locations within representation spaces of thousands of dimensions, underscoring both the interpretability, efficiency and scalability of the proposed approach.
Related papers
- Multi-Cue Anomaly Detection and Localization under Data Contamination [0.6703429330486276]
We propose a robust anomaly detection framework that integrates limited anomaly supervision into the adaptive deviation learning paradigm.<n>Our framework achieves strong detection and localization performance, interpretability, and robustness under various levels of data contamination.
arXiv Detail & Related papers (2026-01-30T12:34:13Z) - Generate Aligned Anomaly: Region-Guided Few-Shot Anomaly Image-Mask Pair Synthesis for Industrial Inspection [53.137651284042434]
Anomaly inspection plays a vital role in industrial manufacturing, but the scarcity of anomaly samples limits the effectiveness of existing methods.<n>We propose Generate grained Anomaly (GAA), a region-guided, few-shot anomaly image-mask pair generation framework.<n>GAA generates realistic, diverse, and semantically aligned anomalies using only a small number of samples.
arXiv Detail & Related papers (2025-07-13T12:56:59Z) - CLIP Meets Diffusion: A Synergistic Approach to Anomaly Detection [54.85000884785013]
Anomaly detection is a complex problem due to the ambiguity in defining anomalies, the diversity of anomaly types, and the scarcity of training data.<n>We propose CLIPfusion, a method that leverages both discriminative and generative foundation models.<n>We believe that our method underscores the effectiveness of multi-modal and multi-model fusion in tackling the multifaceted challenges of anomaly detection.
arXiv Detail & Related papers (2025-06-13T13:30:15Z) - Crane: Context-Guided Prompt Learning and Attention Refinement for Zero-Shot Anomaly Detection [50.343419243749054]
Anomaly detection is critical in fields such as medical diagnostics and industrial defect detection.<n> CLIP's coarse-grained image-text alignment limits localization and detection performance for fine-grained anomalies.<n>Crane improves the state-of-the-art ZSAD from 2% to 28%, at both image and pixel levels, while remaining competitive in inference speed.
arXiv Detail & Related papers (2025-04-15T10:42:25Z) - Back to Bayesics: Uncovering Human Mobility Distributions and Anomalies with an Integrated Statistical and Neural Framework [14.899157568336731]
DeepBayesic is a novel framework that integrates Bayesian principles with deep neural networks to model the underlying distributions.
We evaluate our approach on several mobility datasets, demonstrating significant improvements over state-of-the-art anomaly detection methods.
arXiv Detail & Related papers (2024-10-01T19:02:06Z) - Towards a Unified Framework of Clustering-based Anomaly Detection [18.30208347233284]
Unsupervised Anomaly Detection (UAD) plays a crucial role in identifying abnormal patterns within data without labeled examples.
We propose a novel probabilistic mixture model for anomaly detection to establish a theoretical connection among representation learning, clustering, and anomaly detection.
We have devised an improved anomaly score that more effectively harnesses the combined power of representation learning and clustering.
arXiv Detail & Related papers (2024-06-01T14:30:12Z) - Seeing Unseen: Discover Novel Biomedical Concepts via
Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues.
We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space.
A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z) - PUAD: Frustratingly Simple Method for Robust Anomaly Detection [0.0]
We argue that logical anomalies, such as the wrong number of objects, can not be well-represented by the spatial feature maps.
We propose a method that incorporates a simple out-of-distribution detection method on the feature space against state-of-the-art reconstruction-based approaches.
Our method achieves state-of-the-art performance on the MVTec LOCO AD dataset.
arXiv Detail & Related papers (2024-02-23T06:57:31Z) - Improving Intrusion Detection with Domain-Invariant Representation Learning in Latent Space [5.823403993020438]
We introduce a multi-task representation learning technique that fuses information across related domains into a unified latent space.<n>By jointly optimizing classification, reconstruction, and mutual information regularization losses, our method learns a minimal(bottleneck), domain-invariant representation that discards spurious correlations.<n>Our experimental results demonstrate significant improvements in zero-day or novel anomaly detection across diverse anomaly detection datasets.
arXiv Detail & Related papers (2023-12-28T17:24:13Z) - Learning a Cross-modality Anomaly Detector for Remote Sensing Imagery [21.444315419064882]
A remote sensing anomaly detector can find objects deviating from the background as potential targets for Earth monitoring.
Current anomaly detectors aim to learn the certain background distribution, the trained model cannot be transferred to unseen images.
This study exploits the learning target conversion from the varying background distribution to the consistent deviation metric.
arXiv Detail & Related papers (2023-10-11T14:07:05Z) - Causality-Based Multivariate Time Series Anomaly Detection [63.799474860969156]
We formulate the anomaly detection problem from a causal perspective and view anomalies as instances that do not follow the regular causal mechanism to generate the multivariate data.
We then propose a causality-based anomaly detection approach, which first learns the causal structure from data and then infers whether an instance is an anomaly relative to the local causal mechanism.
We evaluate our approach with both simulated and public datasets as well as a case study on real-world AIOps applications.
arXiv Detail & Related papers (2022-06-30T06:00:13Z) - Self-Supervised Training with Autoencoders for Visual Anomaly Detection [61.62861063776813]
We focus on a specific use case in anomaly detection where the distribution of normal samples is supported by a lower-dimensional manifold.
We adapt a self-supervised learning regime that exploits discriminative information during training but focuses on the submanifold of normal examples.
We achieve a new state-of-the-art result on the MVTec AD dataset -- a challenging benchmark for visual anomaly detection in the manufacturing domain.
arXiv Detail & Related papers (2022-06-23T14:16:30Z) - Explainable Deep Few-shot Anomaly Detection with Deviation Networks [123.46611927225963]
We introduce a novel weakly-supervised anomaly detection framework to train detection models.
The proposed approach learns discriminative normality by leveraging the labeled anomalies and a prior probability.
Our model is substantially more sample-efficient and robust, and performs significantly better than state-of-the-art competing methods in both closed-set and open-set settings.
arXiv Detail & Related papers (2021-08-01T14:33:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.