Calibrating Tabular Anomaly Detection via Optimal Transport
- URL: http://arxiv.org/abs/2602.06810v1
- Date: Fri, 06 Feb 2026 15:58:22 GMT
- Title: Calibrating Tabular Anomaly Detection via Optimal Transport
- Authors: Hangting Ye, He Zhao. Wei Fan, Xiaozhuang Song, Dandan Guo, Yi Chang, Hongyuan Zha,
- Abstract summary: We present CTAD (Calibrating Tabular Anomaly Detection), a model-agnostic post-processing framework that enhances any existing TAD detector through sample-specific calibration.<n>Our approach characterizes normal data via two complementary distributions, i.e., an empirical distribution from random sampling and a structural distribution from K-means centroids.<n>We prove that OT distance has a lower bound proportional to the test sample's distance from centroids, and establish that anomalies systematically receive higher calibration scores than normals in expectation.
- Score: 38.82475750342141
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tabular anomaly detection (TAD) remains challenging due to the heterogeneity of tabular data: features lack natural relationships, vary widely in distribution and scale, and exhibit diverse types. Consequently, each TAD method makes implicit assumptions about anomaly patterns that work well on some datasets but fail on others, and no method consistently outperforms across diverse scenarios. We present CTAD (Calibrating Tabular Anomaly Detection), a model-agnostic post-processing framework that enhances any existing TAD detector through sample-specific calibration. Our approach characterizes normal data via two complementary distributions, i.e., an empirical distribution from random sampling and a structural distribution from K-means centroids, and measures how adding a test sample disrupts their compatibility using Optimal Transport (OT) distance. Normal samples maintain low disruption while anomalies cause high disruption, providing a calibration signal to amplify detection. We prove that OT distance has a lower bound proportional to the test sample's distance from centroids, and establish that anomalies systematically receive higher calibration scores than normals in expectation, explaining why the method generalizes across datasets. Extensive experiments on 34 diverse tabular datasets with 7 representative detectors spanning all major TAD categories (density estimation, classification, reconstruction, and isolation-based methods) demonstrate that CTAD consistently improves performance with statistical significance. Remarkably, CTAD enhances even state-of-the-art deep learning methods and shows robust performance across diverse hyperparameter settings, requiring no additional tuning for practical deployment.
Related papers
- Multi-Cue Anomaly Detection and Localization under Data Contamination [0.6703429330486276]
We propose a robust anomaly detection framework that integrates limited anomaly supervision into the adaptive deviation learning paradigm.<n>Our framework achieves strong detection and localization performance, interpretability, and robustness under various levels of data contamination.
arXiv Detail & Related papers (2026-01-30T12:34:13Z) - Correcting False Alarms from Unseen: Adapting Graph Anomaly Detectors at Test Time [60.341117019125214]
We propose a lightweight and plug-and-play Test-time adaptation framework for correcting Unseen Normal pattErns in graph anomaly detection (GAD)<n>To address semantic confusion, a graph aligner is employed to align the shifted data to the original one at the graph attribute level.<n>Extensive experiments on 10 real-world datasets demonstrate that TUNE significantly enhances the generalizability of pre-trained GAD models to both synthetic and real unseen normal patterns.
arXiv Detail & Related papers (2025-11-10T12:10:05Z) - Adaptive Deviation Learning for Visual Anomaly Detection with Data Contamination [20.4008901760593]
We introduce a systematic adaptive method that employs deviation learning to compute anomaly scores end-to-end.
Our proposed method surpasses competing techniques and exhibits both stability and robustness in the presence of data contamination.
arXiv Detail & Related papers (2024-11-14T16:10:15Z) - Enhancing Anomaly Detection via Generating Diversified and Hard-to-distinguish Synthetic Anomalies [7.021105583098609]
Recent approaches have focused on leveraging domain-specific transformations or perturbations to generate synthetic anomalies from normal samples.
We introduce a novel domain-agnostic method that employs a set of conditional perturbators and a discriminator.
We demonstrate the superiority of our method over state-of-the-art benchmarks.
arXiv Detail & Related papers (2024-09-16T08:15:23Z) - GLAD: Towards Better Reconstruction with Global and Local Adaptive Diffusion Models for Unsupervised Anomaly Detection [60.78684630040313]
Diffusion models tend to reconstruct normal counterparts of test images with certain noises added.
From the global perspective, the difficulty of reconstructing images with different anomalies is uneven.
We propose a global and local adaptive diffusion model (abbreviated to GLAD) for unsupervised anomaly detection.
arXiv Detail & Related papers (2024-06-11T17:27:23Z) - TabADM: Unsupervised Tabular Anomaly Detection with Diffusion Models [5.314466196448187]
We present a diffusion-based probabilistic model effective for unsupervised anomaly detection.
Our model is trained to learn the density of normal samples by utilizing a unique rejection scheme.
At inference, we identify anomalies as samples in low-density regions.
arXiv Detail & Related papers (2023-07-23T14:02:33Z) - Anomaly Detection under Distribution Shift [24.094884041252044]
Anomaly detection (AD) is a crucial machine learning task that aims to learn patterns from a set of normal training samples to identify abnormal samples in test data.
Most existing AD studies assume that the training and test data are drawn from the same data distribution, but the test data can have large distribution shifts.
We introduce a novel robust AD approach to diverse distribution shifts by minimizing the distribution gap between in-distribution and OOD normal samples in both the training and inference stages.
arXiv Detail & Related papers (2023-03-24T07:39:08Z) - Diversity-Measurable Anomaly Detection [106.07413438216416]
We propose Diversity-Measurable Anomaly Detection (DMAD) framework to enhance reconstruction diversity.
PDM essentially decouples deformation from embedding and makes the final anomaly score more reliable.
arXiv Detail & Related papers (2023-03-09T05:52:42Z) - Hierarchical Semi-Supervised Contrastive Learning for
Contamination-Resistant Anomaly Detection [81.07346419422605]
Anomaly detection aims at identifying deviant samples from the normal data distribution.
Contrastive learning has provided a successful way to sample representation that enables effective discrimination on anomalies.
We propose a novel hierarchical semi-supervised contrastive learning framework, for contamination-resistant anomaly detection.
arXiv Detail & Related papers (2022-07-24T18:49:26Z) - Explainable Deep Few-shot Anomaly Detection with Deviation Networks [123.46611927225963]
We introduce a novel weakly-supervised anomaly detection framework to train detection models.
The proposed approach learns discriminative normality by leveraging the labeled anomalies and a prior probability.
Our model is substantially more sample-efficient and robust, and performs significantly better than state-of-the-art competing methods in both closed-set and open-set settings.
arXiv Detail & Related papers (2021-08-01T14:33:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.