A Latent Space Correlation-Aware Autoencoder for Anomaly Detection in
Skewed Data
- URL: http://arxiv.org/abs/2301.00462v3
- Date: Thu, 15 Feb 2024 16:18:32 GMT
- Title: A Latent Space Correlation-Aware Autoencoder for Anomaly Detection in
Skewed Data
- Authors: Padmaksha Roy
- Abstract summary: We propose a kernelized autoencoder that measures latent dimension correlation to effectively detect both near and far anomalies.
The multi-objective function has two goals -- it measures correlation information in the latent feature space in the form of robust MD distance.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised learning-based anomaly detection in latent space has gained
importance since discriminating anomalies from normal data becomes difficult in
high-dimensional space. Both density estimation and distance-based methods to
detect anomalies in latent space have been explored in the past. These methods
prove that retaining valuable properties of input data in latent space helps in
the better reconstruction of test data. Moreover, real-world sensor data is
skewed and non-Gaussian in nature, making mean-based estimators unreliable for
skewed data. Again, anomaly detection methods based on reconstruction error
rely on Euclidean distance, which does not consider useful correlation
information in the feature space and also fails to accurately reconstruct the
data when it deviates from the training distribution. In this work, we address
the limitations of reconstruction error-based autoencoders and propose a
kernelized autoencoder that leverages a robust form of Mahalanobis distance
(MD) to measure latent dimension correlation to effectively detect both near
and far anomalies. This hybrid loss is aided by the principle of maximizing the
mutual information gain between the latent dimension and the high-dimensional
prior data space by maximizing the entropy of the latent space while preserving
useful correlation information of the original data in the low-dimensional
latent space. The multi-objective function has two goals -- it measures
correlation information in the latent feature space in the form of robust MD
distance and simultaneously tries to preserve useful correlation information
from the original data space in the latent space by maximizing mutual
information between the prior and latent space.
Related papers
- Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning [50.84938730450622]
We propose a trajectory-based method TV score, which uses trajectory volatility for OOD detection in mathematical reasoning.
Our method outperforms all traditional algorithms on GLMs under mathematical reasoning scenarios.
Our method can be extended to more applications with high-density features in output spaces, such as multiple-choice questions.
arXiv Detail & Related papers (2024-05-22T22:22:25Z) - stMCDI: Masked Conditional Diffusion Model with Graph Neural Network for Spatial Transcriptomics Data Imputation [8.211887623977214]
We introduce textbfstMCDI, a novel conditional diffusion model for spatial transcriptomics data imputation.
It employs a denoising network trained using randomly masked data portions as guidance, with the unmasked data serving as conditions.
The results obtained from spatial transcriptomics datasets elucidate the performance of our methods relative to existing approaches.
arXiv Detail & Related papers (2024-03-16T09:06:38Z) - Physics-Guided Abnormal Trajectory Gap Detection [2.813613899641924]
We propose a Space Time-Aware Gap Detection (STAGD) approach to leverage space-time indexing and merging trajectory gaps.
We also incorporate a Dynamic Region-based Merge (DRM) approach to efficiently compute gap abnormality scores.
arXiv Detail & Related papers (2024-03-10T17:07:28Z) - Intrinsic dimension estimation for discrete metrics [65.5438227932088]
In this letter we introduce an algorithm to infer the intrinsic dimension (ID) of datasets embedded in discrete spaces.
We demonstrate its accuracy on benchmark datasets, and we apply it to analyze a metagenomic dataset for species fingerprinting.
This suggests that evolutive pressure acts on a low-dimensional manifold despite the high-dimensionality of sequences' space.
arXiv Detail & Related papers (2022-07-20T06:38:36Z) - Analyzing the Latent Space of GAN through Local Dimension Estimation [4.688163910878411]
style-based GANs (StyleGANs) in high-fidelity image synthesis have motivated research to understand the semantic properties of their latent spaces.
We propose a local dimension estimation algorithm for arbitrary intermediate layers in a pre-trained GAN model.
Our proposed metric, called Distortion, measures an inconsistency of intrinsic space on the learned latent space.
arXiv Detail & Related papers (2022-05-26T06:36:06Z) - Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic
Uncertainty [58.144520501201995]
Bi-Lipschitz regularization of neural network layers preserve relative distances between data instances in the feature spaces of each layer.
With the use of an attentive set encoder, we propose to meta learn either diagonal or diagonal plus low-rank factors to efficiently construct task specific covariance matrices.
We also propose an inference procedure which utilizes scaled energy to achieve a final predictive distribution.
arXiv Detail & Related papers (2021-10-12T22:04:19Z) - Featurized Density Ratio Estimation [82.40706152910292]
In our work, we propose to leverage an invertible generative model to map the two distributions into a common feature space prior to estimation.
This featurization brings the densities closer together in latent space, sidestepping pathological scenarios where the learned density ratios in input space can be arbitrarily inaccurate.
At the same time, the invertibility of our feature map guarantees that the ratios computed in feature space are equivalent to those in input space.
arXiv Detail & Related papers (2021-07-05T18:30:26Z) - Regressive Domain Adaptation for Unsupervised Keypoint Detection [67.2950306888855]
Domain adaptation (DA) aims at transferring knowledge from a labeled source domain to an unlabeled target domain.
We present a method of regressive domain adaptation (RegDA) for unsupervised keypoint detection.
Our method brings large improvement by 8% to 11% in terms of PCK on different datasets.
arXiv Detail & Related papers (2021-03-10T16:45:22Z) - Statistical control for spatio-temporal MEG/EEG source imaging with
desparsified multi-task Lasso [102.84915019938413]
Non-invasive techniques like magnetoencephalography (MEG) or electroencephalography (EEG) offer promise of non-invasive techniques.
The problem of source localization, or source imaging, poses however a high-dimensional statistical inference challenge.
We propose an ensemble of desparsified multi-task Lasso (ecd-MTLasso) to deal with this problem.
arXiv Detail & Related papers (2020-09-29T21:17:16Z) - Correlation-aware Deep Generative Model for Unsupervised Anomaly
Detection [9.578395294627057]
Unsupervised anomaly detection aims to identify anomalous samples from highly complex and unstructured data.
We propose a method of Correlation aware unsupervised Anomaly detection via Deep Gaussian Mixture Model (CADGMM)
Experiments on real-world datasets demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2020-02-18T03:32:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.