Explaining Anomalies using Denoising Autoencoders for Financial Tabular
Data
- URL: http://arxiv.org/abs/2209.10658v1
- Date: Wed, 21 Sep 2022 21:02:22 GMT
- Title: Explaining Anomalies using Denoising Autoencoders for Financial Tabular
Data
- Authors: Timur Sattarov, Dayananda Herurkar, J\"orn Hees
- Abstract summary: We propose a framework for explaining anomalies using denoising autoencoders designed for mixed type tabular data.
This is achieved by localizing individual sample columns with potential errors and assigning corresponding confidence scores.
Our framework is designed for a domain expert to understand abnormal characteristics of an anomaly, as well as to improve in-house data quality management processes.
- Score: 5.071227866936205
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recent advances in Explainable AI (XAI) increased the demand for deployment
of safe and interpretable AI models in various industry sectors. Despite the
latest success of deep neural networks in a variety of domains, understanding
the decision-making process of such complex models still remains a challenging
task for domain experts. Especially in the financial domain, merely pointing to
an anomaly composed of often hundreds of mixed type columns, has limited value
for experts.
Hence, in this paper, we propose a framework for explaining anomalies using
denoising autoencoders designed for mixed type tabular data. We specifically
focus our technique on anomalies that are erroneous observations. This is
achieved by localizing individual sample columns (cells) with potential errors
and assigning corresponding confidence scores. In addition, the model provides
the expected cell value estimates to fix the errors.
We evaluate our approach based on three standard public tabular datasets
(Credit Default, Adult, IEEE Fraud) and one proprietary dataset (Holdings). We
find that denoising autoencoders applied to this task already outperform other
approaches in the cell error detection rates as well as in the expected value
rates. Additionally, we analyze how a specialized loss designed for cell error
detection can further improve these metrics. Our framework is designed for a
domain expert to understand abnormal characteristics of an anomaly, as well as
to improve in-house data quality management processes.
Related papers
- A Transfer Learning Framework for Anomaly Detection in Multivariate IoT Traffic Data [6.229535970620059]
We propose a transfer learning-based model for anomaly detection in time-series datasets.
Unlike conventional methods, our approach does not require labeled data in either the source or target domains.
Empirical evaluations on novel intrusion detection datasets demonstrate that our model outperforms existing techniques.
arXiv Detail & Related papers (2025-01-26T02:03:49Z) - MeLIAD: Interpretable Few-Shot Anomaly Detection with Metric Learning and Entropy-based Scoring [2.394081903745099]
We propose MeLIAD, a novel methodology for interpretable anomaly detection.
MeLIAD is based on metric learning and achieves interpretability by design without relying on any prior distribution assumptions of true anomalies.
Experiments on five public benchmark datasets, including quantitative and qualitative evaluation of interpretability, demonstrate that MeLIAD achieves improved anomaly detection and localization performance.
arXiv Detail & Related papers (2024-09-20T16:01:43Z) - GeneralAD: Anomaly Detection Across Domains by Attending to Distorted Features [68.14842693208465]
GeneralAD is an anomaly detection framework designed to operate in semantic, near-distribution, and industrial settings.
We propose a novel self-supervised anomaly generation module that employs straightforward operations like noise addition and shuffling to patch features.
We extensively evaluated our approach on ten datasets, achieving state-of-the-art results in six and on-par performance in the remaining.
arXiv Detail & Related papers (2024-07-17T09:27:41Z) - Fin-Fed-OD: Federated Outlier Detection on Financial Tabular Data [11.027356898413139]
Anomaly detection in real-world scenarios poses challenges due to dynamic and often unknown anomaly distributions.
This paper addresses the question of enhancing outlier detection within individual organizations without compromising data confidentiality.
We propose a novel method leveraging representation learning and federated learning techniques to improve the detection of unknown anomalies.
arXiv Detail & Related papers (2024-04-23T11:22:04Z) - Progressing from Anomaly Detection to Automated Log Labeling and
Pioneering Root Cause Analysis [53.24804865821692]
This study introduces a taxonomy for log anomalies and explores automated data labeling to mitigate labeling challenges.
The study envisions a future where root cause analysis follows anomaly detection, unraveling the underlying triggers of anomalies.
arXiv Detail & Related papers (2023-12-22T15:04:20Z) - Unraveling the "Anomaly" in Time Series Anomaly Detection: A
Self-supervised Tri-domain Solution [89.16750999704969]
Anomaly labels hinder traditional supervised models in time series anomaly detection.
Various SOTA deep learning techniques, such as self-supervised learning, have been introduced to tackle this issue.
We propose a novel self-supervised learning based Tri-domain Anomaly Detector (TriAD)
arXiv Detail & Related papers (2023-11-19T05:37:18Z) - Anomaly Detection with Score Distribution Discrimination [4.468952886990851]
We propose to optimize the anomaly scoring function from the view of score distribution.
We design a novel loss function called Overlap loss that minimizes the overlap area between the score distributions of normal and abnormal samples.
arXiv Detail & Related papers (2023-06-26T03:32:57Z) - Leveraging variational autoencoders for multiple data imputation [0.5156484100374059]
We investigate the ability of deep models, namely variational autoencoders (VAEs), to account for uncertainty in missing data through multiple imputation strategies.
We find that VAEs provide poor empirical coverage of missing data, with underestimation and overconfident imputations.
To overcome this, we employ $beta$-VAEs, which viewed from a generalized Bayes framework, provide robustness to model misspecification.
arXiv Detail & Related papers (2022-09-30T08:58:43Z) - An Outlier Exposure Approach to Improve Visual Anomaly Detection
Performance for Mobile Robots [76.36017224414523]
We consider the problem of building visual anomaly detection systems for mobile robots.
Standard anomaly detection models are trained using large datasets composed only of non-anomalous data.
We tackle the problem of exploiting these data to improve the performance of a Real-NVP anomaly detection model.
arXiv Detail & Related papers (2022-09-20T15:18:13Z) - Attribute-Guided Adversarial Training for Robustness to Natural
Perturbations [64.35805267250682]
We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space.
Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations.
arXiv Detail & Related papers (2020-12-03T10:17:30Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.