Multiscale Feature Attribution for Outliers
- URL: http://arxiv.org/abs/2310.20012v1
- Date: Mon, 30 Oct 2023 20:58:28 GMT
- Title: Multiscale Feature Attribution for Outliers
- Authors: Jeff Shen, Peter Melchior
- Abstract summary: We propose a new feature attribution method, Inverse Multiscale Occlusion, specifically designed for outliers.
We demonstrate our method on outliers detected in galaxy spectra from the Dark Energy Survey Instrument.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning techniques can automatically identify outliers in massive
datasets, much faster and more reproducible than human inspection ever could.
But finding such outliers immediately leads to the question: which features
render this input anomalous? We propose a new feature attribution method,
Inverse Multiscale Occlusion, that is specifically designed for outliers, for
which we have little knowledge of the type of features we want to identify and
expect that the model performance is questionable because anomalous test data
likely exceed the limits of the training data. We demonstrate our method on
outliers detected in galaxy spectra from the Dark Energy Survey Instrument and
find its results to be much more interpretable than alternative attribution
approaches.
Related papers
- Diversified Outlier Exposure for Out-of-Distribution Detection via
Informative Extrapolation [110.34982764201689]
Out-of-distribution (OOD) detection is important for deploying reliable machine learning models on real-world applications.
Recent advances in outlier exposure have shown promising results on OOD detection via fine-tuning model with informatively sampled auxiliary outliers.
We propose a novel framework, namely, Diversified Outlier Exposure (DivOE), for effective OOD detection via informative extrapolation based on the given auxiliary outliers.
arXiv Detail & Related papers (2023-10-21T07:16:09Z) - SaliencyCut: Augmenting Plausible Anomalies for Anomaly Detection [24.43321988051129]
We propose a novel saliency-guided data augmentation method, SaliencyCut, to produce pseudo but more common anomalies.
We then design a novel patch-wise residual module in the anomaly learning head to extract and assess the fine-grained anomaly features from each sample.
arXiv Detail & Related papers (2023-06-14T08:55:36Z) - WePaMaDM-Outlier Detection: Weighted Outlier Detection using Pattern
Approaches for Mass Data Mining [0.6754597324022876]
Outlier detection can reveal vital information about system faults, fraudulent activities, and patterns in the data.
This article proposed the WePaMaDM-Outlier Detection with distinct mass data mining domain.
It also investigates the significance of data modeling in outlier detection techniques in surveillance, fault detection, and trend analysis.
arXiv Detail & Related papers (2023-06-09T07:00:00Z) - Adaptive Negative Evidential Deep Learning for Open-set Semi-supervised Learning [69.81438976273866]
Open-set semi-supervised learning (Open-set SSL) considers a more practical scenario, where unlabeled data and test data contain new categories (outliers) not observed in labeled data (inliers)
We introduce evidential deep learning (EDL) as an outlier detector to quantify different types of uncertainty, and design different uncertainty metrics for self-training and inference.
We propose a novel adaptive negative optimization strategy, making EDL more tailored to the unlabeled dataset containing both inliers and outliers.
arXiv Detail & Related papers (2023-03-21T09:07:15Z) - SLA$^2$P: Self-supervised Anomaly Detection with Adversarial
Perturbation [77.71161225100927]
Anomaly detection is a fundamental yet challenging problem in machine learning.
We propose a novel and powerful framework, dubbed as SLA$2$P, for unsupervised anomaly detection.
arXiv Detail & Related papers (2021-11-25T03:53:43Z) - Learning to Rank Anomalies: Scalar Performance Criteria and Maximization
of Two-Sample Rank Statistics [0.0]
We propose a data-driven scoring function defined on the feature space which reflects the degree of abnormality of the observations.
This scoring function is learnt through a well-designed binary classification problem.
We illustrate our methodology with preliminary encouraging numerical experiments.
arXiv Detail & Related papers (2021-09-20T14:45:56Z) - Unsupervised Outlier Detection using Memory and Contrastive Learning [53.77693158251706]
We think outlier detection can be done in the feature space by measuring the feature distance between outliers and inliers.
We propose a framework, MCOD, using a memory module and a contrastive learning module.
Our proposed MCOD achieves a considerable performance and outperforms nine state-of-the-art methods.
arXiv Detail & Related papers (2021-07-27T07:35:42Z) - Combining Feature and Instance Attribution to Detect Artifacts [62.63504976810927]
We propose methods to facilitate identification of training data artifacts.
We show that this proposed training-feature attribution approach can be used to uncover artifacts in training data.
We execute a small user study to evaluate whether these methods are useful to NLP researchers in practice.
arXiv Detail & Related papers (2021-07-01T09:26:13Z) - Deep Visual Anomaly detection with Negative Learning [18.79849041106952]
In this paper, we propose anomaly detection with negative learning (ADNL), which employs the negative learning concept for the enhancement of anomaly detection.
The idea is to limit the reconstruction capability of a generative model using the given a small amount of anomaly examples.
This way, the network not only learns to reconstruct normal data but also encloses the normal distribution far from the possible distribution of anomalies.
arXiv Detail & Related papers (2021-05-24T01:48:44Z) - Toward Deep Supervised Anomaly Detection: Reinforcement Learning from
Partially Labeled Anomaly Data [150.9270911031327]
We consider the problem of anomaly detection with a small set of partially labeled anomaly examples and a large-scale unlabeled dataset.
Existing related methods either exclusively fit the limited anomaly examples that typically do not span the entire set of anomalies, or proceed with unsupervised learning from the unlabeled data.
We propose here instead a deep reinforcement learning-based approach that enables an end-to-end optimization of the detection of both labeled and unlabeled anomalies.
arXiv Detail & Related papers (2020-09-15T03:05:39Z) - Interpretable Anomaly Detection with Mondrian P{\'o}lya Forests on Data
Streams [6.177270420667713]
Anomaly detection at scale is an extremely challenging problem of great practicality.
Recent work has coalesced on variations of (random) $k$emphd-trees to summarise data for anomaly detection.
These methods rely on ad-hoc score functions that are not easy to interpret.
We contextualise these methods in a probabilistic framework which we call the Mondrian Polya Forest.
arXiv Detail & Related papers (2020-08-04T13:19:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.