DATE: Detecting Anomalies in Text via Self-Supervision of Transformers
- URL: http://arxiv.org/abs/2104.05591v1
- Date: Mon, 12 Apr 2021 16:08:05 GMT
- Title: DATE: Detecting Anomalies in Text via Self-Supervision of Transformers
- Authors: Andrei Manolache and Florin Brad and Elena Burceanu
- Abstract summary: Recent deep methods for anomalies in images learn better features of normality in an end-to-end self-supervised setting.
We use this approach for Anomaly Detection in text, by introducing a novel pretext task on text sequences.
We show strong quantitative and qualitative results on the 20Newsgroups and AG News datasets.
- Score: 5.105840060102528
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Leveraging deep learning models for Anomaly Detection (AD) has seen
widespread use in recent years due to superior performances over traditional
methods. Recent deep methods for anomalies in images learn better features of
normality in an end-to-end self-supervised setting. These methods train a model
to discriminate between different transformations applied to visual data and
then use the output to compute an anomaly score. We use this approach for AD in
text, by introducing a novel pretext task on text sequences. We learn our DATE
model end-to-end, enforcing two independent and complementary self-supervision
signals, one at the token-level and one at the sequence-level. Under this new
task formulation, we show strong quantitative and qualitative results on the
20Newsgroups and AG News datasets. In the semi-supervised setting, we
outperform state-of-the-art results by +13.5% and +6.9%, respectively (AUROC).
In the unsupervised configuration, DATE surpasses all other methods even when
10% of its training data is contaminated with outliers (compared with 0% for
the others).
Related papers
- RoSAS: Deep Semi-Supervised Anomaly Detection with
Contamination-Resilient Continuous Supervision [21.393509817509464]
This paper proposes a novel semi-supervised anomaly detection method, which devises textitcontamination-resilient continuous supervisory signals
Our approach significantly outperforms state-of-the-art competitors by 20%-30% in AUC-PR.
arXiv Detail & Related papers (2023-07-25T04:04:49Z) - ASPEST: Bridging the Gap Between Active Learning and Selective
Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain.
Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples.
In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z) - Confidence-Guided Data Augmentation for Deep Semi-Supervised Training [0.9968241071319184]
We propose a new data augmentation technique for semi-supervised learning settings that emphasizes learning from the most challenging regions of the feature space.
We perform experiments on two benchmark RGB datasets: CIFAR-100 and STL-10, and show that the proposed scheme improves classification performance in terms of accuracy and robustness.
arXiv Detail & Related papers (2022-09-16T21:23:19Z) - AnoShift: A Distribution Shift Benchmark for Unsupervised Anomaly
Detection [7.829710051617368]
We introduce an unsupervised anomaly detection benchmark with data that shifts over time, built over Kyoto-2006+, a traffic dataset for network intrusion detection.
We first highlight the non-stationary nature of the data, using a basic per-feature analysis, t-SNE, and an Optimal Transport approach for measuring the overall distribution distances between years.
We validate the performance degradation over time with diverse models, ranging from classical approaches to deep learning.
arXiv Detail & Related papers (2022-06-30T17:59:22Z) - Fake It Till You Make It: Near-Distribution Novelty Detection by
Score-Based Generative Models [54.182955830194445]
existing models either fail or face a dramatic drop under the so-called near-distribution" setting.
We propose to exploit a score-based generative model to produce synthetic near-distribution anomalous data.
Our method improves the near-distribution novelty detection by 6% and passes the state-of-the-art by 1% to 5% across nine novelty detection benchmarks.
arXiv Detail & Related papers (2022-05-28T02:02:53Z) - Self-Trained One-class Classification for Unsupervised Anomaly Detection [56.35424872736276]
Anomaly detection (AD) has various applications across domains, from manufacturing to healthcare.
In this work, we focus on unsupervised AD problems whose entire training data are unlabeled and may contain both normal and anomalous samples.
To tackle this problem, we build a robust one-class classification framework via data refinement.
We show that our method outperforms state-of-the-art one-class classification method by 6.3 AUC and 12.5 average precision.
arXiv Detail & Related papers (2021-06-11T01:36:08Z) - RLAD: Time Series Anomaly Detection through Reinforcement Learning and
Active Learning [17.089402177923297]
We introduce a new semi-supervised, time series anomaly detection algorithm.
It uses deep reinforcement learning and active learning to efficiently learn and adapt to anomalies in real-world time series data.
It requires no manual tuning of parameters and outperforms all state-of-art methods we compare with.
arXiv Detail & Related papers (2021-03-31T15:21:15Z) - DAGA: Data Augmentation with a Generation Approach for Low-resource
Tagging Tasks [88.62288327934499]
We propose a novel augmentation method with language models trained on the linearized labeled sentences.
Our method is applicable to both supervised and semi-supervised settings.
arXiv Detail & Related papers (2020-11-03T07:49:15Z) - TadGAN: Time Series Anomaly Detection Using Generative Adversarial
Networks [73.01104041298031]
TadGAN is an unsupervised anomaly detection approach built on Generative Adversarial Networks (GANs)
To capture the temporal correlations of time series, we use LSTM Recurrent Neural Networks as base models for Generators and Critics.
To demonstrate the performance and generalizability of our approach, we test several anomaly scoring techniques and report the best-suited one.
arXiv Detail & Related papers (2020-09-16T15:52:04Z) - Evaluating Prediction-Time Batch Normalization for Robustness under
Covariate Shift [81.74795324629712]
We call prediction-time batch normalization, which significantly improves model accuracy and calibration under covariate shift.
We show that prediction-time batch normalization provides complementary benefits to existing state-of-the-art approaches for improving robustness.
The method has mixed results when used alongside pre-training, and does not seem to perform as well under more natural types of dataset shift.
arXiv Detail & Related papers (2020-06-19T05:08:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.