Anomaly-Injected Deep Support Vector Data Description for Text Outlier
Detection
- URL: http://arxiv.org/abs/2110.14729v1
- Date: Wed, 27 Oct 2021 19:29:19 GMT
- Title: Anomaly-Injected Deep Support Vector Data Description for Text Outlier
Detection
- Authors: Zeyu You, Yichu Zhou, Tao Yang, Wei Fan
- Abstract summary: Anomaly detection or outlier detection is a common task in various domains.
In this work, we propose a deep anomaly-injected support vector data description (AI-SVDD) framework.
To tackle text input, we employ a multilayer perceptron (MLP) network in conjunction with BERT to obtain enriched text representations.
- Score: 6.420355190628236
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Anomaly detection or outlier detection is a common task in various domains,
which has attracted significant research efforts in recent years. Existing
works mainly focus on structured data such as numerical or categorical data;
however, anomaly detection on unstructured textual data is less attended. In
this work, we target the textual anomaly detection problem and propose a deep
anomaly-injected support vector data description (AI-SVDD) framework. AI-SVDD
not only learns a more compact representation of the data hypersphere but also
adopts a small number of known anomalies to increase the discriminative power.
To tackle text input, we employ a multilayer perceptron (MLP) network in
conjunction with BERT to obtain enriched text representations. We conduct
experiments on three text anomaly detection applications with multiple
datasets. Experimental results show that the proposed AI-SVDD is promising and
outperforms existing works.
Related papers
- A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - ARC: A Generalist Graph Anomaly Detector with In-Context Learning [62.202323209244]
ARC is a generalist GAD approach that enables a one-for-all'' GAD model to detect anomalies across various graph datasets on-the-fly.
equipped with in-context learning, ARC can directly extract dataset-specific patterns from the target dataset.
Extensive experiments on multiple benchmark datasets from various domains demonstrate the superior anomaly detection performance, efficiency, and generalizability of ARC.
arXiv Detail & Related papers (2024-05-27T02:42:33Z) - Spotting AI's Touch: Identifying LLM-Paraphrased Spans in Text [61.22649031769564]
We propose a novel framework, paraphrased text span detection (PTD)
PTD aims to identify paraphrased text spans within a text.
We construct a dedicated dataset, PASTED, for paraphrased text span detection.
arXiv Detail & Related papers (2024-05-21T11:22:27Z) - Multi-Class Deep SVDD: Anomaly Detection Approach in Astronomy with
Distinct Inlier Categories [46.34797489552547]
We propose Multi-Class Deep Support Vector Data Description (MCDSVDD) to handle different inlier categories with distinct data distributions.
MCDSVDD uses a neural network to map the data into hyperspheres, where each hypersphere represents a specific inlier category.
Our results demonstrate the efficacy of MCDSVDD in detecting anomalous sources while leveraging the presence of different inlier categories.
arXiv Detail & Related papers (2023-08-09T15:10:53Z) - On the Possibilities of AI-Generated Text Detection [76.55825911221434]
We argue that as machine-generated text approximates human-like quality, the sample size needed for detection bounds increases.
We test various state-of-the-art text generators, including GPT-2, GPT-3.5-Turbo, Llama, Llama-2-13B-Chat-HF, and Llama-2-70B-Chat-HF, against detectors, including oBERTa-Large/Base-Detector, GPTZero.
arXiv Detail & Related papers (2023-04-10T17:47:39Z) - Fine-grained Anomaly Detection in Sequential Data via Counterfactual
Explanations [19.836395281552626]
We propose a novel framework called CFDet for fine-grained anomalous entry detection.
Given a sequence that is detected as anomalous, we can consider anomalous entry detection as an interpretable machine learning task.
We make use of the deep support vector data description (Deep SVDD) approach to detect anomalous sequences and propose a novel counterfactual interpretation-based approach to identify anomalous entries in the sequences.
arXiv Detail & Related papers (2022-10-09T02:38:11Z) - An Outlier Exposure Approach to Improve Visual Anomaly Detection
Performance for Mobile Robots [76.36017224414523]
We consider the problem of building visual anomaly detection systems for mobile robots.
Standard anomaly detection models are trained using large datasets composed only of non-anomalous data.
We tackle the problem of exploiting these data to improve the performance of a Real-NVP anomaly detection model.
arXiv Detail & Related papers (2022-09-20T15:18:13Z) - Deep Anomaly Detection and Search via Reinforcement Learning [22.005663849044772]
We propose Deep Anomaly Detection and Search (DADS) to balance exploitation and exploration.
During the training process, DADS searches for possible anomalies with hierarchically-structured datasets.
Results show that DADS can efficiently and precisely search anomalies from unlabeled data and learn from them.
arXiv Detail & Related papers (2022-08-31T13:03:33Z) - DASVDD: Deep Autoencoding Support Vector Data Descriptor for Anomaly
Detection [9.19194451963411]
Semi-supervised anomaly detection aims to detect anomalies from normal samples using a model that is trained on normal data.
We propose a method, DASVDD, that jointly learns the parameters of an autoencoder while minimizing the volume of an enclosing hyper-sphere on its latent representation.
arXiv Detail & Related papers (2021-06-09T21:57:41Z) - Feature Encoding with AutoEncoders for Weakly-supervised Anomaly
Detection [46.76220474310698]
Weakly-supervised anomaly detection aims at learning an anomaly detector from a limited amount of labeled data and abundant unlabeled data.
Recent works build deep neural networks for anomaly detection by discriminatively mapping the normal samples and abnormal samples to different regions in the feature space or fitting different distributions.
This paper proposes a novel strategy to transform the input data into a more meaningful representation that could be used for anomaly detection.
arXiv Detail & Related papers (2021-05-22T16:23:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.