Enhancing Robustness of On-line Learning Models on Highly Noisy Data
- URL: http://arxiv.org/abs/2103.10824v1
- Date: Fri, 19 Mar 2021 14:13:16 GMT
- Title: Enhancing Robustness of On-line Learning Models on Highly Noisy Data
- Authors: Zilong Zhao, Robert Birke, Rui Han, Bogdan Robu, Sara Bouchenak, Sonia
Ben Mokhtar, Lydia Y. Chen
- Abstract summary: We extend a two-layer on-line data selection framework: Robust Anomaly Detector (RAD) with a newly designed ensemble prediction.
RAD can robustly improve the accuracy of anomaly detection, to reach up to 98.95% for IoT device attacks, up to 85.03% for cloud task failures, and up to 77.51% for face recognition.
- Score: 7.812139470551903
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Classification algorithms have been widely adopted to detect anomalies for
various systems, e.g., IoT, cloud and face recognition, under the common
assumption that the data source is clean, i.e., features and labels are
correctly set. However, data collected from the wild can be unreliable due to
careless annotations or malicious data transformation for incorrect anomaly
detection. In this paper, we extend a two-layer on-line data selection
framework: Robust Anomaly Detector (RAD) with a newly designed ensemble
prediction where both layers contribute to the final anomaly detection
decision. To adapt to the on-line nature of anomaly detection, we consider
additional features of conflicting opinions of classifiers, repetitive
cleaning, and oracle knowledge. We on-line learn from incoming data streams and
continuously cleanse the data, so as to adapt to the increasing learning
capacity from the larger accumulated data set. Moreover, we explore the concept
of oracle learning that provides additional information of true labels for
difficult data points. We specifically focus on three use cases, (i) detecting
10 classes of IoT attacks, (ii) predicting 4 classes of task failures of big
data jobs, and (iii) recognising 100 celebrities faces. Our evaluation results
show that RAD can robustly improve the accuracy of anomaly detection, to reach
up to 98.95% for IoT device attacks (i.e., +7%), up to 85.03% for cloud task
failures (i.e., +14%) under 40% label noise, and for its extension, it can
reach up to 77.51% for face recognition (i.e., +39%) under 30% label noise. The
proposed RAD and its extensions are general and can be applied to different
anomaly detection algorithms.
Related papers
- AN An ica-ensemble learning approach for prediction of uwb nlos signals
data classification [0.0]
This research focuses on harmonizing information through wireless communication and identifying individuals in NLOS scenarios using ultra-wideband radar signals.
Experiments demonstrate categorization accuracies of 88.37% for static data and 87.20% for dynamic data, highlighting the effectiveness of the proposed approach.
arXiv Detail & Related papers (2024-02-27T11:42:26Z) - Unraveling the "Anomaly" in Time Series Anomaly Detection: A
Self-supervised Tri-domain Solution [89.16750999704969]
Anomaly labels hinder traditional supervised models in time series anomaly detection.
Various SOTA deep learning techniques, such as self-supervised learning, have been introduced to tackle this issue.
We propose a novel self-supervised learning based Tri-domain Anomaly Detector (TriAD)
arXiv Detail & Related papers (2023-11-19T05:37:18Z) - Anomaly Detection in Power Generation Plants with Generative Adversarial
Networks [0.5439020425819]
This study explores the use of Generative Adversarial Networks (GANs) for anomaly detection in power generation plants.
The data was initially collected in response to observed irregularities in the fuel consumption patterns of the generating sets situated at the company's base stations.
A GANs model was trained and fine-tuned both with and without data augmentation, with the goal of increasing the dataset size to enhance performance.
arXiv Detail & Related papers (2023-09-30T10:44:05Z) - Meta-learning with GANs for anomaly detection, with deployment in
high-speed rail inspection system [7.220842608593749]
Key challenges for anomaly detection in the AI era with big data include lack of prior knowledge of potential anomaly types.
Within this framework, we incorporate the idea of generative adversarial networks (GANs) with appropriate choices of loss functions.
Our framework has been deployed in five high-speed railways of China since 2021: it has reduced more than 99.7% workload and saved 96.7% inspection time.
arXiv Detail & Related papers (2022-02-11T17:43:49Z) - Towards Reducing Labeling Cost in Deep Object Detection [61.010693873330446]
We propose a unified framework for active learning, that considers both the uncertainty and the robustness of the detector.
Our method is able to pseudo-label the very confident predictions, suppressing a potential distribution drift.
arXiv Detail & Related papers (2021-06-22T16:53:09Z) - Self-Trained One-class Classification for Unsupervised Anomaly Detection [56.35424872736276]
Anomaly detection (AD) has various applications across domains, from manufacturing to healthcare.
In this work, we focus on unsupervised AD problems whose entire training data are unlabeled and may contain both normal and anomalous samples.
To tackle this problem, we build a robust one-class classification framework via data refinement.
We show that our method outperforms state-of-the-art one-class classification method by 6.3 AUC and 12.5 average precision.
arXiv Detail & Related papers (2021-06-11T01:36:08Z) - RLAD: Time Series Anomaly Detection through Reinforcement Learning and
Active Learning [17.089402177923297]
We introduce a new semi-supervised, time series anomaly detection algorithm.
It uses deep reinforcement learning and active learning to efficiently learn and adapt to anomalies in real-world time series data.
It requires no manual tuning of parameters and outperforms all state-of-art methods we compare with.
arXiv Detail & Related papers (2021-03-31T15:21:15Z) - TELESTO: A Graph Neural Network Model for Anomaly Classification in
Cloud Services [77.454688257702]
Machine learning (ML) and artificial intelligence (AI) are applied on IT system operation and maintenance.
One direction aims at the recognition of re-occurring anomaly types to enable remediation automation.
We propose a method that is invariant to dimensionality changes of given data.
arXiv Detail & Related papers (2021-02-25T14:24:49Z) - Contextual-Bandit Anomaly Detection for IoT Data in Distributed
Hierarchical Edge Computing [65.78881372074983]
IoT devices can hardly afford complex deep neural networks (DNN) models, and offloading anomaly detection tasks to the cloud incurs long delay.
We propose and build a demo for an adaptive anomaly detection approach for distributed hierarchical edge computing (HEC) systems.
We show that our proposed approach significantly reduces detection delay without sacrificing accuracy, as compared to offloading detection tasks to the cloud.
arXiv Detail & Related papers (2020-04-15T06:13:33Z) - Hybrid Model For Intrusion Detection Systems [0.0]
This project involves analysis of different machine learning algorithms used in intrusion detection systems.
After the analysis of different intrusion detection systems on both the datasets, this project aimed to develop a new hybrid model for intrusion detection systems.
arXiv Detail & Related papers (2020-03-19T05:52:29Z) - Adaptive Anomaly Detection for IoT Data in Hierarchical Edge Computing [71.86955275376604]
We propose an adaptive anomaly detection approach for hierarchical edge computing (HEC) systems to solve this problem.
We design an adaptive scheme to select one of the models based on the contextual information extracted from input data, to perform anomaly detection.
We evaluate our proposed approach using a real IoT dataset, and demonstrate that it reduces detection delay by 84% while maintaining almost the same accuracy as compared to offloading detection tasks to the cloud.
arXiv Detail & Related papers (2020-01-10T05:29:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.