Related papers: Practical Anomaly Detection over Multivariate Monitoring Metrics for Online Services

Practical Anomaly Detection over Multivariate Monitoring Metrics for Online Services

URL: http://arxiv.org/abs/2308.09937v1
Date: Sat, 19 Aug 2023 08:08:05 GMT
Title: Practical Anomaly Detection over Multivariate Monitoring Metrics for Online Services
Authors: Jinyang Liu, Tianyi Yang, Zhuangbin Chen, Yuxin Su, Cong Feng, Zengyin Yang, Michael R. Lyu
Abstract summary: CMAnomaly is an anomaly detection framework on multivariate monitoring metrics based on collaborative machine. The proposed framework is extensively evaluated with both public data and industrial data collected from a large-scale online service system of Huawei Cloud. Compared with state-of-the-art baseline models, CMAnomaly achieves an average F1 score of 0.9494, outperforming baselines by 6.77% to 10.68%, and runs 10X to 20X faster.
Score: 29.37493773435177
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: As modern software systems continue to grow in terms of complexity and volume, anomaly detection on multivariate monitoring metrics, which profile systems' health status, becomes more and more critical and challenging. In particular, the dependency between different metrics and their historical patterns plays a critical role in pursuing prompt and accurate anomaly detection. Existing approaches fall short of industrial needs for being unable to capture such information efficiently. To fill this significant gap, in this paper, we propose CMAnomaly, an anomaly detection framework on multivariate monitoring metrics based on collaborative machine. The proposed collaborative machine is a mechanism to capture the pairwise interactions along with feature and temporal dimensions with linear time complexity. Cost-effective models can then be employed to leverage both the dependency between monitoring metrics and their historical patterns for anomaly detection. The proposed framework is extensively evaluated with both public data and industrial data collected from a large-scale online service system of Huawei Cloud. The experimental results demonstrate that compared with state-of-the-art baseline models, CMAnomaly achieves an average F1 score of 0.9494, outperforming baselines by 6.77% to 10.68%, and runs 10X to 20X faster. Furthermore, we also share our experience of deploying CMAnomaly in Huawei Cloud.

Related papers

Federated Koopman-Reservoir Learning for Large-Scale Multivariate Time-Series Anomaly Detection [12.44225906937484]
FedKO is a novel unsupervised Federated Learning framework. It is deployed on edge devices for efficient detection of anomalies in local MVTS streams. It reduces up to 8x communication size and 2x memory usage, making it highly suitable for large-scale systems.
arXiv Detail & Related papers (2025-03-14T10:06:52Z)
Enhancing Web Service Anomaly Detection via Fine-grained Multi-modal Association and Frequency Domain Analysis [8.860339665670255]
Anomaly detection is crucial for ensuring the stability and reliability of web service systems. Existing anomaly detection methods use logs and metrics to detect anomalies. We propose a novel anomaly detection method named FFAD to address these two issues.
arXiv Detail & Related papers (2025-01-28T12:00:45Z)
Online Multi-modal Root Cause Analysis [61.94987309148539]
Root Cause Analysis (RCA) is essential for pinpointing the root causes of failures in microservice systems. Existing online RCA methods handle only single-modal data overlooking, complex interactions in multi-modal systems. We introduce OCEAN, a novel online multi-modal causal structure learning method for root cause localization.
arXiv Detail & Related papers (2024-10-13T21:47:36Z)
A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods. The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics. We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z)
PeFAD: A Parameter-Efficient Federated Framework for Time Series Anomaly Detection [51.20479454379662]
We propose a. Federated Anomaly Detection framework named PeFAD with the increasing privacy concerns. We conduct extensive evaluations on four real datasets, where PeFAD outperforms existing state-of-the-art baselines by up to 28.74%.
arXiv Detail & Related papers (2024-06-04T13:51:08Z)
HCL-MTSAD: Hierarchical Contrastive Consistency Learning for Accurate Detection of Industrial Multivariate Time Series Anomalies [4.806959791183183]
We propose a novel self-supervised hierarchical contrastive consistency learning method for detecting anomalies in industrial MTS. By developing a multi-layer contrastive loss, HCL-MTSAD can extensively mine data consistency and timestamp-temporal association. Experiments conducted on six diverse MTS retrieved from real cyber-physical systems and server machines, indicate that HCL-MTSAD's anomaly detection capability outperforms the state-of-the-art benchmark models by an average of 1.8% in terms of F1 score.
arXiv Detail & Related papers (2024-04-12T03:39:33Z)
MELODY: Robust Semi-Supervised Hybrid Model for Entity-Level Online Anomaly Detection with Multivariate Time Series [11.754433499581879]
A faulty code change may degrade the target service's performance and cause cascading outages in downstream services. In this paper, we study the problem of anomaly detection for deployments. We propose a novel framework, semi-supervised hybrid Model for Entity-Level Online Detection of anomalY (MELODY)
arXiv Detail & Related papers (2024-01-18T19:02:41Z)
Graph Spatiotemporal Process for Multivariate Time Series Anomaly Detection with Missing Values [67.76168547245237]
We introduce a novel framework called GST-Pro, which utilizes a graphtemporal process and anomaly scorer to detect anomalies. Our experimental results show that the GST-Pro method can effectively detect anomalies in time series data and outperforms state-of-the-art methods.
arXiv Detail & Related papers (2024-01-11T10:10:16Z)
Twin Graph-based Anomaly Detection via Attentive Multi-Modal Learning for Microservice System [24.2074235652359]
We propose MSTGAD, which seamlessly integrates all available data modalities via attentive multi-modal learning. We construct a transformer-based neural network with both spatial and temporal attention mechanisms to model the inter-correlations between different modalities. This enables us to detect anomalies automatically and accurately in real-time.
arXiv Detail & Related papers (2023-10-07T06:28:41Z)
Robust Detection of Lead-Lag Relationships in Lagged Multi-Factor Models [61.10851158749843]
Key insights can be obtained by discovering lead-lag relationships inherent in the data. We develop a clustering-driven methodology for robust detection of lead-lag relationships in lagged multi-factor models.
arXiv Detail & Related papers (2023-05-11T10:30:35Z)
Causality-Based Multivariate Time Series Anomaly Detection [63.799474860969156]
We formulate the anomaly detection problem from a causal perspective and view anomalies as instances that do not follow the regular causal mechanism to generate the multivariate data. We then propose a causality-based anomaly detection approach, which first learns the causal structure from data and then infers whether an instance is an anomaly relative to the local causal mechanism. We evaluate our approach with both simulated and public datasets as well as a case study on real-world AIOps applications.
arXiv Detail & Related papers (2022-06-30T06:00:13Z)
Federated Variational Learning for Anomaly Detection in Multivariate Time Series [13.328883578980237]
We propose an unsupervised time series anomaly detection framework in a federated fashion. We leave the training data distributed at the edge to learn a shared Variational Autoencoder (VAE) based on Convolutional Gated Recurrent Unit (ConvGRU) model. Experiments on three real-world networked sensor datasets illustrate the advantage of our approach over other state-of-the-art models.
arXiv Detail & Related papers (2021-08-18T22:23:15Z)
Unsupervised Deep Anomaly Detection for Multi-Sensor Time-Series Signals [10.866594993485226]
We propose a novel deep learning-based anomaly detection algorithm called Deep Convolutional Autoencoding Memory network (CAE-M) We first build a Deep Convolutional Autoencoder to characterize spatial dependence of multi-sensor data with a Maximum Mean Discrepancy (MMD) Then, we construct a Memory Network consisting of linear (Autoregressive Model) and non-linear predictions (Bigressive LSTM with Attention) to capture temporal dependence from time-series data.
arXiv Detail & Related papers (2021-07-27T06:48:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.