LO2: Microservice API Anomaly Dataset of Logs and Metrics
- URL: http://arxiv.org/abs/2504.12067v1
- Date: Wed, 16 Apr 2025 13:21:56 GMT
- Title: LO2: Microservice API Anomaly Dataset of Logs and Metrics
- Authors: Alexander Bakhtin, Jesse Nyyssölä, Yuqing Wang, Noman Ahmad, Ke Ping, Matteo Esposito, Mika Mäntylä, Davide Taibi,
- Abstract summary: This dataset supports research on anomaly detection and architectural degradation in microservice systems.<n>We generate a comprehensive dataset of logs, metrics, and traces from a production microservice system.
- Score: 42.61470118436856
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Context. Microservice-based systems have gained significant attention over the past years. A critical factor for understanding and analyzing the behavior of these systems is the collection of monitoring data such as logs, metrics, and traces. These data modalities can be used for anomaly detection and root cause analysis of failures. In particular, multi-modal methods utilizing several types of this data at once have gained traction in the research community since these three modalities capture different dimensions of system behavior. Aim. We provide a dataset that supports research on anomaly detection and architectural degradation in microservice systems. We generate a comprehensive dataset of logs, metrics, and traces from a production microservice system to enable the exploration of multi-modal fusion methods that integrate multiple data modalities. Method. We dynamically tested the various APIs of the MS-based system, implementing the OAuth2.0 protocol using the Locust tool. For each execution of the prepared test suite, we collect logs and performance metrics for correct and erroneous calls with data labeled according to the error triggered during the call. Contributions. We collected approximately 657,000 individual log files, totaling over two billion log lines. In addition, we collected more than 45 million individual metric files that contain 485 unique metrics. We provide an initial analysis of logs, identify key metrics through PCA, and discuss challenges in collecting traces for this system. Moreover, we highlight the possibilities for making a more fine-grained version of the data set. This work advances anomaly detection in microservice systems using multiple data sources.
Related papers
- CHASE: A Causal Heterogeneous Graph based Framework for Root Cause Analysis in Multimodal Microservice Systems [22.00860661894853]
We propose a Causal Heterogeneous grAph baSed framEwork for root cause analysis, namely CHASE, for microservice systems with multimodal data.
CHASE learns from the constructed hypergraph with hyperedges representing the flow of causality and performs root cause localization.
arXiv Detail & Related papers (2024-06-28T07:46:51Z) - GLAD: Content-aware Dynamic Graphs For Log Anomaly Detection [49.9884374409624]
GLAD is a Graph-based Log Anomaly Detection framework designed to detect anomalies in system logs.
We introduce GLAD, a Graph-based Log Anomaly Detection framework designed to detect anomalies in system logs.
arXiv Detail & Related papers (2023-09-12T04:21:30Z) - Robust Multimodal Failure Detection for Microservice Systems [32.25907616511765]
AnoFusion is an unsupervised failure detection approach for microservice systems.
It learns the correlation of the heterogeneous multimodal data and integrates a Graph Attention Network (GAT) and Gated Recurrent Unit (GRU)
It achieves the F1-score of 0.857 and 0.922, respectively, outperforming state-of-the-art failure detection approaches.
arXiv Detail & Related papers (2023-05-30T12:39:42Z) - Interactive System-wise Anomaly Detection [66.3766756452743]
Anomaly detection plays a fundamental role in various applications.
It is challenging for existing methods to handle the scenarios where the instances are systems whose characteristics are not readily observed as data.
We develop an end-to-end approach which includes an encoder-decoder module that learns system embeddings.
arXiv Detail & Related papers (2023-04-21T02:20:24Z) - Robust Failure Diagnosis of Microservice System through Multimodal Data [14.720995687799668]
We propose DiagFusion, a robust failure diagnosis approach that uses multimodal data.
Our evaluations show that DiagFusion outperforms existing methods in terms of root cause instance localization and failure type determination.
arXiv Detail & Related papers (2023-02-21T08:28:28Z) - Heterogeneous Anomaly Detection for Software Systems via Semi-supervised
Cross-modal Attention [29.654681594903114]
We propose Hades, the first end-to-end semi-supervised approach to identify system anomalies based on heterogeneous data.
Our approach employs a hierarchical architecture to learn a global representation of the system status by fusing log semantics and metric patterns.
We evaluate Hades extensively on large-scale simulated data and datasets from Huawei Cloud.
arXiv Detail & Related papers (2023-02-14T09:02:11Z) - PULL: Reactive Log Anomaly Detection Based On Iterative PU Learning [58.85063149619348]
We propose PULL, an iterative log analysis method for reactive anomaly detection based on estimated failure time windows.
Our evaluation shows that PULL consistently outperforms ten benchmark baselines across three different datasets.
arXiv Detail & Related papers (2023-01-25T16:34:43Z) - Lightweight Automated Feature Monitoring for Data Streams [1.4658400971135652]
We propose a flexible system, Feature Monitoring (FM), that detects data drifts in such data sets.
It monitors all features that are used by the system, while providing an interpretable features ranking whenever an alarm occurs.
This illustrates how FM eliminates the need to add custom signals to detect specific types of problems and that monitoring the available space of features is often enough.
arXiv Detail & Related papers (2022-07-18T14:38:11Z) - LogLAB: Attention-Based Labeling of Log Data Anomalies via Weak
Supervision [63.08516384181491]
We present LogLAB, a novel modeling approach for automated labeling of log messages without requiring manual work by experts.
Our method relies on estimated failure time windows provided by monitoring systems to produce precise labeled datasets in retrospect.
Our evaluation shows that LogLAB consistently outperforms nine benchmark approaches across three different datasets and maintains an F1-score of more than 0.98 even at large failure time windows.
arXiv Detail & Related papers (2021-11-02T15:16:08Z) - PyODDS: An End-to-end Outlier Detection System with Automated Machine
Learning [55.32009000204512]
We present PyODDS, an automated end-to-end Python system for Outlier Detection with Database Support.
Specifically, we define the search space in the outlier detection pipeline, and produce a search strategy within the given search space.
It also provides unified interfaces and visualizations for users with or without data science or machine learning background.
arXiv Detail & Related papers (2020-03-12T03:30:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.