Related papers: Analyzing Logs of Large-Scale Software Systems using Time Curves Visualization

Analyzing Logs of Large-Scale Software Systems using Time Curves Visualization

URL: http://arxiv.org/abs/2411.05533v1
Date: Fri, 08 Nov 2024 12:42:45 GMT
Title: Analyzing Logs of Large-Scale Software Systems using Time Curves Visualization
Authors: Dmytro Borysenkov, Adriano Vogel, Sören Henning, Esteban Perez-Wohlfeil,
Abstract summary: We show that our approach can explain the main events in logs collected from different applications without prior knowledge. As a result, we expect a significant reduction of the time required to identify performance bottlenecks and security risks.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Logs are crucial for analyzing large-scale software systems, offering insights into system health, performance, security threats, potential bugs, etc. However, their chaotic nature$\unicode{x2013}$characterized by sheer volume, lack of standards, and variability$\unicode{x2013}$makes manual analysis complex. The use of clustering algorithms can assist by grouping logs into a smaller set of templates, but lose the temporal and relational context in doing so. On the contrary, Large Language Models (LLMs) can provide meaningful explanations but struggle with processing large collections efficiently. Moreover, representation techniques for both approaches are typically limited to either plain text or traditional charting, especially when dealing with large-scale systems. In this paper, we combine clustering and LLM summarization with event detection and Multidimensional Scaling through the use of Time Curves to produce a holistic pipeline that enables efficient and automatic summarization of vast collections of software system logs. The core of our approach is the proposal of a semimetric distance that effectively measures similarity between events, thus enabling a meaningful representation. We show that our method can explain the main events of logs collected from different applications without prior knowledge. We also show how the approach can be used to detect general trends as well as outliers in parallel and distributed systems by overlapping multiple projections. As a result, we expect a significant reduction of the time required to analyze and resolve system-wide issues, identify performance bottlenecks and security risks, debug applications, etc.

Related papers

A system identification approach to clustering vector autoregressive time series [50.66782357329375]
Clustering time series based on their underlying dynamics is keeping attracting researchers due to its impacts on assisting complex system modelling.<n>Most current time series clustering methods handle only scalar time series, treat them as white noise, or rely on domain knowledge for high-quality feature construction.<n>Instead of relying on feature/metric construction, the system identification approach allows treating vector time series clustering by explicitly considering their underlying autoregressive dynamics.
arXiv Detail & Related papers (2025-05-20T14:31:44Z)
Quantifying Memory Utilization with Effective State-Size [73.52115209375343]
We develop a measure of textitmemory utilization' This metric is tailored to the fundamental class of systems with textitinput-invariant and textitinput-varying linear operators
arXiv Detail & Related papers (2025-04-28T08:12:30Z)
BEACON: A Benchmark for Efficient and Accurate Counting of Subgraphs [18.281284442275457]
We introduce BEACON: a benchmark designed to rigorously evaluate both algorithmic (AL) and machine learning (ML) subgraph counting methods. BEACON provides a standardized dataset with verified ground truths, an integrated evaluation environment, and a public leaderboard. Our experiments reveal that while AL methods excel in efficiently counting subgraphs on very large graphs, they struggle with complex patterns.
arXiv Detail & Related papers (2025-04-15T07:53:47Z)
Graphint: Graph-based Time Series Clustering Visualisation Tool [21.763409747687348]
Graphint is an innovative system based on the $k$-Graph methodology. It integrates a robust time series clustering algorithm with an interactive tool for comparison and interpretation.
arXiv Detail & Related papers (2025-03-10T17:20:02Z)
Scalable Graph Attention-based Instance Selection via Mini-Batch Sampling and Hierarchical Hashing [0.24578723416255752]
Instance selection (IS) is important in machine learning for reducing dataset size while keeping key characteristics. This paper introduces a graph attention-based instance selection (GAIS) method that uses attention mechanisms to identify informative instances. We present two approaches for scalable graph construction: a distance-based mini-batch sampling technique that reduces through strategic batch processing, and a hierarchical hashing approach that allows for efficient similarity through random projections.
arXiv Detail & Related papers (2025-02-27T17:17:53Z)
FastGAS: Fast Graph-based Annotation Selection for In-Context Learning [53.17606395275021]
In-context learning (ICL) empowers large language models (LLMs) to tackle new tasks by using a series of training instances as prompts. Existing methods have proposed to select a subset of unlabeled examples for annotation. We propose a graph-based selection method, FastGAS, designed to efficiently identify high-quality instances.
arXiv Detail & Related papers (2024-06-06T04:05:54Z)
Lemur: Log Parsing with Entropy Sampling and Chain-of-Thought Merging [33.522495018321386]
We introduce a cutting-edge textbfLog parsing framework with textbfEntropy sampling and Chain-of-Thought textbfMerging (Lemur) We propose a novel sampling method inspired by information entropy, which efficiently clusters typical logs. Lemur achieves the state-of-the-art performance and impressive efficiency.
arXiv Detail & Related papers (2024-02-28T09:51:55Z)
Multi-modal Causal Structure Learning and Root Cause Analysis [67.67578590390907]
We propose Mulan, a unified multi-modal causal structure learning method for root cause localization. We leverage a log-tailored language model to facilitate log representation learning, converting log sequences into time-series data. We also introduce a novel key performance indicator-aware attention mechanism for assessing modality reliability and co-learning a final causal graph.
arXiv Detail & Related papers (2024-02-04T05:50:38Z)
MLAD: A Unified Model for Multi-system Log Anomaly Detection [35.68387377240593]
We propose MLAD, a novel anomaly detection model that incorporates semantic relational reasoning across multiple systems. Specifically, we employ Sentence-bert to capture the similarities between log sequences and convert them into highly-dimensional learnable semantic vectors. We revamp the formulas of the Attention layer to discern the significance of each keyword in the sequence and model the overall distribution of the multi-system dataset.
arXiv Detail & Related papers (2024-01-15T12:51:13Z)
Multi-Scene Generalized Trajectory Global Graph Solver with Composite Nodes for Multiple Object Tracking [61.69892497726235]
Composite Node Message Passing Network (CoNo-Link) is a framework for modeling ultra-long frames information for association. In addition to the previous method of treating objects as nodes, the network innovatively treats object trajectories as nodes for information interaction. Our model can learn better predictions on longer-time scales by adding composite nodes.
arXiv Detail & Related papers (2023-12-14T14:00:30Z)
Robust Detection of Lead-Lag Relationships in Lagged Multi-Factor Models [61.10851158749843]
Key insights can be obtained by discovering lead-lag relationships inherent in the data. We develop a clustering-driven methodology for robust detection of lead-lag relationships in lagged multi-factor models.
arXiv Detail & Related papers (2023-05-11T10:30:35Z)
ClusterLog: Clustering Logs for Effective Log-based Anomaly Detection [3.3196401064045014]
This study proposes ClusterLog, a log pre-processing method that clusters the temporal sequence of log keys based on their semantic similarity. By grouping semantically and sentimentally similar logs, this approach aims to represent log sequences with the smallest amount of unique log keys, intending to improve the ability of a downstream sequence-based model to effectively learn the log patterns.
arXiv Detail & Related papers (2023-01-19T01:54:48Z)
Task-aware Similarity Learning for Event-triggered Time Series [25.101509208153804]
The overarching goal of this paper is to develop an unsupervised learning framework that is capable of learning task-aware similarities among unlabeled event-triggered time series. The proposed framework aspires to offer a stepping stone that gives rise to a systematic approach to model and learn similarities among a multitude of event-triggered time series.
arXiv Detail & Related papers (2022-07-17T12:54:10Z)
Inferring Unobserved Events in Systems With Shared Resources and Queues [0.8602553195689513]
Real-life systems often record only a subset of all events taking place. To understand and analyze the behavior of processes with shared resources, we aim to reconstruct bounds for timestamps of events that must have happened but were not recorded. We use linear programming over entity traces to derive the timestamps of unobserved events in an efficient manner.
arXiv Detail & Related papers (2021-02-27T09:34:01Z)
Learned Factor Graphs for Inference from Stationary Time Sequences [107.63351413549992]
We propose a framework that combines model-based algorithms and data-driven ML tools for stationary time sequences. neural networks are developed to separately learn specific components of a factor graph describing the distribution of the time sequence. We present an inference algorithm based on learned stationary factor graphs, which learns to implement the sum-product scheme from labeled data.
arXiv Detail & Related papers (2020-06-05T07:06:19Z)
Self-Supervised Log Parsing [59.04636530383049]
Large-scale software systems generate massive volumes of semi-structured log records. Existing approaches rely on log-specifics or manual rule extraction. We propose NuLog that utilizes a self-supervised learning model and formulates the parsing task as masked language modeling.
arXiv Detail & Related papers (2020-03-17T19:25:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.