Related papers: LLM-based event log analysis techniques: A survey

LLM-based event log analysis techniques: A survey

URL: http://arxiv.org/abs/2502.00677v1
Date: Sun, 02 Feb 2025 05:28:17 GMT
Title: LLM-based event log analysis techniques: A survey
Authors: Siraaj Akhtar, Saad Khan, Simon Parkinson,
Abstract summary: Event logs record key information on activities that occur on computing devices.<n>Researchers have developed automated techniques to improve the event log analysis process.<n>This paper aims to survey LLM-based event log analysis techniques.
Score: 1.6180992915701702
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Event log analysis is an important task that security professionals undertake. Event logs record key information on activities that occur on computing devices, and due to the substantial number of events generated, they consume a large amount of time and resources to analyse. This demanding and repetitive task is also prone to errors. To address these concerns, researchers have developed automated techniques to improve the event log analysis process. Large Language Models (LLMs) have recently demonstrated the ability to successfully perform a wide range of tasks that individuals would usually partake in, to high standards, and at a pace and degree of complexity that outperform humans. Due to this, researchers are rapidly investigating the use of LLMs for event log analysis. This includes fine-tuning, Retrieval-Augmented Generation (RAG) and in-context learning, which affect performance. These works demonstrate good progress, yet there is a need to understand the developing body of knowledge, identify commonalities between works, and identify key challenges and potential solutions to further developments in this domain. This paper aims to survey LLM-based event log analysis techniques, providing readers with an in-depth overview of the domain, gaps identified in previous research, and concluding with potential avenues to explore in future.

Related papers

A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond [88.5807076505261]
Large Reasoning Models (LRMs) have demonstrated strong performance gains by scaling up the length of Chain-of-Thought (CoT) reasoning during inference. A growing concern lies in their tendency to produce excessively long reasoning traces. This inefficiency introduces significant challenges for training, inference, and real-world deployment.
arXiv Detail & Related papers (2025-03-27T15:36:30Z)
The LLM Effect: Are Humans Truly Using LLMs, or Are They Being Influenced By Them Instead? [60.01746782465275]
Large Language Models (LLMs) have shown capabilities close to human performance in various analytical tasks. This paper investigates the efficiency and accuracy of LLMs in specialized tasks through a structured user study focusing on Human-LLM partnership.
arXiv Detail & Related papers (2024-10-07T02:30:18Z)
Data Analysis in the Era of Generative AI [56.44807642944589]
This paper explores the potential of AI-powered tools to reshape data analysis, focusing on design considerations and challenges. We explore how the emergence of large language and multimodal models offers new opportunities to enhance various stages of data analysis workflow. We then examine human-centered design principles that facilitate intuitive interactions, build user trust, and streamline the AI-assisted analysis workflow across multiple apps.
arXiv Detail & Related papers (2024-09-27T06:31:03Z)
The Imperative of Conversation Analysis in the Era of LLMs: A Survey of Tasks, Techniques, and Trends [64.99423243200296]
Conversation Analysis (CA) strives to uncover and analyze critical information from conversation data. In this paper, we perform a thorough review and systematize CA task to summarize the existing related work. We derive four key steps of CA from conversation scene reconstruction, to in-depth attribution analysis, and then to performing targeted training, finally generating conversations.
arXiv Detail & Related papers (2024-09-21T16:52:43Z)
Using Large Language Models for Template Detection from Security Event Logs [0.9217021281095907]
Event log analysis techniques are essential for the timely detection of cyber attacks and for assisting security experts with the analysis of past security incidents. The detection of line patterns or templates from unstructured textual event logs has been identified as an important task of event log analysis. This paper investigates the application of Large Language Models (LLMs) for unsupervised detection of templates from unstructured security event logs.
arXiv Detail & Related papers (2024-09-08T10:06:54Z)
GenDFIR: Advancing Cyber Incident Timeline Analysis Through Retrieval Augmented Generation and Large Language Models [0.08192907805418582]
Cyber timeline analysis is crucial in Digital Forensics and Incident Response (DFIR)<n>Traditional methods rely on structured artefacts, such as logs and metadata, for evidence identification and feature extraction.<n>This paper introduces GenDFIR, a framework leveraging large language models (LLMs), specifically Llama 3.1 8B in zero shot mode, integrated with a Retrieval-Augmented Generation (RAG) agent.
arXiv Detail & Related papers (2024-09-04T09:46:33Z)
DISCOVER: A Data-driven Interactive System for Comprehensive Observation, Visualization, and ExploRation of Human Behaviour [6.716560115378451]
We introduce a modular, flexible, yet user-friendly software framework specifically developed to streamline computational-driven data exploration for human behavior analysis. Our primary objective is to democratize access to advanced computational methodologies, thereby enabling researchers across disciplines to engage in detailed behavioral analysis without the need for extensive technical proficiency.
arXiv Detail & Related papers (2024-07-18T11:28:52Z)
LogEval: A Comprehensive Benchmark Suite for Large Language Models In Log Analysis [32.46940506638522]
We introduce LogEval, a benchmark suite designed to evaluate the capabilities of Large Language Models in log analysis tasks. This benchmark covers tasks such as log parsing, log anomaly detection, log fault diagnosis, and log summarization. LogEval evaluates each task using 4,000 publicly available log data entries and employs 15 different prompts for each task to ensure a thorough and fair assessment.
arXiv Detail & Related papers (2024-07-02T02:39:33Z)
Characterization of Large Language Model Development in the Datacenter [55.9909258342639]
Large Language Models (LLMs) have presented impressive performance across several transformative tasks. However, it is non-trivial to efficiently utilize large-scale cluster resources to develop LLMs. We present an in-depth characterization study of a six-month LLM development workload trace collected from our GPU datacenter Acme.
arXiv Detail & Related papers (2024-03-12T13:31:14Z)
The Shifted and The Overlooked: A Task-oriented Investigation of User-GPT Interactions [114.67699010359637]
We analyze a large-scale collection of real user queries to GPT. We find that tasks such as design'' and planning'' are prevalent in user interactions but are largely neglected or different from traditional NLP benchmarks.
arXiv Detail & Related papers (2023-10-19T02:12:17Z)
Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models [56.34029644009297]
Large language models (LLMs) have demonstrated the ability to overcome various limitations of formal Knowledge Representation (KR) systems. LLMs excel most in abductive reasoning, followed by deductive reasoning, while they are least effective at inductive reasoning. We study single-task training, multi-task training, and "chain-of-thought" knowledge distillation fine-tuning technique to assess the performance of model.
arXiv Detail & Related papers (2023-10-02T01:00:50Z)
Learning Representations on Logs for AIOps [6.47086647390439]
Large Language Models (LLMs) are trained using self-supervision on a vast amount of unlabeled data. This paper introduces a LLM for log data which is trained on public and proprietary log data. Our proposed LLM, trained on public and proprietary log data, offers superior performance on multiple downstream tasks.
arXiv Detail & Related papers (2023-08-18T20:34:46Z)
A Comprehensive Survey of Logging in Software: From Logging Statements Automation to Log Mining and Analysis [0.0]
We study a large number of conference and journal papers that appeared on top-level peer-reviewed venues. We provide a set of challenges and opportunities that will lead the researchers in academia and industry in moving the field forward.
arXiv Detail & Related papers (2021-10-24T17:15:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.