MicLog: Towards Accurate and Efficient LLM-based Log Parsing via Progressive Meta In-Context Learning
- URL: http://arxiv.org/abs/2601.07005v1
- Date: Sun, 11 Jan 2026 17:46:10 GMT
- Title: MicLog: Towards Accurate and Efficient LLM-based Log Parsing via Progressive Meta In-Context Learning
- Authors: Jianbo Yu, Yixuan Li, Hai Xu, Kang Xu, Junjielong Xu, Zhijing Li, Pinjia He, Wanyuan Wang,
- Abstract summary: Recent large language model (LLM)-baseds leverage in-context learning (ICL) to extract semantics from examples, demonstrating superior accuracy.<n>We present MicLog, the first progressive meta in-consuming learning (ProgMeta-ICL) log parsing framework that combines meta-learning with ICL on small open-source LLMs.<n>MicLog achieves 10.3% higher parsing accuracy than the state-of-the-art, while reducing parsing time by 42.4%.
- Score: 26.849902115105255
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Log parsing converts semi-structured logs into structured templates, forming a critical foundation for downstream analysis. Traditional syntax and semantic-based parsers often struggle with semantic variations in evolving logs and data scarcity stemming from their limited domain coverage. Recent large language model (LLM)-based parsers leverage in-context learning (ICL) to extract semantics from examples, demonstrating superior accuracy. However, LLM-based parsers face two main challenges: 1) underutilization of ICL capabilities, particularly in dynamic example selection and cross-domain generalization, leading to inconsistent performance; 2) time-consuming and costly LLM querying. To address these challenges, we present MicLog, the first progressive meta in-context learning (ProgMeta-ICL) log parsing framework that combines meta-learning with ICL on small open-source LLMs (i.e., Qwen-2.5-3B). Specifically, MicLog: i) enhances LLMs' ICL capability through a zero-shot to k-shot ProgMeta-ICL paradigm, employing weighted DBSCAN candidate sampling and enhanced BM25 demonstration selection; ii) accelerates parsing via a multi-level pre-query cache that dynamically matches and refines recently parsed templates. Evaluated on Loghub-2.0, MicLog achieves 10.3% higher parsing accuracy than the state-of-the-art parser while reducing parsing time by 42.4%.
Related papers
- DiffuRank: Effective Document Reranking with Diffusion Language Models [71.16830004674513]
We propose DiffuRank, a reranking framework built upon diffusion language models (dLLMs)<n>dLLMs support more flexible decoding and generation processes that are not constrained to a left-to-right order.<n>We show dLLMs achieve performance comparable to, and in some cases exceeding, that of autoregressive LLMs with similar model sizes.
arXiv Detail & Related papers (2026-02-13T02:18:14Z) - LLM-SrcLog: Towards Proactive and Unified Log Template Extraction via Large Language Models [19.933913707655467]
LLM-SrcLog is a proactive and unified framework for log template parsing.<n>It extracts templates directly from source code prior to deployment.<n>It supplements them with data-driven parsing for logs without available code.
arXiv Detail & Related papers (2025-12-04T05:30:15Z) - Last Layer Logits to Logic: Empowering LLMs with Logic-Consistent Structured Knowledge Reasoning [55.55968342644846]
Large Language Models (LLMs) achieve excellent performance in natural language reasoning tasks through pre-training on vast unstructured text.<n>We propose the textitLogits-to-Logic framework, which incorporates logits strengthening and logits filtering as core modules to correct logical defects in LLM outputs.
arXiv Detail & Related papers (2025-11-11T07:08:27Z) - LLM-guided Hierarchical Retrieval [54.73080745446999]
LATTICE is a hierarchical retrieval framework that enables an LLM to reason over and navigate large corpora with logarithmic search complexity.<n>A central challenge in such LLM-guided search is that the model's relevance judgments are noisy, context-dependent, and unaware of the hierarchy.<n>Our framework achieves state-of-the-art zero-shot performance on the reasoning-intensive BRIGHT benchmark.
arXiv Detail & Related papers (2025-10-15T07:05:17Z) - InferLog: Accelerating LLM Inference for Online Log Parsing via ICL-oriented Prefix Caching [38.87172392333867]
We present InferLog, the first inference optimization method for online log parsing.<n>InferLog accelerates inference by designing (1) A Prefix-aware ICL Refinement policy to refine the examples and permutation of in-context learning to improve the prefix caching efficiency.
arXiv Detail & Related papers (2025-07-11T12:21:29Z) - Unleashing the Power of LLMs in Dense Retrieval with Query Likelihood Modeling [69.84963245729826]
We propose an auxiliary task of QL to enhance the backbone for subsequent contrastive learning of the retriever.<n>We introduce our model, which incorporates two key components: Attention Block (AB) and Document Corruption (DC)
arXiv Detail & Related papers (2025-04-07T16:03:59Z) - LibreLog: Accurate and Efficient Unsupervised Log Parsing Using Open-Source Large Language Models [3.7960472831772774]
This paper introduces LibreLog, an unsupervised log parsing approach that enhances privacy and reduces operational costs while achieving state-of-the-art parsing accuracy.
Our evaluation on LogHub-2.0 shows that LibreLog achieves 25% higher parsing accuracy and processes 2.7 times faster compared to state-of-the-art LLMs.
arXiv Detail & Related papers (2024-08-02T21:54:13Z) - Log Parsing using LLMs with Self-Generated In-Context Learning and Self-Correction [15.93927602769091]
Recent emergence of large language models (LLMs) has demonstrated strong abilities in understanding natural language and code.<n>Ada is an effective and adaptive log parsing framework using LLMs with self-generated in-context learning (SG-ICL) and self-correction.<n>Ada outperforms state-of-the-art methods across all metrics, even in zero-shot scenarios.
arXiv Detail & Related papers (2024-06-05T15:31:43Z) - Leveraging Code to Improve In-context Learning for Semantic Parsing [48.66031267718704]
In-context learning (ICL) is an appealing approach for semantic parsing due to its few-shot nature and improved generalization.
We improve the effectiveness of ICL for semantic parsing by (1) using general-purpose programming languages such as Python instead of DSLs, and (2) augmenting prompts with a structured domain description.
arXiv Detail & Related papers (2023-11-16T02:50:06Z) - LILAC: Log Parsing using LLMs with Adaptive Parsing Cache [38.04960745458878]
We propose LILAC, the first practical log parsing framework using large language models (LLMs) with adaptive parsing cache.
LLMs's lack of specialized log parsing capabilities currently hinders their accuracy in parsing.
We show LILAC outperforms state-of-the-art methods by 69.5% in terms of the average F1 score of template accuracy.
arXiv Detail & Related papers (2023-10-03T04:46:59Z) - Self-Supervised Log Parsing [59.04636530383049]
Large-scale software systems generate massive volumes of semi-structured log records.
Existing approaches rely on log-specifics or manual rule extraction.
We propose NuLog that utilizes a self-supervised learning model and formulates the parsing task as masked language modeling.
arXiv Detail & Related papers (2020-03-17T19:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.