Related papers: LogICL: Distilling LLM Reasoning to Bridge the Semantic Gap in Cross-Domain Log Anomaly Detection

LogICL: Distilling LLM Reasoning to Bridge the Semantic Gap in Cross-Domain Log Anomaly Detection

URL: http://arxiv.org/abs/2512.09627v1
Date: Wed, 10 Dec 2025 13:13:30 GMT
Title: LogICL: Distilling LLM Reasoning to Bridge the Semantic Gap in Cross-Domain Log Anomaly Detection
Authors: Jingwei Ye, Zhi Wang, Chenbin Su, Jieshuai Yang, Jiayi Ding, Chunbo Liu, Ge Chu,
Abstract summary: We propose LogICL, a framework distilling Large Language Model (LLM) reasoning into a lightweight encoder for cross-domain anomaly detection.<n>At inference, the optimized encoder retrieves reasoning-aware demonstrations using semantic similarity and delta scores.<n>Experiments on few-shot and zero-shot cross-domain benchmarks confirm LogICL achieves state-of-the-art performance across heterogeneous systems.
Score: 4.319103554448838
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Effective log anomaly detection is critical to sustaining reliability in large-scale IT infrastructures. Transformer-based models require substantial resources and labeled data, exacerbating the cold-start problem in target domains where logs are scarce. Existing cross-domain methods leverage source logs but struggle with generalization due to reliance on surface lexical similarity, failing to capture latent semantic equivalence amid structural divergences. To address this, we propose LogICL, a framework distilling Large Language Model (LLM) reasoning into a lightweight encoder for cross-domain anomaly detection. During training, LogICL constructs a delta matrix measuring the utility of demonstrations selected via Maximal Marginal Relevance relative to zero-shot inference. The encoder is optimized via a multi-objective loss comprising an ICL-Guided term that aligns representations based on reasoning assistance utility, maximum mean discrepancy for domain alignment, and supervised contrastive loss. At inference, the optimized encoder retrieves reasoning-aware demonstrations using semantic similarity and delta scores, enabling frozen-LLM in-context learning with Chain-of-Thought for accurate and interpretable detection. Experiments on few-shot and zero-shot cross-domain benchmarks confirm LogICL achieves state-of-the-art performance across heterogeneous systems. Further analysis via visualizations and case studies confirms LogICL bridges the semantic gap beyond surface lexical similarity, effectively capturing latent semantic equivalence for rapid deployment.

Related papers

LaSER: Internalizing Explicit Reasoning into Latent Space for Dense Retrieval [74.72139580745511]
LaSER is a novel self-distillation framework that internalizes explicit reasoning into the latent space of retrievers.<n>Our method successfully combines the reasoning depth of explicit CoT pipelines with the inference efficiency of standard dense retrievers.
arXiv Detail & Related papers (2026-03-02T04:11:18Z)
Log anomaly detection via Meta Learning and Prototypical Networks for Cross domain generalization [0.0]
We propose a new end-to-end framework based on a meta-learning approach to detect log anomalies.<n>Our methodology first gets the data ready by combining a Drain3 log parsing mechanism with a dynamic drift-based labeling technique.<n>Model Agnostic Meta-Learning (MAML) and Prototypical Networks models are trained to adapt quickly and effectively.
arXiv Detail & Related papers (2026-01-20T12:09:33Z)
From Laboratory to Real-World Applications: Benchmarking Agentic Code Reasoning at the Repository Level [38.24989792739013]
We present RepoReason, a diagnostic benchmark centered on abductive assertion verification.<n>We implement an execution-driven mutation framework that utilizes the environment as a semantic to regenerate ground-truth states.<n>Our findings provide granular white-box insights for optimizing the next generation of agentic software engineering.
arXiv Detail & Related papers (2026-01-07T09:22:28Z)
Accelerate Speculative Decoding with Sparse Computation in Verification [49.74839681322316]
Speculative decoding accelerates autoregressive language model inference by verifying multiple draft tokens in parallel.<n>Existing sparsification methods are designed primarily for standard token-by-token autoregressive decoding.<n>We propose a sparse verification framework that jointly sparsifies attention, FFN, and MoE components during the verification stage to reduce the dominant computation cost.
arXiv Detail & Related papers (2025-12-26T07:53:41Z)
Directional Attractors in LLM Reasoning: How Similarity Retrieval Steers Iterative Summarization Based Reasoning [0.0]
We introduce InftyThink with Cross-Chain Memory, an extension that augments iterative reasoning with an embedding-based semantic cache of previously successful reasoning patterns.<n> Experiments show that semantic lemma retrieval improves accuracy in structured domains while exposing failure modes in tests that include heterogeneous domains.
arXiv Detail & Related papers (2025-12-22T00:26:54Z)
Formal that "Floats" High: Formal Verification of Floating Point Arithmetic [0.0]
This paper presents a scalable methodology for verifying floating-point arithmetic using direct RTL-to-RTL model checking against a golden reference model.<n>The methodology is extended with agentic AI-based formal property generation, integrating large language model (LLM)-driven automation with Human-in-the-Loop (HITL) refinement.<n>Results show that direct RTL-to-RTL model checking achieves higher coverage efficiency and requires fewer assertions than standalone verification.
arXiv Detail & Related papers (2025-12-07T14:03:44Z)
RationAnomaly: Log Anomaly Detection with Rationality via Chain-of-Thought and Reinforcement Learning [27.235259453535537]
RationAnomaly is a novel framework that enhances log anomaly detection by synergizing Chain-of-Thought fine-tuning with reinforcement learning.<n>We have released the corresponding resources, including code and datasets.
arXiv Detail & Related papers (2025-09-18T07:35:58Z)
How Is LLM Reasoning Distracted by Irrelevant Context? An Analysis Using a Controlled Benchmark [43.7702541309867]
Grade School Math with Distracting Context is a benchmark to evaluate Large Language Models' (LLMs) reasoning against systematically controlled context (IC)<n>Our experiments demonstrate that LLMs are significantly sensitive to IC, affecting both reasoning path selection and arithmetic accuracy.<n>We propose a stepwise tree search guided by a process reward model, which notably enhances robustness in out-of-distribution conditions.
arXiv Detail & Related papers (2025-05-24T15:56:22Z)
Retrieval is Not Enough: Enhancing RAG Reasoning through Test-Time Critique and Optimization [58.390885294401066]
Retrieval-augmented generation (RAG) has become a widely adopted paradigm for enabling knowledge-grounded large language models (LLMs)<n>RAG pipelines often fail to ensure that model reasoning remains consistent with the evidence retrieved, leading to factual inconsistencies or unsupported conclusions.<n>We propose AlignRAG, a novel iterative framework grounded in Critique-Driven Alignment (CDA)<n>We introduce AlignRAG-auto, an autonomous variant that dynamically terminates refinement, removing the need to pre-specify the number of critique iterations.
arXiv Detail & Related papers (2025-04-21T04:56:47Z)
Let Synthetic Data Shine: Domain Reassembly and Soft-Fusion for Single Domain Generalization [68.41367635546183]
Single Domain Generalization aims to train models with consistent performance across diverse scenarios using data from a single source.<n>We propose Discriminative Domain Reassembly and Soft-Fusion (DRSF), a training framework leveraging synthetic data to improve model generalization.
arXiv Detail & Related papers (2025-03-17T18:08:03Z)
LUNAR: Unsupervised LLM-based Log Parsing [34.344687402936835]
We propose LUNAR, an unsupervised-based method for efficient and off-the-shelf log parsing. Our key insight is that while LLMs may struggle with direct log parsing, their performance can be significantly enhanced through comparative analysis. Experiments on large-scale public datasets demonstrate that LUNAR significantly outperforms state-of-the-art log crafts in terms of accuracy and efficiency.
arXiv Detail & Related papers (2024-06-11T11:32:01Z)
Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning [50.84938730450622]
We propose a trajectory-based method TV score, which uses trajectory volatility for OOD detection in mathematical reasoning. Our method outperforms all traditional algorithms on GLMs under mathematical reasoning scenarios. Our method can be extended to more applications with high-density features in output spaces, such as multiple-choice questions.
arXiv Detail & Related papers (2024-05-22T22:22:25Z)
Multi-modal Causal Structure Learning and Root Cause Analysis [67.67578590390907]
We propose Mulan, a unified multi-modal causal structure learning method for root cause localization. We leverage a log-tailored language model to facilitate log representation learning, converting log sequences into time-series data. We also introduce a novel key performance indicator-aware attention mechanism for assessing modality reliability and co-learning a final causal graph.
arXiv Detail & Related papers (2024-02-04T05:50:38Z)
Deep Equilibrium Assisted Block Sparse Coding of Inter-dependent Signals: Application to Hyperspectral Imaging [71.57324258813675]
A dataset of inter-dependent signals is defined as a matrix whose columns demonstrate strong dependencies. A neural network is employed to act as structure prior and reveal the underlying signal interdependencies. Deep unrolling and Deep equilibrium based algorithms are developed, forming highly interpretable and concise deep-learning-based architectures.
arXiv Detail & Related papers (2022-03-29T21:00:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.