Information-Dense Reasoning for Efficient and Auditable Security Alert Triage
- URL: http://arxiv.org/abs/2512.08169v1
- Date: Tue, 09 Dec 2025 01:57:24 GMT
- Title: Information-Dense Reasoning for Efficient and Auditable Security Alert Triage
- Authors: Guangze Zhao, Yongzheng Zhang, Changbo Tian, Dan Xie, Hongri Liu, Bailing Wang,
- Abstract summary: Security Operations Centers face massive, heterogeneous alert streams under minute-level service windows.<n>Existing solutions fail: signature systems are brittle, anomaly methods lack actionability, and fully cloud-hosted LLMs raise latency, cost, and privacy concerns.<n>We propose AIDR, a hybrid cloud-premises framework that addresses this trade-off through constrained information-specialized optimization.
- Score: 5.3761282240937796
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Security Operations Centers face massive, heterogeneous alert streams under minute-level service windows, creating the Alert Triage Latency Paradox: verbose reasoning chains ensure accuracy and compliance but incur prohibitive latency and token costs, while minimal chains sacrifice transparency and auditability. Existing solutions fail: signature systems are brittle, anomaly methods lack actionability, and fully cloud-hosted LLMs raise latency, cost, and privacy concerns. We propose AIDR, a hybrid cloud-edge framework that addresses this trade-off through constrained information-density optimization. The core innovation is gradient-based compression of reasoning chains to retain only decision-critical steps--minimal evidence sufficient to justify predictions while respecting token and latency budgets. We demonstrate that this approach preserves decision-relevant information while minimizing complexity. We construct compact datasets by distilling alerts into 3-5 high-information bullets (68% token reduction), train domain-specialized experts via LoRA, and deploy a cloud-edge architecture: a cloud LLM routes alerts to on-premises experts generating SOAR-ready JSON. Experiments demonstrate AIDR achieves higher accuracy and 40.6% latency reduction versus Chain-of-Thought, with robustness to data corruption and out-of-distribution generalization, enabling auditable and efficient SOC triage with full data residency compliance.
Related papers
- Bridging the Perception Gap: A Lightweight Coarse-to-Fine Architecture for Edge Audio Systems [10.143590597259792]
CoFi-Agent is a hybrid architecture targeting edge servers and gateways.<n>It performs fast local perception and triggers conditional forensic refinement only when uncertainty is detected.<n>On the MMAR benchmark, CoFi-Agent improves accuracy from 27.20% to 53.60%, while achieving a better accuracy-efficiency trade-off than an always-on investigation pipeline.
arXiv Detail & Related papers (2026-01-22T05:57:25Z) - Overcoming Joint Intractability with Lossless Hierarchical Speculative Decoding [58.92526489742584]
We propose provably lossless.<n> verification method that significantly boosts the expected number of accepted tokens.<n>We show that HSD yields consistent improvements in acceptance rates across diverse model families and benchmarks.
arXiv Detail & Related papers (2026-01-09T11:10:29Z) - Differential Privacy for Secure Machine Learning in Healthcare IoT-Cloud Systems [11.452686322120428]
We propose a multi-layer IoT, Edge and Cloud architecture to enhance the speed of response for emergency healthcare.<n>Privacy of patient data is assured by proposing a Differential Privacy framework across several machine learning models.
arXiv Detail & Related papers (2025-12-11T08:37:37Z) - CoSense-LLM: Semantics at the Edge with Cost- and Uncertainty-Aware Cloud-Edge Cooperation [0.0]
CoSense-LLM is an edge-first framework that turns continuous multimodal sensor streams into compact semantic tokens.<n>System works with modern serving optimizations, including paged or streaming KV caches, Flash style kernels, speculative decoding, and quantized LoRA adapters.
arXiv Detail & Related papers (2025-10-22T15:16:56Z) - Intra-request branch orchestration for efficient LLM reasoning [52.68946975865865]
Large Language Models (LLMs) increasingly rely on inference-time reasoning algorithms to improve accuracy on complex tasks.<n>Prior work has largely focused on reducing token usage, often at the expense of accuracy, while overlooking other latency factors.<n>We present DUCHESS, an LLM serving system that reduces cost and latency without sacrificing accuracy through intra-request branch orchestration guided by predictions.
arXiv Detail & Related papers (2025-09-29T15:52:08Z) - Dynamic Speculative Agent Planning [57.630218933994534]
Large language-model-based agents face critical deployment challenges due to prohibitive latency and inference costs.<n>We introduce Dynamic Speculative Planning (DSP), an online reinforcement learning framework that provides lossless acceleration with substantially reduced costs.<n>Experiments on two standard agent benchmarks demonstrate that DSP achieves comparable efficiency to the fastest acceleration method while reducing total cost by 30% and unnecessary cost up to 60%.
arXiv Detail & Related papers (2025-09-02T03:34:36Z) - Efficient and Verifiable Privacy-Preserving Convolutional Computation for CNN Inference with Untrusted Clouds [1.1545092788508224]
We propose a verifiable privacy-preserving scheme tailored for CNN convolutional layers.<n>Our scheme enables efficient encryption and decryption, allowing resource-constrained clients to securely offload computations to the untrusted cloud server.
arXiv Detail & Related papers (2025-08-18T11:17:53Z) - Task-Oriented Feature Compression for Multimodal Understanding via Device-Edge Co-Inference [54.53508601749513]
We propose a task-oriented feature compression (TOFC) method for multimodal understanding in a device-edge co-inference framework.<n>To enhance compression efficiency, multiple entropy models are adaptively selected based on the characteristics of the visual features.<n>Results show that TOFC achieves up to 52% reduction in data transmission overhead and 63% reduction in system latency.
arXiv Detail & Related papers (2025-03-17T08:37:22Z) - How Breakable Is Privacy: Probing and Resisting Model Inversion Attacks in Collaborative Inference [13.453033795109155]
Collaborative inference improves computational efficiency for edge devices by transmitting intermediate features to cloud models.<n>There is no established criterion for assessing the difficulty of model inversion attacks (MIAs)<n>We propose the first theoretical criterion to assess MIA difficulty in CI, identifying mutual information, entropy, and effective information volume as key influencing factors.
arXiv Detail & Related papers (2025-01-01T13:00:01Z) - Privacy-Preserving Distributed Learning for Residential Short-Term Load
Forecasting [11.185176107646956]
Power system load data can inadvertently reveal the daily routines of residential users, posing a risk to their property security.
We introduce a Markovian Switching-based distributed training framework, the convergence of which is substantiated through rigorous theoretical analysis.
Case studies employing real-world power system load data validate the efficacy of our proposed algorithm.
arXiv Detail & Related papers (2024-02-02T16:39:08Z) - TeD-SPAD: Temporal Distinctiveness for Self-supervised
Privacy-preservation for video Anomaly Detection [59.04634695294402]
Video anomaly detection (VAD) without human monitoring is a complex computer vision task.
Privacy leakage in VAD allows models to pick up and amplify unnecessary biases related to people's personal information.
We propose TeD-SPAD, a privacy-aware video anomaly detection framework that destroys visual private information in a self-supervised manner.
arXiv Detail & Related papers (2023-08-21T22:42:55Z) - Over-the-Air Federated Learning with Privacy Protection via Correlated
Additive Perturbations [57.20885629270732]
We consider privacy aspects of wireless federated learning with Over-the-Air (OtA) transmission of gradient updates from multiple users/agents to an edge server.
Traditional perturbation-based methods provide privacy protection while sacrificing the training accuracy.
In this work, we aim at minimizing privacy leakage to the adversary and the degradation of model accuracy at the edge server.
arXiv Detail & Related papers (2022-10-05T13:13:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.