Rethinking Provenance Completeness with a Learning-Based Linux Scheduler
- URL: http://arxiv.org/abs/2510.08479v2
- Date: Fri, 10 Oct 2025 21:24:58 GMT
- Title: Rethinking Provenance Completeness with a Learning-Based Linux Scheduler
- Authors: Jinsong Mao, Benjamin E. Ujcich, Shiqing Ma,
- Abstract summary: Provenance plays a critical role in maintaining traceability of a system's actions for root cause analysis of security threats and impacts.<n>Recent research has questioned whether existing provenance collection systems fail to ensure the security guarantees of a true reference monitor.<n>We introduce Aegis, a scheduler for Linux specifically designed for provenance.
- Score: 23.33056415010496
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Provenance plays a critical role in maintaining traceability of a system's actions for root cause analysis of security threats and impacts. Provenance collection is often incorporated into the reference monitor of systems to ensure that an audit trail exists of all events, that events are completely captured, and that logging of such events cannot be bypassed. However, recent research has questioned whether existing state-of-the-art provenance collection systems fail to ensure the security guarantees of a true reference monitor due to the 'super producer threat' in which provenance generation can overload a system to force the system to drop security-relevant events and allow an attacker to hide their actions. One approach towards solving this threat is to enforce resource isolation, but that does not fully solve the problems resulting from hardware dependencies and performance limitations. In this paper, we show how an operating system's kernel scheduler can mitigate this threat, and we introduce Aegis, a learned scheduler for Linux specifically designed for provenance. Unlike conventional schedulers that ignore provenance completeness requirements, Aegis leverages reinforcement learning to learn provenance task behavior and to dynamically optimize resource allocation. We evaluate Aegis's efficacy and show that Aegis significantly improves both the completeness and efficiency of provenance collection systems compared to traditional scheduling, while maintaining reasonable overheads and even improving overall runtime in certain cases compared to the default Linux scheduler.
Related papers
- Detecting Object Tracking Failure via Sequential Hypothesis Testing [80.7891291021747]
Real-time online object tracking in videos constitutes a core task in computer vision.<n>We propose interpreting object tracking as a sequential hypothesis test, wherein evidence for or against tracking failures is gradually accumulated over time.<n>We propose both supervised and unsupervised variants by leveraging either ground-truth or solely internal tracking information.
arXiv Detail & Related papers (2026-02-13T14:57:15Z) - Patch-to-PoC: A Systematic Study of Agentic LLM Systems for Linux Kernel N-Day Reproduction [27.460244103362935]
We present the first large-scale study of LLM-based Linux kernel vulnerability reproduction.<n>Using kernel security patches as input, K-Repro automates end-to-end bug reproduction of N-day vulnerabilities in the Linux kernel.<n>Our results show that K-Repro can generate PoCs that reproduce over 50% of the cases with practical time and monetary cost.
arXiv Detail & Related papers (2026-02-07T00:34:08Z) - CaMeLs Can Use Computers Too: System-level Security for Computer Use Agents [60.98294016925157]
AI agents are vulnerable to prompt injection attacks, where malicious content hijacks agent behavior to steal credentials or cause financial loss.<n>We introduce Single-Shot Planning for CUAs, where a trusted planner generates a complete execution graph with conditional branches before any observation of potentially malicious content.<n>Although this architectural isolation successfully prevents instruction injections, we show that additional measures are needed to prevent Branch Steering attacks.
arXiv Detail & Related papers (2026-01-14T23:06:35Z) - Operational Runtime Behavior Mining for Open-Source Supply Chain Security [2.9552238960255255]
HeteroGAT-Rank is an industry-oriented runtime behavior mining system.<n>It surfaces actionable runtime signals to guide manual investigation and threat hunting.<n>An evaluation on a large-scale OSS execution dataset shows that HeteroGAT-Rank effectively highlights meaningful and interpretable behavioral indicators.
arXiv Detail & Related papers (2026-01-11T15:14:18Z) - Agentic AI for Autonomous Defense in Software Supply Chain Security: Beyond Provenance to Vulnerability Mitigation [0.0]
The current paper includes an example of agentic artificial intelligence (AI) based on autonomous software supply chain security.<n>It combines large language model (LLM)-based reasoning, reinforcement learning (RL), and multi-agent coordination.<n>Results show that agentic AI can facilitate the transition to self defending, proactive software supply chains.
arXiv Detail & Related papers (2025-12-29T14:06:09Z) - A Dynamical Systems Framework for Reinforcement Learning Safety and Robustness Verification [1.104960878651584]
This paper introduces a novel framework that addresses the lack of formal methods for verifying the robustness and safety of learned policies.<n>By leveraging tools from dynamical systems theory, we identify and visualize Lagrangian Coherent Structures (LCS) that act as the hidden "skeleton" governing the system's behavior.<n>We show that this framework provides a comprehensive and interpretable assessment of policy behavior, successfully identifying critical flaws in policies that appear successful based on reward alone.
arXiv Detail & Related papers (2025-08-21T14:00:26Z) - DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents [52.92354372596197]
Large Language Models (LLMs) are increasingly central to agentic systems due to their strong reasoning and planning capabilities.<n>This interaction also introduces the risk of prompt injection attacks, where malicious inputs from external sources can mislead the agent's behavior.<n>We propose a Dynamic Rule-based Isolation Framework for Trustworthy agentic systems, which enforces both control and data-level constraints.
arXiv Detail & Related papers (2025-06-13T05:01:09Z) - Safely Learning Controlled Stochastic Dynamics [61.82896036131116]
We introduce a method that ensures safe exploration and efficient estimation of system dynamics.<n>After training, the learned model enables predictions of the system's dynamics and permits safety verification of any given control.<n>We provide theoretical guarantees for safety and derive adaptive learning rates that improve with increasing Sobolev regularity of the true dynamics.
arXiv Detail & Related papers (2025-06-03T11:17:07Z) - Code-as-Monitor: Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection [56.66677293607114]
We propose Code-as-Monitor (CaM) for both open-set reactive and proactive failure detection.<n>To enhance the accuracy and efficiency of monitoring, we introduce constraint elements that abstract constraint-related entities.<n>Experiments show that CaM achieves a 28.7% higher success rate and reduces execution time by 31.8% under severe disturbances.
arXiv Detail & Related papers (2024-12-05T18:58:27Z) - Enhancing Attack Resilience in Real-Time Systems through Variable Control Task Sampling Rates [2.238622204691961]
We propose a novel schedule vulnerability analysis methodology, enabling runtime switching between valid schedules for various control task sampling rates.
We present the Multi-Rate Attack-Aware Randomized Scheduling (MAARS) framework for fixed-priority schedulers, designed to reduce the success rate of timing inference attacks on real-time systems.
arXiv Detail & Related papers (2024-08-01T07:25:15Z) - Monitoring Auditable Claims in the Cloud [0.0]
We propose a flexible monitoring approach that is independent of the implementation of the observed system.
Our approach is based on combining distributed Datalog-based programs with tamper-proof storage based on Trillian.
We apply our approach to an industrial use case that uses a cloud infrastructure for orchestrating unmanned air vehicles.
arXiv Detail & Related papers (2023-12-19T11:21:18Z) - Prov2vec: Learning Provenance Graph Representation for Unsupervised APT Detection [2.07180164747172]
It is necessary to detect Advanced Persistent Threats as early in the campaign as possible.
This paper proposes, Prov2Vec, a system for the continuous monitoring of enterprise host's behavior to detect attackers' activities.
arXiv Detail & Related papers (2023-10-02T01:38:13Z) - Safety Margins for Reinforcement Learning [53.10194953873209]
We show how to leverage proxy criticality metrics to generate safety margins.
We evaluate our approach on learned policies from APE-X and A3C within an Atari environment.
arXiv Detail & Related papers (2023-07-25T16:49:54Z) - Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers.
We then present the pointwise feasibility conditions of the resulting safety controller.
We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z) - Dos and Don'ts of Machine Learning in Computer Security [74.1816306998445]
Despite great potential, machine learning in security is prone to subtle pitfalls that undermine its performance.
We identify common pitfalls in the design, implementation, and evaluation of learning-based security systems.
We propose actionable recommendations to support researchers in avoiding or mitigating the pitfalls where possible.
arXiv Detail & Related papers (2020-10-19T13:09:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.