Related papers: model-based script synthesis for fuzzing

model-based script synthesis for fuzzing

URL: http://arxiv.org/abs/2308.04115v1
Date: Tue, 8 Aug 2023 08:07:50 GMT
Title: model-based script synthesis for fuzzing
Authors: Zian Liu, Chao Chen, Muhammad Ejaz Ahmed, Jun Zhang, Dongxi Liu
Abstract summary: Existing approaches fuzz the kernel by modeling syscall sequences from traces or static analysis of system codes. We propose WinkFuzz, an approach to learn and mutate traced syscall sequences in order to reach different kernel states.
Score: 10.739464605434977
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Kernel fuzzing is important for finding critical kernel vulnerabilities. Close-source (e.g., Windows) operating system kernel fuzzing is even more challenging due to the lack of source code. Existing approaches fuzz the kernel by modeling syscall sequences from traces or static analysis of system codes. However, a common limitation is that they do not learn and mutate the syscall sequences to reach different kernel states, which can potentially result in more bugs or crashes. In this paper, we propose WinkFuzz, an approach to learn and mutate traced syscall sequences in order to reach different kernel states. WinkFuzz learns syscall dependencies from the trace, identifies potential syscalls in the trace that can have dependent subsequent syscalls, and applies the dependencies to insert more syscalls while preserving the dependencies into the trace. Then WinkFuzz fuzzes the synthesized new syscall sequence to find system crashes. We applied WinkFuzz to four seed applications and found a total increase in syscall number of 70.8\%, with a success rate of 61\%, within three insert levels. The average time for tracing, dependency analysis, recovering model script, and synthesizing script was 600, 39, 34, and 129 seconds respectively. The instant fuzzing rate is 3742 syscall executions per second. However, the average fuzz efficiency dropped to 155 syscall executions per second when the initializing time, waiting time, and other factors were taken into account. We fuzzed each seed application for 24 seconds and, on average, obtained 12.25 crashes within that time frame.

Related papers

Trace of the Times: Rootkit Detection through Temporal Anomalies in Kernel Activity [2.900892566337075]
Kernel rootkits enable stealthy operation and are thus difficult to detect. Existing detection approaches rely on signatures that are unable to detect novel rootkits or require domain knowledge about the rootkits to be detected. Our framework injects probes into the kernel to measure time stamps of functions within relevant system calls, computes distributions of function execution times, and uses statistical tests to detect time shifts. The evaluation of our open-source implementation on publicly available data sets indicates high detection accuracy with an F1 score of 98.7% across five scenarios with varying system states.
arXiv Detail & Related papers (2025-03-04T08:43:38Z)
KGym: A Platform and Dataset to Benchmark Large Language Models on Linux Kernel Crash Resolution [59.20933707301566]
Large Language Models (LLMs) are consistently improving at increasingly realistic software engineering (SE) tasks. In real-world software stacks, significant SE effort is spent developing foundational system software like the Linux kernel. To evaluate if ML models are useful while developing such large-scale systems-level software, we introduce kGym and kBench.
arXiv Detail & Related papers (2024-07-02T21:44:22Z)
Explore as a Storm, Exploit as a Raindrop: On the Benefit of Fine-Tuning Kernel Schedulers with Coordinate Descent [48.791943145735]
We show the potential to reduce Ansor's search time while enhancing kernel quality. We apply this approach to the first 300 kernels that Ansor generates. This result has been replicated in 20 well-known deep-learning models.
arXiv Detail & Related papers (2024-06-28T16:34:22Z)
Making 'syscall' a Privilege not a Right [4.674007120771649]
nexpoline is a secure syscall interception mechanism combining Memory Protection Keys (MPK) and Seccomp or Syscall User Dispatch (SUD) It offers better efficiency than secure interception techniques like ptrace, as nexpoline can intercept syscalls through binary rewriting securely. Notably, it operates without kernel modifications, making it viable on current Linux systems without needing root privileges.
arXiv Detail & Related papers (2024-06-11T16:33:56Z)
RelayAttention for Efficient Large Language Model Serving with Long System Prompts [59.50256661158862]
This paper aims to improve the efficiency of LLM services that involve long system prompts. handling these system prompts requires heavily redundant memory accesses in existing causal attention algorithms. We propose RelayAttention, an attention algorithm that allows reading hidden states from DRAM exactly once for a batch of input tokens.
arXiv Detail & Related papers (2024-02-22T18:58:28Z)
KernelGPT: Enhanced Kernel Fuzzing via Large Language Models [8.77369393651381]
We propose KernelGPT, the first approach to automatically synthesizing syscall specifications via Large Language Models (LLMs) Our results demonstrate that KernelGPT can generate more new and valid specifications and achieve higher coverage than state-of-the-art techniques.
arXiv Detail & Related papers (2023-12-31T18:47:33Z)
DISTFLASHATTN: Distributed Memory-efficient Attention for Long-context LLMs Training [82.06732962485754]
FlashAttention effectively reduces the quadratic peak memory usage to linear in training transformer-based large language models (LLMs) on a single GPU. We introduce DISTFLASHATTN, a memory-efficient attention mechanism optimized for long-context LLMs training. It achieves 1.67x and 1.26 - 1.88x speedup compared to recent Ring Attention and DeepSpeed-Ulysses.
arXiv Detail & Related papers (2023-10-05T03:47:57Z)
RLTrace: Synthesizing High-Quality System Call Traces for OS Fuzz Testing [10.644829779197341]
We propose a deep reinforcement learning-based solution, called RLTrace, to synthesize diverse and comprehensive system call traces as the seed to fuzz OS kernels. During model training, the deep learning model interacts with OS kernels and infers optimal system call traces. Our evaluation shows that RLTrace outperforms other seed generators by producing more comprehensive system call traces.
arXiv Detail & Related papers (2023-10-04T06:46:00Z)
SYSPART: Automated Temporal System Call Filtering for Binaries [4.445982681030902]
Recent approaches automatically identify the system calls required by programs to block unneeded ones. SYSPART is an automatic system-call filtering system designed for binary-only server programs.
arXiv Detail & Related papers (2023-09-10T23:57:07Z)
Kernel Continual Learning [117.79080100313722]
kernel continual learning is a simple but effective variant of continual learning to tackle catastrophic forgetting. episodic memory unit stores a subset of samples for each task to learn task-specific classifiers based on kernel ridge regression. variational random features to learn a data-driven kernel for each task.
arXiv Detail & Related papers (2021-07-12T22:09:30Z)
Latency-Aware Differentiable Neural Architecture Search [113.35689580508343]
Differentiable neural architecture search methods became popular in recent years, mainly due to their low search costs and flexibility in designing the search space. However, these methods suffer the difficulty in optimizing network, so that the searched network is often unfriendly to hardware. This paper deals with this problem by adding a differentiable latency loss term into optimization, so that the search process can tradeoff between accuracy and latency with a balancing coefficient.
arXiv Detail & Related papers (2020-01-17T15:55:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.