Pipe-Cleaner: Flexible Fuzzing Using Security Policies
- URL: http://arxiv.org/abs/2411.00261v1
- Date: Thu, 31 Oct 2024 23:35:22 GMT
- Title: Pipe-Cleaner: Flexible Fuzzing Using Security Policies
- Authors: Allison Naaktgeboren, Sean Noble Anderson, Andrew Tolmach, Greg Sullivan,
- Abstract summary: Pipe-Cleaner is a system for detecting and analyzing C code vulnerabilities.
It is based on flexible developer-designed security policies enforced by a tag-based runtime reference monitor.
We demonstrate the potential of this approach on several heap-related security vulnerabilities.
- Score: 0.07499722271664144
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fuzzing has proven to be very effective for discovering certain classes of software flaws, but less effective in helping developers process these discoveries. Conventional crash-based fuzzers lack enough information about failures to determine their root causes, or to differentiate between new or known crashes, forcing developers to manually process long, repetitious lists of crash reports. Also, conventional fuzzers typically cannot be configured to detect the variety of bugs developers care about, many of which are not easily converted into crashes. To address these limitations, we propose Pipe-Cleaner, a system for detecting and analyzing C code vulnerabilities using a refined fuzzing approach. Pipe-Cleaner is based on flexible developer-designed security policies enforced by a tag-based runtime reference monitor, which communicates with a policy-aware fuzzer. Developers are able to customize the types of faults the fuzzer detects and the level of detail in fault reports. Adding more detail helps the fuzzer to differentiate new bugs, discard duplicate bugs, and improve the clarity of results for bug triage. We demonstrate the potential of this approach on several heap-related security vulnerabilities, including classic memory safety violations and two novel non-crashing classes outside the reach of conventional fuzzers: leftover secret disclosure, and heap address leaks.
Related papers
- LLM-Powered Silent Bug Fuzzing in Deep Learning Libraries via Versatile and Controlled Bug Transfer [15.118579443741659]
We build on the observation that historical bug reports contain rich, underutilized information about silent bugs.<n>We leverage large language models (LLMs) to perform versatile yet controlled bug transfer for silent bug fuzzing.<n>This enables proactive detection of silent bugs by transferring high-risk contexts and oracle designs from known buggy to functionally similar target.
arXiv Detail & Related papers (2026-02-26T14:53:26Z) - Outrunning LLM Cutoffs: A Live Kernel Crash Resolution Benchmark for All [57.23434868678603]
Live-kBench is an evaluation framework for self-evolving benchmarks that scrapes and evaluates agents on freshly discovered kernel bugs.<n> kEnv is an agent-agnostic crash-resolution environment for kernel compilation, execution, and feedback.<n>Using kEnv, we benchmark three state-of-the-art agents, showing that they resolve 74% of crashes on the first attempt.
arXiv Detail & Related papers (2026-02-02T19:06:15Z) - The Trojan Knowledge: Bypassing Commercial LLM Guardrails via Harmless Prompt Weaving and Adaptive Tree Search [58.8834056209347]
Large language models (LLMs) remain vulnerable to jailbreak attacks that bypass safety guardrails to elicit harmful outputs.<n>We introduce the Correlated Knowledge Attack Agent (CKA-Agent), a dynamic framework that reframes jailbreaking as an adaptive, tree-structured exploration of the target model's knowledge base.
arXiv Detail & Related papers (2025-12-01T07:05:23Z) - FalseCrashReducer: Mitigating False Positive Crashes in OSS-Fuzz-Gen Using Agentic AI [9.23077204004445]
This paper presents two AI-driven strategies to reduce false positives in OSS-Fuzz-Gen, a multi-agent system for automated fuzz driver generation.<n>First, constraint-based fuzz driver generation proactively enforces constraints on a function's inputs and state to guide driver creation.<n>Second, context-based crash validation reactively analyzes function callers to determine whether reported crashes are feasible from program entry points.
arXiv Detail & Related papers (2025-10-02T16:36:56Z) - DiffuGuard: How Intrinsic Safety is Lost and Found in Diffusion Large Language Models [50.21378052667732]
We conduct an in-depth analysis of dLLM vulnerabilities to jailbreak attacks across two distinct dimensions: intra-step and inter-step dynamics.<n>We propose DiffuGuard, a training-free defense framework that addresses vulnerabilities through a dual-stage approach.
arXiv Detail & Related papers (2025-09-29T05:17:10Z) - What Do They Fix? LLM-Aided Categorization of Security Patches for Critical Memory Bugs [46.325755802511026]
We developLM, a dual-method pipeline that integrates two approaches based on a Large Language Model (LLM) and a fine-tuned small language model.<n>LM successfully identified 111 of 5,140 recent Linux kernel patches addressing OOB or UAF vulnerabilities, with 90 true positives confirmed by manual verification.
arXiv Detail & Related papers (2025-09-26T18:06:36Z) - FuzzRDUCC: Fuzzing with Reconstructed Def-Use Chain Coverage [6.827408090670258]
Binary-only fuzzing often struggles with achieving thorough code coverage and uncovering hidden vulnerabilities.<n>We introduce FuzzRDUCC, a novel fuzzing framework that employs symbolic execution to reconstruct definition-use (def-use) chains directly from binary executables.
arXiv Detail & Related papers (2025-09-05T09:47:34Z) - Evaluating Language Model Reasoning about Confidential Information [95.64687778185703]
We study whether language models exhibit contextual robustness, or the capability to adhere to context-dependent safety specifications.<n>We develop a benchmark (PasswordEval) that measures whether language models can correctly determine when a user request is authorized.<n>We find that current open- and closed-source models struggle with this seemingly simple task, and that, perhaps surprisingly, reasoning capabilities do not generally improve performance.
arXiv Detail & Related papers (2025-08-27T15:39:46Z) - deepSURF: Detecting Memory Safety Vulnerabilities in Rust Through Fuzzing LLM-Augmented Harnesses [8.093479682590825]
Rust ensures memory safety by default, but it also permits the use of unsafe code, which can introduce memory safety vulnerabilities if misused.<n>We present deepSURF, a tool that integrates static analysis with Large Language Model (LLM)-guided fuzzing harness generation.<n>We evaluate deepSURF on 27 real-world Rust crates, successfully rediscovering 20 known memory safety bugs and uncovering 6 previously unknown vulnerabilities.
arXiv Detail & Related papers (2025-06-18T17:18:23Z) - DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents [52.92354372596197]
Large Language Models (LLMs) are increasingly central to agentic systems due to their strong reasoning and planning capabilities.<n>This interaction also introduces the risk of prompt injection attacks, where malicious inputs from external sources can mislead the agent's behavior.<n>We propose a Dynamic Rule-based Isolation Framework for Trustworthy agentic systems, which enforces both control and data-level constraints.
arXiv Detail & Related papers (2025-06-13T05:01:09Z) - CrashFixer: A crash resolution agent for the Linux kernel [58.152358195983155]
This work builds upon kGym, which shares a benchmark for system-level Linux kernel bugs and a platform to run experiments on the Linux kernel.
This paper introduces CrashFixer, the first LLM-based software repair agent that is applicable to Linux kernel bugs.
arXiv Detail & Related papers (2025-04-29T04:18:51Z) - CKGFuzzer: LLM-Based Fuzz Driver Generation Enhanced By Code Knowledge Graph [29.490817477791357]
We propose an automated fuzz testing method driven by a code knowledge graph and powered by an intelligent agent system.
The code knowledge graph is constructed through interprocedural program analysis, where each node in the graph represents a code entity.
CKGFuzzer achieved an average improvement of 8.73% in code coverage compared to state-of-the-art techniques.
arXiv Detail & Related papers (2024-11-18T12:41:16Z) - Fixing Security Vulnerabilities with AI in OSS-Fuzz [9.730566646484304]
OSS-Fuzz is the most significant and widely used infrastructure for continuous validation of open source systems.
We customise the well-known AutoCodeRover agent for fixing security vulnerabilities.
Our experience with OSS-Fuzz vulnerability data shows that LLM agent autonomy is useful for successful security patching.
arXiv Detail & Related papers (2024-11-03T16:20:32Z) - FuzzCoder: Byte-level Fuzzing Test via Large Language Model [46.18191648883695]
We propose to adopt fine-tuned large language models (FuzzCoder) to learn patterns in the input files from successful attacks.
FuzzCoder can predict mutation locations and strategies locations in input files to trigger abnormal behaviors of the program.
arXiv Detail & Related papers (2024-09-03T14:40:31Z) - Leveraging Stack Traces for Spectrum-based Fault Localization in the Absence of Failing Tests [44.13331329339185]
We introduce a new approach, SBEST, that integrates stack trace data with test coverage to enhance fault localization.
Our approach shows a significant improvement, increasing Mean Average Precision (MAP) by 32.22% and Mean Reciprocal Rank (MRR) by 17.43% over traditional stack trace ranking methods.
arXiv Detail & Related papers (2024-05-01T15:15:52Z) - Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations [76.19419888353586]
Large language models (LLMs) are susceptible to a variety of risks, from non-faithful output to biased and toxic generations.
We present our efforts to create and deploy a library of detectors: compact and easy-to-build classification models that provide labels for various harms.
arXiv Detail & Related papers (2024-03-09T21:07:16Z) - Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug
Unearthing [2.4287247817521096]
Vulnerabilities in BusyBox can have far-reaching consequences.
The study revealed the prevalence of older BusyBox versions in real-world embedded products.
We introduce two techniques to fortify software testing.
arXiv Detail & Related papers (2024-03-06T17:57:03Z) - Online Corrupted User Detection and Regret Minimization [49.536254494829436]
In real-world online web systems, multiple users usually arrive sequentially into the system.
We present an important online learning problem named LOCUD to learn and utilize unknown user relations from disrupted behaviors.
We devise a novel online detection algorithm OCCUD based on RCLUB-WCU's inferred user relations.
arXiv Detail & Related papers (2023-10-07T10:20:26Z) - Hyperfuzzing: black-box security hypertesting with a grey-box fuzzer [2.2000560351723504]
LeakFuzzer advances the state of the art by using a noninterference security property together with a security flow policy as an oracle.
LeakFuzzer can detect the same set of errors that a normal fuzzer can detect, with the addition of being able to detect violations of secure information flow policies.
arXiv Detail & Related papers (2023-08-17T16:15:02Z) - What Happens When We Fuzz? Investigating OSS-Fuzz Bug History [0.9772968596463595]
We analyzed 44,102 reported issues made public by OSS-Fuzz prior to March 12, 2022.
We identified the bug-contributing commits to estimate when the bug containing code was introduced, and measure the timeline from introduction to detection to fix.
arXiv Detail & Related papers (2023-05-19T05:15:36Z) - Beyond the Prior Forgery Knowledge: Mining Critical Clues for General
Face Forgery Detection [61.74632676703288]
We propose a novel Critical Forgery Mining framework, which can be flexibly assembled with various backbones to boost generalization and performance.
Specifically, we first build a fine-grained triplet and suppress specific forgery traces through prior knowledge-agnostic data augmentation.
We then propose a fine-grained relation learning prototype to mine critical information in forgeries through instance and local similarity-aware losses.
arXiv Detail & Related papers (2023-04-24T23:02:27Z) - Enhancing Multiple Reliability Measures via Nuisance-extended
Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition.
We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training.
We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z) - Multi-context Attention Fusion Neural Network for Software Vulnerability
Identification [4.05739885420409]
We propose a deep learning model that learns to detect some of the common categories of security vulnerabilities in source code efficiently.
The model builds an accurate understanding of code semantics with a lot less learnable parameters.
The proposed AI achieves 98.40% F1-score on specific CWEs from the benchmarked NIST SARD dataset.
arXiv Detail & Related papers (2021-04-19T11:50:36Z) - Robust and Transferable Anomaly Detection in Log Data using Pre-Trained
Language Models [59.04636530383049]
Anomalies or failures in large computer systems, such as the cloud, have an impact on a large number of users.
We propose a framework for anomaly detection in log data, as a major troubleshooting source of system information.
arXiv Detail & Related papers (2021-02-23T09:17:05Z) - DeFuzz: Deep Learning Guided Directed Fuzzing [41.61500799890691]
We propose a deep learning (DL) guided directed fuzzing for software vulnerability detection, named DeFuzz.
DeFuzz includes two main schemes: (1) we employ a pre-trained DL prediction model to identify the potentially vulnerable functions and the locations (i.e., vulnerable addresses)
Precisely, we employ Bidirectional-LSTM (BiLSTM) to identify attention words, and the vulnerabilities are associated with these attention words in functions.
arXiv Detail & Related papers (2020-10-23T03:44:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.