Related papers: Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug Unearthing

Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug Unearthing

URL: http://arxiv.org/abs/2403.03897v1
Date: Wed, 6 Mar 2024 17:57:03 GMT
Title: Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug Unearthing
Authors: Asmita, Yaroslav Oliinyk, Michael Scott, Ryan Tsang, Chongzhou Fang, Houman Homayoun
Abstract summary: Vulnerabilities in BusyBox can have far-reaching consequences. The study revealed the prevalence of older BusyBox versions in real-world embedded products. We introduce two techniques to fortify software testing.
Score: 2.4287247817521096
License: http://creativecommons.org/licenses/by/4.0/
Abstract: BusyBox, an open-source software bundling over 300 essential Linux commands into a single executable, is ubiquitous in Linux-based embedded devices. Vulnerabilities in BusyBox can have far-reaching consequences, affecting a wide array of devices. This research, driven by the extensive use of BusyBox, delved into its analysis. The study revealed the prevalence of older BusyBox versions in real-world embedded products, prompting us to conduct fuzz testing on BusyBox. Fuzzing, a pivotal software testing method, aims to induce crashes that are subsequently scrutinized to uncover vulnerabilities. Within this study, we introduce two techniques to fortify software testing. The first technique enhances fuzzing by leveraging Large Language Models (LLM) to generate target-specific initial seeds. Our study showed a substantial increase in crashes when using LLM-generated initial seeds, highlighting the potential of LLM to efficiently tackle the typically labor-intensive task of generating target-specific initial seeds. The second technique involves repurposing previously acquired crash data from similar fuzzed targets before initiating fuzzing on a new target. This approach streamlines the time-consuming fuzz testing process by providing crash data directly to the new target before commencing fuzzing. We successfully identified crashes in the latest BusyBox target without conducting traditional fuzzing, emphasizing the effectiveness of LLM and crash reuse techniques in enhancing software testing and improving vulnerability detection in embedded systems. Additionally, manual triaging was performed to identify the nature of crashes in the latest BusyBox.

Related papers

Your Fix Is My Exploit: Enabling Comprehensive DL Library API Fuzzing with Large Language Models [49.214291813478695]
Deep learning (DL) libraries, widely used in AI applications, often contain vulnerabilities like overflows and use buffer-free errors. Traditional fuzzing struggles with the complexity and API diversity of DL libraries. We propose DFUZZ, an LLM-driven fuzzing approach for DL libraries.
arXiv Detail & Related papers (2025-01-08T07:07:22Z)
DiffusionAttacker: Diffusion-Driven Prompt Manipulation for LLM Jailbreak [51.8218217407928]
Large Language Models (LLMs) are susceptible to generating harmful content when prompted with carefully crafted inputs. This paper introduces DiffusionAttacker, an end-to-end generative approach for jailbreak rewriting inspired by diffusion models.
arXiv Detail & Related papers (2024-12-23T12:44:54Z)
FuzzDistill: Intelligent Fuzzing Target Selection using Compile-Time Analysis and Machine Learning [0.0]
I present FuzzDistill, an approach that harnesses compile-time data and machine learning to refine fuzzing targets. I demonstrate the efficacy of my approach through experiments conducted on real-world software, demonstrating substantial reductions in testing time.
arXiv Detail & Related papers (2024-12-11T04:55:58Z)
A Code Knowledge Graph-Enhanced System for LLM-Based Fuzz Driver Generation [29.490817477791357]
We propose CodeGraphGPT, a novel system that integrates code knowledge graphs with an intelligent agent to automate the fuzz driver generation process. By framing fuzz driver creation as a code generation task, CodeGraphGPT leverages program analysis to construct a knowledge graph of code repositories. We evaluate CodeGraphGPT on eight open-source software projects, achieving an average improvement of 8.73% in code coverage compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-11-18T12:41:16Z)
Pipe-Cleaner: Flexible Fuzzing Using Security Policies [0.07499722271664144]
Pipe-Cleaner is a system for detecting and analyzing C code vulnerabilities. It is based on flexible developer-designed security policies enforced by a tag-based runtime reference monitor. We demonstrate the potential of this approach on several heap-related security vulnerabilities.
arXiv Detail & Related papers (2024-10-31T23:35:22Z)
Aligning LLMs to Be Robust Against Prompt Injection [55.07562650579068]
We show that alignment can be a powerful tool to make LLMs more robust against prompt injection attacks. Our method -- SecAlign -- first builds an alignment dataset by simulating prompt injection attacks. Our experiments show that SecAlign robustifies the LLM substantially with a negligible hurt on model utility.
arXiv Detail & Related papers (2024-10-07T19:34:35Z)
FuzzCoder: Byte-level Fuzzing Test via Large Language Model [46.18191648883695]
We propose to adopt fine-tuned large language models (FuzzCoder) to learn patterns in the input files from successful attacks. FuzzCoder can predict mutation locations and strategies locations in input files to trigger abnormal behaviors of the program.
arXiv Detail & Related papers (2024-09-03T14:40:31Z)
FuzzTheREST: An Intelligent Automated Black-box RESTful API Fuzzer [0.0]
This work introduces a black-box API of fuzzy testing tool that employs Reinforcement Learning (RL) for vulnerability detection. The tool found a total of six unique vulnerabilities and achieved 55% code coverage.
arXiv Detail & Related papers (2024-07-19T14:43:35Z)
AutoBencher: Towards Declarative Benchmark Construction [74.54640925146289]
We use AutoBencher to create datasets for math, multilinguality, knowledge, and safety. The scalability of AutoBencher allows it to test fine-grained categories knowledge, creating datasets that elicit 22% more model errors (i.e., difficulty) than existing benchmarks.
arXiv Detail & Related papers (2024-07-11T10:03:47Z)
Jailbreaking Large Language Models Through Alignment Vulnerabilities in Out-of-Distribution Settings [57.136748215262884]
We introduce ObscurePrompt for jailbreaking LLMs, inspired by the observed fragile alignments in Out-of-Distribution (OOD) data. We first formulate the decision boundary in the jailbreaking process and then explore how obscure text affects LLM's ethical decision boundary. Our approach substantially improves upon previous methods in terms of attack effectiveness, maintaining efficacy against two prevalent defense mechanisms.
arXiv Detail & Related papers (2024-06-19T16:09:58Z)
Are you still on track!? Catching LLM Task Drift with Activations [55.75645403965326]
Task drift allows attackers to exfiltrate data or influence the LLM's output for other users. We show that a simple linear classifier can detect drift with near-perfect ROC AUC on an out-of-distribution test set. We observe that this approach generalizes surprisingly well to unseen task domains, such as prompt injections, jailbreaks, and malicious instructions.
arXiv Detail & Related papers (2024-06-02T16:53:21Z)
Revisiting Neural Program Smoothing for Fuzzing [8.861172379630899]
This paper presents the most extensive evaluation of NPS fuzzers against standard gray-box fuzzers. We implement Neuzz++, which shows that addressing the practical limitations of NPS fuzzers improves performance. We present MLFuzz, a platform with GPU access for easy and reproducible evaluation of ML-based fuzzers.
arXiv Detail & Related papers (2023-09-28T17:17:11Z)
Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection [64.67495502772866]
Large Language Models (LLMs) are increasingly being integrated into various applications. We show how attackers can override original instructions and employed controls using Prompt Injection attacks. We derive a comprehensive taxonomy from a computer security perspective to systematically investigate impacts and vulnerabilities.
arXiv Detail & Related papers (2023-02-23T17:14:38Z)
D2A: A Dataset Built for AI-Based Vulnerability Detection Methods Using Differential Analysis [55.15995704119158]
We propose D2A, a differential analysis based approach to label issues reported by static analysis tools. We use D2A to generate a large labeled dataset to train models for vulnerability identification.
arXiv Detail & Related papers (2021-02-16T07:46:53Z)
Adversarial EXEmples: A Survey and Experimental Evaluation of Practical Attacks on Machine Learning for Windows Malware Detection [67.53296659361598]
adversarial EXEmples can bypass machine learning-based detection by perturbing relatively few input bytes. We develop a unifying framework that does not only encompass and generalize previous attacks against machine-learning models, but also includes three novel attacks. These attacks, named Full DOS, Extend and Shift, inject the adversarial payload by respectively manipulating the DOS header, extending it, and shifting the content of the first section.
arXiv Detail & Related papers (2020-08-17T07:16:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.