Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug
Unearthing
- URL: http://arxiv.org/abs/2403.03897v1
- Date: Wed, 6 Mar 2024 17:57:03 GMT
- Title: Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug
Unearthing
- Authors: Asmita, Yaroslav Oliinyk, Michael Scott, Ryan Tsang, Chongzhou Fang,
Houman Homayoun
- Abstract summary: Vulnerabilities in BusyBox can have far-reaching consequences.
The study revealed the prevalence of older BusyBox versions in real-world embedded products.
We introduce two techniques to fortify software testing.
- Score: 2.4287247817521096
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: BusyBox, an open-source software bundling over 300 essential Linux commands
into a single executable, is ubiquitous in Linux-based embedded devices.
Vulnerabilities in BusyBox can have far-reaching consequences, affecting a wide
array of devices. This research, driven by the extensive use of BusyBox, delved
into its analysis. The study revealed the prevalence of older BusyBox versions
in real-world embedded products, prompting us to conduct fuzz testing on
BusyBox. Fuzzing, a pivotal software testing method, aims to induce crashes
that are subsequently scrutinized to uncover vulnerabilities. Within this
study, we introduce two techniques to fortify software testing. The first
technique enhances fuzzing by leveraging Large Language Models (LLM) to
generate target-specific initial seeds. Our study showed a substantial increase
in crashes when using LLM-generated initial seeds, highlighting the potential
of LLM to efficiently tackle the typically labor-intensive task of generating
target-specific initial seeds. The second technique involves repurposing
previously acquired crash data from similar fuzzed targets before initiating
fuzzing on a new target. This approach streamlines the time-consuming fuzz
testing process by providing crash data directly to the new target before
commencing fuzzing. We successfully identified crashes in the latest BusyBox
target without conducting traditional fuzzing, emphasizing the effectiveness of
LLM and crash reuse techniques in enhancing software testing and improving
vulnerability detection in embedded systems. Additionally, manual triaging was
performed to identify the nature of crashes in the latest BusyBox.
Related papers
- FuzzTheREST: An Intelligent Automated Black-box RESTful API Fuzzer [0.0]
This work introduces a black-box API of fuzzy testing tool that employs Reinforcement Learning (RL) for vulnerability detection.
The tool found a total of six unique vulnerabilities and achieved 55% code coverage.
arXiv Detail & Related papers (2024-07-19T14:43:35Z) - Improving Logits-based Detector without Logits from Black-box LLMs [56.234109491884126]
Large Language Models (LLMs) have revolutionized text generation, producing outputs that closely mimic human writing.
We present Distribution-Aligned LLMs Detection (DALD), an innovative framework that redefines the state-of-the-art performance in black-box text detection.
DALD is designed to align the surrogate model's distribution with that of unknown target LLMs, ensuring enhanced detection capability and resilience against rapid model iterations.
arXiv Detail & Related papers (2024-06-07T19:38:05Z) - Are you still on track!? Catching LLM Task Drift with Activations [55.75645403965326]
Large Language Models (LLMs) are routinely used in retrieval-augmented applications to orchestrate tasks and process inputs from users and other sources.
This opens the door to prompt injection attacks, where the LLM receives and acts upon instructions from supposedly data-only sources, thus deviating from the user's original instructions.
We define this as task drift, and we propose to catch it by scanning and analyzing the LLM's activations.
We show that this approach generalizes surprisingly well to unseen task domains, such as prompt injections, jailbreaks, and malicious instructions, without being trained on any of these attacks.
arXiv Detail & Related papers (2024-06-02T16:53:21Z) - Semi-supervised Open-World Object Detection [74.95267079505145]
We introduce a more realistic formulation, named semi-supervised open-world detection (SS-OWOD)
We demonstrate that the performance of the state-of-the-art OWOD detector dramatically deteriorates in the proposed SS-OWOD setting.
Our experiments on 4 datasets including MS COCO, PASCAL, Objects365 and DOTA demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-02-25T07:12:51Z) - Revisiting Neural Program Smoothing for Fuzzing [8.861172379630899]
This paper presents the most extensive evaluation of NPS fuzzers against standard gray-box fuzzers.
We implement Neuzz++, which shows that addressing the practical limitations of NPS fuzzers improves performance.
We present MLFuzz, a platform with GPU access for easy and reproducible evaluation of ML-based fuzzers.
arXiv Detail & Related papers (2023-09-28T17:17:11Z) - Not what you've signed up for: Compromising Real-World LLM-Integrated
Applications with Indirect Prompt Injection [64.67495502772866]
Large Language Models (LLMs) are increasingly being integrated into various applications.
We show how attackers can override original instructions and employed controls using Prompt Injection attacks.
We derive a comprehensive taxonomy from a computer security perspective to systematically investigate impacts and vulnerabilities.
arXiv Detail & Related papers (2023-02-23T17:14:38Z) - FuSeBMC v4: Improving code coverage with smart seeds via BMC, fuzzing and static analysis [2.792964753261107]
FuSeBMC v4 is a test generator that synthesizes seeds with useful properties.
FuSeBMC works by first analyzing and incrementally injecting goal labels into the given C program.
arXiv Detail & Related papers (2022-06-28T15:13:37Z) - D2A: A Dataset Built for AI-Based Vulnerability Detection Methods Using
Differential Analysis [55.15995704119158]
We propose D2A, a differential analysis based approach to label issues reported by static analysis tools.
We use D2A to generate a large labeled dataset to train models for vulnerability identification.
arXiv Detail & Related papers (2021-02-16T07:46:53Z) - DeFuzz: Deep Learning Guided Directed Fuzzing [41.61500799890691]
We propose a deep learning (DL) guided directed fuzzing for software vulnerability detection, named DeFuzz.
DeFuzz includes two main schemes: (1) we employ a pre-trained DL prediction model to identify the potentially vulnerable functions and the locations (i.e., vulnerable addresses)
Precisely, we employ Bidirectional-LSTM (BiLSTM) to identify attention words, and the vulnerabilities are associated with these attention words in functions.
arXiv Detail & Related papers (2020-10-23T03:44:03Z) - Adversarial EXEmples: A Survey and Experimental Evaluation of Practical
Attacks on Machine Learning for Windows Malware Detection [67.53296659361598]
adversarial EXEmples can bypass machine learning-based detection by perturbing relatively few input bytes.
We develop a unifying framework that does not only encompass and generalize previous attacks against machine-learning models, but also includes three novel attacks.
These attacks, named Full DOS, Extend and Shift, inject the adversarial payload by respectively manipulating the DOS header, extending it, and shifting the content of the first section.
arXiv Detail & Related papers (2020-08-17T07:16:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.