CovRL: Fuzzing JavaScript Engines with Coverage-Guided Reinforcement
Learning for LLM-based Mutation
- URL: http://arxiv.org/abs/2402.12222v1
- Date: Mon, 19 Feb 2024 15:30:40 GMT
- Title: CovRL: Fuzzing JavaScript Engines with Coverage-Guided Reinforcement
Learning for LLM-based Mutation
- Authors: Jueon Eom, Seyeon Jeong, Taekyoung Kwon
- Abstract summary: This paper presents a novel technique called CovRL (Coverage-guided Reinforcement Learning) that combines Large Language Models (LLMs) with reinforcement learning from coverage feedback.
CovRL-Fuzz identifies 48 real-world security-related bugs in the latest JavaScript engines, including 39 previously unknown vulnerabilities and 11 CVEs.
- Score: 2.5864634852960444
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fuzzing is an effective bug-finding technique but it struggles with complex
systems like JavaScript engines that demand precise grammatical input.
Recently, researchers have adopted language models for context-aware mutation
in fuzzing to address this problem. However, existing techniques are limited in
utilizing coverage guidance for fuzzing, which is rather performed in a
black-box manner. This paper presents a novel technique called CovRL
(Coverage-guided Reinforcement Learning) that combines Large Language Models
(LLMs) with reinforcement learning from coverage feedback. Our fuzzer,
CovRL-Fuzz, integrates coverage feedback directly into the LLM by leveraging
the Term Frequency-Inverse Document Frequency (TF-IDF) method to construct a
weighted coverage map. This map is key in calculating the fuzzing reward, which
is then applied to the LLM-based mutator through reinforcement learning.
CovRL-Fuzz, through this approach, enables the generation of test cases that
are more likely to discover new coverage areas, thus improving vulnerability
detection while minimizing syntax and semantic errors, all without needing
extra post-processing. Our evaluation results indicate that CovRL-Fuzz
outperforms the state-of-the-art fuzzers in terms of code coverage and
bug-finding capabilities: CovRL-Fuzz identified 48 real-world security-related
bugs in the latest JavaScript engines, including 39 previously unknown
vulnerabilities and 11 CVEs.
Related papers
- $\mathbb{USCD}$: Improving Code Generation of LLMs by Uncertainty-Aware Selective Contrastive Decoding [64.00025564372095]
Large language models (LLMs) have shown remarkable capabilities in code generation.
The effects of hallucinations (e.g., output noise) make it challenging for LLMs to generate high-quality code in one pass.
We propose a simple and effective textbfuncertainty-aware textbfselective textbfcontrastive textbfdecoding.
arXiv Detail & Related papers (2024-09-09T02:07:41Z) - LLAMAFUZZ: Large Language Model Enhanced Greybox Fuzzing [6.042114639413868]
Specialized fuzzers can handle complex structured data, but require additional efforts in grammar and suffer from low throughput.
In this paper, we explore the potential of utilizing the Large Language Model to enhance greybox fuzzing for structured data.
Our LLM-based fuzzer, LLAMAFUZZ, integrates the power of LLM to understand and mutate structured data to fuzzing.
arXiv Detail & Related papers (2024-06-11T20:48:28Z) - FOX: Coverage-guided Fuzzing as Online Stochastic Control [13.3158115776899]
Fuzzing is an effective technique for discovering software vulnerabilities by generating random test inputs executing them against the target program.
This paper addresses the limitations of existing coverage-guided fuzzers, focusing on the scheduler and mutator components.
We present FOX, a proof-of-concept implementation of our control-theoretic approach, and compare it to industry-standard fuzzers.
arXiv Detail & Related papers (2024-06-06T21:21:05Z) - FFN-SkipLLM: A Hidden Gem for Autoregressive Decoding with Adaptive Feed Forward Skipping [49.66872823080736]
Autoregressive Large Language Models (e.g., LLaMa, GPTs) are omnipresent achieving remarkable success in language understanding and generation.
To mitigate overload incurred during generation, several early-exit and layer-dropping strategies have been proposed.
We propose FFN-SkipLLM, which is an input-adaptive feed-forward skipping strategy.
arXiv Detail & Related papers (2024-04-05T02:35:43Z) - How Can LLM Guide RL? A Value-Based Approach [68.55316627400683]
Reinforcement learning (RL) has become the de facto standard practice for sequential decision-making problems by improving future acting policies with feedback.
Recent developments in large language models (LLMs) have showcased impressive capabilities in language understanding and generation, yet they fall short in exploration and self-improvement capabilities.
We develop an algorithm named LINVIT that incorporates LLM guidance as a regularization factor in value-based RL, leading to significant reductions in the amount of data needed for learning.
arXiv Detail & Related papers (2024-02-25T20:07:13Z) - Reinforcement learning guided fuzz testing for a browser's HTML
rendering engine [0.9176056742068814]
We propose a novel approach to combine a trained test case generator deep learning model with a double deep Q-network.
The DDQN guides test case creation based on a code coverage signal.
Our approach improves the code coverage performance of the underlying generator model by up to 18.5% for the Firefox HTML rendering engine.
arXiv Detail & Related papers (2023-07-27T00:31:02Z) - Fuzzing with Quantitative and Adaptive Hot-Bytes Identification [6.442499249981947]
American fuzzy lop, a leading fuzzing tool, has demonstrated its powerful bug finding ability through a vast number of reported CVEs.
We propose an approach called toolwhich is designed based on the following principles.
Our evaluation results on 10 real-world programs and LAVA-M dataset show that toolachieves sustained increases in branch coverage and discovers more bugs than other fuzzers.
arXiv Detail & Related papers (2023-07-05T13:41:35Z) - Text Generation with Efficient (Soft) Q-Learning [91.47743595382758]
Reinforcement learning (RL) offers a more flexible solution by allowing users to plug in arbitrary task metrics as reward.
We introduce a new RL formulation for text generation from the soft Q-learning perspective.
We apply the approach to a wide range of tasks, including learning from noisy/negative examples, adversarial attacks, and prompt generation.
arXiv Detail & Related papers (2021-06-14T18:48:40Z) - Multi-context Attention Fusion Neural Network for Software Vulnerability
Identification [4.05739885420409]
We propose a deep learning model that learns to detect some of the common categories of security vulnerabilities in source code efficiently.
The model builds an accurate understanding of code semantics with a lot less learnable parameters.
The proposed AI achieves 98.40% F1-score on specific CWEs from the benchmarked NIST SARD dataset.
arXiv Detail & Related papers (2021-04-19T11:50:36Z) - Robust Deep Reinforcement Learning through Adversarial Loss [74.20501663956604]
Recent studies have shown that deep reinforcement learning agents are vulnerable to small adversarial perturbations on the agent's inputs.
We propose RADIAL-RL, a principled framework to train reinforcement learning agents with improved robustness against adversarial attacks.
arXiv Detail & Related papers (2020-08-05T07:49:42Z) - DisCor: Corrective Feedback in Reinforcement Learning via Distribution
Correction [96.90215318875859]
We show that bootstrapping-based Q-learning algorithms do not necessarily benefit from corrective feedback.
We propose a new algorithm, DisCor, which computes an approximation to this optimal distribution and uses it to re-weight the transitions used for training.
arXiv Detail & Related papers (2020-03-16T16:18:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.