FuzzDistill: Intelligent Fuzzing Target Selection using Compile-Time Analysis and Machine Learning
- URL: http://arxiv.org/abs/2412.08100v1
- Date: Wed, 11 Dec 2024 04:55:58 GMT
- Title: FuzzDistill: Intelligent Fuzzing Target Selection using Compile-Time Analysis and Machine Learning
- Authors: Saket Upadhyay,
- Abstract summary: I present FuzzDistill, an approach that harnesses compile-time data and machine learning to refine fuzzing targets.
I demonstrate the efficacy of my approach through experiments conducted on real-world software, demonstrating substantial reductions in testing time.
- Score: 0.0
- License:
- Abstract: Fuzz testing is a fundamental technique employed to identify vulnerabilities within software systems. However, the process can be protracted and resource-intensive, especially when confronted with extensive codebases. In this work, I present FuzzDistill, an approach that harnesses compile-time data and machine learning to refine fuzzing targets. By analyzing compile-time information, such as function call graphs' features, loop information, and memory operations, FuzzDistill identifies high-priority areas of the codebase that are more probable to contain vulnerabilities. I demonstrate the efficacy of my approach through experiments conducted on real-world software, demonstrating substantial reductions in testing time.
Related papers
- CKGFuzzer: LLM-Based Fuzz Driver Generation Enhanced By Code Knowledge Graph [29.490817477791357]
We propose an automated fuzz testing method driven by a code knowledge graph and powered by an intelligent agent system.
The code knowledge graph is constructed through interprocedural program analysis, where each node in the graph represents a code entity.
CKGFuzzer achieved an average improvement of 8.73% in code coverage compared to state-of-the-art techniques.
arXiv Detail & Related papers (2024-11-18T12:41:16Z) - FuzzCoder: Byte-level Fuzzing Test via Large Language Model [46.18191648883695]
We propose to adopt fine-tuned large language models (FuzzCoder) to learn patterns in the input files from successful attacks.
FuzzCoder can predict mutation locations and strategies locations in input files to trigger abnormal behaviors of the program.
arXiv Detail & Related papers (2024-09-03T14:40:31Z) - Rethinking Negative Pairs in Code Search [56.23857828689406]
We propose a simple yet effective Soft-InfoNCE loss that inserts weight terms into InfoNCE.
We analyze the effects of Soft-InfoNCE on controlling the distribution of learnt code representations and on deducing a more precise mutual information estimation.
arXiv Detail & Related papers (2023-10-12T06:32:42Z) - A Discrepancy Aware Framework for Robust Anomaly Detection [51.710249807397695]
We present a Discrepancy Aware Framework (DAF), which demonstrates robust performance consistently with simple and cheap strategies.
Our method leverages an appearance-agnostic cue to guide the decoder in identifying defects, thereby alleviating its reliance on synthetic appearance.
Under the simple synthesis strategies, it outperforms existing methods by a large margin. Furthermore, it also achieves the state-of-the-art localization performance.
arXiv Detail & Related papers (2023-10-11T15:21:40Z) - Using Machine Learning To Identify Software Weaknesses From Software
Requirement Specifications [49.1574468325115]
This research focuses on finding an efficient machine learning algorithm to identify software weaknesses from requirement specifications.
Keywords extracted using latent semantic analysis help map the CWE categories to PROMISE_exp. Naive Bayes, support vector machine (SVM), decision trees, neural network, and convolutional neural network (CNN) algorithms were tested.
arXiv Detail & Related papers (2023-08-10T13:19:10Z) - A Unified Active Learning Framework for Annotating Graph Data with
Application to Software Source Code Performance Prediction [4.572330678291241]
We develop a unified active learning framework specializing in software performance prediction.
We investigate the impact of using different levels of information for active and passive learning.
Our approach aims to improve the investment in AI models for different software performance predictions.
arXiv Detail & Related papers (2023-04-06T14:00:48Z) - Statement-Level Vulnerability Detection: Learning Vulnerability Patterns Through Information Theory and Contrastive Learning [31.15123852246431]
We propose a novel end-to-end deep learning-based approach to identify the vulnerability-relevant code statements of a specific function.
Inspired by the structures observed in real-world vulnerable code, we first leverage mutual information for learning a set of latent variables.
We then propose novel clustered spatial contrastive learning in order to further improve the representation learning.
arXiv Detail & Related papers (2022-09-20T00:46:20Z) - Advancing Reacting Flow Simulations with Data-Driven Models [50.9598607067535]
Key to effective use of machine learning tools in multi-physics problems is to couple them to physical and computer models.
The present chapter reviews some of the open opportunities for the application of data-driven reduced-order modeling of combustion systems.
arXiv Detail & Related papers (2022-09-05T16:48:34Z) - VELVET: a noVel Ensemble Learning approach to automatically locate
VulnErable sTatements [62.93814803258067]
This paper presents VELVET, a novel ensemble learning approach to locate vulnerable statements in source code.
Our model combines graph-based and sequence-based neural networks to successfully capture the local and global context of a program graph.
VELVET achieves 99.6% and 43.6% top-1 accuracy over synthetic data and real-world data, respectively.
arXiv Detail & Related papers (2021-12-20T22:45:27Z) - DeFuzz: Deep Learning Guided Directed Fuzzing [41.61500799890691]
We propose a deep learning (DL) guided directed fuzzing for software vulnerability detection, named DeFuzz.
DeFuzz includes two main schemes: (1) we employ a pre-trained DL prediction model to identify the potentially vulnerable functions and the locations (i.e., vulnerable addresses)
Precisely, we employ Bidirectional-LSTM (BiLSTM) to identify attention words, and the vulnerabilities are associated with these attention words in functions.
arXiv Detail & Related papers (2020-10-23T03:44:03Z) - Estimating Structural Target Functions using Machine Learning and
Influence Functions [103.47897241856603]
We propose a new framework for statistical machine learning of target functions arising as identifiable functionals from statistical models.
This framework is problem- and model-agnostic and can be used to estimate a broad variety of target parameters of interest in applied statistics.
We put particular focus on so-called coarsening at random/doubly robust problems with partially unobserved information.
arXiv Detail & Related papers (2020-08-14T16:48:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.