An Empirical Study on the Distance Metric in Guiding Directed Grey-box Fuzzing
- URL: http://arxiv.org/abs/2409.12701v1
- Date: Thu, 19 Sep 2024 12:15:54 GMT
- Title: An Empirical Study on the Distance Metric in Guiding Directed Grey-box Fuzzing
- Authors: Tingke Wen, Yuwei Li, Lu Zhang, Huimin Ma, Zulie Pan,
- Abstract summary: Directed grey-box fuzzing (DGF) aims to discover vulnerabilities in specific code areas efficiently.
It remains opaque about how different distance metrics guide the fuzzing process and affect the fuzzing result in practice.
- Score: 13.43238098819184
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Directed grey-box fuzzing (DGF) aims to discover vulnerabilities in specific code areas efficiently. Distance metric, which is used to measure the quality of seed in DGF, is a crucial factor in affecting the fuzzing performance. Despite distance metrics being widely applied in existing DGF frameworks, it remains opaque about how different distance metrics guide the fuzzing process and affect the fuzzing result in practice. In this paper, we conduct the first empirical study to explore how different distance metrics perform in guiding DGFs. Specifically, we systematically discuss different distance metrics in the aspect of calculation method and granularity. Then, we implement different distance metrics based on AFLGo. On this basis, we conduct comprehensive experiments to evaluate the performance of these distance metrics on the benchmarks widely used in existing DGF-related work. The experimental results demonstrate the following insights. First, the difference among different distance metrics with varying methods of calculation and granularities is not significant. Second, the distance metrics may not be effective in describing the difficulty of triggering the target vulnerability. In addition, by scrutinizing the quality of testcases, our research highlights the inherent limitation of existing mutation strategies in generating high-quality testcases, calling for designing effective mutation strategies for directed fuzzing. We open-source the implementation code and experiment dataset to facilitate future research in DGF.
Related papers
- Attention Distance: A Novel Metric for Directed Fuzzing with Large Language Models [23.471848775985364]
We introduce textbfattention distance, a novel metric that computes attention scores between code elements and reveal their intrinsic connections.<n>Compared to state-of-the-art directed fuzzers DAFL and WindRanger, our approach achieves textbf2.89$times$ and textbf7.13$times$ improvements, respectively.
arXiv Detail & Related papers (2025-12-19T17:03:50Z) - MeshMetrics: A Precise Implementation of Distance-Based Image Segmentation Metrics [0.0]
MeshMetrics is a mesh-based framework that provides a more precise computation of distance-based metrics than conventional grid-based approaches.<n>We demonstrate that MeshMetrics achieves higher accuracy and precision than established tools, and is substantially less affected by discretization artifacts.
arXiv Detail & Related papers (2025-09-06T10:16:40Z) - DIDS: Domain Impact-aware Data Sampling for Large Language Model Training [61.10643823069603]
We present Domain Impact-aware Data Sampling (DIDS) for large language models.<n>DIDS group training data based on learning effects, where a proxy language model and dimensionality reduction are employed.<n>It achieves 3.4% higher average performance while maintaining comparable training efficiency.
arXiv Detail & Related papers (2025-04-17T13:09:38Z) - Human Re-ID Meets LVLMs: What can we expect? [14.370360290704197]
We compare the performance of the leading large vision-language models in the human re-identification task.
Our results confirm the strengths of LVLMs, but also their severe limitations that often lead to catastrophic answers.
arXiv Detail & Related papers (2025-01-30T19:00:40Z) - Episodic Novelty Through Temporal Distance [39.66260812278513]
Episodic Novelty Through Temporal Distance (ETD) is a novel approach that introduces temporal distance as a robust metric for state similarity and intrinsic reward.
By employing contrastive learning, ETD accurately estimates temporal distances and derives intrinsic rewards based on the novelty of states within the current episode.
arXiv Detail & Related papers (2025-01-26T06:43:45Z) - Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning [50.84938730450622]
We propose a trajectory-based method TV score, which uses trajectory volatility for OOD detection in mathematical reasoning.
Our method outperforms all traditional algorithms on GLMs under mathematical reasoning scenarios.
Our method can be extended to more applications with high-density features in output spaces, such as multiple-choice questions.
arXiv Detail & Related papers (2024-05-22T22:22:25Z) - Measuring Domain Shifts using Deep Learning Remote Photoplethysmography Model Similarity [0.9208007322096533]
We study the domain shift problem under the context of remote photoplethys (rmography)
We propose metrics based on metrics which may be used as a measure of domain shift.
One of the proposed metrics with viable correlations, DS-diff, does not assume access to the ground truth of the target domain.
arXiv Detail & Related papers (2024-04-12T01:13:23Z) - Simple Ingredients for Offline Reinforcement Learning [86.1988266277766]
offline reinforcement learning algorithms have proven effective on datasets highly connected to the target downstream task.
We show that existing methods struggle with diverse data: their performance considerably deteriorates as data collected for related but different tasks is simply added to the offline buffer.
We show that scale, more than algorithmic considerations, is the key factor influencing performance.
arXiv Detail & Related papers (2024-03-19T18:57:53Z) - Navigating the Metrics Maze: Reconciling Score Magnitudes and Accuracies [24.26653413077486]
Ten years ago a single metric, BLEU, governed progress in machine translation research.
This paper investigates the "dynamic range" of modern metrics.
arXiv Detail & Related papers (2024-01-12T18:47:40Z) - RGB-based Category-level Object Pose Estimation via Decoupled Metric
Scale Recovery [72.13154206106259]
We propose a novel pipeline that decouples the 6D pose and size estimation to mitigate the influence of imperfect scales on rigid transformations.
Specifically, we leverage a pre-trained monocular estimator to extract local geometric information.
A separate branch is designed to directly recover the metric scale of the object based on category-level statistics.
arXiv Detail & Related papers (2023-09-19T02:20:26Z) - Towards Multiple References Era -- Addressing Data Leakage and Limited
Reference Diversity in NLG Evaluation [55.92852268168816]
N-gram matching-based evaluation metrics, such as BLEU and chrF, are widely utilized across a range of natural language generation (NLG) tasks.
Recent studies have revealed a weak correlation between these matching-based metrics and human evaluations.
We propose to utilize textitmultiple references to enhance the consistency between these metrics and human evaluations.
arXiv Detail & Related papers (2023-08-06T14:49:26Z) - Comparative Study Between Distance Measures On Supervised Optimum-Path
Forest Classification [0.0]
Optimum-Path Forest (OPF) uses a graph-based methodology and a distance measure to create arcs between nodes and hence sets of trees.
This work proposes a comparative study over a wide range of distance measures applied to the supervised Optimum-Path Forest classification.
arXiv Detail & Related papers (2022-02-08T13:34:09Z) - Benchmarking Deep Models for Salient Object Detection [67.07247772280212]
We construct a general SALient Object Detection (SALOD) benchmark to conduct a comprehensive comparison among several representative SOD methods.
In the above experiments, we find that existing loss functions usually specialized in some metrics but reported inferior results on the others.
We propose a novel Edge-Aware (EA) loss that promotes deep networks to learn more discriminative features by integrating both pixel- and image-level supervision signals.
arXiv Detail & Related papers (2022-02-07T03:43:16Z) - Leaning Compact and Representative Features for Cross-Modality Person
Re-Identification [18.06382007908855]
This paper pays close attention to the cross-modality visible-infrared person re-identification (VI Re-ID) task.
The proposed method is superior to the other most advanced methods in terms of impressive performance.
arXiv Detail & Related papers (2021-03-26T01:53:16Z) - Rethink Maximum Mean Discrepancy for Domain Adaptation [77.2560592127872]
This paper theoretically proves two essential facts: 1) minimizing the Maximum Mean Discrepancy equals to maximize the source and target intra-class distances respectively but jointly minimize their variance with some implicit weights, so that the feature discriminability degrades.
Experiments on several benchmark datasets not only prove the validity of theoretical results but also demonstrate that our approach could perform better than the comparative state-of-art methods substantially.
arXiv Detail & Related papers (2020-07-01T18:25:10Z) - Multi-Source Domain Adaptation for Text Classification via
DistanceNet-Bandits [101.68525259222164]
We present a study of various distance-based measures in the context of NLP tasks, that characterize the dissimilarity between domains based on sample estimates.
We develop a DistanceNet model which uses these distance measures as an additional loss function to be minimized jointly with the task's loss function.
We extend this model to a novel DistanceNet-Bandit model, which employs a multi-armed bandit controller to dynamically switch between multiple source domains.
arXiv Detail & Related papers (2020-01-13T15:53:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.