CveBinarySheet: A Comprehensive Pre-built Binaries Database for IoT Vulnerability Analysis
- URL: http://arxiv.org/abs/2501.08840v1
- Date: Wed, 15 Jan 2025 14:50:46 GMT
- Title: CveBinarySheet: A Comprehensive Pre-built Binaries Database for IoT Vulnerability Analysis
- Authors: Lingfeng Chen,
- Abstract summary: CveBinarySheet is a database containing 1033 CVE entries spanning from 1999 to 2024.<n>Our dataset encompasses 16 essential third-party components, including busybox and curl.<n>Each precompiled binary is available at two compiler optimization levels (O0 and O3), facilitating comprehensive vulnerability analysis under different compilation scenarios.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Binary Static Code Analysis (BSCA) is a pivotal area in software vulnerability research, focusing on the precise localization of vulnerabilities within binary executables. Despite advancements in BSCA techniques, there is a notable scarcity of comprehensive and readily usable vulnerability datasets tailored for diverse environments such as IoT, UEFI, and MCU firmware. To address this gap, we present CveBinarySheet, a meticulously curated database containing 1033 CVE entries spanning from 1999 to 2024. Our dataset encompasses 16 essential third-party components, including busybox and curl, and supports five CPU architectures: x86-64, i386, MIPS, ARMv7, and RISC-V64. Each precompiled binary is available at two compiler optimization levels (O0 and O3), facilitating comprehensive vulnerability analysis under different compilation scenarios. By providing detailed metadata and diverse binary samples, CveBinarySheet aims to accelerate the development of state-of-the-art BSCA tools, binary similarity analysis, and vulnerability matching applications.
Related papers
- BinPool: A Dataset of Vulnerabilities for Binary Security Analysis [5.423608359320192]
The ideal dataset consists of a large and diverse collection of real-world vulnerabilities, paired so as to contain both vulnerable and patched versions of each program.
Previous datasets are either publicly unavailable, lack semantic diversity, involve artificially introduced vulnerabilities, or were collected using static analyzers.
In this paper, we describe a new publicly available dataset which we dubbed Binpool, containing numerous samples of vulnerable versions of Debian packages.
arXiv Detail & Related papers (2025-04-27T00:07:34Z) - Beyond the Edge of Function: Unraveling the Patterns of Type Recovery in Binary Code [55.493408628371235]
We propose ByteTR, a framework for recovering variable types in binary code.
In light of the ubiquity of variable propagation across functions, ByteTR conducts inter-procedural analysis to trace variable propagation and employs a gated graph neural network to capture long-range data flow dependencies for variable type recovery.
arXiv Detail & Related papers (2025-03-10T12:27:05Z) - Bridging the PLC Binary Analysis Gap: A Cross-Compiler Dataset and Neural Framework for Industrial Control Systems [14.826593801448032]
PLC-BEAD is a dataset containing 2431 compiled binaries from 700+ PLC programs across four major industrial compilers.
This novel dataset uniquely pairs each binary with its original Structured Text source code and standardized functionality labels.
We demonstrate the dataset's utility through PLCEmbed, a transformer-based framework for binary code analysis.
arXiv Detail & Related papers (2025-02-27T03:27:37Z) - Enhancing Reverse Engineering: Investigating and Benchmarking Large Language Models for Vulnerability Analysis in Decompiled Binaries [2.696054049278301]
We introduce DeBinVul, a novel decompiled binary code vulnerability dataset.
We fine-tune state-of-the-art LLMs using DeBinVul and report on a performance increase of 19%, 24%, and 21% in detecting binary code vulnerabilities.
arXiv Detail & Related papers (2024-11-07T18:54:31Z) - BinaryAI: Binary Software Composition Analysis via Intelligent Binary Source Code Matching [8.655595404611821]
We introduce BinaryAI, a novel binary-to-source SCA technique with two-phase binary source code matching to capture both syntactic and semantic code features.
Our experimental results demonstrate the superior performance of BinaryAI in terms of binary source code matching and the downstream SCA task.
arXiv Detail & Related papers (2024-01-20T07:57:57Z) - BEC: Bit-Level Static Analysis for Reliability against Soft Errors [0.26107298043931204]
We propose a bit-level error coalescing (BEC) static program analysis to understand and improve program reliability against soft errors.
BEC analysis tracks each bit corruption in the register file and classifies the effect of the corruption by its semantics at compile time.
The proposed method is generic and not limited to a specific computer architecture.
arXiv Detail & Related papers (2024-01-11T09:03:47Z) - SOCI^+: An Enhanced Toolkit for Secure OutsourcedComputation on Integers [50.608828039206365]
We propose SOCI+ which significantly improves the performance of SOCI.
SOCI+ employs a novel (2, 2)-threshold Paillier cryptosystem with fast encryption and decryption as its cryptographic primitive.
Compared with SOCI, our experimental evaluation shows that SOCI+ is up to 5.4 times more efficient in computation and 40% less in communication overhead.
arXiv Detail & Related papers (2023-09-27T05:19:32Z) - BiBench: Benchmarking and Analyzing Network Binarization [72.59760752906757]
Network binarization emerges as one of the most promising compression approaches offering extraordinary computation and memory savings.
Common challenges of binarization, such as accuracy degradation and efficiency limitation, suggest that its attributes are not fully understood.
We present BiBench, a rigorously designed benchmark with in-depth analysis for network binarization.
arXiv Detail & Related papers (2023-01-26T17:17:16Z) - BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to
Real-Network Performance [54.214426436283134]
Deep neural networks, such as the Deep-FSMN, have been widely studied for keyword spotting (KWS) applications.
We present a strong yet efficient binary neural network for KWS, namely BiFSMNv2, pushing it to the real-network accuracy performance.
We highlight that benefiting from the compact architecture and optimized hardware kernel, BiFSMNv2 can achieve an impressive 25.1x speedup and 20.2x storage-saving on edge hardware.
arXiv Detail & Related papers (2022-11-13T18:31:45Z) - VELVET: a noVel Ensemble Learning approach to automatically locate
VulnErable sTatements [62.93814803258067]
This paper presents VELVET, a novel ensemble learning approach to locate vulnerable statements in source code.
Our model combines graph-based and sequence-based neural networks to successfully capture the local and global context of a program graph.
VELVET achieves 99.6% and 43.6% top-1 accuracy over synthetic data and real-world data, respectively.
arXiv Detail & Related papers (2021-12-20T22:45:27Z) - CARAFE++: Unified Content-Aware ReAssembly of FEatures [132.49582482421246]
We propose unified Content-Aware ReAssembly of FEatures (CARAFE++), a universal, lightweight and highly effective operator to fulfill this goal.
CARAFE++ generates adaptive kernels on-the-fly to enable instance-specific content-aware handling.
It shows consistent and substantial gains across all the tasks with negligible computational overhead.
arXiv Detail & Related papers (2020-12-07T07:34:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.