Levels of Binary Equivalence for the Comparison of Binaries from Alternative Builds
- URL: http://arxiv.org/abs/2410.08427v1
- Date: Fri, 11 Oct 2024 00:16:26 GMT
- Title: Levels of Binary Equivalence for the Comparison of Binaries from Alternative Builds
- Authors: Jens Dietrich, Tim White, Behnaz Hassanshahi, Paddy Krishnan,
- Abstract summary: Build platform variability can strengthen security as it facilitates the detection of compromised build environments.
The availability of multiple binaries built from the same sources creates new challenges and opportunities.
To answer such questions requires a notion of equivalence between binaries.
- Score: 1.1405827621489222
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In response to challenges in software supply chain security, several organisations have created infrastructures to independently build commodity open source projects and release the resulting binaries. Build platform variability can strengthen security as it facilitates the detection of compromised build environments. Furthermore, by improving the security posture of the build platform and collecting provenance information during the build, the resulting artifacts can be used with greater trust. Such offerings are now available from Google, Oracle and RedHat. The availability of multiple binaries built from the same sources creates new challenges and opportunities, and raises questions such as: 'Does build A confirm the integrity of build B?' or 'Can build A reveal a compromised build B?'. To answer such questions requires a notion of equivalence between binaries. We demonstrate that the obvious approach based on bitwise equality has significant shortcomings in practice, and that there is value in opting for alternative notions. We conceptualise this by introducing levels of equivalence, inspired by clone detection types. We demonstrate the value of these new levels through several experiments. We construct a dataset consisting of Java binaries built from the same sources independently by different providers, resulting in 14,156 pairs of binaries in total. We then compare the compiled class files in those jar files and find that for 3,750 pairs of jars (26.49%) there is at least one such file that is different, also forcing the jar files and their cryptographic hashes to be different. However, based on the new equivalence levels, we can still establish that many of them are practically equivalent. We evaluate several candidate equivalence relations on a semi-synthetic dataset that provides oracles consisting of pairs of binaries that either should be, or must not be equivalent.
Related papers
- DALEQ -- Explainable Equivalence for Java Bytecode [1.4003844469021811]
We present daleq, a tool that disassembles Java byte code into a relational database.<n>It can then normalise this database by applying datalog rules and infer equivalence between two classes.<n>We demonstrate the impact of daleq in an industrial context through a large-scale evaluation involving 2,714 pairs of jars.
arXiv Detail & Related papers (2025-08-03T01:17:25Z) - Decompile-Bench: Million-Scale Binary-Source Function Pairs for Real-World Binary Decompilation [12.983487033256448]
Decompile-Bench is the first open-source dataset comprising two million binary-source function pairs condensed from 100 million collected function pairs.<n>For the evaluation purposes, we developed a benchmark Decompile-Bench-Eval including manually crafted binaries from the well-established HumanEval and MBPP.<n>We find that fine-tuning with Decompile-Bench causes a 20% improvement over previous benchmarks in terms of the re-executability rate.
arXiv Detail & Related papers (2025-05-19T03:34:33Z) - BinMetric: A Comprehensive Binary Analysis Benchmark for Large Language Models [50.17907898478795]
We introduce BinMetric, a benchmark designed to evaluate the performance of large language models on binary analysis tasks.<n>BinMetric comprises 1,000 questions derived from 20 real-world open-source projects across 6 practical binary analysis tasks.<n>Our empirical study on this benchmark investigates the binary analysis capabilities of various state-of-the-art LLMs, revealing their strengths and limitations in this field.
arXiv Detail & Related papers (2025-05-12T08:54:07Z) - Attestable builds: compiling verifiable binaries on untrusted systems using trusted execution environments [3.207381224848367]
attestable builds provide strong source-to-binary correspondence in software artifacts.<n>We tackle the challenge of opaque build pipelines that disconnect the trust between source code and the final binary artifact.
arXiv Detail & Related papers (2025-05-05T10:00:04Z) - ReF Decompile: Relabeling and Function Call Enhanced Decompile [50.86228893636785]
The goal of decompilation is to convert compiled low-level code (e.g., assembly code) back into high-level programming languages.
This task supports various reverse engineering applications, such as vulnerability identification, malware analysis, and legacy software migration.
arXiv Detail & Related papers (2025-02-17T12:38:57Z) - Skeleton-Guided-Translation: A Benchmarking Framework for Code Repository Translation with Fine-Grained Quality Evaluation [37.25839260805938]
Skeleton-Guided-Translation is a framework for repository-level Java to C# code translation with fine-grained quality evaluation.
We present TRANSREPO-BENCH, a benchmark of high quality open-source Java repositories and their corresponding C# skeletons.
arXiv Detail & Related papers (2025-01-27T13:44:51Z) - Binary Code Similarity Detection via Graph Contrastive Learning on Intermediate Representations [52.34030226129628]
Binary Code Similarity Detection (BCSD) plays a crucial role in numerous fields, including vulnerability detection, malware analysis, and code reuse identification.
In this paper, we propose IRBinDiff, which mitigates compilation differences by leveraging LLVM-IR with higher-level semantic abstraction.
Our extensive experiments, conducted under varied compilation settings, demonstrate that IRBinDiff outperforms other leading BCSD methods in both One-to-one comparison and One-to-many search scenarios.
arXiv Detail & Related papers (2024-10-24T09:09:20Z) - BinSimDB: Benchmark Dataset Construction for Fine-Grained Binary Code Similarity Analysis [6.093226756571566]
We construct a benchmark dataset for fine-grained binary code similarity analysis called BinSimDB.
Specifically, we propose BMerge and BPair algorithms to bridge the discrepancies between two binary code snippets.
The experimental results demonstrate that BinSimDB significantly improves the performance of binary code similarity comparison.
arXiv Detail & Related papers (2024-10-14T05:13:48Z) - Assemblage: Automatic Binary Dataset Construction for Machine Learning [35.674339346299654]
Assemblage is a cloud-based distributed system that crawls, configures, and builds Windows PE binaries.
We have run Assemblage on AWS over the past year, producing 890k Windows PE and 428k Linux ELF binaries across 29 configurations.
arXiv Detail & Related papers (2024-05-07T04:10:01Z) - Advanced Detection of Source Code Clones via an Ensemble of Unsupervised Similarity Measures [0.0]
This research introduces a novel ensemble learning approach for code similarity assessment.
The key idea is that the strengths of a diverse set of similarity measures can complement each other and mitigate individual weaknesses.
arXiv Detail & Related papers (2024-05-03T13:42:49Z) - BinGo: Identifying Security Patches in Binary Code with Graph
Representation Learning [19.22004583230725]
We propose BinGo, a new security patch detection system for binary code.
BinGo consists of four phases, namely, patch data pre-processing, graph extraction, embedding generation, and graph representation learning.
Our experimental results show BinGo can achieve up to 80.77% accuracy in identifying security patches between two neighboring versions of binary code.
arXiv Detail & Related papers (2023-12-13T06:35:39Z) - CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code
Completion [86.01508183157613]
CrossCodeEval is built on a diverse set of real-world, open-sourced, permissively-licensed repositories in four popular programming languages.
We show that CrossCodeEval is extremely challenging when the relevant cross-file context is absent.
We also show that CrossCodeEval can also be used to measure the capability of code retrievers.
arXiv Detail & Related papers (2023-10-17T13:18:01Z) - On the Security Blind Spots of Software Composition Analysis [46.1389163921338]
We present a novel approach to detect vulnerable clones in the Maven repository.
We retrieve over 53k potential vulnerable clones from Maven Central.
We detect 727 confirmed vulnerable clones and synthesize a testable proof-of-vulnerability project for each of those.
arXiv Detail & Related papers (2023-06-08T20:14:46Z) - Towards Accurate Binary Neural Networks via Modeling Contextual
Dependencies [52.691032025163175]
Existing Binary Neural Networks (BNNs) operate mainly on local convolutions with binarization function.
We present new designs of binary neural modules, which enables leading binary neural modules by a large margin.
arXiv Detail & Related papers (2022-09-03T11:51:04Z) - Repo2Vec: A Comprehensive Embedding Approach for Determining Repository
Similarity [2.095199622772379]
Repo2Vec is a comprehensive embedding approach to represent a repository as a distributed vector.
We evaluate our method with two real datasets from GitHub for a combined 1013 repositories.
arXiv Detail & Related papers (2021-07-11T18:57:03Z) - Contextualizing Meta-Learning via Learning to Decompose [125.76658595408607]
We propose Learning to Decompose Network (LeadNet) to contextualize the meta-learned support-to-target'' strategy.
LeadNet learns to automatically select the strategy associated with the right via incorporating the change of comparison across contexts with polysemous embeddings.
arXiv Detail & Related papers (2021-06-15T13:10:56Z) - D2A: A Dataset Built for AI-Based Vulnerability Detection Methods Using
Differential Analysis [55.15995704119158]
We propose D2A, a differential analysis based approach to label issues reported by static analysis tools.
We use D2A to generate a large labeled dataset to train models for vulnerability identification.
arXiv Detail & Related papers (2021-02-16T07:46:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.