Formally Verified Binary-level Pointer Analysis
- URL: http://arxiv.org/abs/2501.17766v1
- Date: Wed, 29 Jan 2025 16:57:15 GMT
- Title: Formally Verified Binary-level Pointer Analysis
- Authors: Freek Verbeek, Ali Shokri, Daniel Engel, Binoy Ravindran,
- Abstract summary: This paper presents an approach to formally proven correct binary-level pointer analysis.
A salient property of our approach is that it first generically considers what proof obligations a generic abstract domain for pointer analysis must satisfy.
- Score: 1.9561775591923982
- License:
- Abstract: Binary-level pointer analysis can be of use in symbolic execution, testing, verification, and decompilation of software binaries. In various such contexts, it is crucial that the result is trustworthy, i.e., it can be formally established that the pointer designations are overapproximative. This paper presents an approach to formally proven correct binary-level pointer analysis. A salient property of our approach is that it first generically considers what proof obligations a generic abstract domain for pointer analysis must satisfy. This allows easy instantiation of different domains, varying in precision, while preserving the correctness of the analysis. In the trade-off between scalability and precision, such customization allows "meaningful" precision (sufficiently precise to ensure basic sanity properties, such as that relevant parts of the stack frame are not overwritten during function execution) while also allowing coarse analysis when pointer computations have become too obfuscated during compilation for sound and accurate bounds analysis. We experiment with three different abstract domains with high, medium, and low precision. Evaluation shows that our approach is able to derive designations for memory writes soundly in COTS binaries, in a context-sensitive interprocedural fashion.
Related papers
- On the Role of Pre-trained Embeddings in Binary Code Analysis [7.161446721947512]
Pre-trained embeddings of assembly code have become a gold standard for solving binary code analysis tasks.
In contrast to natural language processing, label information is not scarce for many tasks in binary code analysis.
We systematically evaluate recent embeddings for assembly code on five downstream tasks using a corpus of 1.2 million functions.
arXiv Detail & Related papers (2025-02-12T10:50:46Z) - Checkification: A Practical Approach for Testing Static Analysis Truths [0.0]
We propose a method for testing abstract interpretation-based static analyzers.
The main advantage of our approach lies in its simplicity, which stems directly from framing it within the Ciao assertion-based validation framework.
We have applied our approach to the CiaoPP static analyzer, resulting in the identification of many bugs with reasonable overhead.
arXiv Detail & Related papers (2025-01-21T12:38:04Z) - StrTune: Data Dependence-based Code Slicing for Binary Similarity Detection with Fine-tuned Representation [5.41477941455399]
BCSD can address binary tasks such as malicious code snippets identification and binary patch analysis by comparing code patterns.
Because binaries are compiled with different compilation configurations, existing approaches still face notable limitations when comparing binary similarity.
We propose StrTune, which slices binary code based on data dependence and perform slice-level fine-tuning.
arXiv Detail & Related papers (2024-11-19T12:20:08Z) - Smart Contract Vulnerability Detection based on Static Analysis and Multi-Objective Search [3.297959314391795]
This paper introduces a method for detecting vulnerabilities in smart contracts using static analysis and a multi-objective optimization algorithm.
We focus on four types of vulnerabilities: reentrancy, call stack overflow, integer overflow, and timestamp dependencies.
We validate our approach using an open-source dataset collected from Etherscan, containing 6,693 smart contracts.
arXiv Detail & Related papers (2024-09-30T23:28:17Z) - FoC: Figure out the Cryptographic Functions in Stripped Binaries with LLMs [54.27040631527217]
We propose a novel framework called FoC to Figure out the Cryptographic functions in stripped binaries.
We first build a binary large language model (FoC-BinLLM) to summarize the semantics of cryptographic functions in natural language.
We then build a binary code similarity model (FoC-Sim) upon the FoC-BinLLM to create change-sensitive representations and use it to retrieve similar implementations of unknown cryptographic functions in a database.
arXiv Detail & Related papers (2024-03-27T09:45:33Z) - DeepSec: Deciding Equivalence Properties for Security Protocols -- Improved theory and practice [8.735998284944436]
We contribute both to the theory and practice of this verification problem.
We establish new complexity results for static equivalence, trace equivalence and labelled bisimilarity.
Our procedure is the first to decide trace equivalence and labelled bisimilarity exactly for a large variety of cryptographic primitives.
arXiv Detail & Related papers (2022-11-06T22:01:04Z) - Measuring the Interpretability of Unsupervised Representations via
Quantized Reverse Probing [97.70862116338554]
We investigate the problem of measuring interpretability of self-supervised representations.
We formulate the latter as estimating the mutual information between the representation and a space of manually labelled concepts.
We use our method to evaluate a large number of self-supervised representations, ranking them by interpretability.
arXiv Detail & Related papers (2022-09-07T16:18:50Z) - Rethinking Counting and Localization in Crowds:A Purely Point-Based
Framework [59.578339075658995]
We propose a purely point-based framework for joint crowd counting and individual localization.
We design an intuitive solution under this framework, which is called Point to Point Network (P2PNet)
arXiv Detail & Related papers (2021-07-27T11:41:50Z) - Target-Aware Object Discovery and Association for Unsupervised Video
Multi-Object Segmentation [79.6596425920849]
This paper addresses the task of unsupervised video multi-object segmentation.
We introduce a novel approach for more accurate and efficient unseen-temporal segmentation.
We evaluate the proposed approach on DAVIS$_17$ and YouTube-VIS, and the results demonstrate that it outperforms state-of-the-art methods both in segmentation accuracy and inference speed.
arXiv Detail & Related papers (2021-04-10T14:39:44Z) - Making Affine Correspondences Work in Camera Geometry Computation [62.7633180470428]
Local features provide region-to-region rather than point-to-point correspondences.
We propose guidelines for effective use of region-to-region matches in the course of a full model estimation pipeline.
Experiments show that affine solvers can achieve accuracy comparable to point-based solvers at faster run-times.
arXiv Detail & Related papers (2020-07-20T12:07:48Z) - Deep Hough Transform for Semantic Line Detection [70.28969017874587]
We focus on a fundamental task of detecting meaningful line structures, a.k.a. semantic lines, in natural scenes.
Previous methods neglect the inherent characteristics of lines, leading to sub-optimal performance.
We propose a one-shot end-to-end learning framework for line detection.
arXiv Detail & Related papers (2020-03-10T13:08:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.