Related papers: Do Unit Proofs Work? An Empirical Study of Compositional Bounded Model Checking for Memory Safety Verification

Do Unit Proofs Work? An Empirical Study of Compositional Bounded Model Checking for Memory Safety Verification

URL: http://arxiv.org/abs/2503.13762v1
Date: Mon, 17 Mar 2025 22:55:12 GMT
Title: Do Unit Proofs Work? An Empirical Study of Compositional Bounded Model Checking for Memory Safety Verification
Authors: Paschal C. Amusuo, Owen Cochell, Taylor Le Lievre, Parth V. Patil, Aravind Machiry, James C. Davis,
Abstract summary: Memory safety defects pose a major threat to software reliability, enabling cyberattacks, outages, and crashes.<n>To mitigate these risks, organizations adopt Compositional Bounded Model Checking (BMC), using unit proofs to formally verify memory safety.<n>However, methods for creating unit proofs vary across organizations and are inconsistent within the same project, leading to errors and missed defects.<n>This work presents the first empirical study on unit proofing for memory safety verification.
Score: 4.826167697662746
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Memory safety defects pose a major threat to software reliability, enabling cyberattacks, outages, and crashes. To mitigate these risks, organizations adopt Compositional Bounded Model Checking (BMC), using unit proofs to formally verify memory safety. However, methods for creating unit proofs vary across organizations and are inconsistent within the same project, leading to errors and missed defects. In addition, unit proofing remains understudied, with no systematic development methods or empirical evaluations. This work presents the first empirical study on unit proofing for memory safety verification. We introduce a systematic method for creating unit proofs that leverages verification feedback and objective criteria. Using this approach, we develop 73 unit proofs for four embedded operating systems and evaluate their effectiveness, characteristics, cost, and generalizability. Our results show unit proofs are cost-effective, detecting 74\% of recreated defects, with an additional 9\% found with increased BMC bounds, and 19 new defects exposed. We also found that embedded software requires small unit proofs, which can be developed in 87 minutes and executed in 61 minutes on average. These findings provide practical guidance for engineers and empirical data to inform tooling design.

Related papers

Advancing Embodied Agent Security: From Safety Benchmarks to Input Moderation [52.83870601473094]
Embodied agents exhibit immense potential across a multitude of domains. Existing research predominantly concentrates on the security of general large language models. This paper introduces a novel input moderation framework, meticulously designed to safeguard embodied agents.
arXiv Detail & Related papers (2025-04-22T08:34:35Z)
LeakGuard: Detecting Memory Leaks Accurately and Scalably [3.256598917442277]
LeakGuard is a memory leak detection tool which provides satisfactory balance of accuracy and scalability. For accuracy, LeakGuard analyzes the behaviors of library and developer-defined memory allocation and deallocation functions. For scalability, LeakGuard examines each function of interest independently by using its function summary and under-constrained symbolic execution technique.
arXiv Detail & Related papers (2025-04-06T09:11:37Z)
Automated Proof Generation for Rust Code via Self-Evolution [69.25795662658356]
We introduce SAFE, a novel framework that overcomes the lack of human-written proof to enable automated proof generation of Rust code. We demonstrate superior efficiency and precision compared to GPT-4o. This advancement leads to a significant improvement in performance, achieving a 70.50% accuracy rate in a benchmark crafted by human experts.
arXiv Detail & Related papers (2024-10-21T08:15:45Z)
Enabling Unit Proofing for Software Implementation Verification [2.7325338323814328]
We propose a research agenda for a unit proofing framework, both methods and tools. This will enable engineers to discover code-level defects early.
arXiv Detail & Related papers (2024-10-18T18:37:36Z)
What Makes and Breaks Safety Fine-tuning? A Mechanistic Study [64.9691741899956]
Safety fine-tuning helps align Large Language Models (LLMs) with human preferences for their safe deployment. We design a synthetic data generation framework that captures salient aspects of an unsafe input. Using this, we investigate three well-known safety fine-tuning methods.
arXiv Detail & Related papers (2024-07-14T16:12:57Z)
To Err is Machine: Vulnerability Detection Challenges LLM Reasoning [8.602355712876815]
We present a challenging code reasoning task: vulnerability detection.<n>State-of-the-art (SOTA) models reported only 54.5% Balanced Accuracy in our vulnerability detection evaluation.<n>New models, new training methods, or more execution-specific pretraining data may be needed to conquer vulnerability detection.
arXiv Detail & Related papers (2024-03-25T21:47:36Z)
Unlearning Backdoor Threats: Enhancing Backdoor Defense in Multimodal Contrastive Learning via Local Token Unlearning [49.242828934501986]
Multimodal contrastive learning has emerged as a powerful paradigm for building high-quality features. backdoor attacks subtly embed malicious behaviors within the model during training. We introduce an innovative token-based localized forgetting training regime.
arXiv Detail & Related papers (2024-03-24T18:33:15Z)
Rely-guarantee Reasoning about Concurrent Memory Management: Correctness, Safety and Security [0.0]
An incorrect specification and implementation of the memory management may lead to system crashes or exploitable attacks. This article presents the first formal specification and mechanized proof of a concurrent memory management for a real-world OS.
arXiv Detail & Related papers (2023-09-17T03:41:10Z)
Conservative Prediction via Data-Driven Confidence Minimization [70.93946578046003]
In safety-critical applications of machine learning, it is often desirable for a model to be conservative. We propose the Data-Driven Confidence Minimization framework, which minimizes confidence on an uncertainty dataset.
arXiv Detail & Related papers (2023-06-08T07:05:36Z)
Avoid Adversarial Adaption in Federated Learning by Multi-Metric Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources. FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks. We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously. MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z)
A New Era in Software Security: Towards Self-Healing Software via Large Language Models and Formal Verification [8.733354577147093]
This paper introduces an innovative approach that combines Large Language Models (LLMs) with Formal Verification strategies for automatic software vulnerability repair. We present the ESBMC-AI framework as a proof of concept, leveraging the well-recognized and industry-adopted Efficient SMT-based Context-Bounded Model Checker (ESBMC) and a pre-trained transformer model. Our results demonstrate ESBMC-AI's capability to automate the detection and repair of issues such as buffer overflow, arithmetic overflow, and pointer dereference failures with high accuracy.
arXiv Detail & Related papers (2023-05-24T05:54:10Z)
PAC-Based Formal Verification for Out-of-Distribution Data Detection [4.406331747636832]
This study places probably approximately correct (PAC) based guarantees on OOD detection using the encoding process within VAEs. It is used to bound the detection error on unfamiliar instances with user-defined confidence.
arXiv Detail & Related papers (2023-04-04T07:33:02Z)
No Need to Know Physics: Resilience of Process-based Model-free Anomaly Detection for Industrial Control Systems [95.54151664013011]
We present a novel framework to generate adversarial spoofing signals that violate physical properties of the system. We analyze four anomaly detectors published at top security conferences.
arXiv Detail & Related papers (2020-12-07T11:02:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.