Benchmarking Symbolic Execution Using Constraint Problems -- Initial
Results
- URL: http://arxiv.org/abs/2001.07914v1
- Date: Wed, 22 Jan 2020 08:48:55 GMT
- Title: Benchmarking Symbolic Execution Using Constraint Problems -- Initial
Results
- Authors: Sahil Verma, Roland H.C. Yap
- Abstract summary: Symbolic execution is a powerful technique for bug finding and program testing.
We transform CSP benchmarks into C programs suitable for testing the reasoning capabilities of symbolic execution tools.
- Score: 6.961253535504978
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Symbolic execution is a powerful technique for bug finding and program
testing. It is successful in finding bugs in real-world code. The core
reasoning techniques use constraint solving, path exploration, and search,
which are also the same techniques used in solving combinatorial problems,
e.g., finite-domain constraint satisfaction problems (CSPs). We propose CSP
instances as more challenging benchmarks to evaluate the effectiveness of the
core techniques in symbolic execution. We transform CSP benchmarks into C
programs suitable for testing the reasoning capabilities of symbolic execution
tools. From a single CSP P, we transform P depending on transformation choice
into different C programs. Preliminary testing with the KLEE, Tracer-X, and
LLBMC tools show substantial runtime differences from transformation and solver
choice. Our C benchmarks are effective in showing the limitations of existing
symbolic execution tools. The motivation for this work is we believe that
benchmarks of this form can spur the development and engineering of improved
core reasoning in symbolic execution engines.
Related papers
- To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning [55.52872152909785]
Chain-of-thought (CoT) via prompting is the de facto method for eliciting reasoning capabilities from large language models (LLMs)
We show that CoT gives strong performance benefits primarily on tasks involving math or logic, with much smaller gains on other types of tasks.
arXiv Detail & Related papers (2024-09-18T17:55:00Z) - DOCE: Finding the Sweet Spot for Execution-Based Code Generation [69.5305729627198]
We propose a comprehensive framework that includes candidate generation, $n$-best reranking, minimum Bayes risk (MBR) decoding, and self-ging as the core components.
Our findings highlight the importance of execution-based methods and the difference gap between execution-based and execution-free methods.
arXiv Detail & Related papers (2024-08-25T07:10:36Z) - SEPE-SQED: Symbolic Quick Error Detection by Semantically Equivalent Program Execution [34.48913676566932]
Symbolic quick error detection (SQED) has greatly improved efficiency in formal chip verification.
We propose a new variant called symbolic quick error detection by semantically equivalent program execution (SEPE-SQED)
SEPE-SQED effectively detects single-instruction bugs by differentiating their impact on the original instruction and its semantically equivalent program.
arXiv Detail & Related papers (2024-04-04T03:11:24Z) - Parallel Program Analysis on Path Ranges [3.018638214344819]
Ranged symbolic execution performs symbolic execution on program parts, so called path ranges, in parallel.
We present a verification approach that splits programs into path ranges and then runs arbitrary analyses on the ranges in parallel.
arXiv Detail & Related papers (2024-02-19T08:26:52Z) - Quantum Algorithm Exploration using Application-Oriented Performance
Benchmarks [0.0]
The QED-C suite of Application-Oriented Benchmarks provides the ability to gauge performance characteristics of quantum computers.
We investigate challenges in broadening the relevance of this benchmarking methodology to applications of greater complexity.
arXiv Detail & Related papers (2024-02-14T06:55:50Z) - Divide, Conquer and Verify: Improving Symbolic Execution Performance [0.14999444543328289]
Symbolic Execution is a formal method that can be used to verify the behavior of computer programs and detect software vulnerabilities.
Despite advances in performance in recent years, Symbolic Execution is too slow to be applied to real-world software.
We present a divide-and-conquer approach for symbolic execution by executing individual slices and later combining the side effects.
arXiv Detail & Related papers (2023-10-05T15:21:10Z) - Evaluating and Improving Tool-Augmented Computation-Intensive Math
Reasoning [75.74103236299477]
Chain-of-thought prompting(CoT) and tool augmentation have been validated as effective practices for improving large language models.
We propose a new approach that can deliberate the reasoning steps with tool interfaces, namely textbfDELI.
Experimental results on CARP and six other datasets show that the proposed DELI mostly outperforms competitive baselines.
arXiv Detail & Related papers (2023-06-04T17:02:59Z) - COPS: Controlled Pruning Before Training Starts [68.8204255655161]
State-of-the-art deep neural network (DNN) pruning techniques, applied one-shot before training starts, evaluate sparse architectures with the help of a single criterion -- called pruning score.
In this work we do not concentrate on a single pruning criterion, but provide a framework for combining arbitrary GSSs to create more powerful pruning strategies.
arXiv Detail & Related papers (2021-07-27T08:48:01Z) - Performance Evaluation of Adversarial Attacks: Discrepancies and
Solutions [51.8695223602729]
adversarial attack methods have been developed to challenge the robustness of machine learning models.
We propose a Piece-wise Sampling Curving (PSC) toolkit to effectively address the discrepancy.
PSC toolkit offers options for balancing the computational cost and evaluation effectiveness.
arXiv Detail & Related papers (2021-04-22T14:36:51Z) - CoCoMoT: Conformance Checking of Multi-Perspective Processes via SMT
(Extended Version) [62.96267257163426]
We introduce the CoCoMoT (Computing Conformance Modulo Theories) framework.
First, we show how SAT-based encodings studied in the pure control-flow setting can be lifted to our data-aware case.
Second, we introduce a novel preprocessing technique based on a notion of property-preserving clustering.
arXiv Detail & Related papers (2021-03-18T20:22:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.