Scaling Symbolic Execution to Large Software Systems
- URL: http://arxiv.org/abs/2408.01909v1
- Date: Sun, 4 Aug 2024 02:54:58 GMT
- Title: Scaling Symbolic Execution to Large Software Systems
- Authors: Gabor Horvath, Reka Kovacs, Zoltan Porkolab,
- Abstract summary: Symbolic execution is a popular static analysis technique used both in program verification and in bug detection software.
We focus on an error finding framework called the Clang Static Analyzer, and the infrastructure built around it named CodeChecker.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Static analysis is the analysis of a program without executing it, usually carried out by an automated tool. Symbolic execution is a popular static analysis technique used both in program verification and in bug detection software. It works by interpreting the code, introducing a symbol for each value unknown at compile time (e.g. user-given inputs), and carrying out calculations symbolically. The analysis engine strives to explore multiple execution paths simultaneously, although checking all paths is an intractable problem, due to the vast number of possibilities. We focus on an error finding framework called the Clang Static Analyzer, and the infrastructure built around it named CodeChecker. The emphasis is on achieving end-to-end scalability. This includes the run time and memory consumption of the analysis, bug presentation to the users, automatic false positive suppression, incremental analysis, pattern discovery in the results, and usage in continuous integration loops. We also outline future directions and open problems concerning these tools. While a rich literature exists on program verification software, error finding tools normally need to settle for survey papers on individual techniques. In this paper, we not only discuss individual methods, but also how these decisions interact and reinforce each other, creating a system that is greater than the sum of its parts. Although the Clang Static Analyzer can only handle C-family languages, the techniques introduced in this paper are mostly language-independent and applicable to other similar static analysis tools.
Related papers
- Easing Maintenance of Academic Static Analyzers [0.0]
Mopsa is a static analysis platform that aims at being sound.
This article documents the tools and techniques we have come up with to simplify the maintenance of Mopsa since 2017.
arXiv Detail & Related papers (2024-07-17T11:29:21Z) - Efficacy of static analysis tools for software defect detection on open-source projects [0.0]
The study used popular analysis tools such as SonarQube, PMD, Checkstyle, and FindBugs to perform the comparison.
The study results show that SonarQube performs considerably well than all other tools in terms of its defect detection.
arXiv Detail & Related papers (2024-05-20T19:05:32Z) - Customizing Static Analysis using Codesearch [1.7205106391379021]
A commonly used language to describe a range of static analysis applications is Datalog.
We aim to make building custom static analysis tools much easier for developers, while at the same time providing a familiar framework for application security and static analysis experts.
Our approach introduces a language called StarLang, a variant of Datalog which only includes programs with a fast runtime.
arXiv Detail & Related papers (2024-04-19T09:50:02Z) - Integrating Static Code Analysis Toolchains [0.8246494848934447]
State of the art toolchains support features for either test execution and build automation or traceability between tests, requirements and design information.
Our approach combines all those features and extends traceability to the source code level, incorporating static code analysis.
arXiv Detail & Related papers (2024-03-09T18:59:50Z) - Symbol-Specific Sparsification of Interprocedural Distributive
Environment Problems [3.9777369380822956]
This paper presents Sparse IDE, a framework that realizes sparsification for any static analysis that fits the Interprocedural Distributive Environment (IDE) framework.
We design, implement and evaluate a linear constant propagation analysis client on top of SparseHeros.
arXiv Detail & Related papers (2024-01-26T12:31:30Z) - ControlLLM: Augment Language Models with Tools by Searching on Graphs [97.62758830255002]
We present ControlLLM, a novel framework that enables large language models (LLMs) to utilize multi-modal tools for solving real-world tasks.
Our framework comprises three key components: (1) a textittask decomposer that breaks down a complex task into clear subtasks with well-defined inputs and outputs; (2) a textitThoughts-on-Graph (ToG) paradigm that searches the optimal solution path on a pre-built tool graph; and (3) an textitexecution engine with a rich toolbox that interprets the solution path and runs the
arXiv Detail & Related papers (2023-10-26T21:57:21Z) - Guess & Sketch: Language Model Guided Transpilation [59.02147255276078]
Learned transpilation offers an alternative to manual re-writing and engineering efforts.
Probabilistic neural language models (LMs) produce plausible outputs for every input, but do so at the cost of guaranteed correctness.
Guess & Sketch extracts alignment and confidence information from features of the LM then passes it to a symbolic solver to resolve semantic equivalence.
arXiv Detail & Related papers (2023-09-25T15:42:18Z) - A Static Evaluation of Code Completion by Large Language Models [65.18008807383816]
Execution-based benchmarks have been proposed to evaluate functional correctness of model-generated code on simple programming problems.
static analysis tools such as linters, which can detect errors without running the program, haven't been well explored for evaluating code generation models.
We propose a static evaluation framework to quantify static errors in Python code completions, by leveraging Abstract Syntax Trees.
arXiv Detail & Related papers (2023-06-05T19:23:34Z) - Structural Analysis of Branch-and-Cut and the Learnability of Gomory
Mixed Integer Cuts [88.94020638263467]
The incorporation of cutting planes within the branch-and-bound algorithm, known as branch-and-cut, forms the backbone of modern integer programming solvers.
We conduct a novel structural analysis of branch-and-cut that pins down how every step of the algorithm is affected by changes in the parameters defining the cutting planes added to the input integer program.
Our main application of this analysis is to derive sample complexity guarantees for using machine learning to determine which cutting planes to apply during branch-and-cut.
arXiv Detail & Related papers (2022-04-15T03:32:40Z) - Searching for More Efficient Dynamic Programs [61.79535031840558]
We describe a set of program transformations, a simple metric for assessing the efficiency of a transformed program, and a search procedure to improve this metric.
We show that in practice, automated search can find substantial improvements to the initial program.
arXiv Detail & Related papers (2021-09-14T20:52:55Z) - D2A: A Dataset Built for AI-Based Vulnerability Detection Methods Using
Differential Analysis [55.15995704119158]
We propose D2A, a differential analysis based approach to label issues reported by static analysis tools.
We use D2A to generate a large labeled dataset to train models for vulnerability identification.
arXiv Detail & Related papers (2021-02-16T07:46:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.