Related papers: Customizing Static Analysis using Codesearch

Customizing Static Analysis using Codesearch

URL: http://arxiv.org/abs/2404.12747v1
Date: Fri, 19 Apr 2024 09:50:02 GMT
Title: Customizing Static Analysis using Codesearch
Authors: Avi Hayoun, Veselin Raychev, Jack Hair,
Abstract summary: A commonly used language to describe a range of static analysis applications is Datalog. We aim to make building custom static analysis tools much easier for developers, while at the same time providing a familiar framework for application security and static analysis experts. Our approach introduces a language called StarLang, a variant of Datalog which only includes programs with a fast runtime.
Score: 1.7205106391379021
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Static analysis is a growing application of software engineering, leading to a range of essential security tools, bug-finding tools, as well as software verification. Recent years show an increase of universal static analysis tools that validate a range of properties and allow customizing parts of the scanner to validate additional properties or "static analysis rules". A commonly used language to describe a range of static analysis applications is Datalog. Unfortunately, the language is still non-trivial to use, leading to analysis that is difficult to implement in a precise but performant way. In this work, we aim to make building custom static analysis tools much easier for developers, while at the same time, providing a familiar framework for application security and static analysis experts. Our approach introduces a language called StarLang, a variant of Datalog which only includes programs with a fast runtime by the means of having low time complexity of its decision procedure.

Related papers

Combining Static Analysis Techniques for Program Comprehension Using Slicito [0.0]
This paper presents a new version of the tool SLICITO that allows developers to perform this kind of exploration on C# code in Visual Studio. Inspired by Moldable Development, SLICITO provides a set of program analysis and visualization building blocks that can be used to create specialized program comprehension tools.
arXiv Detail & Related papers (2025-03-19T20:10:57Z)
Sentiment Analysis Tools in Software Engineering: A Systematic Mapping Study [43.44042227196935]
We aim to help developers or stakeholders in their choice of sentiment analysis tools for their specific purpose. Our results summarize insights from 106 papers with respect to (1) the application domain, (2) the purpose, (3) the used data sets, (4) the approaches for developing sentiment analysis tools, (5) the usage of already existing tools, and (6) the difficulties researchers face.
arXiv Detail & Related papers (2025-02-11T19:02:25Z)
Checkification: A Practical Approach for Testing Static Analysis Truths [0.0]
We propose a method for testing abstract interpretation-based static analyzers. The main advantage of our approach lies in its simplicity, which stems directly from framing it within the Ciao assertion-based validation framework. We have applied our approach to the CiaoPP static analyzer, resulting in the identification of many bugs with reasonable overhead.
arXiv Detail & Related papers (2025-01-21T12:38:04Z)
ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use [51.43211624452462]
We present ToolHop, a dataset comprising 995 user queries and 3,912 associated tools. ToolHop ensures diverse queries, meaningful interdependencies, locally executable tools, detailed feedback, and verifiable answers. We evaluate 14 LLMs across five model families, uncovering significant challenges in handling multi-hop tool-use scenarios.
arXiv Detail & Related papers (2025-01-05T11:06:55Z)
Scaling Symbolic Execution to Large Software Systems [0.0]
Symbolic execution is a popular static analysis technique used both in program verification and in bug detection software. We focus on an error finding framework called the Clang Static Analyzer, and the infrastructure built around it named CodeChecker.
arXiv Detail & Related papers (2024-08-04T02:54:58Z)
Easing Maintenance of Academic Static Analyzers [0.0]
Mopsa is a static analysis platform that aims at being sound. This article documents the tools and techniques we have come up with to simplify the maintenance of Mopsa since 2017.
arXiv Detail & Related papers (2024-07-17T11:29:21Z)
Efficacy of static analysis tools for software defect detection on open-source projects [0.0]
The study used popular analysis tools such as SonarQube, PMD, Checkstyle, and FindBugs to perform the comparison. The study results show that SonarQube performs considerably well than all other tools in terms of its defect detection.
arXiv Detail & Related papers (2024-05-20T19:05:32Z)
Integrating Static Code Analysis Toolchains [0.8246494848934447]
State of the art toolchains support features for either test execution and build automation or traceability between tests, requirements and design information. Our approach combines all those features and extends traceability to the source code level, incorporating static code analysis.
arXiv Detail & Related papers (2024-03-09T18:59:50Z)
E&V: Prompting Large Language Models to Perform Static Analysis by Pseudo-code Execution and Verification [7.745665775992235]
Large Language Models (LLMs) offer new capabilities for software engineering tasks. LLMs simulate the execution of pseudo-code, effectively conducting static analysis encoded in the pseudo-code with minimal human effort. E&V includes a verification process for pseudo-code execution without needing an external oracle.
arXiv Detail & Related papers (2023-12-13T19:31:00Z)
Guess & Sketch: Language Model Guided Transpilation [59.02147255276078]
Learned transpilation offers an alternative to manual re-writing and engineering efforts. Probabilistic neural language models (LMs) produce plausible outputs for every input, but do so at the cost of guaranteed correctness. Guess & Sketch extracts alignment and confidence information from features of the LM then passes it to a symbolic solver to resolve semantic equivalence.
arXiv Detail & Related papers (2023-09-25T15:42:18Z)
A Static Evaluation of Code Completion by Large Language Models [65.18008807383816]
Execution-based benchmarks have been proposed to evaluate functional correctness of model-generated code on simple programming problems. static analysis tools such as linters, which can detect errors without running the program, haven't been well explored for evaluating code generation models. We propose a static evaluation framework to quantify static errors in Python code completions, by leveraging Abstract Syntax Trees.
arXiv Detail & Related papers (2023-06-05T19:23:34Z)
Software Vulnerability Detection via Deep Learning over Disaggregated Code Graph Representation [57.92972327649165]
This work explores a deep learning approach to automatically learn the insecure patterns from code corpora. Because code naturally admits graph structures with parsing, we develop a novel graph neural network (GNN) to exploit both the semantic context and structural regularity of a program.
arXiv Detail & Related papers (2021-09-07T21:24:36Z)
Rissanen Data Analysis: Examining Dataset Characteristics via Description Length [78.42578316883271]
We introduce a method to determine if a certain capability helps to achieve an accurate model of given data. Since minimum program length is uncomputable, we estimate the labels' minimum description length (MDL) as a proxy. We call the method Rissanen Data Analysis (RDA) after the father of MDL.
arXiv Detail & Related papers (2021-03-05T18:58:32Z)
D2A: A Dataset Built for AI-Based Vulnerability Detection Methods Using Differential Analysis [55.15995704119158]
We propose D2A, a differential analysis based approach to label issues reported by static analysis tools. We use D2A to generate a large labeled dataset to train models for vulnerability identification.
arXiv Detail & Related papers (2021-02-16T07:46:53Z)
Exploring Software Naturalness through Neural Language Models [56.1315223210742]
The Software Naturalness hypothesis argues that programming languages can be understood through the same techniques used in natural language processing. We explore this hypothesis through the use of a pre-trained transformer-based language model to perform code analysis tasks.
arXiv Detail & Related papers (2020-06-22T21:56:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.