Related papers: Symbol-Specific Sparsification of Interprocedural Distributive Environment Problems

Symbol-Specific Sparsification of Interprocedural Distributive Environment Problems

URL: http://arxiv.org/abs/2401.14813v1
Date: Fri, 26 Jan 2024 12:31:30 GMT
Title: Symbol-Specific Sparsification of Interprocedural Distributive Environment Problems
Authors: Kadiray Karakaya and Eric Bodden
Abstract summary: This paper presents Sparse IDE, a framework that realizes sparsification for any static analysis that fits the Interprocedural Distributive Environment (IDE) framework. We design, implement and evaluate a linear constant propagation analysis client on top of SparseHeros.
Score: 3.9777369380822956
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Previous work has shown that one can often greatly speed up static analysis by computing data flows not for every edge in the program's control-flow graph but instead only along definition-use chains. This yields a so-called sparse static analysis. Recent work on SparseDroid has shown that specifically taint analysis can be "sparsified" with extraordinary effectiveness because the taint state of one variable does not depend on those of others. This allows one to soundly omit more flow-function computations than in the general case. In this work, we now assess whether this result carries over to the more generic setting of so-called Interprocedural Distributive Environment (IDE) problems. Opposed to taint analysis, IDE comprises distributive problems with large or even infinitely broad domains, such as typestate analysis or linear constant propagation. Specifically, this paper presents Sparse IDE, a framework that realizes sparsification for any static analysis that fits the IDE framework. We implement Sparse IDE in SparseHeros, as an extension to the popular Heros IDE solver, and evaluate its performance on real-world Java libraries by comparing it to the baseline IDE algorithm. To this end, we design, implement and evaluate a linear constant propagation analysis client on top of SparseHeros. Our experiments show that, although IDE analyses can only be sparsified with respect to symbols and not (numeric) values, Sparse IDE can nonetheless yield significantly lower runtimes and often also memory consumptions compared to the original IDE.

Related papers

Employing Continuous Integration inspired workflows for benchmarking of scientific software -- a use case on numerical cut cell quadrature [0.3387808070669509]
This paper presents a proven approach that utilizes established Continuous Integration tools and practices to achieve high automation of benchmark execution and reporting.<n>Our use case is the numerical integration (quadrature) on arbitrary domains, which are bounded by implicitly or parametrically defined curves or surfaces in 2D or 3D.
arXiv Detail & Related papers (2025-03-21T14:42:24Z)
Beyond the Edge of Function: Unraveling the Patterns of Type Recovery in Binary Code [55.493408628371235]
We propose ByteTR, a framework for recovering variable types in binary code. In light of the ubiquity of variable propagation across functions, ByteTR conducts inter-procedural analysis to trace variable propagation and employs a gated graph neural network to capture long-range data flow dependencies for variable type recovery.
arXiv Detail & Related papers (2025-03-10T12:27:05Z)
Universal Scalability in Declarative Program Analysis (with Choice-Based Combination Pruning) [1.3874486202578669]
We show a near-universal construction that allows the choice construct to flexibly limit evaluation of predicates. We apply the technique to probably the largest, pre-existing Datalog analysis frameworks in existence: Doop (for Javacode) and the main client analyses from the Gigahorse framework.
arXiv Detail & Related papers (2025-03-07T21:23:02Z)
Scaling Symbolic Execution to Large Software Systems [0.0]
Symbolic execution is a popular static analysis technique used both in program verification and in bug detection software. We focus on an error finding framework called the Clang Static Analyzer, and the infrastructure built around it named CodeChecker.
arXiv Detail & Related papers (2024-08-04T02:54:58Z)
Customizing Static Analysis using Codesearch [1.7205106391379021]
A commonly used language to describe a range of static analysis applications is Datalog. We aim to make building custom static analysis tools much easier for developers, while at the same time providing a familiar framework for application security and static analysis experts. Our approach introduces a language called StarLang, a variant of Datalog which only includes programs with a fast runtime.
arXiv Detail & Related papers (2024-04-19T09:50:02Z)
Supporting Error Chains in Static Analysis for Precise Evaluation Results and Enhanced Usability [2.8557828838739527]
Static analyses tend to report where a vulnerability manifests rather than the fix location. This can cause presumed false positives or imprecise results. We designed an adaption of an existing static analysis algorithm that can distinguish between a manifestation and fix location.
arXiv Detail & Related papers (2024-03-12T16:46:29Z)
Context-Sensitive Abstract Interpretation of Dynamic Languages [0.0]
There is a vast gap in the quality of IDE tooling between static languages like Java and dynamic languages like Python or JavaScript. Modern frameworks and libraries in these languages heavily use their dynamic capabilities to achieve the best ergonomics and readability. We propose an algorithm that can bridge this gap by statically analyzing dynamic metaprogramming and runtime in programs.
arXiv Detail & Related papers (2024-01-31T17:45:05Z)
Revisiting Evaluation Metrics for Semantic Segmentation: Optimization and Evaluation of Fine-grained Intersection over Union [113.20223082664681]
We propose the use of fine-grained mIoUs along with corresponding worst-case metrics. These fine-grained metrics offer less bias towards large objects, richer statistical information, and valuable insights into model and dataset auditing. Our benchmark study highlights the necessity of not basing evaluations on a single metric and confirms that fine-grained mIoUs reduce the bias towards large objects.
arXiv Detail & Related papers (2023-10-30T03:45:15Z)
Multi-Label Meta Weighting for Long-Tailed Dynamic Scene Graph Generation [55.429541407920304]
Recognizing the predicate between subject and object pairs is imbalanced and multi-label in nature. Recent state-of-the-art methods predominantly focus on the most frequently occurring predicate classes. We introduce a multi-label meta-learning framework to deal with the biased predicate distribution.
arXiv Detail & Related papers (2023-06-16T18:14:23Z)
Hexatagging: Projective Dependency Parsing as Tagging [63.5392760743851]
We introduce a novel dependency, the hexatagger, that constructs dependency trees by tagging the words in a sentence with elements from a finite set of possible tags. Our approach is fully parallelizable at training time, i.e., the structure-building actions needed to build a dependency parse can be predicted in parallel to each other. We achieve state-of-the-art performance of 96.4 LAS and 97.4 UAS on the Penn Treebank test set.
arXiv Detail & Related papers (2023-06-08T18:02:07Z)
A Static Evaluation of Code Completion by Large Language Models [65.18008807383816]
Execution-based benchmarks have been proposed to evaluate functional correctness of model-generated code on simple programming problems. static analysis tools such as linters, which can detect errors without running the program, haven't been well explored for evaluating code generation models. We propose a static evaluation framework to quantify static errors in Python code completions, by leveraging Abstract Syntax Trees.
arXiv Detail & Related papers (2023-06-05T19:23:34Z)
SegmentMeIfYouCan: A Benchmark for Anomaly Segmentation [111.61261419566908]
Deep neural networks (DNNs) are usually trained on a closed set of semantic classes. They are ill-equipped to handle previously-unseen objects. detecting and localizing such objects is crucial for safety-critical applications such as perception for automated driving.
arXiv Detail & Related papers (2021-04-30T07:58:19Z)
D2A: A Dataset Built for AI-Based Vulnerability Detection Methods Using Differential Analysis [55.15995704119158]
We propose D2A, a differential analysis based approach to label issues reported by static analysis tools. We use D2A to generate a large labeled dataset to train models for vulnerability identification.
arXiv Detail & Related papers (2021-02-16T07:46:53Z)
Comparative Code Structure Analysis using Deep Learning for Performance Prediction [18.226950022938954]
This paper aims to assess the feasibility of using purely static information (e.g., abstract syntax tree or AST) of applications to predict performance change based on the change in code structure. Our evaluations of several deep embedding learning methods demonstrate that tree-based Long Short-Term Memory (LSTM) models can leverage the hierarchical structure of source-code to discover latent representations and achieve up to 84% (individual problem) and 73% (combined dataset with multiple of problems) accuracy in predicting the change in performance.
arXiv Detail & Related papers (2021-02-12T16:59:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.