Symbol-Specific Sparsification of Interprocedural Distributive
Environment Problems
- URL: http://arxiv.org/abs/2401.14813v1
- Date: Fri, 26 Jan 2024 12:31:30 GMT
- Title: Symbol-Specific Sparsification of Interprocedural Distributive
Environment Problems
- Authors: Kadiray Karakaya and Eric Bodden
- Abstract summary: This paper presents Sparse IDE, a framework that realizes sparsification for any static analysis that fits the Interprocedural Distributive Environment (IDE) framework.
We design, implement and evaluate a linear constant propagation analysis client on top of SparseHeros.
- Score: 3.9777369380822956
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Previous work has shown that one can often greatly speed up static analysis
by computing data flows not for every edge in the program's control-flow graph
but instead only along definition-use chains. This yields a so-called sparse
static analysis. Recent work on SparseDroid has shown that specifically taint
analysis can be "sparsified" with extraordinary effectiveness because the taint
state of one variable does not depend on those of others. This allows one to
soundly omit more flow-function computations than in the general case.
In this work, we now assess whether this result carries over to the more
generic setting of so-called Interprocedural Distributive Environment (IDE)
problems. Opposed to taint analysis, IDE comprises distributive problems with
large or even infinitely broad domains, such as typestate analysis or linear
constant propagation. Specifically, this paper presents Sparse IDE, a framework
that realizes sparsification for any static analysis that fits the IDE
framework.
We implement Sparse IDE in SparseHeros, as an extension to the popular Heros
IDE solver, and evaluate its performance on real-world Java libraries by
comparing it to the baseline IDE algorithm. To this end, we design, implement
and evaluate a linear constant propagation analysis client on top of
SparseHeros. Our experiments show that, although IDE analyses can only be
sparsified with respect to symbols and not (numeric) values, Sparse IDE can
nonetheless yield significantly lower runtimes and often also memory
consumptions compared to the original IDE.
Related papers
- Scaling Symbolic Execution to Large Software Systems [0.0]
Symbolic execution is a popular static analysis technique used both in program verification and in bug detection software.
We focus on an error finding framework called the Clang Static Analyzer, and the infrastructure built around it named CodeChecker.
arXiv Detail & Related papers (2024-08-04T02:54:58Z) - Customizing Static Analysis using Codesearch [1.7205106391379021]
A commonly used language to describe a range of static analysis applications is Datalog.
We aim to make building custom static analysis tools much easier for developers, while at the same time providing a familiar framework for application security and static analysis experts.
Our approach introduces a language called StarLang, a variant of Datalog which only includes programs with a fast runtime.
arXiv Detail & Related papers (2024-04-19T09:50:02Z) - Supporting Error Chains in Static Analysis for Precise Evaluation
Results and Enhanced Usability [2.8557828838739527]
Static analyses tend to report where a vulnerability manifests rather than the fix location.
This can cause presumed false positives or imprecise results.
We designed an adaption of an existing static analysis algorithm that can distinguish between a manifestation and fix location.
arXiv Detail & Related papers (2024-03-12T16:46:29Z) - Context-Sensitive Abstract Interpretation of Dynamic Languages [0.0]
There is a vast gap in the quality of IDE tooling between static languages like Java and dynamic languages like Python or JavaScript.
Modern frameworks and libraries in these languages heavily use their dynamic capabilities to achieve the best ergonomics and readability.
We propose an algorithm that can bridge this gap by statically analyzing dynamic metaprogramming and runtime in programs.
arXiv Detail & Related papers (2024-01-31T17:45:05Z) - Revisiting Evaluation Metrics for Semantic Segmentation: Optimization
and Evaluation of Fine-grained Intersection over Union [113.20223082664681]
We propose the use of fine-grained mIoUs along with corresponding worst-case metrics.
These fine-grained metrics offer less bias towards large objects, richer statistical information, and valuable insights into model and dataset auditing.
Our benchmark study highlights the necessity of not basing evaluations on a single metric and confirms that fine-grained mIoUs reduce the bias towards large objects.
arXiv Detail & Related papers (2023-10-30T03:45:15Z) - Multi-Label Meta Weighting for Long-Tailed Dynamic Scene Graph
Generation [55.429541407920304]
Recognizing the predicate between subject and object pairs is imbalanced and multi-label in nature.
Recent state-of-the-art methods predominantly focus on the most frequently occurring predicate classes.
We introduce a multi-label meta-learning framework to deal with the biased predicate distribution.
arXiv Detail & Related papers (2023-06-16T18:14:23Z) - Hexatagging: Projective Dependency Parsing as Tagging [63.5392760743851]
We introduce a novel dependency, the hexatagger, that constructs dependency trees by tagging the words in a sentence with elements from a finite set of possible tags.
Our approach is fully parallelizable at training time, i.e., the structure-building actions needed to build a dependency parse can be predicted in parallel to each other.
We achieve state-of-the-art performance of 96.4 LAS and 97.4 UAS on the Penn Treebank test set.
arXiv Detail & Related papers (2023-06-08T18:02:07Z) - A Static Evaluation of Code Completion by Large Language Models [65.18008807383816]
Execution-based benchmarks have been proposed to evaluate functional correctness of model-generated code on simple programming problems.
static analysis tools such as linters, which can detect errors without running the program, haven't been well explored for evaluating code generation models.
We propose a static evaluation framework to quantify static errors in Python code completions, by leveraging Abstract Syntax Trees.
arXiv Detail & Related papers (2023-06-05T19:23:34Z) - SegmentMeIfYouCan: A Benchmark for Anomaly Segmentation [111.61261419566908]
Deep neural networks (DNNs) are usually trained on a closed set of semantic classes.
They are ill-equipped to handle previously-unseen objects.
detecting and localizing such objects is crucial for safety-critical applications such as perception for automated driving.
arXiv Detail & Related papers (2021-04-30T07:58:19Z) - D2A: A Dataset Built for AI-Based Vulnerability Detection Methods Using
Differential Analysis [55.15995704119158]
We propose D2A, a differential analysis based approach to label issues reported by static analysis tools.
We use D2A to generate a large labeled dataset to train models for vulnerability identification.
arXiv Detail & Related papers (2021-02-16T07:46:53Z) - Comparative Code Structure Analysis using Deep Learning for Performance
Prediction [18.226950022938954]
This paper aims to assess the feasibility of using purely static information (e.g., abstract syntax tree or AST) of applications to predict performance change based on the change in code structure.
Our evaluations of several deep embedding learning methods demonstrate that tree-based Long Short-Term Memory (LSTM) models can leverage the hierarchical structure of source-code to discover latent representations and achieve up to 84% (individual problem) and 73% (combined dataset with multiple of problems) accuracy in predicting the change in performance.
arXiv Detail & Related papers (2021-02-12T16:59:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.