Combined Static Analysis and Machine Learning Prediction for Application Debloating
- URL: http://arxiv.org/abs/2404.00196v1
- Date: Sat, 30 Mar 2024 00:14:17 GMT
- Title: Combined Static Analysis and Machine Learning Prediction for Application Debloating
- Authors: Chris Porter, Sharjeel Khan, Kangqi Ni, Santosh Pande,
- Abstract summary: We develop a framework, Predictive Debloat with Static Guarantees (PDSG)
PDSG predicts the dynamic callee set emanating from a callsite, and to resolve mispredictions, it employs a lightweight audit based on static invariants of call chains.
It achieves the highest gadget reductions among similar techniques on SPEC CPU 2017, reducing 82.5% of the total gadgets on average.
- Score: 2.010931857032585
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Software debloating can effectively thwart certain code reuse attacks by reducing attack surfaces to break gadget chains. Approaches based on static analysis enable a reduced set of functions reachable at a callsite for execution by leveraging static properties of the callgraph. This achieves low runtime overhead, but the function set is conservatively computed, negatively affecting reduction. In contrast, approaches based on machine learning (ML) have much better precision and can sharply reduce function sets, leading to significant improvement in attack surface. Nevertheless, mispredictions occur in ML-based approaches. These cause overheads, and worse, there is no clear way to distinguish between mispredictions and actual attacks. In this work, we contend that a software debloating approach that incorporates ML-based predictions at runtime is realistic in a whole application setting, and that it can achieve significant attack surface reductions beyond the state of the art. We develop a framework, Predictive Debloat with Static Guarantees (PDSG). PDSG is fully sound and works on application source code. At runtime it predicts the dynamic callee set emanating from a callsite, and to resolve mispredictions, it employs a lightweight audit based on static invariants of call chains. We deduce the invariants offline and assert that they hold at runtime when there is a misprediction. To the best of our knowledge, it achieves the highest gadget reductions among similar techniques on SPEC CPU 2017, reducing 82.5% of the total gadgets on average. It triggers misprediction checks on only 3.8% of the total predictions invoked at runtime, and it leverages Datalog to verify dynamic call sequences conform to the static call relations. It has an overhead of 8.9%, which makes the scheme attractive for practical deployments.
Related papers
- HO-FMN: Hyperparameter Optimization for Fast Minimum-Norm Attacks [14.626176607206748]
We propose a parametric variation of the well-known fast minimum-norm attack algorithm.
We re-evaluate 12 robust models, showing that our attack finds smaller adversarial perturbations without requiring any additional tuning.
arXiv Detail & Related papers (2024-07-11T18:30:01Z) - Enhanced Bug Prediction in JavaScript Programs with Hybrid Call-Graph Based Invocation Metrics [0.7099737083842057]
Bug prediction aims at finding source code elements in a software system that are likely to contain defects.
In this paper, we propose a function level JavaScript bug prediction model based on static source code metrics with the addition of a hybrid (static and dynamic) code analysis based metric of the number of incoming and outgoing function calls.
arXiv Detail & Related papers (2024-05-12T10:31:43Z) - E&V: Prompting Large Language Models to Perform Static Analysis by
Pseudo-code Execution and Verification [7.745665775992235]
Large Language Models (LLMs) offer new capabilities for software engineering tasks.
LLMs simulate the execution of pseudo-code, effectively conducting static analysis encoded in the pseudo-code with minimal human effort.
E&V includes a verification process for pseudo-code execution without needing an external oracle.
arXiv Detail & Related papers (2023-12-13T19:31:00Z) - Equation Discovery with Bayesian Spike-and-Slab Priors and Efficient Kernels [57.46832672991433]
We propose a novel equation discovery method based on Kernel learning and BAyesian Spike-and-Slab priors (KBASS)
We use kernel regression to estimate the target function, which is flexible, expressive, and more robust to data sparsity and noises.
We develop an expectation-propagation expectation-maximization algorithm for efficient posterior inference and function estimation.
arXiv Detail & Related papers (2023-10-09T03:55:09Z) - Latency-aware Unified Dynamic Networks for Efficient Image Recognition [72.8951331472913]
LAUDNet is a framework to bridge the theoretical and practical efficiency gap in dynamic networks.
It integrates three primary dynamic paradigms-spatially adaptive computation, dynamic layer skipping, and dynamic channel skipping.
It can notably reduce the latency of models like ResNet by over 50% on platforms such as V100,3090, and TX2 GPUs.
arXiv Detail & Related papers (2023-08-30T10:57:41Z) - How to Find Actionable Static Analysis Warnings [28.866251060033537]
We show that effective predictors of such warnings can be created by methods that adjust the decision boundary.
For eight open-source Java projects (CASSANDRA, JMETER, COMMONS, LUCENE-SOLR, ANT, TOMCAT, DERBY) we achieve perfect test results on 4/8 datasets.
arXiv Detail & Related papers (2022-05-21T04:47:02Z) - Efficient and Differentiable Conformal Prediction with General Function
Classes [96.74055810115456]
We propose a generalization of conformal prediction to multiple learnable parameters.
We show that it achieves approximate valid population coverage and near-optimal efficiency within class.
Experiments show that our algorithm is able to learn valid prediction sets and improve the efficiency significantly.
arXiv Detail & Related papers (2022-02-22T18:37:23Z) - Sparse and Imperceptible Adversarial Attack via a Homotopy Algorithm [93.80082636284922]
Sparse adversarial attacks can fool deep networks (DNNs) by only perturbing a few pixels.
Recent efforts combine it with another l_infty perturbation on magnitudes.
We propose a homotopy algorithm to tackle the sparsity and neural perturbation framework.
arXiv Detail & Related papers (2021-06-10T20:11:36Z) - The Hammer and the Nut: Is Bilevel Optimization Really Needed to Poison
Linear Classifiers? [27.701693158702753]
Data poisoning is a particularly worrisome subset of poisoning attacks.
We propose a counter-intuitive but efficient framework to combat data poisoning.
Our framework achieves comparable, or even better, performances in terms of the attacker's objective.
arXiv Detail & Related papers (2021-03-23T09:08:10Z) - AQD: Towards Accurate Fully-Quantized Object Detection [94.06347866374927]
We propose an Accurate Quantized object Detection solution, termed AQD, to get rid of floating-point computation.
Our AQD achieves comparable or even better performance compared with the full-precision counterpart under extremely low-bit schemes.
arXiv Detail & Related papers (2020-07-14T09:07:29Z) - Probabilistic Regression for Visual Tracking [193.05958682821444]
We propose a probabilistic regression formulation and apply it to tracking.
Our network predicts the conditional probability density of the target state given an input image.
Our tracker sets a new state-of-the-art on six datasets, achieving 59.8% AUC on LaSOT and 75.8% Success on TrackingNet.
arXiv Detail & Related papers (2020-03-27T17:58:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.