A Comprehensive Study on Static Application Security Testing (SAST) Tools for Android
- URL: http://arxiv.org/abs/2410.20740v1
- Date: Mon, 28 Oct 2024 05:10:22 GMT
- Title: A Comprehensive Study on Static Application Security Testing (SAST) Tools for Android
- Authors: Jingyun Zhu, Kaixuan Li, Sen Chen, Lingling Fan, Junjie Wang, Xiaofei Xie,
- Abstract summary: VulsTotal is a unified evaluation platform for defining and describing tools' supported vulnerability types.
We select 11 free and open-sourced SAST tools from a pool of 97 existing options, adhering to clearly defined criteria.
We then unify 67 general/common vulnerability types for Android SAST tools.
- Score: 22.558610938860124
- License:
- Abstract: To identify security vulnerabilities in Android applications, numerous static application security testing (SAST) tools have been proposed. However, it poses significant challenges to assess their overall performance on diverse vulnerability types. The task is non-trivial and poses considerable challenges. {Firstly, the absence of a unified evaluation platform for defining and describing tools' supported vulnerability types, coupled with the lack of normalization for the intricate and varied reports generated by different tools, significantly adds to the complexity.} Secondly, there is a scarcity of adequate benchmarks, particularly those derived from real-world scenarios. To address these problems, we are the first to propose a unified platform named VulsTotal, supporting various vulnerability types, enabling comprehensive and versatile analysis across diverse SAST tools. Specifically, we begin by meticulously selecting 11 free and open-sourced SAST tools from a pool of 97 existing options, adhering to clearly defined criteria. After that, we invest significant efforts in comprehending the detection rules of each tool, subsequently unifying 67 general/common vulnerability types for {Android} SAST tools. We also redefine and implement a standardized reporting format, ensuring uniformity in presenting results across all tools. Additionally, to mitigate the problem of benchmarks, we conducted a manual analysis of huge amounts of CVEs to construct a new CVE-based benchmark based on our comprehension of Android app vulnerabilities. Leveraging the evaluation platform, which integrates both existing synthetic benchmarks and newly constructed CVE-based benchmarks from this study, we conducted a comprehensive analysis to evaluate and compare these selected tools from various perspectives, such as general vulnerability type coverage, type consistency, tool effectiveness, and time performance.
Related papers
- LLMs in Software Security: A Survey of Vulnerability Detection Techniques and Insights [12.424610893030353]
Large Language Models (LLMs) are emerging as transformative tools for software vulnerability detection.
This paper provides a detailed survey of LLMs in vulnerability detection.
We address challenges such as cross-language vulnerability detection, multimodal data integration, and repository-level analysis.
arXiv Detail & Related papers (2025-02-10T21:33:38Z) - ACEBench: Who Wins the Match Point in Tool Usage? [68.54159348899891]
ACEBench is a comprehensive benchmark for assessing tool usage in Large Language Models (LLMs)
It categorizes data into three primary types based on evaluation methodology: Normal, Special, and Agent.
It provides a more granular examination of error causes across different data types.
arXiv Detail & Related papers (2025-01-22T12:59:08Z) - A Comparative Analysis of Vulnerability Management Tools: Evaluating Nessus, Acunetix, and Nikto for Risk Based Security Solutions [0.0]
This paper presents a comparative analysis of three widely used vulnerability management tools: Nessus, Acunetix, and Nikto.
Each tool is assessed based on its detection accuracy, risk scoring using the Common Vulnerability Scoring System (CVSS), ease of use, automation and reporting capabilities, performance metrics, and cost effectiveness.
arXiv Detail & Related papers (2024-11-28T13:14:24Z) - The Impact of SBOM Generators on Vulnerability Assessment in Python: A Comparison and a Novel Approach [56.4040698609393]
Software Bill of Materials (SBOM) has been promoted as a tool to increase transparency and verifiability in software composition.
Current SBOM generation tools often suffer from inaccuracies in identifying components and dependencies.
We propose PIP-sbom, a novel pip-inspired solution that addresses their shortcomings.
arXiv Detail & Related papers (2024-09-10T10:12:37Z) - Static Application Security Testing (SAST) Tools for Smart Contracts: How Far Are We? [14.974832502863526]
In recent years, the importance of smart contract security has been heightened by the increasing number of attacks against them.
To address this issue, a multitude of static application security testing (SAST) tools have been proposed for detecting vulnerabilities in smart contracts.
In this paper, we propose an up-to-date and fine-grained taxonomy that includes 45 unique vulnerability types for smart contracts.
arXiv Detail & Related papers (2024-04-28T13:40:18Z) - An Extensive Comparison of Static Application Security Testing Tools [1.3927943269211593]
Static Application Security Testing Tools (SASTTs) identify software vulnerabilities to support the security and reliability of software applications.
Several studies have suggested that alternative solutions may be more effective than SASTTs due to their tendency to generate false alarms.
Our SASTTs evaluation is based on a controlled, though synthetic, Java.
arXiv Detail & Related papers (2024-03-14T09:37:54Z) - SALAD-Bench: A Hierarchical and Comprehensive Safety Benchmark for Large Language Models [107.82336341926134]
SALAD-Bench is a safety benchmark specifically designed for evaluating Large Language Models (LLMs)
It transcends conventional benchmarks through its large scale, rich diversity, intricate taxonomy spanning three levels, and versatile functionalities.
arXiv Detail & Related papers (2024-02-07T17:33:54Z) - ASSERT: Automated Safety Scenario Red Teaming for Evaluating the
Robustness of Large Language Models [65.79770974145983]
ASSERT, Automated Safety Scenario Red Teaming, consists of three methods -- semantically aligned augmentation, target bootstrapping, and adversarial knowledge injection.
We partition our prompts into four safety domains for a fine-grained analysis of how the domain affects model performance.
We find statistically significant performance differences of up to 11% in absolute classification accuracy among semantically related scenarios and error rates of up to 19% absolute error in zero-shot adversarial settings.
arXiv Detail & Related papers (2023-10-14T17:10:28Z) - DeepfakeBench: A Comprehensive Benchmark of Deepfake Detection [55.70982767084996]
A critical yet frequently overlooked challenge in the field of deepfake detection is the lack of a standardized, unified, comprehensive benchmark.
We present the first comprehensive benchmark for deepfake detection, called DeepfakeBench, which offers three key contributions.
DeepfakeBench contains 15 state-of-the-art detection methods, 9CL datasets, a series of deepfake detection evaluation protocols and analysis tools, as well as comprehensive evaluations.
arXiv Detail & Related papers (2023-07-04T01:34:41Z) - AIBugHunter: A Practical Tool for Predicting, Classifying and Repairing
Software Vulnerabilities [27.891905729536372]
AIBugHunter is a novel ML-based software vulnerability analysis tool for C/C++ languages that is integrated into Visual Studio Code.
We propose a novel multi-objective optimization (MOO)-based vulnerability classification approach and a transformer-based estimation approach to help AIBugHunter accurately identify vulnerability types and estimate severity.
arXiv Detail & Related papers (2023-05-26T04:21:53Z) - Differential privacy and robust statistics in high dimensions [49.50869296871643]
High-dimensional Propose-Test-Release (HPTR) builds upon three crucial components: the exponential mechanism, robust statistics, and the Propose-Test-Release mechanism.
We show that HPTR nearly achieves the optimal sample complexity under several scenarios studied in the literature.
arXiv Detail & Related papers (2021-11-12T06:36:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.