Does the Vulnerability Threaten Our Projects? Automated Vulnerable API Detection for Third-Party Libraries
- URL: http://arxiv.org/abs/2409.02753v1
- Date: Wed, 4 Sep 2024 14:31:16 GMT
- Title: Does the Vulnerability Threaten Our Projects? Automated Vulnerable API Detection for Third-Party Libraries
- Authors: Fangyuan Zhang, Lingling Fan, Sen Chen, Miaoying Cai, Sihan Xu, Lida Zhao,
- Abstract summary: We propose VAScanner, which can effectively identify vulnerable root methods causing vulnerabilities in TPLs.
VAScanner eliminates 5.78% false positives and 2.16% false negatives owing to the proposed sifting and augmentation mechanisms.
In a large-scale analysis of 3,147 projects using vulnerable TPLs, we find only 21.51% of projects were threatened by vulnerable APIs.
- Score: 11.012017507408078
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Developers usually use TPLs to facilitate the development of the projects to avoid reinventing the wheels, however, the vulnerable TPLs indeed cause severe security threats. The majority of existing research only considered whether projects used vulnerable TPLs but neglected whether the vulnerable code of the TPLs was indeed used by the projects, which inevitably results in false positives and further requires additional patching efforts and maintenance costs. To address this, we propose VAScanner, which can effectively identify vulnerable root methods causing vulnerabilities in TPLs and further identify all vulnerable APIs of TPLs used by Java projects. Specifically, we first collect the initial patch methods from the patch commits and extract accurate patch methods by employing a patch-unrelated sifting mechanism, then we further identify the vulnerable root methods for each vulnerability by employing an augmentation mechanism. Based on them, we leverage backward call graph analysis to identify all vulnerable APIs for each vulnerable TPL version and construct a database consisting of 90,749 (2,410,779 with library versions) vulnerable APIs with 1.45% false positive proportion with a 95% CI of [1.31%, 1.59%] from 362 TPLs with 14,775 versions. Our experiments show VAScanner eliminates 5.78% false positives and 2.16% false negatives owing to the proposed sifting and augmentation mechanisms. Besides, it outperforms the state-of-the-art method-level tool in analyzing direct dependencies, Eclipse Steady, achieving more effective detection of vulnerable APIs. Furthermore, in a large-scale analysis of 3,147 projects using vulnerable TPLs, we find only 21.51% of projects (with 1.83% false positive proportion and a 95% CI of [0.71%, 4.61%]) were threatened through vulnerable APIs by vulnerable TPLs, demonstrating that VAScanner can potentially reduce false positives significantly.
Related papers
- Fine-Grained 1-Day Vulnerability Detection in Binaries via Patch Code Localization [12.73365645156957]
1-day vulnerabilities in binaries have become a major threat to software security.
patch presence test is one of the effective ways to detect the vulnerability.
We propose a novel approach named PLocator, which leverages stable values from both the patch code and its context.
arXiv Detail & Related papers (2025-01-29T04:35:37Z) - Enhancing Security in Third-Party Library Reuse -- Comprehensive Detection of 1-day Vulnerability through Code Patch Analysis [8.897599530972638]
Thirdparty libraries (TPLs) can introduce vulnerabilities (known as 1-day vulnerabilities) because of the low maintenance of TPLs.
VULTURE aims at identifying 1-day vulnerabilities that arise from the reuse of vulnerable TPLs.
VULTURE successfully identified 175 vulnerabilities from 178 reused TPLs.
arXiv Detail & Related papers (2024-11-29T12:02:28Z) - The Impact of SBOM Generators on Vulnerability Assessment in Python: A Comparison and a Novel Approach [56.4040698609393]
Software Bill of Materials (SBOM) has been promoted as a tool to increase transparency and verifiability in software composition.
Current SBOM generation tools often suffer from inaccuracies in identifying components and dependencies.
We propose PIP-sbom, a novel pip-inspired solution that addresses their shortcomings.
arXiv Detail & Related papers (2024-09-10T10:12:37Z) - How Well Do Large Language Models Serve as End-to-End Secure Code Producers? [42.119319820752324]
We studied GPT-3.5 and GPT-4's capability to identify and repair vulnerabilities in the code generated by four popular LLMs.
By manually or automatically reviewing 4,900 pieces of code, our study reveals that large language models lack awareness of scenario-relevant security risks.
To address the limitation of a single round of repair, we developed a lightweight tool that prompts LLMs to construct safer source code.
arXiv Detail & Related papers (2024-08-20T02:42:29Z) - Comparison of Static Application Security Testing Tools and Large Language Models for Repo-level Vulnerability Detection [11.13802281700894]
Static Application Security Testing (SAST) is usually utilized to scan source code for security vulnerabilities.
Deep learning (DL)-based methods have demonstrated their potential in software vulnerability detection.
This paper compares 15 diverse SAST tools with 12 popular or state-of-the-art open-source LLMs in detecting software vulnerabilities.
arXiv Detail & Related papers (2024-07-23T07:21:14Z) - Static Application Security Testing (SAST) Tools for Smart Contracts: How Far Are We? [14.974832502863526]
In recent years, the importance of smart contract security has been heightened by the increasing number of attacks against them.
To address this issue, a multitude of static application security testing (SAST) tools have been proposed for detecting vulnerabilities in smart contracts.
In this paper, we propose an up-to-date and fine-grained taxonomy that includes 45 unique vulnerability types for smart contracts.
arXiv Detail & Related papers (2024-04-28T13:40:18Z) - Vulnerability Detection with Code Language Models: How Far Are We? [40.455600722638906]
PrimeVul is a new dataset for training and evaluating code LMs for vulnerability detection.
It incorporates a novel set of data labeling techniques that achieve comparable label accuracy to human-verified benchmarks.
It also implements a rigorous data de-duplication and chronological data splitting strategy to mitigate data leakage issues.
arXiv Detail & Related papers (2024-03-27T14:34:29Z) - Vulnerability Scanners for Ethereum Smart Contracts: A Large-Scale Study [44.25093111430751]
In 2023 alone, such vulnerabilities led to substantial financial losses exceeding a billion of US dollars.
Various tools have been developed to detect and mitigate vulnerabilities in smart contracts.
This study investigates the gap between the effectiveness of existing security scanners and the vulnerabilities that still persist in practice.
arXiv Detail & Related papers (2023-12-27T11:26:26Z) - Exploiting Library Vulnerability via Migration Based Automating Test
Generation [16.39796265296833]
In software development, developers extensively utilize third-party libraries to avoid implementing existing functionalities.
Vulnerability exploits, as code snippets provided for reproducing vulnerabilities after disclosure, contain a wealth of vulnerability-related information.
This study proposes a new method based on vulnerability exploits, called VESTA, which provides vulnerability exploit tests as the basis for developers to decide whether to update dependencies.
arXiv Detail & Related papers (2023-12-15T06:46:45Z) - On the Security Blind Spots of Software Composition Analysis [46.1389163921338]
We present a novel approach to detect vulnerable clones in the Maven repository.
We retrieve over 53k potential vulnerable clones from Maven Central.
We detect 727 confirmed vulnerable clones and synthesize a testable proof-of-vulnerability project for each of those.
arXiv Detail & Related papers (2023-06-08T20:14:46Z) - ESCORT: Ethereum Smart COntRacTs Vulnerability Detection using Deep
Neural Network and Transfer Learning [80.85273827468063]
Existing machine learning-based vulnerability detection methods are limited and only inspect whether the smart contract is vulnerable.
We propose ESCORT, the first Deep Neural Network (DNN)-based vulnerability detection framework for smart contracts.
We show that ESCORT achieves an average F1-score of 95% on six vulnerability types and the detection time is 0.02 seconds per contract.
arXiv Detail & Related papers (2021-03-23T15:04:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.