Boosting Static Resource Leak Detection via LLM-based Resource-Oriented Intention Inference
- URL: http://arxiv.org/abs/2311.04448v4
- Date: Thu, 12 Dec 2024 12:57:59 GMT
- Title: Boosting Static Resource Leak Detection via LLM-based Resource-Oriented Intention Inference
- Authors: Chong Wang, Jianan Liu, Xin Peng, Yang Liu, Yiling Lou,
- Abstract summary: Existing static detection techniques rely on mechanical matching of predefined resource acquisition/release APIs and null-checking conditions to find unreleased resources.
We propose InferROI, a novel approach that directly infers resource-oriented intentions (acquisition, release, and reachability validation) in code.
We evaluate the effectiveness of InferROI in both resource-oriented intention inference and resource leak detection.
- Score: 14.783216988363804
- License:
- Abstract: Resource leaks, caused by resources not being released after acquisition, often lead to performance issues and system crashes. Existing static detection techniques rely on mechanical matching of predefined resource acquisition/release APIs and null-checking conditions to find unreleased resources, suffering from both (1) false negatives caused by the incompleteness of predefined resource acquisition/release APIs and (2) false positives caused by the incompleteness of resource reachability validation identification. To overcome these challenges, we propose InferROI, a novel approach that leverages the exceptional code comprehension capability of large language models (LLMs) to directly infer resource-oriented intentions (acquisition, release, and reachability validation) in code. InferROI first prompts the LLM to infer involved intentions for a given code snippet, and then incorporates a two-stage static analysis approach to check control-flow paths for resource leak detection based on the inferred intentions. We evaluate the effectiveness of InferROI in both resource-oriented intention inference and resource leak detection. Experimental results on the DroidLeaks and JLeaks datasets demonstrate InferROI achieves promising bug detection rate (59.3% and 62.5%) and false alarm rate (18.6% and 19.5%). Compared to three industrial static detectors, InferROI detects 14~45 and 149~485 more bugs in DroidLeaks and JLeaks, respectively. When applied to real-world open-source projects, InferROI identifies 29 unknown resource leak bugs (verified by authors), with 7 of them being confirmed by developers. In addition, the results of an ablation study underscores the importance of combining LLM-based inference with static analysis.
Related papers
- Provenance: A Light-weight Fact-checker for Retrieval Augmented LLM Generation Output [49.893971654861424]
We present a light-weight approach for detecting nonfactual outputs from retrieval-augmented generation (RAG)
We compute a factuality score that can be thresholded to yield a binary decision.
Our experiments show high area under the ROC curve (AUC) across a wide range of relevant open source datasets.
arXiv Detail & Related papers (2024-11-01T20:44:59Z) - On the Effectiveness of LLMs for Manual Test Verifications [1.920300814128832]
This study aims to explore the use of Large Language Models (LLMs) to produce verifications for manual tests.
Open-source models Mistral-7B and Phi-3-mini-4k demonstrated effectiveness and consistency comparable to closed-source models.
There were also concerns about AI hallucinations, where verifications significantly deviated from expectations.
arXiv Detail & Related papers (2024-09-19T02:03:04Z) - Exploring Automatic Cryptographic API Misuse Detection in the Era of LLMs [60.32717556756674]
This paper introduces a systematic evaluation framework to assess Large Language Models in detecting cryptographic misuses.
Our in-depth analysis of 11,940 LLM-generated reports highlights that the inherent instabilities in LLMs can lead to over half of the reports being false positives.
The optimized approach achieves a remarkable detection rate of nearly 90%, surpassing traditional methods and uncovering previously unknown misuses in established benchmarks.
arXiv Detail & Related papers (2024-07-23T15:31:26Z) - Open-Source Drift Detection Tools in Action: Insights from Two Use Cases [0.0]
D3Bench examines the capabilities of Evidently AI, NannyML, and Alibi-Detect, leveraging real-world data from two smart building use cases.
We consider a comprehensive set of non-functional criteria, such as the integrability with ML pipelines, the adaptability to diverse data types, user-friendliness, computational efficiency, and resource demands.
Our findings reveal that Evidently AI stands out for its general data drift detection, whereas NannyML excels at pinpointing the precise timing of shifts and evaluating their consequent effects on predictive accuracy.
arXiv Detail & Related papers (2024-04-29T13:13:10Z) - Collaborative Knowledge Infusion for Low-resource Stance Detection [83.88515573352795]
Target-related knowledge is often needed to assist stance detection models.
We propose a collaborative knowledge infusion approach for low-resource stance detection tasks.
arXiv Detail & Related papers (2024-03-28T08:32:14Z) - A Little Leak Will Sink a Great Ship: Survey of Transparency for Large Language Models from Start to Finish [47.3916421056009]
Large Language Models (LLMs) are trained on massive web-crawled corpora.
LLMs produce leaked information in most cases despite less such data in their training set.
Self-detection method showed superior performance compared to existing detection methods.
arXiv Detail & Related papers (2024-03-24T13:21:58Z) - A New Benchmark and Reverse Validation Method for Passage-level
Hallucination Detection [63.56136319976554]
Large Language Models (LLMs) generate hallucinations, which can cause significant damage when deployed for mission-critical tasks.
We propose a self-check approach based on reverse validation to detect factual errors automatically in a zero-resource fashion.
We empirically evaluate our method and existing zero-resource detection methods on two datasets.
arXiv Detail & Related papers (2023-10-10T10:14:59Z) - Inference of Resource Management Specifications [2.8975089867684436]
A resource leak occurs when a program fails to free some finite resource after it is no longer needed.
Recent work proposed an approach to prevent resource leaks based on checking resource management specifications.
This paper presents a novel technique to automatically infer a resource management specification for a program.
arXiv Detail & Related papers (2023-06-21T00:42:42Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.