Utilizing Precise and Complete Code Context to Guide LLM in Automatic False Positive Mitigation
- URL: http://arxiv.org/abs/2411.03079v1
- Date: Tue, 05 Nov 2024 13:24:56 GMT
- Title: Utilizing Precise and Complete Code Context to Guide LLM in Automatic False Positive Mitigation
- Authors: Jinbao Chen, Hongjing Xiang, Luhao Li, Yu Zhang, Boyao Ding, Qingwei Li,
- Abstract summary: Application Security Testing(SAST) tools are crucial for early bug detection and code quality but often generate false positives that slow development.
Large Language Models, adept at understanding natural language and code, offer promising ways to improve the accuracy and usability of SAST tools.
Our work emphasizes the critical impact of precise and complete code context and highlights the potential of combining program analysis with LLMs.
- Score: 3.0538467265507574
- License:
- Abstract: Static Application Security Testing(SAST) tools are crucial for early bug detection and code quality but often generate false positives that slow development. Automating false positive mitigation is thus essential for advancing SAST tools. Past efforts use static/dynamic analysis or machine learning. The advent of Large Language Models, adept at understanding natural language and code, offers promising ways to improve the accuracy and usability of SAST tools. However, existing LLM-based methods need improvement in two key areas: first, extracted code snippets related to warnings are often cluttered with irrelevant control and data flows, reducing precision; second, critical code contexts are often missing, leading to incomplete representations that can mislead LLMs and cause inaccurate assessments. To ensure the use of precise and complete code context, thereby avoiding misguidance and enabling LLMs to reach accurate conclusions, we propose LLM4FPM. One of its core components is eCPG-Slicer, which builds an extended code property graph and extracts line-level, precise code context. Moreover, LLM4FPM incorporates FARF algorithm, which builds a file reference graph and then efficiently detects all files related to a warning in linear time, enabling eCPG-Slicer to gather complete code context across these files. We evaluate LLM4FPM on Juliet dataset, where it comprehensively outperforms the baseline, achieving an F1 score above 99% across various CWEs. LLM4FPM leverages a free, open-source model, avoiding costly alternatives and reducing inspection costs by up to $2758 per run on Juliet, with an average inspection time of 4.7 seconds per warning. Our work emphasizes the critical impact of precise and complete code context and highlights the potential of combining program analysis with LLMs, improving the quality and efficiency of software development.
Related papers
- Enhancing Fault Localization Through Ordered Code Analysis with LLM Agents and Self-Reflection [8.22737389683156]
Large Language Models (LLMs) offer promising improvements in fault localization by enhancing code comprehension and reasoning.
We introduce LLM4FL, a novel LLM-agent-based fault localization approach that integrates SBFL rankings with a divide-and-conquer strategy.
Our results demonstrate that LLM4FL outperforms AutoFL by 19.27% in Top-1 accuracy and surpasses state-of-the-art supervised techniques such as DeepFL and Grace.
arXiv Detail & Related papers (2024-09-20T16:47:34Z) - Impact of Large Language Models of Code on Fault Localization [2.936007114555107]
We propose a simple but effective sequence generation approach for fine-tuning large language models of code for FL tasks.
Specifically, we fine-tune representative encoder, encoder-decoder, and decoder-based 13 LLMCs for FL tasks.
Experimental results show that LLMCs fine-tuned with our approach successfully pinpoint error positions in 50.6%, 64.2%, and 72.3% of 1,291 methods in Defects4J for Top-2/3/5 prediction.
arXiv Detail & Related papers (2024-08-19T02:36:07Z) - CodecLM: Aligning Language Models with Tailored Synthetic Data [51.59223474427153]
We introduce CodecLM, a framework for adaptively generating high-quality synthetic data for instruction-following abilities.
We first encode seed instructions into metadata, which are concise keywords generated on-the-fly to capture the target instruction distribution.
We also introduce Self-Rubrics and Contrastive Filtering during decoding to tailor data-efficient samples.
arXiv Detail & Related papers (2024-04-08T21:15:36Z) - PPTC-R benchmark: Towards Evaluating the Robustness of Large Language
Models for PowerPoint Task Completion [96.47420221442397]
We construct adversarial user instructions by attacking user instructions at sentence, semantic, and multi-language levels.
We test 3 closed-source and 4 open-source LLMs using a benchmark that incorporates robustness settings.
We find that GPT-4 exhibits the highest performance and strong robustness in our benchmark.
arXiv Detail & Related papers (2024-03-06T15:33:32Z) - Efficient Tool Use with Chain-of-Abstraction Reasoning [65.18096363216574]
Large language models (LLMs) need to ground their reasoning to real-world knowledge.
There remains challenges for fine-tuning LLM agents to invoke tools in multi-step reasoning problems.
We propose a new method for LLMs to better leverage tools in multi-step reasoning.
arXiv Detail & Related papers (2024-01-30T21:53:30Z) - Large Language Models for Test-Free Fault Localization [11.080712737595174]
We propose a language model based fault localization approach that locates buggy lines of code without any test coverage information.
We fine-tune language models with 350 million, 6 billion, and 16 billion parameters on small, manually curated corpora of buggy programs.
Our empirical evaluation shows that LLMAO improves the Top-1 results over the state-of-the-art machine learning fault localization (MLFL) baselines by 2.3%-54.4%, and Top-5 results by 14.4%-35.6%.
arXiv Detail & Related papers (2023-10-03T01:26:39Z) - ZeroLeak: Using LLMs for Scalable and Cost Effective Side-Channel
Patching [6.556868623811133]
Security critical software, e.g., OpenSSL, comes with numerous side-channel leakages left unpatched due to a lack of resources or experts.
We explore the use of Large Language Models (LLMs) in generating patches for vulnerable code with microarchitectural side-channel leakages.
arXiv Detail & Related papers (2023-08-24T20:04:36Z) - LLMDet: A Third Party Large Language Models Generated Text Detection
Tool [119.0952092533317]
Large language models (LLMs) are remarkably close to high-quality human-authored text.
Existing detection tools can only differentiate between machine-generated and human-authored text.
We propose LLMDet, a model-specific, secure, efficient, and extendable detection tool.
arXiv Detail & Related papers (2023-05-24T10:45:16Z) - Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models [75.75038268227554]
Self-Checker is a framework comprising a set of plug-and-play modules that facilitate fact-checking.
This framework provides a fast and efficient way to construct fact-checking systems in low-resource environments.
arXiv Detail & Related papers (2023-05-24T01:46:07Z) - Inference with Reference: Lossless Acceleration of Large Language Models [97.04200102556551]
LLMA is an accelerator to speed up Large Language Model (LLM) inference with references.
It is motivated by the observation that there are abundant identical text spans between the decoding result by an LLM and the reference that is available in many real world scenarios.
arXiv Detail & Related papers (2023-04-10T09:55:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.