Using ChatGPT as a Static Application Security Testing Tool
- URL: http://arxiv.org/abs/2308.14434v1
- Date: Mon, 28 Aug 2023 09:21:37 GMT
- Title: Using ChatGPT as a Static Application Security Testing Tool
- Authors: Atieh Bakhshandeh, Abdalsamad Keramatfar, Amir Norouzi, and Mohammad
Mahdi Chekidehkhoun
- Abstract summary: ChatGPT has caught a huge amount of attention with its remarkable performance.
We study the feasibility of using ChatGPT for vulnerability detection in Python source code.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, artificial intelligence has had a conspicuous growth in
almost every aspect of life. One of the most applicable areas is security code
review, in which a lot of AI-based tools and approaches have been proposed.
Recently, ChatGPT has caught a huge amount of attention with its remarkable
performance in following instructions and providing a detailed response.
Regarding the similarities between natural language and code, in this paper, we
study the feasibility of using ChatGPT for vulnerability detection in Python
source code. Toward this goal, we feed an appropriate prompt along with
vulnerable data to ChatGPT and compare its results on two datasets with the
results of three widely used Static Application Security Testing tools (Bandit,
Semgrep and SonarQube). We implement different kinds of experiments with
ChatGPT and the results indicate that ChatGPT reduces the false positive and
false negative rates and has the potential to be used for Python source code
vulnerability detection.
Related papers
- Impact of the Availability of ChatGPT on Software Development: A Synthetic Difference in Differences Estimation using GitHub Data [49.1574468325115]
ChatGPT is an AI tool that enhances software production efficiency.
We estimate ChatGPT's effects on the number of git pushes, repositories, and unique developers per 100,000 people.
These results suggest that AI tools like ChatGPT can substantially boost developer productivity, though further analysis is needed to address potential downsides such as low quality code and privacy concerns.
arXiv Detail & Related papers (2024-06-16T19:11:15Z) - Pros and Cons! Evaluating ChatGPT on Software Vulnerability [0.0]
We evaluate ChatGPT using Big-Vul covering five different common software vulnerability tasks.
We found that the existing state-of-the-art methods are generally superior to ChatGPT in software vulnerability detection.
ChatGPT exhibits limited vulnerability repair capabilities in both providing and not providing context information.
arXiv Detail & Related papers (2024-04-05T10:08:34Z) - Exploring ChatGPT's Capabilities on Vulnerability Management [56.4403395100589]
We explore ChatGPT's capabilities on 6 tasks involving the complete vulnerability management process with a large-scale dataset containing 70,346 samples.
One notable example is ChatGPT's proficiency in tasks like generating titles for software bug reports.
Our findings reveal the difficulties encountered by ChatGPT and shed light on promising future directions.
arXiv Detail & Related papers (2023-11-11T11:01:13Z) - Primacy Effect of ChatGPT [69.49920102917598]
We study the primacy effect of ChatGPT: the tendency of selecting the labels at earlier positions as the answer.
We hope that our experiments and analyses provide additional insights into building more reliable ChatGPT-based solutions.
arXiv Detail & Related papers (2023-10-20T00:37:28Z) - ChatGPT for Vulnerability Detection, Classification, and Repair: How Far
Are We? [24.61869093475626]
Large language models (LLMs) like ChatGPT exhibited remarkable advancement in a range of software engineering tasks.
We compare ChatGPT with state-of-the-art language models designed for software vulnerability purposes.
We found that ChatGPT achieves limited performance, trailing behind other language models in vulnerability contexts by a significant margin.
arXiv Detail & Related papers (2023-10-15T12:01:35Z) - When ChatGPT Meets Smart Contract Vulnerability Detection: How Far Are We? [34.61179425241671]
We present an empirical study to investigate the performance of ChatGPT in identifying smart contract vulnerabilities.
ChatGPT achieves a high recall rate, but its precision in pinpointing smart contract vulnerabilities is limited.
Our research provides insights into the strengths and weaknesses of employing large language models, specifically ChatGPT, for the detection of smart contract vulnerabilities.
arXiv Detail & Related papers (2023-09-11T15:02:44Z) - Fighting Fire with Fire: Can ChatGPT Detect AI-generated Text? [20.37071875344405]
We evaluate the zero-shot performance of ChatGPT in the task of human-written vs. AI-generated text detection.
We empirically investigate if ChatGPT is symmetrically effective in detecting AI-generated or human-written text.
arXiv Detail & Related papers (2023-08-02T17:11:37Z) - Is ChatGPT Involved in Texts? Measure the Polish Ratio to Detect
ChatGPT-Generated Text [48.36706154871577]
We introduce a novel dataset termed HPPT (ChatGPT-polished academic abstracts)
It diverges from extant corpora by comprising pairs of human-written and ChatGPT-polished abstracts instead of purely ChatGPT-generated texts.
We also propose the "Polish Ratio" method, an innovative measure of the degree of modification made by ChatGPT compared to the original human-written text.
arXiv Detail & Related papers (2023-07-21T06:38:37Z) - ChatLog: Carefully Evaluating the Evolution of ChatGPT Across Time [54.18651663847874]
ChatGPT has achieved great success and can be considered to have acquired an infrastructural status.
Existing benchmarks encounter two challenges: (1) Disregard for periodical evaluation and (2) Lack of fine-grained features.
We construct ChatLog, an ever-updating dataset with large-scale records of diverse long-form ChatGPT responses for 21 NLP benchmarks from March, 2023 to now.
arXiv Detail & Related papers (2023-04-27T11:33:48Z) - To ChatGPT, or not to ChatGPT: That is the question! [78.407861566006]
This study provides a comprehensive and contemporary assessment of the most recent techniques in ChatGPT detection.
We have curated a benchmark dataset consisting of prompts from ChatGPT and humans, including diverse questions from medical, open Q&A, and finance domains.
Our evaluation results demonstrate that none of the existing methods can effectively detect ChatGPT-generated content.
arXiv Detail & Related papers (2023-04-04T03:04:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.