An Empirical Study of Vulnerability Handling Times in CPython
- URL: http://arxiv.org/abs/2411.00447v1
- Date: Fri, 01 Nov 2024 08:46:14 GMT
- Title: An Empirical Study of Vulnerability Handling Times in CPython
- Authors: Jukka Ruohonen,
- Abstract summary: The paper examines the handling times of software vulnerabilities in CPython.
The paper contributes to the recent effort to better understand security of the Python ecosystem.
- Score: 0.2538209532048867
- License:
- Abstract: The paper examines the handling times of software vulnerabilities in CPython, the reference implementation and interpreter for the today's likely most popular programming language, Python. The background comes from the so-called vulnerability life cycle analysis, the literature on bug fixing times, and the recent research on security of Python software. Based on regression analysis, the associated vulnerability fixing times can be explained very well merely by knowing who have reported the vulnerabilities. Severity, proof-of-concept code, commits made to a version control system, comments posted on a bug tracker, and references to other sources do not explain the vulnerability fixing times. With these results, the paper contributes to the recent effort to better understand security of the Python ecosystem.
Related papers
- SafePyScript: A Web-Based Solution for Machine Learning-Driven Vulnerability Detection in Python [0.0]
We present SafePyScript, a machine learning-based web application designed specifically to identify vulnerabilities in Python source code.
Despite Python's significance as a major programming language, there is currently no convenient and easy-to-use machine learning-based web application for detecting vulnerabilities in its source code.
arXiv Detail & Related papers (2024-11-01T14:49:33Z) - How Maintainable is Proficient Code? A Case Study of Three PyPI Libraries [3.0105723746073]
We investigate the risk level of proficient code inside a file.
We identify several instances of high proficient code that was also high risk.
We envision that the study should help developers identify scenarios where proficient code might be detrimental to future code maintenance activities.
arXiv Detail & Related papers (2024-10-08T04:45:11Z) - CRUXEval-X: A Benchmark for Multilingual Code Reasoning, Understanding and Execution [50.7413285637879]
The CRUXEVAL-X code reasoning benchmark contains 19 programming languages.
It comprises at least 600 subjects for each language, along with 19K content-consistent tests in total.
Even a model trained solely on Python can achieve at most 34.4% Pass@1 in other languages.
arXiv Detail & Related papers (2024-08-23T11:43:00Z) - Towards Identifying Code Proficiency through the Analysis of Python Textbooks [7.381102801726683]
The aim is to gauge the level of proficiency a developer must have to understand a piece of source code.
Prior attempts, which relied heavily on expert opinions and developer surveys, have led to considerable discrepancies.
This paper presents a new approach to identifying Python competency levels through the systematic analysis of introductory Python programming textbooks.
arXiv Detail & Related papers (2024-08-05T06:37:10Z) - Python Fuzzing for Trustworthy Machine Learning Frameworks [0.0]
We propose a dynamic analysis pipeline for Python projects using Sydr-Fuzz.
Our pipeline includes fuzzing, corpus minimization, crash triaging, and coverage collection.
To identify the most vulnerable parts of machine learning frameworks, we analyze their potential attack surfaces and develop fuzz targets for PyTorch, and related projects such as h5py.
arXiv Detail & Related papers (2024-03-19T13:41:11Z) - Causal-learn: Causal Discovery in Python [53.17423883919072]
Causal discovery aims at revealing causal relations from observational data.
$textitcausal-learn$ is an open-source Python library for causal discovery.
arXiv Detail & Related papers (2023-07-31T05:00:35Z) - Exploring Security Commits in Python [11.533638656389137]
Most security issues in Python have not been indexed by CVE and may only be fixed by'silent' security commits.
It is critical to identify the hidden security commits, due to the limited data variety, non-comprehensive code semantics, and uninterpretable learned features.
We construct the first security commit dataset in Python, PySecDB, which consists of three subsets including a base dataset, a pilot dataset, and an augmented dataset.
arXiv Detail & Related papers (2023-07-21T18:46:45Z) - A Static Evaluation of Code Completion by Large Language Models [65.18008807383816]
Execution-based benchmarks have been proposed to evaluate functional correctness of model-generated code on simple programming problems.
static analysis tools such as linters, which can detect errors without running the program, haven't been well explored for evaluating code generation models.
We propose a static evaluation framework to quantify static errors in Python code completions, by leveraging Abstract Syntax Trees.
arXiv Detail & Related papers (2023-06-05T19:23:34Z) - CodeLMSec Benchmark: Systematically Evaluating and Finding Security
Vulnerabilities in Black-Box Code Language Models [58.27254444280376]
Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks.
Training data for these models is usually collected from the Internet (e.g., from open-source repositories) and is likely to contain faults and security vulnerabilities.
This unsanitized training data can cause the language models to learn these vulnerabilities and propagate them during the code generation procedure.
arXiv Detail & Related papers (2023-02-08T11:54:07Z) - BackdoorBox: A Python Toolbox for Backdoor Learning [67.53987387581222]
This Python toolbox implements representative and advanced backdoor attacks and defenses.
It allows researchers and developers to easily implement and compare different methods on benchmark or their local datasets.
arXiv Detail & Related papers (2023-02-01T09:45:42Z) - Break-It-Fix-It: Unsupervised Learning for Program Repair [90.55497679266442]
We propose a new training approach, Break-It-Fix-It (BIFI), which has two key ideas.
We use the critic to check a fixer's output on real bad inputs and add good (fixed) outputs to the training data.
Based on these ideas, we iteratively update the breaker and the fixer while using them in conjunction to generate more paired data.
BIFI outperforms existing methods, obtaining 90.5% repair accuracy on GitHub-Python and 71.7% on DeepFix.
arXiv Detail & Related papers (2021-06-11T20:31:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.