BUGSPHP: A dataset for Automated Program Repair in PHP
- URL: http://arxiv.org/abs/2401.07356v2
- Date: Sun, 21 Jan 2024 15:22:50 GMT
- Title: BUGSPHP: A dataset for Automated Program Repair in PHP
- Authors: K.D. Pramod, W.T.N. De Silva, W.U.K. Thabrew, Ridwan Shariffdeen,
Sandareka Wickramanayake
- Abstract summary: This paper presents a benchmark dataset of bugs on real-world applications called BUGSPHP.
The training dataset includes more than 600,000 bug-fixing commits.
The test dataset contains 513 manually validated bug-fixing commits equipped with developer-provided test cases.
- Score: 2.236957801565796
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automated Program Repair (APR) improves developer productivity by saving
debugging and bug-fixing time. While APR has been extensively explored for
C/C++ and Java programs, there is little research on bugs in PHP programs due
to the lack of a benchmark PHP bug dataset. This is surprising given that PHP
has been one of the most widely used server-side languages for over two
decades, being used in a variety of contexts such as e-commerce, social
networking, and content management. This paper presents a benchmark dataset of
PHP bugs on real-world applications called BUGSPHP, which can enable research
on analysis, testing, and repair for PHP programs. The dataset consists of
training and test datasets, separately curated from GitHub and processed
locally. The training dataset includes more than 600,000 bug-fixing commits.
The test dataset contains 513 manually validated bug-fixing commits equipped
with developer-provided test cases to assess patch correctness.
Related papers
- MultiMend: Multilingual Program Repair with Context Augmentation and Multi-Hunk Patch Generation [2.7036595757881323]
MultiMend is a learning-based APR approach designed to improve repair performance on multiple programming languages.
It embeds source code lines and applies retrieval-augmented generation to augment the buggy context with relevant lines during patch generation.
We evaluate MultiMend on four benchmarks with four programming languages and compare it with state-of-the-art methods.
arXiv Detail & Related papers (2025-01-27T13:37:43Z) - Evaluating Agent-based Program Repair at Google [9.62742759337993]
Agent-based program repair offers to automatically resolve complex bugs end-to-end.
Recent work has explored the use of agent-based repair approaches on the popular open-source SWE-Bench.
This paper explores the viability of using an agentic approach to address bugs in an enterprise context.
arXiv Detail & Related papers (2025-01-13T18:09:25Z) - Leveraging Data Characteristics for Bug Localization in Deep Learning Programs [21.563130049562357]
We propose Theia, which detects and localizes structural bugs in Deep Learning (DL) programs.
Our results show that Theia successfully localizes 57/75 structural bugs in 40 buggy programs, whereas NeuraLint, a state-of-the-art approach capable of localizing structural bugs before training localizes 17/75 bugs.
arXiv Detail & Related papers (2024-12-08T01:52:06Z) - Fuzzing the PHP Interpreter via Dataflow Fusion [13.303933700280343]
This paper introduces FlowFusion, the first automatic fuzzing framework to detect memory errors in the PHP interpreter.
In our evaluation, FlowFusion found 158 unknown bugs in the PHP interpreter, with 125 fixed and 11 confirmed.
FlowFusion also outperformed state-of-the-art fuzzers AFL++ and Polyglot, covering 24% more lines of code after 24 hours of fuzzing.
arXiv Detail & Related papers (2024-10-29T03:54:59Z) - Yama: Precise Opcode-based Data Flow Analysis for Detecting PHP Applications Vulnerabilities [4.262259005587605]
Yama is a context-sensitive and path-sensitive interprocedural data flow analysis method for PHP.
We have found that the precise semantics and clear control flow of PHP opcodes enable data flow analysis to be more precise and efficient.
We evaluated Yama from three dimensions: basic data flow analysis capabilities, complex semantic analysis capabilities, and the ability to discover vulnerabilities in real-world applications.
arXiv Detail & Related papers (2024-10-16T08:14:37Z) - KGym: A Platform and Dataset to Benchmark Large Language Models on Linux Kernel Crash Resolution [59.20933707301566]
Large Language Models (LLMs) are consistently improving at increasingly realistic software engineering (SE) tasks.
In real-world software stacks, significant SE effort is spent developing foundational system software like the Linux kernel.
To evaluate if ML models are useful while developing such large-scale systems-level software, we introduce kGym and kBench.
arXiv Detail & Related papers (2024-07-02T21:44:22Z) - A Novel Approach for Automatic Program Repair using Round-Trip
Translation with Large Language Models [50.86686630756207]
Research shows that grammatical mistakes in a sentence can be corrected by translating it to another language and back.
Current generative models for Automatic Program Repair (APR) are pre-trained on source code and fine-tuned for repair.
This paper proposes bypassing the fine-tuning step and using Round-Trip Translation (RTT): translation of code from one programming language to another programming or natural language, and back.
arXiv Detail & Related papers (2024-01-15T22:36:31Z) - RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic
Program Repair [75.40584530380589]
We propose a novel Retrieval-Augmented Patch Generation framework (RAP-Gen)
RAP-Gen explicitly leveraging relevant fix patterns retrieved from a list of previous bug-fix pairs.
We evaluate RAP-Gen on three benchmarks in two programming languages, including the TFix benchmark in JavaScript, and Code Refinement and Defects4J benchmarks in Java.
arXiv Detail & Related papers (2023-09-12T08:52:56Z) - Using Developer Discussions to Guide Fixing Bugs in Software [51.00904399653609]
We propose using bug report discussions, which are available before the task is performed and are also naturally occurring, avoiding the need for additional information from developers.
We demonstrate that various forms of natural language context derived from such discussions can aid bug-fixing, even leading to improved performance over using commit messages corresponding to the oracle bug-fixing commits.
arXiv Detail & Related papers (2022-11-11T16:37:33Z) - BigIssue: A Realistic Bug Localization Benchmark [89.8240118116093]
BigIssue is a benchmark for realistic bug localization.
We provide a general benchmark with a diversity of real and synthetic Java bugs.
We hope to advance the state of the art in bug localization, in turn improving APR performance and increasing its applicability to the modern development cycle.
arXiv Detail & Related papers (2022-07-21T20:17:53Z) - Break-It-Fix-It: Unsupervised Learning for Program Repair [90.55497679266442]
We propose a new training approach, Break-It-Fix-It (BIFI), which has two key ideas.
We use the critic to check a fixer's output on real bad inputs and add good (fixed) outputs to the training data.
Based on these ideas, we iteratively update the breaker and the fixer while using them in conjunction to generate more paired data.
BIFI outperforms existing methods, obtaining 90.5% repair accuracy on GitHub-Python and 71.7% on DeepFix.
arXiv Detail & Related papers (2021-06-11T20:31:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.