Repairing Bugs in Python Assignments Using Large Language Models
- URL: http://arxiv.org/abs/2209.14876v1
- Date: Thu, 29 Sep 2022 15:41:17 GMT
- Title: Repairing Bugs in Python Assignments Using Large Language Models
- Authors: Jialu Zhang, Jos\'e Cambronero, Sumit Gulwani, Vu Le, Ruzica Piskac,
Gustavo Soares, Gust Verbruggen
- Abstract summary: We propose to use a large language model trained on code to build an APR system for programming assignments.
Our system can fix both syntactic and semantic mistakes by combining multi-modal prompts, iterative querying, test-case-based selection of few-shots, and program chunking.
We evaluate MMAPR on 286 real student programs and compare to a baseline built by combining a state-of-the-art Python syntax repair engine, BIFI, and state-of-the-art Python semantic repair engine for student assignments, Refactory.
- Score: 9.973714032271708
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Students often make mistakes on their introductory programming assignments as
part of their learning process. Unfortunately, providing custom repairs for
these mistakes can require a substantial amount of time and effort from class
instructors. Automated program repair (APR) techniques can be used to
synthesize such fixes. Prior work has explored the use of symbolic and neural
techniques for APR in the education domain. Both types of approaches require
either substantial engineering efforts or large amounts of data and training.
We propose to use a large language model trained on code, such as Codex, to
build an APR system -- MMAPR -- for introductory Python programming
assignments. Our system can fix both syntactic and semantic mistakes by
combining multi-modal prompts, iterative querying, test-case-based selection of
few-shots, and program chunking. We evaluate MMAPR on 286 real student programs
and compare to a baseline built by combining a state-of-the-art Python syntax
repair engine, BIFI, and state-of-the-art Python semantic repair engine for
student assignments, Refactory. We find that MMAPR can fix more programs and
produce smaller patches on average.
Related papers
- Repairs in a Block World: A New Benchmark for Handling User Corrections with Multi-Modal Language Models [48.42142115255159]
We release BlockWorld-Repairs: a dataset of multi-modal TPR sequences in an instruction-following manipulation task.
We evaluate several state-of-the-art Vision and Language Models (VLM) across multiple settings, focusing on their capability to process and accurately respond to TPRs.
Our results suggest that these models are not yet ready to be deployed in multi-modal collaborative settings.
arXiv Detail & Related papers (2024-09-21T21:06:25Z) - AI Coders Are Among Us: Rethinking Programming Language Grammar Towards Efficient Code Generation [14.831115535710692]
We propose the concept of AI-oriented grammar.
This aims to represent code in a way that better suits the working mechanism of AI models.
Code written with AI-oriented grammar discards formats and uses a minimum number of tokens.
arXiv Detail & Related papers (2024-04-25T04:46:02Z) - A Novel Approach for Automatic Program Repair using Round-Trip
Translation with Large Language Models [50.86686630756207]
Research shows that grammatical mistakes in a sentence can be corrected by translating it to another language and back.
Current generative models for Automatic Program Repair (APR) are pre-trained on source code and fine-tuned for repair.
This paper proposes bypassing the fine-tuning step and using Round-Trip Translation (RTT): translation of code from one programming language to another programming or natural language, and back.
arXiv Detail & Related papers (2024-01-15T22:36:31Z) - How Helpful do Novice Programmers Find the Feedback of an Automated
Repair Tool? [1.2990666399718034]
We describe our experience of using CLARA, an automated repair tool, to provide feedback to novices.
First, we extended CLARA to support a larger subset of the Python language, before integrating it with the Jupyter Notebooks used for our programming exercises.
We found that novices often struggled to understand the proposed repairs, echoing the well-known challenge to understand compiler/interpreter messages.
arXiv Detail & Related papers (2023-10-02T07:45:56Z) - Repair Is Nearly Generation: Multilingual Program Repair with LLMs [9.610685299268825]
We introduce RING, a multilingual repair engine powered by a large language model trained on code (LLMC) such as Codex.
Taking inspiration from the way programmers manually fix bugs, we show that a prompt-based strategy that conceptualizes repair as localization, transformation, and candidate ranking, can successfully repair programs in multiple domains with minimal effort.
arXiv Detail & Related papers (2022-08-24T16:25:58Z) - problexity -- an open-source Python library for binary classification
problem complexity assessment [0.0]
The classification problem's complexity assessment is an essential element of many topics in the supervised learning domain.
The tools currently available for the academic community, which would enable the calculation of problem complexity measures, are available only as libraries of the C++ and R languages.
This paper describes the software module that allows for the estimation of 22 complexity measures for the Python language.
arXiv Detail & Related papers (2022-07-14T07:32:15Z) - AVATAR: A Parallel Corpus for Java-Python Program Translation [77.86173793901139]
Program translation refers to migrating source code from one language to another.
We present AVATAR, a collection of 9,515 programming problems and their solutions written in two popular languages, Java and Python.
arXiv Detail & Related papers (2021-08-26T05:44:20Z) - ProtoTransformer: A Meta-Learning Approach to Providing Student Feedback [54.142719510638614]
In this paper, we frame the problem of providing feedback as few-shot classification.
A meta-learner adapts to give feedback to student code on a new programming question from just a few examples by instructors.
Our approach was successfully deployed to deliver feedback to 16,000 student exam-solutions in a programming course offered by a tier 1 university.
arXiv Detail & Related papers (2021-07-23T22:41:28Z) - Break-It-Fix-It: Unsupervised Learning for Program Repair [90.55497679266442]
We propose a new training approach, Break-It-Fix-It (BIFI), which has two key ideas.
We use the critic to check a fixer's output on real bad inputs and add good (fixed) outputs to the training data.
Based on these ideas, we iteratively update the breaker and the fixer while using them in conjunction to generate more paired data.
BIFI outperforms existing methods, obtaining 90.5% repair accuracy on GitHub-Python and 71.7% on DeepFix.
arXiv Detail & Related papers (2021-06-11T20:31:04Z) - OPFython: A Python-Inspired Optimum-Path Forest Classifier [68.8204255655161]
This paper proposes a Python-based Optimum-Path Forest framework, denoted as OPFython.
As OPFython is a Python-based library, it provides a more friendly environment and a faster prototyping workspace than the C language.
arXiv Detail & Related papers (2020-01-28T15:46:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.