Conversing with Copilot: Exploring Prompt Engineering for Solving CS1
Problems Using Natural Language
- URL: http://arxiv.org/abs/2210.15157v1
- Date: Thu, 27 Oct 2022 03:48:24 GMT
- Title: Conversing with Copilot: Exploring Prompt Engineering for Solving CS1
Problems Using Natural Language
- Authors: Paul Denny and Viraj Kumar and Nasser Giacaman
- Abstract summary: GitHub Copilot is an artificial intelligence model for automatically generating source code from natural language problem descriptions.
Since June 2022, Copilot has officially been available for free to all students as a plug-in to development environments like Visual Studio Code.
- Score: 3.155277175705079
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: GitHub Copilot is an artificial intelligence model for automatically
generating source code from natural language problem descriptions. Since June
2022, Copilot has officially been available for free to all students as a
plug-in to development environments like Visual Studio Code. Prior work
exploring OpenAI Codex, the underlying model that powers Copilot, has shown it
performs well on typical CS1 problems thus raising concerns about the impact it
will have on how introductory programming courses are taught. However, little
is known about the types of problems for which Copilot does not perform well,
or about the natural language interactions that a student might have with
Copilot when resolving errors. We explore these questions by evaluating the
performance of Copilot on a publicly available dataset of 166 programming
problems. We find that it successfully solves around half of these problems on
its very first attempt, and that it solves 60\% of the remaining problems using
only natural language changes to the problem description. We argue that this
type of prompt engineering, which we believe will become a standard interaction
between human and Copilot when it initially fails, is a potentially useful
learning activity that promotes computational thinking skills, and is likely to
change the nature of code writing skill development.
Related papers
- Exploring the Effect of Multiple Natural Languages on Code Suggestion
Using GitHub Copilot [46.822148186169144]
GitHub Copilot is an AI-enabled tool that automates program synthesis.
Recent studies have extensively examined Copilot's capabilities in various programming tasks.
However, little is known about the effect of different natural languages on code suggestion.
arXiv Detail & Related papers (2024-02-02T14:30:02Z) - Exploring the Problems, their Causes and Solutions of AI Pair Programming: A Study on GitHub and Stack Overflow [6.724815667295355]
GitHub Copilot, the AI programmer pair, utilize machine learning models trained on a large corpus of code snippets to generate code suggestions.
Despite its popularity in software development, there is limited empirical evidence on the actual experiences of practitioners who work with Copilot.
We collected data from 473 GitHub issues, 706 GitHub discussions, and 142 Stack Overflow posts.
arXiv Detail & Related papers (2023-11-02T06:24:38Z) - SWE-bench: Can Language Models Resolve Real-World GitHub Issues? [80.52201658231895]
SWE-bench is an evaluation framework consisting of $2,294$ software engineering problems drawn from real GitHub issues and corresponding pull requests across $12$ popular Python repositories.
We show that both state-of-the-art proprietary models and our fine-tuned model SWE-Llama can resolve only the simplest issues.
arXiv Detail & Related papers (2023-10-10T16:47:29Z) - Demystifying Practices, Challenges and Expected Features of Using GitHub
Copilot [3.655281304961642]
We conducted an empirical study by collecting and analyzing the data from Stack Overflow (SO) and GitHub Discussions.
We identified the programming languages, technologies used with Copilot, functions implemented, benefits, limitations, and challenges when using Copilot.
Our results suggest that using Copilot is like a double-edged sword, which requires developers to carefully consider various aspects when deciding whether or not to use it.
arXiv Detail & Related papers (2023-09-11T16:39:37Z) - Giving Feedback on Interactive Student Programs with Meta-Exploration [74.5597783609281]
Developing interactive software, such as websites or games, is a particularly engaging way to learn computer science.
Standard approaches require instructors to manually grade student-implemented interactive programs.
Online platforms that serve millions, like Code.org, are unable to provide any feedback on assignments for implementing interactive programs.
arXiv Detail & Related papers (2022-11-16T10:00:23Z) - Human-guided Collaborative Problem Solving: A Natural Language based
Framework [74.27063862727849]
Our framework consists of three components -- a natural language engine that parses the language utterances to a formal representation and vice-versa.
We illustrate the ability of this framework to address the key challenges of collaborative problem solving by demonstrating it on a collaborative building task in a Minecraft-based blocksworld domain.
arXiv Detail & Related papers (2022-07-19T21:52:37Z) - GitHub Copilot AI pair programmer: Asset or Liability? [14.572381978575182]
We study the capabilities of Copilot in two different programming tasks.
We compare Copilot's proposed solutions with those of human programmers on a set of programming tasks.
The results show that Copilot is capable of providing solutions for almost all fundamental algorithmic problems.
arXiv Detail & Related papers (2022-06-30T15:00:03Z) - Competition-Level Code Generation with AlphaCode [74.87216298566942]
We introduce AlphaCode, a system for code generation that can create novel solutions to problems that require deeper reasoning.
In simulated evaluations on recent programming competitions on the Codeforces platform, AlphaCode achieved on average a ranking of top 54.3%.
arXiv Detail & Related papers (2022-02-08T23:16:31Z) - An Empirical Cybersecurity Evaluation of GitHub Copilot's Code
Contributions [8.285068188878578]
GitHub Copilot is a language model trained over open-source GitHub code.
Code often contains bugs - and so, it is certain that the language model will have learned from exploitable, buggy code.
This raises concerns on the security of Copilot's code contributions.
arXiv Detail & Related papers (2021-08-20T17:30:33Z) - Measuring Coding Challenge Competence With APPS [54.22600767666257]
We introduce APPS, a benchmark for code generation.
Our benchmark includes 10,000 problems, which range from having simple one-line solutions to being substantial algorithmic challenges.
Recent models such as GPT-Neo can pass approximately 15% of the test cases of introductory problems.
arXiv Detail & Related papers (2021-05-20T17:58:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.