Related papers: GitHub Copilot AI pair programmer: Asset or Liability?

GitHub Copilot AI pair programmer: Asset or Liability?

URL: http://arxiv.org/abs/2206.15331v2
Date: Fri, 14 Apr 2023 20:52:00 GMT
Title: GitHub Copilot AI pair programmer: Asset or Liability?
Authors: Arghavan Moradi Dakhel, Vahid Majdinasab, Amin Nikanjam, Foutse Khomh, Michel C. Desmarais, Zhen Ming (Jack) Jiang
Abstract summary: We study the capabilities of Copilot in two different programming tasks. We compare Copilot's proposed solutions with those of human programmers on a set of programming tasks. The results show that Copilot is capable of providing solutions for almost all fundamental algorithmic problems.
Score: 14.572381978575182
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Automatic program synthesis is a long-lasting dream in software engineering. Recently, a promising Deep Learning (DL) based solution, called Copilot, has been proposed by OpenAI and Microsoft as an industrial product. Although some studies evaluate the correctness of Copilot solutions and report its issues, more empirical evaluations are necessary to understand how developers can benefit from it effectively. In this paper, we study the capabilities of Copilot in two different programming tasks: (i) generating (and reproducing) correct and efficient solutions for fundamental algorithmic problems, and (ii) comparing Copilot's proposed solutions with those of human programmers on a set of programming tasks. For the former, we assess the performance and functionality of Copilot in solving selected fundamental problems in computer science, like sorting and implementing data structures. In the latter, a dataset of programming problems with human-provided solutions is used. The results show that Copilot is capable of providing solutions for almost all fundamental algorithmic problems, however, some solutions are buggy and non-reproducible. Moreover, Copilot has some difficulties in combining multiple methods to generate a solution. Comparing Copilot to humans, our results show that the correct ratio of humans' solutions is greater than Copilot's suggestions, while the buggy solutions generated by Copilot require less effort to be repaired.

Related papers

A Human Centric Requirements Engineering Framework for Assessing Github Copilot Output [0.0]
GitHub Copilot introduces new challenges in how these software tools address human needs.<n>I analyzed GitHub Copilot's interaction with users through its chat interface.<n>I established a human-centered requirements framework with clear metrics to evaluate these qualities.
arXiv Detail & Related papers (2025-08-05T21:33:23Z)
Code with Me or for Me? How Increasing AI Automation Transforms Developer Workflows [66.1850490474361]
We conduct the first academic study to explore developer interactions with coding agents.<n>We evaluate two leading copilot and agentic coding assistants, GitHub Copilot and OpenHands.<n>Our results show agents have the potential to assist developers in ways that surpass copilots.
arXiv Detail & Related papers (2025-07-10T20:12:54Z)
CPRet: A Dataset, Benchmark, and Model for Retrieval in Competitive Programming [56.17331530444765]
CPRet is a retrieval-oriented benchmark suite for competitive programming.<n>It covers four retrieval tasks: two code-centric (i.e., Text-to-Code and Code-to-Code) and two newly proposed problem-centric tasks (i.e., Problem-to-Duplicate and Simplified-to-Full)<n>Our contribution includes both high-quality training data and temporally separated test sets for reliable evaluation.
arXiv Detail & Related papers (2025-05-19T10:07:51Z)
Copilot Arena: A Platform for Code LLM Evaluation in the Wild [44.33771124408514]
Copilot Arena is a platform to collect user preferences for code generation through native integration into a developer's working environment. Copilot Arena has served over 4.5 million suggestions from 10 models and collected over 11k pairwise judgements.
arXiv Detail & Related papers (2025-02-13T13:40:52Z)
Robotic warehousing operations: a learn-then-optimize approach to large-scale neighborhood search [84.39855372157616]
This paper supports robotic parts-to-picker operations in warehousing by optimizing order-workstation assignments, item-pod assignments and the schedule of order fulfillment at workstations. We solve it via large-scale neighborhood search, with a novel learn-then-optimize approach to subproblem generation. In collaboration with Amazon Robotics, we show that our model and algorithm generate much stronger solutions for practical problems than state-of-the-art approaches.
arXiv Detail & Related papers (2024-08-29T20:22:22Z)
GitHub Copilot: the perfect Code compLeeter? [3.708656266586145]
This paper aims to evaluate GitHub Copilot's generated code quality based on the LeetCode problem set. We evaluate Copilot's reliability in the code generation stage, the correctness of the generated code and its dependency on the programming language.
arXiv Detail & Related papers (2024-06-17T08:38:29Z)
Learning Task Decomposition to Assist Humans in Competitive Programming [90.4846613669734]
We introduce a novel objective for learning task decomposition, termed value (AssistV) We collect a dataset of human repair experiences on different decomposed solutions. Under 177 hours of human study, our method enables non-experts to solve 33.3% more problems, speeds them up by 3.3x, and empowers them to match unassisted experts.
arXiv Detail & Related papers (2024-06-07T03:27:51Z)
Exploring the Effect of Multiple Natural Languages on Code Suggestion Using GitHub Copilot [46.822148186169144]
GitHub Copilot is an AI-enabled tool that automates program synthesis. Recent studies have extensively examined Copilot's capabilities in various programming tasks. However, little is known about the effect of different natural languages on code suggestion.
arXiv Detail & Related papers (2024-02-02T14:30:02Z)
Exploring the Problems, their Causes and Solutions of AI Pair Programming: A Study on GitHub and Stack Overflow [6.724815667295355]
GitHub Copilot, the AI programmer pair, utilize machine learning models trained on a large corpus of code snippets to generate code suggestions. Despite its popularity in software development, there is limited empirical evidence on the actual experiences of practitioners who work with Copilot. We collected data from 473 GitHub issues, 706 GitHub discussions, and 142 Stack Overflow posts.
arXiv Detail & Related papers (2023-11-02T06:24:38Z)
Demystifying Practices, Challenges and Expected Features of Using GitHub Copilot [3.655281304961642]
We conducted an empirical study by collecting and analyzing the data from Stack Overflow (SO) and GitHub Discussions. We identified the programming languages, technologies used with Copilot, functions implemented, benefits, limitations, and challenges when using Copilot. Our results suggest that using Copilot is like a double-edged sword, which requires developers to carefully consider various aspects when deciding whether or not to use it.
arXiv Detail & Related papers (2023-09-11T16:39:37Z)
Data-Copilot: Bridging Billions of Data and Humans with Autonomous Workflow [49.724842920942024]
Industries such as finance, meteorology, and energy generate vast amounts of data daily. We propose Data-Copilot, a data analysis agent that autonomously performs querying, processing, and visualization of massive data tailored to diverse human requests.
arXiv Detail & Related papers (2023-06-12T16:12:56Z)
Let the Flows Tell: Solving Graph Combinatorial Optimization Problems with GFlowNets [86.43523688236077]
Combinatorial optimization (CO) problems are often NP-hard and out of reach for exact algorithms. GFlowNets have emerged as a powerful machinery to efficiently sample from composite unnormalized densities sequentially. In this paper, we design Markov decision processes (MDPs) for different problems and propose to train conditional GFlowNets to sample from the solution space.
arXiv Detail & Related papers (2023-05-26T15:13:09Z)
Measuring the Runtime Performance of Code Produced with GitHub Copilot [1.6021036144262577]
We evaluate the runtime performance of code produced when developers use GitHub Copilot versus when they do not. Our results suggest that using Copilot may produce code with a significantly slower runtime performance.
arXiv Detail & Related papers (2023-05-10T20:14:52Z)
Conversing with Copilot: Exploring Prompt Engineering for Solving CS1 Problems Using Natural Language [3.155277175705079]
GitHub Copilot is an artificial intelligence model for automatically generating source code from natural language problem descriptions. Since June 2022, Copilot has officially been available for free to all students as a plug-in to development environments like Visual Studio Code.
arXiv Detail & Related papers (2022-10-27T03:48:24Z)
Competition-Level Code Generation with AlphaCode [74.87216298566942]
We introduce AlphaCode, a system for code generation that can create novel solutions to problems that require deeper reasoning. In simulated evaluations on recent programming competitions on the Codeforces platform, AlphaCode achieved on average a ranking of top 54.3%.
arXiv Detail & Related papers (2022-02-08T23:16:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.