Related papers: Exploring the Problems, their Causes and Solutions of AI Pair Programming: A Study on GitHub and Stack Overflow

Exploring the Problems, their Causes and Solutions of AI Pair Programming: A Study on GitHub and Stack Overflow

URL: http://arxiv.org/abs/2311.01020v4
Date: Sat, 31 Aug 2024 12:46:51 GMT
Title: Exploring the Problems, their Causes and Solutions of AI Pair Programming: A Study on GitHub and Stack Overflow
Authors: Xiyu Zhou, Peng Liang, Beiqi Zhang, Zengyang Li, Aakash Ahmad, Mojtaba Shahin, Muhammad Waseem,
Abstract summary: GitHub Copilot, the AI programmer pair, utilize machine learning models trained on a large corpus of code snippets to generate code suggestions. Despite its popularity in software development, there is limited empirical evidence on the actual experiences of practitioners who work with Copilot. We collected data from 473 GitHub issues, 706 GitHub discussions, and 142 Stack Overflow posts.
Score: 6.724815667295355
License: http://creativecommons.org/licenses/by/4.0/
Abstract: With the recent advancement of Artificial Intelligence (AI) and Large Language Models (LLMs), AI-based code generation tools become a practical solution for software development. GitHub Copilot, the AI pair programmer, utilizes machine learning models trained on a large corpus of code snippets to generate code suggestions using natural language processing. Despite its popularity in software development, there is limited empirical evidence on the actual experiences of practitioners who work with Copilot. To this end, we conducted an empirical study to understand the problems that practitioners face when using Copilot, as well as their underlying causes and potential solutions. We collected data from 473 GitHub issues, 706 GitHub discussions, and 142 Stack Overflow posts. Our results reveal that (1) Operation Issue and Compatibility Issue are the most common problems faced by Copilot users, (2) Copilot Internal Error, Network Connection Error, and Editor/IDE Compatibility Issue are identified as the most frequent causes, and (3) Bug Fixed by Copilot, Modify Configuration/Setting, and Use Suitable Version are the predominant solutions. Based on the results, we discuss the potential areas of Copilot for enhancement, and provide the implications for the Copilot users, the Copilot team, and researchers.

Related papers

A Human Centric Requirements Engineering Framework for Assessing Github Copilot Output [0.0]
GitHub Copilot introduces new challenges in how these software tools address human needs.<n>I analyzed GitHub Copilot's interaction with users through its chat interface.<n>I established a human-centered requirements framework with clear metrics to evaluate these qualities.
arXiv Detail & Related papers (2025-08-05T21:33:23Z)
Code with Me or for Me? How Increasing AI Automation Transforms Developer Workflows [66.1850490474361]
We conduct the first academic study to explore developer interactions with coding agents.<n>We evaluate two leading copilot and agentic coding assistants, GitHub Copilot and OpenHands.<n>Our results show agents have the potential to assist developers in ways that surpass copilots.
arXiv Detail & Related papers (2025-07-10T20:12:54Z)
SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving [90.32201622392137]
We present SwingArena, a competitive evaluation framework for Large Language Models (LLMs)<n>Unlike traditional static benchmarks, SwingArena models the collaborative process of software by pairing LLMs as iterations, who generate patches, and reviewers, who create test cases and verify the patches through continuous integration (CI) pipelines.
arXiv Detail & Related papers (2025-05-29T18:28:02Z)
Fundamental Risks in the Current Deployment of General-Purpose AI Models: What Have We (Not) Learnt From Cybersecurity? [60.629883024152576]
Large Language Models (LLMs) have seen rapid deployment in a wide range of use cases. OpenAIs Altera are just a few examples of increased autonomy, data access, and execution capabilities. These methods come with a range of cybersecurity challenges.
arXiv Detail & Related papers (2024-12-19T14:44:41Z)
GitHub Copilot: the perfect Code compLeeter? [3.708656266586145]
This paper aims to evaluate GitHub Copilot's generated code quality based on the LeetCode problem set. We evaluate Copilot's reliability in the code generation stage, the correctness of the generated code and its dependency on the programming language.
arXiv Detail & Related papers (2024-06-17T08:38:29Z)
Impact of the Availability of ChatGPT on Software Development: A Synthetic Difference in Differences Estimation using GitHub Data [49.1574468325115]
ChatGPT is an AI tool that enhances software production efficiency. We estimate ChatGPT's effects on the number of git pushes, repositories, and unique developers per 100,000 people. These results suggest that AI tools like ChatGPT can substantially boost developer productivity, though further analysis is needed to address potential downsides such as low quality code and privacy concerns.
arXiv Detail & Related papers (2024-06-16T19:11:15Z)
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement [48.29860831901484]
We introduce OS-Copilot, a framework to build generalist agents capable of interfacing with comprehensive elements in an operating system (OS) We use OS-Copilot to create FRIDAY, a self-improving embodied agent for automating general computer tasks. On GAIA, a general AI assistants benchmark, FRIDAY outperforms previous methods by 35%, showcasing strong generalization to unseen applications via accumulated skills from previous tasks.
arXiv Detail & Related papers (2024-02-12T07:29:22Z)
Exploring the Effect of Multiple Natural Languages on Code Suggestion Using GitHub Copilot [46.822148186169144]
GitHub Copilot is an AI-enabled tool that automates program synthesis. Recent studies have extensively examined Copilot's capabilities in various programming tasks. However, little is known about the effect of different natural languages on code suggestion.
arXiv Detail & Related papers (2024-02-02T14:30:02Z)
SWE-bench: Can Language Models Resolve Real-World GitHub Issues? [80.52201658231895]
SWE-bench is an evaluation framework consisting of $2,294$ software engineering problems drawn from real GitHub issues and corresponding pull requests across $12$ popular Python repositories. We show that both state-of-the-art proprietary models and our fine-tuned model SWE-Llama can resolve only the simplest issues.
arXiv Detail & Related papers (2023-10-10T16:47:29Z)
Demystifying Practices, Challenges and Expected Features of Using GitHub Copilot [3.655281304961642]
We conducted an empirical study by collecting and analyzing the data from Stack Overflow (SO) and GitHub Discussions. We identified the programming languages, technologies used with Copilot, functions implemented, benefits, limitations, and challenges when using Copilot. Our results suggest that using Copilot is like a double-edged sword, which requires developers to carefully consider various aspects when deciding whether or not to use it.
arXiv Detail & Related papers (2023-09-11T16:39:37Z)
Measuring the Runtime Performance of Code Produced with GitHub Copilot [1.6021036144262577]
We evaluate the runtime performance of code produced when developers use GitHub Copilot versus when they do not. Our results suggest that using Copilot may produce code with a significantly slower runtime performance.
arXiv Detail & Related papers (2023-05-10T20:14:52Z)
Generation Probabilities Are Not Enough: Uncertainty Highlighting in AI Code Completions [54.55334589363247]
We study whether conveying information about uncertainty enables programmers to more quickly and accurately produce code. We find that highlighting tokens with the highest predicted likelihood of being edited leads to faster task completion and more targeted edits.
arXiv Detail & Related papers (2023-02-14T18:43:34Z)
Conversing with Copilot: Exploring Prompt Engineering for Solving CS1 Problems Using Natural Language [3.155277175705079]
GitHub Copilot is an artificial intelligence model for automatically generating source code from natural language problem descriptions. Since June 2022, Copilot has officially been available for free to all students as a plug-in to development environments like Visual Studio Code.
arXiv Detail & Related papers (2022-10-27T03:48:24Z)
GitHub Copilot AI pair programmer: Asset or Liability? [14.572381978575182]
We study the capabilities of Copilot in two different programming tasks. We compare Copilot's proposed solutions with those of human programmers on a set of programming tasks. The results show that Copilot is capable of providing solutions for almost all fundamental algorithmic problems.
arXiv Detail & Related papers (2022-06-30T15:00:03Z)
An Empirical Cybersecurity Evaluation of GitHub Copilot's Code Contributions [8.285068188878578]
GitHub Copilot is a language model trained over open-source GitHub code. Code often contains bugs - and so, it is certain that the language model will have learned from exploitable, buggy code. This raises concerns on the security of Copilot's code contributions.
arXiv Detail & Related papers (2021-08-20T17:30:33Z)
Measuring Coding Challenge Competence With APPS [54.22600767666257]
We introduce APPS, a benchmark for code generation. Our benchmark includes 10,000 problems, which range from having simple one-line solutions to being substantial algorithmic challenges. Recent models such as GPT-Neo can pass approximately 15% of the test cases of introductory problems.
arXiv Detail & Related papers (2021-05-20T17:58:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.