Related papers: What Developers Ask to ChatGPT in GitHub Pull Requests? an Exploratory Study

What Developers Ask to ChatGPT in GitHub Pull Requests? an Exploratory Study

URL: http://arxiv.org/abs/2508.17161v1
Date: Sat, 23 Aug 2025 23:24:47 GMT
Title: What Developers Ask to ChatGPT in GitHub Pull Requests? an Exploratory Study
Authors: Julyanara R. Silva, Carlos Eduardo C. Dantas, Marcelo A. Maia,
Abstract summary: Large Language Models (LLMs) such as ChatGPT have introduced a new set of tools to support software developers in solving pro- gramming tasks.<n>To explore this limitation, we conducted a manual evaluation of 155 valid ChatGPT links extracted from 139 merged Pull Requests.<n>Our results produced a catalog of 14 types of ChatGPT requests categorized into four main groups.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The emergence of Large Language Models (LLMs), such as ChatGPT, has introduced a new set of tools to support software developers in solving pro- gramming tasks. However, our understanding of the interactions (i.e., prompts) between developers and ChatGPT that result in contributions to the codebase remains limited. To explore this limitation, we conducted a manual evaluation of 155 valid ChatGPT share links extracted from 139 merged Pull Requests (PRs), revealing the interactions between developers and reviewers with ChatGPT that led to merges into the main codebase. Our results produced a catalog of 14 types of ChatGPT requests categorized into four main groups. We found a significant number of requests involving code review and the implementation of code snippets based on specific tasks. Developers also sought to clarify doubts by requesting technical explanations or by asking for text refinements for their web pages. Furthermore, we verified that prompts involving code generation generally required more interactions to produce the desired answer compared to prompts requesting text review or technical information.

Related papers

Why Do Developers Engage with ChatGPT in Issue-Tracker? Investigating Usage and Reliance on ChatGPT-Generated Code [4.605779671279481]
We analyzed 1,152 Developer-ChatGPT conversations across 1,012 issues in GitHub.<n>ChatGPT is primarily utilized for ideation, whereas its usage for validation is minimal.<n>ChatGPT-generated code was used as-is to resolve only 5.83% of the issues.
arXiv Detail & Related papers (2024-12-09T18:47:31Z)
ChatGPT Inaccuracy Mitigation during Technical Report Understanding: Are We There Yet? [6.079560395398429]
It is unknown how ChatGPT hallucinates for technical texts that contain both textual and technical terms. ChiME uses context-free grammar to parse stack traces in technical reports. ChiME shows 30.3% more correction over ChatGPT responses.
arXiv Detail & Related papers (2024-11-11T20:54:54Z)
You Augment Me: Exploring ChatGPT-based Data Augmentation for Semantic Code Search [47.54163552754051]
Code search plays a crucial role in software development, enabling developers to retrieve and reuse code using natural language queries. Recently, large language models (LLMs) have made remarkable progress in both natural and programming language understanding and generation. We propose a novel approach ChatDANCE, which utilizes high-quality and diverse augmented data generated by a large language model.
arXiv Detail & Related papers (2024-08-10T12:51:21Z)
An Empirical Study on Developers Shared Conversations with ChatGPT in GitHub Pull Requests and Issues [20.121332699827633]
ChatGPT has significantly impacted software development practices. Despite its widespread adoption, the impact of ChatGPT as an assistant in collaborative coding remains largely unexplored. We analyze a dataset of 210 and 370 developers shared conversations with ChatGPT in GitHub pull requests (PRs) and issues.
arXiv Detail & Related papers (2024-03-15T16:58:37Z)
Exploring ChatGPT's Capabilities on Vulnerability Management [56.4403395100589]
We explore ChatGPT's capabilities on 6 tasks involving the complete vulnerability management process with a large-scale dataset containing 70,346 samples. One notable example is ChatGPT's proficiency in tasks like generating titles for software bug reports. Our findings reveal the difficulties encountered by ChatGPT and shed light on promising future directions.
arXiv Detail & Related papers (2023-11-11T11:01:13Z)
Social Commonsense-Guided Search Query Generation for Open-Domain Knowledge-Powered Conversations [66.16863141262506]
We present a novel approach that focuses on generating internet search queries guided by social commonsense. Our proposed framework addresses passive user interactions by integrating topic tracking, commonsense response generation and instruction-driven query generation.
arXiv Detail & Related papers (2023-10-22T16:14:56Z)
Exploring the Potential of ChatGPT in Automated Code Refinement: An Empirical Study [0.0]
ChatGPT, a cutting-edge language model, has demonstrated impressive performance in various natural language processing tasks. We conduct the first empirical study to understand the capabilities of ChatGPT in code review tasks. Our results show that ChatGPT achieves higher EM and BLEU scores of 22.78 and 76.44 respectively, while the state-of-the-art method achieves only 15.50 and 62.88 on a high-quality code review dataset.
arXiv Detail & Related papers (2023-09-15T07:41:33Z)
DevGPT: Studying Developer-ChatGPT Conversations [12.69439932665687]
This paper introduces DevGPT, a dataset curated to explore how software developers interact with ChatGPT. The dataset encompasses 29,778 prompts and responses from ChatGPT, including 19,106 code snippets.
arXiv Detail & Related papers (2023-08-31T06:55:40Z)
Uncovering the Potential of ChatGPT for Discourse Analysis in Dialogue: An Empirical Study [51.079100495163736]
This paper systematically inspects ChatGPT's performance in two discourse analysis tasks: topic segmentation and discourse parsing. ChatGPT demonstrates proficiency in identifying topic structures in general-domain conversations yet struggles considerably in specific-domain conversations. Our deeper investigation indicates that ChatGPT can give more reasonable topic structures than human annotations but only linearly parses the hierarchical rhetorical structures.
arXiv Detail & Related papers (2023-05-15T07:14:41Z)
A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity [79.12003701981092]
We carry out an extensive technical evaluation of ChatGPT using 23 data sets covering 8 different common NLP application tasks. We evaluate the multitask, multilingual and multi-modal aspects of ChatGPT based on these data sets and a newly designed multimodal dataset. ChatGPT is 63.41% accurate on average in 10 different reasoning categories under logical reasoning, non-textual reasoning, and commonsense reasoning.
arXiv Detail & Related papers (2023-02-08T12:35:34Z)
Multi-hop Question Generation with Graph Convolutional Network [58.31752179830959]
Multi-hop Question Generation (QG) aims to generate answer-related questions by aggregating and reasoning over multiple scattered evidence from different paragraphs. We propose Multi-Hop volution Fusion Network for Question Generation (MulQG), which does context encoding in multiple hops. Our proposed model is able to generate fluent questions with high completeness and outperforms the strongest baseline by 20.8% in the multi-hop evaluation.
arXiv Detail & Related papers (2020-10-19T06:15:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.